Journal of information and communication convergence engineering 2024; 22(3): 249-255
Published online September 30, 2024
https://doi.org/10.56977/jicce.2024.22.3.249
© Korea Institute of Information and Communication Engineering
Correspondence to : DaeHan Ahn (E-mail: daehani4@gmail.com)
Department of Electrical, Electronic, and Computer Engineering, University of Ulsan, 44610, Republic of Korea
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Advancements in deep learning have enhanced vision-based aggregate analysis. However, further development and studies have encountered challenges, particularly in acquiring large-scale datasets. Data collection is costly and time-consuming, posing a significant challenge in acquiring large datasets required for training neural networks. To address this issue, this study introduces a simulation that efficiently generates the necessary data and labels for training neural networks. We utilized a genetic algorithm (GA) to create optimized lists of aggregates based on the specified values of weight and particle size distribution for the aggregate sample. This enabled sample data collection without conducting sieving tests. Our evaluation of the proposed simulation and GA methodology revealed errors of 1.3% and 2.7 g for aggregate size distribution and weight, respectively. Furthermore, we assessed a segmentation model trained with data from the simulation, achieving a promising preliminary F1 score of 78.18 on the actual aggregate image.
Keywords Aggregate Analysis, Data Generation, Aggregate Detection, Aggregate Segmentation, Genetic Algorithm
The emergence of deep learning has precipitated rapid advancements in the various industries over a short period. This evolution extended its benefits to architecture and led to the development of various innovative applications. In particular, the aggregate image analysis significantly enhanced the quality assessment of cement and concrete. This technology facilitates a direct approach to image analysis using a camera, thereby eliminating the need for manual measurements [1-4]. Despite these advancements, deep-learning-based applications in aggregate inspection face challenges in accelerating research and service development, primarily because of the necessity of compiling large-scale datasets for training.
The training data essential for conducting aggregate image analysis demands aggregate images and their specific attributes, such as particle size distribution (PSD) and weight, as both input and output for the learning models. Additionally, segmentation models that require meticulous labeling efforts to segregate each aggregate gravel individually can be employed for a detailed analysis of aggregates. To achieve this, the conventional data collection method is handled in the following sequence: first, sample preparation and raw image capture; second, aggregate analysis via an actual test; and finally, recording and storing the analysis results. Large-scale data collection faces time and cost challenges, as the entire process—from preparing samples to recording results—is labor-intensive, time-consuming, and costly at every stage. Additionally, for segmentation, manual labor is required to categorize images captured by humans using a direct masking technique, which also significantly increases both time and cost. The second issue is the diversity of the aggregate labels or attributes. Conventional methods do not support the pre-assignment of labels to samples and only permit random sample collection, making it difficult to gather data with a diverse range of labels and produce the desired labels. In particular, aggregate samples collected from the same location tend to have similar labels, making imbalanced labels unavoidable.
To address these challenges, we introduce a simulation-driven approach for collecting a large-scale dataset, as illustrated in Fig. 1, which is designed to emulate aggregate images with characteristics that correspond to user-defined labels in the simulation, collecting three key elements: label, aggregate images corresponding to the label, and aggregate images for the segmentation task. Consequently, this method enables the automated generation of significant volumes of aggregate data and does not require manual intervention.
Moreover, to capture an aggregate image corresponding to the desired label, we propose an aggregate generation model using a genetic algorithm (GA) [5]. For instance, when the model receives user-defined labels regarding the PSD and total weight of the aggregate, the GA model generates a list of aggregates that align with the aggregatés specific labels to be produced. Subsequently, the simulation uses the list to emulate the aggregate, as guided by the list. This approach facilitates the generation of a diverse aggregate dataset featuring the desired characteristics. Additionally, to acquire a ground-truth image for segmentation, we designed an outline shader to emphasize the edge of the aggregate grain. This process enables the extraction of edge information for segmentation, thereby eliminating the need for human effort.
This study proposes a simulation-based approach for collecting large-scale datasets. The approach was designed to emulate the characteristics of a set of images according to custom labels, collecting labels, corresponding set images, and setting images for segmentation tasks. This method allows for the automatic generation of large-scale datasets without manual intervention.
In summary, this study makes three primary contributions: 1) we propose a simulation-based large-scale dataset collection method for vision-based aggregate analysis; 2) we introduce an aggregate-set generation model using a genetic algorithm; and 3) we introduce a ground-truth image generation method for segmentation tasks.
The PSD is crucial in determining the physical properties of concrete mixtures, such as their strength, durability, and stability, which are crucial for the construction industry. Achieving the appropriate PSD is essential for ensuring the desired strength and workability of concrete and asphalt compositions [6,7]. Sieving is one of the most common methods used to assess PSD [8,9,10]. This process involves passing the aggregate through a series of sieves that have progressively smaller mesh sizes, as illustrated in Fig. 2 The amount of aggregate that each sieve retains is weighed, and from these measurements, a PSD curve is generated. This curve provides a detailed analysis of the aggregate granularity, allowing engineers to accurately derive the composition of aggregates. Based on this distribution, predictions can be made regarding the strength and durability of the aggregates. Understanding these properties is fundamental to optimizing the quality and enhancing the performance and longevity of construction materials.
Conducting a traditional sieve analysis, a method that necessitates drying, washing, and measuring as per standards [9,10], is a resource-intensive task that is costly and time-consuming. The advent of modern computer vision techniques has transformed aggregate analysis by providing realtime testing capabilities and significantly reducing manual effort, duration, and cost, thereby offering a more efficient and precise method for evaluating PSD.
An application based on computer vision for aggregate analysis was developed by analyzing 2D images captured by cameras. In previous studies [1-4], models utilizing convolutional layers to extract various features from aggregate images were proposed. This model presents an approach for inferring the characteristics of aggregates by mapping the overall characteristics of aggregate images to the actual distribution values of the aggregates. To enhance the performance of aggregate analysis, models incorporating segmentation techniques such as the Canny and Laplacian filters [11,12] have been proposed, which separate individual aggregate gravels in the image and analyze them based on a combination of their characteristics. Recently, there has been a trend toward improved performance using deep-learning-based segmentation models [13,14].
Despite these advancements, computer vision-based solutions for aggregate analysis remain limited, primarily because of the challenge of the large-scale dataset collection required for training deep neural networks. The conventional data collection process requires manual effort for edge delineation, culminating in considerable time and cost expenditure. Furthermore, the inability to predefine data labels renders the collection of diverse datasets more challenging. To address these obstacles, our primary objective is to establish a simulation environment capable of generating large-scale datasets with diverse labels. This facilitates the training of deep-learning-based segmentation models, overcoming the limitations of current data collection and preparation methods.
For a vision-based aggregate analysis, the dataset must contain both an image of the aggregate sample and its corresponding test result, which is referred to as a label. Additionally, for segmentation, an image that delineates the highlighted outlines is essential. Considering these prerequisites, we propose a simulation-based approach to generate an aggregate dataset, as illustrated in Fig. 1. By leveraging a virtual engine to mirror the real world, this approach facilitates the rapid and accurate acquisition of diverse data, surpassing the efficacy of manual testing conducted by humans.
Following prior studies [1,2,3], we designed the simulation environment to include a plate measuring 35 cm × 30 cm × 10 cm, positioned under a camera at a height of 50 cm, specifically designed to capture aggregate images. The design of the aggregates generated in the simulation environment is referenced in [15]. Once the desired particle size distribution and total weight of the aggregate are input into the proposed simulation, as shown in Fig. 1(a), the GA model generates a list of aggregates to be created within the simulation, as shown in Fig. 1(b). The simulation then sequentially generates aggregates in the virtual environment according to this list. After the aggregate generation is complete, the pair of generated aggregate images captured by the virtual camera and the input characteristics for the aggregate are collected. These processes are illustrated in Fig. 1(c). Additionally, an outline shader was applied to the generated aggregate images to produce images with emphasized outlines, which were stored as labels for segmentation tasks, as shown in Fig. 1(d).
The simulation aims to produce aggregates to obtain data that adhere to the desired criteria, such as the distribution and weight value. Hence, merely generating aggregates at random is insufficient, and an optimization process is essential. To accomplish this, we implemented a genetic algorithm as illustrated in Fig. 3. The overall processes reflect the basic GA, which consists of initialization, evaluation, selection, crossover, and mutation. We customized the algorithm to address our specific problem and organized it as follows. Initially, we introduced each individual, also known as a chromosome, to represent a possible solution, with further details discussed in Section III.B.1. Second, we developed a fitness function to evaluate and select a well-defined chromosome for the next generation, as detailed in Section III.B.2. Finally, we implemented a crossover strategy that addressed the overweight and underweight issues encountered during the crossover. for which solutions are presented in Section III.B.3.
A chromosome, which was the target to be optimized by the GA model, was designed to represent a predefined list of aggregates created in the simulation. Consequently, a gene in the chromosome represents gravel and contains characteristics, such as shape, size, volume, and intended placement, to be created in the simulation.
Fig. 4 depicts the chromosome generation process. Initially, a gene that symbolizes aggregate gravel in the simulation is randomly structured as a vector
The fitness function is designed to produce aggregate lists that satisfy both the desired weight and particle size distribution curve criteria. To achieve this goal, we introduced Eq. (1) to serve as the fitness function.
The equation comprises two components: an evaluation of the weight condition and an assessment of the particle size distribution curve. For weight condition evaluation, the fitness function calculates the ratio between the total weight WT of the generated aggregates and the desired weight WD. The desired weight represents a predetermined parameter, whereas the total weight is determined by aggregating the weight of each gene within a chromosome. The formula is as follows:
Thus, as the total weight of the aggregates to be generated in the simulation approximated the desired weight, the resulting value closed 1.0. Based on this fact, the fitness for the total weight can be evaluated by the difference between 1.0 and the ratio W. Conversely, the second component of the proposed Eq. (1) assesses the extent to which the particle size distribution within a chromosome aligns with the specified desired distribution, as calculated using Eq. (3).
Dn represents the fitness associated with the distribution of aggregate gravel passing through an n-sized sieve, and dn signifies the percentile of particles passing through the n-sized sieve, calculated by aggregating particles within a chromosome. dn represents the desired distribution value for each sieve. The constant a serves as a regularization factor to adjust the strictness of the fitness function, while γ (≤1.0) is another constant that modulates a value. During the GA iteration, the value of a is updated via a ← aγ, indicating a gradual decrease towards zero with each GA cycle. The fitness function for the distribution was formulated by averaging the values for each sieve.
Finally, the outcome of the fitness function will approach 0.0 when a chromosome precisely aligns with the desired weight and distribution. Subsequently, using this fitness function, the GA model selects chromosomes in a rank-based manner.
The crossover stage involves creating a new chromosome by combining two or more existing chromosomes. In our GA procedure, this process can introduce variations in weight as the crossover operation is conducted without considering the weight factor and is performed randomly. To address the total weight discrepancy post-crossover, we introduce the post-processing method depicted in Fig. 5, which is detailed in Algorithm 2. This method accounts for two scenarios, in which the resulting chromosome is either over or under the desired weight. In cases of overweight, a reduction was applied to align with the target weight. Because the weight needs to be reduced, genes are deleted individually starting from the last index of the chromosome to decrease the weight. If the weight value reaches the set value, the process is halted, and the result is output as a new chromosome. Conversely, for chromosomes falling under weight, additional genes were incorporated using the chromosome generator described in the Algorithm. 1, until the weight requirement was satisfied. Consequently, this approach guarantees that all chromosomes consistently match the desired weight value throughout the GA process.
For effective image segmentation, a sufficient number of target images, specifying how the input images should be segmented, must be provided to train the model. However, generating a segmented target image is a laborious, time-, and cost-intensive task. This challenge can be addressed in a simulation by emphasizing the edges of objects, using a technique known as the outline shader. The outline shader is a shading technique used to highlight the outlines of 2D or 3D objects. This method achieves an edge-highlighting effect by rendering a slightly expanded version of the original object's mesh and then rendering the original object on top. The process of creating expanded object Oe can be expressed mathematically as follows:
By duplicating the original object to create a new object, and obtaining the values of the vertices of the object mesh, an expanded mesh can be obtained by slightly moving each vertex in the direction of the normal vector. Applying a single color to the expanded object and rendering it before the original object causes the two objects to overlap with the expanded object appearing behind the original object. We adopted this method and designed a simulation to easily obtain aggregate images for segmentation labels from the original aggregate images by applying this technique.
This study aims to generate aggregates in a simulation that meets the user’s desired specifications, such as total weight and particle size distribution. To achieve this goal, we present a GA model that creates a list of aggregates to be generated in a simulation while satisfying the user specifications. Therefore, the evaluation of this model involved a comparison between the user requirements and the characteristics of the aggregates produced by the model. Specifically, the model was evaluated based on the accuracy of how the characteristics of aggregates obtained through the model matched the user’s requirements. Fig. 6 shows the accuracy obtained while optimizing the list of aggregates to be generated when the requirements are given. Our model consistently meets the required weight value during the generation and crossover processes. Performance exceeds 98.2% from the initial iterations and remains stable with further iterations. The simulation showed a difference of 2.7 g in weight. It means an error of less than 1.0% (≈2.7 g/5 Kg). For particle size distribution, it initially showed considerably low performance because it was randomly set, but as iterations proceeded, it approached the requirement conditions. Figs. 6 and 7 present the comparison results between the user requirements for the ground truth (PSD) and the distribution generated by the model. As shown in the results, there is a high degree of alignment between the two outcomes, with an evaluation showing a performance score of up to 98.7%. These results demonstrate that we can obtain an aggregate image in the simulation that matches the user requirements.
Fig. 8 demonstrates the proposed simulation and application of outline shading for segmentation. Because all the aggregates can be identified and controlled in the simulation, it is possible to accurately emphasize the outline for all the aggregates. Based on this, it is clear that the proposed model can contribute to minimizing human effort and time costs.
The datasets generated by the simulation can help train deep neural networks and are particularly effective for segmentation models. To demonstrate the feasibility of the dataset, we preliminarily evaluate the segmentation model [14], as shown in Fig. 9. The model used in this experiment was trained solely on the generated datasets and not on actual aggregate datasets and was applied to real aggregate data for evaluation. The evaluation metric used was the F1 score, which assessed how well the segmentation lines (i.e., the outlines of the objects) were created. The results showed that the performance improved as more virtually generated data were used, achieving a maximum F1 score of 78.18. Additionally, the accuracy of the aggregate recognition based on the segmentation results was evaluated to determine how well the aggregates could be recognized. This evaluation measures the number of aggregates that can be recognized using an object-recognition algorithm [16,17]. The evaluation results proved that, at an F1 score of 78.18, the object recognition accuracy was 96.53%, demonstrating that nearly all objects could be recognized. This evaluation sufficiently showed the possibility of analyzing real aggregates, even when training was conducted using simulated aggregates.
In this discussion, we provide a simulation for collecting aggregate analysis data based on image processing. However, additional tasks are required to improve the performance. For example, incorporating diverse stone modeling techniques and high-resolution images for meshes can enhance the quality of the dataset. Furthermore, optimizing the neural network performance with the generated dataset requires consideration of variables, such as the location and intensity of the light source, camera noise, and other environmental factors, all of which can be readily simulated. Future evaluations will focus on the impact of texture resolution, lighting conditions, plate size, and other pertinent external factors on the segmentation performance. Advancing the current gravel-generation algorithm to ensure data diversity can significantly boost the effectiveness of the simulations.
This study introduced an innovative strategy for gathering large-scale datasets for vision-based deep learning models for aggregate analysis. The simulator designed for this purpose efficiently creates aggregates within a virtual setting and instantly provides reference data for segmentation, drastically diminishing the resources and time required for data acquisition. A GA was employed to enhance the precision of the simulated data and ensure that they met the specific weight and particle size distribution criteria for the aggregates. The method has a remarkably low error rate, achieving 2.7 g and 98.7% for the weight value and particle size distribution, respectively. Additionally, the segmentation model was trained and evaluated using data generated by the simulation, resulting in an F1 score of 78.18 and an aggregate detection accuracy of 96.53%, proving that the developed model can adequately analyze actual data.
The results obtained from this investigation offer substantial support for the development and assessment of segmentation models tailored for deep-learning-driven aggregate analyses. Notably, the simulator delineated in this study is invaluable for verifying the segmentation model performance and identifying errors in 2D images, thereby enabling further advancements in segmentation research.
This research was supported by Training of experts in the smart manufacture in 2022 funded by the Korea Industrial Complex Corporation.
DaeHan Ahn
DaeHan Ahn is an assistant professor at University of Ulsan. He holds a Ph.D in Information and Communication Convergence Engineering from Daegu Gyeongbuk Institute of Science and Technology. His current research interests include industrial AI, Data Science, and Cyber Physical Systems.
Journal of information and communication convergence engineering 2024; 22(3): 249-255
Published online September 30, 2024 https://doi.org/10.56977/jicce.2024.22.3.249
Copyright © Korea Institute of Information and Communication Engineering.
1Department of Electrical, Electronic, and Computer Engineering, University of Ulsan 44610, Republic of Korea
Correspondence to:DaeHan Ahn (E-mail: daehani4@gmail.com)
Department of Electrical, Electronic, and Computer Engineering, University of Ulsan, 44610, Republic of Korea
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Advancements in deep learning have enhanced vision-based aggregate analysis. However, further development and studies have encountered challenges, particularly in acquiring large-scale datasets. Data collection is costly and time-consuming, posing a significant challenge in acquiring large datasets required for training neural networks. To address this issue, this study introduces a simulation that efficiently generates the necessary data and labels for training neural networks. We utilized a genetic algorithm (GA) to create optimized lists of aggregates based on the specified values of weight and particle size distribution for the aggregate sample. This enabled sample data collection without conducting sieving tests. Our evaluation of the proposed simulation and GA methodology revealed errors of 1.3% and 2.7 g for aggregate size distribution and weight, respectively. Furthermore, we assessed a segmentation model trained with data from the simulation, achieving a promising preliminary F1 score of 78.18 on the actual aggregate image.
Keywords: Aggregate Analysis, Data Generation, Aggregate Detection, Aggregate Segmentation, Genetic Algorithm
The emergence of deep learning has precipitated rapid advancements in the various industries over a short period. This evolution extended its benefits to architecture and led to the development of various innovative applications. In particular, the aggregate image analysis significantly enhanced the quality assessment of cement and concrete. This technology facilitates a direct approach to image analysis using a camera, thereby eliminating the need for manual measurements [1-4]. Despite these advancements, deep-learning-based applications in aggregate inspection face challenges in accelerating research and service development, primarily because of the necessity of compiling large-scale datasets for training.
The training data essential for conducting aggregate image analysis demands aggregate images and their specific attributes, such as particle size distribution (PSD) and weight, as both input and output for the learning models. Additionally, segmentation models that require meticulous labeling efforts to segregate each aggregate gravel individually can be employed for a detailed analysis of aggregates. To achieve this, the conventional data collection method is handled in the following sequence: first, sample preparation and raw image capture; second, aggregate analysis via an actual test; and finally, recording and storing the analysis results. Large-scale data collection faces time and cost challenges, as the entire process—from preparing samples to recording results—is labor-intensive, time-consuming, and costly at every stage. Additionally, for segmentation, manual labor is required to categorize images captured by humans using a direct masking technique, which also significantly increases both time and cost. The second issue is the diversity of the aggregate labels or attributes. Conventional methods do not support the pre-assignment of labels to samples and only permit random sample collection, making it difficult to gather data with a diverse range of labels and produce the desired labels. In particular, aggregate samples collected from the same location tend to have similar labels, making imbalanced labels unavoidable.
To address these challenges, we introduce a simulation-driven approach for collecting a large-scale dataset, as illustrated in Fig. 1, which is designed to emulate aggregate images with characteristics that correspond to user-defined labels in the simulation, collecting three key elements: label, aggregate images corresponding to the label, and aggregate images for the segmentation task. Consequently, this method enables the automated generation of significant volumes of aggregate data and does not require manual intervention.
Moreover, to capture an aggregate image corresponding to the desired label, we propose an aggregate generation model using a genetic algorithm (GA) [5]. For instance, when the model receives user-defined labels regarding the PSD and total weight of the aggregate, the GA model generates a list of aggregates that align with the aggregatés specific labels to be produced. Subsequently, the simulation uses the list to emulate the aggregate, as guided by the list. This approach facilitates the generation of a diverse aggregate dataset featuring the desired characteristics. Additionally, to acquire a ground-truth image for segmentation, we designed an outline shader to emphasize the edge of the aggregate grain. This process enables the extraction of edge information for segmentation, thereby eliminating the need for human effort.
This study proposes a simulation-based approach for collecting large-scale datasets. The approach was designed to emulate the characteristics of a set of images according to custom labels, collecting labels, corresponding set images, and setting images for segmentation tasks. This method allows for the automatic generation of large-scale datasets without manual intervention.
In summary, this study makes three primary contributions: 1) we propose a simulation-based large-scale dataset collection method for vision-based aggregate analysis; 2) we introduce an aggregate-set generation model using a genetic algorithm; and 3) we introduce a ground-truth image generation method for segmentation tasks.
The PSD is crucial in determining the physical properties of concrete mixtures, such as their strength, durability, and stability, which are crucial for the construction industry. Achieving the appropriate PSD is essential for ensuring the desired strength and workability of concrete and asphalt compositions [6,7]. Sieving is one of the most common methods used to assess PSD [8,9,10]. This process involves passing the aggregate through a series of sieves that have progressively smaller mesh sizes, as illustrated in Fig. 2 The amount of aggregate that each sieve retains is weighed, and from these measurements, a PSD curve is generated. This curve provides a detailed analysis of the aggregate granularity, allowing engineers to accurately derive the composition of aggregates. Based on this distribution, predictions can be made regarding the strength and durability of the aggregates. Understanding these properties is fundamental to optimizing the quality and enhancing the performance and longevity of construction materials.
Conducting a traditional sieve analysis, a method that necessitates drying, washing, and measuring as per standards [9,10], is a resource-intensive task that is costly and time-consuming. The advent of modern computer vision techniques has transformed aggregate analysis by providing realtime testing capabilities and significantly reducing manual effort, duration, and cost, thereby offering a more efficient and precise method for evaluating PSD.
An application based on computer vision for aggregate analysis was developed by analyzing 2D images captured by cameras. In previous studies [1-4], models utilizing convolutional layers to extract various features from aggregate images were proposed. This model presents an approach for inferring the characteristics of aggregates by mapping the overall characteristics of aggregate images to the actual distribution values of the aggregates. To enhance the performance of aggregate analysis, models incorporating segmentation techniques such as the Canny and Laplacian filters [11,12] have been proposed, which separate individual aggregate gravels in the image and analyze them based on a combination of their characteristics. Recently, there has been a trend toward improved performance using deep-learning-based segmentation models [13,14].
Despite these advancements, computer vision-based solutions for aggregate analysis remain limited, primarily because of the challenge of the large-scale dataset collection required for training deep neural networks. The conventional data collection process requires manual effort for edge delineation, culminating in considerable time and cost expenditure. Furthermore, the inability to predefine data labels renders the collection of diverse datasets more challenging. To address these obstacles, our primary objective is to establish a simulation environment capable of generating large-scale datasets with diverse labels. This facilitates the training of deep-learning-based segmentation models, overcoming the limitations of current data collection and preparation methods.
For a vision-based aggregate analysis, the dataset must contain both an image of the aggregate sample and its corresponding test result, which is referred to as a label. Additionally, for segmentation, an image that delineates the highlighted outlines is essential. Considering these prerequisites, we propose a simulation-based approach to generate an aggregate dataset, as illustrated in Fig. 1. By leveraging a virtual engine to mirror the real world, this approach facilitates the rapid and accurate acquisition of diverse data, surpassing the efficacy of manual testing conducted by humans.
Following prior studies [1,2,3], we designed the simulation environment to include a plate measuring 35 cm × 30 cm × 10 cm, positioned under a camera at a height of 50 cm, specifically designed to capture aggregate images. The design of the aggregates generated in the simulation environment is referenced in [15]. Once the desired particle size distribution and total weight of the aggregate are input into the proposed simulation, as shown in Fig. 1(a), the GA model generates a list of aggregates to be created within the simulation, as shown in Fig. 1(b). The simulation then sequentially generates aggregates in the virtual environment according to this list. After the aggregate generation is complete, the pair of generated aggregate images captured by the virtual camera and the input characteristics for the aggregate are collected. These processes are illustrated in Fig. 1(c). Additionally, an outline shader was applied to the generated aggregate images to produce images with emphasized outlines, which were stored as labels for segmentation tasks, as shown in Fig. 1(d).
The simulation aims to produce aggregates to obtain data that adhere to the desired criteria, such as the distribution and weight value. Hence, merely generating aggregates at random is insufficient, and an optimization process is essential. To accomplish this, we implemented a genetic algorithm as illustrated in Fig. 3. The overall processes reflect the basic GA, which consists of initialization, evaluation, selection, crossover, and mutation. We customized the algorithm to address our specific problem and organized it as follows. Initially, we introduced each individual, also known as a chromosome, to represent a possible solution, with further details discussed in Section III.B.1. Second, we developed a fitness function to evaluate and select a well-defined chromosome for the next generation, as detailed in Section III.B.2. Finally, we implemented a crossover strategy that addressed the overweight and underweight issues encountered during the crossover. for which solutions are presented in Section III.B.3.
A chromosome, which was the target to be optimized by the GA model, was designed to represent a predefined list of aggregates created in the simulation. Consequently, a gene in the chromosome represents gravel and contains characteristics, such as shape, size, volume, and intended placement, to be created in the simulation.
Fig. 4 depicts the chromosome generation process. Initially, a gene that symbolizes aggregate gravel in the simulation is randomly structured as a vector
The fitness function is designed to produce aggregate lists that satisfy both the desired weight and particle size distribution curve criteria. To achieve this goal, we introduced Eq. (1) to serve as the fitness function.
The equation comprises two components: an evaluation of the weight condition and an assessment of the particle size distribution curve. For weight condition evaluation, the fitness function calculates the ratio between the total weight WT of the generated aggregates and the desired weight WD. The desired weight represents a predetermined parameter, whereas the total weight is determined by aggregating the weight of each gene within a chromosome. The formula is as follows:
Thus, as the total weight of the aggregates to be generated in the simulation approximated the desired weight, the resulting value closed 1.0. Based on this fact, the fitness for the total weight can be evaluated by the difference between 1.0 and the ratio W. Conversely, the second component of the proposed Eq. (1) assesses the extent to which the particle size distribution within a chromosome aligns with the specified desired distribution, as calculated using Eq. (3).
Dn represents the fitness associated with the distribution of aggregate gravel passing through an n-sized sieve, and dn signifies the percentile of particles passing through the n-sized sieve, calculated by aggregating particles within a chromosome. dn represents the desired distribution value for each sieve. The constant a serves as a regularization factor to adjust the strictness of the fitness function, while γ (≤1.0) is another constant that modulates a value. During the GA iteration, the value of a is updated via a ← aγ, indicating a gradual decrease towards zero with each GA cycle. The fitness function for the distribution was formulated by averaging the values for each sieve.
Finally, the outcome of the fitness function will approach 0.0 when a chromosome precisely aligns with the desired weight and distribution. Subsequently, using this fitness function, the GA model selects chromosomes in a rank-based manner.
The crossover stage involves creating a new chromosome by combining two or more existing chromosomes. In our GA procedure, this process can introduce variations in weight as the crossover operation is conducted without considering the weight factor and is performed randomly. To address the total weight discrepancy post-crossover, we introduce the post-processing method depicted in Fig. 5, which is detailed in Algorithm 2. This method accounts for two scenarios, in which the resulting chromosome is either over or under the desired weight. In cases of overweight, a reduction was applied to align with the target weight. Because the weight needs to be reduced, genes are deleted individually starting from the last index of the chromosome to decrease the weight. If the weight value reaches the set value, the process is halted, and the result is output as a new chromosome. Conversely, for chromosomes falling under weight, additional genes were incorporated using the chromosome generator described in the Algorithm. 1, until the weight requirement was satisfied. Consequently, this approach guarantees that all chromosomes consistently match the desired weight value throughout the GA process.
For effective image segmentation, a sufficient number of target images, specifying how the input images should be segmented, must be provided to train the model. However, generating a segmented target image is a laborious, time-, and cost-intensive task. This challenge can be addressed in a simulation by emphasizing the edges of objects, using a technique known as the outline shader. The outline shader is a shading technique used to highlight the outlines of 2D or 3D objects. This method achieves an edge-highlighting effect by rendering a slightly expanded version of the original object's mesh and then rendering the original object on top. The process of creating expanded object Oe can be expressed mathematically as follows:
By duplicating the original object to create a new object, and obtaining the values of the vertices of the object mesh, an expanded mesh can be obtained by slightly moving each vertex in the direction of the normal vector. Applying a single color to the expanded object and rendering it before the original object causes the two objects to overlap with the expanded object appearing behind the original object. We adopted this method and designed a simulation to easily obtain aggregate images for segmentation labels from the original aggregate images by applying this technique.
This study aims to generate aggregates in a simulation that meets the user’s desired specifications, such as total weight and particle size distribution. To achieve this goal, we present a GA model that creates a list of aggregates to be generated in a simulation while satisfying the user specifications. Therefore, the evaluation of this model involved a comparison between the user requirements and the characteristics of the aggregates produced by the model. Specifically, the model was evaluated based on the accuracy of how the characteristics of aggregates obtained through the model matched the user’s requirements. Fig. 6 shows the accuracy obtained while optimizing the list of aggregates to be generated when the requirements are given. Our model consistently meets the required weight value during the generation and crossover processes. Performance exceeds 98.2% from the initial iterations and remains stable with further iterations. The simulation showed a difference of 2.7 g in weight. It means an error of less than 1.0% (≈2.7 g/5 Kg). For particle size distribution, it initially showed considerably low performance because it was randomly set, but as iterations proceeded, it approached the requirement conditions. Figs. 6 and 7 present the comparison results between the user requirements for the ground truth (PSD) and the distribution generated by the model. As shown in the results, there is a high degree of alignment between the two outcomes, with an evaluation showing a performance score of up to 98.7%. These results demonstrate that we can obtain an aggregate image in the simulation that matches the user requirements.
Fig. 8 demonstrates the proposed simulation and application of outline shading for segmentation. Because all the aggregates can be identified and controlled in the simulation, it is possible to accurately emphasize the outline for all the aggregates. Based on this, it is clear that the proposed model can contribute to minimizing human effort and time costs.
The datasets generated by the simulation can help train deep neural networks and are particularly effective for segmentation models. To demonstrate the feasibility of the dataset, we preliminarily evaluate the segmentation model [14], as shown in Fig. 9. The model used in this experiment was trained solely on the generated datasets and not on actual aggregate datasets and was applied to real aggregate data for evaluation. The evaluation metric used was the F1 score, which assessed how well the segmentation lines (i.e., the outlines of the objects) were created. The results showed that the performance improved as more virtually generated data were used, achieving a maximum F1 score of 78.18. Additionally, the accuracy of the aggregate recognition based on the segmentation results was evaluated to determine how well the aggregates could be recognized. This evaluation measures the number of aggregates that can be recognized using an object-recognition algorithm [16,17]. The evaluation results proved that, at an F1 score of 78.18, the object recognition accuracy was 96.53%, demonstrating that nearly all objects could be recognized. This evaluation sufficiently showed the possibility of analyzing real aggregates, even when training was conducted using simulated aggregates.
In this discussion, we provide a simulation for collecting aggregate analysis data based on image processing. However, additional tasks are required to improve the performance. For example, incorporating diverse stone modeling techniques and high-resolution images for meshes can enhance the quality of the dataset. Furthermore, optimizing the neural network performance with the generated dataset requires consideration of variables, such as the location and intensity of the light source, camera noise, and other environmental factors, all of which can be readily simulated. Future evaluations will focus on the impact of texture resolution, lighting conditions, plate size, and other pertinent external factors on the segmentation performance. Advancing the current gravel-generation algorithm to ensure data diversity can significantly boost the effectiveness of the simulations.
This study introduced an innovative strategy for gathering large-scale datasets for vision-based deep learning models for aggregate analysis. The simulator designed for this purpose efficiently creates aggregates within a virtual setting and instantly provides reference data for segmentation, drastically diminishing the resources and time required for data acquisition. A GA was employed to enhance the precision of the simulated data and ensure that they met the specific weight and particle size distribution criteria for the aggregates. The method has a remarkably low error rate, achieving 2.7 g and 98.7% for the weight value and particle size distribution, respectively. Additionally, the segmentation model was trained and evaluated using data generated by the simulation, resulting in an F1 score of 78.18 and an aggregate detection accuracy of 96.53%, proving that the developed model can adequately analyze actual data.
The results obtained from this investigation offer substantial support for the development and assessment of segmentation models tailored for deep-learning-driven aggregate analyses. Notably, the simulator delineated in this study is invaluable for verifying the segmentation model performance and identifying errors in 2D images, thereby enabling further advancements in segmentation research.
This research was supported by Training of experts in the smart manufacture in 2022 funded by the Korea Industrial Complex Corporation.
Rong Ran, Member, KIICE
Journal of information and communication convergence engineering 2022; 20(2): 73-78 https://doi.org/10.6109/jicce.2022.20.2.73Hong, Chul-Eui;
The Korea Institute of Information and Commucation Engineering 2010; 8(1): 13-18 https://doi.org/10.6109/jicce.2010.8.1.013Kim, Myoung-Jong;Kim, Hong-Bae;Kang, Dae-Ki;
The Korea Institute of Information and Commucation Engineering 2010; 8(4): 370-376 https://doi.org/10.6109/jicce.2010.8.4.370