Summary of classification results.
Grading, sorting, and classification of agricultural products are important steps to ensure a profitable and sustainable food industry. Human-intensive labors are replaced with better devices/machines that can be used in-line and generate sufficiently fast measurements for a high production volume. Most previous works focused on only one of the external quality parameters, such as color, size, mass, shape, and defects. In this work, we proposed an integrated machine vision system that can grade, sort, and classify mangoes using multiple features including weight, size, and external defects. We found that weight estimation using our proposed algorithm based on visual information was not statistically different from that of a conventional weight measurement using a static digital load cell; the estimation error is relatively small (4–5%). We also constructed an artificial neural network model to classify mango having multiple types of external defect; the classification error is less than 8% for the worst possible case. The results indicate that our system shows a great potential to be used in a real industrial setting. Future work will aim to investigate other features such as ripeness and bruises to increase the effectiveness and practicality of the system.
- image processing
- machine vision
- weighing system
- defect classification
- real time
- neural network
Food standards are evolving both to ensure the sustainability of agriculture and to satisfy consumer needs. The reputation of producers and consequently their market share is based on the quality of the product, which makes quality controls very crucial. The market together with ever increasing social concerns about good agricultural practices, including environmental, economic, and social sustainability and traceability, require guarantees of high quality from the earliest stages of the crop to postharvest storage and treatments.
Optical sensors have been used extensively in the industry ranging from the automatic sorting of products into categories to the control of processes which are difficult to observe, for instance, because of their long duration . At this point it is important to note that the quality of biological products is not easy to assess, as individuals of the same category may differ greatly from one to another in terms of color, shape, or size. Furthermore, because they are living products, their physiochemical properties evolve over time. Their inherent variability sometimes introduces a certain amount of subjectivity into quality control, thus increasing the difficulty involved in developing automated inspection systems. Addressing these challenges often requires research in advanced and multidisciplinary technologies and sometimes the use of expensive equipment.
In this work, our focus is on mango, Mangifera indica, especially the Cat-Chu cultivar due to the increasing export potential in our country—Vietnam.
Postharvest handling of mangoes is usually completed in several steps: washing, sorting, grading, packing, storage, and transportation as shown in the followingFigure 1; among which, sorting and grading are considered the most important especially for fresh agricultural products.
Sorting of agricultural products is accomplished based on external quality parameters such as color, defects, shape, and sizes. Manual sorting is based on traditional visual quality inspection performed by trained human operators situated on one or both sides of a conveyor belt. They visually inspect the produce and remove those not satisfying the predetermined quality standards. Pieces are transported slowly enough to allow the workers to inspect all of them and even manipulate them to ensure the inspection of most of their surface. The process is normally tedious, time-consuming, subjective, slow, expensive, and nonconsistent. A cost-effective, consistent, faster, and accurate sorting can be achieved with a machine vision-assisted sorting.
In this work, we present an integrated machine vision-based inspection system including sorting, grading, and weighing of mangoes—particularly, the Cat-Chu cultivar.
2. Our research
2.1. Mass estimation
Consumers usually prefer fruits having almost uniform masses and shapes. This is also one of the requirements for export. However, one cannot easily model mango shapes which are not round or oval-shaped. Commonly accepted laboratory instruments are shown in Figure 2 including a Vernier caliper for size/length measurements, a water replacement measurement setup to estimate volumes, and a planimeter to calculate areas. These methods are time-consuming and not suitable to be implemented into a real production line.
Several attempts have tried to formulate a relationship between mangoes’ masses and their sizes [2, 3, 4, 5]. Guzman-Estrada et al.  used a set of complicated geometrical parameters to estimate the mass of mangoes; most of the parameters can only be obtained using a mechanical measurement tool. Vasquez-Caicedo et al.  tried to use five parameters such as length, width, and thickness at maximum width and minimum width to estimate mango weight. Yimyam et al.  used four digital photographs to produce a three-dimensional model of Nam-Dokmai mangoes. Most of these methods did not provide easy-to-obtain parameters, except for Spreer et al. ; they provide an experimental weight-size correlation based on just three parameters—Length, Max Width (W), and Max Thickness (T) for a specific mango cultivar (Chok Anan) from Thailand.
The weight estimation method using Speer’s method is shown as follows:
In this work, we will try to use Spreer’s approach to find a meaningful relationship between shape parameters and masses of Cat-Chu mangoes. We used over 200 mangoes as a training dataset to establish the necessary weight-size relationship. Fortunately, we also obtain a linear relationship as shown in the following Figure 3. The constant in our case is 4.879 × 10−4. The obtained R2 is about 97.6%.
Our estimated mass is:
To validate our findings, we collected an additional 68 mangoes to be used as a validation dataset. The accuracy achieved is impressive, with an average error percentage of 3.23%. This further proves that the simple, linear correlation between mass and sizes can be used to estimate the corresponding mass effectively.
We also designed and constructed an image capturing platform to obtain the images from two different viewpoints (top and side views). The platform would also be used to test the algorithm’s ability to estimate mangoes’ masses solely based on their sizes. An algorithm was developed to capture and process the images while mangoes travel along a conveyor.
Top and side views of the mango were captured to estimate the mango mass using Eq. (2), and the result will also be compared with conventional mass measurement using a calibrated digital scale. We found that the difference between the masses estimated using this technique was not statistically different from the conventional method using a digital scale (p < 0.05). Classification result showed an accuracy of 95–96% when grading mangoes solely based on masses.
2.2. Image segmentation
In this section we review a few methods for automatic selection of threshold values; the most important methods that we will discuss are Otsu’s method and the valley-emphasis method. For a more general discussion regarding thresholding techniques, please read the reference “Machine Vision” by Davies .
2.2.1. Otsu’s method
This used to be one of the de facto algorithms in image segmentation . An image is a two-dimensional matrix of N pixels, each with an intensity level between 0 and L-1, where L is the number of distinct gray levels. The number of pixels with a certain gray level i is denoted as fi, and the probability of occurrence of gray level i is given by
The average of the intensity level of the whole image can be calculated as
By segmenting the image using a single threshold, we get two disjoint regions C1 and C2, which are formed by the area of pixels with gray levels [1,…,t] and [t,…L], respectively, where t is the threshold level. Normally, C1 and C2 correspond to the object of interest and the background. The probability distributions of C1 and C2 are
The mean gray-level values of the two classes can be computed as
Using discriminant analysis, Otsu  showed that the optimal threshold t* can be determined by maximizing the between-class variance, that is
where the between-class variance σB is defined as
Otsu’s method works well when the images have clear peaks and valleys—in other words, it works for images whose histograms show clear bimodal or multimodal distributions. There are times when histograms of images contain several different types with widely varied number of pixels, such as external defects; Otsu’s method will not give the correct threshold level as shown in the following Figure 4.
2.2.2. Valley-emphasis method
To improve drawbacks of Otsu’s method, Ng et al.  proposed the valley-emphasis method. The idea of the valley-emphasis method is to select a threshold value that has a small probability of occurrence (valley in the gray-level histogram) and also maximize the between-group variance, as in Otsu’s method. The formulation for the valley-emphasis method is
The addition of an extra weight factor, (1-Pt), ensures the calculated threshold having a small probability of occurrence Pt will always be selected. Hence, the name valley-emphasis because the threshold level will always reside at the valley of the histogram. For images that have apparent bimodal distribution, the valley-emphasis method should give a threshold value that is very close to the value generated by Otsu’s method because both methods attempt to maximize the between-group variance of the histogram.
The same segmentation experiment done previously using Otsu’s method is repeated using the valley-emphasis method as shown in Figure 5. We can clearly observe that the segmentation result is much better. And, the result can be utilized for further analysis steps.
2.3. Defect isolation
2.3.1. Defect isolation
Due to their green appearances, we use G channel as the main channel, since it will be much easier to observe defects. To make the defects stand out, we use a simple linear contrast enhancement as shown in . The results shown in Figure 6 illustrate the effectiveness of the contrast enhancement.
After image enhancement, we apply another round of valley-emphasis segmentation on the area of the mango mask to isolate the defect zones. The result was illustrated as shown in Figure 7.
To simplify the calculation effort, we only concentrate on defects that are equal to or larger than 30 pixels. After segmenting the defect zones from the previous steps, we will use their sizes and locations on the original image to generate the new defect candidate for further classification steps as shown in Figure 8.
2.3.2. Defect classification
There are many kinds of defects that negatively degrade mangoes’ quality . Among them, four kinds that are most commonly seen are shown in Figure 9 including stripe-type scars, dark patches, sap burns, and small spots. The defect classification steps will help us know how many kinds of defects are present on the fruit skin area as shown in Figure 10.
188.8.131.52. Color features
We use an artificial neural network with inputs as color features, shape features, and image statistical information. Li et al.  suggested that using HSV (HSI) instead of RGB color space improves segmentation results. In this research, there are 18H bins, 3S bins, and 3V bins. Therefore, we will have 162 features in HSV space.
184.108.40.206. Shape features
220.127.116.11. Histogram-based features
The histogram-based features used in this work are first-order statistics that include mean, variance, skewness, and kurtosis for all R, G, and B channels. Let z be a random variable denoting image gray levels and p(zi), i = 0, 1, 2, 3, ……. L-1, be the corresponding histogram, where L is the number of distinct gray levels. The five following features for each color channel are calculated using the abovementioned histogram:
18.104.22.168. Manual labeling of training data
We prepare our dataset with standardized defect templates of 20 × 20 using 193 abovementioned features: 162 color-based, 16 shape-based, and 15 histogram-based. We also manually label different images in the training dataset with different kinds of defects. For example, “Image 1” has nine defect zones, one is the first defect type and the rest are the fourth defect type. The procedure is applied similarly for the rest of the training images.
22.214.171.124. Building a neural network model
Our classification problem is a nonlinear one with 193 inputs corresponding to 193 chosen features and 4 outputs corresponding to four types of defects. Usually, a number of hidden layers are experimentally chosen to be a half of all the number of inputs and outputs . Therefore, we chose 98 hidden neurons (=(193 + 4)/2). The neural network model is illustrated in Figure 11.
We split our dataset into five smaller ones with different characteristics:
Set 1: All images show no defects.
Set 2: Images show only one type of defects.
Set 3: Images show only two types of defects.
Set 4: Images show three types of defects.
Set 5: Images have all four types of defects.
The classification results are summarized in Table 1. From the statistics, we can see that the classification accuracy reduces with an increasing number of defect zones and it also takes more computation time. The result is quite promising to be applicable to an automated sorting and grading system. In the current version, no acceleration techniques have been applied; in the near future, advanced parallel programming technique using graphics processing units (GPU) can be utilized to speed up the process, hopefully, to achieve a real-time performance level.
|Set 1||Set 2||Set 3||Set 4||Set 5||Total|
|Number of photos||70||50||40||30||30||220|
|Number of defects||70||106||272||425||722||1595|
|Number of defects correctly identified||70||103||258||392||657||1480|
|Number of wrong identifications||0||3||14||33||65||115|
In this work, we have established an integrated framework for an automated grading, sorting, and weighing system of Cat-Chu mangoes using features including weight, size, and external defects. We found a simple, easy-to-calculate formulation between simple parameters and mango mass. The estimation error is very small, less than 3% if we use a mechanical measurement tool and less than 5% if we use an optical measurement using top- and side-view image captures. We also proposed an innovative procedure to classify external defects based on an artificial neural network. The classification error is less than 8% for the worst possible case. The results indicate that our system has a great potential to be used in a real industrial setting. Future work will aim to investigate other features such as ripeness and bruises to increase the effectiveness and practicality of the system and possible speedup to real-time performance using advanced graphics processing unit (GPU) and further code parallelism.
This work is being carried out at the International University, Vietnam National University Ho Chi Minh City, and being funded by Ho Chi Minh City Department of Science and Technology under the Contract Number 236/2017/HD-SKHCN. The project title is “Research, design, fabricate a prototype of a mango sorter for export.”