Nondestructive Assessment of Citrus Fruit Quality and Ripening by Visible–Near Infrared Reflectance Spectroscopy

As non-climacteric, citrus fruit are only harvested at their optimal edible ripening stage. The usual approach followed by producers and packinghouses to establish the internal quality and ripening of citrus fruit is to collect fruit sets throughout ripening and use them to determine the quality attributes (QA) by standard and, in many cases, destructive and time-consuming methods. However, due to the large variability within and between orchards, the number of measured fruits is seldom statistically representative of the batch, resulting in a fallible assessment of their internal QA (IQA) and a weak traceability in the citrus supply chain. Visible/nearinfrared reflectance spectroscopy (Vis–NIRS) is a nondestructive method that addresses this problem, and has proved to predict many IQA of a wide number of fruit including citrus. Yet, its application on a daily basis is not straightforward, and there are still several questions to address by researchers in order to implement it routinely in the crop supply chain. This chapter reviews the application of Vis–NIRS in the assessment of the quality and ripening of citrus fruit, and makes a critical evaluation on the technique’s limiting issues that need further attention by researchers.


Introduction
Citrus fruit are grown commercially in more than 50 countries around the world and are major commodities in the international trade [1,2]. In Europe, the exceptional characteristics met by some of these produces have granted them the Protected Geographical Indication (PGI), such as the lemons (Citrus limon (L.) Osbeck) of Menton in France, Sorrento, Amalfi and Syracuse, and the Sicilian blood orange (Citrus Â sinensis) in Italy, the "Algarve Citrus" in Portugal, or the "Valencianos Citrus" in Spain.
As non-climacteric, citrus fruit are only harvested at their optimal edible ripening stage, and are required to meet the expectations of the current consumer who demands for fruit not only with the best appearance, flavor, and nutritional However, all QA vary greatly inside the same orchard, either in terms of absolute values and/or in terms of spatial and temporal distribution, and even in the same tree. This has been shown in citrus orchards of 'Shiranuhi' mandarin (C. unshiu Â C. sinensis) Â C. reticulata [17], 'Ortanique' (Citrus reticulata Blanco x Citrus sinensis (L) Osbeck) [18], mandarin (Citrus reticulata Blanco) [19], and 'Newhall' and 'Valencia Late' orange [20]. Multiple factors, such as the level of sunlight exposure and the associated fruit temperature on the tree, fruit yield and size, tree vigor and age, rootstocks, site-specific nutritional requirements and micro topographies within the orchard, are reportedly associated to this variability [21][22][23][24]. Furthermore, the location of the orchards and their edaphoclimatic conditions, as well as the cultural practices also induce variability on the fruit maturation process, leading to different levels of QA and different ripening rates observed for the same cultivar at different sites [20,21]. Consequently, the number of tested fruits with the standard methods is seldom statistically representative of the orchard, leading to the sub-representation of the effective ripening stage of the fruit within and between orchards, which results in a limited assessment of their ripening, heterogeneous fruit quality, a deficient OHD management and a weak traceability in the citrus supply chain [25][26][27].
Overall, there is the need to upgrade the management and the sustainability of citrus fruit supply chain with smart and nondestructive technologies that allow a fast, objective, accurate and extensive assessment of fruit QA and ripening on-tree and in the following postharvest, to replace conventional methods. Their aim would be to deliver the best produce to the markets, and contribute to reduce the current level of food loss around the globe, that involves a large portion of fruit and vegetables [28][29][30]. Considering how much of the world's population lacks food security, and the importance of these commodities in the provision of essential nutrients and vitamins, which could prevent malnutrition, that kind of technologies would comply with the sustainable development goals (SDGs) proposed by the Food and Agriculture Organization (FAO), International Fund for Agricultural Development (IFAD), and the World Food Programme (WFP), in the 2030 Sustainable Development Agenda, which supports a global commitment to end poverty, hunger and malnutrition by 2030, creating a #ZeroHunger world [31,32].
The large number of reports published in the past two decades, show an active, and highly motivated research concerning the development of various nondestructive technologies for the assessment of quality and ripening parameters of a wide variety of fruit, including citrus [16,[33][34][35][36][37]. These techniques are used on inline sorting systems, on the bench or in the field and come in many forms, prices and commercial brands. Among them, the visible-near infrared reflectance spectroscopy (Vis-NIRS), is conceivably one of the most suitable and advanced nondestructive technologies currently used to monitoring several horticultural produces. It has been implemented in applications ranging from the inline automated grading systems, assessing up to 10-12 fruit per second, to handheld units suitable for field use, operating in full sunlight and varying ambient temperature [38,39]. Additionally, it continues to grow stronger as a major investigation topic worldwide, with a major potential for improvement and contribution to the state of the art of precision agriculture and agronomic systems management [40].
This chapter comprises a brief explanation of Vis-NIRS fundamentals and a review of the various reports on its application published since 2012. Reports published before 2012 were already covered in the last review by [41] and will not be repeated here, with a few exceptions that represent relevant breakthroughs in the area. It will further attempt a critical evaluation on the limiting issues that need further research, to implement it as an effective nondestructive method to assess these commodities' quality and optimal ripening. The authors invite the reader to complement this chapter with some of the most outstanding reviews published throughout the years, by the main researchers working on the subject (but not only in citrus). These reviews comprise the principles of the technique, its various methods and the listing of fruit and the respective QA for which it has provided calibration models [41][42][43][44][45], the overview on the publications and main research groups in the field [40], various recommendations for future research activity in the area regarding the adequate experimental design and the reporting requirements [38], as well as the current real-life applications available on the market that seem to comply with the warranted robustness for the technology to be integrated in the supply chain of many crops, including citrus [38,39].

Fundamentals of visible-near infrared reflectance spectroscopy (Vis-NIRS)
In this review we will adopt the most common definition that Vis-NIRS covers the wavelength range 400-2500 nm of the electromagnetic spectrum. The lower limit is consensual, since it is the onset of the visible range, but the upper limit is mainly defined by the spectral response of the most common spectrometers. It comprises the visible (Vis) region (400-750 nm), the more penetrative short wave NIR (SWNIR), or Herschel region (750-1100 nm), and the near infrared region (750-2500 nm) of the spectrum [38,39,45]. The NIR radiation was discovered by Friedrich Wilhelm Herschel in 1800, and was first used in agricultural applications to measure the moisture in grain in the late 1960s [45]. The first Vis-NIRS application was commercialized in Japan in 1989 to sort peaches based on SSC in an automated grading line, but the research on its principles, applications and on the development of new customized systems, have only followed some decades later, being quite active nowadays [38][39][40].

Interaction of radiation with the fruit
When a light beam from the sun or a tungsten lamp, hits a fruit or any other sample, the incident radiation may be specularly reflected, absorbed or transmitted, and the relative contribution of each phenomenon depends on the chemical constitution and physical parameters of the sample (Figure 1) [46]. The spectral distribution of the radiation that penetrates the product change through wavelength dependent scattering and absorption processes. The photons that enter the fruit may emerge through multiple scattering in the tissue. Light emerging on the same side of incidence is described as diffuse reflection, while light emerging on the opposite side is described as diffuse transmition. Both diffuse modes may be understood in a general sense as 'transmitted', according to the initial description. The emerging diffuse light is collected by a spectrometer, originating the term diffuse reflection spectroscopy. The spectral features depend on the chemical composition of the product, as well as on its light scattering properties which are related to the sample microstructure. Fruit and vegetables are turbid media, in which scattering events dominate over absorption in the visible (400-750 nm), and particularly in the SWNIR and NIR ranges of the electromagnetic spectrum (750-2500 nm) [44] (see Figure 1).
In thin rind fruit most of light interaction takes place on the flesh and the skin has mainly a modulation effect upon the spectra. In most citrus, however, most of light interaction occurs in the thick rind and few photons probe the flesh. Thus, the 4 Citrus assessment of IQA depends on the interplay between pulp and skin biochemistry and their optical properties [47][48][49].
Absolute quantification of diffuse reflected light (for example, as spectral radiance [WÁsr À1 m À2 Hz À1 ]) is of little use, because it depends obviously on the characteristics of the light source. The calculation of reflectance avoids that subjectivity, since it normalizes the absolute measurement of the sample's reflection by that of a reference material, usually a near perfect reflector ('white') in the wavelength range under study. It should be stressed, however, that even the reflectance depends on the collection geometry (solid angle of collection, viewed area, etc.). Common choices for the reference material include Spectralon or Teflon, with nearly 100% reflection in the Vis-NIR. Reflectance R λ ðÞis calculated according to the Eq. (1) presented below.
where S stands for the Sample counts, D for the Dark counts, Ref for the Reference counts and λ is the wavelength. Here, counts refer to the digitized output of the spectrometer, which are proportional to the spectral radiance measured in a specific geometry. The dark counts are obtained with the spectrometer closed and represent the electronic noise, which must be subtracted from the sample and reference measurements.

'Point' and imaging measurements
Vis-NIRS is most commonly applied on specific 'points' of the fruit, by observing a small area, which produce an average spectrum for that specific site. This is called point measurements. But Vis-NIRS can also be applied on extensive sections across the fruit, through multispectral and hyperspectral measurements, which create an image of the measured sections for each wavelength band [44]. The main difference between multi-and hyperspectral modes are the number of wavebands used. Multispectral imaging uses a set of filters and a common digital camera to deliver typically no more than ten bands, while hyperspectral cameras merge imaging and spectral separation in the optical hardware to produce hundreds of contiguous wavebands. Another way to look into hyperspectral images is to think that it yields the reflectance spectrum for each spatial position of a sample (i.e., for each pixel of the image) [44]. Both techniques, although costly, have been shown to successfully assess several IQA, diseases and defects in several fruit, including citrus fruit [50,51]. Yet, extensive investigation is needed to allow both the acquisition and image processing software to be implemented in real-time systems. Thus, this chapter will only address the systems that perform 'point' measurements, based on their much wider spread, cost-effective and friendly use through the supply chain, and particularly under field conditions. Further information on the principles and applications of multispectral and hyperspectral Vis-NIRS technologies could be found in the reviews by [44,50].

Instrumentation and measurement setup
There are currently, a large variety of both commercial and lab-made customized Vis-NIRS systems, with various shapes, sizes, and prices, that operate in many spectral ranges, and have reportedly allowed the assessment of several QA, the content of critical compounds and the diagnostic and/or prediction of disorders in a wide sort of fruits, including citrus (Tables 2-4). Nevertheless, most of the current commercial fruit applications of Vis-NIRS are based on the use of silicon-based spectrophotometers comprising the Vis-SWNIR region (400-1100 nm), because of their accessible prices and the larger light penetration depth in this band, in comparison to the significantly more expensive InGaAs-based devices (900-2500 nm), that do not add too much value to the quality assessment procedure [39].
From handheld, benchtop, to inline automated grading system, all Vis-NIRS devices comprehend the following fundamental components: an optical spectrometer, a light source (usually a tungsten halogen light bulb) and collection optics (optical fibers, lenses, integration spheres, dedicated probes). The current spectrometers typically include a connection for an optical fiber, an entrance slit (that defines the spectral resolution), a diffraction grating to separate the light into its spectral components, mirrors for collimation and focusing, and a light sensing device that is usually a one-dimensional CCD (Charge-Coupled Device) or CMOS (Complementary Metal-Oxide Semiconductor).
The Vis-NIR spectra may be acquired according to three principal geometrical configurations, as depicted in Figure 2: the reflectance mode (a), the transmittance mode (b), and in the interactance mode (c). The reflectance mode is susceptible to receive specularly reflected light, which may be a disadvantage, since only a fraction of the collected photons probes the fruit interior. In the transmittance mode the photons probe necessarily the fruit interior; however, the optical signal may be weak and noisy. The interactance mode is a tradeoff between the two previous modes: by using a contact probe it avoids specularly reflected photons and receives only those traveling through the fruit flesh. Also, the distance between light injection and collection is small, insuring a good optical signal. However, this is also a disadvantage, since the probing depth into the fruit pulp is shallow.
The choice of the geometry is thus of the utmost importance for obtaining good results, and should account for the fruit and the assessed QA. The penetration of NIR radiation into fruit tissue decreases exponentially with the depth, which is quite critical in thick rind fruit such as citrus [50]. Furthermore, the choice of the detection mode might be influenced by the spectral range used, as report by [48], in which both interactance and reflectance modes produced similar models to assess the SSC of 'Sunkist' navel oranges in the Vis-SWNIR range, but the participation of Vis region degraded this assessment in the transmittance mode. in general, to detect 6 Citrus the internal defects, the transmittance mode should be chosen, while the other two modes are quite reliable regarding other IQA (Tables 2-4).

Typical Vis-NIRS spectrum and its interpretation
All studies on the use of Vis-NIRS to assess the fruit QA start by acquiring the reflectance (R) spectra, which is then converted to the respective absorbance log (1/R) spectra (Figure 3). The main spectral differences observed in a wide variety of fruit are in the visible region, namely in the 400-750 nm range. This is due to changes in the pigments' content through ripening, namely chlorophylls, carotenoids and anthocyanins present on the fruit rind [38]. In fruit that change from green  to yellow/orange/red colors through ripening, the spectral information on the pigments' absorption range, may provide accessory indirect correlations with IQA such as firmness, as found in 'Rocha' pear (Pyrus communis L.) [77]. This, however, is not so clear in citrus fruit because their color change do not correlate with their maturity and depends on the orchards' location climate [6]. Otherwise, the pattern of the absorption spectra in the NIR range is quite similar among the various fruit species, although position and magnitude of the peaks are fruit specific, even among citrus fruit varieties [41]. The magnitude of the peaks and minima are also dependent on the acquisition mode used, but in general the same features are present and the landscape of the spectra is similar among the same fruit as reported for 'Sunkist' oranges [48].
The spectra in the NIR range convey information mainly related with vibrational bands (stretching and bending) of the relevant functional organic groups, such as O-H, C-H, C-O and C=O. The compilation of the main wavebands present in citrus fruit, namely, O-H and C-H vibration absorptions are presented graphically in Figure 3. These groups exist in all fruit organic molecules, but variations associated with water and storage reserves may induce slight changes in the spectra that may be related with the IQA. Vibration states are quantized and the transitions between states are said to be fundamental or overtones. The fundamental transitions (corresponding to a fundamental band) refer to the transition from the ground state to the first excited state, and take place mainly in the infrared range, that is above 2500 nm. In this range, the absorption peaks are distinguishable and correlate directly to specific compounds, allowing a better assessment of organic compounds such as vitamin C, citric acid or sucrose, as reported in 'Valencia' orange [74]. In contrast, the overtone bands correspond to transitions to higher excited states, with a large number falling in the NIR range. For example, the first overtone band corresponds to the transition from the ground state to the second excited state. A very crude approximation is that the n-th overtone frequency is close to (n-1) times the fundamental frequency. Thus, a general rule is that overtones have higher frequencies and lower amplitudes than the fundamental. For example, the fundamental frequency for O-H stretching (1ν) is around 2700 nm, which means that is beyond the range of the most common NIR dispersive-type spectrometers. However, the overtones are within their instrumental range: 2ν at 1420 nm (first overtone, strong intensity band), 3ν at 970 nm (2nd overtone, medium intensity) and 4ν at 750 nm (3rd overtone, low/very low intensity band). The quoted values are only indicative of a typical band central value. Indeed, the vibrations are dependent on the chemical environment, which results in a frequency spread of the bands. Finally, it is important to refer the combination bands. These correspond to the superposition of vibration motions. For example, the fundamental bending mode (1δ) of water at 6300 nm (infrared range) may combine with the fundamental stretching mode at 2700 nm (1ν), to generate a combination band around 1900 nm [ν+δ (O-H)]. Having in mind that the fruit tissue is composed by many different organic molecules, it is easy to understand that the spectral landscape of fruit NIR reflectance is a continuum, due to band superposition, as previously shown by [74], when comparing NIR and medium infrared spectroscopies (MIR), to assess several compounds in 'Valencia' oranges. Summarizing, the NIR spectra of a fruit contains mainly overtones and combination bands of stretching and bending vibrations of the main functional organic groups of relevant organic compounds regarding the fruit IQA, such as O-H and C-H. The large number of possible vibrations and corresponding bands originates a spectral landscape with very broad and unspecific features, from which it is nevertheless possible to retrieve useful information. For instance, [74] obtained better prediction of fructose and reducing sugars when using NIRS than MIRS. 8

Citrus
The typical Vis-NIR and NIR reflectance and absorption spectra of 'Valencia Late' orange and 'Rocha' pear are depicted in Figure 3 [78]. Both fruit spectra were relatively flat from 700 to 910, followed by strong water absorption peaks around 970, 1450 and 1940 nm. However, the C-H bands may distort slightly the water peaks, and the analysis of this distortion conveys more information than the main peaks alone. It is from these patterns associated with the OH and CH vibrations that it is possible to retrieve the information about sugars. Even the water bands by themselves may convey information about the sugars, because the concentration of sugars and water are interdependent [41,66,79,80].
In the Vis-NIR range the most prominent feature is the 3ν(O-H) peak at $ 975 nm. This peak is reported in the literature in the range 960-980 nm [79], but the actual location depends on multiple factors: (i) the degree of OH bonding; (ii) the temperature; (iii) the presence of other close bands. In other words, within different chemical environments, the OH group will peak at different wavelengths. Effect of (iii) is more clearly observed in Figure 3c. Indeed, the smooth 975 nm peak observed in absorbance has a more fine structure disclosed upon derivation. Thus, the peak 4ν(CH 2 ) at 930 nm is actually coalescing with the main water peak, causing a depression in the 2nd derivative left positive peak. The form of this overlap is a source of information about all organic compounds content, namely sugars, acids, proteins, etc.
A minor feature, but consistently observed in most fruit, is the slight inflection around 840 nm, which is caused by the band 3ν+δ(O-H). In this case it is more clearly observable in the reflectance spectrum, Figure 3a. In this discussion is important to have in mind that second derivation of a symmetric peaks yields a negative peak at the same position, with two lateral smaller positive peaks. This is clearly observed for the 840 nm feature in the 2nd derivative plot, with the negative peak coinciding with the 3ν+δ(O-H) absorption wavelength. On the contrary, the peaks of the 3ν(O-H) features do not coincide in the absorbance and 2nd derivative plots, which clearly indicates spectral overlap.
Similar curves are observed in other cultivars. For example, in 'Newhall' orange [66] the same structure for the second derivative plot was observed, although a different technique was used, namely the Norris derivative [81].
Concerning the NIR spectra, those from the oranges show three main peaks at 1190, 1450 and 1940 nm, whose origin may be traced to the 3ν(C-H), 2ν(O-H) and ν+δ(O-H) bands, respectively. However, satellite bands overlap, as in the Vis/NIR case. The most 'pure' peak is the first, around 1200 nm, corresponding to the 3ν(C-H) band. As is the 840 nm band, absorbance and 2nd derivative peaks coincide. The other two main peaks are more complex blends of two or more bands. For example, the second peak around 1450 nm, although dominated by the stronger 2ν(O-H) band, has contributions from 2ν+δ(C-H) and 2ν+2δ(C-H). Consequently, the 2nd derivative feature associated with this mix is more complex. The same could be said about the third peak.
Furthermore, due to several causes, the various peaks, even when they coincide among different fruit, may present different levels of importance, signified by their infrared values, with the various IQA. For instance, the combination band of OH reported at 839 nm correlated highly with SSC in 'Rocha' pear samples, but not in 'Valencia Late' oranges [78].

Chemometrics
As it has been mentioned earlier, Vis-NIR spectra of fresh fruits tend to be composed by a large superposition of absorption bands. The presence of a substantial amount of water in fresh fruit has a big impact in the spectra, dominating most of the spectral landscape. Therefore, the signals corresponding to the absorption bands of key chemical compounds such as sugars and acids, become masked by water and are only discernible as weak fluctuations in the spectrum. Given the complex interplay between the multiple absorption bands and their weak amplitudes, most of the times the linear relation between chemical compound concentration and absorption (Lambert-Beer law) is almost lost. In order to be able to extract information about chemical concentrations from this type of spectrum, we have to look for relationships' patterns between different wavelengths. This is done resorting to multivariate statistical techniques, that when applied in the field of analytical chemistry is called Chemometrics. This research field can be considered as a subset of the broader area of Machine Learning, and pursues the same goals, i.e., infer critical information from high dimensional data. There is a vast literature on this subject for those who wish to learn more about Chemometrics.
Here are some suggestions for introductory and advanced levels [82][83][84][85][86]. In this section, it is presented a brief introduction to the scientific language used in this area, with the sole objective of familiarize the reader with the main concepts that are often presented in the literature. In Vis-NIRS of fresh fruit, the input data are spectra, i.e. one-dimensional arrays of values, each one corresponding to the intensity of light (diffusively reflected or absorbed) at a specific wavelength. Each spectrum X n (n ¼ 1, … number of samples) represents a measurement or sample and each point x i (i ¼ 1, … , number of variables) of the spectrum is usually referred to as input variables, spectral features or simply as 'wavelengths'. The macroscopic properties or QA features that are obtained through laboratory testing (e.g. SSC, firmness, etc.) are commonly defined as target variables Y n or simply attributes. Chemometrics consists on the application of mathematical/statistical methods that allow mapping the spectral features x i into the target variables Y. These methods can be subdivided into two broad subcategories: unsupervised and supervised. In the former type of method, only the input variables x i are used and, the main purpose is to find, for example, trends within the data, clusters that can be used for classification or other general characteristics of the data set. On the other hand, supervised methods use both input and target variables and can be used for classification tasks (e.g. discriminate between different fruit sub-species or origins) or for quantitative (regression) prediction of attributes that have continuous distributions (e.g. SSC, firmness, TA, etc.).

Spectral pre-processing and outlier detection
In order to implement the best calibration model possible to predict the expected QA, often the spectral data has to be preprocessed before being used. Preprocessing techniques are used to remove irrelevant information (noise, systematic errors and faulty samples) that can degrade the performance of the numerical algorithm used to develop the calibration model. Several preprocessing methods have been created for this purpose and reviews, such as [83,87], present a wider scope of these techniques. A brief summary of the most commonly used methods is presented in Table 1. For Vis-NIRS, the most common forms of spectral preprocessing can be stacked into two groups: scatter corrections (SC) and derivative techniques (DT). Scatter-corrective methods are used to remove the influence of scattered light that can contaminate the diffuse reflectance spectra. The rationale behind SC techniques is to remove effects that are unrelated to the chemical composition of the samples and that just depend on the measurement geometry or samples morphology. On the other hand, DT are designed to improve signal to noise ratios, eliminate systematic baseline biases and enhance spectral variations. Another type of preprocessing commonly mentioned in the literature is outlier detection. This process consists in identifying and removing from the data set, samples that are very different from the rest of the samples. These outliers can be for example, reflectance spectra that were defectively acquired or fruit with odd properties. The idea is to remove these samples from the data set, using some pre-defined metric in order to feed the model only with the most representative samples in the data set that lead to a correct mapping of the attribute being predicted by the calibration model constructed.

Clustering
In Vis-NIR spectra of fresh fruit, the common spectrum is often described by smooth mounds and soft depressions. This means that adjacent wavelengths can be highly correlated. Therefore, in order to reduce redundancy of information provided by neighbour features x i , x iþ1 , x iþ2 , … ½ (also called co-linearity), sometimes it is beneficial to restrict the number of used input variables. This is often called dimensionality reduction and is very important for the right operation of certain calibration models. The simplest way to deal with this problem is by using subsampling, where a certain number of spectral features are discarded, e.g. every 3rd or 5th point in the spectra. To deal with this problem of dimensionality reduction, some 'clever' algorithms were introduced, the most common being the Principal Component Analysis (PCA) algorithm, the Hierarchical Clustering Analysis (HCA) and K-Means. the latest two methods can be used for classification tasks (mapping a Technique Description

MSC (Multiplicative Scattering
Correction) [88] Each spectrum X n is regressed against a reference spectrum (usually the mean spectrum, X m ) using a least square method. Then, the corrected spectrum is computed based on these regression coefficients. There are more sophisticated variants of this method (e.g. Extended MSC [89] that try to correct for additional additive effects).

SNV (Standard Normal Variate) [90]
Each individual spectrum X n is normalized to have zero mean and unit variance. In some cases, this can also appear as sample standardization.
Smooth derivatives (Savitzky-Golay (SG) [76] and Norris-Williams (NW) [81] The spectrum derivative in relation to the wavelength is computed. Usually the first derivatives remove baseline effects and the second derivative remove baseline and linear trends in the spectrum. Smooth derivatives are computed using an averaging multi point window in order to be more robust against spurious noise. SG and NW algorithms are the most common methods to compute these derivatives.
De-trending This process consists in fitting a polynomial (usually 1st or 2nd order) to the spectrum and subtract it from the signal. This provides baseline correction.

Scaling
Scaling is a sort of umbrella under with we can find multiple types of data manipulations. The most common ones are: baseline subtraction, sub-sampling, normalization on columns (all spectral features in the data set are scaled between a max and min values) and normalization on rows (the individual spectral features are scaled between min and max). cluster to a class) and for outlier detection as well. If samples are too far apart from the defined clusters (according to some metric such as the Euclidean distance or the Mahalanobis distance), then this suggests that it might be an outlier.

Classification and regression models
As we mentioned earlier, depending on the problem at hand, we might need to implement a classification or a regression model for our data. Multiple Linear Regression (MLR) is perhaps one of the most straight forward methods to implement. It expands the application of simple linear regression to the multivariate case by linearly combining them. Due to its simplicity, this method has some drawbacks, namely an inefficient applicability in the cases of high co-linearity in the data, and when the number of features in the data set is higher than the number of samples. This is often the case of fresh fruit Vis-NIRS datasets and hence its applicability has been limited. One way of overcoming these limitations, is to use a dimensionality reduction method, such as PCA and then perform MLR on these lower dimensional components. This workflow is known as Principal Component Regression (PCR). Partial Least Square Regression (PLS) is without question the most widely used method to create calibration models to predict the most QA of fresh fruit (Tables 2-4). As opposed to PCA, the PLS algorithm takes into account the covariance between input x i and target Y i variables. In the same spirit as PCA, PLS also projects the data into a latent space, but this time the components are defined along the direction of maximum variance between x i and Y i . These components are called latent variables (also named factors by some researchers), are built in order to model the target variable, and their number is what defines the quality of the PLS model. In general, a low number of latent variables usually lead to more robust predictions, but that might not always be the case. A variant of PLS named PLS Discriminant Analysis (PLS-DA) can be used to deal with classification scenarios when the target variables Y i are not continuous (e.g. 0, 1 for fruits without and with defects). The models mentioned so far are can be described as linear because they rely on a linear combination of multivariate solutions. Besides the easiness of implementation, they are also classically appreciated in Chemometrics because they are easy to interpret in terms of feature importance, i.e., after fitting the model to the data we can back-trace some parameters (e.g. regression coefficients) and find what wavelengths or spectral bands better contributed to the prediction. In turn, this allows inferring information about the chemical concentrations and can be used to identify biological and metabolic behaviors.
In the last couple of decades, non-linear models imported from other areas of Machine Learning have begun to permeate Chemometrics, and given its high use case in the literature, Support Vector Machines (SVM) is one of the most popular. The strategy of this model consists in searching for boundaries that separate two cluster or classes. The algorithm tries to find the best boundary between classes by maximizing a distance margin between neighbor samples. It has the advantage that it can use kernel tricks to transform the data points into another mathematical space, where these boundaries are easier to establish. SVMs were initially used for classification tasks, but have been extended to deal with regression problems as well (SVR). SVR has been used successfully for many datasets, and the most often mentioned drawback is the complexity of its optimization task. Another popular type of non-linear models that is often used for classification and regression problems is Neural Networks (NN). These represent a wide class of algorithms with many types of architectures and are derived from the field of Artificial Intelligence. In recent years, classical NN architectures such as the Multi-Layer Perceptron has been increasingly substituted by more modern architectures developed for Deep

Citrus
Learning. The most promising of these NN are the so called Convolutional Neural Networks that have been very successful in image recognition tasks.

The quality of a calibration model
Independent of the type of model that is used for prediction or classification, the important thing is to find how well it performs on the desired data. To assess the quality of the predictions made by the calibration models, several metrics are often used. In a recent review by [38], the author makes a case for the uniformization of the report of error metrics in future publications. In what follows these recommendations are highligted. The partitioning of the data for model development is very important. The data set should always be split into two sub-sets, called train and test sets. The train set, as its name suggests, is used for the calibration model development and, once the main hyper-parameters have been established, the model is used to predict the test set and assess its performance. Model development can be done with the full train set using a cross-validation strategy or by further splitting it into calibration and tuning (or assessment or validation) sets. As a note of caution, it is important to mention that for different areas of Machine Learning the names given to these data splits can vary and that can lead to some confusion. Otherwise, the test set should be derived from a different distribution from that of the train, in which case it is named as external validation set [45]. For example, data from two consecutive harvest seasons are used as train set, while the test set uses data from a third   season. A similar situation can be envisaged by using a train set collected from different orchards or origins, than that used for validation. Yet, given the large amount of time invested in acquiring this type of data, multi-seasonal or multiorchards test data sets are not often found in the literature. In contrast, laboratory models with homogeneous fruit sets are abundant (Tables 2-4). Currently, the most common procedure when constructing and validating Vis-NIRS models for the various fruit QA is to separate a fraction of the available samples as train/ calibration set (usually 80%), and the remaining as test/validation set (usually 20%). Furthermore, the validation samples are typically chosen as the best possible representation of the whole set and within the variation range of the train set. This has been applied even when the models comprehend several species and/or cultivars, orchard locations and harvest years, which are mixed in the calibration and validation sets [68]. All studies included in Tables 2-4 that used this approach, were labeled as internal in the validation column. Internal validation does not ensure the success of a continuous monitoring application, which is a dynamic and open process, particularly, if one aims to use the Vis-NIRS in real world applications, being in inline grading systems or handheld devices. Once the model is applied to the test set and a final prediction is made, one can assess how well the model performs by computing several metrics. If the model was developed for regression, the most used are the root mean squared error (RMSE), bias, the slope, the coefficient of determination (R 2 ), and the standard deviation ratio (SDR) or  Table 5. Tables 2-4. ratio of performance to deviation (RPD) or residual standard deviation (RSD). If the model is developed for classification, the advised metrics are accuracy (ACC), F1 score and receiver operating characteristic (ROC) curve. For completeness, these metrics are often computed not only for the test set, but also for the calibration, and tuning sets as well. The comparison between calibration (C), tuning (CV) and test set (P) error metrics allows to understand how well the model generalizes, i.e. how the information learned by the model during training transposes to the final external validation dataset.

Prediction of quality attributes
Vis-NIRS combined with various chemometric methods has produced calibration models to predict simultaneously multiple QA of various citrus species and varieties, which are presented in Tables 2-4. These attributes range from fruit size, weight and color [68] to SSC, MI, external and internal defects, several compounds such as sugars, acids, pigments and antioxidants. As expected, these models address predominantly several varieties of orange and mandarin, but also grapefruit, lime, pomelo, sweet lemon and tangelo. The spectral ranges used cover the whole Vis-NIRS range. The majority of the Vis-NIRS calibration models were obtained from samples collected and assessed under controlled conditions in the laboratory, after fruit temperature equilibration, either with benchtop or handheld devices (Tables 2 and 3). Despite the large market availability of the latter, presenting different levels of portability, spectral ranges, sizes, and prices, only a few studies have focused on its application to assess the quality and ripening of oranges and mandarins on-tree (Table 4), perhaps due to the complexities involved under field conditions, and the performance deterioration of calibration models, in spite of the spectral range used [63][64][65][66][67][68]. Nevertheless, the QA assessed on-tree ( Table 4) comprise fruit mass and size, color parameters [63,67,68], pericarp thickness, SSC, TA, firmness, MI, juice pH and mass, and BrimA index, which measures the balance between sweetness and acidity as described by [12]. Noteworthy, the majority of the calibration models exhibited R 2 < 0:8, despite the range used, and did not include external validation, except for [65,66].
Grading lines equipped with Vis-NIR sensors are now commercially available from various companies, to assess both the EQA and IQA of citrus fruit [38][39][40]. Unfortunately, the scientific evidence about the accuracy of these systems is very scarce, due to the 'industrial secrecy'. Nevertheless, there are a few cases of partnership among the industrial sector and the research groups to know the real applicability and their performance in assessing citrus fruit QA by such equipment [91], in most cases still in the prototype stage, as reported by [92,93]. [91] evaluated the performance of a customized NIR equipment installed underneath the fruit conveyor to sort oranges and mandarins in a Spanish packinghouse. This system working in a transmittance mode in the 650-970 nm spectral range, only provided calibration models that could discriminate between low and high values of SSC for both mandarin (R = 0.76-0.86; SEP $ 0.9°Brix; RPD $ 0.74) and orange (R = 0.87; SEP $ 0.7°Brix; RPD < 1.5). No acceptable models were obtained for TA in neither species. [92] evaluated the performance of a Vis-NIRS system to assess the SSC of 'Indian River' red grapefruit and 'Honey' tangerine from Florida in a sorting inline prototype, with R 2 ranging from 0.15 to 0.67. The prototype percent correct classification averaged 85% for SSC at 10°Brix and 79% for an 11°Brix setpoint in the second-year of tests. Otherwise, [93] reported on the development and laboratory testing of the nondestructive citrus fruit quality monitoring prototype system, which consisted of a light detection and ranging (LIDAR) and Vis-NIRS sensors installed on an inclined conveyor for mimicking real-time fruit size and SSC measurement respectively, during harvest. Laboratory tests in 'Valencia' orange revealed that the system was applicable for instantaneous fruit size (R 2 = 0.91) and SSC (R 2 = 0.677, SEP = 0.48°Brix) determination.
The various Vis-NIRS calibration models presented in this chapter, show different levels of accuracy, prediction and robustness for the various attributes, SSC being the most successfully IQA assessed at all spectral ranges, independently of the devices used (Tables 2-4). Both juice pH and vitamin C also seem easily assessed by devices operating in the Vis-SWNIRS range devices [55,94,95], but TA has been shown to require wavelengths range > 1000 nm [13,96]. Additionally, calibration models for firmness have been difficult to obtain, although there are a few exceptions reported for several orange varieties, in the reflectance mode and in the ranges 500-1690 nm [67], and 1000-2500 nm [73]. Of course, the calibration models for specific compounds, such as sugars, acids or antioxidants, require in most cases the longer NIR spectral range [12,69,74].
Among the chemometrics methods used to construct the calibration models, PLS is still the main linear regression technique used, and the one to produce the best models for the widest number of QA (Tables 2-4). However, there are some exceptions, regarding the use of non-linear techniques, which were shown to deliver calibration models with equivalent or even better prediction and accuracy for several QA than PLS. Among these, there is the WT-LSSVR, BP-NN and LS-SMV that provided models with higher prediction capacity for SS in 'Gannan' orange [58], SSC, TA and vitamin C in Nanfeng mandarin [55] and in 'Newhall' orange [96], respectively. The LOCAL algorithm has also shown to produce better models than MPLS for firmness and juice mass in 'Powell Summer Navel' orange [67], and in 'Clemenvilla' mandarin [68]. SIMCA, SVM and particularly PCA-ANN, also allowed to assess with a total accuracy > 98% the freeze damage in sweet lemon [13] and PCA-GRNN allowed the assessment of the granulation in 'Honey' pomelo at a classification accuracy (CA) > 95% [14].
Independently of the chemometrics technique used, for the majority of the models presented in Tables 2-4, the train and test fruit sets were chosen from the same batch, orchards or seasons. Even when the whole data set comprised fruit from several orchards, harvest season or citrus varieties, the usual approach was to randomly choose 80% of the whole data set to construct the calibration model and 20% to validate it, as reported by [68,97]. A truly stringent external validation is thus required to have a realistic idea of the models' performance in orchard and/or cargo batch monitoring. External validation means validation through a dataset with a different origin (spatial or temporal) relatively to the datasets used in calibration. Nevertheless, there are some clear examples of this approach, such as previously reported for mandarin [53,56,[97][98][99], orange [12,56,65,66,70,72,73,97], and grapefruit [12,70]. Without the effective external validation, it is not possible to know exactly how well these models would work in real conditions due to the large variability within the trees, orchards, sites and harvest seasons. Yet, a certain degree of deterioration of the initial model prediction is expected, which would warrant further attention. This has been reported by [54,65,66]. Yet, there is space and potential for improvement beyond the 'proof of concept', if one aims to use these devices on the daily routines of the orchards' management. This has been suggested by [55,66] through model recalibration using a few fruits from the new harvest season/orchard, or by achieving a strong degree of robustness by constructing a multi-seasonal and multi-orchard model as reported by [67] for 'Newhall' orange, which will be much more advantageous when assessing the ripening of fruit on-tree.

Future research and perspectives
Vis-NIRS has been incorporated by a large number of companies in commercial applications to be used on inline, benchtop and handheld systems. However, there are several topics regarding the full potential and limitations of this technology that require attention and further research in order to provide the consistency warranted by the daily basis routines of the citrus supply chain when assessing fruit quality and ripening. Firstly, all researchers engaged in this area should report their results in a uniform way, particularly in what respects the obtained models' metrics [38]. This would allow a better understanding on the effective advances and contributions made by each study. Other models' metrics parameters, such as the prediction gain, may also be useful, as reported by [100]. Secondly, the calibration models' robustness must be addressed and solved through a stringent multi-year, multicultivar and multi-orchard validation, such as previously reported by [66]. The usual approach of validating calibration models with a random fraction of the total available data set, even when the models comprehend several varieties, orchard sites and harvest years does not ensure the success of a continuous monitoring application and delivers unrealistic performance metrics [53,54]. The usual recalibration and spiking approaches used to improve the initial calibration models with a few fruits from independent data sets that will be then assessed, assume that those fruits used to recalibrate/update the model constitute a faithful representation of the new population and are common techniques in various commercial devices, for inline and benchtop systems. However, this becomes quite difficult to apply if one aims to monitor the on-tree fruit ripening evolution through time, for the fruit sampled in the first weeks cannot represent those to be measured in the last weeks of the harvest season. Thirdly, there is a large potential for models' improvement, by using the non-linear techniques of machine learning, and those of deep learning. Fourthly, there is much to understand on the effect of the rind in the assessment of the pulp IQA in citrus fruit, since the NIR radiation hardly gets to the fruit pulp, and both biochemical and optical properties have a major role to play in the spectral data acquired [47,49,95,101,102]. Fifthly, the calibration models should be able to predict attributes that are closer to the organoleptic evaluation of the fruit. It is the case of BrimA index, a better indicator of fruit sweetness that the SSC, which was satisfactorily predicted in orange, grapefruit and mandarin [12,64]. Finally, the handheld devices must really be tested under field conditions, if one aims to assess the fruit on-tree, which is essential for the OHD decision.

Conclusions
The usefulness of Vis-NIRS combined with different chemometric techniques in the supply chain of citrus fruit is already quite extensive and growing, similarly to many other commodities. In this chapter the authors only addressed the classic spectral 'point' measurements, but it is quite clear that both inline, benchtop and handheld devices are used to assess nondestructively multiple QA in various citrus species and cultivars, with a clear predominance of orange and mandarin. Among these attributes, there are both EQA and IQA, as well as defects caused by various factors, such as physiological disorders. The devices available on the market are from various brands, operate in various ranges and present a wide variety of prices. Aside from the "proof of concept" made by many studies, that the authors tried to comprise as much as possible in this chapter, there are several issues that still need to be addressed by researchers, a major one being the need for a stringent external validation of the calibration models, in order to assure robustness and to fulfill with the essential requirements to include this technology in the daily routines of these crop supply chain. This is of the utmost importance when considering the assessment of fruit ripening on-tree to determine the optimal harvest date for each orchard, or sections of the orchard. This is highly significant based on the determinant effect of producing and harvesting the fruit at its best ripening stage, thus assuring the best quality throughout the whole postharvest and shelf-life. As a concluding remark, it is very important to add that these devices are of medium and high cost, and that are not the kind of technology to 'set and forget', as reiterated by [39], which demands not only for a budget to acquire the systems, but also to maintain them, and to keep the continuous update and improvement of the calibration models, that in most cases need the selling company assistance. Thus, there must be a cost-benefit that both the producers and packinghouses have to meet through the added commercial value to citrus fruit by these systems, and the consumer willingness to pay for fresh fruit graded in terms of IQA such as sugar content, acidity and nutraceutical properties.