Coffea arabica L. is a native coffee species probably originated in Abyssinia, now Ethiopia. The genetic diversity of C. arabica has economic implications directly related to profits by breeding for developing new varieties to a global market. The economic value of C. arabica genetic resources are estimated at US$ 420 million, considered a 10% discount rate. Understanding the extent of traits variability and genetic diversity is essential to guide crosses between genotypes, targeting the development of new varieties with high economic value. This chapter will present the C. arabica economic importance, primarily to Brazil, the most significant world producer; we will outline the origin and dispersion of arabica coffee and briefly show the leading germplasm banks. We will also point out contribution of genetic diversity studies based on morphological, agronomic traits, and molecular markers supporting the development of new varieties. Finally, we present an outline for the future.
- economic importance
- genetic resources
- molecular markers
Coffee is an everyday beverage and consumed enthusiastically throughout the world. This popular beverage is a primary source of annual income and employment, contributing economically, on four continents, as well as too many emerging nations. In the second half of the nineteenth-century coffee was transformed into an industrial product as a consequence of the accelerated expansion of coffee production in Brazil, which in turns, nurtured the growth of a mass consumer market in the United States .
Coffee crop, in current times, spread in over 10 million hectares grown in more than 80 tropical and sub-tropical nations. On a social basis, it plays a relevant role notably for the subsistence of nearly 20 million coffee-farming families in underdeveloped countries of Asia, Africa and Latin America . In the world, coffee places in the second-largest export commodity position only behind to the petroleum products .
According to the USDA 2020/21 Forecast Overview , the world coffee production is estimated at approximately 9 million bags (60 kilograms) superior to the past year record of 176.1 million. The forecast is that Brazil accounts for most considerable of the increment because its arabica coffee crops start the on-year of the biennial production cycle and robusta coffee is achieving record output. Brazil is the leading supporter of the forecast for the expansion in world exports. Arabica output in Brazil is forecast to achieve 6.8 million bags above the preceding season to 47.8 million. Most favorable climate conditions prevailed in the majority coffee regions, promoting coffee fruit setting and development and filling, thus succeeding in high yields.
The genus Coffea includes approximately 124 well-identified species. Coffea canephora P. and Coffea arabica L. are commercially highpoints species . C. arabica is a member of the family Rubiaceae and is a single polyploid species inside the genus Coffea. This true allotetraploid has 2n = 4x = 44 chromosomes. It is the old and most cultivated species of a coffee plant [6, 7]. Concerning floral biology, each species of the Coffea genus has its particularities and C. arabica can be characterized as self-fertile, it means that reproduction occur through self-fertilization, with an allogamy index of about 10%, on average. One research carried out at the Instituto Agronômico de Campinas concluded that insects (bees in particular), wind and gravity, are the main responsible for the pollination of coffee .
Genetic diversity is a prerequisite component of biodiversity, obligatory for species reproduction, and essential for adapting species to a dynamic environment . Besides, assessment of genetic diversity directly impacts the development of new varieties through breeding. So, valuable genetic traits can be transferred to existing plant cultivars to achieve goals towards increasing crop yields, characteristics related to the quality of crops, resistance to disease and pest, etc. Plant breeding focused on wild races genetic information is usual for most global crops and has driven an essential contribution to increasing global food security . Notwithstanding, genetic information wild races have been reduced at an alarming rate, specifically for tropical crop species, including C. arabica [11, 12]. Among other expert scholars, Labouisse et al.  cite deforestation as a noticeable contributing factor affecting the genetic erosion of coffee in Ethiopia. McNeely et al.  states some issues like land use conversion, overexploitation, and introduction of exotics species as factors contributing to native populations decimate.
In plant breeding, it is decisive to identify the most critical phenotypic traits to increase plant production. Therefore, the assessment of trait occurrences and dissimilarities in a population is the key to defining possibly useful crosses among accessions. The first line of attack is to understand the extent of the variability of some species. To do that, many countries around the world strategically centered money and human capital on collecting, assessing, and keeping the genetic resources available on germplasm banks. Although many studies emphasize the genetic diversity with molecular markers, it is also useful for plant breeders to contemplate the morphological and agronomical diversity of interest traits. In this context, we briefly show the coffee chain’s budget value, summarily point out the leading germplasm banks, and concisely demonstrate the employment of genetic diversity assessed on morphological and agronomical traits, along with molecular markers approaches.
2. Economic importance of Coffea arabica
Coffee represents an agricultural commodity that has stood out in international trade and domestic supply in terms of quantity and value . Developing countries correspond to the leading suppliers, while the main buyers are developed countries, in which coffee consumption is full-bodied. The soil characteristics of the intertropical and equatorial regions of the world play a fundamental role in the coffee marketing chain worldwide .
Approximately 170 countries are coffee producers, and almost all countries are consumers, highlighting its commercial importance, which has grown steadily over the last 150 years . Even with the crop distribution range, it has not represented a barrier to the growing production concentration in some nations. Currently, 70% of the consumed coffee worldwide comes from Brazil, Vietnam, Colombia, and Indonesia. In contrast, the primary consumer countries are the United States, the European Union, Brazil, and Japan, which account for two-thirds of the global demand for coffee .
Agricultural products usually have limited extended storage to avoid severe losses of quality. However, in coffee, the beans can be stored for decades, since observed aspects regarding the limits of humidity, light, temperature, and the latter keeping with reasonable consumption conditions. The coffee profile allows coffee growers to use the harvest with a strategic vision of economics. Many of them prefer store in bags instead of selling them immediately, hoping they will reach better prices .
Coffee was introduced in Brazil in 1727 through French Guiana and spread from northern Brazil to the southeast states, mainly in the mountain regions. Coffee developed in these areas due to favorable climate conditions for its grown, such as mild temperature, heavy rains, and distinct dry season .
The cultivation of coffee has evolved significantly and contributed to economic development throughout the history of Brazilian regions, particularly during early times and locations where the crop implantation occurred. The establishment of farming dates back to the 18th century in the northern land, precisely at the State of Pará. Later, it moved to the states of Rio de Janeiro and São Paulo (which corresponds to the Paraíba Valley). In 1850, cultivation spread rapidly towards Serra da Mantiqueira and Santos. In the 20th century, coffee cultivation continued its expansion in the states of São Paulo, south of Minas Gerais, Espírito Santo, Paraná and also to the northern region of Brazil, in the State of Rondônia. During this period of growth, the Brazilian economy, in general, was strongly associated with the coffee market, and the Brazilian Federal Government heavily regulated the coffee market until the mid-1990s .
Twelve states represent the primary coffee-producing regions in Brazil, and there are about 300,000 coffee plantations in the country, spread over 1950 cities is estimated. The state of Minas Gerais holds about 50% of the total coffee production in Brazil. Minas Gerais state offers topography and mountain climate ideal for the cultivation of coffee that along with the low-cost land, and abundance of cheap labor, may contribute to their outstanding position .
Minas Gerais accounts for approximately 50% of coffee cultivated in Brazil, 98% of this occupancy with C. arabica species that is the most economically relevant species. Some of the most economically outstanding cultivars of C. arabica in Brazil are Mundo Novo, Bourbon, Catuai Vermelho, and Catuai Amarelo .
At world scenario, the International Coffee Organization estimated that in 2014, the consumption of coffee was 150.3 million bags of 60 kg, in 2015; it rose to 152.1 million bags. In the last four years, the annual increase has remained an average of 2%. There was a significant increase in consumption in the Asia region, with rates of growth in the range between 4.5 and 9% in Indonesia, the Philippines, India, and Thailand. World coffee production in 2015 was 143.4 million 60 kg bags .
In April 2020, analyzing world coffee exports, it appoints an estimate to the 10.82 million bags, whereas, in April 2019, this number was 11.17 million. In the first seven months of the coffee year from 2019 to 2020 (the period between October/19 to April/20), they decreased 3.8% concerning exports in the same period from 2018 to 2019, totaling 72.78 bags, against 75.67 million. Shipping of beans of C. arabica species in the 12 months ending in April 2020 totaled 81.30 million bags, against 80.75 million bags the previous year . In this context, Brazil has been responsible for 20% of coffee exports in the world. Due to the exponential growth of global consumption and the capacity to produce in large quantities, Brazil has become one of the largest coffee beans exporters. In numbers, it represents more than 34 thousand bags, which corresponds to the US $ 5.4 billion in revenue, 15% of which consists of Specialty coffee. The United States and Germany are the major importing countries . The coffee tree farmland employs approximately 26 million people, many of whom are small farmers, dependent mainly on coffee for their livelihood .
In the budgetary part, the International Coffee Organization’s composite indicator fell 4.1% in May 2020, registering an average of 104.45 US cents per pound, which represented a second consecutive month of decline. The price trend curve for all C. arabica groups was bearish. From October 2019 to April 2020, shipments from Africa increased 7% to 7.66 million bags, and those from Asia and Oceania increased by 0.6%, to 23.62 million bags. In the same period, shipments from Central America and Mexico fell 4.9% to 8.77 million bags, and those from South America fell 8.6% to 32.74 million .
A series of research recognizes the economic value of genetic diversity . However, these authors confirm the market failure in the case of conservation of coffee genetic resources, especially in Ethiopian highland forests, alerting that in 10 years, the coffee forest will disappear if the current devastation rates persists, which is alarming. This study addressed Ethiopian genetic coffee resources, the primary centre of diversity, revealing the potential economic importance of amounts to nearly US$1458 million, considering a 5% discount rate and US$420 million for a 10% discount rate. A good explanation of this outsized discount rate impact may be the expressive time lag between the required cost of coffee breeding programmes and the gains resulting from enhanced cultivars development.
3. Origin and distribution of Coffea arabica
The study of plant domestication, beyond its role in man’s cultural evolution, is an excellent experimental system for the study of biological evolution. Numerous dissimilarities in the middle of wild and domesticated types are related to essential features and basic plant biology processes, such as adaptation, development, and reproduction .
The C. arabica had its origin in the highlands of tropical forests located in southwestern Ethiopia. Under the specter of the biological structure, the genetic basis of the world’s coffee plantations is considerably small, as are most commercial coffee varieties to date, derived from a limited number of accessions from Ethiopia’s forests .
C. arabica is one of the most favorite beverage crops globally that accounts for about 70% of the total international coffee market. This crop species is the most valuable globally due to their high beverage quality and taken every day by a million people worldwide. The C. arabica was assumed to be originated in the Southwestern part of Ethiopia in specifically called the Keffa area . It is also considered the possibility that C. arabica was originated in the Boma plateau in Sudan and Mount Marsabit of Kenya. Ethiopia is recognized strongly substantiated as a primary centre of diversity for coffee arabica [29, 30, 31].
In ancient times, coffee was first noticed by the Arab merchants in Ethiopia and taken to Yemen . The origin of C. arabica has been subject to both molecular and archeological studies, confirming the Ethiopian origin of C. arabica [28, 33, 34]. C. arabica is a true allotetraploid species with 2n = 4x = 44 that considered as originated from the interspecific hybridization of C. canephora and C. eugenioides [35, 36].
C. arabica cultivation was started after the wild coffee introduced from Ethiopia to Yemen as early as 575 AD . The cultivated coffee arabica divided in to C. arabica var. typica and C. arabica var. Bourbon . After its introduction to Yemen, the coffee arabica was distributed worldwide and became the most popular beverage crop. The crop distributed to Reunions Island from Yemen and then introduced to India and Java (Indonesia) [38, 39]. The coffee crop was then distributed from Java to Europe (Amsterdam botanical garden) in 1710 [28, 40]. After that, the coffee plant was taken to South America in 1718 from Europe. It was introduced to Martinique Island in 1720 or 1723 and Brazil via French Guiana in 1727 [40, 41, 42]. Finally, the coffee was spread throughout the world from South America. Ferreira et al. 2019  precisely illustrate the origin and dispersion of C. arabica (Figure 1).
4. Coffea arabica genetic resources
The efficient use of available germplasm for breeding purposes requires detailed information on the relationship of genetic relatedness among accessions that compose it, primarily affected by the domestication process. The prospect of coffee improvement in all desirable aspects depends on the availability and use of the mostly untapped genes found in the wild, in farmers’ fields and in and ex-situ germplasm collections .
Conservation in-situ of plant species make possible the maintenance a greater diversity of species and genepools in a dynamic environment, supporting populations that continue to evolve . Understory trees in the tropical forests of Africa are the range where wild coffee grows spontaneously. It covers a wide geographic area from Guinea in West Africa through Central to eastern Africa, with additional centres of diversity add the Mascarene Islands (La Réunion and Mauritius) in the Indian Ocean, Madagascar, and the Comoros Islands .
From 1971 and 1997, the deforestation took place in around 235,400 ha of closed and slightly disturbed forests in the highland plateau of southwest Ethiopia. Numerous international organizations have outlined proposals for in-situ conservation of C. arabica, but regrettably, implementation has been lagging as a result of financial constraints .
An effort to preserve the last remaining coffee forests in Ethiopia and to prevent the loss of biodiversity resulted in a creation of the Yayu Biosphere Reserve and the Kafa Biosphere Reserve, in 2010. At that time, due to the sustainable strategic interest, it became component of the United Nations World Network of Biosphere reserves. Yayu Coffee Forest Biosphere plays a crucial role in the in-situ conservation being the last remaining montane rainforest fragments with wild C. arabica populations in the world .
Given this alarming scenario, in the past, the strategic importance of wild C. arabica boosted exploration missions guided to in its primary centre of origin (Ethiopia and Kenya) and the secondary centre of diversity, Yemen. In this sense, in 1964–1965, a Food and Agriculture Organization of the United Nations (FAO) conducted collecting expedition of coffee germplasm in different locations in Ethiopia . In 1966, an expedition mission performed by ORSTOM (Office de la Recherche Scientifique et Technique Outre-Mer; a formerly designation of Institute de Recherche pour le Développement [IRD]) collected germplasm from 70 different origins. Despite the original purpose, most accessions were collected from cultivated coffee being only some native of the understory of tropical forest .
The accelerated devastation of the tropical forest ecosystems in Africa, Madagascar, the Comoros and Mascarene islands drove collecting mission for other Coffea species. The result of those collecting expedition yielded a total of 20,000 wild coffee trees collected, representing more than 70 species and also the identification of 300 wild coffee populations .
According to Bramel et al. , is consensus in the majority of institutions worldwide indicates the conservation of the collection is secure due to the adherence and engagement of the institutes and their team. In most institutions, everyone is challenged, to some degree, to cover the yearly cost for everyday conservation operations. One critical study concerning costing for Centro Agronómico Tropical de Investigación y Enseñanza (CATIE) confirms the long-term implications of negligence if the fund is insufficient, which is quite alarming.
Comprehensively, the conservation tactics applied to C. arabica accessions may be in-situ sites or both ex-situ and in-situ. In-situ involves the maintenance of genetic material in the arrangement of native populations by implementing ecosystem reserves such as national parks and refuges. On the order hand, ex-situ that deals maintenance of a species out its original habitat. In this approach, farmed and natural plant species are collected and transferred to a specific site aiming to conserve the genetic information. Furthermore, the accessions are maintained locally in the forms seeding, seeds or in vitro culture .
In this sense, the chief way of knowing and measuring the size of species variability is to carry out collection expeditions to acquire materials in a vast natural geographic occurrence. After that, each accession must be documented, and subsequently, the measurement of its phenotype must be carried out. In germplasm conservation ex-situ, the most common scheme used in coffee, this surveying must be made with suitable statistical designs, plot sizes suitably reliable, an adequate number of repetitions and field locations.
According to Giomo et al. , the first and critical step in a breeding programme is the presence and understanding of genetic diversity. In this sense, the knowledge of a series of desirable traits is required to develop a new cultivar of coffee such as adaptability, architecture, fruit color, longevity, maturation, precocity, productivity, resistance to pests and diseases, size, type of grain, quality of coffee cupping, vigor, among others. Therefore, it is imperative to know the distinguished accessions selecting particular interest traits, including agronomic characterization of plants up to the beans’ chemical composition and sensory quality, to meet the specific coffee production chain demands.
In coffee species, a significant marketable crop, the research on genetic improvement carried out by a renowned research center around the world has in the germplasm banks its primary source of raw material, essentially in C. arabica and C. canephora. Germplasm banks guard and preserve an extensive collection of genetic resources used in breeding research and biotechnology to obtain increasingly adapted and productive cultivars.
Among the world-leading significant germplasm resources and conservation of the Coffea genus, we highlight the following research institute: Centre National de Recherche Agronomique (CNRA), United States Department of Agriculture - National Plant Germplasm System, CATIE, Centro de Cooperación Internacional de Investigación Agricola para el Desarrollo (CIRAD), Ethiopian Institute of Agricultural Research, Jimma Agricultural Research Center (JARC), Institute of Biodiversity Conservation, Instituto Agronômico do Paraná (IAPAR), and Instituto Agronômico de Campinas (IAC). Those institutes enable the acquisition, exchange, conservation, duplication, and documentation of this crop’s valuable genetic resources, aiming the world food security. These organizations also performs phenotypic, cytogenetic, and molecular evaluation seeking elite accessions looking for specific attractive traits, primarily due to the already known low variability of Coffea arabica species, allowing in this way, putative well successful crosses.
The genebanks around the world have a collection of C. arabica which stands out with the most significant number of accessions (11,415), immediately succeeded by C. canephora (625), C. liberica (94), C. eugenioides (81) and other Coffea species (7756) .
CNRA was founded in 1998 and headquartered in Abidjan, Ivory Coast. According to Labouisse , CNRA has the most extensive genebank field collection of coffee in the world with 8003 accessions that resulted of prospecting conducted in eight African countries: Cote d’Ivoire, Guinea, Cameroon, Tanzania, Kenya, Madagascar and the Democratic Republic of the Congo.
Currently, the United States Department of Agriculture (USDA) comes again developing a Coffea collection as part of the National Plant Germplasm System, with approximately 300 accessions. In the past, this governmental department used to maintain 500 accessions of arabica coffee [31, 51].
Established in 1942, CATIE botanical garden and germplasm collection inaugurated its headquarter in Turrialba, Costa Rica. In 1948, the field collections of rubber, cocoa and coffee launched the germplasm preservation in Turrialba . The CATIE International Coffee Germplasm Center is one organization in the public domain because of its designation to the International Institute ex-situ collections network under the auspices of FAO . Their field genebank of coffee places the third in the world , and include to an ample range the entire genetic diversity of C. arabica recording 1987 accessions and above 9000 coffee trees. Also, the genetic diversity of a couple of other Coffea species is represented to a minor extent, covering 68 introductions of C. canephora and 24 introductions of C. liberica . The C. arabica germplasm bank of CATIE possess 880 wild and semi-wild genotypes, 581 accessions of them acquired from collecting expedition performed by FAO and ORSTOM in Ethiopia - the known biodiversity hotspots; 923 belongs cultivars, mutants and selections section; 19 interspecific hybrids; and 165 intraspecific hybrids . Considering that field collections maintenance is very costly to maintain, and the conserved genetic material is continuously endangered to biotic and abiotic stress, the research team of CATIE, from this point of view, developed a methodology for cryopreservation in liquid Nitrogen for long-term germplasm conservation of coffee seeds. Lately, CATIE maintains a core subset of 63 accessions from Ethiopia cryopreserved and thus establishing the first world cryobank [3, 53].
CIRAD commenced does collecting mission since the 1960s, being some of these collecting expeditions occurred in association with other institutions - viz., ORSTOM, International Plant Genetic Resources Institute (IPGRI), and IRD . In 1977, an ORSTOM/CIRAD mission arrived in Kenya where they collected eighty different accessions of C. arabica at Mount Marsabitan, along with samples of C. eugenioides, C. zanguebariae, and C. fadenii. Subsequently, in 1989, samples from coffee plantation arising from 22 different origins were collected by an IPGRI/CIRAD mission-focused in Yemen. Besides that, the mission recognized six morphologically different types of coffee plants . According to FAO-WIEW database 1990–2001, CIRAD maintain in Guyana a total 3800 accession ex-situ of coffee .
In Ethiopia, the Jimma Agricultural Research Center (JARC) has the commitment to be a leading centre of excellence research for arabica coffee on the planet, operating ten research stations located strategically in the main coffee production areas. The Jimma Research Station initiated variety development and germplasm conservation activity in 1966–1967. From 1966 to nowadays, the field collection has assembled 5853 accessions of C. arabica grouped in the following program/type: National collection - 1431, Exotic collection - 78, Coffee Berry Disease (CBD) resistance collection - 825, and Local landrace - 3519. To date, JARC has launched 42 coffee varieties. In Ethiopia, JARC is the unique public institution that has taken the initiative of multiplying and providing basic coffee seeds, primarily, coffee adapted varieties and CBD resistant material. Furthermore, this research institute plays a considerable role in dissemination and adoption of improved coffee technologies by innovative farmers, private and state-owned farms throughout the countryside [13, 54, 55]. Other important genetic resources organization in Ethiopia is the Institute of Biodiversity Conservation established in Choche (Limu) field genebank with 5196 accessions conserved .
In Brazil, the IAPAR was founded in 1972 and headquartered in Londrina, in the state of Paraná. The IAPAR operates in a 300 ha-farm, of which 40 ha are cultivated with coffee. In 1975 was established the field genebank of coffee that were primarily composed by IAC accessions with posterior inclusions of accessions from the FAO/IBPGR collection. Also, they have a partnership with five farmers to test the F3/F4 generations. Several cultivars have been released by IAPAR improved to achieve high yield, drought tolerance, resistance to rust, nematodes, bacterial blight, and leaf miner; and also, different ripening cycles. The IAPAR combine testing, seed production and demonstration to farmers in the F6 generation, speeding, in this way, the time to release genetic material. The IAPAR institution has a good reputation among of coffee farmer’s producers in Paraná .
The Instituto Agronômico de Campinas (IAC), institution of the Brazilian Coffee Consortium, maintains the largest and the oldest coffee germplasm bank in the country, with 5451 records. Supported by the framework of this diversity, the active germplasm bank of the “Instituto Agronômico” has contributed for 87 years with significant results in the Brazilian coffee research. IAC also perform a series of research in collaboration with other research institutions within the Brazilian Coffee Consortium [50, 56].
IAC continuously performs morphological, agronomic, chemical and molecular characterization of the genetic materials maintained in its germplasm banks. This is essential for the definition and identification of the most genetic promising materials, with better productivity and other attributes considered according to each survey. To achieve an desired coffee cultivar is required a long-term due to the time demanded to advance the genetic material from generation to generation. In Brazil, the two most adopted cultivars in coffee plantations, Mundo Novo and Catuaí, are the results of improvement research conducted by the “Instituto Agronômico” from its germplasm bank. They are planted in about 80% of Brazilian coffee crops area today .
Besides, the germplasm bank of the IAPAR and IAC, there are other five coffee germplasm banks in Brazil: Empresa de Pesquisa Agropecuária de Minas Gerais (EPAMIG), Universidade Federal de Viçosa (UFV), Instituto Capixaba de Pesquisa, Assistência Técnica e Extensão Rural (INCAPER), Fundação Procafé, and Embrapa Rondônia. According to Bramel , the collection of these germplasm banks has an estimation of about 13,856 accessions; however, the number of accessions may be inconsistent across reports.
As stated by Bramel , it is estimated 21,026 accessions in a compilation of world coffee collections that account 52 holding coffee germplasm collection with at least ten accessions.
5. Breeding and genetic diversity based on morphological and agronomic traits
In plant breeding, it is crucial to identify the most critical phenotypic traits to boost plant production. Consequently, the evaluation of trait occurrences and differences in a population is a key to determining probably valuable crosses among accessions. Although most studies focus on genetic diversity with molecular markers, it is also useful for plant breeders to recognize the morphological diversity of traits of interest .
Around the world, the arabica breeding programmes has the primary purpose of developing new cultivars taking into account the economic benefits to be returned to coffee growers. The target characteristics in the desired arabica cultivar are productivity, mainly focused on bean size as well as cup quality and resistance to major diseases and pests. On the other hand, each breeding programme has its own particularities that establish the priorities of selection criteria usually defined based on multifactorial variations in specific circumstances of weather conditions, soil, biotic and abiotic stresses, cropping systems, socio-economic factors, market dynamics and consumer preferences. In arabica coffee, typically, four primary methods of breeding and selection are used: 1- Pure line selection; 2- Pedigree selection after hybridization (sometimes also backcrossing); 3 - Intraspecific F1 hybrids; 4 - Interspecific hybridization (arabica x robusta), backcrossing and pedigree selection. The comprehensive overview of selection criteria and outcomes from each breeding method is presented in detail by Van der Vossen .
Gathering a series of studies, Monge and Guevara  make the compilation of the critical phenotypic markers for evaluation of coffee and suggests a list of appropriated traits evaluation markers: morphological descriptors - viz.: architectural (ramification degree, number of internodes, and length of plagiotropic branches) and physical (dimensions and color of leaves, flowers and fruits, flush color, stem diameter, etc.); phenological descriptor (flowering dates, fructification cycle duration); ecological adaptation descriptors (altitude, dry or humid regions, resistance to pest and diseases); productive descriptors (productivity level and early or late flowering, and fruit set); technological descriptors (coffee quality, the weight of 100 beans, caracoli rate, etc).
Monge and Guevara  in a review also outlined a compilation result of two studies concerning the phenotypic evaluation of 300 wild C. arabica collected in eight Ethiopia area, those accessions were added into CATIE collections in 1985. It highlighted the high variability in fruit maturation length (ranging from 130 to 258 days), a caracoli rate (varying from 1 to 71%), size of leaves, internode length and bean size. Furthermore, there was a detected correlation concerning morphologic variables - viz.: the lower ramification of the tree, the bigger the leaves and the bean that produces.
Cilas et al. , in a study concerning genetic value prediction for C. arabica production through evaluation of morpho-agronomic traits, having the yield registered throughout the first four years of production. They concluded that better coffee yield may be increased by the addition of the medium level of heterozygosity, once the hybrid present immense superiority in comparison to the parental line. Furthermore, these authors also affirm that the prediction of yield may also be fully achieved by combining morphological traits, for instance, stem diameter, number of primary branches and tree height.
Bertrand et al. , addressing efforts towards sustainability, performed a study in three Central American countries comprising of 15 trials between 2000 to 2006 aiming to assess F1 hybrids of C. arabica in the agroforestry system (shade) compared to full-sun (unshade) crop system. The experiment involved thirteen lines and twenty-one F1 hybrids that were measured to average production throughout the first production cycle earlier than pruning and coppicing. The results point out that the green coffee per tree yield was higher among F1 hybrids in contrast to traditional cultivar in 58%, aggregating to 170 g in agroforestry, whereas in the full-sun system this increment was 34%, accumulating 190 g. In this respect, the economic outcomes of both systems look quite similar. This study also discussed the economic advantage in the agroforestry system renovation with hybrids, indicating that after six years of replacing the traditional cultivar by hybrids could earn up 5000 USD/ha. They were also pointing to the facilitation of credit policies and the opportunity of reaching new market niches with differentiated prices.
The first original phenotypic structure within C. arabica was present by Montagnon and Bouharmont . The authors observed eighteen morphological and agronomic characteristics in a field collection of 148 accessions used the analyzed by multivariate approach. Interestingly, the result allowed identifying a sharp structure split into two main groups, comprise respectively 53 and 76 accessions. The other six groups are composed of less than five entries. The principal component analysis explained 77% of the accumulated variation within the first two axes, which is reasonably good. Also, the authors believe that the arrangement of the two main structured groups combined with the historical evidence of those accessions infers that group 1 has not been engaged within the domestication pathway of C. arabica. The traits modified by the course of domestication partly explained the well-defined separation of those two main groups.
The genetic diversity study conduced in Tepi National Spices Agricultural Research Center on 93 C. arabica accessions based 22 quantitative characteristics was able to detect five clusters by using multivariate techniques of hierarchical cluster and principal component analysis . According to Klief , the significant inter-cluster distances between clusters point out that there is a high probability for obtaining transgressive segregates and maximize heterosis by crossing germplasm accessions across distinct clusters.
An study carried out in southwestern Saudi Arabia evaluated the genetic variation of accessions of C. arabica conserved in-situ in 19 localities, where stressful conditions prevail. Multivariate approach applied on 17 quantitative traits detected five groups. Interestingly, four accessions from the same place were grouped in four different clusters, supporting the importance of in-situ conservation strategy. All cluster showed significant inter-cluster distance, where two clusters present highest cluster distance. Therefore, Tounekti et al.  affirms that from these findings, it is suitable to explore this variability in breeding programmes to overcome environmental stresses.
The biochemical aspect of coffee liquor is highly essential. From this point of view, it was made a study addressing the genetic diversity based on caffeine content level concurrently with physical aspects of green bean characteristics and coffee cup quality. The examination of dissimilarities involved cluster analysis based on unweighted pair group arithmetic average (UPGMA), together with correlation among those variables analyzed. The outcome results consisted of two main groups were distinguished. The first cluster formed by 11 accessions distinguished by high caffeine content, undesirable physical characteristics of green bean and poor coffee cup quality. The other cluster split into two subgroups: the first with 26 accessions with caffeine content varying from low to average level and cup quality; the next subgroup with five accessions characterized by a medium level of caffeine content, desirable physical qualities of green coffee bean and high-grade cup quality. The authors also identify negative and significant associations linking caffeine content and all other variables related to cup quality. From that perspective, it is possible a simultaneous improvement of desirable cup quality plus low caffeine content .
A research, performed in IAC, evaluated the effectiveness of a minimum set of descriptors established for the conduct of test for distinctness, uniformity and stability in C. arabica. Twenty-nine cultivars were scattered in 11 groups when assessed by 35 morphological characteristics and three agronomic traits during three years. The results demonstrate that those descriptors were skilled in discriminating cultivar groups but a minor role in the identification of cultivars within each group. Therefore, the authors recommend the adoption of molecular markers and biochemical descriptors to identify cultivars to be protected more accurately .
Weldemichael et al.  conducted one well-designed study estimating genetic parameters in 49 accessions of C. arabica. It was used 26 carefully chosen appropriated quantitative traits aiming to estimate the phenotypic variation. The statistical analyses approach consisted of a series of adequate genetic parameters estimation. The findings exhibited the occurrence of variability for some morphological traits among coffee germplasm accessions. Interestingly, coffee berry disease recorded a pronounced genetic gain per population mean (88.8%); this point draws particular attention, once in arabica coffee disease resistance is a breeding objective of the chief priority to plant breeders. The detected low genetic advance as per cent mean and/or low genotypic coefficients of variation exhibited in most traits indicating these characteristics could not be developed through simple section rather heterosis breeding. Conversely, they advise that high morphological variation is not a guarantee of pronounced genetic variation; in this viewpoint, it is helpful to take into consideration the molecular and biochemical studies as a complementary approach.
6. Genetic diversity based on molecular markers
The progress achieved in plant breeding programmes culminated in reduced genetic variability in the improved populations [36, 67, 68, 69]. This problem may be worse in species with a narrow genetic base, such as Arabica coffee (C. arabica). The narrow genetic base of this species is associated with its autogamy, the low number of plants that were initially distributed worldwide, and the recent evolution of the species [30, 36, 70]. Thus, genotype discrimination based on differences in phenotypic characteristics may be difficult because individuals who are genetically distinct may be phenotypically similar, which reduces the selective efficiency. To overcome this difficulty, molecular markers have been used as an important tool in the accurate discrimination of genotypes [71, 72].
DNA markers allow the detection of variations in DNA sequences between individuals of the same species. Because they identify variations in DNA, they are stable and are unaffected by the environment or by pleiotropic or epistatic effects . Thus, molecular markers have been used in breeding programmes as an efficient tool for the discrimination of genotypes and the analysis of genetic variability, as their analysis is a precise association strategy between phenotypic and genotypic variability.
Genetic diversity assisted by molecular markers has been used in several stages of Arabica coffee breeding programmes. The molecular characterization of coffee accessions is an accurate tool for the conservation and more efficient use of genetic resources by breeders. This molecular information is useful in evaluating the redundancies and deficiencies of the germplasm and generates information on the efficiency of the collection, maintenance, and expansion of a germplasm bank. In addition, the study of molecular diversity provides fundamental information to help breeders choose parents to integrate into cross-breeding schemes, as well as in directing the improvement of the genetic base during the course of a breeding programme.
Different molecular markers, such as simple sequence repeats (SSRs), sequence-characterized amplified regions (SCARs), and single-nucleotide polymorphisms (SNPs), have been identified and made available for coffee [71, 72, 74, 75, 76, 77, 78, 79, 80, 81, 82]. These species-specific markers combined with random markers, such as inter-simple sequence repeats (ISSRs), random amplified polymorphic DNA (RAPD) and amplified fragment length polymorphisms (AFLPs); support the genetic breeding of this crop.
Genetic studies and analyses of diversity and molecular characterizations of different germplasm banks and cultivars of C. arabica have benefited from molecular marker technology. Coffee plants belonging to the group of the Híbrido de Timor (HdT) from the Brazilian germplasm bank of the Universidade Federal de Viçosa (UFV) in partnership with Empresa de Pesquisa Agropecuária de Minas Gerais (EPAMIG) and Empresa Brasileira de Pesquisa Agropecuária (Embrapa Café) have been studied in detail using AFLP and SSR markers . HdT coffee plants are the result of natural hybridization between C. arabica and C. canephora and are one of the main sources of resistance genes to coffee diseases and pests [84, 85, 86]. Through molecular markers, redundancy was observed in the core collection of the HdT, so that two plants with different identifications corresponded to the same genotype. One of them was eliminated, resulting in a core collection containing 151 unique and properly discriminated HdTs. The data obtained allowed fingerprinting of the accessions . The fingerprinting of each genotype allows the identification of individuals through a unique code. This information will provide reliability to breeders for germplasm maintenance, preservation, and exchange.
With 52 alleles from 22 SSRs, it was possible to access the diversity of the Core Collection of HdT . Considerable variability was observed between the accessions, which were separated into 21 groups. This grouping result was analyzed together with the resistance data obtained for the main coffee diseases, rust and coffee berry disease. The concentration of individuals resistant to both diseases was verified in eight groups. Through this analysis, it was possible to identify HdT coffee plants belonging to distinct genetic diversity groups that have not yet been used in genetic breeding. This made it possible to select genotypes in the obtained dendrogram that were as distinct as possible from the sources already explored to date and that have different disease resistance genes. The selected HdT accessions consist of potential parents for breeding aiming resistance to multiple diseases .
Molecular markers were also analyzed in the HdT to understand the introgression of the genomes from the coffee species of their origin (C. arabica and C. canephora), as well as their potential impact on the cup quality on the C. arabica cultivars. HdT has the largest portion of the genome corresponding to C. arabica ; however, the small portion of C. canephora provides disease resistance genes. This portion, even though small, raises concern about the possibility of C. canephora affect the cup quality, since the beverage quality of C. canephora is known to be lower. Thus, the effect of introgression of C. canephora on HdT derivatives were evaluated [88, 89]. The study also demonstrated the presence of disease-resistant genotypes combined with good cup quality typical of C. arabica cultivars. The genetic diversity analysis showed high genetic similarity between HdT with C. arabica and clear differentiation among coffee species. The introgression of C. canephora in the HdT accessions did not reach 30%. The sensory analysis of the coffee genotypes showed no significant difference in the beverage quality parameters between C. arabica cv. Bourbon and HdT-derived cultivars, which demonstrated the possibility of developing C. arabica cultivars without affecting beverage quality .
Accessions of different species and interspecific hybrids from the germplasm bank of UFV/EPAMIG/Embrapa were also analyzed with genomic SSRs and expressed sequence tag–SSR markers. The combination of these two types of markers allowed discriminating all accessions, including genotypes traditionally of C. arabica, genotypes containing introgression of HdT, C. canephora, HdT, C. racemosa, and triploids of C. arabica and C. racemosa. This study also identified unique alleles that are useful for accession discriminating in breeding programmes and for cultivar fingerprinting [90, 91].
Using the currently available large-scale genotyping technology, genetic diversity between and within Brazilian coffee breeding progenies was assessed by 49,567 SNPs. The significant number of SNP molecular markers distributed throughout C. arabica genome was efficient in discriminating all evaluated accessions by grouping them according to their genealogies. Mixtures within the families were identified. New parents to be introduced in the ongoing breeding were identified, and the parents currently used were analyzed in detail. The population structure and its effect on obtaining the improved varieties of C. arabica were discussed .
Accessions from the germplasm bank and cultivars launched by the breeding programme of the Instituto Agronômico de Campinas were analyzed with RAPD, AFLP, and SSR markers . The variability observed between accessions was small, and only two groups were formed, one containing genotypes that included most cultivars and the other containing accessions/cultivars derived from interspecific crosses.
A more comprehensive analysis of Brazilian coffee plants was performed in 34 cultivars belonging to the Brazilian Cultivar Trial, using SSR markers . The molecular pattern obtained allowed the discrimination of all cultivars and the creation of a fingerprinting data of the main cultivars of the country. The ability of markers to detect varietal mixtures and the diversity between and within cultivars was demonstrated.
The genetic variability of C. arabica accessions from other countries, such as Costa Rica , Mexico , Nicaragua , India [97, 98, 99], Indonesia , China , Kenya  and Ethiopia [34, 103, 104, 105], has also been analyzed using markers such as ISSRs, SSRs, sequence-related amplified polymorphisms (SRAPs), AFLPs, and SNPs. In Ethiopia, different studies have shown the presence of great genetic variability in coffee plants. This variability has been attributed to the particular ecological characteristics of the country, such as its rainfall amplitude and its different altitudes, temperatures, and soil fertility, which are suitable for the crop. The presence of indigenous coffee production methods in the country has also contributed to this diversity [5, 106]. Greater genetic diversity has been reported among wild coffee populations than cultivated genotypes .
A broader study of the diversity and fingerprinting of Arabica coffee accessions from various producing regions of the world was done in 2533 genotypes . These genotypes corresponding to the Core Collection of the germplasm of the Tropical Agricultural Research and Higher Education Center, accessions from Southern Sudan, and cultivars/germplasm from North, Central, and South America as well as Africa and Asia. The obtained fingerprinting was efficient. Based on this tool, farmers can verify and trust the identity of the cultivars being planted, and coffee roasters can rely on marketing related to the cultivars they are growing and selling. The seed and nursery sector can become more professional and reliable by using this new monitoring tool to establish and verify the genetic purity of the seed and seedling stock.
Currently, SNP markers are using for genome-wide investigation [72, 82, 108]. In an original work of genome-wide association, candidate genes associated with lipids and diterpenes contents in C. arabica were identified . This study detects the domestication and breeding process in C. arabica, pointing out the switch in allele frequency, revealing high allelic richness in wild accessions. In this regard, the identification of these candidate genes outlining potential targets for improving beverage cup quality in a coffee breeding programme.
Genetic resources commendably provide the basis of genetics solution to solve numerous problems of coffee growing areas throughout the world. The experimental schemes that lead to the introgression of new agronomic traits are known and have previously been validated with large populations. This approach has allowed the combination of several desirable traits in a single coffee cultivar. Also, plant breeders currently can count on the employment of molecular genetics to enhance the competence to introduce the desirable characteristics in the new cultivar. Molecular marker approach in association with morpho-agronomic characterization and diversity study helps to efficiently maintain the germplasm bank and facilitated its use by the breeder. Molecular tools are also useful to detect genetic structure and divergent breeding subpopulation. Application of genomics as a supplementary approach to conventional coffee breeding is highly recommended, improve the productivity of the breeding programme by reducing time to variety development as well as assure selection of desirable traits on the course of the breeding process, this is specifically relevant for the coffee crop that is perennial and has a narrow genetic base. Furthermore, molecular and morphological diversity approach provides nurseries, farmers and the whole coffee industry an opportunity to increase knowledge about the genetic identity of the coffee tree planted or traded.
The highly-regarded line of attack in the coffee sector is the elaboration of a wide-ranging catalog on existing germplasm collections including the markers profile. In the world, the usage of genetic diversity available in germplasm collections faces two significant problems: limited access to the conserved genetic resources and the deficiencies of genetic evaluation. Anthropogenic disturbances have modified the natural habitats where wild coffee species have spontaneously evolved, and in consequence, much relevant germplasm is in the risk of destruction. So, efforts of the scientific community are essential to design and implement conservation strategies. The ongoing partnership between Latin America and the African countries involved in the conservation and evaluation of coffee genetic resources is a well-intentioned strategy. This network aims to revitalize and advance the research to boost the productivity and cup quality of the coffee.
Conflict of interest
The authors do not have conflict of interests.