Neotropical fish correspond to approximately 30% of all fish species worldwide. The diversity of fish species found in Neotropical basins reflects variations in life-history strategies and exhibition of particular morphological, physiological and ecological attributes. These attributes are mainly related to different forms of feeding, life maintenance and reproduction. Today, fish populations are being threatened by anthropogenic actions that are having a visible impact on the natural state of continental aquatic ecosystems. The main causes are overfishing, non-native species introduction, reservoir-dam systems, mining, pollution and deforestation. The biology and population dynamics of the species are still unclear due to lack of research. Genetic tools can be useful resources for the conservation of Neotropical fish species in several ways. Molecular genetic markers are considered powerful tools to identify cryptic and hybrid fish and also allow the evaluation of the genetic variability and structure of populations of Neotropical ichthyofauna. Several analyses of molecular markers have been performed on Neotropical fish, including allozyme analysis, restriction fragment length polymorphisms in regions of DNA (RFLP), randomly amplified polymorphic DNA (AFLP), randomly amplified polymorphic DNA (RAPD), microsatellites, single nucleotide polymorphisms (SNPs) and mitochondrial DNA (mtDNA) markers. In order to analyse a high number of markers, next generation sequencing has allowed researchers to generate a large amount of genomic information that can be applied to the conservation of Neotropical fish.
- molecular markers
- genetic conservation
- Neotropical ichthyofauna
1. Rivers of the Neotropical region
The distribution of freshwater fish around the world was mediated by historical climatic and geological events at different time points. Today, each global region has distinct patterns of distribution due to physical barriers obstructing species dispersion, representing different tolerances to environmental variables . The tropics of the American continent are well known for their high biodiversity. This is due to habitat heterogeneity and a complex geological history. The Neotropical region is a biogeographic region that comprises Central America (including the southern part of Mexico and the peninsula of Baja California), the south of Florida, the Caribbean and South America. The origin and evolution of the Neotropical region arose through a process of synergism between its fauna that experienced local rainfall variations and gradual climate change resulting in a mosaic of habitats controlled by river migrations, sea-level fluctuations, local dryness and local uplifts [2, 3].
Regional geographical formations can affect the local hydrography and species distribution by forming distinct biogeographic barriers and allowing speciation of some isolated populations. Consequently, large basins separated by physical barriers with heterogeneous distribution across thousands of river systems, tend to have distinct species, with behaviours relating to environmental characteristics [4, 5]. The main hydrographic basins covering the Neotropical region are concentrated in South America, including the Amazon Basin, which covers the Colombian and Brazilian hydrographic regions, the Upper Paraná River Basin, the Paraguay-Paraná Basin, the São Francisco River Basin and the Uruguay River Basin .
The Amazon drainage basin covers 7.05 million km2 occupying approximately 39% of the South American land mass. Around 72% of the basin is concentrated in Brazil, but it covers almost the whole continent from the Andes Mountains in the west to the Atlantic Ocean in the east. The mean water temperature in the basin is 27–29°C and reaches up to 34°C. Rainfall is the main source of water for the Amazon Basin, with about 50% of water originating from precipitation, being 6% of the basin area continuously flooded by large and medium rivers . Several fish species of economical relevance in the Amazon River are migratory, such as catfishes of the Siluriformes order that can migrate for thousands of kilometres, and seed dispersers, such as
The Upper Paraná-River Basin is formed by the junction of the Grande and Parnaiba rivers in the south-central region of Brazil. It is one of the longest rivers in the world at 4695 km with a 2.8 × 106 km2 drainage area. It comprises 10.5% of the total area of Brazil and flows by the region that has the greatest population density of the country, subject to dam construction and agricultural, industrial and urban pollution. The Upper Paraná region has a tropical and subtropical climate with an average temperature of 22°C and 140 cm of precipitation per year . There is a large floodplain located between the Porto Primavera and Itaipu dams, with a 230 km dam-free stretch, and is a region considered important for the conservation of local fish fauna. There are large migrators such as
The Paraná-Paraguay Basin covers most of the southeastern region of Brazil and other countries such as Paraguay, eastern Bolivia and northern Argentina. Together with the Uruguay River, it covers most of central South America. The hydrographic basin covers 2.8 million km2 and is considered the second biggest Brazilian basin. In contrast to the Amazon Basin, the climate in the Paraná-Paraguay Basin is drier, and the basin oscillates between harsh dryness and shallowness, with rains from October to March. The annual precipitation rate is 800–1200 mm leading to the formation of significant seasonal floodplains . One of the biggest and most important wetlands of the world is the Pantanal located in the Upper Paraguay Basin. The complex hydrological cycle of the Pantanal wetland creates selective pressures on the adaptive and diversified traits of fish species. The Pantanal wetland consists of 5% of all existing Neotropical species [14, 15], but surprisingly few studies into diversity, structure and the population dynamics of fish populations have been carried out. Most of the fish in the Paraguay-Paraná River are economically important and are migratory, such as
The Uruguay River Basin system is located in the temperate latitudes near the southern coast of Brazil with altitudes reaching 1800 m. It runs along the border between the Santa Catarina and Rio Grande do Sul states of Brazil until the Paraná River where it forms the estuary of the Plata River in Argentina. As a result of its sloping profile and abundance of rapids, the Uruguay River is hard to navigate compared to other rivers. With faster water, there are a considerable number of hydropowered dams in the basin that can affect the reproduction of migratory species and their eggs and larvae drift .
The São Francisco River covers 7.4% of Brazil and represents a large number of reservoirs, representing the second highest source of hydropower in the country . The headwaters rise in the southern region of Minas Gerais state, and run through Bahia, Pernambuco, Sergipe and Alagoas states to then empty into the Atlantic Ocean. The altitude reaches 1600 m above sea level, and there are diverse climate conditions raging from humid tropical to semi-arid, with temperatures from 18 to 27°C and high evaporation rates (2300–3000 mm/year). The São Francisco River is rich in floodplains and marginal lagoons that are used by fish species as a habitat for feeding, reproduction and refuge. Around 8% of the species migrate to reproduce and are considered important commercial fish. These include some Characiformes (
2. Diversity and biology of Neotropical fish
Neotropical fish comprise approximately 30% of all fish species in the world (5160 species) and are found in only 0.003% of all the freshwater on the planet . The Neotropical region has some of the highest numbers of fish families, and unlike other zoogeographic zones where Cypriniformes predominate, there is a high proportion of endemic families belonging to the Characiformes (~1200 species) and Siluriformes (~1300 species) orders. Despite the predominance of Characiformes and Siluriformes species in the Neotropical basins, the heterogeneity of species between basins and their unequal distribution are considerable. This is largely due to the formation of lakes, puddles, streams, rapids, rivers and floodplains that have become determining factors for the high diversity of fish species that exist today in the Neotropical region [1, 21].
There are estimates that the number of freshwater fish species in the Neotropical region exceeds 8000 ; however, the total number remains unknown. There are many factors that make it difficult to study biodiversity in this region, in addition to problematic taxonomic issues  like cryptic species . Furthermore, few institutions [United Nations Food and Agriculture Organisation (FAO) and the Brazilian Institute of Geography and Statistics (IBGE)] provide fishing data statistics that ultimately affects the management of these species.
Despite the low number of fish species described, there was an increase following the advent of molecular biology and cytogenetic techniques. Many single species were re-described as a complex of cryptic species after genetic analyses .
Efforts to describe new species of Neotropical fish have focused mainly on the Amazon Basin, with more than half of the fish species described found in this region . Of these, there are around 200 poorly known species that are exploited by commercial and subsistence fishing, and this number may be even greater due to misidentification errors. This makes it difficult to implement fishery management and conservation policies [24, 25].
The diversity of fish species found in the Neotropical basins reflects variations in life-history strategies and exhibition of particular morphological, physiological and ecological attributes. Such attributes are related to the different forms of feeding, life maintenance and reproduction [26, 27, 28]. Research into the pattern of life strategies may have practical applications in conservation and is fundamental to fishery management in identifying appropriate measures to reduce the impact from reservoirs and other anthropogenic activities .
Neotropical fish species range from colonising and opportunist species to periodic and equilibrium species. Colonising populations, such as those belonging to the genera
The high degree of variability in reproductive strategies is closely related to their environment and the selective forces present over the life history of the species. As a result, reproductive strategies can be expressed in different ways, including the type of egg fertilisation, differences in age of maturation, parental care, spawning and migratory patterns . Seasonal periods of flooding and dryness influence reproductive strategies, especially in migratory species. However, the connection between reproductive events and fluctuations in hydrometric level is not entirely clear. The increase in water level combined with changes in temperature, photoperiod and water conductivity may lead to physiological changes in the individual that stimulate gonad maturation, migratory movements, spawning, egg fertilisation and offspring development . Although migratory movements are due to reproductive needs, they have seasonal, trophic and ontogenetic characteristics that are all associated with the hydrological regime of the river [21, 37, 38].
In addition to large migratory species that require long stretches of river and seasonal stimuli to exercise their life strategies, there are also sedentary species that carry out their vital activities in a restricted area and are more influenced by local environmental variations. Due to their small size, they spend their lives associated to a substrate, such as trunks, rocks and aquatic plants where they find protection, food and a suitable surface for egg deposition. The displacement of these species is generally short and occurs in more lentic environments, where there may be occasional variations. The reproductive period of some species occurs during lower levels of precipitation so that eggs and larvae are not dragged to stretches of the river where they will be unable to find suitable conditions for development. On the other hand, some species of sedentary fish such as the lambari (
Some sedentary species also show seasonality in reproduction.
3. Environmental impacts and risks for Neotropical fish
Neotropical fish are currently being threatened by anthropogenic activities that are showing visible effects on freshwater ecosystems. These effects are related to overfishing; non-native species introduction; dam construction for hydropower; river contamination from mining activities; and industrial and agricultural pollution and deforestation .
In Neotropical areas, the limits of exploitation of the majority of commercially valued fish stocks are close to the maximum sustainable yield . Aquaculture would indirectly alleviate pressures on threatened wild stocks and, therefore, needs to be carried out in a sustainable way with the least possible impact on natural populations . In 2015, of the 25 countries with the highest production from aquaculture (97.1% of total production), three Neotropical countries were included (Chile, Brazil and Ecuador). Moreover, with regard to freshwater aquaculture, Brazil leads this ranking, producing 474,300 tons, followed by Chile producing around 68.7 tons and Ecuador 28.2 tons .
In 2016, according to IBGE (Brazilian Institute of Geography and Statistics) data, Brazilian aquaculture had continued to grow and reached a total of 580,000 tons, with a production value of R$ 4.2 billion. A total of 77.32% of this production originated from fish farms followed by shrimp farms (21.5%). The development of farming technologies directed at the native species has helped to accelerate fish production, and relieve the pressure exerted by extractive fisheries.
Despite the advantages of aquaculture, uncontrolled fish production and lack of proper inspection by government agencies can be problematic for the Neotropical ichthyofauna. Uncontrolled hybridisation of fish, introduction of non-native species and loading of excess nutrients originating from effluents from aquaculture production can become a serious threat to wild fish populations.
Currently, the production of fish hybrids involves many Neotropical species resulting in viable products of high interest for farmers . Nevertheless, the main threat caused by hybridisation is the genetic introgression on wild populations [50, 51]. If fertile, hybrids can genetically contaminate natural and farmed stocks by genetic homogeneisation and compete with the native parental lineages (in sexual behaviour, territory, food, etc.) .
Brazil plays an important role in the conservation of its rich diversity of Neotropical fish. However, policy initiatives have threatened the biodiversity of these species and the functioning of their ecosystems. In some countries, there is specific legislation for hybridisation (in contrast to the Brazilian legislation, which does not require a licence for hybrid production); for example, in the state of California, there are laws prohibiting unlicensed fish hybridisation . In Brazil, most commercial establishments are unlicensed, and there are few legislative proposals to regulate the activity . According to Hashimoto et al., legislation is necessary to guarantee the safety of hybridisation techniques used in Brazil .
There is a particularly high concentration of hydroelectric dams in the Upper Paraná and São Francisco rivers (many of the rivers in South America are so heavily dammed they become a chain of reservoirs) . The largest dams in Latin America are Itaipu (Paraná River, Paraguay-Brazil), Guri (Rio Caroni, Venezuela), Tucuruí (Rio Tocantins, Brazil) and Yacyretá (Paraná River, Argentina-Brazil). Currently, 90% of the energy consumed in Brazil originates from hydroelectric plants, with an annual output of 78,000 MW . Dams have been built in almost all hydrographic basins with consequent formation of reservoirs. These reservoirs alter the natural distribution of seasonal flows and nutrients, leading to the formation of new ecosystems with specific structures and functioning . These new ecosystems have several factors that affect the local ichthyofauna have a serious impact on the life cycle of the fish . Dams act as barriers to the natural flow of rivers. They are built mainly to produce electricity, but also to supply water to residential, agricultural and industrial areas. The change and/or loss in water flow impacts the distribution of aquatic life biodiversity . Dams also affect the watershed and lower the water quality, impacting not only the river itself but also its tributaries. This may also be harmful to native species by destabilising the ecosystem and the living communities . For example, migratory fish suffer due to the interruption to their migratory routes and require a different habitat to complete their life cycle. These species generally migrate upstream to spawn during the wet season producing numerous small eggs. These eggs and larvae are transported with the current to nurseries downstream without any parental care, where they find ideal conditions for initial development and protection from predators [59, 60]. The consequences of blocking migratory fish routes is observed in their reproductive cycle for years, leading to the depletion of natural stocks and extinction of the species .
The new ecosystem formed modifies the structure of fish communities that inhabit the river, and the establishment of new communities depends on the physical, chemical, hydrological and geomorphological changes as a result of the spatial and temporal redistribution of the river flow [62, 63, 64, 65]. Changes in species composition and abundance can increase the numbers of some species and eliminate others, causing collapse of the ecosystem .
In order to mitigate such effects, management measures have been put in place to preserve the Neotropical ichthyofauna . Until the 1950s, the main objective of management programmes was to ensure that species could migrate through the reservoirs to complete their life cycle. Transposition mechanisms (fish ladders) were created in the main Brazilian dams. In the 1990s, dozens of fish transposition systems were constructed, even with few studies into the efficacy of the method and despite the costs and effort required . Most of these mechanisms are based on ladders, structures that reduce the velocity and gradient of the water so that fish can climb and pass through the dam .
However, these mechanisms have species selectivity and allow the movement of only some species of fish. This divergence between species can cause dramatic imbalances in the population and the Neotropical ecosystem . The main process of passage is recognition of the entrance . If the fish cannot recognise the entrance to the passage, they remain where they are, which delays migration and spawning and interferes significantly with their reproductive process .
Storage and repopulation of fish are alternative methods to mitigate the impacts of hydroelectric dams . Several breeding programmes were implemented aimed at the production of fish to restock the reservoirs, mainly to improve fishing activities. Some non-native species were introduced to southern and southeastern regions of Brazil over 20 years (1970–1990) due to the difficulty of producing native species , a trend that has declined in recent years, though it still exists . Hydroelectric companies have begun to produce native species for restocking (repopulation), but for this to be successful, evaluation of the efficiency and genetic quality of the parents is essential [21, 72]. In repopulation programmes, genetic monitoring is a fundamental step, since a reduction in genetic variability reduces the adaptability of the species to different environmental conditions and interferes significantly with the survival of young fish [66, 73]. The use of molecular markers has been shown to be effective for genetic management in order to maximise diversity and reduce inbreeding in the repopulation centres [67, 68, 74].
Aquatic organisms are fragile and sensitive to a wide range of stressors. Reproduction, growth and population survival are highly dependent on water quality. Environmental pollutants such as metals and pesticides present a serious risk to local ichthyofauna. The physiological effects of toxicants include disruption of hormonal, neurological and metabolic systems and elimination of behaviours that are essential to fitness and survival in natural ecosystems . Studies into many Neotropical fish have corroborated this.
Mining activities impact the aquatic ecosystem in the basins of Upper Paraguay and in the Colombian, Brazilian and Peruvian Amazon . Mercury, the main compound released during gold mining, accumulates in the sediment and in the muscle and tissues of fish (bioaccumulation). This means that through the trophic chain, the predators that are high in the food chain tend to accumulate more metals (biomagnification). Fish in the rivers of Madre de Dios city (Perú), affected by illegal mining, revealed that the species
Mining can also lead to the collapse of dams, as occurred in Mariana city (Minas Gerais state, Brazil) in 2015 that was considered the biggest environmental disaster in Brazil that released approximately 55–62 million m3 of mining waste directly into the watershed of the River Doce, spreading across the Atlantic coast [85, 86]. This affected the ichthyofauna by fragmentation and destruction of habitats, water contamination, change in water flow, impact on estuaries and mangroves at the mouth of the River Doce , destruction of fish breeding areas, destruction of the nurseries of the ichthyofauna (feeding areas for larvae and juveniles), disruption of the gene flow between different areas, loss of species with habitat specificity and collapse of fish stocks .
4. Genetic applications
Genetic tools are important resources for the conservation of Neotropical fish species. The biology and population dynamics of the species are still unknown due to insufficient research. In spite of the high diversity that characterises Neotropical fish, there are many species with a large geographical distribution and differing population structure. Along a hydrographic basin, one can find many populations, from panmictic populations of long-distance migratory species, characterised by large gene flow, to restricted populations of local organisms with well-defined population structures . Research into the verification of variability and genetic structure of populations belonging to different river basins will aid the construction of policies and management measures for the maintenance of natural populations. In addition, genetic tools are increasingly being used to molecularly identify new species that was previously impossible due to morphological similarities. Furthermore, various anthropogenic activities in aquaculture and pollution have been increasingly studied at the molecular level, particularly with respect to research into hybrid fish and the effects of contaminants.
Biodiversity is conceptualised into distinct biological levels (genetic, species, community and ecosystem) that have each been impacted by human activities. The impact on genetic diversity is one of the biggest concerns, affecting species adaptation and taxa speciation [89, 90]. Knowledge of how the genetic diversity of Neotropical fish is maintained and how the populations are structured is important to determine how these species can be conserved. Many species of freshwater fish display genetic variation with adaptive traits that enhance survival and reproduction in particular environments and increase the capability of the organisms to adapt to environmental changes and anthropogenic activities .
Genetic variability in populations can be measured by the allele number and heterozygosity . Intrapopulation variability is influenced by factors such as mutation, genetic drift and natural selection. Genetic variation originates from mutations and decreases in genetic drift that increases the interpopulation differentiation due to a finite population size, with gene flow occurring between populations . Conversely, natural selection can reduce genetic variation by allele fixation . Anthropogenic activities, such as habitat fragmentation, increase the risks of genetic drift and gene flow reduction, diminishing the genetic variability of populations and interrupting flow of the adaptive genes leading to extinction of some species . Molecular genetic markers have emerged as a powerful tool to identify genetic variability in populations  and have had a substantial impact on the fields of ecology, evolution and conservation .
The identification of cryptic species is an important genetic application for the ecology and conservation of Neotropical freshwater fish. This taxonomic challenge has been overcome due to the advent and availability of rapid DNA sequencing for detecting and differentiating morphologically similar species . The destruction and disturbance of river basins, especially those caused by human interference, have led to the threat of complete extinction of several fish species . However, many species exposed to these threats are still undescribed, and efforts to catalogue and identify these fish are increasingly important. Most species have been described by morphological and typological characteristics . However, speciation is not always accompanied by differences in morphology, and due to the difficulty of identification, the actual number of existing fish species is greater than previously described .
DNA sequencing has introduced a new method of species discovery known as DNA barcodes . DNA barcodes are short and standardised sequences from a part of the mitochondrial genome that can be used to distinguish different species. This differentiation can easily be determined when genetic variation between species exceeds that within species . The barcode sequence from each unknown specimen is then compared with a library of reference barcode sequences derived from individuals of known identity. Research has been carried out to evaluate the effectiveness of this technique in identifying cryptic species in insects , birds  and plants . The diversity Neotropical freshwater ichthyofauna is the richest in the world and make up around 25% of the total freshwater fish fauna on Earth . However, the lack of knowledge of their diversity makes taxonomic identification a great challenge.
Genetic methods facilitate the identification of cryptic species and species with few identifiable phenotypic characteristics. The presumed neutrality of some molecular markers, in conjunction with phylogenetic methods, provides a new perspective on species identification, especially in hierarchical relatedness and relative rates of evolution. The increased frequency with which cryptic species can be discovered with DNA sequence data, and often subsequently confirmed with morphological and/or ecological data, suggests that molecular data should be routinely incorporated into taxonomic research.
Another major problem for the natural populations of Neotropical fish (that can be reduced or controlled using genetic resources) is accidental or deliberate release of non-native fish species . Hybridisation is the mating of genetically differentiated individuals and may involve individuals within a species or between species . Conventional approaches to detect interspecific hybridisation include morphometric and molecular analyses. In recent years, DNA polymorphisms have been used for investigating fish hybridisation . Nuclear genetic markers, in particular, allow hybrid species identification because contributions to the hybrid genome of both the father and mother can be identified .
5. Molecular markers
There are a number of molecular markers in Neotropical fish, such as allozyme markers, restriction fragment length polymorphisms in regions of DNA (RFLP), randomly amplified polymorphic DNA (RAPD), randomly amplified polymorphic DNA (AFLP), microsatellites markers, high genome coverage markers [single nucleotide polymorphisms (SNPs)] and maternal inheritance markers (mtDNA).
Allozymes were considered the first molecular marker, discovered in the 1960s in enzymes. When DNA sequences of two or more alleles in the same locus are divergent, and the corresponding RNA encodes different amino acids, multiple variants of the same protein are created. However, not every mutation in a DNA sequence results in changes to the amino acid sequences, and this is one of the disadvantages of using an allozyme as a molecular marker . Other disadvantages include heterozygote deficiencies due to null alleles and the amount and quality of tissue samples required . The limitations and disadvantages of these markers led to the development of DNA-based genetic markers.
In the 1980s, the first DNA-based molecular markers were developed. They can be classified into dominant and codominant markers. It is not possible to identify heterozygotes in dominant markers, whereas in codominant markers, this differentiation can be determined, and it is possible to estimate allele frequencies. Molecular markers can also be classified into those with known function (type I markers) or with anonymous regions (type II markers) .
5.2. Restriction fragment length polymorphisms (RFLP)
RFLP markers were the first markers discovered that were based on DNA sequences . They are considered codominant markers and are type I or type II. They are based on bacterial enzymes that recognise specific DNA sequences. The DNA is then cut into fragments where these sequences are found. The digestion of DNA by restriction enzymes results in fragments that vary between individuals, populations and species. The fragments can be analysed using the polymerase chain reaction (PCR), and the PCR products are digested by restriction enzymes. RFLP markers have low potential in determining genetic variation when compared to new, recently discovered molecular markers, mainly due to the low level of polymorphism. In addition, sequence information of the specimen is required, making it difficult to determine markers in species without molecular information. However, one advantage of these markers is that they are codominant .
5.3. Randomly amplified polymorphic DNA (RAPD)
RAPD techniques use PCR amplification of random anonymous segments of genomic DNA with identical pairs of primers at 8–10 bp in length. Unlike RFLP markers, RAPD does not require any knowledge of DNA sequences of the organism. Therefore, nearly all RAPD markers are dominant, and it is not possible to distinguish whether a DNA segment is amplified from a heterozygous or homozygous locus . The primers used are short and anneal at low temperatures, amplifying multiple products from different loci. Due to the fact that most of the nuclear genome is non-coding, most amplified loci are neutral. Genetic variation is assessed by considering each band as a bi-allelic locus, with the presence or absence of the amplified product generated by PCR. One disadvantage of this technique is the intensity variation that can occur between bands. They can make it difficult to determine whether bands represent different loci or alternative alleles of a locus. The markers also have a low reproducibility due to low annealing temperature in PCR amplification, and thus have limited application in fisheries science. Despite the disadvantages, the detection of polymorphisms is considered high [108, 111].
5.4. Randomly amplified polymorphic DNA (AFLP)
AFLP is a combination of the RFLP and RAPD techniques, using PCR to randomly amplify anonymous fragments of nuclear DNA (type II marker). The technique involves digestion of DNA using a restriction enzyme, as in RFLP analysis, producing a high number of dominant fragments that, depending on their concentration, are not detected by electrophoresis. The DNA is digested with different types of endonucleases, generating fragments of different sizes. The following steps are similar to the principles of RAPD, where small, known DNA sequences (adapters) are coupled to the ends of the fragments and are annealed with specific primers during PCR . A unique feature of this technique is the addition of known sequence adapters to DNA fragments generated by complete genomic DNA digestion. This allows subsequent PCR amplification of the many fragments generated that are then separated by denaturing polyacrylamide gel electrophoresis . The AFLP technique has some advantages, such as detection of greater numbers of loci generating a higher number of polymorphisms, broad coverage of the genome with high reproducibility (due to high PCR annealing temperatures) and low cost . Like RAPDs, they are considered dominant markers and although there are packages for codominant scoring of AFLP bands, their applicability in population studies is difficult. The major disadvantage of the technique is the need for automated gene sequencers for electrophoretic analysis of fluorescent labels, although traditional electrophoretic methods can also be employed using radioactive labels or silver staining techniques .
5.5. mtDNA markers
Mitochondrial DNA (mtDNA) markers were the first widely used DNA markers and are one of the most popular markers for molecular diversity studies in fish . This part of the genome consists of a small, circular, abundant and easy to amplify DNA molecule as there are multiple copies in the cell. Moreover, the mitochondrial gene content is strongly conserved across species, with little duplication, no intronic regions and very short intergenic regions . Studies of vertebrate species have shown a mutation rate that exceeds, by multiple times, nuclear DNA mutation rates that may be due to a lack of repair mechanisms during replication . The complete mtDNA sequences have been sequenced to facilitate analyses of molecular markers in many economically important Neotropical fish species, such as
The DNA of cytoplasmic organelles has a non-Mendelian inheritance, and the mtDNA must be considered a single locus in genetic investigations . Inheritance occurs via the mitochondria of the oocyte from which an animal develops . This maternal transmission gives information on maternal lineages of fish stocks and provides a more sensitive tool for detecting population subdivision, making it an efficient marker when compared to typical nuclear markers such as microsatellites and SNPs .
Many studies of mtDNA have focused on the major non-coding region, often called the control region, because of its rapid rate of evolution. The control region includes transcriptional promoters in both strands and the D-loop region. In these non-coding D-loop regions, the evolution rates are higher than the rest of the molecule. These changes lead to the formation of multiple alleles, called haplotypes that can be phylogenetically ordered within the same population and confirm intrapopulation phylogenetic relationships in population studies .
Microsatellites or simple sequence repeats (SSRs) have been a popular marker in genetic fish research due to their abundance in the genome in all regions of the chromosome. There can be a small number to a few hundred copies of tandem repeat sequences of mono-, di-, tri- and tetranucleotide motifs. They are codominant and mostly type II markers, with abundancy in all species of fish with an estimated occurrence of one in every 10 kb in coding genes, intronic regions and regulatory sequences [122, 123].
These markers are useful in evaluating structure and genetic diversity between different populations due to high polymorphisms that give a high power in analyses of population genetics . The polymorphisms are identified by size differences, resulting in varying numbers of repeat units in alleles of a single locus . Mutation rates have been detected as high as 10−2 per generation .
There have been many studies of wild fish stocks using microsatellites that allowed the analysis of historical population structures, colonisation histories and connectivity between populations . These population characteristics are generally controlled by environmental effects [126, 127, 128] or by anthropogenic intervention [129, 130, 131] that can induce the structuring of fish populations with a reduction in gene flow exchange and genetic variability.
However, the use of microsatellite markers has some drawbacks. They require a large investment of time and laboratorial effort due to the genotyping step . Moreover, they require a species-specific marker, where there is a high potential for null alleles and imperfect repeats due to polymerase slippage during replication, and genotyping errors that impact population studies by providing unreliable genetic information for conservation biology, molecular ecology and population genetic research .
5.7. Single nucleotide polymorphisms (SNPs)
SNPs are type I or type II polymorphisms caused by point mutations that generate different alleles for a given nucleotide belonging to a specific locus. These molecular markers are unique nucleotide substitutions of a sequence at a single site and have been well characterised since the beginning of DNA sequencing . SNPs are the main focus in molecular marker development as they constitute the most abundant polymorphism in any organism’s genome, with a frequency estimated at approximately 1 SNP per 200–500 bp . This marker is adaptable to the automation of genotyping and reveals hidden polymorphisms that are not detected by other markers and methods . Moreover, they can be efficiently identified in any organism without the need for genomic information.
Theoretically, the SNP of a particular locus can contain up to four alleles (A, T, C and G). In practice, however, most SNPs are usually limited to two alleles (often two C/T pyrimidines or two A/G purines) with codominant inheritance . The level of polymorphism is not as high as in microsatellite markers (multi-alleles), but this disadvantage is counterbalanced by its abundance in the genome . Therefore, to be considered an SNP, it is necessary for the least frequent allele to have a frequency of 1% or higher .
These characteristics demonstrate that this marker is ideal for several biological studies because they allow complex genomic analyses with high yield and coverage. This marker has been revolutionary in fish population research. The SNP markers have already been used in comparative studies of evolutionary genomics, population genomics, identification of interspecific hybrids, identification of sex-related sequences, genomic selection, mapping of genes by linkage maps and detection of alleles associated with economically important characteristics in aquaculture [135, 136, 137, 138, 139].
For the routine use of SNPs, genotyping platforms for analysing a large number of markers and samples, in a fast and economical manner, are fundamental. For low-throughput SNP genotyping, candidate loci can be tested using different methodologies. In summary, each platform uses a specific detection chemistry, which generates differences in the cost of genotyping, price of equipment, number of markers, expertise for use, sample volume analysis and automation .
One of the greatest barriers to the routine use of SNPs is the characterisation and discovery of these markers. Historically, numerous approaches to SNP discovery have been described, primarily from the comparison of specific locus sequences. Direct sequencing (Sanger) of candidate genes was considered the simplest, though expensive, strategy for SNP discovery. On a larger scale, the comparison of sequences of cloned fragments, particularly expressed sequence tag (EST) designs using different types of tissues, is the best alternative . However, in addition to the high costs, a considerable amount of laboratory work, time and expertise is required for this type of analysis.
5.8. Next generation sequencing (NGS) in molecular marker discovery
Next generation sequencing (NGS) allowed researchers to generate a large amount of sequencing data at relatively low cost as compared with other methods such as Sanger sequencing. To identify a greater number of gene-associated markers, a greater yield of sequence readings is required. Next generation sequencers are particularly adapted to produce high precision sequence coverage [141, 142]. Furthermore, NGS provides an enormous number of reads, which allows entire genomes to be sequenced at a fraction of the cost for Sanger sequencing  and is inclusive of non-model organisms . Therefore, NGS technologies have become useful for
Transcriptome sequencing of genomes is one of the most common analytical approaches. Complementary DNA (cDNA) is produced from the mRNA of a specific tissue or life stage. Thus, whole mRNA sequences (cDNA library) from a specific tissue or set of tissues can be aligned to a reference genome (or reference transcripts) or assembled
RAD-seq is an important method of genome reduction in non-model fish for identifying and genotyping SNPs, and unlike RNA-seq, uses genomic DNA as a template. The technique uses the principles of RFLP by reducing the complexity of the genome by subsampling at sites defined by restriction enzymes . This technique consists of digesting the genomic DNA with restriction enzymes, followed by mechanical fragmentation to reduce the size of the fragments making them suitable for sequencing. The digested fragments are then attached to adapters with single barcodes for each individual so they can be multiplexed in a pool of samples. Thus, the regions adjacent to the restriction sites of multiple individuals are sequenced simultaneously in a single run . There are numerous variations of the RAD-seq technique with single restriction enzyme cut sites (original RAD, 2bRAD) or with two restriction enzyme cut sites (GBS, CRoPS, RRL, ddRAD) that promise to increase the number of loci assayed at low cost and effort in ecological and evolutionary studies .
The identification of SNPs using the RAD-seq method has the advantage of avoiding unequal gene expression problems that may impair the discovery of SNPs using transcriptome sequencing . Another advantage of the RAD-seq technique is the possibility of identifying DNA barcodes for individual samples or pools of samples during the preparation of DNA libraries, thus reducing costs . However, alongside transcriptome analysis, the ability to identify true SNPs is hampered by the occurrence of errors caused by high-throughput sequencing. To mitigate this problem, a sufficient sequence read depth is necessary for both techniques .
RNA-seq and RAD-seq techniques have allowed the detection of many microsatellite markers [155, 156] and SNP markers [154, 157, 158] in model and non-model fish species around the world. Although they have been increasingly used in the aquaculture industry for Neotropical fish, microsatellites have been identified and characterised for research in the field of biology and conservation [159, 160, 161, 162]. In previous studies, microsatellites loci in closely related species have been identified. These include species belonging to the Anostomidae , Characidae [164, 165], Cichlidae , Pimelodidae , Prochilodontidae [168, 169, 170] and Serrasalmidae  families.
With respect to SNP identification, few studies have been carried out in relation to the conservation of Neotropical freshwater species. Researchers have focused on valorous species such as the tambaqui (
6. Application of molecular markers
6.1. Identification of Neotropical cryptic species
mtDNA has been a marker of choice for reconstructing historical patterns of population demography, admixture, biogeography and speciation [88, 176] and can help identify cryptic individuals in many Neotropical fish species . However, the main problem in genetic studies aimed at the maintenance of biodiversity is the difficulty of developing a method of species identification, since there are millions of unidentified and unknown species. The use of DNA barcodes, segments of approximately 600 bp of the mitochondrial gene cytochrome oxidase I (COI), has been considered an efficient technique to catalogue all biodiversity. The Neotropical freshwater ichthyofauna is considered the most diverse in the world, and very few fish species have been identified. It has been estimated that 30–40% of species have not been described, and genetic identification is a challenge, even with molecular techniques [176, 177].
Barcode research has already been performed in the São Francisco River Basin and provided evidence of the effectiveness of barcodes to catalogue the diversity of Neotropical basins by discovering new species and genera (
Advances in the use of barcodes have also been achieved in the Pampas plain region of Argentina and have shown that specimens of
6.2. Genetic variability
Levels of genetic diversity between individuals in the same population and between populations are essential for species conservation in the face of environmental changes. In general, most of the wild populations tend to have high levels of genetic diversity . This is largely due to formation of these groups by migratory fish, representing panmictic populations, since high gene flow and the size of the population reduce the effects of genetic drift .
Several factors that may interfere in the fragmentation of populations, or their migratory potential, may cause a population bottleneck and decrease the genetic variability. Bottlenecks reduce population size by making individuals subject to genetic drift and inbreeding, thereby reducing the species evolutionary potential .
Several studies carried out in the Paraná River Basin have already demonstrated a decline and genetic homogenisation among fish populations in this basin [185, 186, 187, 188]. These studies indicate that the fragmentation of the basin due to the large hydroelectric dams installed in the Paraná River Basin, mainly in the Upper Paraná region, is one of the major factors affecting these populations.
Brazil is the third largest producer of hydroelectric power, accounting for up to 10% of total world production. The conversion of free-flowing tropical rivers into the regulated systems associated with hydroelectric dams is one of the major concerns for the conservation of freshwater Neotropical fish. In addition to the impact on water velocity and temperature, hydroelectric dams block the natural river flow that affects freshwater fish populations due to habitat fragmentation, with increased risks of population isolation and consequent destruction of gene flow. This has already been reported using microsatellite markers for
In order to mitigate the damage caused by hydroelectric dams, programmes to reintroduce affected species are a potential solution. However, lack of knowledge about the genetics of local species can have the opposite effect. Analyses of restocking programmes for
In addition to hydroelectric dams and inappropriate programmes for genetic restocking, the inadequate management of cultivated populations may also interfere with the genetic variability of species. Fish escaping from aquaculture facilities may influence the level of genetic diversity in natural populations living in the vicinity of fish farms. The introduction of cultivated individuals to wild populations may result in a mixture of populations with different genetic characteristics that reduce the average genetic diversity (Wahlund effect), as has already been observed in many fish populations [191, 192].
6.3. Genetic structure
As mentioned previously, Neotropical ichthyofauna is subject to many environmental factors that may affect their rate of retention in the environment of origin, including the destruction of their habitat and consequent fragmentation of populations. The effects on the spatial distribution of fish populations may result in genetic processes that affect gene frequency, including dispersive processes, gene oscillation and founder effects. These genetic processes intensify systematic migration, mutation and selection. Due to the high levels of polymorphisms and abundance throughout the genome, molecular markers are useful for genetic structure analyses in different populations . Studies directed towards the verification of structures of Neotropical populations using microsatellite markers are concentrated on populations affected by the construction of dams.
Many freshwater fish species that inhabit Neotropical rivers have migratory behaviour and reproduce during the rainy season, when water levels increase and temperatures rise. Normally, fish migration occurs in the main river or its tributaries for the spawning of eggs that are subsequently carried downstream to the floodplains, where they find suitable conditions for development . This ability to migrate long distances suggests that these fish species constitute a single panmictic population, as reported in several studies of
Some research has also been carried out using mtDNA for population structure analyses. D-loop regions were used to infer structural analyses of populations of pacu (
There are few studies that use SNPs for the conservation of Neotropical freshwater species, and there are insufficient data to evaluate the genetic structure of natural stocks. More genetic studies using SNP markers for species identification need to be conducted in order to better understand population structure and to develop management measures and conservation policies [172, 173].
6.4. Identification of Neotropical hybrid fish
In the Neotropical region, particularly in Brazil, serrasalmid and pimelodid hybrids represent important advances in aquaculture. Hybrid fish originating from serrasalmid species such as the pacu (
Genetic technologies for hybrid fish identification include cytogenetic methods and PCR techniques. The morphological similarity of fish hybrids to their parental species, mainly in juvenile stages or post-F1 generations, means they can only be differentiated using molecular marker techniques. Initially, non-PCR-dependent molecular methods (using allozymes and RFLPs) were used to identify hybrids of serrasalmid species . These markers are not currently used due to the advantages of PCR techniques. Mendelian codominant PCR markers, such as microsatellites and SNPs, are suitable for hybrid identification and introgression events. However, more studies are required to define genetic markers, such as SNPs, that are essential for the identification of fish hybrids, together with production monitoring and management measures, particularly in detecting escaped fish hybrids in the natural environment .
Alternative and less costly techniques, such as PCR-RFLP and multiplex-PCR, are easier to carry out and have proved to be efficient methodologies that can be quickly and inexpensively executed, allowing the identification by simple PCRs based on single nucleotide polymorphisms . The PCR-RFLP method allows the analysis of DNA variation. Base substitutions in specific fragments formed at the enzyme recognition sites result in patterns of restriction fragments . Multiplex-PCR uses species-specific primers for determining loci that differ between the analysed species by a few nucleotide substitutions, and two or more reactions can take place in the same tube . PCR-RFLP and multiplex-PCR techniques are well established for the identification of hybrids between Neotropical species [55, 204]. However, genetic monitoring of hybridisation programmes should be applied in a routine way to verify whether the trade and management of hybrids are being performed correctly in fish breeding farms.
6.5. Traceability of Neotropical fish
Traceability is the ability to identify species and their origin. It is considered important for the conservation of natural stocks and for certification of food quality. DNA-based methodology of traceability has greater reliability and accuracy and is an important tool for the conservation of threatened stocks of Neotropical fish. Furthermore, SNP arrays for species identification, or for identification of a specific population, can be used in processed fish samples that have been frozen, salted, cooked and canned with a high attribution power. This makes it possible to identify the origin of the fish consumed and avoid commercial fishing in places with threatened stocks. However, the fish traceability test alone is not sufficient to reduce the decline in fish numbers; rather, traceability techniques should be used in conjunction with sustainable fisheries, by-catch reduction and management-based policies [125, 154, 205]. Despite traceability research in fish populations worldwide to avoid predatory and indiscriminate overfishing, there is still a lack of important studies related to DNA traceability markers in freshwater Neotropical fish species.
The methodological advances and the development of sequencing technologies can enable an efficient applicability of molecular markers in the conservation of Neotropical fish. Despite the negative impact that human activities have had on fish from the Neotropical region (such as deforestation, construction of dams, overfishing and non-native species introduction to the basins), there are few genetic studies into population structure, genetic variability and hybrid identification.
This work was supported by grants from CNPq (446,779/2014-8, 130,629/2015-4 and 305,916/2015-7), FAPESP (2014/03772-7, 2016/18294-9, 2015/14185-8 and 2016/21011-9), and CAPES.