Metabarcoding approaches for the study of human vector-borne diseases using natural populations of vectors as biological samples.
The implementation of sustainable control strategies aimed at disrupting the transmission of vector-borne pathogens requires a comprehensive knowledge of the vector ecology in the different eco-epidemiological contexts, as well as the local pathogen transmission cycles and their dynamics. However, even when focusing only on one specific vector-borne disease, achieving this knowledge is highly challenging, as the pathogen may exhibit a high genetic diversity and multiple vector species or subspecies and host species may be involved. In addition, the development of the pathogen and the vectorial capacity of the vectors may be affected by their midgut and/or salivary gland microbiome. The recent advent of Next-Generation Sequencing (NGS) technologies has brought powerful tools that can allow for the simultaneous identification of all these essential components, although their potential is only just starting to be realized. We present a metabarcoding approach that can facilitate the description of comprehensive host-pathogen networks, integrate important microbiome and coinfection data, identify at-risk situations, and disentangle the transmission cycles of vector-borne pathogens. This powerful approach should be generalized to unravel the transmission cycles of any pathogen and their dynamics, which in turn will help the design and implementation of sustainable, effective, and locally adapted control strategies.
- vector-borne diseases
- transmission cycles
- vector ecology
- next-generation sequencing (NGS)
- blood meals
- One Health
Vector-borne diseases affecting human health are caused by pathogens transmitted by “living organisms” between humans or from animals to humans. These “living organisms” are known as “vectors,” which generally are bloodsucking arthropods, such as mosquitoes, ticks, flies, sandflies, fleas, or triatomine bugs. These arthropods ingest disease-producing microorganisms during a blood meal from an infected host (human or animal) and later transmit it to a new host during their subsequent blood meals . According to the World Health Organization (WHO), vector-borne diseases, such as malaria, dengue, human African trypanosomiasis, leishmaniasis, Chagas disease, yellow fever, Japanese encephalitis, or onchocerciasis, account for almost 20% of all infectious diseases worldwide. They cause more than 700,000 deaths annually, and more than half of the world’s population is estimated to be at risk of these diseases . They are a major obstacle to development, and the poorest segments of societies and least-developed countries are the most affected. The most deadly vector-borne disease, malaria, causes more than 400,000 deaths annually, mainly children under 5 years. However, the world’s fastest-growing vector-borne disease is dengue, with a 30-fold increase in disease incidence over the last 50 years [1, 2]. Currently, there is an estimation of 96 million cases of dengue per year, and more than 3.9 billion people in over 128 countries are at risk of contracting this disease [1, 3]. Chagas disease, which is one of the primary study models of our research group and classified by the WHO within the group of Neglected Tropical Diseases (NTDs), is a major public health problem in Latin America where 6–7 million people are currently infected [4, 5].
The control of vector-borne diseases relies mainly on control programs targeted against the different vectors. Nevertheless, the efficiency of the different vector control strategies is highly linked to the local ecology of the vectors , which in turn defines local transmission cycles. Consequently, for the implementation of sustainable control strategies aimed at disrupting the transmission of vector-borne pathogens, comprehensive knowledge of the vector ecology and behavior in the different eco-epidemiological contexts, as well as the local transmission cycles of the pathogens and their dynamics, is an essential need. However, even when focusing only on one specific vector-borne disease, achieving this knowledge is challenging. Indeed, the pathogen may exhibit a high genetic diversity, and multiple vector species or subspecies and host species may be involved. In addition, the development of the pathogen and the vectorial capacity of the vectors may be affected by their midgut and/or salivary gland microbiome. Sometimes, many pathogen species can also be involved. For example, leishmaniases are caused by more than 20
The recent advent of Next-Generation Sequencing (NGS) technologies has brought powerful tools, with enormous potential, allowing the simultaneous identification of all these components for the understanding of the eco-epidemiology of vector-borne diseases. Nevertheless, their potential is only just starting to be realized. Here, we present a metabarcoding approach based on NGS that can facilitate the creation of comprehensive host-pathogen networks, integrate important microbiome and coinfection data, identify at-risk situations, and disentangle the transmission cycles of vector-borne pathogens.
2. Complexity of vector-borne pathogen transmission cycles and their dynamics
The transmission cycles of vector-borne pathogens are shaped by the ecology and behavior of hosts and vectors in their specific environments and defined by the specific interactions between the vectors, the pathogens, and their hosts (which also act as blood-feeding sources of the vectors) . Consequently, the comprehensive identification of these interactions is critical to disentangle transmission cycles and understand their dynamics. In most cases, an extraordinary diversity of organisms is involved, making the identification of those interactions challenging. In the case of Chagas disease, for example, the causative agent, a protozoan parasite called
3. Metabarcoding: a highly sensitive and integrative approach to disentangle vector-borne pathogen transmission cycles
NGS technologies can generate millions of sequencing reads in parallel. This massive throughput sequencing capacity can produce sequence reads from fragmented libraries of a specific genome (i.e., genome sequencing) or from a pool of PCR products. Metabarcoding approaches rely on this technology where a large number of different amplicons of taxonomic informative genes (barcodes) can be sequenced. While metagenomics refers to the identification of all genomes within a particular ecosystem or sample, metabarcoding aims to identify only a subset of them (those that are of interest for a particular question) by sequencing of millions of different amplicons of these barcodes, without a necessity for cloning (i.e., sequences are obtained directly from a mix of different amplicons of different barcodes of interest) .
Consequently, in the case of vector-borne pathogens, starting only from the vectors as biological samples, it is possible to target and amplify well-chosen molecular markers (barcodes) of interest with universal primer sets to identify the different actors of transmission cycles (e.g., vertebrate blood sources, midgut microbiome, pathogen diversity, and vector diversity ). Other ecological interactions which are not directly involved in the transmission cycles but relevant for the understanding of the vector ecology and the dynamics of the transmission cycles (e.g., plant-feeding sources, sometimes required as a source of energy for routine activities such as flight, mating, and walking or a source of protein for maturation of eggs ) can also be identified. A schematic representation of the metabarcoding approach for the identification of ecological interactions of disease vectors is given in Figure 2. After purification of the total DNA (and RNA if working with RNA pathogens) contained in each vector midgut (and salivary glands, depending on the kind of vector)
Currently, the most common systems provide up to 384 different tags and 25 million reads per sequencing run. The depth (i.e., the number of reads or the number of sequences) obtained per molecular marker and sample depends on the number of labeled samples and the number of markers amplified per sample. For instance, if we amplify 10 molecular markers for 100 vector specimens and run at a depth of 25 million sequences, about 250,000 reads per vector specimen and 25,000 reads per marker and specimen will be theoretically obtained. This kind of multiplexing allows to considerably lower sequencing costs per sample. Downstream analyses with bioinformatics tools, such as those provided on the open access Galaxy platform , allows to obtain and identify the sequences corresponding to each targeted marker for each vector specimen. This approach is thus extremely powerful to further reconstitute the pathogen transmission cycles and understand its dynamics, since it can reveal, after adequate analyses, all the existing ecological interactions thanks to the simultaneous identification and for each specimen of its species or subspecies, its blood-feeding source(s), the pathogen(s) of interest, the species or lineage(s) of the pathogen(s) of interest, the composition of its midgut microbiome, of its salivary gland microbiome, its plant-feeding source(s), mutations associated with insecticide resistance, etc.
T. cruzitransmission cycles in the Yucatan peninsula (Mexico): an example of the metabarcoding approach use
As a proof of concept, we recently performed a pilot study of the metabarcoding approach presented above using Chagas disease in the Yucatan peninsula (Mexico) . In this region,
This study, which was based on 14
To further assess potential transmission cycles of
Consequently, the approach presented here provides very high-value information that can be used in multiple ways for further design and implementation of sustainable, effective, and locally adapted control strategies and deserves to be extended to other eco-epidemiological contexts and to any vector-borne pathogen. To date, metabarcoding approaches for the study of human vector-borne diseases using natural populations of vectors are being progressively adopted, but they are still timidly used [54, 55]. Moreover, they are still generally focused only on one of the components of transmission cycles, such as blood-feeding hosts [56, 57, 58, 59], plant-feeding hosts , microbiome composition [60, 61], or vector diversity  (Table 1), thus providing limited information, while the approach can be easily more integrative, as we illustrated here, to simultaneously identify the different actors involved in transmission.
|Vector||Geographic origin||Target DNA||Main findings||Reference|
|Mosquitoes ||Different villages in Papua New Guinea||Mammalian blood-feeding hosts||Unbiased characterization of mammalian blood-feeding hosts, including unsuspected hosts and mixed blood meals. Human, dog, and pig were the most common host-feeding sources. The approach can also be adapted to evaluate interindividual variations among human blood meals|||
|Mosquitoes (||Different sites in the coast of the Caspian Sea in northern Iran||Vertebrate blood-feeding hosts||The four most common mosquito species had similar host-feeding patterns. The most commonly detected hosts in these species were humans, cattle, and ducks|||
|Mosquitoes and sand flies||Forest sites in French Guiana||Mammal blood-feeding hosts||Accuracy of the short 12S marker proposed for the identification of Amazonian mammals. The accuracy of taxonomic assignations highly depends on the comprehensiveness of the reference library|||
|Triatomine bugs (||Two sampling sites in in Panama||Vertebrate blood-feeding hosts||Reliability of the metabarcoding approach proposed for the identification of vertebrate blood-feeding host|||
|Phlebotomine sandflies (||Different sampling sites in Brazil, Israel, and Ethiopia||Plant-feeding hosts||Sand flies preferentially feed on |||
|Mosquitoes (||Different habitats across central Thailand||Bacterial and eukaryotic microbiome||Patterns of microbial composition and diversity that affect pathogen prevalence appeared to differ by both vector species and habitat for a given species. Microbial composition was less diverse in urban areas|||
|Tse-tse flies (||Two trypanosomiasis foci in Cameroon||Bacterial microbiome||Endosymbiont |||
|Phlebotomine sand flies (||Various locations in French Guiana||Sand flies||Efficiency of metabarcoding based on the mitochondrial 16S rRNA for identification of sand fly diversity in bulk samples|||
|Mosquitoes and sand flies (various species)||3 sites along a gradient of anthropogenic pressure in French Guayana, area of Saint-Georges de l’Oyapock||Vectors and vertebrate blood-feeding hosts||Contrasting ecological features and feeding behavior among dipteran species, which allowed unveiling arboreal and terrestrial mammals, as well as birds, lizards, and amphibians. Lower vertebrate diversity was found in sites undergoing higher levels of human-induced perturbation|||
|Triatomine bugs (||Different habitats in rural Yucatan (Mexico)||Vertebrate blood-feeding hosts, ||Ecological associations of triatomines which shape |||
In this chapter, we presented a metabarcoding approach to study vector-borne pathogen transmission cycles and their dynamics and illustrated the feasibility and high sensitivity of the proposed approach with a recent study performed using Chagas disease in the Yucatan peninsula (Mexico), as a study model. Currently, NGS technologies are quickly becoming more affordable and cost-effective. Moreover, many bioinformatics tools have allowed to greatly simplify analyses in the last years. Consequently, this powerful approach deserves to be generalized to other eco-epidemiological contexts to unravel the transmission cycles of any vector-borne pathogen and their dynamics, which in turn will help the implementation of sustainable, effective, and locally adapted control strategies of their transmission.
This work received financial support from CONACYT (National Council of Science and Technology, Mexico) Basic Science (Project ID: CB2015-258752) and National Problems (Project ID: PN2015-893) Programs. This work was also funded by the Louisiana Board of Regents through the Board of Regents Support Fund [# LESASF (2018-2021)-RD-A-19] and grant #632083 from Tulane University School of Public Health and Tropical Medicine.