Most species in the mustard family are restricted to higher elevations and latitudes where they also have restricted local spatial distributions. In this chapter, we describe a novel hypothesis for the development of low-elevation range limits in upland mustard species. The hypothesis suggests that defense regulation of glucosinolates could underlie the evolution of the spatially restricted distributions. A list of testable predictions is presented to evaluate the hypothesis. An interdisciplinary Ecological Genomics approach is needed to test the predictions; therefore, we also describe the field of Ecological Genomics. Although there is already support for some of the predictions, which we discuss, most of the predictions remain untested. Therefore, we also describe several tests that help evaluate each of the predictions.
- Range limits
- signaling pathways
- evolutionary constraints
- ecological genomics
Mustard plants (Brassicaceae) include approximately 3,700 species, several crop species (cabbage, radish, canola, etc.), and the model for molecular plant biology, Arabidopsis thaliana. Despite this diversity, mustard species generally inhabit high-altitude temperate regions where populations have patchy distributions (Al-Shehbaz personal communication). At lower elevations and latitudes, species often face both abiotic and biotic stressors, which populations must adapt to for range expansion or to survive climate shifts. Mustard species are also characterized by the production of glucosinolate defense toxins . In this chapter, we evaluate a recent hypothesis  that regulation of glucosinolates could underlie the evolution of the spatially restricted distributions. This hypothesis is in contrast to previous hypotheses on defense evolution that argue the opposite that variation in defensive chemistry is the consequence of spatial distributions, life history patterns, inherent growth rates, etc. [3, 4]. As such, we begin this chapter with a description of the central hypothesis, followed by a set of predictions and then a description of the interdisciplinary approach needed to evaluate the hypothesis.
Understanding the causes and dynamics of naturally occurring range limits in plants has become a central issue in both basic (evolutionary ecology) and applied (conservation and agriculture) areas of biology because of climate change and land use concerns [5-8]. Most transplant studies show decreased performance just across geographic range boundaries ( for review); therefore, it is generally assumed that many range boundaries are spatial manifestations of niche limits, requiring adaptation for local range expansion or for the persistence of populations at range edges as climate changes. However, the existence of range limits suggests that adaptation to stressful environments just outside the range is often prevented. What prevents this adaptation from occurring?
Many factors and processes can contribute to range limit development. These factors include lack of genetic variation in range margin populations, barriers to dispersal, gene flow from elsewhere within the range, and various kinds of tradeoffs . Any of these factors, alone or in combination with other factors, could prevent adaptation to stressful environments outside the range. However, there is often sufficient genetic variation within and among range margin populations  for natural selection to presumably act upon, and often there are no obvious barriers to dispersal at range boundaries. In these cases, possible constraints on the process of adaptation to stressful environments just outside the range would include gene flow and tradeoffs. But because many range margin populations are geographically and genetically isolated (e.g., ), it is thought that the study of range limit development should often focus on molecular, physiological, or developmental tradeoffs . What kind of tradeoffs might be important to mustards at low elevation or low latitudinal range limits?
The process of adaptation often proceeds by modifying existing structures and pathways. Within ranges, stress response signal transduction pathways help plants to survive temporary challenges from abiotic and biotic stressors . Just across range boundaries, some of these same stressors increase in frequency; therefore, one would predict that adaptation to stressful environments across range limits would involve the upregulation of stress response pathways such that the pathways and the traits that they regulate were expressed more frequently or stably.
However, evolutionary models predict that a problem may arise when antagonistic response pathways are co-opted simultaneously for evolutionary change . The problem is with negative pleiotropic and epistatic effects. Multiple signaling pathways often form networks of regulatory genes (transcription factors) that may interact for multiple positive and negative integrative effects. An excellent example is the flowering time signaling network in Arabidopsis, which involves many positive and negative interactions among photoperiod, circadian clock, vernalization, autonomous and Gibberellic acid pathways. In general, quantitative geneticists predict that the evolution of complex traits may involve many genetic correlations, which is why there has been such an interest in the analysis of quantitative genetic variance–covariance G-matrices . Indeed, early theoreticians such as Fisher predicted that the evolution of complex traits would only involve many genes of small effects to avoid such pleiotropic effects . Epistatic interactions between major flowering time network genes FRI and FLC, for example, were one of the contributing factors in the maintenance of genetic variation in Arabidopsis flowering time  and therefore may represent an evolutionary constraint. FRI and FLC are major transcription factors (TFs) in the flowering time signaling network that allow large behavioral shifts involving many genes, but these major effects might impede evolution through multiple epistatic and pleiotropic effects.
In another example, it is generally assumed that plants face both abiotic and biotic stressors across range boundaries, especially at low elevation “trailing edges” of species ranges , yet it is well known that stress response pathways, such as abscisic acid (ABA) signaling for coping with abiotic stressors (e.g., drought) and jasmonic acid (JA) signaling for coping with biotic stressors (e.g., herbivores) may negatively interfere with one another ([18 - 20] for reviews). Thus, the simultaneous co-option of these antagonistic pathways for low-elevation range expansion where organisms face both increased abiotic and biotic stressors may be problematic because of the crosstalk. Although limited phylogenetic evidence suggests that ancient antagonistic crosstalk between signaling pathways may not constrain evolution , more thorough and experimental work is needed to address this issue.
Specifically, our hypothesis states that components of defense (e.g., JA – jasmonic acid) and stress tolerance (e.g., ABA – abscisic acid) signaling pathways are not genetically independent of one another, which may constrain the simultaneous evolution of defense and stress tolerance. Candidate genes in Arabidopsis for the evolutionary constraint include specific TFs that are involved in gene regulation within and among signaling pathways (Figure 1). For example, if there exists genetic variation in TFs MYC2 or AF2 that natural or artificial selection could act upon to increase drought stress tolerance traits for more stable expression, we predict that  a glucosinolate defense response would also change because these TFs help to regulate both pathways, and  other pleiotropic or epistatic effects would reduce fitness.
Defense and stress tolerance phenotypes will be genetically correlated in family structured common garden experiments.
Genetically diverged populations from different elevations will also be diverged for defense and abiotic stress tolerance traits; neither population will have high values of both traits simultaneously.
Defense allocation and stress tolerance phenotypes will not segregate independently from one another in extended generation crosses between diverged populations.
Defense and stress tolerance genetic covariation will be associated with markers linked to candidate transcription factors (TFs) that regulate both defense and drought tolerance pathways (AtMYC2 [At1g32640], AtMYB2 [At2g47190], and a NAC TF AtAF2 [At5g08790]).
DNA sequence variation of candidate regulatory genes (TFs) implicated in prediction #3 will also correlate with the defense and stress tolerance tradeoff in (a) the segregating crosses and in (b) unrelated individuals (linkage disequilibrium association analysis). Comparison of DNA sequence of coding regions will also show molecular evidence for evolution (McDonald-Kreitman Test and dS/dN ratios).
There may be other regions of the genome besides those containing the candidate TFs that also simultaneously affect defense and stress tolerance phenotypes and thus would also provide molecular evidence for the tradeoff. For whole genome marker analysis away from candidate TFs, defense and stress tolerance phenotypes will either co-locate in linkage mapping or their QTL will show negative epistasis.
Gene expression of candidate genes in genetically diverged populations will reflect the evolution of more stable expression of defense and stress tolerance response pathways.
4. Approach and Discussion
An Ecological Genomics approach would satisfy the need for experimental and molecular genetics to evaluate the hypothesis in several of the predictions. Ecological Genomics (EG) is an interdisciplinary approach in the biological sciences that seeks to find the genes underlying species interactions in natural habitats and to study the evolutionary forces that have shaped these genes, their expression patterns, and the phenotypes that they encode [22-26]. As such, nucleic acid sequencing, forward and reverse genetic tools, comparative methods, and other molecular techniques are required to find the relevant genes, to establish databases of candidate genes, and to study their expression patterns and allelic variants.
Although many model species (species whose genomes have been completely sequenced and that have broad interest from molecular biologists) meet and exceed these molecular criteria, often they lack the attributes for ecological studies. For example, the natural distribution of A. thaliana is in Eurasia, making it difficult for study by North American researchers. Furthermore, there are still relatively few model species. With the advent of affordable high-throughput next-generation sequencing (NGS), however, sequence information is becoming available for more species and populations. As of June 2015, for example, there were 6,653 completely sequenced genomes, but only a small fraction of these are eukaryotes (http://www.genomesonline.org). The GOLD website also listed 60,631 genome sequencing projects (June, 2015), but only 9,059 or 15% were eukaryotes. Thus, although NGS is helping the field of EG to move away from model species, especially, for example, in evolutionary studies , there is still a need to understand the attributes that constitute an ideal model organism for EG studies.
Feder and Mitchell-Olds  listed the criteria for an ideal model species in Evolutionary and Ecological Functional Genomics (EEFG), which is synonymous with EG. The criteria stated that there needs to be: (1) a co-operative community of researchers from different disciplines that share resources and information; (2) the tools to find genes and study their variation within and among species; (3) natural, undisturbed habitats such that genetically diverged populations can be studied for local adaptation; (4) molecular data on sequences and chromosomal maps for marker development and mapping, cis and trans regulatory regions identified, and gene function and its fitness consequences known under natural conditions; and, finally, (5) the ability to study the ecological consequences of natural genetic sequence variation in the genes for evolutionary inferences. Thus, access to NGS alone does not necessarily make a species ideally suited for EG study.
Boechera stricta, a close wild relative of Arabidopsis, satisfies many of these criteria and is an emerging ecological model species that inhabits environments that differ substantially in drought stress, herbivore community, and other abiotic and biotic conditions . The selfing rate of B. stricta in the northern portion of its geographic range is 0.95 , enabling the creation of experimental advanced generation hybrids for forward ecological genetic studies (e.g., ). The genome of B. stricta has also been recently sequenced (http://phytozome.jgi.doe.gov/pz/portal.html#!info?alias=Org_Bstricta).
Forward genetics for finding candidate genes in ecology include population genomics, association mapping, linkage mapping, and transcriptomics [31, 32]. Population genomics identifies outlier marker loci in the statistical analysis of population genetic parameters, but there is no knowledge of associated phenotypes. Association and linkage mapping both include measurement of phenotypes, but a distinct advantage of association mapping is that no pedigree is required and that allelic variants representative of naturally occurring populations can be used in analyses [26, 33]. Association mapping can be conducted on unrelated individuals because it is based on general inherent linkage disequilibrium (LD). When markers for association analysis are developed from candidate genes, significant associations may actually identify relevant genes. We assume that candidate genes in Arabidopsis could often be studied with success under natural conditions in close relatives in the genus Boechera. Unidentified causal genes in LD with significant candidate markers are unlikely in B. stricta where LD decays rapidly, within 10 kb . If present, population substructure (i.e., stratification, admixture, or inbreeding) must be controlled for genetic association analyses because of confounding effects on LD that can lead to false positives .
4.1. Tests of prediction #1
Using family structured quantitative genetic analyses, previous studies have examined the genetic correlation between defense and stress tolerance of prediction #1. This work has been conducted on a close wild relative of Arabidopsis in the genus Boechera. Boechera stricta is a genetically diverse, diploid, predominantly self-fertilizing species that occurs at higher elevations throughout western North America in natural habitats [29, 34]. The phenotype of an individual of any species is determined by genetic and environmental factors (P = G + E), and these factors and thus the phenotypes vary among individuals within populations (VP = VG + VE). If the phenotypes are measured from in a common garden experiment, environmental variation is eliminated so that the phenotypic and genetic variations are equal (VP = VG). Among full-siblings of clonal or self-fertilizing species such as B. stricta, the genetic variation (VG) measured from a common environment can be used for evolutionary inferences because all allele combinations within and among loci are inherited without change. In accordance with prediction #1, the negative genetic correlation between glucosinolate (GS) toxin defense allocation and stress tolerance associated with range limits of B. stricta has been observed using these methods in the field and lab five times previously [2, 35-37]. Stressors involved in the tradeoff have included drought, nutrient deficiency, and change in plant community structure across the range boundary (suggesting competition) and the multivariate stress of the range boundary itself. In these studies, it was hypothesized that these tradeoffs occurred because of antagonistic crosstalk between abscisic acid (ABA) stress tolerance and jasmonic acid/ethylene (JA/ET) or salicylic acid (SA) defense-signaling pathways. Circumstantial evidence implicating the pathways in the tradeoff comes from experimental ABA soil inoculations that depended on endogenous GS level . These conclusions are based mainly on correlative methods and circumstantial evidence. Experimental and molecular genetics and more direct measures of pathway components are needed.
4.2. Tests of prediction #2
For gene mapping in B. stricta, a replicated cross has been conducted between populations from the Big Horn Mountains, Wyoming, and the Black Hills, South Dakota. These are geographically isolated and genetically diverged populations [36, 38]. The populations are located at different ends of the altitudinal range of B. stricta (Big Horns 3,000 m, Black Hills 1,700m) and thus the sites differ by several environmental factors. The populations have diverged for glucosinolate content and stress tolerance traits such as root:shoot ratio as predicted; neither population had high values of both traits (Figure 2).
4.3. Tests of prediction #3
The segregating F2 populations from the crosses can be used to test prediction #3 that the defense and stress tolerance traits will not be inherited independently of one another. F2s from more than one cross allow for broader inference and increased statistical power . Because each genotype in the F2 segregating generation cannot be replicated, drought treatments in the lab would need to be imposed on all plants after the plants were first monitored for performance (e.g., growth) under controlled watering conditions. Measures of drought tolerance could include relative growth (before and after drought treatments), leaf mass area (LMA), water use efficiency (carbon isotope ratio), and root:shoot ratio. Several measures of stress tolerance increase the probability of detecting the tradeoff, and provide a general assessment of stress tolerance . Glucosinolate analysis should be conducted on leaf tissue at the time of drought stress.
Of course, the goal should be to test for the tradeoff in the field across the low-elevation range boundary using the segregating generation of the experimental crosses. However, this requires replication of each F2 genotype that could be compared within and just outside the range boundary. In the F3 generation, there is replication of each F2 lineage. Common garden experiments of F3 families could be established within and across low-elevation range boundaries. Previous field transplant experiments [2, 35, 36] have established the areas just across the low-elevation range boundary as stressful in terms of several correlated abiotic and biotic stressors, which manifest as slower growth and survivorship. Large sample sizes would guarantee the detection of effects if they exist, and would also allow for assessment across multiple years and the possibility of ecological gradient manipulations. For example, removal treatments of candidate competitors, such as Lithophragma parviflorum , could be included within and outside the range. Competitive interactions also induces ABA  and therefore might also induce the tradeoff.
4.4. Tests of predictions #4 and #5
Here the goal is to test whether TFs can be implicated in the tradeoff using TF-linked markers in the crosses, and then to conduct further genetic association analyses using markers within the implicated candidate TFs. Previous molecular work allows one to locate markers on a B. stricta linkage map that are linked to the candidate TFs in the crosses. Chromosomal painting and end sequencing have shown that there are large syntenic blocks that align between the A. thaliana and B. stricta genomes . These Arabidopsis blocks have been located in the B. stricta genome. This can now also be verified with the recently sequenced B. stricta genome.
Analyses within the candidate genes could be conducted in populations of related and unrelated individuals. The analyses of related individuals could use F2 and F3 mapping populations, but there would be confounding effects of linkage and linkage disequilibrium (LD) from other unknown genes and alleles. Genotyping in the F2 generation also allows for genetic association analyses in the F3 field experiments. Significant associations using unrelated individuals would help to eliminate confounding effects of linkage and linkage disequilibrium (LD) from other unknown genes and alleles. Another set of markers such as microsatellite markers (e.g., [29, 36]) would need to be used to identify and control population genetic structure among the unrelated individuals .
For marker development within genes of B. stricta, identification of genetic variation (e.g., single nucleotide polymorphisms – SNPs) in marker-implicated candidate TFs can be performed by first sequencing the TFs and their promoter regions from the diverged populations. Alignment of the diverged sequences would identify any polymorphisms. The recently sequenced B. stricta genome allows one to readily design primers to amplify and then sequence the genes.
To test whether polymorphic genes are under selection, statistical tests of synonymous to nonsynonymous substitution ratios in coding regions would be performed. Comparing dN/dS ratios allows for detection of positive or negative selection, albeit conservatively. The MacDonald-Kreitman test is more liberal for detecting deviation from the neutral model of molecular evolution .
4.5. Test of prediction #6
To determine whether drought tolerance and defensive QTL co-localize on the B. stricta genome or whether there are epistatic interactions between QTL from different traits, linkage mapping would need to be conducted in F2 lab and F3 field experiments. Genotyping conducted in the F2 generation can also be used for successful mapping among F3 sib families . Linkage analysis is often extended to incorporate information from several markers, called multi-point or interval mapping; therefore, composite interval mapping followed by multiple interval mapping using QTL-Cartographer will be used.
4.6. Test of prediction #7
If natural selection has acted upon genetic variation in signaling pathways for more stable or frequent expression of traits, this should be detectable in a comparison of the high- and low-elevation populations for trait responses and gene expression. For example, the populations could be compared in a double challenge experiment in which plants are first drought-stressed and then fed upon by generalist insect herbivores. One would predict that drought stress responses would take precedence over herbivore-induced defense responses in close wild relatives of Arabidopsis, reflecting the crosstalk between ABA and JA signaling that has been well documented in Arabidopsis. But we would predict that this plastic tradeoff would vary between high and low diverged populations and that this would be reflected in differential gene expression involving the candidate signaling pathways. To examine the expression levels of candidate TFs, one could use qPCR, but genome-wide gene expression using RNAseq would also allow the assessment of other genes as well.
An Ecological Genomics approach is needed to evaluate most of the listed predictions that remain untested. We mainly advocate a candidate gene approach that leverages the vast functional genomics literature of Arabidopsis for functional ecological genetics studies on close wild relatives. However, next-generation sequencing is rapidly increasing the potential of many ecological systems. The suggested studies outlined here are important because if our hypothesis continues to be supported, there may be important implications for understanding range limits, defense evolution, canalization, conservation, and crop improvement. If defense regulation can be used to help predict population sensitivity to environmental stress, then there would be several important applied implications.