Since the dawn of classical microbiology, scientists have applied efforts to unravel the ecological patterns of occurrence, distribution and function of microorganisms in several ecosystems, including soil. In the last years, the famous Baas-Becking affirmation “Everything is everywhere, but, the environment selects”, have been largely used as central question, in microbial biogeographic studies and also in theoretical niche-based theories .
Despite we known the importance of soil “small”-microorganisms to maintain the dynamic balance and resilience of the “big”-ecosystems, little is known about their patterns of assembly and its relationship with the functions resilience due the conversion of natural areas, such as tropical forests, to agriculture. Advances arising from genomics era have enabled microbial ecologists to access the ecological dimension of genetic and functional biodiversity, through genomic sequencing techniques, at scale and depth never seen before .
However, a topic that remains unclear is how to analyze and interpret these patterns of biodiversity generated by millions (or even billions) of genetic and functional data, resulting in robust and concise answers about ecological issues, among them: which is/are the effect(s) of conversion of forest to agriculture on microbial ecological patterns? Moreover, how to integrate this dimension of genetic and functional biodiversity, with the dimension expressed by metabolic products and ecological relations of microorganisms, and with a third and not less important, environmental dimension, which can modulate the patterns of occurrence and distribution of microorganisms in several ecosystems around the globe? Elucidating these dimensions, through metacommunity ecology and biogeography may allow us to unravel the black box of microbial assembly and functionality in the agroecosystems, and give answers to Baas-Becking affirmation supporters and opponents.
The dimension 1, has been massively analyzed through large scale sequencing of nucleic acids. Recently, metagenomic libraries from soil microbial DNA are being used as template , in order to evaluate the effects of the conversion of forest into agriculture . The dimension 2 has been assessed by biochemical assays of production/consumption of microbial-mediated greenhouse gases and microbial enzyme activities in soil [5–8]. The Dimension 3 is evaluated by observation or data collection and analysis of environmental factors, including soil physicochemical, climatic and geographical attributes and their relation with microbial molecular parameters [9–12]. A multidimensional approach, linking these dimensions that modulate the distribution and abundance of microorganisms in the ecosystems is obtained by multivariate pairwise correlations among the parameters evaluated in both the three dimensions, generating an integrative view in systems biology (Figure 1).
The aim of this chapter is to provide to readers some conceptual and practical bases for analysis and interpretation of microbial metacommunity assembly (structure), functions and their linkage, with applications in agroecosystems conservation. To achieve that, we consider results from the recent advances in high throughput next-generation sequencing (NGS) and bioinformatics that allow us to assess deeply, both the taxonomic, the phylogenetic and the genetic microbial biodiversity, establishing a novel border in microbial metacommunity ecology. We argue that metacommunity ecology and biogeography may be used as cornerstones to microbial ecology studies, helping us to elucidate tricking questions regarding microbial distribution and ecological relationships, from the local community level to the global level.
2. Molecular advances in microbial ecology
The rapid development of molecular biology techniques at the end of the twentieth century and their successful application to the study of microbial ecology has changed our view of the assess structure and function of microorganisms. In recent years, advances in the field of molecular microbial ecology, in which are included the NGS techniques , have revealed a far unknown microbial biodiversity that was not detected previously by classical microbiology.
The NGS tools have decreased the relative costs of sequencing and increased massively the capacity of data production and quality. Their advances have contributed significantly to the understanding of the structure and function of soil microbial communities. Several molecular methods have been used to investigate the microbial diversity and changes in the microbial community structure in a wide range of environments (e.g. Shotgun metagenomics).
The studies in microbial ecology have been improved with the development and advance of the sequencing technologies. DNA sequencing is the process of determining the order of nucleotides that constitute a DNA molecule. The method determines the order of the four bases, i.e. adenine (A), guanine (G), cytosine (C), and thymine (T), in a strand of DNA. The DNA sequencing provides a mean of identifying organisms by comparing to databases. From a known and identified species, a molecular marker (e.g. 16S rRNA gene) is sequenced and deposited in publicly available databases for future comparison. DNA sequencing is suitable for sequence individual genes, molecular markers, larger genetic regions, full chromosomes or the entire genome. The sequencing approach is a powerful tool for the study of microbial communities inhabiting soil and could be useful to predict changes in soil properties and quality, as well as to understand the community assembly in these environments. The assessment of the microbial diversity will be advanced by the development of new technologies that answer some key questions about the “who, what, when, where, why and how” of microbial communities .
The rapid development of molecular biology techniques at the end of the twentieth century and their successful application to the study of microbial ecology has changed our view of the assess structure and function of microorganisms. In recent years, advances in the field of molecular microbial ecology, in which are included the Next Generation Sequencing (NGS) techniques , have revealed a far unknown microbial biodiversity that was not detected previously by classical microbiology.
The advance in sequencing technologies from Sanger to 454 pyrosequencing and Illumina has opened new possibilities in microbial community analysis by making it possible to collect millions of sequences, spanning hundreds of samples. The increase in the number of sequences per run from parallel pyrosequencing technologies such as the Roche 454 GS FLX (5 x 105) to Illumina GAIIx (1 x 108) is of the order of 1,000-fold and greater than the increase in the number of sequences per run from Sanger (1 x 103 through 1 x 104) to 454 . In addition, the use of barcode strategies allows the analysis of thousands of samples in a single run. With the advance of such technologies the read length has increased, although they are far shorter than the desirable length or the read length obtained from traditional Sanger sequencing (~1000 bp) . The 454 pyrosequencing was the first next-generation sequencing technology available as a commercial product  and can be considered the cornerstone of the sequencing revolution. The development of the pyrosequencing method allowed an advance of metagenome studies by increasing the number of reads and decreasing costs per sequence, enabling a deep phylogenetic community analysis.
The NGS tools have decreased the relative costs of sequencing and increased massively the capacity of data production and quality. Their advances have contributed significantly to the understanding of the structure and function of microbial communities. Several molecular methods are used to investigate the microbial diversity and changes in the microbial community structure in a wide range of environments. The use of metagenomics in the studies of microbial communities has enabled researchers to have an overview not only of the diversity, but also the functional traits, which are also an important approach to link the microbial community structure to functions. The rapid advance of sequence technologies allied to bioinformatics tools are increasing the possibility of massive studies on microbial ecology for a deep comprehension of the composition and function that soil microorganisms play in a wide range of ecosystems. The new information available will be useful for a better understanding of microbial assembly, at both phylogenetic and functional aspect of a community.
3. The metacommunity concept in microbial ecology
The first and simplest concept defines metacommunity as a set of communities that interact each other exchanging individuals of multiple species, linked by dispersal . Different species interact each other via ecological relations, at the community (local level). There might be events of immigration, dispersal, besides other, that modulate the exchange of individuals from local communities to a broader range of communities, culminating with species evolution . The use of different terms and different perspectives is a concerning question in metacommunity ecology. To reach scales of organization and set populations and communities dynamics within metacommunities, we use the terms and definitions conceptualized by  and applied by . In order to assess the metacommunity assembly we regard the four central theories in Metacommunity Ecology, namely: (I) patch-dynamic, (II) species-sorting, (III) mass effects and (IV) neutral theory .
Metacommunity studies have been applied to ecology [19,23]. The patters of community distribution can vary in regional scale across environments, between environments at the same region and inside a specific environment . Thus, a multidimensional approach is needed to have a comprehensive picture. To achieve that, four paradigms of metacommunities can be reached (Figure 2):
Patch-dynamic (stochastic+deterministic) – as found to the neutral model, it assumes that the habitat quality is constant across different arrays (microbial cores) of the landscape. In this model, both stochastic and deterministic extinctions are affected by interspecific relations e counterbalanced by dispersion .
The approaches undergoing this paradigm are based on two different versions, based in occupancy formalisms, in which the patches are occupied or vacant by certain populations at their equilibrium. Both versions assume that the local and the regional population dynamics have a time gap, which means that the effects of changes in extinction-colonization patterns in the local community take certain period to affect the regional metacommunity dynamics.
In the first version for this paradigm, only regional coexistence is considered to influence the metacommunity patterns. It assumes that species that coexist compete for niche resources, but there is no interactions between species that influence local community dynamics, since local communities are not considered in the model. In the second version, given a homogeneous environment where a set of species co-occur in equilibrium, the regional coexistence is possible due a trade-off between competition and dispersal or fecundity and dispersal  (Figure 1a). A limitation of patch-dynamic paradigm is that a set of local communities or patches are assumed identical.
Species-sorting (deterministic) – based on the traditional theory of niche segregation of species that co-occur in certain environment . The theory infers about changes in communities across environmental gradients .
In this case, the role of environmental parameters such as soil fertility and plant cover species, soil organic matter content, besides other, acquire fundamental importance in modulate the patterns of distribution and abundance of microorganisms . The species-sorting paradigm infers that local patches differ in some attributes and the result of local species interactions depends on the environmental abiotic factors .
This paradigm assumes that different habitats patches are heterogeneous and that rates of dispersal are moderate, which means that all species have similar probability to reach all patches of the metacommunity (Figure 1b). Thus, it is expected to occur a species-sorting through niche partitioning, since there species are differently adapted to particular conditions, defined by environmental gradients.
Mass-effects (deterministic) – assumes that a certain population can vary in regional and local scales. This population can be affected quantitatively by dispersion. This model of mass effects due dispersion requires that different arrays of habitat have different conditions in certain moment, and that these relations should sufficiently tightly related. Thus, dispersion results in a sink/source dynamic between populations in different arrays at the landscape .
Dispersal has a great role in mass-effects paradigm. In one hand, an increase in immigration rates might enhance the abundance of certain populations in a local community, in detriment of neighbor communities from the metacommunity. In other hand an increase in emigration rates can decrease the rates of birth of local populations apart from the abundance expected in a close metacommunity (Figure 1c). Considering competing species of microorganisms in a certain environment where the total community has a constant abundance, a fluctuation in local population abundances may occur by chance .
Neutral Model (stochastic) – one thing that all the previous paradigms have in common is the assumption that species in local communities differ from each other in their capacities of competition for niche occupation. The dynamics of a metacommunity depends on the trade-offs resulting from the assembly of several co-occurring populations.
Neutral paradigm emerges as a null hypothesis for microbial assembly. Thus, in its models, the persistence in a certain community is the result of random processes of immigration and extinction (Figure 1d). The species have equal competition capacity . Neural theory is the simplest way to characterize the complexity of a set of populations in a local community. To asses that, we only need a θ number of potential species in a given community, and a m immigration rate parameter .
A classical approach to evaluate neutral paradigm was described by , through a re-interpretation of Ewens’ sampling distribution, that was initially developed for genetics studies . The model undergoing this approach is based on zero-sum dynamics of a metacommunity. Indices deriving from this view are being also used for local communities , with recent applications in studies related to the patterns of microbial assembly in agroecosystems and rhizosphere [4,21]
A more comprehensive picture of the application of these paradigms in microbial ecology studies can be reached by both theoretical (e.g. Classic Metapopulation – neutral) and numerical (e.g. Zero-Sum Model – neutral, Broken-Stick Model – niche-based) models. The models describe the organization of microorganisms into communities, at the local level , and along metacommunities, at the regional level  (e.g. Rates of dispersal and immigration coefficient).
4. Application of metacommunity models to unravel microbial structure and functions in agroecosystems
Although we know a lot about plant and animal distribution, demography and functions, these patterns remain abstruse, when it comes to microorganisms. Knowledge about microbial assembly and functions, due conversion of pristine forests into agricultural systems, is vital to understand the possible ecological consequences.
Biogeography and metacommunity ecology studies have made possible to investigate the mechanisms leading to microbial diversity generation and maintenance in these ecosystems, such as emergence of new species, extinction, dispersal and ecological interactions  in several levels of complexity and scale ranges. A framework to investigate microbial patterns is needed, with references to that found to macroorganisms and the establishment of possible exceptions regarding microbial assembly and functional niche occupation. This comprise the knowledge whether microbial assembly differ among environments and space, besides the effects that modulate this variation. Moreover, a biogeographic multiscale approach can help us to unravel if spatial variation is due to punctual environmental factors, such as land-use and seasonality  or evolutionary selecting events .
As mentioned in the previous section, different species inside a local community and even communities along a metacommunity, use to have different patterns of assembly through space and time, due different ability to compete, occupy niches, and disperse along the ecosystems gradient. Thus, besides the application of metacommunity paradigms to describe microbial assembly and niche occupancy, we can also argue about the limitations and barriers to dispersal that make some species behavior to differ in a biogeographic scale (Figure 3).
An early conceptual groundwork for microbial biogeography can be found in Candolle province and habitat definitions for plants . Bringing that into microbial boundaries, a province could be any area, in which the microbial structure reflects historical evolutionary events. Thus, the limits of a single province should vary greatly in size and are inwardly linked to the resolution and the taxa in study . Areas of soybean cultivation, hundreds of miles distant each other, might be considered particular provinces, considering the general structure of bacterial communities that inhabit their rhizosphere and surrounding soil. Although, those areas may also be treated as members of a single province, taking into account that members of the bacterial class Alphaproteobacteria, are able to colonize the rhizosphere and nodulate these plants, apart from distance, due a high level of conserved genes related to this mechanism and a large number of strategies of signaling to a broader range of environments and plant species .
4.1. Local, regional and global factors affect soil microbial community structure
Understanding controls over the distribution of soil microbial communities is a fundamental step toward describing soil ecosystems, understanding their functional capabilities, and predicting their responses to environmental change. However, the complexity of these communities and their interactions with environmental characteristics have made generalizations difficult. Recently, high throughput sequencing technologies have facilitated the investigation of soil bacterial communities at local , regional , and global scales .
Microbial groups related to environmental characteristics has been recognized as the most important mechanism controlling soil microbial communities  with chemical soil factors identified as a master variable explaining significant portions of the variation in soil microbial diversity and community structure at local [40,41], regional [42–44] and global [45,46] scales.
However, while environmental factors have been identified as exerting primary control on soil microbial distribution, on average approximately 50% of the variation in microbial diversity and structure remains unexplained . Additionally, very few examinations have been made of how controls on soil microbial communities operate simultaneously at multiple scales to contrast local and regional drivers of microbial diversity and community structure.
The factors that control soil microbial community composition are much debated. It has been suggested that while local scale variation in soil microbial communities can be explained by plant identity, substrate hotspots and soil chemical factors [47–49], at regional and continental scales, additional factors, such as climate, topography, and soil pH, become more important [50–52]. However, pH has been shown to shape soil microbial communities over distances < 1 m2 , as well as at field and continental scales [38,41]. Microbial communities have also been shown to be influenced by vegetation type, land use, soil nutrient status, and soil organic matter quality and quantity at landscape scales [51,52,54].
The relationship among soil microbial communities and landscape factors, soil factors and plant communities at different spatial scales is relatively lacking, despite their importance for ecosystem functioning. This lack of understanding of the factors that explain variation in microbial communities at larger spatial scales is surprising given their functional importance in regulating ecosystem processes, such as carbon and nitrogen cycling [49,55] and the resistance of nutrient cycles to climate change-related disturbances . In this sense, no studies have simultaneously tested the importance of a range of abiotic factors, including climate and soil properties, and biotic factors, such as vegetation composition, across a wide range of spatial scales. According to , this represents a major gap in knowledge given the potential for both abiotic and biotic factors to explain variation in soil microbial communities.
4.2. Case study: Niche-based theory explains microbial assembly in soybean rhizosphere
The rhizosphere is the immediate surroundings of the plant root, the portion of soil under influence of root exudates. The rhizosphere is considered a hot spot of microbial species, and the communities inhabiting this environment are shaped by the nutrients released by the plant, such as exudates, border cells and mucilage. Studies on rhizosphere microbiome increased in the last year, mainly because this microbiota can have profound effects on the growth, nutrition and health of plants in agroecosystems [for rhizosphere microbiome review see ].
In an experimental research,  studied the process of microbial selection in the rhizosphere from bulk soil reservoir under agricultural management of soybean in Amazon forest soils. They used a shotgun metagenomics approach to investigate the taxonomic and functional diversities of microbial communities and to test the validity of neutral and niche theories to explain the community assembly in the rhizosphere. The species rank abundance distribution generated by metagenomics was fitted to five theoretical models of assembly. The neutral theory predicts that rank abundance distribution will be consistent with ZSM model  and niche-based fits the pre-emption, broken stick, log-normal and Zipf-Mandelbrot models [58–60].
The authors collected samples of bulk and rhizosphere soil of soybean harvested in agricultural fields in order to evaluate which microbial groups and functional genes are selected in the rhizosphere when compared to the bulk soil. At first, they showed that there is a selection process in the rhizosphere, where the species abundance fitted the log-normal distribution model, which is an indicator of the occurrence of niche-based process. The niche theory predicts that changes in species community composition are related to changes in environmental variables, since species have unique properties that allow them to exploit unique niches available . Thus, the root exudates may select organisms to inhabit the rhizosphere environment.
With the sequencing data, the authors also could show what groups of organisms are selected in the rhizosphere and what function they are playing. In this study , they showed that there was a selection process at both taxonomic and functional levels operating in the assembly of the rhizospheric community, with different community structure compared to the bulk soil community. The phyla Actinobacteria, Acidobacteria, Chloroflexi, Cyanobacteria, Chlamydiae, Tenericutes, Deferribacteres, Chlorobi, Verrucomicrobia and Aquificae were selected in the rhizosphere. In addition, the functional analysis indicated that functions related to the metabolism of nutrients, such as nitrogen, phosphorus, potassium and iron were more abundant in rhizosphere than the bulk soil (Figure 4).
The phyla indicate in the figure 4 were selected in the rhizosphere and are playing important functions related to the metabolism of some important nutrients to the plant. The community selection in rhizosphere is influenced by exudates released from the roots, which create different niches to be exploited by the soil microorganisms. The roots deposits consist mainly of carbon, and secondary metabolites such as antimicrobial compounds and flavonoids. Other soil parameters also are affected by the root system, such as pH, moisture, oxygen pressure and nutrient availability. In this study, the authors used a community assembly approach to understand the microbial selection process in rhizosphere, and they have shown that soybean selects a specific microbial community inhabiting the rhizosphere based on functional traits, which may be related to benefits to the plant, as growth promotion and nutrition. The microbial community assembly in the rhizosphere follows largely the niche-based mechanisms, showing that variations in the rhizosphere promoted by roots exudates shape the microbial community structure.
5. Concluding remarks
In this chapter, we have discussed the considerations of the applications of metacommunity theories and biogeography for land-use management and agroecosystems conservation. The contribution of these models to explain the patterns of structure, abundance and functional traits at local and regional scales were emphasized here. We settled some bases for the application of metacommunity models, regarding community assembly and microbial functions in agroecosystems, including recent results from our group and several theoretical and experimental studies available in the literature.
Studies of microbial assembly and its linkage to the functional resilience in the agroecosystems are very important for microbial ecologists. Comparative studies in different agroecosystems and regions of the globe are needed, to stablish a huge conceptual view about the patterns of microbial distribution and ecological relationships. Based on these several studies, we can argue about soil quality and find global bioindicators of soil health as well as endemic microbial populations with local and regional importance to maintain the ecosystems equilibrium and guarantee the biodiversity, acting as niche holders.
Metacommunity and biogeography concepts emerge as important tools to evaluate bioindicators of soil quality and functional resilience, since both can be applied to a broader range of environments, from the microcosm scale up to the landscape or regional scale, independently of the type of soil, management or species to be reached.
We wish to thank Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP 2008/58114-3, 2011/51749-6) and Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq 485801/2011-6) for funding.