Construction and Analysis of Metagenome Library from Bacterial Community Associated with Toxic Dinoflagellate Alexandrium tamiyavanichii

Previous studies have suggested that a specific community of bacteria coexists within the phycosphere of marine dinoflagellates. In order to better understand the dinoflagellate-bacteria relationships, a fosmid clone library was constructed from the metagenome DNA and analyzed. Some of the fosmid clones were end-sequenced. A total of 1501 fosmid clones with insert sizes of 30–40 Kbp were produced. End sequenc-ing of 238 clones showed that 55% of the genes had known functions, 11% were of putative function and 34% were genes of unknown function or had no match in Genbank. There were approximately 14% sequences with no classification and could potentially represent novel genes. Analysis of these partial sequences also revealed some promising enzymes that possess various potential industrial applications such as chitinases, kinases, agarases and oxygenases. The results also showed that the bacterial flora of the Alexandrium tamiyavanichii culture was dominated by the Alpha-proteobacteria, followed by Bacteroidetes and Gamma-proteobacteria. The findings in this study suggested that the bacterial community may play various role in the association with dinoflagellate. This study had also shown that dinoflagellate-associated bacterial community is a valuable source for discovery of novel genes and gene products.


Introduction
Microalgae are the major producers of biomass and organic compounds in the ocean. More than 5000 species of marine microalgae are known to date and are separated into six major divisions: Chlorophyta (green algae), Ochrophyta (yellow algae, golden brown and diatoms), Haptophyta (coccolithophorids), Pyrrhophyta (dinoflagellates), Euglenophyta and Cyanophyta (blue-green algae) [1]. Among the 5000 species about 300 can proliferate in high numbers to form the so-called red tide and brown tide phenomena [1,2].
Many planktonic organisms can form mass occurrences in the water column. When the cell densities reach values considerably higher than their general background distribution, they are called blooms [3]. Blooms can be almost mono specific, others are formed by a combination of species [4,5]. Many prominent blooms can be traced back to high nutrient loads [6][7][8], but they can occur whenever a species is able to outgrow its competitors while partially reducing grazer pressure [9].
Microscopic marine algae can be vectors of microbial communities because they are universally associated in the ocean. In nature, most microbial communities are found adhered to microalgae, organism and inanimate surfaces. These interactions are dynamic and are important factors in microbial proliferation and survival. Aquatic algae in situ as well as in laboratory culture condition are often found to be associated with a variety of bacterial strains [10]. Bacteria community can be defined as multi -species of bacteria assemblages in which organisms live together in a contiguous environment (host) and interact with other [11]. Bacteria reproduce asexually, are sized between 0.1 to 20 μm, and can be rod, cocci or comma shaped.
For marine microorganisms (bacterioplankton), there are approximately 10 6 bacterial cells per ml of surface seawater throughout the world's oceans [12]. While this number has been known for at least 30 years, how many bacterial species are actually present in the bacterioplankton are still unknown. Bacterioplankton commonly found in marine environment are mainly from bacteria group of Proteobacteria, Cytophaga-Flexibactar-Bacteroides (CFB), marine Archaea and other groups of bacteria, where bacteria group from Proteobacteria is the largest. Proteobacteria group are divided to some class, which are Alpha (α-), Beta (β-), Delta (δ-), Epsilon (ε-) and Gamma (γ-) Proteobacteria [13]. Up until now, the estimated abundance and genetic diversity of bacterioplankton are based on the data in the GenBank database. Hagström et al. [14] had analyzed on all of the 16S rDNA sequences sent to GenBank to get the estimation of marine bacterioplankton species that were available in the GenBank database. Their studies show that the richness of marine bacterioplankton species in the GenBank database was low relatively.
The ecology of bacterioplankton and phytoplankton is widely recognized to be tightly coupled. Interactions between bacteria and phytoplankton such as dinoflagellates may play an important role in regulating dinoflagellate toxin production. Previous studies on the interactions between bacteria and dinoflagellates have been shown to be highly variable and are sometimes specific. Effects of bacteria on toxic dinoflagellates include negative effects such as cell lysis and death [15] and positive effects such as growth enhancement with an addition of bacteria to cultures [16]. Examples of factors which may cause stimulation or inhibition by bacteria include production of co-factors and secretion of signaling molecules controlling cellular processes of dinoflagellates [17]. In addition, bacterial influences on nutrient availability may result in stimulation or inhibition of the growth or toxin production of dinoflagellates. Both toxin production by dinoflagellates and bacteria associated with toxic or non-toxic dinoflagellates have been documented. For example, Gallacher et al. [18] described evidence of paralytic shellfish toxin (PST) production by bacteria associated with dinoflagellates cultures.
Cultures of dinoflagellates contain a considerable amount of bacteria which probably accompanied the dinoflagellates in the original sample. Bacterial assemblage found in the phycosphere of dinoflagellates may play an important role in regulating dinoflagellate toxin production. While several studies have suggested that bacteria-phytoplankton interactions have the potential to dramatically influence harmful algal bloom dynamics, little is known about how bacteria and phytoplankton communities interact at the species composition level. Other studies have indicated that inside a phytoplankton bloom, α-Proteobacteria overwhelm the free-living bacterioplankton, while microorganisms connected to phytoplankton are basically distinguished as fitting in with (CFB), γ-Proteobacteria, and Planctomycetes groups [19,20].
At present the precise association of bacteria with cultured dinoflagellates is still not well understood. Moreover, current estimates indicate that more than 99% of the microorganisms present in many natural environments are not readily culturable and therefore not accessible for biotechnology or basic research [21]. Technology to access the genomic DNA or RNA of microorganisms, directly from environmental samples without prior cultivation, has opened new ways of understanding microbial diversity and functions. Thus, this present study is an important step towards understanding bacteria-dinoflagellate interactions.
Metagenomics has become a powerful tool to investigate the biodiversity of complex microbial communities and for studying its metabolic pathways. This technique can be considered as a revolutionary approach to study the microbial community that is unapproachable by available conventional methods and this approach also can capture the total genomes that present in a community of interest. According to Schloss and Handelsman [22], metagenomics was builds on advances in microbial genomics and in the polymerase chain reaction (PCR) amplification and cloning of genes. The field of metagenomics has played a pivotal role for significant progress in microbial ecology, evolution, and diversity over the past years. This approach has allowed researchers to elucidate some possible mechanisms governing ecosystem function and diversity.

Methodology
Dinoflagellate Culture. Clonal culture of Alexandrium tamiyavanichii were obtained from UKM Microalgae Culture Collections and maintained in ES-DK medium [23] and growth in a light-dark cycle (14:10 hour) incubator at 26°C (model 2015 Shelab, USA).
DNA Extraction. Bulk genomic DNA were directly extracted from a 2.0 L of mid exponential growth phase dinoflagellate [24]. Firstly, culture medium was filtered through 0.2 μm nitrate cellulose membrane (Whatmann, England). Cell pellets were then concentrated and resuspended in buffer (100 mM EDTA, 10 mM Tris-HCl [pH 8.0]) and treated with proteinase K (0.5 mg/mL)-1% sodium dodecyl sulfate (SDS) for 1 h at 37°C. Lysates were further treated by CTAB extraction (0.5 M NaCl, 1% CTAB) for 10 min at 65°C. Then, DNA was extracted once with equal volume of chloroform-isoamyl alcohol (24:1) and phenol-chloroformisoamyl alcohol (25:4:1) and spin at 21000 x g for 5 min at 4°C. After that, DNA was precipitated with 0.6 volume of isopropanol by centrifugation at 21000 x g for 15 min. The DNA pellet was then washed with one volume of 70% ethanol and spin at 21000 x g for 10 min. Finally, DNA was resuspended in 50 μL of ddH 2 O and stored at −20°C.
Metagenomic Library Construction and End-sequences Analysis. Sheared DNA with sizes ranged from 30 kilo bases to 40 kilo bases were used to construct the metagenomic library. Metagenomic library was constructed using CopyControl pCC1FOS library construction kit (Epicenter, USA) following the manufacturer's protocol and fosmids were then purified using Millipore 96-well prep BAC purification kit (Millipore, USA) following the manufacturer's protocol and end-sequenced using T7 universal primer (5′-TAATACGACTCACTATAGGG-3′). End-sequences were edited by using Staden Package software [25]. Low quality DNA sequences were identified and trimmed using Pregap4. The resulting high-quality sequences were assembled into Synthetic Biology -New Interdisciplinary Science contigs by using Gap4. All dataset was then analyzed by BLASTX [26]. The taxonomical analysis of sequence matches was performed using MEGAN version 4.0 [27] and gene ontology analysis was carried out using Blast2GO suite [28].

Results and Discussion
About 4-6 x 10 30 microbes exist on this earth [29]. They form the foundation of the biosphere, regulate the biogeochemical cycle and influence geology, hydrology, local and global climate. Furthermore, microorganisms have the potential to produce beneficial products to humans such as bioactive compound products, enzymes and polymers. Research on microbial interactions in a natural environment allows us to better understand complex global issues such as greenhouse gases, biodegradation of harmful compounds and enable us to discover new natural products such as antibiotics. However, it is estimated that 99% of the microbes are "viable but nonculturable" [21,30]. In the meantime, the function and role of the majority of microbes present in the natural environment are still not well understood. Furthermore, they are a valuable resource in biotechnology applications and new product discoveries. The design of metagenomic techniques has allowed us to study in-depth interactions and the role of microbial communities in a natural environment without the need for culturing [31].
The metagenomic approach to studying the bacterial community has begun about more than 20 years ago. Since then, the analysis of bacterial communities using this technique has been widely reported. However, most metagenomic studies have been carried out on bacterial communities from seawater samples, sediments, freshwater. Some metagenomic studies on bacterial communities associated with other organisms were also reported such as from the marine sponges [32], beetle [33], polychetes [34] and tubeworm [35]. However, the analysis of bacterial communities associated with HAB by using metagenomic methods is still poorly reported.
A total of 1501 fosmid clones with insert sizes of 30 kbp to 40 kbp were selected for amplification. Sequences of 80 bp to 550 bp in length were obtained from 238 clones. BLASTX results showed that 23% of the sequences had no match with GenBank data at e-value >10 −4 , 11% were functionally unknown and 11% were putative (Figure 1). Figure 2 shows the functional classification of significant sequences. Most of the sequences could be functionally categorized into a metabolism cluster (37%). There were approximately 14% sequences with no classification and could potentially represent novel genes. Analysis of these partial sequences also revealed some promising enzymes that possess various potential industrial applications such as chitinases, kinases, agarases and oxygenases. The results also showed that the bacterial flora of the Alexandrium tamiyavanichii culture was dominated by the Alpha-proteobacteria, followed by Bacteroidetes and Gamma-proteobacteria (Figure 3). This is similar to the findings of Hold et al. [36] and Green et al. [37]. Alpha-proteobacteria is the largest group in the proteobacteria clade and many members under these taxa are as yet uncultivated bacteria. In this study some of the partial sequences matched those of as yet uncultivable bacteria species. These results suggested that bacteria associated with dinoflagellates are a valuable source for metagenomic studies. Such studies could yield products useful for environmental monitoring, bioremediation and biodegradation [38].
Fosmid end-sequencing has been done to assess the diversity of gene reservoirs in the constructed metagenomic fosmid library. The nucleotide sequence analysis obtained from 238 fosmid clones showed that most of the sequences are still functionally unknown and are believed to represent most of the undiscovered proteins and potentially to provide important new information or pathways if analyzed more deeply. The analysis of the fosmid end sequences also revealed that the majority of the sequences have similarities with the sequence of the Proteobacteria phylum. Some studies also showed that microflora around the marine dinoflagelate phycosphere was dominated by the Proteobacteria phylum [36,37,39]. We also found that part of the sequences was matched with genes derived from Roseobacter-Sulfitobacter-Silicibacter clade. Many Roseobacter species have been shown to utilize dimethyl sulfoniopropionate (DMSP) as both a carbon source and a sulfur source, and it is likely DMSP metabolism is important in Roseobacter-phytoplankton interactions [40].  Synthetic Biology -New Interdisciplinary Science 6 Analysis of these partial sequences also revealed some genes that might be important in bacteria-algal interactions. One of the contigs was similar to the response regulator of the LuxR family protein from Roseovarius sp. This protein is known to be responsible for a variety of biological processes in the natural environment, including the quorum sensing and production of toxins [41]. In a complex community like during an algal bloom, this protein may play a significant role in determining the population structure and function through signaling or inducing the production of certain proteins [42]. Thus, it is believed that bacteria use this type of protein to adapt to the changing conditions around the phycosphere of dinoflagellate such as changes in nutrients, cell densities, and increasing concentration of PSP toxins.
The end-sequences obtained can not be used to describe the metabolic activity of each bacterial taxon involved but the analysis of the nucleotide sequences has shown that the constructed metagenomic library has great potential as a source to study the physiology and function of the bacterial community involved.

Conclusion
The advantage of metagenomic method is that it allows us to study the genome of the bacterial community directly from the natural environment in which the function and role of certain bacteria on the environment can be determined. Studies have shown that metagenomics are a very useful technology in finding genes that can be applied in industrial, biotechnology, pharmaceutical and medical fields such as esterase, lipase, agarase, polymerase, polyketide synthases, chitinase and so on. Some potentially uncultured microbes and new genes were discovered through this study. In this study, metagenomic libraries using fosmid vectors were constructed from the bacterial community associated with A. tamiyavanichii. A total of 1501 fosmid clones ranging from 30 to 40 Kbp have been obtained and this is equivalent to 13 bacterial genomes. Finally, to our knowledge, the metagenomic library in this study was the first being constructed from the bacterial communities associated with toxic marine dinoflagellate. This collection of libraries can be used as a major source for finding new genes or pathways for biosynthesis and to study the interactions of dinoflagelates more profoundly, especially in the production of the saxitoxins as there is a hypothesis that the toxins produced by dinoflagelates are derived from bacteria.