Genome-Wide Identification of Estrogen Receptor Alpha Regulated miRNAs Using Transcription Factor Binding Data

In our investigation, we found that initially miR-21 was repressed after 4 days on E2. However, miR-21 was up-regulated Bioinformatics - Trends and Methodologies is a collection of different views on most recent topics and basic concepts in bioinformatics. This book suits young researchers who seek basic fundamentals of bioinformatic skills such as data mining, data integration, sequence analysis and gene expression analysis as well as scientists who are interested in current research in computational biology and bioinformatics including next generation sequencing, transcriptional analysis and drug design. Because of the rapid development of new technologies in molecular biology, new bioinformatic techniques emerge accordingly to keep the pace of in silico development of life science. This book focuses partly on such new techniques and their applications in biomedical science. These techniques maybe useful in identification of some diseases and cellular disorders and narrow down the number of experiments required for medical diagnostic.

and "tumour suppressor miRNAs" are inappropriately expressed in cancers. However, our understanding as to the TFs or chromatin modifications responsible for governing the expression levels of these essential miRNAs remains limited. At the transcriptional level, gene expression is governed by interactions among TFs and ciselements such as promoters and enhancers. Chromatin immunoprecipitation (ChIP) experiment discovers specific protein-DNA interactions in a given cell type and is regarded as a major tool for investigating interactions between TFs and their binding sites. Based on the pairing of ChIP with DNA microarray and high-throughput sequencing technologies (ChIP-chip and ChIP-seq), genome-wide maps of TF binding sites can now be readily produced. Many groups have used ChIP-chip and ChIP-seq assays to globally study direct targets of TFs and provided significant insights into gene regulation networks (Farnham 2009). Together with mRNA-based expression microarrays, vast amounts of data are publicly available for analysis by bioinformatics. Indeed, networks of gene expression (or systems biology) are gaining popularity to help uncover the physiology regulation underneath and interpret the biological meaning behind these networks. In principle, one can use the genome-wide binding map of a specific TF (or a chromatic modifying factor) to search for its putative target miRNAs, i.e. locate putative binding sites inside miRNA regulatory regions (such as promoter or enhancer) according to genomic coordinates. In the following section, we present a procedure which uses published ChIPchip data to predict candidate miRNAs regulated by a specific TF. Specifically, based on one genome-wide estrogen receptor (ER) binding map, we found 59 miRNA regulatory regions in which there is at least one ER binding site. Several putative ER-regulated oncogenic and tumour suppressor miRNAs were further confirmed in a breast cancer cell model.

Prediction of ER-regulated miRNAs
Accessing and analyzing the genomic sequence and functional annotations were based on the UCSC Genome Browser (Rhead et al. 2010) and Galaxy platform (Goecks, Nekrutenko, and Taylor 2010). UCSC Genome Browser is a web tool for convenient displaying and accessing the genome sequences, together with rich annotation tracks. Galaxy platform is an interactive system that combines existing multiple genome resources via a simple web portal. Users can manipulate remote resources and perform flexible operations such as intersections, unions, and subtractions. Currently, 718 miRNAs have been annotated in human genome (hg18). Their regulatory regions (50 Kb upstream from the pre-miRNAs) were collected from UCSC Genome Browser. The original ChIP-chip data were produced by Carroll et al. (Carroll et al. 2006). Totally 3,665 estrogens receptor binding regions, which were considered with high confidence, were used in this study. Since the original coordinates were annotated in hg17, a web-based liftOver utility with default settings was used to convert the genomic coordinates to the hg18 version of the human genome (http://genome.ucsc.edu/cgi-bin/hgLiftOver). Overlapping regions were searched in Galaxy platform.

Motif analysis
The ER binding regions which overlapped with miRNA regulatory sequences were download from UCSC Genome Browser and further analyzed by TOUCAN 2, a widely used regulatory sequence analysis suite (Aerts et al. 2005). It screened the input sequences against a precompiled library of motifs to find the statistically over-represented motifs. TRANSFAC is a database storing eukaryotic transcription factors and the transcription regulating DNA sequence elements (Matys et al. 2006). Position weight matrixes (PWMs) were obtained from the TRANSFAC 7.0 database. The Eukaryotic Promoter Database (EPD) collected annotated eukaryotic RNA polymerase II promoters sequences around the experimentally determined transcription start site (Schmid et al. 2006). The human promoter sequence (-499,100 around TSS) from EPD were used as background sequences. The 0th order of the Markov model with prior 0.1 was chosen to compute both the background sequences and the actual sequence frequencies. The p-value and significance value indicate the probability that the observed over-representation of the motif is achieved by random selection for a single or multiple TFs, respectively.

Cells culture, cell counting and qRT-PCR
MCF-7 cell line was from American Type Culture Collection (Manassas, VA). Cells were grown with phenol red-free D-MEM supplemented with 0.5% charcoal stripped FBS for 3 Days. The estrogen-deprived MCF-7 cells were treated with 10 nM 17 -estradiol (E2, Sigma-Aldrich Co.) or DMSO as a control. At the indicated time points, cells were rinsed with PBS and counted manually under the microscope. Total RNA was collected and extracted with Trizol reagent (Invitrogen). Reverse transcription of mature miRNAs and quantitative realtime PCR analysis were performed as previously reported (Xu, Liao, and Wong 2010). Primers specific for the indicated miRNAs were available upon request.

Identifying putative miRNAs regulated by ERs
Estrogen receptor alpha (ER ) and beta (ER ) are members of the nuclear receptors superfamily which are ligand-regulated transcription factors. Estrogen (17 -estradiol, E2) is a potent ligand for both ERs. ERs either directly interact with cis-regulatory elements of target genes by binding to estrogen-response elements (EREs) or indirectly tether to transcription factors such as AP1 and SP1 (Ali and Coombes 2002). By transcriptional control of a large number of target genes, ERs regulate a wide variety of cellular processes including development and differentiation (Deroo and Korach 2006). In particular, ER is thought to be involved in the progression of breast cancer. Depending on the status of ER expression, breast cancer is classified into ER + and ER -subtypes. Differential anti-hormone treatments are prescribed in conjunction with anti-cancer drugs to manage these breast cancer subtypes. Therefore, understanding how ER affects the expressions of oncogenic and tumour suppressor miRNAs may assist better development of anti-cancer therapy. Carroll et al. previously used ChIP-chip technology to analyze ER binding regions genomewide. They found that the majority of ER binding regions are located outside of the classical promoter-proximal regions, suggesting distal regulation by ER (Carroll et al. 2006). To account for the possible bias when only focusing on promoter-proximal regions, we regarded 50 Kb upstream of all known pre-miRNAs as possible regulatory regions. Besides, setting a wider candidate region should be beneficial at this exploring step. Totally 59 miRNA regulatory regions were found that overlapped with 65 Carroll's ER binding regions ( Table 1). As shown in Figure 1, there are three representative patterns of ER binding regions relative to specific miRNAs. For example, the promoter of hsa-miR-342 contains both promoter-proximal and distal ER binding regions, whereas the promoter of hsa-miR-21 is characterized by two proximal ER binding regions. In contrast, there is only one distal ER binding region within the miR-143~145 cluster upstream regulatory region. We then used TOUCAN 2 to see whether there are TFs binding sites over-represented in these miRNA-related ER-binding regions. The top 10 significant binding motifs are listed in Table 2. Not surprisingly, we identified the consensus ERE (AGGTCANNNTGACC) as the most common TF binding motif presented in these miRNA regulatory regions bound by ER.
In addition, we also observed enrichments of activator protein 1(AP-1) and forkhead (FKH) motifs among the miRNA-related ER-binding regions. The AP-1 family consists of proteins belonging to the JUN, FOS and ATF subfamilies. These subunits can hetero-dimerize and bind to their DNA target genes. AP-1 complex modulates a variety of cellular processes in response to environmental stimuli. Specially, AP-1 complex is an important regulator in tumour development since its target genes are involved in oncogenic transformation, tumour suppression, invasive growth and angiogenesis (Wagner 2001;Jochum, Passegue, and Wagner 2001). FKH proteins are a super-family of transcription factors that participate in regulating the expression of genes involved in cell growth, proliferation and differentiation. Many FKH proteins are important to embryonic development, glucose homeostasis, tumourigenesis and even vocal learning (Hannenhalli and Kaestner 2009). In previous analysis of mRNA targets, these two binding motifs were also shown to enriched in ER binding regions, suggesting their role in ER-regulated mRNAs transcription (Carroll et al. 2006). Our findings further implied that AP-1 and forkhead family members are cooperating transcription factors to regulate ER responsive miRNAs in combinatorial fashions. In our result, p53 motif is the fourth most significant enriched binding sites in ER binding regions. p53 is an essential tumour suppressor because mutations or aberrations in the expression of p53 gene were frequently observed in a variety of cancer cell lines and clinical tumour samples (Nigro et al. 1989). Liu et al. also found that ER can bind directly to p53 and repress its target genes ). This important finding has profound translational implications because the same group of investigators recently demonstrated that (1) Ionizing radiation disrupts the ER -p53 interaction in breast tumours, functionally leads to p53 restoration in breast tumours subjected to radiation therapy and elucidates a novel mechanism underlying the anti-tumour effect of radiation therapy ); (2) The presence of wild-type p53 is an important determinant for responsiveness to antiestrogen therapy since anti-estrogens could reactivate p53 by disrupting the ER -p53 interaction and subsequently p53 activates many tumour suppressor genes (Konduri et al. 2010). Similar with the situation for mRNA target regulation, we therefore hypothesize that ER -p53 interaction may also involve in modulating the transcription of "oncogenic miRNAs" and "tumour suppressor miRNAs" although the exact mechanism needs further analysis. Except for co-regulator of ER , there is also possible interaction between the enriched TFs. For example, cross-talk between glucocorticoid receptor (GR) and AP-1 has been well established (Herrlich 2001). In our result, both GR and AP-1 are also enriched in the binding region, whether such interactions are involved in miRNAs target regulation warrants further investigation. In the original analysis of binding sites in mRNA promoters, Carroll et al. found there is a strong correlation among ER , Forkhead, Oct, Ap-1 and C/EBP (Carroll et al. 2006). In our analysis of miRNA promoter regions, we did not find significant enrichment for Oct and C/EBP. But interestingly, several novel TF binding sites (v-Maf, Meis-1, p53, GR-, ROR 1, Hand1) are over-represented in miRNA-related ER-binding regions. This observation perhaps reflect the similar (in the case of common TFs, i.e. ER , FHK and AP-1) and distinct modes of ER modulation in miRNA and mRNA gene regulation.  Table 2. Top 10 enriched motifs in the miRNA-related ER-binding sites. N: number of times TF site appears in the input sequences. Note that TF binding site might appear more than once in one sequence. P-value: probability to find even more occurrences than N in the input sequences. Sig value: a significance coefficient used to select the most overrepresented patterns among the distinct motifs. When analyzing only one TF site, a P-value smaller than 0.05 could be considered as being over-represented. In case of multiple TF sites, sig-value is used to select the significant result. Generally, positive sig values mean significant.

Confirmation of ER-regulated miRNA in a breast cancer cell model
MCF-7 is a well established ER + cell line that reflects hormone-dependent breast cancer; namely, E2 increases MCF-7 cell proliferation. In our hand, the cell number significantly increased after four days of treatment with 10 nM E2 compared to DMSO control while a late phase increase in cell number was observed starting on day 9 (Figure 2a). Among the predicted E2-regulated miRNAs, we randomly selected 8 miRNAs and used qRT-PCR to detect the time-dependent changes in their expression levels during cell proliferation. Compared to DMSO control, miR-342, miR-21, miR-422a, miR-124, and miR-181c were generally found to be up-regulated by E2 treatment; whereas miR-143, miR-145, and miR-483 were down-regulated (Figure 2b and 2c), suggesting that they are under the respective influences of positive and negative EREs. Intriguingly, the down-regulated miRNAs exhibit wave patterns of expression, i.e., significantly suppressed on day 4 with differential levels of restoration on day 7 followed by another round of suppression and partial rebound. Other than miR-124 which displays a wave pattern of induction, the rest of the E2-induced miRNAs show a gradual pattern of induction. The determinants and regulatory networks that dictate these patterns of expression await comprehensive investigations. Of those up-regulated miRNAs, miR-342 was induced to the highest extent by E2. MiR-342 is encoded in an intron of the gene EVL and commonly suppressed in human colorectal cancer (Grady et al. 2008). Over-expression of miR-342 in the colorectal cancer cell line HT-29 induced apoptosis, pointing towards a pro-apoptotic tumour suppressor function (Grady et al. 2008). On the other hand, miR-342 expression level in breast tumours is more complicated with highest level in ER and HER2/neu-positive luminal B tumours but lowest level in ER, PR and HER2/neu triple-negative tumours (Lowery et al. 2009). Adding to the uncertainty regarding its role, miR-342 is down-regulated in tamoxifen-resistant MCF-7 cells (Miller et al. 2008). Consistent with these findings, Cittelly et al. compared miRNA expression profiles between MCF-7/pcDNA (tamoxifen-sensitive) and MCF-7/HER2Δ16 (tamoxifen-resistant) cells when both cell lines were treated for 24 hr with 100 pM 17-estradiol (E2) and 1μM 4-hydroxytamoxifen (TAM). They found that miR-342 was the most dramatically down-regulated miRNA in the tamoxifen resistant MCF-7/HER2Δ16 cells. They further proved that other tamoxifen resistant cell lines such as TAMR1 and LCC2, all exhibited dramatically suppressed levels of miR-342 whereas another tamoxifen sensitive MCF-7/HER2 cell lines also expressed high levels of miR-342, indicating that loss of miR-342 was a common feature of tamoxifen resistance (Cittelly et al. 2010). The expression level of miR-21 was previously found to be significantly changed in various cancers; especially, it is higher in ER + than ER -breast tumour (Mattie et al. 2006;Volinia et al. 2006). However, inconsistent results were reported regarding the effect of E2 on miR-21 expression in MCF-7. Wickramasinghe et al. reported that E2 inhibited miR-21 expression after 6 hr (Wickramasinghe et al. 2009). In contrast, another group found that miR-21 was induced after a 4 hr E2 treatment (Bhat-Nakshatri et al. 2009). In our investigation, we found that initially miR-21 was repressed after 4 days on E2. However, miR-21 was up-regulated by E2 upon long term treatments. MiR-21 is thought to be an oncogenic miRNA (oncomiR) and several confirmed endogenous targets such as PDCD-4 and PTEN are important tumor suppressers (Asangani et al. 2008;Folini et al. 2010;Meng et al. 2007). Consistent with its proposed role as an oncomiR, our results showed that miR-21 expression progressively increased from day 7 to day 12 in parallel with the late phase increase in cell number. Fig. 3. ER,AP-1 and p53 binding motifs relative to miR-342,miR-21 and miR-143~145. The red, blue and pink boxes represent ER, AP-1 and p53 binding sites respectively. ER_2920 and ER2921 located in miR-342 upstream region; ER_3188 and ER_3189 located in miR-21 regulatory region; ER_1259 located in miR-21 upstream region (please referring to Figure 1).
MiR-143 and miR-145 are clustered miRNAs with their expression levels co-ordinately down-regulated in multiple forms of cancer (Akao et al. 2007;Michael et al. 2003;Sevignani et al. 2007;Wang et al. 2008). They can function as important tumour suppressers by targeting multiple key genes in apoptosis, proliferation, and metastasis signalling pathways (Chen et al. 2009;Chiyomaru et al. 2010;Sevignani et al. 2007;Zaman et al. 2010). However, whether miR-143 and miR-145 are regulated by E2 in MCF-7 breast cell is unclear. In this study, we found that both were repressed by E2 in a long term treatment. Importantly, we also observed that ectopic expression of miR-145 repressed MCF-7 cell proliferation (data not shown). These observations are consistent with previous studies in other cancers, indicating that miR-145 is repressed in cancers compared to the normal control. Analyzing the miR-342, miR-21 and miR-143~145 regulatory regions, we found AP-1 binding motifs in all of them, supporting the role of AP-1 as a basal activator (Figure 3) (Wagner 2001). In addition, there are both ER and p53 motifs in the miR-342 upstream region, therefore miR-342 may be a dual target of these two TFs and the expression of miR-342 perhaps depends on both the integrity of estrogen signalling pathway but also the status of p53. Estrogen-response elements were detected in both miR-342 and miR-143 promoter regions, indicating direct estrogen receptor binding. However, ERE is not present in miR-21 upstream regulatory region and it is possible that transcriptional activation of miR-21 may be mediated via estrogen receptor tethered to AP-1 motifs.
Except for miR-342, miR-21, miR-143, and miR-145, little is known for the other miRNAs regarding their roles in breast cancer development. Since our analysis has already implicated several oncogenic and tumour suppressor miRNAs to be regulated by ER, we believe that this strategy can provide promising miRNAs candidates for additional functional exploration.

Discussion
Understanding gene regulation is crucial to elucidating the mechanisms of development, differentiation and signaling response. Over the past three decades, advances in technologies such as genomic sequencing and expression profiling by microarray have paved ways to more thorough investigations into gene regulatory networks. These advances also necessitated the development of bioinformatics. Namely, analytical tools and methods are continuingly being invented for processing the vast amount of information generated and mining the corresponding datasets; hence, new discoveries are observed and novel concepts are developed for hypothesis building and testing. Nonetheless, the accumulation of datasets sometimes outpaces the development of bioinformatics and a certain amount of valuable information is left un-mined. In this chapter, we present a case of utilizing developed bioinformatics tools to learn more about gene regulation network based on published transcriptional factor binding datasets. We used a previous published ER ChIP-chip data to find a set of putative ER-regulated miRNAs. This concept and method can be extended to other aspects. Firstly, several ChIP based techniques, such as ChIP-PET (paired-end tag), ChIP-DSL (DNA selection and ligation), were developed to map TFs binding sites. The genome-wide TF binding sites generated from these variations of ChIP-chip techniques could also be used to map the miRNAs promoters. Secondly, our methods can be extended to other nuclear hormone receptors (NHRs) and TFs providing that corresponding genomic coordinates of TFs binding are available. Importantly, the specificity of TF binding sites could be investigated by comparing different but related TF binding data. For example, recently by comparison of ER and estrogen-related receptor (ERR) binding data in breast cancer cell line MCF-7, Deblois et al. showed that ERR and ER display strict binding site specificity while a small number of binding sites were shared by both transcriptional factors (Deblois et al. 2009).Another prominent feature of this versatile procedure lies in its easy application and low cost. In recent years, ChIP based techniques are popular assays to study direct targets of TFs genome-wide. For example , many NHR binding maps have been published (Deblois and Giguere 2008). Surprisingly, few miRNAs regulated by a specific NHR were mined from these valuable datasets. The directly targeted miRNAs by a specific NHR or TF can be readily discovered through our procedure if the genome-wide binding sites for this TF have been produced by others. Thus, it avoids redundant experiments and greatly facilitate rapid discovery. MiRNAs microarray is a common practice to identify miRNA expression changes upon a specific treatment. However, there are some limitations inherited in microarray platform. For instance, microarray data is usually mixed with primary, secondary, and even tertiary gene expression changes, making it difficult to dissect which TFs are responsible for these different levels of regulation. Our procedure directly links the candidate TFs with putative target miRNAs through analyzing ChIP-chip and ChIP-seq binding data. Uniquely, our analysis also allows investigation into the relationships between mRNAs and miRNAs co-ordinately regulated by a specific TF in a given cell type upon a particular treatment, providing an entirely new set of information not revealed by mircroarray analysis alone. However, it should be noted that not all regulatory regions are included in the original design of ChIP-chip platform. Thus, our analysis can only provide a partial picture that is dependent on the completeness of ChIP-chip design. As more comprehensive technology such as ChIP-seq analysis is used in investigation, the genomic coverage will be significantly improved. Besides, TF binding sites may be located outside of the 50 kb upstream regulatory region defined in our analysis. Therefore, it is best to complement ChIP data analysis with microarray studies to obtain comprehensive information on TFs and miRNAs regulation networks.

Conclusion
Understanding the relationships between transcriptional factors and their target mRNAs is greatly facilitated by genome-wide analysis based on the pairing of chromatin immunoprecipitation with DNA microarray. However, few miRNAs regulated by transcription factors have been mined from these data. Our bioinformatics procedure efficiently utilize genome-wide binding data to screen upstream regulatory regions of all human miRNAs and hunt for miRNA targets modulated by a specific transcription factor.
As an example, we predicted 59 putative estrogen-responsive miRNAs based on a published genome-wide ER binding dataset. Several ER-regulated miRNAs were further confirmed in a breast cancer cell model. Among these, miR-342, miR-21, miR-422a, miR-124, and miR-181c were generally found to be up-regulated by estrogen treatment; whereas miR-143, miR-145, and miR-483 were down-regulated. This example demonstrated the power and efficiency of this novel analysis method. Furthermore, this example also indicated miRNA target of a specific TF can be equally detected from ChIP-chip based binding data, which are usually produced for identifying mRNA targets. Integrating our method with routine analysis procedure will gain a full picture of gene regulation network by simultaneously elucidating the miRNAs and mRNAs targets of a specific TF.

Acknowledgment
We are in debt to Ms. Xuemei Liao for assistance on graphic preparation. The research is supported by the science foundation of the education department of Henan province (Grant No. 2011A180009) and a start-up grant from Henan University of Technology (#2009BS040).