InTechOpen uses cookies to offer you the best online experience. By continuing to use our site, you agree to our Privacy Policy.

Computer and Information Science » Numerical Analysis and Scientific Computing » "Bioinformatics", book edited by Horacio Pérez-Sánchez, ISBN 978-953-51-0878-8, Published: November 28, 2012 under CC BY 3.0 license. © The Author(s).

Chapter 12

Novel microRNA Cloning Using Bioinformatics

By Yoshiaki Mizuguchi, Takuya Mishima, Eiji Uchida and Toshihiro Takizawa
DOI: 10.5772/53803

Article top

Novel microRNA Cloning Using Bioinformatics

Yoshiaki Mizuguchi1, Eiji Uchida1, Takuya Mishima2 and Toshihiro Takizawa2

1. Introduction

MicroRNAs (miRNAs) participate in several biological processes, including development, differentiation, apoptosis, and proliferation (1, 2) through imperfect pairing with target messenger RNAs (mRNAs) of protein-coding genes and transcriptional or post-transcriptional regulation of their expression (3, 4). Approaches to miRNA detection, such as parallel sequencing technologies may replace conventional sequencing (5). The GS 454 technology can produce a similar number of longer (100–150-nucleotides (nt)) sequence reads in a single analysis run, with the advantage that this method can derive the complete sequence of the mature miRNA. Moreover, recent studies on miRNA profiling performed with cloning techniques suggest that sequencing methods are suitable for the detection of novel miRNAs, modifications, and precise compositions, and that cloning frequencies calculated by clone count analysis strongly correlate with the concentrations measured by Northern blotting, and are reproducible. The achievement of comprehensive profiling of miRNA in human diseases requires exhaustive qualitative and quantitative analyses. Here we show the techniques and the some of the results of the miRNA transcriptomes in the liver using sequencing. This serves as a critical step in clarifying the functional significance of specific miRNAs as they relate to liver diseases.

2. Techiniques of MicroRNA cloning and bioinformatics for MiRNAome

2.1. MicroRNA cloning

The method for microRNA cloning and sequencing that we moderated from the original ones are shown in Fig1. We cloned small RNA by a modification of the published miRNA cloning protocol of Lagos-Quintana et al. (6). In brief, total RNA samples were extracted using ISOGEN (Nippon Gene, Tokyo, Japan), separated in a denaturing polyacrylamide gel, and the 18–24 nt fraction was recovered. Next, 5′- and 3′-adapters were ligated to the RNAs Ligation of small RNAs with DNA_RNA chimera linkers at both termini [3’ linker oligonucleotide (5’/5 Phos/rCrUrGrUAGGCACCATCAATdi-deoxyC-3 ’) and the 5’ linker oligonucleotide (5’-ATCGTrArGrGrCrArCrCrUrGrArArA-3’)] and RT-PCR was carried out.


Figure 1.

Overview of the miRNA cloning. See paper body for details.

Amplification of the cDNA fragments was obtained by two consecutive rounds of PCR. Specific restriction enzyme digestion of the adaptors allowed for concatemerization of the cDNA into larger fragments. These fragments were then cloned into a vector to create a cDNA library. Concatemerization increases the length of informative sequences obtainable from each clone. we concatenated more than 20 cDNAs into a single fragment using a BanI restriction enzyme (New England Biolabs, Ipswich, MA, USA), a DNA ligation kit ver. 2.1 (Takara Bio, Shiga, Japan), and a Geneclean III kit (Qbiogene, Irvine, CA, USA) prior to TA cloning. The concatenated products were then inserted into plasmids and sequenced (Fig1).

The sequences were compared to human DNA to determine the genomic origin of the small RNA. It was important to avoid contamination from other samples and molecular-weight makers during electrophoresis. Such contaminants considerably diminished the accuracy and efficiency of miRNA cloning. We avoided contamination by performing the cloning procedure separately for each sample, by using a special gel with a small plastic rod that divided the sample and marker lanes, and by using separate vats for each gel for ethidium bromide staining. We made small RNA libraries by excising a portion of a polyacrylamide gel containing species 18–24 nt in length to avoid contaminating our purified RNAs with piRNAs (7).

2.2. Bioinformatics analysis of the sequence data

We performed a homology search for all cloned small RNAs and a secondary structural analysis for all novel miRNA candidates.

2.2.1. Comparing the cloned sequences with those of known RNAs

The vector sequence, the 5'and 3' linkers, and their coupled sequences (CTGTAGGCACCTGAAA) were removed. Those extracted sequences composed of 16–30 nt were defined as valid small RNAs and were subjected to followings. The small RNA sequences were analyzed for homology with known RNAs, including miRNA, piwi-interacting (pi) RNA, rRNA, tRNA, small nuclear (sn) RNA, small nucleolar (sno) RNA, and mRNA, and human genomic DNA sequences. The databases used were: miRNA (mature- and pre-), Sanger Database.; piRNA, the NCBI Entrez Nucleotide database; rRNA, the European ribosomal RNA database; tRNA, the Genomic tRNA database; sn/snoRNA, RNAdb and NONCODE; mRNA, NCBI Reference Sequence; and human genomic sequences, the UCSC Genome Bioinformatics Site. In our searches, we defined the cloned sequencing results that had higher than 90% homology as valid if they met our criteria for sequence error, erroneous PCR amplification, and 3′- and 5′-end variations. Clones with 100% homology with human genomic DNA but did not match known RNAs when compared to the above databases were termed novel miRNA candidates and were subjected to further analysis. The result is shown in table1.

Reads ( % )HCCANLSum
miRNA256,64 (81.6)208,038 (77.4)464,687 (79.7)
piRNA2,983 (0.9)1,440 (0.5)4,423 (0.8)
rRNA5,474 (1.7)10,161 (3.8)15,635 (2.7)
tRNA1,703 (0.5)621 (0.2)2,324 (0.4)
snRNA700 (0.2)343 (0.1)1,043 (0.2)
snoRNA654 (0.2)747 (0.3)1,401 (0.2)
mRNA6,053 (1.9)7,279 (2.7)13,332 (2.3)
Genome2,799 (0.9)3,149 (1.2)5,948 (1.0)
Others15,588 (5.0)34,686 (12.9)71,174 (12.2)

Table 1.

Annotation of the sequenced small RNAs

2.2.2. Secondary structure analysis of Novel microRNAs

The two-dimensional precursor miRNA (pre-miRNA) configurations of our novel miRNA candidates were predicted according to the method described previously (8) with some modifications. Briefly, 196-nt of genomic sequence was added to the candidate sequences (88-nt at each end). Each candidate sequence was divided into 110-nt windows and subjected to two-dimensional analysis along its entire length, using the RNAfold software (Vienna RNA Secondary Structure Package (9)). The configurations that had the lowest free energy and that had a high conservation (described below) and met the following criteria were termed novel miRNAs: (a) contained a stem-loop configuration; (b) cloned mature miRNA sequence portion consisted of more than 16-nt in its double-stranded region; (c) the loop contained fewer than 20-nt; (d) the internal loop contained fewer than 10-nt; and (e) the bulge contained fewer than 5-nt. Furthermore, novel sequences with overlapping positions in the genome were grouped together. Novel antisense miRNAs are defined with above criteria (a)-(e) but without conservation score if they are coded in same chromosomal region.

2.2.3. Determination of hairpin conservation of Novel MicroRNAs

We classified all the candidate miRNAs using the PhastCon database at the University of California at Santa Cruz.(10, 11) This database has scores for each nucleotide in the human genome relative to its degree of conservation when compared to nucleotides in the armadillo, bush baby, cat, chicken, chimpanzee, cow, dog, elephant, frog, fugu, guinea pig, hedgehog, horse, lizard, medaka, mouse, opossum, platypus, rabbit, rat, rhesus monkey, shrew, stickleback, tenrec, tetraodon, tree shrew, and zebrafish. The algorithm is based on a phylogenetic hidden Markov model that uses best-in-genome pairwise alignment for each species (based on BLASTZ), followed by multiple alignment of the twenty eight genomes. A hairpin was defined as conserved if the average PhastCon conservation score over the 28 species for any 15-nt sequence in the hairpin stem was at least 0.8.(12)

3. Other techniques for MicroRNAome

3.1. Real-time PCR analysis of known miRNAs

Real-time PCR was performed on an ABI7300 (Applied Biosystems, Foster City, CA, USA) using various mirVana qRTPCR primer sets (Ambion, Austin, TX, USA) and a SYBR ExScript RT-PCR kit (Takara Bio), or with TaqMan miRNAs assays (Applied Biosystems), a High capacity cDNA archive kit (Applied Biosystems), and Absolute QPCR ROX mix (Abgene, Rochester, NY, USA), according to the manufacturers’ instructions. As an endogenous control, 5SrRNA or U6 snRNA was used.

3.2. Ago2-immunoprecipitation and PCR analysis of novel miRNAs

After bioinformatic analysis of the sequence data, we further validated novel miRNAs by using a combination of Ago2-immunoprecipitation (13) followed by PCR-based miRNA detection (14). Briefly, 50 ml Dynabeads protein G slurry (Invitrogen) was immobilized with

20 mg human P4 anti-mouse Ago2 monoclonal antibody (clone 2D4, Wako Pure Chemical Industries, Osaka, Japan). One hundred fifty micrograms of human tissue P4 were homogenized in 1.5 ml of a cell lysis solution (provided in miRNAs isolation kit, Wako) using a Polytron PT1200C homogenizer (Kinematica AG, Lucerne, Switzerland) for 10 s at 4 8C, and then 1.5 ml of the cell lysis solution was added into the homogenized solution. Following incubation for 15 min on ice, lysate was centrifuged at 20 000 g for 20 min at 4 8C and filtered through a 0.8 mmSupor Acrodisc syringe filter (Pall Corporation, Ann Arbor, MI, USA). One milliliter of the filtered lysate was incubated with 25 ml of the anti-Ago2-Dynabead protein G for incubation for 60 min at 4 8C. After immunoprecipitation, Ago2-associated RNAs were isolated from the immunoprecipitate according to the manufacture’s protocol (Wako). We confirmed that the immunoprecipitate contained human P5 Ago2 protein of w100 kDa in size by western blot (data not shown). Non-immune human IgG (Sigma) was used as a control for Ago2-immunoprecipitation. Preparation of the cDNA library using the Ago2-associated RNAs and semi-quantitative PCR analysis of the above-mentioned novel miRNA candidates were performed, as reported previously (14). A small RNA-specific primer and a universal reverse primer RTQ-UNIr (14), were used for amplification of each of the small RNAs. The PCR products were analyzed on a 12% polyacrylamide gel. The primers for the human GAPDH were used for negative control.

3.3. PCR analysis of novel miRNAs (Alternative method of 3.2)

After bioinformatic analysis of the sequence data, we further validated novel miRNAs by PCRbased miRNA detection (14). Briefly, small RNAs were isolated using the mirVana™ miRNA isolation kit (Ambion). Small RNA samples were polyadenylated with Poly(A) Tailing Kit (Ambion) and were purified with Acid-Phenol:Chloroform and with filter cartridge provided in the mirVana Probe & Marker Kit (Ambion). To generate a small RNA cDNA library, tailed RNA were reverse transcripted using RTQ primer(14) and the samples were purified using the QIAquick spin PCR purification kit (QIAGEN). A small RNA-specific primer and a universal reverse primer RTQ-UNIr (14), were used for amplification of each of the small RNAs. The PCR products were analyzed on a 12% polyacrylamide gel. The primers for the human GAPDH were used for negative control.

3.4. Real-time PCR-based miRNA expression profiling

Total miRNA (350 ng) was reverse-transcribed using Megaplex RT Primers (Applied Biosystems). The resulting cDNAs were pre-amplified using Megaplex PreAmp Primers (Applied Biosystems) and the pre-amplified products applied to a TaqMan Human MicroRNA Array Panel (A and B, v2.0).

3.5. siRNA, Pre-miR and anti-miR transfections

Cultured cells were transfected with precursor hsa-miR-200c and hsa-miR-141 (ID: PM11714; PM10860); Anti-miR™ 200c and 141 inhibitors (ID: MH11714; MH10860) (Ambion, Austin, TX) for 8 hours in serum free medium. Serum supplemented medium was added and gene and protein expression measured at the indicated time points.

4. Study designs

Sequencing using 454 sequencing and conventional cloning from 22 pair of HCC and adjacent normal liver (ANL) and 3 HCC cell lines identified reliable reads of more than 300000 miRNAs from HCC and more than 270000 from ANL for registered human miRNAs.

5. Detected novel microRNAs using cloning

Eleven novel opposite miRNAs (defined as miRNAs cloned from the other arm of precursors from which known miRNAs have been cloned) were identified from the annotated miRNAs (Figure 2A).Based on above criteria, a scan of all the novel miRNA candidates identified 245 novel precursors, representing putative 210 novel mature miRNAs (Figure 2B).


A: Chromosomal locations of our cloned mature miRNAs which correspond to previously identified pre-miRNAs. The bar graphs indicate the location of mature miRNAs (both our cloned miRNAs and previously identified miRNAs) based on known pre-miRNA sequences. Many of our cloned sequences were found in these regions. Arrowheads indicate novel “opposite” miRNAs cloned in our experiments. These sequences represent mature miRNAs which originate on the opposite DNA strand of the known precursor miRNA. B: The 222 novel mature miRNAs identified in our study were checked for species conservation using the PhastCon database and their corresponding hairpin conservation scores charted in this figure.

Figure 2.

Qualitative analysis of miRNAs in the human liver.

Accession NumberMost frequent sequenceStrandChr.Precursor
StartEndPhastCon scoredG (kcal/mol)

Table 2.

Summary of the novel miRNAsSummary of predicted novel miRNAs. The predicted novel miRNAs with their PhastCon scores are listed. And the most frequent cloned sequences, genome locations, clone counts, and conservation scores calculated with PhastCon are listed. The stem-loop is a 110-nt sequence derived by computational prediction. The precursor structures are listed in Table 3.


Table 3.

Samples of the Secondary structures of predicted novel miRNAs. The predicted novel miRNAs with their particular ID numbers are listed. The mature miRNAs are depicted in blue in the drawing of the precursor.

We cloned 210 novel microRNA candidates. Samples of the novel microRNAs that were detected from our study and its bioinformatics data are tabulated in Table 2 and 3. And those novel miRNAs have been deposited with DDBJ under consecutive accession codes from AB372573 to AB372814.

6. Future direction

We have been demonstrated the usefulness and accuracy of sequencing in genetic research of the liver. One of the main problems with applying sequencing to the miRNA transcription research is that sequencing is a time-consuming procedure. And an important consideration for the discovery of miRNA by sequencing is the difficultly in identifying miRNAs that are expressed at low levels, at highly specific stages or in rare cell types. Moreover, a serious problem is that some miRNAs are difficult to profile precisely due to their physical properties or post-transcriptional modifications, such as RNA editing. In principle, these limitations can be overcome by extensive sequencing of small RNA libraries from a broad range of samples. For differential display, the sequencing-based method has the theoretical advantage in that it has the capability to discover and detect novel miRNAs. Based on our sequence variability results, especially with regard to RNA modifications, the accuracy of the sequence-based method is expected to be superior to that of the hybridization-based method. For the prediction of novel miRNAs, methods that rely on phylogenetic 1 genes. To overcome this problem, we made use of a computational approach for structural conservation criteria using the thermodynamic stability and intrinsic structural features of miRNAs. In clinics, pathologists often meet difficult situations in which they cannot clearly tell whether the tissue specimens they are observing are malignant or benign. Thus, in our opinion, using some miRNAs as a tumor marker would help clinicians to clearly determine whether that tissue is cancerous. miRNA sequences followed by bioinformatics have greater power than individual miRNAs or other clinic-pathological variables for the detection of high risk patients’ groups with poor prognoses. There is currently little data available as to how we can use each miRNA to predict high risk groups; however, additional future miRNA work and data accumulation will elucidate such criteria. And further investigation is warranted to clarify the mechanism of aberrant expression of miRNAs in cancer and its participation in carcinogenesis. Nevertheless, these findings show that sequence-based miRNA profiling has potential for the confirmation of precise miRNA dynamics in a specific disease. In addition, it will increase our understanding of the mechanisms and factors involved in human liver cancer.


1 - DP Bartel, (2004) MicroRNAs: genomics, biogenesis, mechanism, and function. Cell 116281297 .
2 - B. D Harfe, 2005MicroRNAs in vertebrate development.Curr Opin Genet Dev 15410415
3 - D. P Bartel, C. Z Chen, 2004Micromanagers of gene expression: the potentially widespread influence of metazoan microRNAs.Nat Rev Genet 5396400
4 - N Rajewsky, 2006microRNA target predictions in animals.Nat Genet 38 Suppl:S813
5 - B. C Meyers, F. F Souret, C Lu, P. J Green, 2006Sweating the small stuff: microRNA discovery in plantsCurr Opin Biotechnol 17139146
6 - M Lagos-quintana, R Rauhut, A Yalcin, J Meyer, W Lendeckel, et al2002Identification of tissue-specific microRNAs from mouse. Curr Biol 12735739
7 - V. N Kim, 2006Small RNAs just got bigger: Piwi-interacting RNAs (piRNAs) in mammalian testes.Genes Dev. 2019931997
8 - J Mineno, S Okamoto, T Ando, M Sato, H Chono, et al2006The expression profile of microRNAs in mouse embryosNucleic Acids Res 3417651771
9 - I. L Hofacker, 2003Vienna RNA secondary structure serverNucleic Acids Res 3134293431
10 - S Schwartz, W. J Kent, A Smit, Z Zhang, R Baertsch, et al2003Human-mouse alignments with BLASTZ. Genome Res 13103107
11 - A Siepel, D Haussler, 2004Combining phylogenetic and hidden Markov models in biosequence analysisJ Comput Biol 11413428
12 - E Berezikov, V Guryev, van de Belt J, Wienholds E, Plasterk RH, et al. (2005Phylogenetic shadowing and computational identification of human microRNA genes. Cell 1202124
13 - A Azuma-mukai, H Oguri, T Mituyama, Z. R Qian, K Asai, H Siomi, Siomi MC Characterization of endogenous human Argonautes and their miRNA partners in RNA silencing. (2008Proc Natl Acad Sci U S A 10579647969
14 - S Ro, C Park, J Jin, K. M Sanders, W Yan, 2006A Pcr-based, method for detection and quantification of small RNAs. Biochem Biophys Res Comm 351756763
15 - NCBI Entrez Nucleotide database;European ribosomal RNA database,; Genomic tRNA database,; RNAdb,;
16 - NONCODE;NCBI Reference Sequence,; UCSC Genome Bioinformatics Site,; OncoDb HCC,