Accumulating evidence highlights that noncoding RNAs, especially the long noncoding RNAs (lncRNAs), are critical regulators of gene expression in development, differentiation, and human diseases, such as cancers and heart diseases. The regulatory mechanisms of lncRNAs have been categorized into four major archetypes: signals, decoys, scaffolds, and guides. Increasing evidence points that lncRNAs are able to regulate almost every cellular process by their binding to proteins, mRNAs, miRNA, and/or DNAs. In this review, we present the recent research advances about the regulatory mechanisms of lncRNA in gene expression at various levels, including pretranscription, transcription regulation, and posttranscription regulation. We also introduce the interaction between lncRNA and DNA, RNA and protein, and the bioinformatics applications on lncRNA research.
- long noncoding RNAs
- gene regulation
- lncRNA binding
It was estimated that there are approximately 20,500 protein coding genes that account for 2% of the genome , and another 98% of the genome were considered as “DNA junks” previously due to their disability in coding proteins. The application of high-throughput next generation sequencing (NGS) technology has changed our view of the genome, about 90% of the human genome can be transcribed into RNA transcripts [2, 3, 4, 5]. Except the small portion of transcripts encoding for proteins, the majority of RNA transcripts in the type have been grouped into noncoding RNAs (ncRNAs), including transfer RNAs (tRNAs), ribosomal RNAs (rRNAs), small ncRNAs such as micro RNAs (miRNAs), small interfering RNAs (siRNAs), small nuclear RNAs (snRNA), circular RNA, as well as long ncRNAs (lncRNAs). The rRNA and tRNA are two basic ant most abundant RNAs that play important roles in mRNA translation. The miRNAs are short single strand RNAs of 20–22 base with the role of promoting mRNA degradation. SiRNA is a class of double strand RNA with a length of 20–25 base pairs, which interferes target gene expression by degrading mRNA or preventing translation. The snRNA, with an average length of 150 base, is a class of small RNA located in nucleus speckles. The primary function of snRNA is in the processing of pre-mRNA splicing. Circular RNAs is a very special class of ncRNA. The 3′ and 5′ ends of the circular RNA join together to form a covalently closed continuous loop. The function of the circular RNA is not clear. LncRNAs are defined as the noncoding linear transcripts longer than 200 nucleotides, which have different features with other ncRNA listed above. In general, they share common characteristics with mRNAs. The majority of LncRNAs are usually transcribed by RNA polymerase II, capped at 5′ end, and spliced; most of them are also polyadenylated at the 3′ end and have promoter regions. Compared to the protein coding gene mRNA, lncRNAs lack of open reading frame (ORF), contain fewer exons (~2.8 exons in lncRNAs compared to 11 exons for protein coding gene), are expressed in low abundance and more tissue-specific . Some of them have no polyadenylation tails. The lncRNAs account for the major class of the ncRNAs in the gnome. There are ~30,000 high-confidence transcripts of lncRNAs in human according to GENCODE reference genome annotation , and more and more new lncRNAs are coming into the light. The database LNCipedia has collected 127,802 transcripts from 56,946 long noncoding genes in human [8, 9]. Many of the ncRNAs have been confirmed as playing crucial regulatory roles in diverse biological processes and tumorigenesis. Increasing evidence indicates that lncRNAs play important roles in various cellular processes, such as DNA repair , proliferation , epithelial-mesenchymal transition (EMT)  by regulating various aspects of the related gene expression. LncRNAs have been associated with various diseases [13, 14, 15, 16] and identified as potential biomarkers in some diseases, such as cancers, cardiovascular diseases, nervous system diseases, etc. [17, 18, 19]. LncRNAs could regulate gene expression by serving as molecular signals, guides, decoys and/or scaffolds . In this review, we will present the recent research advances about the regulatory mechanisms of lncRNAs in gene expression at various levels, including pretranscription, transcription regulation, and post transcription regulation. We will also discuss the interaction between lncRNA and DNA, RNA and protein, as well as the applications of bioinformatics in lncRNA-related research.
2. The role of lncRNA in pretranscription regulation
Gene expression is regulated at many levels, such as epigenetic, transcriptional, post-transcriptional, translational, and post-translational. In order to be transcribed, changes in the chromatin structure of a gene must take place to make the chromatin open to polymerases and transcriptional factors (TFs). Modification on chromatin DNA and histones affects gene accessibility and is associated with distinct transcription states. For example, H3 hyperacetylation or methylation at lysine 4 often makes the gene easily accessible, thereby actively transcribed. In contrast, histone methylation at lysine 9 results the assembly of compact or closed chromatin around the DNA, leading to transcription silence. A number of epigenetic control factors have been identified to modify histones. Some of them can facilitate transcriptional activation, such as p300/CBP, Esa1, and TAF1, and others participate in transcriptional silencing, such as EZH2 and Ubc9. However, the majority of epigenetic factors, such as DNA methyltransferases/demethylases and histone modification enzymes may not efficiently recognize specific DNA sequences. Emerging studies show that lncRNAs can act as signals, guides or scaffolds at chromatin level to regulate gene expression [21, 22, 23]. LncRNAs have been reported to participate in the methylation processes. For example, Wang et al. found that lncRNA Dum (developmental pluripotency-associated 2 (Dppa2) Upstream binding Muscle lncRNA) regulated DPPa2 expression by affecting DNA methylation . Dnmt proteins are known as DNA methyltransferases. Dum promoted DNA methylation of Dppa2 promoter by recruiting Dnmt1, Dnmt3a, and Dnmt3b to its promoter site, thereby silencing Dppa2 expression in cis and stimulating myogenic differentiation. Studies show that lncRNA HOTAIR (Hox transcript antisense intergenic RNA), transcribed from chromosome 12, can coordinate histone modification by binding to histone modifiers [25, 26]. Rinn et al. found that knock-down of HOTAIR led to transcriptional activation of HOXD locus genes present in the chromosome 2 and HOTAIR binding is required to guide polycomb repressive complex 2 (PRC2) to the HOXD locus . PRC2 is epigenetic factor and can catalyze methylation on lysine 27 of histone H3 (H3K27). PRC2 and LSD1 (lysine-specific demethylase 1) bind to the 5′ and 3′ domains of HOTAIR, respectively. The HOTAIR-PRC2-LSD1 complex then targets the HOXD locus on the chromosome 2, silencing the genes involved in the suppression of metastasis. LncRNA HOTTIP (HOXA transcript at the distal tip) binds to and targets WDR5-MLL complexes to the 5′ HOXA locus, mediating the transcriptional activation of HOXA via driving H3K4 methylation . LncRNA Evf-2 interacts with methy-CpG binding protein 2 (MECP2), inhibiting the methylation at DLX5/6 enhancer . Similarly, LncRNA GClnc1 (gastric cancer–associated lncRNA 1) promotes gastric carcinogenesis by acting as a modular scaffold of WDR5 and KAT2A complexes to specify the histone modification pattern on superoxide dismutase 2 . Zhao et al. showed that lncRNA PAPAS (promoter and pre-rRNA antisense) guided CHD4/NuRD (nucleosome remodeling and deacetylation) to the rDNA promoter by forming a DNA-RNA triplex structure at the enhancer region of rDNA . Other studies have also explored the role of lncRNAs in epigenetic regulation of transcription [31, 32].
3. The role of lncRNA in transcription regulation
Transcription begins with the binding of RNA polymerase II to the promoter region of a gene with the support of general transcription factors (GTFs). Other transcription factors (TFs) bind to the enhancer region accelerate transcription. The transcription is terminated when the polymerase meets to the terminator. LncRNAs can regulate gene expression by direct binding with TFs or PNA Pol II, or interfering the binding polymerase with promotor. For example, lncMyoD is an lncRNA activated by myogenic differentiation (MyoD) during myogenesis. LncMyoD can directly binds to IGF2-mRNA-binding protein 2 (IMP2) and negatively regulates IMP2-mediated translation of proliferation genes such as N-Ras and c-Myc, which create a permissive state for differentiation . lncRNA Gas5 (growth arrest-specific 5) attenuates some of GR positive related gene expression by binding to glucose receptors (GR) . LncHIFCAR (long noncoding HIF-1α co-activating RNA) level is upregulated in oral carcinoma. Shiih et al. found that LncHIFCAR acted as a HIF-1α co-activator driving oral cancer progression . LncHIFCAR formed a complex with HIF-1α via directly binding and facilitates the recruitment of HIF-1α and p300 cofactor to the target promoters. In addition, lncRNAs can guide RNA polymerase II to bind to the promoter of specific genes. Miao et al. found that lncRNA LEENE guided and facilitated the recruitment of RNA Pol II to the eNOS promoter to upregulate eNOS RNA transcription . In addition, recent studies indicated that lncRNA gene promoter could compete for enhancer with protein coding gene promoter. Enhancer is the cis-acting DNA sequence that can enhance the transcription of an associated gene, when bound by specific transcription factors. Cho et al. found that the lncRNA PVT1 promoter has a tumor-suppressor function that is independent of PVT1 gene . The promoter of lncRNA PVT1 competes with the Myc promoter for engagement to four intragenic enhancers, thereby inhibiting the expression of Myc gene.
4. lncRNA on posttranscription regulation
After transcription, the pre-mRNAs are regulated by various RNA-binding proteins (RBPs). The pre-mRNAs are capped, polyadenylated, spliced, edited and transferred from nucleus to cytoplasm. The stability of mRNA is also an important aspect for translation. There are evidence showing the role of lncRNAs in mRNA splicing, editing, transporting, mRNA stability, and mRNA translation. In addition, lncRNAs can regulate mRNA expression indirectly by acting as competing endogenous RNAs.
4.1. lncRNA and alternative splicing
Alternative splicing is a regulatory process during gene expression that enables a single gene coding for multiple proteins. Recent studies indicate that lncRNA can regulate alternative splicing through two main mechanisms. LncRNAs can interact with specific splicing factors or form RNA-RNA duplexes with pre-mRNAs. SR (splicing factor) proteins, such as SRSF1, are a conserved family of proteins involved in RNA splicing regulation in a concentration- and phosphorylation-dependent manner. MALAT1 is a highly conserved lncRNA among mammals and predominantly localizes to nuclear speckles. Tripathi et al.  showed that MALAT1 acts as a molecular sponge to titrate the cellular pool of SR splicing factors, affecting the distribution of splicing factors in nuclear speckles where the alternative splicing occurs and ultimately controlling alternative splicing. The 5′ region of MALAT1 can also bind to the serine/arginine domain of SRSF1 and regulates its cellular levels of the phosphorylated forms. SORL1 (Sorting Protein-Related Receptor Containing LDLR Class A Repeats), a sorting receptor for amyloid precursor protein (APP), can interact with amyloid APP and affect its transport and process in brain. Downregulation of SORL1 expression increases APP secretion and subsequently Aβ formation. LncRNA 51A, an antisense mapping to the intron 1 of the SORL1 gene, masked canonical splicing sites by pairing with SORL1 pre-mRNA, driving a splicing shift of SORL1 from the canonical long protein variant A to an alternatively spliced protein form B .
4.2. The regulation of lncRNA on mRNA stability
Different mRNAs have different lifespans, even in a single cell. The greater the stability of an mRNA molecule is, the more proteins may be produced by the mRNA molecule. The steady-state level of a mRNA is determined by the rate of synthesis and degradation. Modulation of mRNA degradation is an important control point in gene expression to regulate protein synthesis in response to physiological needs and environmental signals. Studies have shown that the role of lncRNA in the regulation of mRNA stability. For example, Cao et al. found that lncRNA LAST (LncRNA-Assisted Stabilization of Transcripts) acted as a mRNA stabilizer by cooperating with CNBP (CCHC-type zinc finger nucleic acid binding protein) to promote Cyclin D1 mRNA stability . Antisense lncRNAs are transcripts emerging from the opposite strand of a coding-RNA region. β-site amyloid precursor protein cleaving enzyme (BACE1) is involved in the production of the amyloid-β (Aβ) peptides that form plaques in the brains of individuals with AD. BACE1-AS expression is elevated in the brain of Alzheimer mouse model. Faghihi et al. found that BACE1-AS increased the stability of BACE1 mRNA and upregulated the BACE1 protein by forming RNA duplex with BACE1 mRNA, which masked the binding site for miR-485-5p and thereby increase the BACE1 mRNA stability [40, 41]. Matsui et al. found that iNOS antisense transcript stabilized iNOS mRNA through interaction with AU-rich element-binding HuR protein . lncRNAs can also reduce the stability of mRNA by making the transcript prone to degradation. aHIF is a natural antisense transcript of hypoxia-inducible factor 1alpha (HIF-1α). Rossignol et al. reported that aHIF could expose AU-riches elements present in the HIF-1α mRNA 3’ UTR, thus increasing the degradation speed of HIF-1a mRNA .
4.3. lncRNA and protein stability
LncRNA can also directly interact with proteins and regulate their stability by retarding protein ubiquitination and degeneration. Androgen receptor (AR) is a critical risk factor in castration-resistant prostate cancer. Zhang et al. shown that lncRNA HOTAIR bound to AR protein to block AR interaction with E3 ubiquitin ligase MDM2, thereby preventing AR ubiquitination and AR protein degradation . Liu et al. identified lncRNA MT1JP as a critical factor in restraining cell transformation by modulating p53 translation through binding and stabilizing the RNA binding protein TIAR . LincRNA-p21 is a hypoxia-responsive lncRNA. LncRNA-p21 can bind to HIF-1 at its VHL binging region and attenuate VHL-mediated HIF-1α ubiquitination, leading to HIF-1α accumulation .
4.4. lncRNA regulates protein translation
Transcription and translation are two main stages in gene expression. In translation, the ribosomal preinitiation complex, consisting of eukaryotic initiation factors (eIFs) and ribosomes, is positioned the start codon of the target RNA. With the help of tRNA, the mRNA is decoded to produce peptide chains. It has been reported that lncRNAs participate in protein translation by interaction with rRNA, ribosome or eIFs. Li et al. reported that a nucleolar-specific lncRNA, LoNA reduced rRNA production and ribosome biosynthesis . The 5′ region of LoNA bound to and sequestered nucleolin to suppress rRNA transcription and the 3′ end recruited and diminishes fibrillarin activity to reduce rRNA methylation . Tran et al. found that lncRNA AS-RBM15, the antisense of RNA binding motif protein 15, overlapped with the 5’ UTR of RBM15. As-RBM15 enhanced RBM15 protein translation via incorporation into the RBM15 mRNA-containing polyribosome in a CAP-dependent manner . The antisense of UCHL1 binds the 5’ UTR of UCHL1 mRNA to active polysomes for UCHL1 translation . LncRNA Gas5 (growth arrest specific 5) has been reported to interact and cooperate with eIF4E to regulate c-MYC translation .
4.5. lncRNAs act as microRNA sponges
Micro RNAs (miRNAs), short ncRNAs, can bind to the complementary regions of mRNAs to post-translationally regulate the expression of target genes. The subcellular localization of lncRNAs is associated with their functions. Studies have shown that lncRNAs can act as miRNA sponges to inhibit the binding of miRNAs to their mRNA targets, thereby stabilizing the target mRNAs and regulating the corresponding protein expression . Hu et al. found that p53-responding lncRNA GUARDIN is important for maintaining genomic integrity under exogenous genotoxic stress. GUARDIN contains eight regions complementary to the ‘seed’ region of miR-23a. Telomeric repeat-binding factor-2 (TRF2), a critical component of the shelterin complex, is one of the targets of miR-23a. GUARDIN maintained the expression of TRF2 by sequestering miR-23a. Zhou et al. observed that lncRNA H19 acted as a sponge for the miRNAs miR-200b/c and Let-7b to promote an epithelial or mesenchymal switch in tumor cells . In the epithelial-like tumor cells, H19 inhibited the migration-related protein ARF by sequestering miR-200b/c. In contrast, H19 activated ARF by sequestering Let-7b in the mesenchymal-like tumor cells. H19 stimulated HMGA2-mediated EMT by sequestering Let-7 to promote pancreatic ductal adenocarcinoma cell invasion and migration [52, 53].
4.6. lncRNA sequesters proteins
LncRNA can act as protein molecular decoy by binding and sequestering proteins, thereby inhibiting their functions . Lee et al. identified a lncRNA named noncoding RNA activated by DNA damage (NORAD)  and found that NORAD played an important role in maintaining genomic stability by binding and sequestering PUMILIO proteins. In the absence of NORAD, PUMILIO proteins drove chromosomal instability by acting as negative regulators of gene expression involved in DNA damage repair and mitosis. S-adenosyl-L-homocysteine hydrolase (SAHH) is the only mammalian enzyme capable of catalyzing the hydrolysis of S-adenosyl-L-homocysteine (SAH), which is an inhibitor of S-adenosylmethionine (SAM)-dependent methyltransferases. Zhou et al. reported that lncRNA H19 bound to SAHH and suppressed its enzymatic activity, leading to an increased accumulation of SAH . lncRNA Gas5 is induced by stress like starvation. Glucocorticoid receptor (GR) plays important role in regulating genes associated with metabolism, development and immune response. Kino et al. showed that lncRNA Gas5 bound to the DNA-binding domain of glucocorticoid receptor (GR), thereby, inhibiting the ability of GR to regulate the related gene expression .
5. The lncRNA interaction with other molecules
As shown above, lncRNAs regulate gene expression at pretranscriptional, transcriptional and posttranscriptional levels by interacting with DNAs, mRNAs, miRNAs, and proteins. Here we will give a brief introduction of the interactions of lncRNA with other molecules.
LncRNAs regulate gene expression through various complicated mechanisms, one of the them is binding to the DNA or forming RNA-dsDNA triplexes by targeting specific DNA sequences. The lncRNA as a third strand can inserted into the major groove of the DNA duplex by Hoogsteen hydrogen binding . The Hoogsteen hydrogen binding by a third strand is usually weaker than the Watson-Crick hydrogen bonding. Studies from Roberts et al. indicate RNA may form more stable triplexes than their DNA counterparts . Various methods have been used to investigate the interaction of lncRNA and DNA. The formation of lncRNA DHFR and dsDNA triplex was demonstrated using electrophoresis mobility shift assay  and the binding of lncRNA Fendrr to dsDNA was determined by in vitro pull-down experiments [61, 62]. High throughput methods are also applied to investigate the genomic binding sites of lncRNAs, such as capture hybridization analysis of RNA targets sequencing (CHART-seq) . Although the mechanism of RNA-dsDNA triplexes formation is still not well illustrated, it is clear that the lncRNA-DNA interaction offers a potent mechanism for gene regulation. LncRNAs can bind DNAs acting as scaffolds for introducing proteins into the gene loci. When the proteins introduced by lncRNAs are methylation–related enzymes, these enzymes can induce promotor CpG island methylation or demethylation (Figure 1A). When the proteins imported by lncRNAs are histone modifier enzymes (Figure 1A), histone modifications can result gene expression, transcriptional silencing, or DNA repair and genomic imprinting. LncRNA can regulate either neighboring (cis) or distal (trans) protein coding genes. LncRNAs derived from one chromatin can bind to another chromatin, such as LncRNA HOTAIR, which is transcripted from the HOXC locus on the chromatin 12 and represses transcription in trans across 40 kb of the HOXD locus .
The competitive endogenous RNA (ceRNA) hypothesis proposes that RNA transcripts, including both coding and noncoding RNAs, compete for post-translational regulation with shared miRNA binding sites (i.e. miRNA response elements) . Cytoplasmic lncRNA can act as a molecular sponge of miRNA to regulate the rate of translation and degradation of mRNA (Figure 1B). RNA interference is a powerful mechanism of gene silence and mediated by RNA-induced silencing complexes (RISC). RICS is ribonucleoprotein complex, which incorporates on strand of a single-stranded RNA fragment, such as miRNA . Once a mRNA is transcribed and exported to the cytoplasm, it can be targeted by the RISC, resulting in the accelerated degradation of the mRNA, or blocked translation. LncRNAs with miRNA response element can compete for miRNA targeting and binding to RISC, leading to sequestration of the microRNA-RISC and preventing RISC-mediated degradation of mRNAs and increasing mRNA expression. Using AGO-CLIP-seq high throughout method, thousands of miRNA-target interaction are estimated [66, 67]. We have shown studies that lncRNAs act as miRNA spongers above. Except lncRNAs, mRNA, circular RNAs and pseudogenes can also act s as the ceRNA. The most widely used miRNA target prediction rule is the 6-nucleotide interactions between 5′ ends of the microRNA which is called “seed region” . Note that one microRNA can binds to hundreds of RNAs and one RNA molecule may also bind to diverse microRNA with different affinity, which is difficult to quantify.
As described above, lncRNAs are able to regulate pre-mRNA splicing and mRNA stability. Some antisense lncRNAs can bind to the homologous mRNA at the splice site, thereby masking the splice site and blocking spliceosome assembly . LncRNAs can also mask miRNA target sites of an mRNA by binding to the mRNA at the miRNA response elements (Figure 1C), resulting in elevated stability and expression of the mRNA . LncRNAs could bind to protein and mRNA simultaneously. Gong et al. reported that lncRNAs bound to the Alu region at the 3’ UTR of mRNA and the looped lncRNAs bound to STAU1 protein (Double-Stranded RNA-Binding Protein Staufen Homolog 1), leading to the activation of the Staufen-mediated decay pathway . Recently, computational and experimental methods have been developed to determine RNA–RNA interaction. For example, Sharma et al.  developed a high throughput sequencing method, named LIGation of interacting RNA followed by high-throughput sequencing” (LIGR-seq), to reveal a remarkable landscape of RNA-RNA interactions involving the major classes of ncRNA and mRNA. Nguyen developed MARIO (Mapping RNA interactome in vivo)  approach to map tens of thousands of endogenous RNA-RNA interactions.
There is no double that RNA-protein interactions play a crucial role in fundamental cellular processes. LncRNAs can function as protein decoys recruiting or sequester proteins, or act as scaffolds linking different proteins, which may act coordinately or act as a complex (Figure 1D left). Several models have been established to understand how an lncRNA regulates gene expression by protein binding. LncRNAs are able to recruit chromatin modifier to achieve chromatin modification . LncRNA can bind and stabilize a protein by masking its ubiquitination site, inducing the accumulation of the target protein . lncRNA can bind to protein, the bonded domain may change its function (Figure 1D right) . LncRNAs can bind to TFs to mask their DNA-binding sequences or stabilize TFs . LncRNAs are also able to bind to functional enzymes to inhibit their activities , resulting elevated levels of substrate proteins. The high throughput experimental methods, such as CLIP-seq and RIP-seq, have been used to study RNA-protein interactions .
5.5. The lncRNA location and its role
Most of lncRNAs are located exclusively in the nucleus, but some of them are located in the cytoplasm or in both nucleus and cytoplasm. Increasing evidence reveals that RNA subcellular location is a very important feature in understanding lncRNA functions. The nuclear function of lncRNAs are apt to regulate gene expression in cis or in trans. In the nucleus, a lncRNA can accumulate at its transcription site and recruit transcription factors or chromatin modifiers. LncRNAs in the nucleus also can regulate gene expression in trans by binding to a remote genome sites. In addition, the effect of lncRNA on alternative splicing usually occurs in the nucleus. The lncRNAs exported to cytoplasm intend to interfere translation or sequester proteins/miRNA. Currently, the subcellular locations of known lncRNAs are mainly determined by biological experiments. Most recently, Zhang et al. built a web-accessible database (RNALocate) to provide a high-quality RNA subcellular localization resource and facilitate future researches on RNA functions or structures based on the experimental data . Su et al. developed a sequence-based bioinformatics tool (iLoc-lncRNA) to annotate and predict the subcellular locations of lncRNAs by binomial distribution of the 8-tuple nucleotide signatures into the general pseudo K-tuple nucleotide composition .
6. The application of bioinformatics in lncRNA studies
The research on LncRNAs research has increased rapidly in recent years. The bioinformaticians have developed various approaches for identification of lncRNA from the genome and transcriptome data, prediction of lncRNA structures, investigation of lncRNA functions and construction of lnRNA-associated regulatory networks. LncRNA databases have been established from some giant studies or by integrating different studies. The association analyses based methods, such as gene set enrichment analysis (GSEA) and weighted gene co-expression network analysis (WGCNA) make it easier for biologists to research the lncRNA. As described above, lncRNA can bind with single or multiple different types of molecules to regulate gene expression in the different level directly or indirectly. Therefore, the databases and tools are important for investigating lncRNA bindings with other molecules. The binding pattern is usually determined by the molecular sequences (nucleotide or amino acid) and the structures. Accumulated high throughput experimental methods have been developed for the identification of the lncRNA-DNA, lncRNA-RNA, and lncRNA-protein bindings. Based on these existing data, bioinformatics approaches, such as deep learning, can be used to identify the binding rules, which can be used to predict the potential molecule that could bind to given lncRNA based on the DNA/RNA sequences and/or structure. Note that the bioinformatics tools are useful, but one also should keep in mind that the prediction and associated analysis is a good screening process. Validation of the predicted results with experimental methods are required. Here, we briefly reviewed the bioinformatics tools for investigating the interaction of lncRNAs with other molecules.
The formation of RNA-dsDNA triplex is sequence-specific. Some motifs are easier to form triplex . Therefore, the bioinformatics method can be used to predict the binding sites. Triplexator  and Longtarget  are two bioinformatics tools that have been used to predict lncRNA-dsDNA bindings. There are a number of bioinformatics tools to predict RNA-RNA interaction, such as RNAup , intaRNA , RNAplex , etc. RNAup calculates the thermodynamics of RNA-RNA interactions. IntaRNA is a fast approach for predicting RNA-RNA interaction incorporating accessibility and seeding of interaction sites. Recently, Gawronski et al. proposed a pipeline (MechRNA) to predict lncRNA interactions with target RNA using IntaRNA2, as well as the protein bindings with other tools . The performance of various RNA-RNA interaction tools has been summarized previously .
Based on ceRNA hypothesis and seed region match rule, many databases and bioinformatics tools have been developed. Targetscan , starBase , miRTarBase , Pictar , miRanda  and other bioinformatics tools have been used for the prediction of miRNA targets. Our group developed a novel network-based method previously to integrate the correlations between lncRNA, protein coding genes and miRNA . We also designed a model to assess the combined impact of mRNAs, lncRNAs and miRNAs on cellular signaling transduction networks recently . Tong et al. developed tools and a server to enable validating predicted lncRNA-miRNA-mRNA regulations from TCGA RNA-seq data and identifying miRNA-associated cancer signaling pathways and related lncRNA sponges [89, 90]. However, because of the complicate reticular networks, it is difficult to evaluate the effect of individual ceRNAs based solely on the bioinformatics analysis [91, 92].
Bioinformatics sciences have developed various tools for predicting ncRNA-protein interaction. Most of the existing methods are based on the sequences of either proteins or RNAs. Some of them investigate the associations between proteins and RNAs, such as RPI-Pred , RPIseq , and lncPro . Some tools are used to predict binding sites of RNAs or proteins, such as BindN , RNABindR , RNAProB , PPRint , PRINTR , PRBR , SRCPred , RNABindRPlus  and RBRIdent . The CatRAPID  method is specially designed to determine residue-nucleotide interactions. We developed a three-step prediction model called RPI-Bind for the identification of RNA-protein binding regions based on both sequences and structures of proteins and RNAs. These three steps include: (1) prediction of RNA binding regions on proteins, (2) prediction of protein binding regions on RNA, and (3) simultaneous prediction of interaction regions on RNA and proteins . Suresh et al. developed a RNA-protein interaction predictor (RPI-Pred) using a new support-vector machine-based method to predict protein-RNA interaction pairs, based on both the sequences and structures . The usage of machine learning algorithms have improved the prediction accuracies [93, 106].
Advances in high throughput technologies results in a rapid identification of a large amount of lncRNAs and expands our understanding of the mechanisms how lncRNAs function. Accumulating evidence indicates the complicated roles of lncRNAs. In order to gain deep insight into the interaction of lncRNA with other molecules, a number of bioinformatics tools have been developed. It is undeniable that our understanding of gene regulation by lncRNAs is still in the early stages. Given the growing numbers of lncRNA studies, we can anticipate that the detailed roles of novel lncRNAs will be addressed in the near future.
This work was partially supported by National Institutes of Health grants [NIH R01GM123037, U01AR069395-01A1, and U01CA166886] to X. Zhou.