Open access peer-reviewed chapter

Swinger RNAs in the Human Mitochondrial Transcriptome

By Ganesh Warthi and Hervé Seligmann

Submitted: December 9th 2017Reviewed: August 8th 2018Published: November 5th 2018

DOI: 10.5772/intechopen.80805

Downloaded: 195

Abstract

Transcriptomes include coding and non-coding RNAs and RNA fragments with no apparent homology to parent genomes. Non-canonical transcriptions systematically transforming template DNA sequences along precise rules explain some transcripts. Among these systematic transformations, 23 systematic exchanges between nucleotides, i.e. 9 symmetric (X ↔ Y, e.g. C ↔ T) and 14 asymmetric (X → Y → Z → X, e.g. A → T → G → A) exchanges. Here, comparisons between mitochondrial swinger RNAs previously detected in a complete human transcriptome dataset (including cytosolic RNAs) and swinger RNAs detected in purified mitochondrial transcriptomic data (not including cytosolic RNAs) show high reproducibility and exclude cytosolic contaminations. These results based on next-generation sequencing Illumina technology confirm detections of mitochondrial swinger RNAs in GenBank’s EST database sequenced by the classical Sanger method, assessing the existence of swinger polymerizations.

Keywords

  • swinger RNA
  • non-canonical transcription
  • mitogenome
  • systematic nucleotide exchange
  • blast analyses

1. Introduction

Transcription is an intracellular mechanism that produces RNA by DNA-dependant RNA polymerisation. RNAs coding for polypeptide chains are mRNAs translated by other transcription products, tRNAs and ribosomal RNAs. Some RNAs do not correspond to any DNA sequence in the genome, suggesting in some cases spontaneous emergence [1]. These RNAs remain usually unreported and are ignored. Similarly, proteomic data include numerous peptides that do not match canonical translation of predicted ORFs, but imply translation of stop codons [2, 3, 4, 5, 6, 7, 8] by tRNAs with anticodons matching stops [9, 10, 11] or by tRNAs with expanded anticodons [12, 13, 14]. Assuming fusion of different transcripts explains the origins of some of these non-canonical RNAs [15]. Some human RNAs matching exons differ from their DNA by specific changes, called RDDs (RNA-DNA differences) [16]. RDDs can be single nucleotide substitutions or deletions [17, 18, 19], presumably resulting from post-transcriptional edition [20, 21]. Some short transcripts correspond to mitochondrial DNA at the condition that one assumes mono- or dinucleotide deletions after each transcribed nucleotide triplet [22, 23]. Formation of secondary structures by del-transformed sequences apparently downregulates del-transcription itself or its products, delRNAs [24].

Another type of systematic transformation consists of 23 systematic exchanges between nucleotides, 9 symmetric (X ↔ Y, e.g. A ↔ C,) [25, 26] and 14 asymmetric exchanges (X → Y → Z → X, e.g. A → C → G → A) [26, 27]. For example, in systematic transformation A ↔ C, nucleotide A is introduced in place of nucleotide C and vice versa. The two-headed arrow (↔) indicates that A and C replace each other during transcription. One-headed arrows (→) indicate asymmetric exchanges: in the example A → C → G → A, nucleotide A is systematically incorporated in place of every C; similarly, C replaces G and G replaces A during RNA polymerisation. Transcripts corresponding to systematic exchanges are called swinger RNAs. BLASTn analyses detect about 100 predicted swinger RNAs (longer than 100 nucleotides) in GenBank’s EST database in addition to the (approximately) 10,000 canonical human mitochondrial RNAs in that database. Hence, about 1% of the human mitochondrial transcripts in GenBank’s EST database correspond to 1 among 23 systematic nucleotide exchanges [25, 26, 27, 28]. These systematic nucleotide exchanges (an expression that fits chemical contexts) are called bijective transformations in mathematical contexts [29, 30, 31]; swinger transcription fits biological contexts.

Mitogenomes are comparatively small, also because of the selection against multiple direct repeats [32, 33, 34, 35] and invert repeats [15]: these form secondary structures that are frequently excised; such deletions are frequently deleterious. Vertebrate mitogenomes have densely packed coding and non-coding regions templating for RNAs. Non-canonical transformations greatly increase potential numbers of RNA products for single sequences: four and five RNA transcripts when assuming systematic deletions of mono- and dinucleotides for del-transcriptions, respectively, and 23 swinger RNAs when considering systematic nucleotide exchanges. Therefore, studies of swinger transformations focus on the human mitogenome, which is short (16,569 bp), hence reducing potential false-positive detections due to sheer genome size and because ample sequence data are available from several sources for this organism.

Note that swinger DNA has been detected (mainly corresponding to rRNA genes) for mitochondrial and nuclear sequences [36, 37, 38]. Hence, swinger RNAs result from canonical transcription of swinger-transformed DNA or swinger transcription of regular DNA [22]. Some mass spectra match predicted peptides translated from del- and swinger-transformed RNA [39, 40, 41, 42]. Detection of chimeric RNAs, consisting of part regular, and part swinger-transformed contiguous sequences suggests that regular canonical and swinger-transformed RNA result from single polymerisation events, probably by the same polymerase [43]. Peptides corresponding to such chimeric RNAs also occur [44].

Secondary structure formation by swinger-transformed sequences associates with swinger RNA detection [45], suggesting regulation of swinger RNA processing by secondary structures, as observed for canonical mitochondrial RNAs, i.e. tRNA punctuation [46].

Abundances of human mitochondrial swinger RNAs detected in GenBank’s EST database [25, 26], originating from various sources using Sanger sequencing, are proportional to those detected in transcriptomic data produced by next-generation sequencing, Illumina technology [47]. Similarly, abundances and lengths of swinger RNAs detected in Mimivirus’ transcriptome sequenced by 454 technology are proportional to those detected when using SOLID sequencing [01]. These analyses confirmed that swinger RNAs are not sequencing artefacts due to specific sequencing technologies, but data sources do not exclude contamination by cytosolic RNA. Here, we compare the previously described human mitochondrial swinger transcriptome [39] from a complete human transcriptome (including cytosolic RNAs) with the swinger transcriptome as detected in purified human mitochondrial lines [48]. Reproducibility of swinger RNA coverages of the human mitogenome would exclude sequencing artefacts and cytosolic contaminations as alternative explanations for hypothetical swinger RNAs. We predict (1) the detection of swinger RNAs from transcriptomic data extracted from purified mitochondrial lines and (2) high similarities between mitogenomic swinger RNA coverages described here and previously [39].

2. Materials and methods

2.1. Detection of swinger RNAs

We used GenBank’s BLASTn (‘somewhat similar sequences’ with default alignment parameters) [49] for in-silico alignment searches between each of the 23 swinger-transformed versions of the human mitogenome (NC_012920) and transcriptomic data in GenBank’s Sequence Read Archive (SRA) (SRX084350-SRX084355 and SRX087285), sequenced by RNA-Seq, Illumina HiSeq 2500 technology [48]. Alignments with more than 80% identity were recorded and used as a swinger RNA candidate for further analysis.

2.2. Mitogenomic gene coverage by swinger RNAs

Locations of detected swinger RNAs were recorded by mapping these RNAs on the human mitogenome. We analyse separately 17 mitogenomic regions: the D-loop, 2 ribosomal RNAs (12S and 16S), 13 protein-coding genes involved in the electron transport chain and the WANCY region (intragenic region between ND2 and CO1 that templates for tRNAs with cognate amino acids W, A, N, C and Y). Percentage coverages by detected swinger RNAs were calculated for each swinger transformation in each selected mitogenomic region and used for further statistical analyses.

3. Results and discussion

3.1. Swinger RNAs in the human mitochondrial transcriptome

Table 1 summarises results from BLASTn analyses of the purified mitochondrial transcriptome [48] for the 23 swinger-transformed versions of the human mitogenome. In total 4120 reads aligned with the 23 swinger-transformed versions of the human mitogenome, producing 841 contigs. The highest detected identity between a theoretical mitogenome swinger transformation and a read was 95.32% for transformation A → C → T → G → A, and the lowest identity was 88.94% for transformation A → T → C → G → A. The overall identity averaged at 92.86%. A previous swinger analysis of other transcriptomic data [39] found swinger transformations A ↔ G and C ↔ T least and most frequent, respectively. Here, swinger transformation A ↔ G remains the least frequent; C ↔ T is the second most frequent, suggesting high reproducibility.

ReadContigsIdCoverageReadContigsIdCoverage
Regular70014299.768479400691007658
A ↔ C1814293.9412391631089.12389
A ↔ G448592.201846287.2697
A ↔ T984894.3413761861790.63593
C ↔ G319789.9922728885.01343
C ↔ T3385190.5915992532692.47937
G ↔ T4353592.431614400588.36249
A ↔ C + G ↔ T1232589.7973011286.2376
A ↔ G + C ↔ T633492.6398269889.3257
A ↔ T + C ↔ G1263692.871112971194.8395
A → C → G → A802593.81778211290.59439
A → C → T → A1605292.991474311290.79453
A → G → C → A985891.701729431793.13573
A → G → T → A983991.291245281288.49469
A → T → C → A2185093.1215783632092.13810
A → T → G → A842994.0887312590.21190
C → G → T → C1224394.15116721488.44247
C → T → G → C1655791.9020581261786.42791
A → C → G → T → A1404592.331306381589.53575
A → C → T → G → A1573595.32971541092.65327
A → G → C → T → A2112692.63713991587.35530
A → G → T → C → A782793.60777301191.29369
A → T → C → G → A1951788.9446360992.1321
A → T → G → C → A1835594.1418741151694.78601

Table 1.

Total human mitogenome coverage by detected swinger RNAs from current (2018) and previous (2016) [39] analyses of two different datasets sequenced by Illumina.

Current data are from purified mitochondrial lines, previous data are from complete human transcriptome, including cytosolic and mitochondrial transcriptomes. Columns 2–5: current analyses. Columns are (1) swinger transformation (includes lack of transformation), (2) aligning read numbers, (3) contig numbers, (4) mean identity between reads and transformed mitogenome and (5) total mitogenomic coverage by all swinger contigs. Columns 6–9 indicate corresponding data in the same order for the previous study.

Total mitogenome coverages by swinger RNAs for each transformation were plotted as a function of corresponding coverages from a previous analysis published in 2016 [39]. Coverages are positively correlated (Pearson correlation coefficient r = 0.669, one-tailed p = 0.0002, Figure 1). Coverages for the purified mitochondrial line transcriptomes are systematically greater than for those for previous analyses (supplementary data and Table 1) [39].

Figure 1.

Total mitogenome coverages by swinger RNAs across the complete human mitogenome in previously analysed data [39] (y-axis) as a function of those obtained in current observations from purified mitochondrial lines.

3.2. Gene-level comparisons of swinger RNA coverages

Swinger RNA coverages of each of the 17 mitogenomic regions (D-loop, 2 rRNAs, 13 CDs and the WANCY region) are in Tables 2 and 3, for analyses of current and previous Illumina data [39], respectively. Pearson correlation coefficients between swinger coverages were calculated considering (1) genes, i.e. for each gene across the 23 different transformations, and (2) for each swinger transformation, across the 17 different mitogenomic regions.

TransformationD-loop12S16SND1ND2W-YCO1CO2ATP8ATP6CO3ND3ND4LND4ND5ND6Cytb
A ↔ C6.45.69.10.015.125.05.80.09.23.77.311.80.08.37.926.73.7
A ↔ G0.00.00.00.00.00.02.40.00.00.00.09.00.00.02.70.06.0
A ↔ T9.42.88.33.215.911.51.73.122.20.07.77.58.420.410.620.22.5
C ↔ G0.00.00.00.06.44.61.90.011.60.00.08.46.70.00.07.20.0
C ↔ T13.92.59.911.38.80.00.11.528.513.56.89.50.05.417.941.511.4
G ↔ T20.90.010.63.813.30.00.00.033.313.40.00.00.013.218.946.99.6
A ↔ C + G ↔ T2.65.27.70.05.65.68.44.10.00.00.00.00.04.45.511.41.8
A ↔ G + C ↔ T2.92.10.07.718.05.67.73.80.00.04.17.521.210.36.418.90.0
A ↔ T + C ↔ G4.77.27.02.48.56.610.55.135.70.04.127.50.09.44.90.05.3
A → C → G → A4.52.50.10.07.58.20.00.010.112.86.30.00.00.08.713.07.7
A → C → T → A9.00.06.75.013.819.69.920.813.512.09.88.40.010.16.315.09.2
A → G → C → A12.69.66.78.719.30.07.40.041.110.414.07.515.513.910.827.84.6
A → G → T → A10.82.73.65.912.70.03.46.728.513.111.77.50.09.210.818.92.8
A → T → C → A12.32.27.315.912.00.06.83.821.77.611.016.520.210.47.735.06.3
A → T → G → A7.02.73.32.58.00.07.06.415.08.40.09.20.03.17.52.16.8
C → G → T → C10.14.92.011.520.00.05.40.017.43.519.013.60.07.35.814.90.0
C → T → G → C14.40.64.710.312.40.08.214.241.17.63.70.07.124.513.846.514.8
A → C → G → T → A11.65.04.010.720.70.03.83.115.04.46.18.10.04.214.221.73.0
A → C → T → G → A10.43.05.30.05.115.61.78.00.04.07.58.40.01.511.616.49.6
A → G → C → T → A5.92.32.20.017.30.03.84.20.03.40.00.04.40.010.00.05.0
A → G → T → C → A2.90.03.72.97.52.65.00.00.011.25.10.09.82.87.513.94.6
A → T → C → G → A2.32.83.50.02.90.03.11.30.00.06.10.00.06.52.90.02.2
A → T → G → C → A16.94.110.48.717.813.06.110.426.66.80.00.00.04.417.139.217.7

Table 2.

Percentage coverage of mitogenomic regions by swinger RNAs in this study.

Transformation1234567891011121314151617rP
A ↔ C4.10.00.00.03.10.00.00.014.50.00.09.59.47.03.88.40.00.210.418
A ↔ G0.00.00.00.00.00.03.20.00.00.00.00.00.00.02.60.00.00.1980.446
A ↔ T8.94.14.20.37.70.01.60.017.913.00.00.00.07.54.40.00.00.3660.148
C ↔ G5.10.00.00.09.80.03.20.013.50.00.00.00.00.00.06.70.00.660.004
C ↔ T9.45.97.09.47.60.00.06.115.50.00.09.80.04.46.933.02.60.8690.000
G ↔ T3.70.00.05.04.70.00.00.024.20.00.00.00.00.00.010.14.30.6960.002
A ↔ C + G ↔ T0.00.00.00.00.00.02.70.00.00.00.09.80.00.00.00.00.0−0.1680.520
A ↔ G + C ↔ T3.13.00.00.02.80.00.00.00.016.90.09.09.40.01.96.70.00.2120.414
A ↔ T + C ↔ G0.00.04.94.39.80.02.10.017.90.00.00.00.02.42.50.00.00.6460.005
A → C → G → A6.20.00.00.014.30.01.96.70.00.00.011.30.00.01.58.40.00.0450.863
A → C → T → A8.22.80.00.07.70.01.90.00.00.00.00.00.04.91.79.31.10.1170.654
A → G → C → A10.73.00.00.06.50.00.00.017.446.90.00.09.82.16.87.80.00.320.210
A → G → T → A11.10.02.22.23.90.00.00.019.30.00.00.00.05.32.514.30.00.8250.000
A → T → C → A5.00.05.60.07.50.07.30.033.80.05.110.10.75.95.721.10.00.670.003
A → T → G → A0.02.90.03.90.00.00.60.00.00.00.00.00.00.02.48.20.0−0.2630.308
C → G → T → C4.54.51.83.10.00.00.00.00.00.03.60.00.00.00.06.90.00.3350.189
C → T → G → C7.90.02.24.83.812.02.30.030.00.00.00.00.00.07.040.26.60.8330.000
A → C → G → T → A8.80.02.21.46.67.40.00.024.20.00.00.00.04.46.010.10.00.5710.017
A → C → T → G → A5.42.54.10.00.08.70.00.00.00.00.00.00.00.04.212.80.00.7650.000
A → G → C → T → A4.50.04.70.03.10.02.14.419.30.09.39.20.00.07.50.00.0−0.0770.768
A → G → T → C → A9.40.00.00.03.20.00.05.00.00.00.00.00.05.13.46.72.50.1550.551
A → T → C → G → A2.90.00.03.67.80.00.01.20.00.00.010.48.81.70.00.00.0−0.2550.323
A → T → G → C → A14.60.00.00.09.10.02.70.016.90.00.00.00.02.98.613.30.00.7670.000
r0.5350.1420.2690.2890.063−0.0170.21−0.2760.6780.030.087−0.1560.3930.2940.5860.7520.522
P0.0080.5190.2150.1820.7760.9390.3370.2030.0000.8920.6930.4770.0640.1740.0030.0000.011

Table 3.

Percentage coverage of mitogenomic regions by swinger RNAs from analyses of complete human transcriptomic data [39].

Columns 1–17 areas in Table 2. r and P (last two columns and last two rows, respectively) indicate linear Pearson correlation coefficients between coverages across mitogenomic region/swinger transformations, comparing data in Tables 2 and 3 by rows and columns, respectively.

Most correlations are positive along both genewise (columns) and transformation-wise (rows) analyses when comparing Tables 2 and 3 (last rows and last columns in Table 3). Focusing on transformations and comparing coverages across genes for each transformation, correlations are positive between Tables 2 and 3 for 19 among 23 transformations (P = 0.00065, one-tailed sign test) with 10 correlations statistically significant at P < 0.05. Analyses at the gene level across transformations detect 14 among 17 positive correlations (P = 0.003, one-tailed sign test), and six correlations at the gene level have P < 0.05 (one-tailed).

Across genes, at the transformation level, the strongest correlation was observed for transformation C ↔ T (Figure 2) with Pearson r = 0.869 and one-tailed P = 0.0000029. Across transformations, at the gene level, the strongest correlation was observed for the gene ND6 with Pearson r = 0.752 and one-tailed P = 0.000017 with highest coverage at C ↔ T transformation in both datasets (Tables 2 and 3). In order to test whether swinger coverage has more transformation than gene-specific effect, we calculated the combined P value using Fisher’s method to combine P values, for the 23 swinger transformations and, separately, for the 17 mitogenome regions. The method sums -2xln(Pi) where i ranges from 1 to k (k = 23 for transformations and k = 17 for genome regions/genes). This yield combined P = 5.7 × 10−21 for transformations and combined P = 1.93 × 10−7 for genes. This indicates a 3× stronger effect of transformation across genes on coverage than a gene-specific effect across transformations. Hence, the most important unknown factor determines transformations. The genome region that is swinger-transcribed is important but secondary.

Figure 2.

Percentage coverage of C ↔ T-transformed swinger RNAs across genes in this study as a function of their coverages in previously analysed data [39]. ND6 has the highest coverage among all transformations. Data from Tables 2 and 3.

4. General conclusion

We find high reproducibility in swinger RNA coverage for the human mitogenome when comparing two independent transcriptomic datasets produced by Illumina sequencing. Positive correlations occur at each gene and transformation levels, reaffirming the reproducibility of the results, but are stronger at the transformation than gene level. The reproducibility of the swinger transcriptome in the giant virus Mimivirus and the ability to predict swinger RNA abundances from mathematical symmetry and error correcting principles [31, 50] together with present results from mitochondrial transcriptomes hint that swinger polymerizations are a universal phenomenon.

Acknowledgments

This work has been carried out thanks to the support of the A*MIDEX project (no ANR-11-IDEX-0001-02) funded by the ‘Investissements d′Avenir’ French government programme, managed by the French National Research Agency (ANR).

© 2018 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution 3.0 License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

How to cite and reference

Link to this chapter Copy to clipboard

Cite this chapter Copy to clipboard

Ganesh Warthi and Hervé Seligmann (November 5th 2018). Swinger RNAs in the Human Mitochondrial Transcriptome, Mitochondrial DNA - New Insights, Hervé Seligmann, IntechOpen, DOI: 10.5772/intechopen.80805. Available from:

chapter statistics

195total chapter downloads

1Crossref citations

More statistics for editors and authors

Login to your personal dashboard for more detailed statistics on your publications.

Access personal reporting

Related Content

This Book

Next chapter

Expanding the Coding Potential of Vertebrate Mitochondrial Genomes: Lesson Learned from the Atlantic Cod

By Tor Erik Jørgensen and Steinar Daae Johansen

Related Book

First chapter

DNA Structure: Alphabet Soup for the Cellular Soul

By P. Shing Ho and Megan Carter

We are IntechOpen, the world's leading publisher of Open Access books. Built by scientists, for scientists. Our readership spans scientists, professors, researchers, librarians, and students, as well as business professionals. We share our knowledge and peer-reveiwed research papers with libraries, scientific and engineering societies, and also work with corporate R&D departments and government entities.

More About Us