Swinger RNAs in the Human Mitochondrial Transcriptome

Transcriptomes include coding and non-coding RNAs and RNA fragments with no apparent homology to parent genomes. Non-canonical transcriptions systematically transforming template DNA sequences along precise rules explain some transcripts. Among these systematic transformations, 23 systematic exchanges between nucleotides, i.e. 9 symmetric (X ↔ Y, e.g. C ↔ T) and 14 asymmetric (X → Y → Z → X, e.g. A → T → G → A) exchanges. Here, comparisons between mitochondrial swinger RNAs previously detected in a complete human transcriptome dataset (including cytosolic RNAs) and swinger RNAs detected in purified mitochondrial transcriptomic data (not including cytosolic RNAs) show high reproducibility and exclude cytosolic contaminations. These results based on next-generation sequencing Illumina technology confirm detections of mitochondrial swinger RNAs in GenBank’s EST database sequenced by the classical Sanger method, assessing the existence of swinger polymerizations.


Introduction
Transcription is an intracellular mechanism that produces RNA by DNA-dependant RNA polymerisation.RNAs coding for polypeptide chains are mRNAs translated by other transcription products, tRNAs and ribosomal RNAs.Some RNAs do not correspond to any DNA sequence in the genome, suggesting in some cases spontaneous emergence [1].These RNAs remain usually unreported and are ignored.Similarly, proteomic data include numerous peptides that do not match canonical translation of predicted ORFs, but imply translation of stop codons [2][3][4][5][6][7][8] by tRNAs with anticodons matching stops [9][10][11] or by tRNAs with expanded anticodons [12][13][14].Assuming fusion of different transcripts explains the origins of some of these non-canonical RNAs [15].Some human RNAs matching exons differ from their DNA by specific changes, called RDDs (RNA-DNA differences) [16].RDDs can be single nucleotide substitutions or deletions [17][18][19], presumably resulting from post-transcriptional edition [20,21].Some short transcripts correspond to mitochondrial DNA at the condition that one assumes mono-or dinucleotide deletions after each transcribed nucleotide triplet [22,23].Formation of secondary structures by del-transformed sequences apparently downregulates del-transcription itself or its products, delRNAs [24].
Another type of systematic transformation consists of 23 systematic exchanges between nucleotides, 9 symmetric (X ↔ Y, e.g.A ↔ C,) [25,26] and 14 asymmetric exchanges (X → Y → Z → X, e.g.A → C → G → A) [26,27].For example, in systematic transformation A ↔ C, nucleotide A is introduced in place of nucleotide C and vice versa.The two-headed arrow (↔) indicates that A and C replace each other during transcription.One-headed arrows (→) indicate asymmetric exchanges: in the example A → C → G → A, nucleotide A is systematically incorporated in place of every C; similarly, C replaces G and G replaces A during RNA polymerisation.Transcripts corresponding to systematic exchanges are called swinger RNAs.BLASTn analyses detect about 100 predicted swinger RNAs (longer than 100 nucleotides) in GenBank's EST database in addition to the (approximately) 10,000 canonical human mitochondrial RNAs in that database.Hence, about 1% of the human mitochondrial transcripts in GenBank's EST database correspond to 1 among 23 systematic nucleotide exchanges [25][26][27][28].These systematic nucleotide exchanges (an expression that fits chemical contexts) are called bijective transformations in mathematical contexts [29][30][31]; swinger transcription fits biological contexts.
Mitogenomes are comparatively small, also because of the selection against multiple direct repeats [32][33][34][35] and invert repeats [15]: these form secondary structures that are frequently excised; such deletions are frequently deleterious.Vertebrate mitogenomes have densely packed coding and non-coding regions templating for RNAs.Non-canonical transformations greatly increase potential numbers of RNA products for single sequences: four and five RNA transcripts when assuming systematic deletions of mono-and dinucleotides for deltranscriptions, respectively, and 23 swinger RNAs when considering systematic nucleotide exchanges.Therefore, studies of swinger transformations focus on the human mitogenome, which is short (16,569 bp), hence reducing potential false-positive detections due to sheer genome size and because ample sequence data are available from several sources for this organism.
Note that swinger DNA has been detected (mainly corresponding to rRNA genes) for mitochondrial and nuclear sequences [36][37][38].Hence, swinger RNAs result from canonical transcription of swinger-transformed DNA or swinger transcription of regular DNA [22].Some mass spectra match predicted peptides translated from del-and swinger-transformed RNA [39][40][41][42].Detection of chimeric RNAs, consisting of part regular, and part swinger-transformed contiguous sequences suggests that regular canonical and swinger-transformed RNA result from single polymerisation events, probably by the same polymerase [43].Peptides corresponding to such chimeric RNAs also occur [44].
Secondary structure formation by swinger-transformed sequences associates with swinger RNA detection [45], suggesting regulation of swinger RNA processing by secondary structures, as observed for canonical mitochondrial RNAs, i.e. tRNA punctuation [46].
Abundances of human mitochondrial swinger RNAs detected in GenBank's EST database [25,26], originating from various sources using Sanger sequencing, are proportional to those detected in transcriptomic data produced by next-generation sequencing, Illumina technology [47].Similarly, abundances and lengths of swinger RNAs detected in Mimivirus' transcriptome sequenced by 454 technology are proportional to those detected when using SOLID sequencing [01].These analyses confirmed that swinger RNAs are not sequencing artefacts due to specific sequencing technologies, but data sources do not exclude contamination by cytosolic RNA.Here, we compare the previously described human mitochondrial swinger transcriptome [39] from a complete human transcriptome (including cytosolic RNAs) with the swinger transcriptome as detected in purified human mitochondrial lines [48].Reproducibility of swinger RNA coverages of the human mitogenome would exclude sequencing artefacts and cytosolic contaminations as alternative explanations for hypothetical swinger RNAs.We predict (1) the detection of swinger RNAs from transcriptomic data extracted from purified mitochondrial lines and (2) high similarities between mitogenomic swinger RNA coverages described here and previously [39].

Detection of swinger RNAs
We used GenBank's BLASTn ('somewhat similar sequences' with default alignment parameters) [49] for in-silico alignment searches between each of the 23 swinger-transformed versions of the human mitogenome (NC_012920) and transcriptomic data in GenBank's Sequence Read Archive (SRA) (SRX084350-SRX084355 and SRX087285), sequenced by RNA-Seq, Illumina HiSeq 2500 technology [48].Alignments with more than 80% identity were recorded and used as a swinger RNA candidate for further analysis.

Mitogenomic gene coverage by swinger RNAs
Locations of detected swinger RNAs were recorded by mapping these RNAs on the human mitogenome.We analyse separately 17 mitogenomic regions: the D-loop, 2 ribosomal RNAs (12S and 16S), 13 protein-coding genes involved in the electron transport chain and the WANCY region (intragenic region between ND2 and CO1 that templates for tRNAs with cognate amino acids W, A, N, C and Y).Percentage coverages by detected swinger RNAs were calculated for each swinger transformation in each selected mitogenomic region and used for further statistical analyses.

Swinger RNAs in the human mitochondrial transcriptome
Table 1 summarises results from BLASTn analyses of the purified mitochondrial transcriptome [48] for the 23 swinger-transformed versions of the human mitogenome.In total 4120 reads aligned with the 23 swinger-transformed versions of the human mitogenome,  [39] analyses of two different datasets sequenced by Illumina.Total mitogenome coverages by swinger RNAs for each transformation were plotted as a function of corresponding coverages from a previous analysis published in 2016 [39].Coverages are positively correlated (Pearson correlation coefficient r = 0.669, one-tailed p = 0.0002, Figure 1).Coverages for the purified mitochondrial line transcriptomes are systematically greater than for those for previous analyses (supplementary data and Table 1) [39].

Gene-level comparisons of swinger RNA coverages
Swinger RNA coverages of each of the 17 mitogenomic regions (D-loop, 2 rRNAs, 13 CDs and the WANCY region) are in Tables 2 and 3, for analyses of current and previous Illumina data [39], respectively.Pearson correlation coefficients between swinger coverages were calculated considering (1) genes, i.e. for each gene across the 23 different transformations, and (2) for each swinger transformation, across the 17 different mitogenomic regions.   2 and 3 by rows and columns, respectively.
Table 3. Percentage coverage of mitogenomic regions by swinger RNAs from analyses of complete human transcriptomic data [39].

Mitochondrial DNA -New Insights
Most correlations are positive along both genewise (columns) and transformation-wise (rows) analyses when comparing Tables 2 and 3 (last rows and last columns in Table 3).Focusing on transformations and comparing coverages across genes for each transformation, correlations are positive between Tables 2 and 3 for 19 among 23 transformations (P = 0.00065, one-tailed sign test) with 10 correlations statistically significant at P < 0.05.Analyses at the gene level across transformations detect 14 among 17 positive correlations (P = 0.003, one-tailed sign test), and six correlations at the gene level have P < 0.05 (one-tailed).
Across genes, at the transformation level, the strongest correlation was observed for transformation C ↔ T (Figure 2) with Pearson r = 0.869 and one-tailed P = 0.0000029.Across transformations, at the gene level, the strongest correlation was observed for the gene ND6 with Pearson r = 0.752 and one-tailed P = 0.000017 with highest coverage at C ↔ T transformation in both datasets (Tables 2 and 3).In order to test whether swinger coverage has more transformation than gene-specific effect, we calculated the combined P value using Fisher's method to combine P values, for the 23 swinger transformations and, separately, for the 17 mitogenome regions.The method sums -2xln(Pi) where i ranges from 1 to k (k = 23 for transformations and k = 17 for genome regions/genes).This yield combined P = 5.7 × 10 −21 for transformations and combined P = 1.93 × 10 −7 for genes.This indicates a 3× stronger effect of transformation  [39].ND6 has the highest coverage among all transformations.Data from Tables 2 and 3.
Swinger RNAs in the Human Mitochondrial Transcriptome http://dx.doi.org/10.5772/intechopen.80805across genes on coverage than a gene-specific effect across transformations.Hence, the most important unknown factor determines transformations.The genome region that is swingertranscribed is important but secondary.

General conclusion
We find high reproducibility in swinger RNA coverage for the human mitogenome when comparing two independent transcriptomic datasets produced by Illumina sequencing.Positive correlations occur at each gene and transformation levels, reaffirming the reproducibility of the results, but are stronger at the transformation than gene level.The reproducibility of the swinger transcriptome in the giant virus Mimivirus and the ability to predict swinger RNA abundances from mathematical symmetry and error correcting principles [31,50] together with present results from mitochondrial transcriptomes hint that swinger polymerizations are a universal phenomenon.
Mitochondrial DNA -New Insights producing 841 contigs.The highest detected identity between a theoretical mitogenome swinger transformation and a read was 95.32% for transformation A → C → T → G → A, and the lowest identity was 88.94% for transformation A → T → C → G → A. The overall identity averaged at 92.86%.A previous swinger analysis of other transcriptomic data[39] found swinger transformations A ↔ G and C ↔ T least and most frequent, respectively.Here, swinger transformation A ↔ G remains the least frequent; C ↔ T is the second most frequent, suggesting high reproducibility.

Figure 1 .
Figure 1.Total mitogenome coverages by swinger RNAs across the complete human mitogenome in previously analysed data [39] (y-axis) as a function of those obtained in current observations from purified mitochondrial lines.

Figure 2 .
Figure 2. Percentage coverage of C ↔ T-transformed swinger RNAs across genes in this study as a function of their coverages in previously analysed data[39].ND6 has the highest coverage among all transformations.Data from Tables2 and 3.

Table 1 .
Total human mitogenome coverage by detected swinger RNAs from current (2018) and previous(2016) and transformed mitogenome and (5) total mitogenomic coverage by all swinger contigs.Columns 6-9 indicate corresponding data in the same order for the previous study.

Table 2 .
Percentage coverage of mitogenomic regions by swinger RNAs in this study.

Table 2 .
r and P (last two columns and last two rows, respectively) indicate linear Pearson correlation coefficients between coverages across mitogenomic region/swinger transformations, comparing data in Tables