Expanding the Coding Potential of Vertebrate Mitochondrial Genomes: Lesson Learned from the Atlantic Cod Expanding the Coding Potential of Vertebrate Mitochondrial Genomes: Lesson Learned from the Atlantic Cod

Vertebrate mitochondrial genomes are highly conserved in structure, gene content, and function. Most sequenced mitochondrial genomes represent bony fishes, and that of the Atlantic cod ( Gadus morhua ) is the best characterized among the fishes. In addition to the well-characterized 37 canonical gene products encoded by vertebrate mitochondrial genomes, new classes of gene products representing peptides and noncoding RNAs have been discovered. The Atlantic cod encodes at least two peptides (MOTS-c and humanin (HN)), two long noncoding RNAs (lncCR-L and lncCR-H), and a number of small RNAs. Here, we review recent research in the Atlantic cod focusing on putative mitochondrial-derived peptides, the mitochondrial transcriptome, and noncoding RNAs.


Introduction
The mitochondrial genome (mitogenome) is highly conserved among vertebrates [1]. All species investigated to date contain mitogenomes encoding the same 37 canonical gene products, organized in a highly similar gene order in most species. Complete mitogenome sequences have been determined from almost 5000 vertebrate species, where about 50% is represented by the bony fishes [2].
The Atlantic cod (Gadus morhua) is a benthopelagic fish in the Gadidae family, belonging to the order of Gadiformes [3,4]. The 16.7 kb circular mitogenome was one of the first to be of heavy-and light-strand replication, respectively; HTR, heteroplasmic tandem repeat; T-P spacer, intergenic noncoding spacer. tRNA genes are indicated by the standard one-letter symbols for amino acids. All genes are H-strand encoded, except Q, A, N, C, Y, S 1 , E, P, and ND6 (L-strand encoded). mtSSU and mtLSU, mitochondrial small-and large-subunit rRNA genes; ND1-ND6, NADH dehydrogenase subunit 1-6; COI-COIII, cytochrome c oxidase subunit I-III; Cyt b, cytochrome b; ATP6 and ATP8, ATPase subunit 6 and 8. (B) Schematic view of the OxPhos complexes embedded in the inner mitochondrial membrane. ATP is generated by oxidative phosphorylation. The mitochondrial genome encodes 13 of the approximately 85 subunits, belonging to complex I (blue), complex III (orange), complex IV (green), and complex V (yellow). completely sequenced from a fish species [5][6][7]. Atlantic cod possesses the same mitogenome organization as most vertebrate species, including that of humans and vertebrate model systems like mouse, rat, Xenopus, and zebrafish ( Figure 1A).
Among the canonical gene products encoded by the Atlantic cod mitogenome, 13 represent hydrophobic proteins essential for oxidative phosphorylation (OxPhos), two are ribosomal RNAs (rRNAs) of the mitochondrial ribosome, and 22 are transfer RNAs (tRNAs) necessary for mitochondrial translation. The OxPhos system consists of five large protein complexes embedded in the inner mitochondrial membrane. However, only 13 of the approximately 85 OxPhos proteins are encoded by the mitogenome (Figure 1B) [8].
Both strands (H-and L-strands) have coding potential ( Figure 1A). Most mitochondrial genes are encoded by the H-strand and include the small and large subunit rRNAs (mtSSU rRNA and mtLSU rRNA), 14 tRNAs, and 12 protein-coding genes. The L-strand, however, encodes only eight tRNAs and one protein. The control region (CR), located between the genes of tRNA Pro and tRNA Phe , is the major noncoding region in the mitogenome and constitutes approximately 1000 bp in Atlantic cod [7,9]. The CR harbors the genetic control elements for H-strand replication origin (OriH), the transcription initiation sites for H-and L-strands, as well as the displacement loop (D-loop) located between OriH and the termination associated sequence (TAS) [7,9,10]. Furthermore, a 30-bp spacer located between the genes of tRNA Asp and tRNA Cys contains the origin of L-strand synthesis. OriL appears functionally conserved in most vertebrates [11,12], including the Atlantic cod [5].
Hallmarks of Atlantic cod mitogenomes are the noncoding intergenic T-P spacer, and the heteroplasmic tandem repeat (HTR) array at the 5′ domain of CR ( Figure 1A). The 74-bp Atlantic cod T-P spacer [5,13], located between the tRNA Thr and tRNA Pro genes, represents an evolutionary preserved feature present in all gadiform species [10,13]. The T-P spacer is variable in sequence and size among gadiforms but still harbors two conserved 17-bp sequence motifs forming potential hairpin structures at the RNA level [10]. The HTR array consists of a 40-bp sequence motif usually present in two to five copies within an individual [5,14,15] and thus results in size heterogeneity and heteroplasmy of Atlantic cod mitogenomes. Here, we review recent developments in the characterization of Atlantic cod mitogenomes with focus on interindividual sequence variation, mitochondrial transcriptome, noncoding RNAs, and putative mitochondrial-derived peptides.

Sequence variation among Atlantic cod mitochondrial genomes
Complete mitogenome sequences have been obtained from approximately 200 specimens representing major ecotypes and geographic locations of Atlantic cod. In one study, based on SOLiD deep sequencing, we performed pooled sequencing of 44 specimens from each of the migratory northeast arctic cod (NA) and the stationary Norwegian coastal cod (NC) [16]. The sequencing represented more than 1100 times mitogenome coverage of each ecotype and 25 times coverage of each individual. We found a total of 365 SNP loci in the dataset, where 121 SNPs were shared between the ecotypes. One hundred fifty-one SNPs and ninety-three SNPs were specific to NA and NC cod, respectively. From the dataset we determined the mitochondrial substitution rate to be 14 times higher compared to that of the nuclear genome [16,17].
More recently we analyzed 156 Atlantic cod mitogenomes at the individual level [18], including 32 specimens previously reported by Carr and Marshall [19]. We found 1034 SNPs in total among the sequences, which were not evenly distributed throughout the mitogenome. The ND2 gene (Complex I) and the COII gene (Complex IV) were the least and most conserved, respectively, among the protein-coding genes. Furthermore, rRNA and tRNA genes showed a significantly lower density of overall SNPs per site compared to protein-coding genes. Thus, the Atlantic cod mitogenome follows a similar pattern of conservation as seen for other vertebrates like zebrafish and human [20][21][22][23] and corroborates the observation that mutation rate constrains in vertebrate mitogenomes appear linked to the position of genes in relation to OriH and OriH [24,25].
The noncoding regions of the Atlantic cod mitogenome showed a mosaic pattern of sequence conservation. Whereas the OriL and the central domain of CR were almost invariant among specimens, the T-P spacer and 5′ domain of CR contain significant sequence variation [7,10,13,18]. The 74-bp T-P spacer was found to contain 16 variable sites and 26 haplotypes among 225 specimens assessed, including a 29-bp sequence duplication in three individuals [10]. Similarly, the 5′ domain of CR was the most variable region within the mitochondrial genome (more than three times that of average substitution rate). The elevated sequence variation was due to hot-spot substitution sites, homopolymeric heterogeneity, and the HTR array [18].

Mitochondrial-derived peptides
Vertebrate mitogenomes have the potential of encoding several short peptides (mitochondrial-derived peptides (MPDs)) [26][27][28]. The best characterized peptides among the MDPs are MOTS-c and humanin (HN). Genes coding for MOTS-c and HN are found as small open reading frames within the mitochondrial small subunit (mtSSU) and large subunit (mtLSU) ribosomal DNA, respectively [29,30]. Studies in mammals indicate that MDPs are circulating signaling molecules with a number of proposed roles. While HN is involved in cellular stress resistance, apoptosis, and metabolism [29,[31][32][33][34], MOTS-c apparently represents an MDP hormone that regulates metabolic homeostasis and insulin sensitivity [30,35].
The Atlantic cod open reading frames encoding MOTS-c and HN were identified at the exact same locations as in human, within the domain 3'M and domain IV of the mtLSU rRNA and mtSSU rRNA, respectively (Figure 2A and B). Comparative analysis revealed MOTS-c and HN to be invariant among Atlantic cod specimens [18] and well conserved between Atlantic cod and human ( Figure 2C). Here, 8 of 16 amino acid residues in MOTS-c and 13 of 21 amino acid residues of HN were shared. Furthermore, when comparing gadiform species representing seven diverse families, we noted 10 of 16 and 15 of 21 amino acid residues to be shared in MOTS-c and HN, respectively ( Figure 2C). The conserved features seen between gadiform species and human suggest related MDP functions.  (Hs, Homo sapiens, NC_012920). Stars above and below the alignment represent conserved residues among gadiforms and between gadiforms and human, respectively.

Mitochondrial transcriptome
Vertebrate mitochondrial transcriptomes have mainly been studied in human cells and tissues [36,37]. Mature mitochondrial RNAs are generated from three polycistronic transcripts initiated within CR from two H-strand promoters (HSP 1 and HSP 2 ) and a single L-strand promoter (LSP) (Figure 3A) [36,[38][39][40]. The HSP 1 -specific transcript is highly abundant and generates mtSSU rRNA, mtLSU rRNA, as well as tRNA Val and tRNA Phe [41,42]. HSP 1 -specific tRNAs have recently been proposed to perform a second role as a mitochondrial rRNA, substituting the lacking 5S rRNA in vertebrate mitochondrial ribosomes [43,44]. While tRNA Val appears associated with the mitochondrial ribosomes in human and rat, tRNA Phe has been identified in porcine and bovine [45].
Ten H-strand-specific mRNAs are posttranscriptionally processed from the HSP 2 transcript, together with 13 tRNAs and the two rRNAs ( Figure 3A) [46]. Most HSP 2 mRNAs are monocistronic, but two of the mRNAs are bicistronic (ND4/4 L mRNA and ATPase8/6 mRNA). Finally, the L-strand-specific transcript gives rise to the ND6 mRNA and eight tRNAs ( Figure 3A).

Atlantic cod mitochondrial mRNAs
Similar to that of human cells, 11 mature mRNAs were readily expressed from the Atlantic cod mitogenome [47]. There are, however, some minor differences in mitochondrial mRNA maturation and modification between human and Atlantic cod. Mapping of the 5′ ends in mitochondrial mRNAs by pyrosequencing revealed that 10 of the 11 mRNAs contain no, or very short (1-2 nt), 5′ untranslated regions (UTRs) [47]. The only exception is the 5' UTR of the COII mRNA, which contained a short hairpin structure. In Atlantic cod and all other Gadidae species, this hairpin structure is capped by a GAAA tetra-loop ( Figure 3B) [47]. GAAA tetra-loops are known to frequently participate in long-range RNA:RNA tertiary interactions [48].
Most Atlantic cod mRNAs lack 3' UTRs, but the COI mRNA has a 3' UTR of 76 nt corresponding to the complete mirror sequence of tRNA Ser(UCN) (Figure 3B) [47]. A very similar 3' UTR (72 nt) has been reported in the human COI mRNA [49], indicating a conserved role in vertebrates. The 3' UTR of the ND5 mRNA is highly variable in length in vertebrates but is lacking completely in Atlantic cod [40,47]. However, the closely related Gadidae species Pollachius virens (Saithe) contains an ND5 mRNA 3' UTR of 16 nucleotides [47]. In humans, mitochondrial mRNAs contain short polyA tails of 40-50 adenosines at their 3′ ends [40,45]. PolyA tails were identified in all mRNAs, except for ND6 mRNA [40], and seven UAA termination codons were created in the human mitochondria by polyA posttranscriptional editing [50]. Similarly, all mitochondrial mRNAs (except the ND6 mRNA) were found to be polyadenylated in Atlantic cod, and six UAA termination codons were generated by polyA addition [47].

Atlantic cod mitochondrial structural RNAs
The 22 mitochondrial tRNAs were found to be highly conserved in Atlantic cod, both in structure and sequence [5,18], and some tRNAs (tRNA Ile , tRNA Ser(UCN) , tRNA Ser(AGY) , and tRNA Cys ) were invariant in the 200 specimens investigated. SOLiD deep sequencing confirmed a nontemplate CCA addition at the 3′ ends of tRNAs (our unpublished results). Thus, mitochondrial tRNA processing and probably modification are highly similar in human and Atlantic cod [47]. The annotated mtSSU rRNA and mtLSU rRNA genes in Atlantic cod are 952 and 1664 bp, respectively [7]. The corresponding rRNAs are highly conserved within the species [18] and well conserved between different fish species [7,51]. The 5′ and 3′ ends of Atlantic cod mitochondrial rRNAs have been precisely mapped using different approaches. Primer extension and pyrosequencing confirmed the 5′ ends to correspond to the annotated features based on comparative sequence alignments [47,51]. The 3′ ends were mapped by pyrosequencing and by RNA ligation sequencing [51]. Interestingly, non-template adenosines were added at both rRNAs. Whereas the 3′ end of mtSSU rRNA was found to be homogenous and mono-adenylated, the corresponding end of mtLSU rRNA was heterogeneous and oligo-adenylated [51]. The observed mtLSU rRNA heterogeneity is consistent with the notion that mitochondrial rRNAs are transcribed and processed from two different precursor RNAs, the HSP 1 and HSP 2 primary transcripts ( Figure 3A).
None of these mitosRNAs have been assigned to a specific function funded on experimental evidence. However, in a recent study by Riggs and Podrabsky [70], mitosRNAs were associated to a hypoxia stress response in killifish embryos.

Atlantic cod mitochondrial long noncoding RNAs
Two lncRNAs (lncCR-H and lncCR-L) have been identified and investigated in Atlantic cod mitochondria (Figure 4) [10,18,47]. Both lncRNAs were found to be polyadenylated but transcribed from opposite strands within the CR [10]. We showed that the Atlantic cod lncCR-L has a mutation rate and an expression level corresponding to that of Complex I mRNAs [10,18,47]. The lncCR-L apparently corresponds to the 7S RNA in human mitochondria [52], and recently we showed that lncCR-L is differentially expressed in a human cancer-matched cell line pair [56].
The lncCR-H was found to be highly variable in sequence and structure, both between and within Atlantic cod specimens [10,18]. A schematic overview of the lncCR-H RNA is presented in Figure 4. Here, the noncoding T-P spacer is present at the 5′ end and includes two potential RNA hairpin structures. The T-P spacer domain is followed by a mirror tRNA Pro , before entering the HTR array motifs. The HTR copy numbers vary between 2 (80 bp) and more than 8 (>320 bp) [5,14,15,18], rendering lncCR-H highly variable in size. Finally, lncCR-H terminates in a short polyA tail at TAS. Thus, lncCR-H has apparently no fixed length in Atlantic cod mitochondria and varies in size between approximately 300 and 500 nt. Interestingly, the TAS motif consists of a perfect palindromic sequence motif (TTTATACATATGTATAAA). We found lncCR-L to terminate with a polyA tail at the same site as lncCR-H but on the opposite strand [10].

Atlantic cod mitochondrial small RNAs
The Atlantic cod mitogenomes express a number of small RNAs, revealed by SOLiD small RNA sequencing experiments (our unpublished results). Here, the majority of mitosRNA was identified as mitochondrial tRNA-derived fragments (tRFs; see [69,70]). Interestingly, most Atlantic cod mitochondrial tRFs correspond to H-strand tRNAs, and some tRFs were differentially expressed during early developmental stages (our unpublished results). Many of the same tRF species detected in Atlantic cod have recently been noted in rainbow trout egg cells [69] and in killifish embryos [70], suggesting a conserved feature at least among some bony fishes.
The SOLiD experiments also detected several abundant small RNAs mapping to the mitochondrial CR [17]. We found three small RNA candidates generated from lncCR-L, suggesting this lncRNA to be a precursor for mitosRNA (Figure 4). Similarly, two mitosRNA were generated from lncCR-H, one corresponded to a pyrimidine-rich motif and the other to tRF-1 derived from tRNA Thr (Figure 4). What functions these small RNAs may serve in the , T-P spacer, HTR (heteroplasmic tandem repeat array), TAS (termination associated sequence), and CSB2 (conserved sequence box 2) are indicated. The H-strand-specific lncCR-H is located at the 5′ domain of CR and is the precursor of two enriched small RNAs (above CR map). The L-strand-specific lncCR-L is located at the central domain of CR and is the precursor of three enriched small RNAs (below CR map).
Expanding the Coding Potential of Vertebrate Mitochondrial Genomes: Lesson Learned… http://dx.doi.org/10.5772/intechopen.75883 mitochondria are not currently known, but we speculate that regulatory roles related to transcription elongation, mtDNA replication, or ribosome functions are likely.

Concluding remarks
The mitochondrial gene content and organization are highly conserved between Atlantic cod and human and strongly support a common functional platform. Similarly, the mitochondrial transcripts generating canonical mRNAs and structural RNAs are surprisingly similar. What about the newly proposed MDPs and noncoding RNAs? Are there any linage-specific differences? Research is still in its infancy, but recent findings suggest conservations between fish and mammals. More experimental studies in Atlantic cod and model systems like zebrafish are highly encouraged, including investigations of the fascinating mitochondrial swinger RNAs [24,71,72]. Mitochondrial-derived noncoding RNAs need to be profiled and further investigated in adult tissue types during normal and stress conditions, as well as at various developmental stages. A first step could be to study the intracellular location by in situ RNA hybridization and then ask if the noncoding RNAs are confined to the mitochondrial compartment or exported to the cytoplasm or other cellular compartments. Our studies in Atlantic cod indicate that at least two of the mitochondrial lncRNAs may serve as precursors for small RNAs. We conclude that vertebrate mitogenomes encode a significant number of gene products in addition to the 37 canonical OxPhos proteins, rRNAs, and tRNAs.