Rat remains a major biomedical model system for common, complex diseases. The rat continues to gain importance as a model system with the completion of its full genomic sequence. Although the genomic sequence has generated much interest, only three complete sequences of the rat mitochondria exist. Therefore, to increase the knowledge of the rat genome, the entire mitochondrial genomes (16,307–16,315 bp) from 10 inbred rat strains (that are standard laboratory models around the world) and 2 wild rat strains were sequenced. We observed a total of 195 polymorphisms, 32 of which created an amino acid change (nonsynonymous substitutions) in 12 of the 13 protein coding genes within the mitochondrial genome. There were 11 single nucleotide polymorphisms within the tRNA genes, six in the 12S rRNA, and 12 in the 16S rRNA including 3 insertions/deletions. We found 14 single nucleotide polymorphisms and 2 insertion/deletion polymorphisms in the D-loop. The inbred rat strains cluster phylogenetically into three distinct groups. The wild rat from Tokyo grouped closely with five inbred strains in the phylogeny, whereas the wild rat from Milwaukee was not closely related to any inbred strain. These data will enable investigators to rapidly assess the potential impact of the mitochondria in these rats on the physiology and the pathophysiology of phenotypes studied in these strains. Moreover, these data provide information that may be useful as new animal models, which result in novel combinations of nuclear and mitochondrial genomes, are developed.
mitochondria are the only organelles (other than the nucleus) with their own DNA, which is maternally inherited (31, 36). The mammalian mitochondrial DNA (mtDNA) is a circular, double-stranded DNA that lacks introns and has only ∼7% noncoding sequences (23) in contrast to the genomic DNA. The mtDNA encodes 37 genes, including 13 protein-coding genes that, in conjunction with subunits encoded by the nuclear genome, form the electron transport chain, the primary ATP producer for the cell. Also included within these 37 coding genes are 22 tRNA genes whose function is to transport amino acids to the ribosome and match them to the codons of the mRNAs thus facilitating incorporation of amino acids into the growing polypeptide during translation. The final 2 genes are rRNA genes. The D-loop or control region, although noncoding, contains binding sites for two transcription factors, three conserved sequence blocks (CSBs) associated with initiation of replication and the loop strand termination associated sequences (9, 21, 23, 61), all of which play an important role in the replication of the mitochondrial genome. Although mitochondria have a noted impact on cellular function, their genetic contribution to disease phenotypes is often not assessed.
mtDNA has a mutation rate of 10–20 times that of nuclear DNA, probably because of a failure of proofreading by mtDNA polymerases and lack of an effective repair system (19, 26, 41, 43). Hundreds of mutations in the tRNA, rRNA, and protein coding genes as well as in the D-loop of the human mitochondrial genome are associated with disease (8 and see http://www.mitomap.org). Interestingly, ∼15% of known human diseases related to mutations in the mitochondria are found within the tRNA genes (http://www.mitomap.org). Because of its extensive physiological characterization along with the ability to be genetically manipulated (1, 12, 13, 51, 71), the rat is a useful model for many different complex diseases. However, only three complete mtDNA genome sequences are available: the Wistar strain, a wild rat from Copenhagen, and a Brown Norway strain (AC_000022; see Refs. 22 and 41). Updated compilation of available mtDNA sequences can be found at http://megasun.bch.umontreal.ca/gobase.
In this study, we report the complete mtDNA sequences of 10 inbred strains and 2 wild rats. The 10 inbred rat strains in this study were selected based on known genetic diversity assessed using >4,800 microsatellite markers in 48 commonly used inbred strains [ACP Strain Polymorphism Percentage Table, Rat Genome Database, Medical College of Wisconsin, Milwaukee, Wisconsin. World Wide Web (URL: http://rgd.mcw.edu/), April 2006; 67]. The initial strain selection represented a relatively equally spaced cross section of strains previously characterized (67). Moreover, they are commonly used as models of diabetes, hypertension, renal failure, and other complex traits. The two wild rat strains were incorporated into the study to examine their diversity compared with the inbred strains. The goals of our study were to determine the complete mtDNA sequence of each of 10 inbred strains and 2 wild rats, compare amounts and patterns of genetic diversity, and discover novel variants among strains, thereby providing a resource for investigators using these strains of rats.
Twelve different rat strains were studied in this project: August × Copenhagen Irish (ACI/Eur); Fawn-hooded hypertensive (FHH/Eur); Fisher 344 (F344/NHsd); three substrains of Goto-Kakizaki: GK/Swe (from Sweden), GK/Far (from Florida), and T2DN/Mcwi (Type II Diabetic Nephropathy) (53); Genetically Hypertensive (GH/OmrMcwi); Wistar Kyoto (WKY/NCrl); Brown Norway (BN/NHsdMcwi); Dahl salt-sensitive (SS/JrHsdMcwi); and two wild rats: one from Milwaukee, WI (Wild/Mcwi) and one from Tokyo (Wild/Tku). Three previously published complete mtDNA genome sequences from a Wistar rat (NC_001665; 29), a BN rat (AC_000022; denoted BNBCM for this paper), and a wild rat from Copenhagen, Denmark (denoted WCOP for this paper; AJ428514; 52) were also included in the analyses. It should be noted that the publicly available BNBCM sequence (AC_000022) was obtained from a BN rat supplied by the Medical College of Wisconsin and should yield the same sequence as the BN/NHsdMcwi. Even though the wild rats are not “strains” in the same sense that the inbred rats are, we will use the term “strain” hereinafter to refer to all of these rats. For more information on the origin and history of these rat strains see the Rat Genome Database (http://rgd.mcw.edu) or RatMap (http://ratmap.org).
All inbred “Mcwi” rats were housed in the Animal Resource Center of the Medical College of Wisconsin, an American Association of Laboratory Animal Care-approved facility. A local Animal Care and Use Committee set the husbandry and care guidelines used. The animals were housed in climate-controlled rooms that are maintained on a 12:12-h light-dark daily cycle. All inbred animals were studied at 6 wk of age.
Total genomic DNA was extracted from tissue obtained from a tail clip. The tissue was incubated in a standard lysis buffer with 100 μg Proteinase K/ml at 55°C overnight on a shaking platform (53). DNA was isopropanol precipitated and pelleted using centrifugation (14,000 g for 15 min at 4°C). The pellet was washed with 70% ethanol and was resuspended in Tris-EDTA (TE) buffer, pH 7.6. The concentration of the DNA was then determined using a spectrophotometer (Beckman model DU640) by determining the absorption at 260 and 280 nm.
Primers used to amplify overlapping fragments of the mitochondrial genome were designed using Primer3 (http://frodo.wi.mit.edu/cgi-bin/primer3/primer3_www.cgi; 60). When this project ensued, the BNBCM sequence was not yet available. Therefore, the published mtDNA sequence from the Wistar rat (NC_001665; 29) was used to design 23 overlapping primer pairs covering the complete rat mitochondrial genome. BLAT (http://genome.ucsc.edu/cgi-bin/hgBlat?command=start; 38) was used to ensure that the primers designed did not amplify nuclear DNA.
The product sizes of the resulting overlapping PCR amplicons ranged from 949 to 999 bp. PCR was performed in a total volume of 25 μl containing standard PCR buffer, 1.5 mM MgCl2, 18 pmol each primer, 0.5 units Taq DNA Polymerase (Promega, Madison, WI), 20 μM each nucleotide, and 25 ng of template DNA. PCR cycling was performed on an MJ Research (MJ Research/Bio-Rad, Hercules, CA) thermocyler for 1 min at 95°C; followed by 45 cycles of 1 min at 95°C, 1 min at 52°C, and 90 s at 72°C; followed by a final extension of 7 min at 72°C. All PCR products were confirmed with 1% agarose gel electrophoresis. Unpurified PCR products were diluted 1:3.5 with water, 4 μl of the dilution were added to a 10-μl sequencing reaction containing 30 pmol primer (BigDye Terminator; Applied Biosystems, Foster City, CA) and 5× sequencing buffer (Applied Biosystems). The sequencing reaction was performed on a Perkin-Elmer 9700 thermocycler for 25 cycles at 96°C for 10 s, 50°C for 5 s, and 60°C for 4 min. Reactions were purified with the CleanSEQ dye-terminator removal system (Agencourt, Beverly, MA). DNA sequencing was performed using an ABI 3700 automated sequencer (Applied Biosystems). All fragments were sequenced from both strands.
Sequencing trace files were exported to a UNIX system and analyzed with the Phred/Phrap/Consed package (22, 32). All bases in all strains were called with a Phred quality score of 20 or greater. The resulting sequences, along with the publicly available sequences, were aligned with Clustal W (68). All single nucleotide substitutions and insertion/deletions were visually confirmed in the trace files using Consed. Because the wild Copenhagen rat was not sequenced in our laboratory, we did not include it in our polymorphism analysis, although we did include it in the phylogenetic tree. We observed a very large number of differences between the previously published complete mtDNA genome of a Wistar rat strain (NC_001665; 29) and all of the sequences we obtained. Because these differences may represent sequencing errors in this older sequence, which we were neither able to confirm nor deny, this sequence was not included in further analysis. A phylogenetic tree (neighbor-joining) and pairwise sequence divergence calculations were done using MEGA version 3.1 (40). Sliding window plots of rat-mouse and rat-human divergence, and of rat single nucleotide polymorphism (SNP) density were produced with in-house perl scripts. The novel sequences have been deposited to GenBank (accession nos. DQ673907-DQ673917).
The complete mitochondrial genome sequence was determined in 12 different rat strains. The sequence we obtained for the BN strain was identical to AC_000022. The rat mitochondrial genome varies between 16,307 and 16,315 bp in size. The size variation is because of insertions/deletions in the replication origin, D-loop, 16S rRNA (Rnr2), and NADH dehydrogenase subunit 2 (ND2).
We observed 195 polymorphic sites or ∼11.9 polymorphisms/kb (Fig. 1) among the 12 strains studied. Of these, 43 (22%) were found only in a wild rat, either from Tokyo or Milwaukee. The remaining 152 polymorphisms (78%) were found in at least one of the inbred strains. Neither the sequence divergence between species, nor the number of polymorphic sites within rats, was uniform across the mitochondrial genome (Fig. 1). Peaks of divergence and diversity are seen in the D-loop and portions of ND2, ND5, and COII (cytochrome c oxidase subunit II; Cox 2), whereas the ribosomal genes are the most conserved between species. There was no evidence of heteroplasmy in the samples studied. This may be because of the young age of the animals studied, or the tissues used.
About 70% of the mitochondrial genome codes for proteins of the electron transport chain. Within the protein coding genes, 149 polymorphisms were found (76% of the total mutations discovered). The polymorphisms were relatively uniformly distributed throughout this segment of the mitochondrial genome. Of these 149 polymorphisms, 32 created an amino acid change (i.e., a missense substitution). Sixteen of the 32 missense substitutions caused a change in charge of the amino acid structure affecting 12 of the 13 protein coding genes, the exception being cytochrome c oxidase subunit 3 (COIII). A 3-bp insertion/deletion of a histidine residue occurred in NADH dehydrogenase subunit 2 (ND2) in 6 of the 12 strains sequenced. ND2 and NADH dehydrogenase subunit 4 (ND4) exhibited the most missense substitutions with seven each (Table 1).
The tRNA genes make up 9% of the entire mitochondrial genome and yet possess only 5% of the total polymorphisms. The 10 polymorphisms found resided in 6 of the 22 tRNA genes (Table 2); 3 in tRNA-cysteine (C), 2 each in tRNA-aspartic acid (D) and tRNA-proline (P), and 1 each in tRNA-tyrosine (Y), tRNA-histidine (H), and tRNA-threonine (T). Their locations are shown on the inferred secondary structure of these tRNAs in Fig. 2. Four of these polymorphisms were found in the T-loop, 2 in the D-loop, 3 in the acceptor stem, and the final one in the spacer between the anticodon stem and the T-stem of the tRNA.
The two rRNA genes account for 15.5% of the mitochondrial genome with 21 polymorphic sites observed; 15 in the 16S rRNA (Rnr2) and 6 in the 12S rRNA (Rnr1). These polymorphisms contributed 11% of the total polymorphisms identified. The two rRNA genes show extensive sequence variations (Table 3). The 16S rRNA gene (Rnr2) had the highest level of nucleotide variation of any of the mtDNA genes and showed a high level of interspecies divergence. Within the group, we identified an interesting variation at position 1136 in a poly(C) tract. Seven strains had identical genotypes with a stretch of five cytosines, whereas the BN contained six cytosines. The remaining four strains contained large poly(C) tracts, making it difficult to determine the precise number of cytosines. We estimated the approximate number of cytosine residues based on visual inspection of the chromatogram to be 10 in GKFL and 12 in GH, GK/Swe, and WKY.
The displacement loop, or D-loop, which makes up 5.5% of the mitochondrial genome, contained 14 polymorphisms, including two insertion/deletions. This region contained the highest rate of polymorphisms per kilobase, which was not unexpected since the D-loop is known to be the most highly mutable region of the mitochondrial genome (47, 65). The D-loop (not to be confused with the D-loop of the tRNA molecule), is the ∼1-kb noncoding region of mtDNA that contains the major regulatory elements for the replication and transcription of the mitochondrial genome. The D-loop has been classified into three different highly conserved regions among 26 species: the extended termination-associated sequence (ETAS), which is divided into ETAS1 and ETAS2; the central region; and the CSB domains (CSB1, CSB2, and CSB3; see Ref. 61). The rat central domain stretches 307 bp. We identified one polymorphism (C/T) located at position 15814. The ETAS has been implicated in the termination of heavy (H) strand synthesis, which is important in the termination of replication (21). The ETAS1 region represents 58 bp of sequence in the rat and contained one observed substitution. Four point substitutions occurred in ETAS2. One substitution was unique to the BN rat. An adenosine to guanine substitution occurred at position 15568, whereas the adenosine residue was conserved among 26 species. The CSB domain, defined as CSB1, CSB2, and CSB3 (72), contains elements important in the replication and transcription of mtDNA. We detected a cytosine insertion in CSB2 in FHH and T2DN at location 16093. This mutation occurred in a poly(C) tract, increasing the total number of cytosines to eight in FHH and T2DN (Table 2).
A phylogenetic tree was constructed based on the sequence variation found among the complete mtDNA genomes of the 10 inbred rat strains, the 2 wild rats that we sequenced (Tokyo and Milwaukee), and a wild rat from Copenhagen, Denmark (Fig. 3). The 10 inbred strains grouped into 3 clusters. One of these clusters contains only the BN rat, which is nearly equidistant from all other rats. The wild rat from Tokyo grouped relatively closely with five inbred strains (SS, FHH, T2DN, ACI, and F344). Conversely, the other two wild rats are far removed from all other strains. The average pairwise distance between the inbred clusters was 0.48%, whereas the maximum divergence between any two inbred strains was 0.63% (calculated between the WKY and either the ACI or F344 strains). The average divergence among the wild strains is 0.52%. A similar divergence was noted between either the wild Milwaukee or wild Copenhagen rat and any of the inbred clusters.
Quality of data.
The mtDNA genome sequence of the 12 strains reported herein appears to be complete and accurate. The sequence we obtained from a BN (BN/NHsdMcwi) rat is identical to that obtained independently by others using different methods (AC_000022; i.e., whole genome shotgun sequencing). A second publicly available sequence from 36-mo-old F344 × BN F1 animals (AY769440; possessing BN mtDNA) was 99% identical to our BN (54). The study assessed the accumulation of mitochondrial point mutations and deletions with age, which could account for the 1% of sequence dissimilarity (54). We concur with the Rat Genome Sequencing Project Consortium that their mtDNA sequence of the BN (AC_000022) be used as the reference sequence for the rat mitochondrial genome. Because our sequences were obtained by PCR amplifying from total genomic DNA, we were concerned about the possibility of inadvertently amplifying nuclear pseudogenes of mtDNA or nuclear mitochondrial-like sequences (Numts; see Refs. 5, 48, and 58). To determine if this was the case, we compared the complete mtDNA sequence with the rat genome assembly using BLAT (38). Only one stretch of sequence alignment >1 kb, and therefore a possible target for amplification from the nuclear genome, was detected. However, no PCR primer pair was encompassed within this region; therefore, we are confident that no nuclear pseudogenes or Numts were amplified and incorporated in our mitochondrial sequences. The vast majority of matches between mtDNA sequence and nuclear sequence were short fragments of low similarity.
Protein coding genes.
Within the protein coding genes, 149 polymorphisms were found. Of these, 32 created an amino acid change (Table 1). Ultimately, the goal will be to determine if any of the 32 missense substitutions observed could result in a phenotypic difference among rat strains, especially if it confers disease susceptibility. Without extensive breeding and backcrossing followed by detailed physiological phenotyping, both of which are beyond the scope of the present study, we were left to make predictions based on the presently available data. Amino acid replacements can be categorized into conservative or nonconservative based on biochemical properties or on the frequency of observed substitutions from large datasets. Alternatively, we can examine if any given amino acid position is highly conserved in a multiple species alignment to ascertain whether or not this position is allowed to vary over evolutionary time.
Most of the observed polymorphisms were conservative in the sense that the two allelic amino acids do not differ greatly in biochemical properties, such as polarity, charge, or size. Nearly one-quarter (8 of 30) involved either a valine/alanine or valine/isoleucine change. Nonconservative polymorphisms may be more likely to affect a change in protein function. For example, the E35K polymorphism in the ATP synthase 6 gene (ATPase6) involved a change between glutamic acid, an acidic residue, and lysine, a basic residue that was found in the GH, GK/Swe, WKY, and GK/Far strains (Table 1). Such a change could affect the secondary, tertiary, or quaternary structure of the protein, the Fo sector of the H+-transporting ATP synthase. In addition to potentially altering the three-dimensional structure, such a change in charge could affect the efficiency of transporting positively charged protons through channels in this protein (10, 20, 25).
We may be able to predict whether any given nonsynonymous polymorphism is likely to be functionally or physiologically important by determining if that amino acid position is highly conserved through evolutionary time. The more dramatic a mutation is, the less likely it is to be retained by natural selection. For example, the Wild/Mcwi rat has a unique amino acid replacement, valine to isoleucine (V453I), in the cytochrome c oxidase subunit I (COI) protein. We can infer that this substitution is not likely to adversely affect the function of the oxidative phosphorylation pathway, since the homologous position varies substantially among other rodents. In contrast, an amino acid replacement of a glutamine by serine (N150S) in ND2 common to the four strains derived in Japan from an outbred Wistar colony (WKY, GH, GK/Far, and GK/Swe) is much more highly conserved; all other murid rodents that have complete mtDNA sequence available (Mus, Volemys, and Nannospalax), and humans possess a glutamine at the homologous position, suggesting a potentially important mutation to assess in more detail.
In determining whether or not there is a causative relationship between any observed amino acid polymorphism and a disease phenotype, we note here the advantage of looking at multiple strains, with and without susceptibility to a disease, rather than just one disease and one normal strain. It has been previously reported that a mutation (D101N) in ATPase6 seen in the BHE/cdb rat causes age-related impaired glucose tolerance, a hallmark of type 2 diabetes mellitus (44, 45). These authors reported that the BHE/cdb rat possesses an asparagine (N) at codon 101, whereas the nondiabetic Sprague-Dawley (SD) rat posseses an aspartic acid (D). Because the BHE/cdb rat shows maternal inheritance of type 2 diabetes, authors concluded that this amino acid substitution explained the impaired glucose tolerance. However, the sequence in all of our 12 rat strains, both diabetic and nondiabetic, contained an asparagine residue at position 101. Thus it appears it is the “normal” SD rat that possesses a unique amino acid substitution, and therefore the D101N substitution in the ATPase6 protein does not likely explain the impaired glucose tolerance observed in the BHE/cdb rat.
RNA genes and the D-loop.
We identified a total of 10 polymorphisms occurring in 6 of the 22 tRNA genes. Mutations in tRNAs play an important role in human disease as evidenced by the high incidence of tRNA mutations found in patients with mitochondrial disease (28, 64). These mutations can cause a major decrease in the level of aminoacylated tRNA, leading to respiratory chain defect (14), and in turn either directly cause or confer susceptibility to disease in humans. Likewise, the A3243G mutation in the tRNA-Leu [L(UUA/G)] gene in humans has been shown to be associated with mitochondrial encephalopathy, lactic acidosis, and stroke-like episodes (MELAS; see Ref. 50). Because it is difficult to predict the effect any given nucleotide variant may have on the secondary and tertiary structure of a tRNA (27, 74), it is therefore difficult to hypothesize that the variant causes an observed phenotype (49). However, with some knowledge of the predicted secondary structure combined with an evolutionary perspective, we can make a reasonable hypothesis as to which observed polymorphisms are more or less likely to cause phenotypic variation, including disease. Six of the 22 tRNA genes possess polymorphic sites in our sample of rats. Of the 10 sequence variants identified, 6 are in one of the less well conserved loops of the predicted classic cloverleaf folding of the tRNA (Fig. 2) and are therefore less likely to cause any meaningful functional difference. Two nucleotide substitutions are found in the acceptor stem, with the double-stranded segment of the tRNA molecule carrying the amino acid to the growing polypeptide. One of these, a T67C substitution found in the tRNA-aspartic acid (D) gene of the Wild/Mcwi rat, increased the number of Watson-Crick (W-C) pairs in the stem. However, in the A5G variant of the tRNA-histidine (H) gene, the BN strain possesses a guanine and all of the other strains have an adenosine, reducing the number of W-C pairs in the acceptor stem from six to five in the BN. Examination of the sequence of all placental mammals for which complete mtDNA genome sequence is available in GenBank and use of the consensus folding of mammalian tRNAs (33) to infer the acceptor stem sequence revealed that 94 out of 95 of these mammals have either 6 or 7 W-C pairs out of a possible 7 pairings (see Fig. 2); the lone exception was the Amazonian pink river dolphin (Inia geoffrensis). This evolutionary conservation suggests that natural selection has strongly favored maintaining strong bonding between the two strands of this stem. The BN strain, with only five of the seven W-C pairs, may have reduced stability of this mitochondrial tRNA.
Although the two rRNA genes exhibited extensive sequence variation (11% of total polymorphisms), we are unable to speculate on whether they manifest any functional consequence. Many groups have associated mutations within the rRNA genes with disease, but a strong phenotypic correlation resulting from one of these mutations has yet to be determined (35, 62).
The D-loop contained 14 polymorphisms, including two insertion/deletions. This region contained the highest rate of polymorphism per kilobase, which was not unexpected since the D-loop is known to be highly mutable (47, 65). Of interest was a cytosine insertion found in FHH and T2DN. This mutation occurred in a poly(C) tract, increasing the total number of cytosines to eight in these strains (Table 1). Studies done in homopolymeric tracts in the D-loop by Malik et al. (42) indicated decreasing fidelity of mtDNA replication machinery in cytosine tracts longer than eight. Other groups have also published a mutation within the D-loop of the human mtDNA genome (T16189C) associated with type 2 diabetes (55, 56) and dilated cardiomyopathy (39).
The FHH and T2DN rats have identical mtDNA genome sequences, as expected since the T2DN strain was created by repeated backcrossing of a female FHH to a GK/Swe male (53). The SS rat (SS/JrHsdMcwi) differs from these two strains by only a single base pair deletion in the poly(C) stretch of the D-loop, a region known to mutate rapidly (in humans) with respect to the number of cytosine residues (11). This sequence similarity was surprising when the history of the strains is considered. The SS rat was derived from an outbred colony of Sprague-Dawley rats in the early 1960s (16, 17, 57), whereas the FHH rat (FHH/EurMcwi) originated from the outbred Fawn-Hooded (FH) rat. The FH resulted from unplanned crossbreeding potentially involving any of the following strains: German-brown rat, white Lashley rat, Wistar, and Long-Evans (63, 70) Therefore it seemed unlikely, a priori, that these two strains would possess nearly identical mitochondrial genomes.
The largest sequence divergence seen between any two of the inbred rats was 0.63%, which is very similar to the largest divergence seen in a sample of 53 humans from around the world (0.68%; see Ref. 36). This sequence divergence between inbred rats was just slightly larger than the sequence divergence between several Mus musculus musculus strains (4) and a Mus musculus domesticus strain NZB/B1NJ (0.53–0.60%; GenBank accession no. L07095), although other M. musculus domesticus complete mitochondrial genome sequences available (AB042432) are nearly identical to the M. musculus musculus sequences. Although the largest divergence within the rat strains is similar to that seen in modern humans and between M. musculus musculus and M. musculus domesticus, it is about five to six times less than the divergence between Mus musculus molossinus (GenBank accession no. NC_006915) and several M. musculus musculus strains (2.4–2.5% divergence).
The 10 inbred strains group phylogenetically into 3 clusters (Fig. 3). One of these consists of only the BN rat. A second cluster, containing the GH, GK/Far, GK/Swe, and WKY, are all rats derived from an outbred Wistar colony in Japan. Differences among the mtDNA sequences of these strains may originate from either diversity present in the ancestral outbred Wistar colony or new mutations arising after the individual strains were developed. The remaining five inbred strains are FHH, T2DN, SS, ACI, and F344. These latter two strains were both developed in the 1920s at Columbia University by Curtiss and Dunning (32a) and therefore may have had a common maternal ancestor ∼80 years ago. Although the wild Tokyo rat groups relatively closely with this cluster of five strains, the other two wild rats are unrelated to any inbred strain and about as divergent from each other as they are from the inbreds. Two conclusions regarding the levels of diversity among inbred strains can be drawn from this. First, the sampled inbred strains contain substantially different mitochondrial haplotypes, as different as any two wild strains. Therefore, the degree of genetic and phenotypic differences seen between strains may be reflective of those between common variants within a natural population. Of course, the obvious caveat is that with only three wild rats sampled, we may be vastly underestimating the amount of variation in living, natural wild populations. Second, there remains a substantial amount of mitochondrial genetic variation in the wild rats that was not captured in the domesticated inbred strains. It may be worth considering sampling additional wild rats to identify novel genetic variants to develop new models of mitochondrial disease. We note, for example, that the wild rat from Copenhagen possesses two missense substitutions (H93Q, A94P) in the cytochrome b gene (Cytb) not observed in any other strain. These residues are next to one of the iron-containing heme groups, as judged by the position of these residues in the highly conserved bovine homolog, whose three-dimensional structure has been deduced (30). This wild rat had missense substitutions in the three genes that did not contain any such substitutions in our sample (Cytb, ND1, and COIII), as well as substitutions in two tRNA genes (tRNA-glutamine and tRNA-serine-1) that otherwise had no polymorphisms in our 12 rats. Sampling a larger number of additional wild rats may also allow for identification of functionally important conserved regions (7), especially for the noncoding D-loop as has been done using cross-species comparisons (61).
The phylogenetic tree of the 10 inbred strains based on complete mitochondrial sequences presented herein is generally similar to a previously published phyogeny made using data from nuclear microsatellite markers (67). In both studies, the ACI, F344, FHH, and SS strains grouped together, although in the microsatellite tree the FHH and SS were not nearly so similar as seen from their mitochondrial genomes. The nuclear data also grouped the GK with the WKY (67), as did the mitochondrial data. However, the GH rat, also derived in Japan along with the GK and WKY, grouped far away from these latter two strains in the microsatellite-based tree, but the three strains were closely related in the mitochondrial tree. Such discrepancies are not surprising given the differences in inheritance between the mitochondrial and nuclear genomes. Data from the International SNP consortium, available in the future, will aid in comparisons between the nuclear and mitochondrial genomes.
Models of mitochondrial disease.
Rat models currently exist for human diseases and are continuing to be developed. The advent of congenic and consomic rat models has continued to further our understanding of the genetic basis of disease (15, 37, 46, 59). However, the development of rodent disease models has focused on the nuclear genome and largely ignored the mitochondrial genome.
Many human diseases, such as diabetes mellitus, Alzheimer's Disease, progressive encephalopathy, Leber's hereditary optic neuropathy, MELAS, myoclonus epilepsy and ragged-red fibers, and Leigh syndrome are linked to mitochondrial genomic alterations (http://www.mitomap.org/cgi-bin/mitomap/tbl8gen.pl; http://www.mitomap.org/cgi-bin/mitomap/tbl9gen.pl). A single mutation in a mitochondrial gene was recently linked to myriad symptoms of the metabolic syndrome (73). Aside from established mitochondrial disorders, it has also been suggested that mitochondrial myopathies be considered in patients with unexplained, slowly progressing multisystem disorders (18).
Also overlooked are the defects in signaling between the mitochondrial and nuclear genomes (34). Only 13 of the nearly 100 proteins needed to assemble the electron transport chain are coded by the mitochondrial genome. The remaining proteins are coded by the nuclear genome and subsequently transferred to the mitochondria. As genetic models of disease are developed (i.e., congenic, consomic, etc.), the consequences of mixing genomes is largely ignored. The sequence variation present among the strains studied reveals that mixing of the genomes may not produce inconsequential results. Knowing the sequence variants may help us to deconstruct these complex interactions.
One can speculate about potential effects of mitochondrial differences within rat strains that may, for example, contribute to the overall spectrum of aerobic running endurance and cardiac performance seen in inbred strains (2, 3). It is generally accepted that mtDNA mutations can be major contributors to human pathologies and possibly to aging. However, heteroplasmy (the mixture of wild-type and mutated mtDNA within a cell) can further complicate the detection and development of disease (49, 66).
In conclusion, we provide the DNA sequence of the complete mitochondrial genome for several of the commonly used rat strains that have not been reported so far. These strains are frequently used in studies of human disease and for testing drugs. We found substantial genetic variation in mtDNA among strains that can now be systematically assessed physiologically. New models may be discovered based on the genetic variants identified at specific genes. In addition, because different inbred strains are used by different groups and fields, providing the complete sequence of additional strains may prove useful. For example, the sequence of the mitochondrial genome of the BN rat, popularly used in physiological research, is available (AC_000022). That of the F344, commonly used in pharmacology and toxicology (24, 69), is shown to be substantially different from the BN. Finally, we note the utility in having complete sequences from multiple strains for comparisons between disease model strains and unaffected control strains, rather than just the sequences of two strains.
Current address for M. I. Jensen-Seaman: Dept. of Biology, Duquesne University, Pittsburgh, PA.
The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
- Copyright © 2006 the American Physiological Society