Introduction
Mitochondrial DNA is widely used in molecular studies, particularly in animals, due to its unique features including maternal inheritance without recombination, compact size, and high density in tissues (
Avise 1994;
Jae-Heup et al. 2001;
Schwartz et al. 2007). One of the most useful and commonly studied regions of the mammalian mitochondrial genome in population genetics is the control region or D-loop, as it is the largest non-coding region with a high mutation rate (
Fig. 1;
Hoelzel et al. 1991;
Brown et al. 1993;
Jae-Heup et al. 2001). However, the benefits of using the control region when investigating population structure, intraspecific variation, and phylogenetic relationships (
Avise 1994) can be complicated when two or more mitochondrial genomes exist in the same cell leading to variation within individuals, i.e., heteroplasmy (
Wilkinson and Chapman 1991).
The control region in felids is generally split into three domains: the left domain, which includes the first hyper-variable sequence (HVS-1) and repetitive sequence 2 (RS-2); the right domain, which includes the second hyper-variable sequence (HVS-2) and repetitive sequence 3 (RS-3); and the central conserved region (CCR), situated between the two, which has relatively low variation compared to the others (
Fig. 2A;
Jae-Heup et al. 2001). RS-2 comprises an array of ∼80 bp tandem repetitive units that have been identified in several mammals (
Fig. 2B;
Wilkinson and Chapman 1991;
Hoelzel et al. 1994), including domestic cat (
Felis catus) (
Lopez et al. 1996) and Eurasian lynx (
Hellborg et al. 2002). The number of complete copies of these repetitive units can range from 1 to 4 across several felid species including cheetah (
Freeman et al. 2001) and great cats in the genus
Panthera (
Jae-Heup et al. 2001), though strangely, the RS-2 repetitive sequence is either absent or highly reduced in the clouded leopard (
Neofelis nebulosa) (
Wu et al. 2007).
Jae-Heup et al. (2001) also found single nucleotide polymorphisms (SNPs) in the RS-2 repeat unit within individual cats, suggesting that repeat copy number is not the only source of heteroplasmic variation.
These repetitions in RS-2 and RS-3 are thought to be the result of misalignment and slippage due to the D-loop’s involvement in replication and are expected to be under stabilizing selection (
Buroker et al. 1990;
Hoelzel et al. 1994). For example, multiple repeats may be selected for if they aid in binding proteins required for replication or compensate for deleterious mutations, but a larger genome will also result in slower replication and likely be selected against (
Wilkinson et al. 1997;
Jae-Heup et al. 2001;
Clark et al. 2012). Additionally, the proportions of heteroplasmic mutations vary depending on the individual, age, and type of tissue, and may be influenced by tissue-specific selection for new variants or as a by-product of accumulated oxidative damage (
Matthews et al. 1994;
Michikawa et al. 1999;
Jae-Heup et al. 2001).
Mutations in the mitochondrial genome can cause severe disease in humans (
Kopinski et al. 2019;
Jackson et al. 2020), inbred lab mice (
Lechuga-Vieco et al. 2022), domestic dogs (
Li et al. 2006;
Baiker et al. 2009), and at least one domestic cat (
Dell'Era et al. 2021). Two mutations in the tRNA-Leu gene were linked to selective symmetrical necrotizing encephalopathy in a house cat resulting in a progressive reduction in cognitive and motile function (
Dell'Era et al. 2021). While this disease is not a direct result of heteroplasmy, the two causative mutations do present in a heteroplasmic fashion (
Dell'Era et al. 2021), which in turn, can affect the clinical presentation of the disease and make diagnosis challenging (
Jackson et al. 2020). Clinical disease in humans can either vary in severity according to the proportion of mutant DNA in the cell or present suddenly when a threshold of 60% is reached (
Jackson et al. 2020). Furthermore, wild felid species with low genetic diversity can suffer inbreeding depression from abnormal sperm morphology and impaired sperm metabolic function (
Pukazhenthi et al. 2006;
Terrell et al. 2010). Since impairment of mitochondrial metabolism or defects in the mitochondrial genome reduce sperm motility in humans (
Amaral et al. 2013;
Irigoyen et al. 2022), there may also be a vital link between mitochondrial heteroplasmy and evolutionary fitness in wild felids.
Here, we aim to investigate intra-individual variation in the mitochondrial control region of Canada lynx (
Lynx canadensis) which has not been described despite previous studies quantifying population-level variation (e.g.,
Rueness et al. 2003;
Prentice et al. 2019). The Canada lynx is part of a keystone predator–prey cycle in the boreal forest and undergoes dramatic fluctuations in population size (
Krebs et al. 2018). Thus, Canada lynx represents a valuable opportunity to study how mitochondrial heteroplasmy is maintained or influences fitness in small populations which will eventually assist in conservation and management decisions. With this in mind, we provide the first characterizations of heteroplasmic variation in copy number of the repetitive sequence RS-2 and high inter-individual variation in copy number of repetitive sequence RS-3 in this species and discuss the implications for stabilizing selection on genome evolution.
Discussion
The control region is commonly sequenced for population and phylogenetic studies, but in many mammals, particularly felids, variable copy numbers of the RS-2 repeat unit can indicate heteroplasmy and distort sequence quality (
Hoelzel et al. 1991;
Brown et al. 1993;
Jae-Heup et al. 2001). Although an electrophoretic signature of heteroplasmy was reported in Canada lynx (
Foran et al. 1997), this phenomenon has never been confirmed by genetic sequence data, nor has it been reported in subsequent molecular studies (
Rueness et al. 2003;
Prentice et al. 2019). Hence, our study is the first to sequence isolates with different copy numbers and describe this phenomenon in this species. We were able to isolate and sequence five length polymorphisms from the left domain corresponding to the copy number of RS-2 repeat units that can exist simultaneously within individual lynx. Furthermore, we documented high haplotype diversity in the right domain through variability in the RS-3 repeat array among individuals.
We easily separated and counted four or five bands per individual through initial gel electrophoresis, but band-stab PCR allowed us to detect up to seven different band sizes from the left domain. The bands in the middle of the size range (corresponding to two and three complete repeats) were the brightest and most common while the longest band was only detected in one individual, and we were unable to isolate this product. Furthermore, we found that sequencing original PCR products (without band-stab or cloning) consistently resulted in sequences with three complete repeat units across all samples, suggesting that three copies is the most common number of repeats within and among individuals (
Fig. 2B). Nanopore sequencing confirmed that amplicons with three complete repeat copies were the most frequent among individuals, and that frequency declined symmetrically as the number of repeat copies differed from three (
Fig. 5). Our success at consistently sequencing three complete repeat copies from the original product also potentially explains why several previous studies never reported variation in repeat units or evidence of heteroplasmy (e.g.,
Rueness et al. 2003;
Prentice et al. 2019). The pattern in our observed copy number frequencies is similar to Eurasian lynx which had two and three complete RS-2 repeats in 23% and 74% of individuals, respectively (
Sindičić et al. 2012). Our pattern of declining repeat frequency from a mode of three supports the idea that the copy number of repeat units is under stabilizing selection (
Buroker et al. 1990;
Hoelzel et al. 1994;
Lopez et al. 1996) and suggests that three complete repeats is the optimal copy number to balance replication efficacy and efficiency.
While we did not find SNPs within individuals when sequencing isolated fragments, we did find evidence of polymorphisms in the form of “heterozygous” chromatogram peaks when sequencing original PCR products without isolation, possibly as a result of SNPs among different mitochondrial genomes. Polymorphisms within the RS-2 repeat units are not uncommon and have been found in several species (
Freeman et al. 2001;
Jae-Heup et al. 2001; see raw data from
Rueness et al. 2003), supporting the notion that heteroplasmic SNPs are possible in this region. Oddly,
Sindičič et al. (2012) did not find any SNPs in the RS-2 repeat units of the closely related Eurasian lynx, though this may be a consequence of low genetic diversity across their sampling area. All inter-individual SNPs occurring outside of the RS-2 array were consistent across multiple band-stab isolates from the same individual, meaning heteroplasmic mutations seem to be minimal outside of the RS-2 region. However, two of the four clones introduced apparent SNPs outside of the RS-2 region, which were not present in band-stab or original PCR sequences for the same individual. While these SNPs are likely artifacts of the cloning procedure and not heteroplasmic mutations, we make note of any possibility of mitochondrial mutations as they can lead to serious mitochondrial disease in domestic cats (
Dell'Era et al. 2021), domesticated dogs (
Li et al. 2006;
Baiker et al. 2009), inbred lab mice (
Lechuga-Vieco et al. 2022), and humans (
Kopinski et al. 2019;
Jackson et al. 2020).
While mitogenome duplication has been described as mitochondrial heteroplasmy in Eurasian lynx and bobcat (
Croteau et al. 2012;
Sindičić et al. 2012), research on other species has suggested that intra-individual mitogenome sequence variants may also be attributed to NUMTs (
Lopez et al. 1994;
Antunes et al. 2007;
Calabrese et al. 2017). Several NUMTs that contain the control region have been discovered in the domestic cat genome including on chromosome D2 (
Lopez et al. 1994;
Antunes et al. 2007), leading to the possibility that NUMTs may also exist in the Canada lynx genome. When comparing to the Canada lynx reference genomes, our sequences aligned better with the mitochondrial genome than the nuclear genome providing confidence to reject NUMTs in our data. However, we cautiously point out that Canada lynx likely has NUMTs on more than one chromosome, as we also found several partial alignments between the mitochondrial and nuclear reference genomes themselves without including our sequence data.
While beyond the scope of this paper, additional approaches for elucidating heteroplasmy versus NUMTs include isolating mitochondria for sequencing, long-read sequencing to capture large overlapping portions of the mitogenome to exclude the possibility of NUMTs (
Patterson et al. 2023), or filtering tools that are designed to remove NUMTs from mitochondrial data, although, generally these programs require the usage of protein-coding genes, which is not the case for our data (
Andújar et al. 2021;
Graham et al. 2021;
Porter and Hajibabaei 2021). Ultimately, we cannot entirely exclude the possibility of NUMTs in our data, but regardless of whether mitochondrial heteroplasmy or NUMTs are the cause of genome duplication in Canada lynx, the potential for increased genetic diversity and influence of stabilizing selection remains.
The RS-3 repeat array in Canada lynx generally matches the patterns seen in many other felid species. The most dominant core motif was TACAC, which appears to be unanimous among felines. Furthermore, like tiger (
Panthera tigris), leopard (
P. pardus), clouded leopard, and bobcat, Canada lynx also showed the rarer core motif TACGC (
Jae-Heup et al. 2001;
Sindičič et al. 2012). Interestingly, unlike the aforementioned species, the placement and occurrence rate of TACGC was remarkably consistent among individual lynx, appearing exclusively as the first motif from the 5′ end in all but one sample. Canada lynx also showed similar fundamental patterns in the suffixes at the end of the repeat motifs but still differed from even closely related congeners (bobcat and Eurasian lynx) upon closer inspection. Canada lynx and Eurasian lynx both showed –G as the most common suffix, but Canada lynx had far more –GTA repeats (41% occurrence), which usually appeared only at the 3′ end in Eurasian lynx (
Sindičić et al. 2012). The most common suffix in bobcat was –ACG which was completely absent in Eurasian lynx and generally found only once in the second section for Canada lynx. Both bobcat and Eurasian lynx also had occurrences of –ACGTA which were not found at all in Canada lynx. Hence, the overall sequence repeat pattern of RS-3 appears to be unique in Canada lynx, as it is among all other felids investigated so far. While the full haplotype sequence of the RS-3 array was previously deemed uninformative for phylogenetic analysis (
Jae-Heup et al. 2001;
Ketmaier and Bernardini 2005), the general pattern of suffix repeats may be indicative of historical speciation among felids. One example could be the –ACG suffix commonly found in other felids, including bobcat, but is highly reduced or absent in other members of genus
Lynx (
Sindičić et al. 2012). Previous phylogenetic analyses have suggested that bobcat is the most basal member of the
Lynx lineage (
Johnson et al. 2004;
Li et al. 2016) therefore, a reduction of the –ACG suffix could be an indicator of speciation within
Lynx.
Here, we document heteroplasmy in the Canada lynx mitochondrial genome by isolating and sequencing products with heterogeneous repeat copy numbers within individuals, thereby adding Canada lynx to the list of felids exhibiting genome duplication. Notably, this research is the first description of intra-individual control region variation in this species, as this phenomenon has not been reported in previous population studies that have successfully sequenced across the left domain (
Rueness et al. 2003;
Prentice et al. 2019). Our results also demonstrate the efficacy of the band-stab method for isolating products from gels separated by as little as 80 bp. While the repeat sequence RS-3 has previously been deemed uninformative for phylogenetic analyses (
Jae-Heup et al. 2001;
Ketmaier and Bernardini 2005), sequencing different RS-2 length polymorphisms within individuals, particularly the first complete repeat from the 3′ end, may be of value to phylogenetic or population genetic analyses (
Wilkinson and Chapman 1991;
Jae-Heup et al. 2001). Ultimately, mitochondrial heteroplasmy provides evolutionary potential for increased genomic diversity which appears to be selected for among felids, possibly under stabilizing selection as a trade-off between increased transcription and slower replication (
Wilkinson et al. 1997;
Jae-Heup et al. 2001;
Clark et al. 2012). Insight into the evolutionary advantage of this phenomenon will likely require a greater understanding of feline mitochondrial function and disease, though currently, there is little evidence for severe mitochondrial disease other than a singular case of selective symmetrical necrotizing encephalopathy in domestic cat (
Dell'Era et al. 2021) and a potential link with teratospermia in wild felids (
Pukazhenthi et al. 2006;
Terrell et al. 2010). Perhaps this is because most heteroplasmic mutations in felids are restricted to non-coding regions, and thus have little detrimental effect. Further studies might investigate how individual factors known to influence heteroplasmy (i.e., age or tissue type) may affect the amplification and degree of heteroplasmy. Alongside this, a wider study documenting the extent of heteroplasmy across the species’ range may prove useful in further understanding this molecular phenomenon. Finally, we caution future studies to be aware of this genome duplication and to take care when inspecting sequence data.