bims-rednas Biomed News
on Repetitive DNA sequences
Issue of 2025–04–06
sixteen papers selected by
Anna Zawada, International Centre for Translational Eye Research



  1. medRxiv. 2025 Mar 19. pii: 2025.03.17.25323863. [Epub ahead of print]
       Background: Hereditary ataxias are genetically diverse, yet up to 75% remain undiagnosed due to technological and financial barriers. A pathogenic ZFHX3 GGC repeat expansion was recently linked to spinocerebellar ataxia type 4 (SCA4), characterized by progressive ataxia and sensory neuropathy, with all reported cases in individuals of Northern European ancestry.
    Methods: We performed Oxford Nanopore Technologies (ONT) genome long-read sequencing (>115 GB per sample) on a total of 15 individuals from Chile; 14 patients with suspected hereditary movement disorders and one unrelated family member. Variants were identified using PEPPER-Margin-DeepVariant 0.8 (SNVs), Sniffles 2.4 (SVs), and Vamos 2.1.3 (STRs). Ancestry was inferred using GenoTools with reference data from the 1000 Genomes Project, Human Genome Diversity Project, and an Ashkenazi Jewish panel. Haplotype analysis was conducted by phasing SNVs within ZFHX3 , and methylation profiling was performed with modbamtools.
    Results: We identified ZFHX3 GGC repeat expansions (47-55 repeats) in four individuals with progressive ataxia, polyneuropathy, and vermis atrophy. One case presented parkinsonism-ataxia, expanding the phenotype. Longer expansions correlated with earlier onset and greater severity. Hypermethylation was detected on the expanded allele, and haplotype analysis linked ultra-rare ZFHX3 variants to distant Swedish ancestry.
    Conclusion: This is the first report of SCA4 outside Northern Europe, confirming a shared founder haplotype and expansion instability. The presence of parkinsonism broadens the clinical spectrum. Comprehensive genetic testing across diverse populations is crucial, and long-read sequencing enhances diagnostic yield by detecting repeat expansions and SNVs in a single assay.
    DOI:  https://doi.org/10.1101/2025.03.17.25323863
  2. Plant J. 2025 Apr;122(1): e70123
      Variegation, a common phenomenon in plants, can be the result of several genetic, developmental, and physiological factors. Leaves of some lettuce cultivars exhibit dramatic red variegation; however, the genetic mechanisms underlying this variegation remain unknown. In this study, we cloned the causal gene for variegation on lettuce leaves and elucidated the underlying molecular mechanisms. Genetic analysis revealed that the polymorphism of variegated versus uniformly red leaves is caused by an "AT" repeat in the promoter of the RLL2A gene encoding a MYB transcription factor. Complementation tests demonstrated that the RLL2A allele (RLL2AV) with (AT)n repeat numbers other than five led to variegated leaves. RLL2AV was expressed in the red spots but not in neighboring green regions. This expression pattern was in concert with a relatively low level of methylation in a retrotransposon inserted in -761 bp of the gene in the red spots compared to high methylation of the retrotransposon in the green region. The presence of (AT)5 in the promoter region, however, stabilized the expression of RLL2A, resulting in uniformly red leaves. In summary, we identified a novel promoter mechanism controlling variegation through inconsistent levels of methylation and showed that the presence of a simple sequence repeat of specific size could stabilize gene expression.
    Keywords:  DNA methylation; Lactuca sativa; MYB transcription factor; RLL2A; anthocyanin biosynthesis; retrotransposon; tandem repeat
    DOI:  https://doi.org/10.1111/tpj.70123
  3. Theor Appl Genet. 2025 Apr 03. 138(4): 91
       KEY MESSAGE: Pangenome graphs enable population-scale genotyping and improve expression analysis, revealing that structural variations (SVs), particularly transposable elements (TEs), significantly contribute to gene expression variation in winter oilseed rape. Structural variations (SVs) impact important traits, from yield to flowering behaviour and stress responses. Pangenome graphs capture population-level diversity, including SVs, within a single data structure and provide a robust framework for downstream applications. They have the potential to serve as unbiased references for SV genotyping, pan-transcriptomic analyses, and association studies, offering significant advantages over single reference genomes. However, their full potential for expression quantitative trait locus (eQTL) analysis is yet to be explored. We combined long and short-read whole genome sequencing data with expression profiling of Brassica napus (oilseed rape) to assess the impact of SVs on gene expression regulation and explored the utility of pangenome graphs for eQTL analysis. Over 90,000 SVs were discovered from 57 long-read datasets. Pangenome graph as reference was evaluated and used for SV genotyping with short reads and transcript expression quantification. Using SVs genotyped from the graph and 100 expression datasets, we identified 267 gene proximal (cis) SV-eQTLs. Over 70% of eQTL-SVs had similarity to transposable elements (TEs), especially Helitrons. The highest proportion of cis-eQTL-SVs were found in promoter regions. About a third of transcripts whose expression was associated with SVs, had no associated SNPs, suggesting that including SVs allows capturing of relationship which would be missed in SNP-only analyses. This study demonstrated that pangenome graphs provide a unifying framework for eQTL analysis by allowing population-scale SV genotyping and gene expression quantification. We also showed that SVs make an appreciable contribution to gene expression variation in winter oilseed rape.
    DOI:  https://doi.org/10.1007/s00122-025-04867-2
  4. Mol Cell. 2025 Mar 26. pii: S1097-2765(25)00198-4. [Epub ahead of print]
      Microsatellites are essential genomic components increasingly linked to transcriptional regulation. FoxP3, a transcription factor critical for regulatory T cell (Treg) development, recognizes TTTG repeat microsatellites by forming multimers along DNA. However, FoxP3 also binds a broader range of TnG repeats (n = 2-5), often at the edges of accessible chromatin regions. This raises questions about how FoxP3 adapts to sequence variability and the potential role of nucleosomes. Using cryoelectron microscopy and single-molecule analyses, we show that murine FoxP3 assembles into various distinct supramolecular structures, depending on DNA sequence. This structural plasticity enables FoxP3 to bridge 2-4 DNA duplexes, forming ultrastable structures that coordinate multiple genomic loci. Nucleosomes further facilitate FoxP3 assembly by inducing local DNA bending, creating a nucleus that recruits distal DNA elements through multiway bridging. Our findings thus reveal FoxP3's unusual ability to shapeshift to accommodate evolutionarily dynamic microsatellites and its potential to reinforce chromatin boundaries and three-dimensional genomic architecture.
    Keywords:  DNA bridging; Foxp3; chromatin loops; microsatellites; multi-way bridging; nucleosome; regulatory T cells; short tandem repeats; supramolecular assemblies; transcription factor
    DOI:  https://doi.org/10.1016/j.molcel.2025.03.005
  5. Commun Biol. 2025 Mar 30. 8(1): 524
      The Neotropical armored catfish Harttia is a valuable model for studying sex chromosome evolution, featuring two independently evolved male-heterogametic systems. This study examined satellitomes-sets of satellite DNAs-from four Amazonian species: H. duriventris (X1X2Y), H. rondoni (XY), H. punctata (X1X2Y), and H. villasboas (X1X2Y). These species share homologous sex chromosomes, with their satellitomes showing a high number of homologous satellite DNAs (satDNAs), primarily located on centromeres or telomeres, and varying by species. Each species revealed a distinct satDNA profile, with independent amplification and homogenization events occurring, suggesting an important role of these repetitive sequences in sex chromosome differentiation in a short evolutionary time, especially in recently originated sex chromosomes. Whole chromosome painting and bioinformatics revealed that in Harttia species without heteromorphic sex chromosomes, a specific satDNA (HviSat08-4011) is amplified in the same linkage group associated with sex chromosomes, suggesting an ancestral system. Such sequence (HviSat08-4011) has partial homology with the ZP4 gene responsible for the formation of the egg envelope, in which its role is discussed. This study indicates that these homologous sex chromosomes have diverged rapidly, recently, and independently in their satDNA content, with transposable elements playing a minor role when compared their roles on autosomal chromosome evolution.
    DOI:  https://doi.org/10.1038/s42003-025-07891-6
  6. bioRxiv. 2025 Mar 19. pii: 2025.03.19.644148. [Epub ahead of print]
      Diseases vary in clinical presentation across individuals despite the same molecular diagnosis. In fragile X syndrome (FXS), mutation-length expansion of a CGG short tandem repeat (STR) in FMR1 causes reduced gene expression and FMRP loss. Nevertheless, FMR1 and FMRP are limited predictors of adaptive functioning and cognition in FXS patients, suggesting that molecular correlates of clinical measures would add diagnostic value. We recently uncovered Megabase-scale domains of heterochromatin (BREACHes) in FXS patient-derived iPSCs. Here, we identify BREACHes in FXS brain tissue (N=4) and absent from sex/age-matched neurotypical controls (N=4). BREACHes span >250 genes and exhibit patient-specific H3K9me3 variation. Using N=4 FXS iPSC lines and N=7 single-cell isogenic FXS iPSC subclones, we observe a strong correlation between inter-sample H3K9me3 variation and heterogeneous BREACH gene repression. We demonstrate improved prediction of cognitive metrics in FXS patients with an additive model of blood FMRP and mRNA levels of H3K9me3-mosaic, but not H3K9me3-invariant, BREACH genes. Our results highlight the utility of H3K9me3 variation at BREACHes for identifying genes associated with FXS clinical metrics.
    DOI:  https://doi.org/10.1101/2025.03.19.644148
  7. Cytogenet Genome Res. 2025 Mar 29. 1-12
       INTRODUCTION: Here we compare differences in the presence of telomeric signals (tDNA-FISH) among karyotypes of taxa having different whole-arm chromosomal rearrangements under the assumption of their participation in differentiation/integration processes during karyotype evolution. We analyzed cytogenetic peculiarities of Robertsonian-like (centromeric) and tandem (telomere-involving) rearrangements using examples of the authors' recent research on comparative cytogenetics of mammals. New data on intra- and interspecific karyotype variation helped to understand the nature of chromosomal rearrangements and their molecular features within and between species in two mammalian taxa: representatives of two genera from two orders (insectivores and rodents).
    METHODS: To detect telomeric repeats in karyotypes of representatives of the Eurasian genus Sorex and Ethiopian endemic Stenocephalemys, G-banded metaphase chromosomes were hybridized in situ with a fluorescein-conjugated peptide nucleic acid probe and 5-TAMRA-labeled (CCCTAA)4 oligonucleotides.
    RESULTS: We compared the location of a molecular chromosomal trait-telomeric sequences-among karyotypes of taxonomically distinct individuals having different types of whole-arm chromosomal rearrangements. Along with the regular terminal location of the telomeric signal on all chromosomes, displays of interstitial telomeric sequences (ITSs) were detectable. This pattern was typical for a studied shrew specimen whose karyotype corresponded to a natural interracial F1 hybrid. This finding doubles the number (known to date) of S. araneus race-specific metacentrics having an identified telomeric signal. In karyotypes of Stenocephalemys specimens, we revealed individual differences in autosomes corresponding to tandem fusion rearrangements, possibly species-specific, for the first time. No intrachromosomal telomeric signal expected in this case was detectable in autosomes, whereas we registered ITSs in pericentromeric regions on X chromosomes near a short, completely heterochromatic (additional) arm.
    CONCLUSION: The new data indicates a heterogeneous distribution of the telomeric signal (tDNA-FISH) on mitotic chromosomes that are involved in (typical for mammals) whole-arm chromosomal variation, thus representing two models of karyotype evolution: Robertsonian polymorphism and tandem fusions. In the analyzed examples of whole-arm chromosomal rearrangements, displays of the centromeric ITS signal more likely represent an integral feature of cytogenetic relatedness within a species (chromosomal races) or between species (in a genus or group of genera) than differentiation of taxa.
    DOI:  https://doi.org/10.1159/000545600
  8. G3 (Bethesda). 2025 Apr 02. pii: jkaf062. [Epub ahead of print]
      Telomeres are eukaryotic chromosome end structures that guard against sequence loss and aberrant chromosome fusions. Telomeric repeat motifs (TRMs), the minimal repeating unit of a telomere, vary from species to species, with some evolutionary clades experiencing a rapid sequence divergence. To explore the full scope of this evolutionary divergence, many bioinformatic tools have been developed to infer novel TRMs using repetitive sequence search on short sequencing reads. However, novel telomeric motifs remain unidentified in up to half of the sequencing libraries assayed with these tools. A possible reason may be that short reads, derived from extensively sheared DNA, preserve little to no positional context of the repetitive sequences assayed. On the other hand, if a sequencing read is sufficiently long, telomeric sequences must appear at either end rather than in the middle. The TeloSearchLR algorithm relies on this to help identify novel TRMs on long reads, in many cases where short-read search tools have failed. In addition, we demonstrate that TeloSearchLR can reveal unusually long telomeric motifs not maintained by telomerase, and it can also be used to anchor terminal scaffolds in new genome assemblies.
    Keywords:  ALT; genome assembly; long-read sequencing; novel telomere detection; telomere; telomeric repeat motif (TRM)
    DOI:  https://doi.org/10.1093/g3journal/jkaf062
  9. Mob DNA. 2025 Apr 03. 16(1): 16
       BACKGROUND: Interspersed repeats occupy a large part of many eukaryotic genomes, and thus their accurate annotation is essential for various genome analyses. Database-free de novo repeat detection approaches are powerful for annotating genomes that lack well-curated repeat databases. However, existing tools do not yet have sufficient repeat detection performance.
    RESULTS: In this study, we developed REPrise, a de novo interspersed repeat detection software program based on a seed-and-extension method. Although the algorithm of REPrise is similar to that of RepeatScout, which is currently the de facto standard tool, we incorporated three unique techniques into REPrise: inexact seeding, affine gap scoring and loose masking. Analyses of rice and simulation genome datasets showed that REPrise outperformed RepeatScout in terms of sensitivity, especially when the repeat sequences contained many mutations. Furthermore, when applied to the complete human genome dataset T2T-CHM13, REPrise demonstrated the potential to detect novel repeat sequence families.
    CONCLUSION: REPrise can detect interspersed repeats with high sensitivity even in long genomes. Our software enhances repeat annotation in diverse genomic studies, contributing to a deeper understanding of genomic structures.
    Keywords:  De novo repeat detection; Inexact seed; REPrise; Seed-and-extend
    DOI:  https://doi.org/10.1186/s13100-025-00353-0
  10. Analyst. 2025 Apr 01.
      As novel noncoding small RNA molecules, piRNAs play crucial roles in cancer development. However, due to their short sequences, easy degradation, and low abundance, developing specific detection methods is challenging. Rapid and early detection is important for the early clinical detection of tumours. Here, a novel one-step, dual-signal amplification piRNA detection system based on sliding replication and catalytic hairpin assembly (CHA), termed CTA, was developed for rapid, ultrasensitive and specific detection of piRNA-823. By utilizing the unique characteristics of tandem repeat sequences to improve amplification efficiency and fluorescence signal intensity, CTA achieved efficient target recognition and signal amplification by embedding tandem repeat sequences in one of the hairpin probes and utilizing chain displacement reactions to produce strong and detectable signals. CTA detected piRNA-823 with a low detection limit of 70 fM. Moreover, the whole detection process could be completed within 45 min. In addition, CTA performed excellently in the detection of cell and cancer samples, and its detection results were consistent with those of RT-qPCR. More importantly, CTA was successfully applied to effectively differentiate between healthy individuals and patients with colorectal cancer. These findings suggest its promising application in the diagnosis of cancer.
    DOI:  https://doi.org/10.1039/d5an00076a
  11. bioRxiv. 2025 Mar 16. pii: 2025.03.16.643550. [Epub ahead of print]
      Understanding how healthcare-associated pathogens adapt in clinical environments can inform strategies to reduce their burden. Here, we investigate the hypothesis that insertion sequences (IS), prokaryotic transposable elements, are a dominant mediator of rapid genomic evolution in healthcare-associated pathogens. Among 28,207 publicly available pathogen genomes, we find high copy numbers of the replicative ISL3 family in healthcare-associated Enterococcus faecium, Streptococcus pneumoniae and Staphylococcus aureus. In E. faecium, the ESKAPE pathogen with the highest IS density, we find that ISL3 proliferation has increased in the last 30 years. To enable better identification of structural variants, we long read-sequenced a new, single hospital collection of 282 Enterococcal infection isolates collected over three years. In these samples, we observed extensive, ongoing structural variation of the E. faecium genome, largely mediated by active replicative ISL3 elements. To determine if ISL3 is actively replicating in clinical timescales in its natural, gut microbiome reservoir, we long read-sequenced a collection of 28 longitudinal stool samples from patients undergoing hematopoietic cell transplantation, whose gut microbiomes were dominated by E. faecium. We found up to six structural variants of a given E. faecium strain within a single stool sample. Examining longitudinal samples from one individual in further detail, we find ISL3 elements can replicate and move to specific positions with profound regulatory effects on neighboring gene expression. In particular, we identify an ISL3 element that upon insertion replaces an imperfect -35 promoter sequence at a folT gene locus with a perfect -35 sequence, which leads to substantial upregulation of expression of folT, driving highly effective folate scavenging. As a known folate auxotroph, E. faecium depends on other members of the microbiota or diet to supply folate. Enhanced folate scavenging may enable E. faecium to thrive in the setting of microbiome collapse that is common in HCT and other critically ill patients. Together, ISL3 expansion has enabled E. faecium to rapidly evolve in healthcare settings, and this likely contributes to its metabolic fitness and may strongly influence its ongoing trajectory of genomic evolution.
    DOI:  https://doi.org/10.1101/2025.03.16.643550
  12. Pest Manag Sci. 2025 Apr 01.
       BACKGROUND: A nearly complete genome assembly consisting of 14 scaffolds, a total length of 969.6 Mb, and an N50 scaffold length of 99.88 Mb, was generated to better understand how transposable element activity has led to adaptive evolution in Bassia scoparia (kochia), an agronomically important weed.
    RESULTS: The nine largest scaffolds correspond to the nine chromosomes of the close relative, Beta vulgaris. From this assembly, 54 387 protein-coding gene loci were annotated. We determined that genes containing Far-Red Elongated Hypocotyl 3 (FHY3) or Far-Red Impaired Response 1 (FAR1) functional domains have undergone a large, kochia-specific gene family expansion. We discovered that putative Mutator Don-Robertson (MuDR) transposable elements with detectable FHY3/FAR1 domains were tightly associated with segmental duplications of 5-enolpyruvylshikimate-3-phosphate synthase subsequently conferring resistance to the herbicide glyphosate. Further, we characterized a new MuDR subtype, named here as 'Muntjac', which contributes to the evolution of herbicide resistance in kochia through the process of transduplication.
    CONCLUSION: Collectively, our study provides insights into the role FHY3/FAR1 genes as active transposable elements and contributes new perspectives on the interaction between transposons and herbicide resistance evolution. © 2025 The Author(s). Pest Management Science published by John Wiley & Sons Ltd on behalf of Society of Chemical Industry.
    Keywords:  5‐enolpyruvylshikimate‐3‐phosphate synthase (EPSPS); MuDR; Muntjac; copy number variation; genome resequencing; glyphosate resistance; kochia; reference genome
    DOI:  https://doi.org/10.1002/ps.8798
  13. Mob DNA. 2025 Apr 02. 16(1): 15
       BACKGROUND: Plant Gypsy LTR-retrotransposons are classified into lineages according to the phylogenetic relationships of the reverse transcriptase. Retand is a lineage of non-chromovirus elements characterized by the presence of a long internal region compared to other lineages.
    RESULTS: This work focuses on the identification and characterization of Potentially Recently Active Retand Elements (PRAREs) in 617 genomic sequence assemblies of Viridiplantae species. The Retand elements were considered PRAREs if their LTRs and insertion sequences were identical, and the sizes of their internal regions and LTRs did not differ by more than 2% from the consensus. A total of 2,735 PRAREs were identified, distributed in 122 clusters corresponding to 34 species, with copy numbers per cluster varying between 1 and 180. They are present in Eudicotyledons and Liliopsida but not in other groups of plants. Some PRAREs are non-autonomous elements, lacking some of the typical LTR retrotransposon coding domains. The size of the POL-3'LTR regions varies between 2,933 and 6,566 bp, and in all cases, includes potential coding regions oriented antisense to the gag and pol genes. 97% of the clusters contain antisense ORFs encoding the TRP28 protein domain of unknown function. The analysis of the consensus TRP28 domain indicates that it probably can bind DNA. About half of the PRAREs contain arrays of tandem repeats in the POL-3'LTR region.
    CONCLUSIONS: The large internal region of the Retand elements is due to the presence of a long POL-3'LTR region. This region frequently contains arrays of tandem repeats that contribute to the expansion of this area. The presence of antisense ORFs in the POL-3'LTR region is also a common feature in these elements, many of which encode proteins with conserved domains, especially the TRP28 domain. The possible function of these TRP28-containing proteins is unknown, but their potential DNA binding capacity and the comparison with similar genes in some retroviruses suggest that they may play a regulatory role in the Retand transposition process.
    Keywords:  Additional ORF; Antisense; Domain annotation; LTR retrotransposons; Retrovirus; Tandem repeat
    DOI:  https://doi.org/10.1186/s13100-025-00354-z
  14. Front Genet. 2025 ;16 1544330
       Background: Transposable elements (TEs, or transposons) are repetitive genomic sequences, accounting for half of a mammal genome. Most TEs are transcriptionally silenced, whereas some TEs, especially endogenous retroviruses (ERVs, long terminal repeat retrotransposons), are physiologically expressed in certain conditions. However, the expression pattern of TEs in those less studied species, like goat (Capra hircus), remains unclear. To obtain an overview of the genomic and transcriptomic features of TEs and ERVs in goat, an important farm species, we herein analyzed transcriptomes of ten C. hircus tissues and cells under various physiological and pathological conditions.
    Method: Distribution of classes, families, and subfamilies of TEs in the C. hircus genome were systematically annotated. The expression patterns of TE-derived transcripts in multiple tissues were investigated at subfamily and location levels. Differential expression of ERV-derived reads was measured under various physiological and pathological conditions, such as embryo development and virus infection challenges. Co-expression between ERV-reads and their proximal genes was also explored to decipher the expression regulation of ERV-derived transcripts.
    Results: There are around 800 TE subfamilies in the goat genome, accounting for 49.1% of the goat genome sequence. TE-derived reads account for 10% of the transcriptome and their abundance are comparable in various goat tissues, while expression of ERVs are variable among tissues. We further characterized expression pattern of ERV reads in various tissues. Differential expression analysis showed that ERVs are highly active in 16-cell embryos, when the genome of the zygote begins to transcribe its own genes. We also recognized numerous activated ERV reads in response to RNA virus infection in lung, spleen, caecum, and immune cells. CapAeg_1.233:ERVK in chromosome 1 and 17 are dysregulated under endometrium development and infection conditions. They showed strong co-expression with their proximal gene OAS1 and TMPRSS2, indicating the impact of activated proximal gene expression on nearby ERVs.
    Conclusion: We generated ERV transcriptomes across goat tissues, and identified ERVs activated in response to different physiological and pathological conditions.
    Keywords:  Capra hircus; endogenous retrovirus; goat; transcriptome; transposable element
    DOI:  https://doi.org/10.3389/fgene.2025.1544330
  15. Ecol Evol. 2025 Apr;15(4): e71165
      Plant mitochondrial genome (mitogenome) has crucial functions underpinning survive, development, and reproduction of organisms. However, the complete mitogenomes have been far less assembled and annotated than plastomes and even nuclear genomes in plants, due to their highly frequent and long repeat sequences and genomic structural variations. These further hinder the understanding of the mitogenome evolution and restrict potential applications in phylogenetic analyses. In this study, we sequenced, assembled, and annotated the complete mitogenome of Semiaquilegia guangxiensis and explored its evolution and usefulness in phylogenetics. The results showed that the mitogenome was composed of two independent molecules, which had a total length of 522,981 bp, a GC content of 45.69%, and 58 genes including 34 protein-coding genes (PCGs), 21 tRNA genes, and three rRNA genes. A generalized codon usage preference was observed among the mitochondrial PCGs, and a total of 665 potential RNA editing sites were identified across the 34 mitochondrial PCGs, all of which were base C-to-U edits. Moreover, a large number of repetitive mitogenome sequences and chloroplast-sourced sequences transferred to the mitogenome were detected. The largest collinear block identified between S. guangxiensis and Paropyrum anemonoides was 4282 bp in length. The phylogenetic analyses based on the mitochondrial gene sequences resolved the phylogenetic relationships within Ranunculales, in which Semiaquilegia was close to Paropyrum. The 19 PCGs were ranked according to their efficiencies on phylogenetic resolution based on several metrics, and the combined metric suggested matR, rps3, and nad5 were the top three loci contributing most to phylogeny. As the first reported mitogenome in Semiaquilegia, our findings enrich the limited mitogenome library of plants, reecho the complex evolutionary dynamics of the mitogenome, and highlight the usefulness of mitochondrial gene sequences in phylogenetics.
    Keywords:  Semiaquilegia guangxiensis; genomic reorganization; intergenomic DNA transfer; mitogenome; phylogenetics
    DOI:  https://doi.org/10.1002/ece3.71165
  16. BMC Genomics. 2025 Mar 31. 26(1): 314
       BACKGROUND: The Orobanchaceae family is widely recognized as an exemplary model system for examining the evolutionary dynamics of parasitic plants. However, reports on the mitochondrial genome (mitogenome) of the hemiparasitic tribe Cymbarieae are currently lacking. Here, we sequenced, assembled and characterized the complete mitogenome of the genus Cymbaria L. sensu stricto (C. mongolica and C. daurica).
    RESULTS: A total of 51 unique mitochondrial genes, including 33 protein-coding genes, three rRNA genes, and 15 tRNA genes, are shared by the mitogenomes of the two hemiparasitic plants, exhibiting the gene content characteristic of autotrophic plants. The mitogenomes of C. mongolica and C. daurica are characterized by a pentacyclic chromosome structure (their major conformation), with lengths of 1,576,465 bp and 1,539,836 bp, respectively. Moreover, we identified and validated the presence of four minor conformations mediated by four pairs of large repeats (> 1000 bp in size) in C. mongolica and eight minor conformations mediated by six large repeats in C. daurica. We further explored codon usage, RNA editing sites, selective pressure, and nucleotide diversity in two Cymbaria mitogenomes. Phylogenetic analyses of 26 species of Lamiales revealed that the two Cymbaria species form a sister clade to the other lineages of Orobanchaceae. Extensive mitogenomic rearrangements are also observed between Cymbaria and five closely related species. Although we identified mitochondrial plastid sequences in the Cymbaria mitogenomes, The mitochondrial plastid sequences (MTPTs) in their mitogenomes represent only 2.37% and 1.74%, respectively. Additionally, there is minimal evidence of intracellular and horizontal gene transfer, with only a few genes (rpl22, rps3, and ycf2) showing low bootstrap support (BS ≤ 70%) for the relationships with the potential host plants Allium mongolicum, Leymus chinensis, and Saposhnikovia divaricata, respectively.
    CONCLUSIONS: We reported the mitochondrial genome in hemiparasitic Cymbaria species for the first time, which are characterized by multiple repeat-mediated recombination and little to no intracellular and horizontal gene transfer. Our findings provide valuable genetic insights for further studies on the mitogenome evolution of hemiparasitic plants.
    Keywords:  Gene transfer; Hemiparasite; Orobanchaceae; Polycyclic molecules; Repeat-mediated recombination
    DOI:  https://doi.org/10.1186/s12864-025-11474-4