bims-rednas Biomed News
on Repetitive DNA sequences
Issue of 2025–04–13
nineteen papers selected by
Anna Zawada, International Centre for Translational Eye Research



  1. Hum Genet. 2025 Apr 10.
      Hereditary ataxia (HA) is a heterogeneous group of complex neurological disorders, which represent a diagnostic challenge due to their diverse phenotypes and genetic etiologies. Next-generation sequencing (NGS) has revolutionized the field of neurogenetics, improving the identification of ataxia-associated genes. Notwithstanding, repeat expansions analysis remains a cornerstone in the diagnostic workflow of these diseases. Here we describe the molecular characterization of a consecutive single-center series of 70 patients with genetically uncharacterized HA. Patients' samples were analyzed for known HA-associated repeat expansions as first tier and negative ones were analyzed by whole exome sequencing (WES) as second tier. Overall, we identified pathogenic/likely pathogenic variants in 40% (n = 28/70) and variants of unknown significance (VUS) in 20% (n = 14/70) of cases. In particular, 10 patients (14.3%, n = 10/70) presented pathogenic repeat expansions while 18 cases (30%, n = 18/60) harbored at least a single nucleotide variant (SNV) or a copy number variant (CNV) in HA or HSP-related genes. WES allowed assessing complex neurological diseases (i.e., leukodystrophies, cerebrotendinous xanthomatosis and atypical xeroderma pigmentosum), which are not usually referred as pure genetic ataxias. Our data suggests that the combined use of repeat expansion analysis and WES, coupled to detailed clinical phenotyping, is able to detect the molecular alteration underpinning ataxia in almost 50% cases, regardless of the hereditary pattern. Indeed, NGS-based tests are fundamental to acknowledge novel HA-associated genes useful to explain the remaining wide fraction of negative tests. Nowadays, this gap is problematic since these patients could not benefit from an etiological diagnosis of their disease that allows prognostic trajectories and prenatal/preimplantation diagnosis.
    DOI:  https://doi.org/10.1007/s00439-025-02744-y
  2. Mov Disord. 2025 Apr 09.
       BACKGROUND: Familial adult myoclonus epilepsy (FAME) is a rare autosomal dominant disorder caused by the same intronic TTTTA/TTTCA repeat expansion in seven distinct genes. TTTTA-only expansions are benign, whereas those containing TTTCA insertions are pathogenic.
    OBJECTIVE: We investigated the genetic basis of dominant cortical myoclonus without seizures in two unrelated families.
    METHODS: Repeat-primed polymerase chain reaction (PCR), long-range PCR, and nanopore sequencing were used to detect and characterize expansions at known FAME loci.
    RESULTS: We identified a novel repeat expansion in MARCHF6, comprising 388 to 454 elongated TTTTA repeats and 5 to 11 TTTCA repeats at the 3'-terminus, segregating with cortical myoclonus in 8 affected individuals. This configuration shows meiotic stability but low-level somatic variability in blood. We observed an inverse correlation between the number of TTTCA repeats and the age at myoclonus onset.
    CONCLUSIONS: These findings indicate that as little as five TTTCA repeats combined with expanded TTTTA repeats can cause cortical myoclonus without epilepsy, highlighting the potential mechanisms underlying FAME pathophysiology. © 2025 The Author(s). Movement Disorders published by Wiley Periodicals LLC on behalf of International Parkinson and Movement Disorder Society.
    Keywords:  MARCHF6; epilepsy; familial adult myoclonus epilepsy (FAME); myoclonus; repeat expansion
    DOI:  https://doi.org/10.1002/mds.30192
  3. Ann Clin Transl Neurol. 2025 Apr 07.
    Care4Rare Canada Consortium
       BACKGROUND AND OBJECTIVES: Spinocerebellar ataxias (SCA) represent a clinically and genetically heterogeneous group of progressive neurodegenerative diseases with prominent cerebellar atrophy. Recently, a novel pathogenic repeat expansion in intron 1 of FGF14 was identified, causing adult-onset SCA (SCA27B). We aimed to determine the proportion of our unsolved adult-onset ataxia cohort harboring this expansion using several technologies, and to characterize the phenotypic presentation within our population.
    METHODS: Individuals presenting with adult-onset ataxia (> 30 years old) and negative previous genetic testing were selected from the Care4Rare patient repository. Affected individuals were from all ethnicities, and 90% had a family history suggestive of dominant ataxia, representing 19 of the 23 families included. We used multiple tools (PCR, long-read genome sequencing and optical genome mapping (OGM)) to identify the pathogenic GAA repeat in FGF14.
    RESULTS: Of the 23 families included in this study, 65.2% harbored a pathogenic GAA expansion in FGF14. Individuals of French-Canadian descent (FC) represented most of our cohort and had a 64.7% diagnostic yield. Affected individuals presented with gaze-evoked nystagmus, gait ataxia, cerebellar dysarthria, and early episodic features. The GAA expansion in FGF14 was visible by OGM in all individuals tested.
    INTERPRETATION: Our diagnostic yield demonstrates this expansion may be the most common cause of adult-onset SCA in dominant families of FC ancestry. Our FC participants have a phenotype distinct from previously published FC patients, with gaze-evoked nystagmus being the most common eye anomaly. From a diagnostic standpoint, the pathogenic GAA repeat can be identified by OGM, but additional tests are required to complement the interpretation.
    DOI:  https://doi.org/10.1002/acn3.70016
  4. bioRxiv. 2025 Mar 26. pii: 2025.03.20.643775. [Epub ahead of print]
      The hexanucleotide (G 4 C 2 ) repeat expansion in the promoter region of C9orf72 is the most frequent genetic cause of frontotemporal dementia (FTD) and amyotrophic lateral sclerosis (ALS). In this study, we conducted a genome-wide DNA methylation (DNAm) analysis using EPIC version 2 (EPICv2) arrays on an FTD cohort comprising 27 carriers and 250 non-carriers of the pathogenic C9orf72 repeat expansion from the Amsterdam Dementia Cohort. We identified differentially methylated CpGs probes associated with the pathogenic C9orf72 expansion and used these findings to create a DNAm Least Absolute Shrinkage and Selection Operator (LASSO) predictor to identify repeat expansion carriers. Eight CpG sites at the C9orf72 locus were significantly differentially hypermethylated in repeat expansion carriers compared to non-carriers. The LASSO model predicted repeat expansion status with an average accuracy of 98.6%. The LASSO predictor was further validated in an independent cohort of 2,548 subjects with available EPICv2 data, identifying four C9orf72 repeat expansion carriers, subsequently confirmed by repeat-primed PCR. This result not only illustrates the accuracy of the DNAm predictor of C9orf72 repeat expansion carriers but also suggests that repeat expansion carriers may be more prevalent than expected. The identification of a highly accurate DNAm biomarker for a repeat expansion locus associated with neurodegenerative disorders may provide great value for studying this locus. The approach holds significant promise for investigating this and other repeat expansion loci, particularly given the growing interest in epigenetic epidemiological studies involving large cohorts with available DNAm data.
    Graphical abstract optional:
    DOI:  https://doi.org/10.1101/2025.03.20.643775
  5. Neuromuscul Disord. 2025 Mar 28. pii: S0960-8966(25)00073-2. [Epub ahead of print]50 105346
      The Japanese patient registry for facioscapulohumeral muscular dystrophy (FSHD) was launched in September 2020, enrolling patients genetically confirmed to have FSHD. This study aimed to analyze clinical and genetic characteristics based on data from the Japanese FSHD registry. Core items were collected from the TREAT-NMD FSHD dataset, version 1.0. By the end of June 2024, over 200 patients were enrolled, with 161 successfully registered after confirmation. Among them, 156 had FSHD1 and 5 had FSHD2; 81 had affected family members; 116 were ambulatory; 73 had respiratory dysfunction; 22 required mechanical ventilation; 8 had cardiac dysfunction; 4 had retinopathy; and 22 had hearing loss. In patients with FSHD1, the median number of D4Z4 repeats was four, with a low proportion of long repeats. D4Z4 repeat counts influenced age at disease onset, site-specific muscle weakness onset, respiratory function, retinopathy, and hearing loss. Notably, female patients were more likely to have early facial weakness and hearing loss. Our data suggest population diversity in D4Z4 repeat numbers and sex differences. We aim to collaborate with patient groups to enroll more participants and gather more accurate epidemiological data, including cases of FSHD2. Additionally, we plan to investigate racial differences through international collaboration.
    Keywords:  D4Z4 repeat count; Ethnic difference; Facioscapulohumeral muscular dystrophy; Patient registry; Sex difference
    DOI:  https://doi.org/10.1016/j.nmd.2025.105346
  6. Arch Med Res. 2025 Apr 04. pii: S0188-4409(25)00028-1. [Epub ahead of print]56(5): 103208
       BACKGROUND: A trinucleotide repeat expansion of the JPH3 gene causes Huntington disease-like 2 (HDL2), clinically indistinguishable from Huntington's disease and is considered a disease unique to African and Afro-descendant populations. We identified five HDL2 families from the Costa Chica region of southern Pacific Mexico. Because the Mexican population is admixed, we aimed to determine the ancestral origin of the expansion and define the mutation-carrying haplotype using microarray genomic data.
    METHODS: Sixteen individuals (Nine symptomatic, three asymptomatic mutation carriers and four healthy non-carriers) were included. Global and local ancestry were estimated using whole-genome microarray data. Principal component and quadratic discriminant analysis (QDA) were used to infer the most likely origin of the haplotypes, complemented by the SMOTE-Tomek sampling strategy.
    RESULTS: Mean ancestry proportions were 16.26, 27.33, and 56.39% for African, European, and Native American components, respectively. A 1.1 Mb segment inferred as African flanking the JPH3 mutation locus was shared by at least one of the homologous chromosomes of all mutation carriers. Phased genotype analysis revealed a common 746 Kb haplotype containing the mutation that includes 412 SNPs. This shared haplotype was consistently inferred to be of African origin. QDA classified this haplotype as Yoruba in 78.3% of the resampling iterations.
    CONCLUSIONS: Ancestry analysis suggests that the JPH3 repeat expansion identified in our patients is a founder mutation of African origin. Other founder mutations causing rare genetic diseases in Mexico show how the admixture process in Latin America has contributed to the high prevalence of disease in certain geographical regions.
    Keywords:  Genetic ancestry; Huntington's disease-like 2; Huntington's phenocopy; JPH3 gene; Neurodegenerative disorder
    DOI:  https://doi.org/10.1016/j.arcmed.2025.103208
  7. J Hum Genet. 2025 Apr 09.
      Leukodystrophy presents a significant diagnostic challenge due to its varied clinical presentation and similarity to other myelin disorders, characterized by abnormalities in myelin and white matter. Hypomyelination disorders, including Pelizaeus-Merzbacher disease (PMD) and hereditary spastic paraplegias (SPG), are associated with variants in the proteolipid protein 1 (PLP1) gene, leading to symptoms ranging from severe dysmyelination in infancy to delayed dysmyelination and axonal degeneration in adulthood. Family history was taken, and pedigree was constructed. Recruitment included seven males and females with spastic paraplegia and nine healthy relatives, who were clinically investigated, and tested with molecular genetic assays including whole exome sequencing (WES), whole genome sequencing (WGS), and PCR amplification with fragment analysis on gel electrophoresis to identify and confirm the genetic cause. Family history was consistent with hereditary condition marked by progressive spastic paraplegia in 10 family members. Males had early onset and progressive paraplegia, and neurodegenerative conditions, resulting in a decline in the neurocognitive functions. However, in some females, the symptoms manifested later in their 30s-40s, leading to neurodegenerative conditions and spastic paraplegias. A total of 16 family members were available for genetic testing and segregation studies. Initial clinical WES in four members was negative. Next, WGS identified a novel copy number variant (CNV) loss (75.5 kb) involving the 3'UTR of the PLP1 gene in three members (the mother, affected son, but not in the unaffected son). Segregation studies in all 16 family members confirmed the presence of the CNV in five additional affected individuals and an asymptomatic female, but not in the eight asymptomatic individuals. Our study reports a novel 3'UTR CNV in PLP1 in a large family with several individuals affected with SPG. This finding expands the mutational landscape of the PLP1-related diseases to include CNV and, possibly, small sequence changes in the regulatory regions of PLP1, that would otherwise be overlooked during the interpretation of the next generation sequencing data.
    DOI:  https://doi.org/10.1038/s10038-025-01340-2
  8. Mol Genet Genomics. 2025 Apr 05. 300(1): 40
      The Neotropical electric fishes of the order Gymnotiformes, known for their unique electrogenic and electrosensory systems, provide exceptional models for investigating ecological and evolutionary questions. While their phylogenetic and cytogenetic diversity is well documented, information about the diversity and evolution of repetitive DNA in gymnotiform genomes remains limited. To understand how repetitive DNA has shaped genome evolution in this group, we conducted bioinformatic analyses on raw sequencing data and genome assemblies from multiple species representing three major families. Our analysis revealed that closely related species share similar patterns of repetitive DNA composition, with Gymnotidae and Apteronotidae exhibiting the highest (16.5%) and lowest (9.4%) proportions of repetitive DNA, respectively. We identified 40 satellite DNA families, with five exclusive to Gymnotidae, twenty-two to Hypopomidae, and six shared between Apteronotidae species. Only one satellite DNA (NEFSat3-18) was conserved across all analyzed species, suggesting rapid evolutionary turnover of these sequences. The evolutionary dynamics of transposable elements varied among families, with Gymnotidae showing recent expansion of LINE elements and DNA transposons, while Apteronotidae displayed more ancient patterns of transposon activity. Analysis of transposable element landscapes revealed that all species experienced at least one burst of transposition during their evolution. Our findings demonstrate that repetitive DNA diversification parallels the evolutionary history of gymnotiform species. We also present a comprehensive dataset of repetitive DNA sequences that can be used as cytogenomic markers in comparative and evolutionary studies.
    Keywords:  Gymnotiformes; Satellite DNA; Transposable elements
    DOI:  https://doi.org/10.1007/s00438-025-02248-4
  9. PLoS Comput Biol. 2025 Apr;21(4): e1012885
      Variable Number Tandem repeats (VNTRs) refer to repeating motifs of size greater than five bp. VNTRs are an important source of genetic variation, and have been associated with multiple Mendelian and complex phenotypes. However, the highly repetitive structures require reads to span the region for accurate genotyping. Pacific Biosciences HiFi sequencing spans large regions and is highly accurate but relatively expensive. Therefore, targeted sequencing approaches coupled with long-read sequencing have been proposed to improve efficiency and throughput. In this paper, we systematically explored the trade-off between targeted and whole genome HiFi sequencing for genotyping VNTRs. We curated a set of 10 , 787 gene-proximal (G-)VNTRs, and 48 phenotype-associated (P-)VNTRs of interest. Illumina reads only spanned 46% of the G-VNTRs and 71% of P-VNTRs, motivating the use of HiFi sequencing. We performed targeted sequencing with hybridization by designing custom probes for 9,999 VNTRs and sequenced 8 samples using HiFi and Illumina sequencing, followed by adVNTR genotyping. We compared these results against HiFi whole genome sequencing (WGS) data from 28 samples in the Human Pangenome Reference Consortium (HPRC). With the targeted approach only 4,091 (41%) G-VNTRs and only 4 (8%) of P-VNTRs were spanned with at least 15 reads. A smaller subset of 3,579 (36%) G-VNTRs had higher median coverage of at least 63 spanning reads. The spanning behavior was consistent across all 8 samples. Among 5,638 VNTRs with low-coverage ( < 15), 67% were located within GC-rich regions ( > 60%). In contrast, the 40X WGS HiFi dataset spanned 98% of all VNTRs and 49 (98%) of P-VNTRs with at least 15 spanning reads, albeit with lower coverage. Spanning reads were sufficient for accurate genotyping in both cases. Our findings demonstrate that targeted sequencing provides consistently high coverage for a small subset of low-GC VNTRs, but WGS is more effective for broad and sufficient sampling of a large number of VNTRs.
    DOI:  https://doi.org/10.1371/journal.pcbi.1012885
  10. Bioinformatics. 2025 Apr 09. pii: btaf155. [Epub ahead of print]
       MOTIVATION: Extended tandem repeats (TRs) have been associated with 60 or more diseases over the past 30 years. Although most TRs have single repeat units (or motifs), complex TRs with different units have recently been correlated with some brain disorders. Of note, a population-scale analysis shows that complex TRs at one locus can be divergent, and different units are often expanded between individuals. To understand the evolution of high TR diversity, it is informative to visualize a phylogenetic tree. To do this, we need to measure the edit distance between pairs of complex TRs by considering duplication and contraction of units created by replication slippage. However, traditional rigorous algorithms for this purpose are computationally expensive.
    RESULTS: We here propose an efficient heuristic algorithm to estimate the edit distance with duplication and contraction of units (EDDC, for short). We select a set of frequent units that occur in given complex TRs, encode each unit as a single symbol, compress a TR into an optimal series of unit symbols that partially matches the original TR with the minimum Levenshtein distance, and estimate the EDDC between a pair of complex TRs from their compressed forms. Using substantial synthetic benchmark datasets, we demonstrate that the estimated EDDC is highly correlated with the accurate EDDC, with a Pearson correlation coefficient of > 0.983, while the heuristic algorithm achieves orders of magnitude performance speedup.
    AVAILABILITY AND IMPLEMENTATION: The software program hEDDC that implements the proposed algorithm is available at https://github.com/Ricky-pon/hEDDC (DOI: 10.5281/zenodo.14732958).
    DOI:  https://doi.org/10.1093/bioinformatics/btaf155
  11. bioRxiv. 2025 Mar 24. pii: 2025.03.24.644916. [Epub ahead of print]
      Kaposi's sarcoma-associated herpesvirus (KSHV) genome contains a terminal repeats (TR) sequence. Previous studies demonstrated that KSHV TR functions as a gene enhancer for inducible lytic gene promoters. Gene enhancers anchor bromodomain-containing protein 4 (BRD4) at specific genomic region, where BRD4 interacts flexibly with transcription-related proteins through its intrinsically disordered domain and exerts transcription regulatory function. Here, we generated recombinant KSHV with reduced TR copy numbers and studied BRD4 recruitment and its contributions to the inducible promoter activation. Reducing the TR copy numbers from 21 (TR21) to 5 (TR5) strongly attenuated viral gene expression during de novo infection and impaired reactivation. The EF1α promoter encoded in the KSHV BAC backbone also showed reduced promoter activity, suggesting a global attenuation of transcription activity within TR5 latent episomes. Isolation of reactivating cells confirmed that the reduced inducible gene transcription from TR-shortened DNA template and is mediated by decreased efficacies of BRD4 recruitment to viral gene promoters. Separating the reactivating iSLK cell population from non-responders showed that reactivatable iSLK cells harbored larger LANA nuclear bodies (NBs) compared to non-responders. The cells with larger LANA NBs, either due to prior transcription activation or TR copy number, supported KSHV reactivation more efficiently than those with smaller LANA NBs. With auxin-inducible LANA degradation, we confirmed that LANA is responsible for BRD4 occupancies on latent chromatin. Finally, with purified fluorescence-tagged proteins, we demonstrated that BRD4 is required for LANA to form liquid-liquid phase-separated dots. The inclusion of TR DNA fragments further facilitated the formation of larger BRD4-containing LLPS in the presence of LANA, similar to the "cellular enhancer dot" formed by transcription factor-DNA bindings. These results suggest that LANA binding to TR establishes an enhancer domain for infected KSHV episomes. The strength of this enhancer, regulated by TR length or transcription memory, determines the outcome of KSHV replication.
    Importance: Gene enhancers are genomic domains that regulate frequency and duration of transcription burst at gene promoters, with BRD4 playing a critical role in their enhancer functions. KSHV latent mini-chromosome also contains an enhancer domain made with multiple copies of 801 bp identical repeat DNA fragments, terminal repeats. Here, we utilized manipulable mini-scale chromatins with convenient inducible KSHV reactivation to systematically examine the association between enhancer strength and the outcome of inducible promoter activation. This study illustrated the amount of BRD4 recruitment at the enhancer associated with frequencies of BRD4 distribution to the inducible promoters during KSHV reactivation and, therefore, KSHV lytic replication. Recruitment of BRD4 to the TR is specifically regulated by KSHV latent protein, LANA. KSHV evolves clever enhancer elements designed to be regulated by the KSHV own latent protein, LANA.
    DOI:  https://doi.org/10.1101/2025.03.24.644916
  12. BMC Genomics. 2025 Apr 04. 26(1): 340
       BACKGROUND: Plastids have highly conserved genomes in most land plants. However, in several families, plastid genomes exhibit high rates of nucleotide substitution and structural rearrangements among species. This elevated rate of evolution has been posited to lead to increased rates of plastid-nuclear incompatibilities (PNI), potentially acting as a driver of speciation. However, the extent to which plastid structural variation exists within a species is unknown. This study investigates whether plastid structural variation, observed at the interspecific level in Campanulaceae, also occurs within Campanula americana, a species with strong intraspecific PNI. We assembled multiple plastid genomes from three lineages of C. americana that exhibit varying levels of PNI when crossed. We then investigated the structural variation and repetitive DNA content among these lineages and compared the repetitive DNA content with that of other species within the family.
    RESULTS: We found significant variation in plastid genome size among the lineages of C. americana (188,309-201,788 bp). This variation was due in part to multiple gene duplications in the inverted repeat region. Lineages also varied in their repetitive DNA content, with the Appalachian lineage displaying the highest proportion of tandem repeats (~ 10%) compared to the Eastern and Western lineages (~ 6%). In addition, genes involved in transcription and protein transport showed elevated sequence divergence between lineages, and a strong correlation was observed between genome size and repetitive DNA content. Campanula americana was found to have one of the most repetitive plastid genomes within Campanulaceae.
    CONCLUSIONS: These findings challenge the conventional view of plastid genome conservation within a species and suggest that structural variation, differences in repetitive DNA content, and divergence of key genes involved in transcription and protein transport may play a role in PNI. This study highlights the need for further research into the genetic mechanisms underlying PNI, a key process in the early stages of speciation.
    Keywords:  Campanulaceae; Chloroplast genome evolution; Comparative genomics; Cyto-nuclear incompatibility; Plastid-nuclear incompatibility; Repetitive DNA
    DOI:  https://doi.org/10.1186/s12864-025-11525-w
  13. Genome Biol Evol. 2025 Apr 11. pii: evaf062. [Epub ahead of print]
      Transposable elements (TEs) expansion and accumulation represent one of the main drivers of genomic gigantism. Different host genome silencing mechanisms have evolved to counteract TE amplification, leading to a genomic arms race between them. Nevertheless, the evolutionary relationship between TEs and host genome silencing pathways remains poorly understood. Here, we investigate the activity of TEs and TE silencing mechanisms in somatic and germline tissues of Bombina pachypus, a 10 Gb anuran genome. Our findings reveal a higher activity of TEs in the gonads compared to the brain, with retrotransposons as the most active class in both gonads (∼15% increased expression compared to brain) and DNA transposons showing a two-fold higher activity in ovaries. However, analysis of differentially expressed TEs between male and female gonads revealed a greater number of overexpressed TEs in testes (231 vs 169), with maximum fold changes up to 22 in testes versus 8 in ovaries. This suggests a more permissive environment for TE expression in male gonads. Accordingly, increased activity of TE silencing pathways was observed in ovaries compared to testes, with the KRAB-ZFP complex showing not only the highest overall expression levels but also a distinct ovary-specific expression pattern. Summarising, while the higher TE activity in the male gonad may result from the lower efficiency of the KRAB-ZFP complex, the elevated activity of KRAB-ZFPs in ovaries, along with growing evidence of the functional role of TEs in the germline, suggests the existence of a broad range of host-TE dynamics going beyond the arms race model.
    Keywords:  KRAB-ZFP; TE expression; TE silencing; genome size evolution; piRNA pathway; somatic and germline expression
    DOI:  https://doi.org/10.1093/gbe/evaf062
  14. Mob DNA. 2025 Apr 09. 16(1): 17
      The African clawed frog Xenopus laevis has an allotetraploid genome consisting of two subgenomes referred as L relating to the Long chromosomes and S relating to the Short chromosomes. While the L subgenome presents conserved synteny with X. tropicalis chromosomes, the S subgenome has undergone rearrangements and deletions leading to differences in gene and transposable element (TE) content between the two subgenomes. The asymmetry in the evolution of the two subgenomes is also detectable in gene expression levels and TE mobility. TEs, also known as "jumping genes", are mobile genetic elements having a key role in genome evolution and gene regulation. However, due to their potential deleterious effects, TEs are controlled by host defense mechanisms such as the nucleosome remodeling and deacetylase (NuRD) complex and the Argonaute proteins that mainly modify the heterochromatin environment. In embryogenesis, TEs can escape the silencing mechanisms during the maternal-to-zygotic transition when a transcriptionally permissive environment is created. Moreover, further evidence highlighted that the reactivation of TEs during early developmental stages is not the result of this genome-wide reorganization of chromatin but it is class and stage-specific, suggesting a precise regulation. In line with these premises, we explored the impact of TE transcriptional contribution in six developmental stages of X. laevis. Overall, the expression pattern referred to the entire set of transcribed TEs was constant across the six developmental stages and in line with their abundance in the genome. However, focusing on subgenome-specific TEs, our analyses revealed a distinctive transcriptional pattern dominated by LTR retroelements in the L subgenome and LINE retroelements in the S subgenome attributable to young copies. Interestingly, genes encoding proteins involved in maintaining the repressive chromatin environment were active in both subgenomes highlighting that TE controlling systems were active in X. laevis embryogenesis and evolved symmetrically.
    DOI:  https://doi.org/10.1186/s13100-025-00350-3
  15. Chromosome Res. 2025 Apr 05. 33(1): 6
      Transposable elements (TEs) are widely present in eukaryotic genomes, where they can contribute to genome size and functional modifications. As new genomes are sequenced and annotated, more studies can be conducted regarding TE content, distribution, and genome evolution. TEs are extensively diversified in fish genomes resulting in an important role in genome and chromosome evolution. However, curated TE libraries are still scarce in non-model organisms, making it difficult to evaluate TE's impact on genomic modifications thoroughly. Here, we aimed to obtain a curated TE library from the neotropical fish Apareiodon sp. genome. The prospection and curation of the TE library resulted in 244 families from 18 superfamilies of DNA transposons and retrotransposons, which comprise about 10% of the genome, with most insertions fitting in one or a few families. A greater diversity of retrotransposon families is present, especially for Ty3 superfamily. Despite the greater number of retrotransposon families, DNA transposons are the most abundant in the genome, with 37% of all TE insertions belonging to the Tc1-Mariner superfamily. Complete TE copies were observed for almost all superfamilies, with most of the sequences on the Tc1-Mariner group. DNA transposons and SINEs presented older insertions in the genome, followed by LINEs and LTR retrotransposons. TE genome density is highest in the cs25 scaffold, and enriched for Helitron elements. With these data, allied to previous studies on chromosome evolution, we suggest that cs25 bears the W chromosome specific region of the Apareiodon sp. genome, with the presence of significant amount of Helitron insertions.
    Keywords:  DNA transposons; Genome evolution; Non-model organisms; Retrotransposons; Sex chromosomes
    DOI:  https://doi.org/10.1007/s10577-025-09765-3
  16. Genome Res. 2025 Apr 10.
      Pangenome methods have the potential to uncover hitherto undiscovered sequences missing from established reference genomes, making them useful to study evolutionary and speciation processes in diverse organisms. The cichlid fishes of the East African Rift Lakes represent one of nature's most phenotypically diverse vertebrate radiations, but single-nucleotide polymorphism (SNP)-based studies have revealed little sequence difference, with 0.1%-0.25% pairwise divergence between Lake Malawi species. These were based on aligning short reads to a single linear reference genome and ignored the contribution of larger-scale structural variants (SVs). We constructed a pangenome graph that integrates six new and two existing long-read genome assemblies of Lake Malawi haplochromine cichlids. This graph intuitively represents complex and nested variation between the genomes and reveals that the SV landscape is dominated by large insertions, many exclusive to individual assemblies. The graph incorporates a substantial amount of extra sequence across seven species, the total size of which is 33.1% longer than that of a single cichlid genome. Approximately 4.73% to 9.86% of the assembly lengths are estimated as interspecies structural variation between cichlids, suggesting substantial genomic diversity underappreciated in SNP studies. Although coding regions remain highly conserved, our analysis uncovers a significant proportion of SV sequences as transposable element (TE) insertions, especially DNA, LINE, and LTR TEs. These findings underscore that the cichlid genome is shaped both by small-nucleotide mutations and large, TE-derived sequence alterations, both of which merit study to understand their interplay in cichlid evolution.
    DOI:  https://doi.org/10.1101/gr.279674.124
  17. Mol Hortic. 2025 Apr 05. 5(1): 22
      Polyploidy occurs frequently in plants and is an important force in plant evolution and crop breeding. New polyploids face various challenges due to genome duplication and subsequent changes in epigenetic modifications, nucleus/cell size and gene expression. How polyploids produce evolutionary novelty remains to be understood. In this study, a transcriptome comparison between 21-day-old diploid and autotetraploid pak choi seedlings revealed that there are few differentially expressed genes (DEGs), with a greater proportion of DEGs downregulated in response to genome duplication. Genome-wide DNA methylation analysis indicated that the level of DNA methylation is obviously increased, especially in transposable elements (TEs) and 1 kb flanking regions, upon genome doubling. The differentially methylated regions between diploid and autotetraploid pak choi were related to 12,857 differentially hypermethylated genes and 8,451 hypomethylated genes, and the DEGs were negatively correlated with the differential methylation in the regions across the DEGs. Notably, TE methylation increases significantly in regions flanking neighboring non-DEGs rather than those flanking DEGs. These results shed light on the role of DNA methylation in the transcriptional regulation of genes in polyploids and the mechanism of coping with "genome shock" due to genome doubling in cruciferous plants.
    Keywords:  Autopolyploid; DNA methylation; Pak choi; Transcriptional regulation; Transposable element
    DOI:  https://doi.org/10.1186/s43897-025-00145-3
  18. BMC Genomics. 2025 Apr 05. 26(1): 343
       BACKGROUND: As a key genus in Zingiberaceae, Curcuma is widely studied for its taxonomic diversity, the presence of bioactive curcuminoids and volatile oils, and its extensive applications in traditional medicine and economic products such as spices and cosmetics. Although chloroplast genomes have been assembled and published for over 20 Curcuma species, mitochondrial genomic data remain limited.
    RESULTS: We successfully sequenced, assembled, and annotated the mitogenome of Curcuma amarissima (C. amarissima) using both Illumina short reads and Nanopore long reads, achieving the first complete mitogenome characterization in the Zingiberaceae family. The C. amarissima mitogenome features a unique multi-branched structure, spanning 6,505,655 bp and consisting of 39 distinct segments. It contains a total of 43 protein-coding genes, 63 tRNA genes, and 4 rRNA genes, with a GC content of 44.04%. Codon usage analysis indicated a weak bias, with neutrality plot analysis suggesting natural selection as a key factor shaping mitochondrial codon usage in C. amarissima. The mitogenome provides valuable insights into genome size, coding genes, structural features, RNA editing, repetitive sequences, and sequence migration, enhancing our understanding of the evolution and molecular biology of multi-branched mitochondria in Zingiberaceae. The high frequency of repeat sequences may contribute to the structural stability of the mitochondria. Comparing chloroplast genome, phylogenetic analysis based on the mitochondrial genome establishes a foundation for further exploration of evolutionary relationships within Zingiberaceae.
    CONCLUSIONS: In short, the mitochondrial genome characterized here advances our understanding of multi-branched mitogenome organization in Zingiberaceae and offers useful genomic resources that may support future breeding, germplasm conservation, and phylogenetic studies, though further research is necessary.
    Keywords:   Curcuma amarissima ; Comparative genomics; Mitogenome; Phylogeny
    DOI:  https://doi.org/10.1186/s12864-025-11540-x
  19. Front Plant Sci. 2025 ;16 1568698
      Prunus subgenus Cerasus (Mill) A. Gray, commonly known as cherries and cherry blossoms, possesses significant edible and ornamental value. However, the mitochondrial genomes (mitogenomes) of cherry species remain largely unexplored. Here, we successfully assembled the mitogenomes of five cherry species (P. campanulata, P. fruticosa, P. mahaleb, P. pseudocerasus, and P. speciosa), revealing common circular structures. The assembled mitogenomes exhibited sizes ranging from 383,398 bp to 447,498 bp, with GC content varying between 45.54% and 45.76%. A total of 62 to 69 genes were annotated, revealing variability in the copy number of protein-coding genes (PCGs) and tRNA genes. Mitogenome collinearity analysis indicated genomic rearrangements across Prunus species, driven by repetitive sequences, particularly dispersed repeats. Additionally, the five cherry species displayed highly conserved codon usage and RNA editing patterns, highlighting the evolutionary conservation of the mitochondrial PCGs. Phylogenetic analyses confirmed the monophyly of subg. Cerasus, although notable phylogenetic incongruences were observed between the mitochondrial and plastid datasets. These results provide significant genomic resources for forthcoming studies on the evolution and molecular breeding of cherry mitogenomes, enhancing the overall comprehension of mitogenome structure and evolution within Prunus.
    Keywords:  cherry; comparative analysis; evolution; mitochondrial genome; phylogenetic analysis
    DOI:  https://doi.org/10.3389/fpls.2025.1568698