bims-rednas Biomed News
on Repetitive DNA sequences
Issue of 2025–04–27
23 papers selected by
Anna Zawada, International Centre for Translational Eye Research



  1. Mov Disord. 2025 Apr 23.
      Long-read sequencing methodologies provide powerful capacity to identify all types of genomic variations in a single test. Long-read platforms such as Oxford Nanopore and PacBio have the potential to revolutionize molecular diagnostics by reaching unparalleled accuracies in genetic discovery and long-range phasing. In the field of dystonia, promising results have come from recent pilot studies showing improved detection of disease-causing structural variants and repeat expansions. Increases in throughput and ongoing reductions in cost will facilitate the incorporation of long-read approaches into mainstream diagnostic practice. Although these developments are likely to transform clinical care, there is currently a discrepancy between the potential benefits of long-read sequencing and the application of this technique to dystonia. In this review we highlight current opportunities and limitations of adopting long-read sequencing methods for the investigation of patients with dystonia. We provide examples of long-read sequencing integration into diagnostic evaluation and the study of pathomechanisms in individuals with dystonic disorders. The goal of this article is to stimulate research into the application and optimization of long-read analysis strategies in dystonia, thus enabling more precise understanding of the underlying etiology in the future. © 2025 The Author(s). Movement Disorders published by Wiley Periodicals LLC on behalf of International Parkinson and Movement Disorder Society.
    Keywords:  Oxford Nanopore; PacBio; dystonia; long‐range haplotype phasing; long‐read sequencing; repeat expansions; structural variants
    DOI:  https://doi.org/10.1002/mds.30208
  2. Hum Mol Genet. 2025 Apr 23. pii: ddaf052. [Epub ahead of print]
      Huntington's disease (HD) is a fatal neurodegenerative disease caused by CAG trinucleotide repeat expansion in the huntingtin gene (Htt) resulting in an expanded polyglutamine (polyQ) tract in the huntingtin (HTT) protein. The expanded polyQ alters structure of HTT making it susceptible to aggregation. The expression of mutant HTT (mHTT) causes dysregulation of several key cellular pathways in neuronal cells resulting in neurodegeneration. Recent studies have demonstrated phosphorylation of the N-terminal domain of the huntingtin (N-HTT) protein as an important regulator of its localization, structure, aggregation, clearance and toxicity. Most studies have focused on the effect of phosphorylation of Ser13 and Ser16 in N-HTT on protein aggregation and reported a drastic reduction in aggregation. However, the downstream impact of this phosphorylation status on key cellular pathways is largely unexplored. Utilizing an inducible cell line model for expression of Exon 1 fragment of mHTT bearing 150 polyglutamine repeats (HD150Q), we demonstrate that kinetin induced phosphorylation at Ser13 and Ser16 of N-HTT resulted in prevention of aggregation as well as resolution of preformed aggregates. Furthermore, kinetin treatment led to rescue of ATP levels and transcription of key genes as well as significant reduction in mitochondrial ROS levels restoring mitochondrial function. Notably, ER stress markers were significantly reduced at transcriptional, translational and post-translational levels. Restoration of mitochondrial function and mitigation of ER stress lead to significant improvement in cell survival. These findings further strengthen the view that HTT N-terminal phosphorylation is a promising therapeutic target for HD.
    Keywords:  Huntingtin phosphorylation; Huntington’s disease; Kinetin; Mutant huntingtin
    DOI:  https://doi.org/10.1093/hmg/ddaf052
  3. Neurobiol Dis. 2025 Apr 19. pii: S0969-9961(25)00130-5. [Epub ahead of print] 106914
      Huntington's Disease (HD) is caused by a CAG repeat expansion in the gene encoding Huntingtin (HTT). While normal HTT function appears impacted by the mutation, the specific pathways unique to CAG repeat expansion versus loss of normal function are unclear. To understand the impact of the CAG repeat expansion, we evaluated biological signatures of HTT knockout (HTT KO) versus those that occur from the CAG repeat expansion by applying multi-omics, live cell imaging, survival analysis and a novel feature- based pipeline to study cortical neurons (eCNs) derived from an isogenic human embryonic stem cell series (RUES2). HTT KO and the CAG repeat expansion influence developmental trajectories of eCNs, with opposing effects on the growth. Network analyses of differentially expressed genes and proteins associated with enriched epigenetic motifs identified subnetworks common to CAG repeat expansion and HTT KO that include neuronal differentiation, cell cycle regulation, and mechanisms related to transcriptional repression and may represent gain-of-function mechanisms that cannot be explained by HTT loss of function alone. A combination of dominant and loss-of-function mechanisms are likely involved in the aberrant neurodevelopmental and neurodegenerative features of HD that can help inform therapeutic strategies.
    Keywords:  Embryonic stem cells; Features; Huntington's disease; Machine learning; Multi-omics; Network analysis; RUES2; Robotic microscopy
    DOI:  https://doi.org/10.1016/j.nbd.2025.106914
  4. Nat Neurosci. 2025 Apr 21.
      Genome-wide enrichment of gene-specific tandem repeat expansions has been linked to autism spectrum disorder. One such mutation is the CTG tandem repeat expansion in the 3' untranslated region of the DMPK gene, which is known to cause myotonic muscular dystrophy type 1. Although there is a clear clinical association between autism and myotonic dystrophy, the molecular basis for this connection remains unknown. Here, we report that sequestration of MBNL splicing factors by mutant DMPK RNAs with expanded CUG repeats alters the RNA splicing patterns of autism-risk genes during brain development, particularly a class of autism-relevant microexons. We demonstrate that both DMPK-CTG expansion and Mbnl null mouse models recapitulate autism-relevant mis-splicing profiles, along with social behavioral deficits and altered responses to novelty. These findings support our model that myotonic dystrophy-associated autism arises from developmental mis-splicing of autism-risk genes.
    DOI:  https://doi.org/10.1038/s41593-025-01943-0
  5. Front Bioinform. 2025 ;5 1532981
       Aims: Autism spectrum disorder (ASD) is a brain developmental disability with a not-fully clarified etiogenesis. Current ASD research largely focuses on coding regions of the genome, but up to date much less is known about the contribution of non-coding elements to ASD risk. The non-coding genome is largely made of DNA repetitive sequences (RS). Although RS were considered slightly more than "junk DNA", today RS have a recognized role in almost every aspect of human biology, especially in developing human brain. Our aim was to test if RS transcription may play a role in ASD.
    Methods: Global RS transcription was firstly investigated in postmortem dorsolateral prefrontal cortex of 13 ASD patients and 39 matched controls. Results were validated in independent datasets.
    Results: AmnSINE1 was the only RS significantly downregulated in ASD specimens. The role of AmnSINE1 in ASD has been investigated at multiple levels, showing that the 1,416 genes containing AmnSINE1 are associated with nervous system development and autism susceptibility. This has been confirmed in a different experimental setting, such as in organoid models of the human cerebral cortex, harboring different ASD causative mutations. AmnSINE1 related genes are transcriptionally co-regulated and are involved not only in brain formation but can specifically be involved in ASD development. Looking for a possible direct role of AmnSINE1 non-coding transcripts in ASD, we report that AmnSINE1 transcripts may alter the miRNA regulatory landscape for genes involved in neurogenesis.
    Conclusion: Our findings provide preliminary evidence supporting a role for AmnSINE1 in ASD development.
    Keywords:  autism spectrum disorder; autistic disorder; embryonic development; microRNA; nervous system; neurogenesis; repetitive sequences
    DOI:  https://doi.org/10.3389/fbinf.2025.1532981
  6. EBioMedicine. 2025 Apr 23. pii: S2352-3964(25)00159-8. [Epub ahead of print]115 105715
    SPORTAX consortium
       BACKGROUND: While most sporadic adult-onset neurodegenerative diseases have only a minor monogenic component, given several recently identified late adult-onset ataxia genes, the genetic burden may be substantial in sporadic adult-onset ataxias. We report systematic mapping of the genetic landscape of sporadic adult-onset ataxia in a well-characterised, multi-centre cohort, combining several multi-modal genetic screening techniques, plus longitudinal natural history data.
    METHODS: Systematic clinico-genetic analysis of a prospective longitudinal multi-centre cohort of 377 consecutive patients with sporadic adult-onset ataxia (SPORTAX cohort), including clinically defined sporadic adult-onset ataxia of unknown aetiology (SAOA) (n = 229) and 'clinically probable multiple system atrophy of cerebellar type' (MSA-Ccp) (n = 148). Combined GAA-FGF14 (SCA27B) and RFC1 repeat expansion screening with next-generation sequencing (NGS) was complemented by natural history and plasma neurofilament light chain analysis in key subgroups.
    FINDINGS: 85 out of 377 (22.5%) patients with sporadic adult-onset ataxia carried a pathogenic or likely pathogenic variant, thereof 67/229 (29.3%) patients with SAOA and 18/148 (12.2%) patients meeting the MSA-Ccp criteria. This included: 45/377 (11.9%) patients with GAA-FGF14≥250 repeat expansions (nine with MSA-Ccp), 17/377 (4.5%) patients with RFC1 repeat expansions (three with MSA-Ccp), and 24/377 (6.4%) patients with single nucleotide variants (SNVs) identified by NGS (six with MSA-Ccp). Five patients (1.3%) were found to have two relevant genetic variants simultaneously (dual diagnosis).
    INTERPRETATION: In this cohort of sporadic adult-onset ataxia, a cohort less likely to have a monogenic cause, a substantial burden of monogenic variants was identified, particularly GAA-FGF14 and RFC1 repeat expansions. This included a substantial share of patients meeting the MSA-Ccp criteria, suggesting a reduced specificity of this clinical diagnosis and potential co-occurrence of MSA-C plus a second, independent genetic condition. These findings have important implications for the genetic work-up and counselling of patients with sporadic ataxia, even when presenting with MSA-like features. With targeted treatments for genetic ataxias now on the horizon, these findings highlight their potential utility for these patients.
    FUNDING: This work was supported by the Clinician Scientist programme "PRECISE.net" funded by the Else Kröner-Fresenius-Stiftung (to DM, AT, CW, OR, and MS), by the Deutsche Forschungsgemeinschaft (as part of the PROSPAX project), and by the Canadian Institutes of Health Research and the Fondation Groupe Monaco. Support was also provided by Humboldt Research Fellowship for Postdocs and the Hertie-Network of Excellence in Clinical Neuroscience and a Fellowship award from the Canadian Institutes of Health Research.
    Keywords:  Adult-onset ataxia; CANVAS; Disease trajectories; Genetic testing; Genomics; Multiple system atrophy; Prospective cohort; SCA27B; Sporadic ataxia
    DOI:  https://doi.org/10.1016/j.ebiom.2025.105715
  7. Exp Eye Res. 2025 Apr 21. pii: S0014-4835(25)00169-1. [Epub ahead of print] 110398
      Fuchs endothelial corneal dystrophy (FECD), which is characterized by excessive extracellular matrix (ECM) accumulation and corneal endothelial cell degeneration, has trinucleotide repeat expansion in TCF4 as a major genetic risk factor. While aberrant splicing has been implicated in FECD pathogenesis, the mechanistic link between splicing abnormalities and disease-specific features remains unclear. Here, we investigated the intron retention (IR) patterns in corneal endothelial cells from FECD patients with TCF4 expansion. Initial RNA-Seq analysis using rMATS identified 486 upregulated and 89 downregulated IR events in expansion-positive FECD compared to controls. Subsequent analysis with the more stringent IRFinder algorithm revealed 10 upregulated IR events distributed across nine genes and, notably, 6 downregulated events exclusively localized within FN1, a major component of corneal guttae. While DEXSeq analysis showed reduced expression across FN1 gene regions in FECD samples, subsequent qPCR validation in an independent cohort demonstrated significantly elevated FN1 expression in both expansion-positive and expansion-negative FECD samples compared to controls. This paradoxical finding suggests that the loss of normal IR-mediated regulation may contribute to pathological FN1 overexpression in FECD. Gene ontology analysis of IR-associated genes revealed enrichment in RNA splicing and ECM-related pathways, supporting a role for IR in disease pathogenesis. Our findings reveal an association between TCF4 expansion and reduced FN1 intron retention, which correlates with ECM accumulation, suggesting a potential link between RNA processing alterations and hallmark features of FECD. These results suggest that targeting IR-mediated regulation could represent a therapeutic strategy for preventing disease progression.
    Keywords:  Fuchs endothelial corneal dystrophy; RNA-Seq; intron retention; splicing
    DOI:  https://doi.org/10.1016/j.exer.2025.110398
  8. Front Neurol. 2025 ;16 1564856
       Background: Dentatorubral-pallidoluysian atrophy (DRPLA) is a progressive neurodegenerative disorder caused by expanded CAG repeats in the ATN1 gene, characterized by cerebellar ataxia, seizures, tremors, and myoclonus. Although approximately 10% of patients with DRPLA reportedly develop schizophrenia-like psychosis (SLP), the distinct association between the clinical course of DRPLA and SLP remains unclear. This study aimed to elucidate the clinical features of SLP in patients with DRPLA.
    Methods: We reviewed 22 cases of pathologically or genetically confirmed DRPLA with SLP, including 21 from the literature and one from our institution. Patient data, including clinical features, treatment information, and disease course, were extracted and analyzed.
    Results: The age of onset was categorized as juvenile (n = 6), early adult (n = 8), and late adult (n = 8). Initially, 10 patients presented with motor symptoms, with six exhibiting psychiatric symptoms and six with both motor and psychiatric symptoms simultaneously. Furthermore, three patients were initially diagnosed with schizophrenia, while four experienced progressive worsening of psychiatric symptoms. The number of CAG repeats ranged from 57 to 76 (mean, 66.0) in the 10 patients with a genetic diagnosis. Summarily, 12 patients received psychotropic medications, with nine showing improvement in delusions and hallucinations.
    Conclusion: SLP can manifest across all DRPLA forms (juvenile-, early adult-, and late adult-onset) and may precede or follow motor symptoms. The clinical course and efficacy of psychotropic medications in patients with DRPLA and SLP suggest a shared pathogenesis between DRPLA and schizophrenia.
    Keywords:  dentatorubral-pallidoluysian atrophy; polyglutamine disease; psychosis; psychotropic medications; schizophrenia
    DOI:  https://doi.org/10.3389/fneur.2025.1564856
  9. J Neurol Sci. 2025 Apr 11. pii: S0022-510X(25)00119-4. [Epub ahead of print]473 123502
       BACKGROUND: C9orf72 is one of the leading causes of genetic FTD. Although neuropsychiatric symptoms are considered a distinctive hallmark, no studies into maladaptive premorbid personality traits have been performed in C9orf72-FTD thus far.
    METHODS: We investigated differences in self-reported and informant-reported brief version of the Dutch Personality Inventory for the DSM (PID-5-NL) in symptomatic (n = 7) and presymptomatic C9orf72 repeat expansion carriers (C9orf72-REC) (n = 27), and controls (n = 18). We also explored relationships with cognitive performance and neuropsychiatric questionnaires (BDI-II-NL and NPI-Q).
    RESULTS: Informants of symptomatic C9orf72-REC reported higher total, Disinhibition and Psychoticism scores on the PID-5-NL than presymptomatic C9orf72-REC and controls. No significant differences were found between presymptomatic C9orf72 repeat expansion carriers and controls. There were no significant correlations between the Mini-Mental State Examination and Frontal Assessment Battery total scores and the PID-5-NL scores. In presymptomatic C9orf72-REC, PID-5-BF-NL total, Negative Affect, and Antagonism scores correlated with the BDI-II-NL. PID-5-IBF-NL total and Negative Affect, Antagonism, and Disinhibition scores correlated with the NPI-Q, Disinhibition correlated with the Frontal-Behavioural cluster, and Negative Affect, Antagonism, and Disinhibition correlated with the Mood and Psychotic clusters.
    CONCLUSION: Our findings suggests that the PID-5-BF-NL offers a valuable insight into the presence of maladaptive premorbid personality traits in the symptomatic phase of C9orf72-FTD.
    Keywords:  C9orf72 repeat expansion; Genetic frontotemporal dementia; Neuropsychiatric symptoms; Premorbid personality traits; Presymptomatic
    DOI:  https://doi.org/10.1016/j.jns.2025.123502
  10. Sci Data. 2025 Apr 22. 12(1): 669
      This study presents a comprehensive transcriptomic analysis of feeder-free extended pluripotent stem cells (ffEPSCs) and their parental human embryonic stem cells (ESCs), providing new insights into understanding human early development and cellular heterogeneity of pluripotency. Leveraging Smart-seq2-based single-cell RNA sequencing (scRNA-seq), we have compared gene expression profiles between ESCs and ffEPSCs and uncovered distinct subpopulations within both groups. Through pseudotime analysis, we have mapped the transition process from ESCs to ffEPSCs, revealing critical molecular pathways involved in the shift from a primed pluripotency to an extended pluripotent state. Additionally, we have employed repeat sequence analysis based on the latest T2T database and identified the stage-specific repeat elements contributing to regulating pluripotency and developmental transitions. This dataset deepens our understanding on early pluripotency and highlights the role of repeat sequences in early embryonic development. Our findings thus offer valuable resources for researchers in stem cell biology, pluripotency, early embryonic development, and potential cell therapy and regenerative medical applications.
    DOI:  https://doi.org/10.1038/s41597-025-05024-6
  11. Pharmacogenet Genomics. 2025 Apr 11.
       OBJECTIVES: To explore the distribution of clinically relevant UGT1A1 polymorphisms and inferred UGT1A1 phenotypes in two Indigenous groups (Paiter-Suruí and Yanomami) from reservation areas in the Brazilian Amazon.
    METHODS: Ninety-two Yanomami and 88 Paiter-Suruí were genotyped with a validated panel of ancestry informative markers. Individuals with >90% Native ancestry were genotyped for the promoter TA repeat (rs8175347) polymorphism and UGT1A1*6 (rs4148323) by direct sequencing, and for UGT1A1*80 (rs887829) by TaqMan allele discrimination. The UGT1A1 metabolic phenotypes were inferred from UGT1A1 diplotypes.
    RESULTS: All Yanomami and 85 (96.6%) Paiter-Suruí had >92% Native ancestry. UGT1A1 genotype data from these individuals revealed: (i) the absence of both alleles with five and eight TA repeats [TA(5) and TA(8)]; (ii) TA(7) allele frequency of 0.470 in Yanomami and 0.441 in Paiter-Suruí; (iii) rs4148323 was absent in Paiter-Suruí and detected in two Yanomami (frequency 0.012); (iv) a perfect linkage disequilibrium (LD) between rs887829C>T and the promoter repeat polymorphisms in both cohorts: C allele with TA(6) and T allele with TA(7). The distribution of the inferred UGT1A1 metabolizer phenotypes did not differ between cohorts (Paiter-Suruí and Yanomami): the intermediate metabolizer was the most common (50.6-55.4%), followed by the normal (30.6-24.1%) and the slow (18.8-20.5%) phenotypes.
    CONCLUSION: This is the first report on the frequency distribution of clinically relevant UGT1A1 variants and inferred UGT1A1 metabolic phenotypes in two major Native populations from indigenous reservation areas in the Brazilian Amazon, namely the Paiter-Suruí and Yanomami. The TA(5) and TA(8) repeats were absent, whereas TA(7) was common (frequency >0.50) in both cohorts. The intronic rs887829 variant (UGT1A1*80) single nucleotide variant was found in perfect LD with the promoter TA repeats. The rs4148323 SNP was absent (Paiter-Suruí) or rare (Yanomami). The frequency of high-risk UGT1A1 poor metabolizer phenotype was 1.6- to 2-fold higher in the indigenous cohorts compared to nonindigenous Brazilians.
    Keywords:  ; Brazilian Amazon; Native populations; UGT1A1 metabolic phenotypes; pharmacogenetics
    DOI:  https://doi.org/10.1097/FPC.0000000000000566
  12. Chemistry. 2025 Apr 22. e202501377
      Telomeric DNA forms G-quadruplex (G4) structures. These G4s structures are crucial for genomic stability and therapeutic targeting. Using time-resolved NMR and CD spectroscopies, we investigated how the ligand Phen-DC3 modulates the folding of the human telomeric repeat 23TAG DNA. The kinetics are modulated by the ligand and by the presence of potassium cations (K+). Ligand binding to G4 occurs via a triphasic process with fast and slow phases. Notably, for the G4 structure in the presence of K+, the slow rate is ten times slower than without K+. These findings offer key insights into the modulation of the complex folding landscape of G4s by ligands, advancing our understanding of G4-ligand interactions for potential therapeutic applications.
    Keywords:  DNA ligand interaction* NMR spectroscopy* DNA structures *DNA folding kinetics; G-quadruplexes
    DOI:  https://doi.org/10.1002/chem.202501377
  13. Anal Biochem. 2025 Apr 19. pii: S0003-2697(25)00114-9. [Epub ahead of print]703 115876
      It has been recently shown that for Bst DNA polymerase, the side isothermal amplification reaction named multimerization (MM) proceeds under certain conditions. MM hinders interpretation of amplification results and reduces the accuracy and reliability of DNA/RNA diagnostics. Here, the mechanism of MM caused by strand-displacement DNA polymerases is reported. The mechanism includes the following key stages: 1) envelopment of the enzyme globule by the synthesized DNA strand, facilitated by DNA breathing, 2) convergence of the 3'-ends of the DNA strands and pseudo-cyclic trigger DNA structure formation, 3) synthesis of the products with repeated motifs resulting in their expansion due to DNA slippage. Initiation of MM reaction occurs with extremely low probability, however, the resulting few trigger DNA structures are efficiently amplified and ultimately lead to the accumulation of nonspecific amplicons (multimers). Molecular models with certain steric and thermodynamic characteristics were used to confirm the proposed mechanism. The highest MM efficiency was observed for DNA templates and reaction conditions that facilitated DNA breathing, complete envelopment of the enzyme globule with DNA strands and convergence of their 3'-ends.
    Keywords:  DNA multimerization; DNA polymerase; Isothermal nucleic acids amplification; Nonspecific DNA synthesis; Strand-displacement activity
    DOI:  https://doi.org/10.1016/j.ab.2025.115876
  14. Mob DNA. 2025 Apr 22. 16(1): 20
       BACKGROUND: Transposons are DNA sequences able to move or copy themselves to other genomic locations leading to insertional mutagenesis. Although transposon-derived sequences account for half of the human genome, most elements are no longer transposition competent. Moreover, transposons are normally repressed through epigenetic silencing in healthy adult tissues but become derepressed in several human cancers, with high activity detected in colorectal cancer. Their impact on non-malignant and malignant tissue as well as the differences between somatic and germline retrotransposition remain poorly understood. With new sequencing technologies, including long read sequencing, we can access intricacies of retrotransposition, such as insertion sequence details and nested repeats, that have been previously challenging to characterize.
    RESULTS: In this study, we investigate somatic and germline retrotransposition by analyzing long read sequencing from 56 colorectal cancers and 112 uterine leiomyomas. We identified 1495 somatic insertions in colorectal samples, while striking lack of insertions was detected in uterine leiomyomas. Our findings highlight differences between somatic and germline events, such as transposon type distribution, insertion length, and target site preference. Leveraging long-read sequencing, we provide an in-depth analysis of the twin-priming phenomenon, detecting it across transposable element types that remain active in humans, including Alus. Additionally, we detect an abundance of germline transposons in repetitive DNA, along with a relationship between replication timing and insertion target site.
    CONCLUSIONS: Our study reveals a stark contrast in somatic transposon activity between colorectal cancers and uterine leiomyomas, and highlights differences between somatic and germline transposition. This suggests potentially different conditions in malignant and non-malignant tissues, as well as in germline and somatic tissues, which could be involved in the transposition process. Long-read sequencing provided important insights into transposon behavior, allowing detailed examination of structural features such as twin priming and nested elements.
    Keywords:  Colorectal cancer; L1; Long read sequencing; Retrotransposition; Transposable element; Uterine leiomyoma
    DOI:  https://doi.org/10.1186/s13100-025-00357-w
  15. Mol Biol Evol. 2025 Apr 25. pii: msaf097. [Epub ahead of print]
      Protein domains of transposable elements (TEs) and viruses increase the protein diversity of host genomes by recombining with other protein domains. By screening 10 million eukaryotic proteins, we identified several domains that define multi-copy gene families and frequently co-occur with TE/viral domains. Among these, a Tc1/Mariner transposase helix-turn-helix (HTH) domain was captured by F-box genes in the Caenorhabditis genus, creating a new class of F-box genes. For specific members of this class, like fbxa-215, we found that the HTH domain is required for diverse processes including germ granule localisation, fertility, and thermotolerance. Furthermore, we provide evidence that Heat Shock Factor 1 (HSF-1) mediates the transcriptional integration of fbxa-215 into the heat-shock response by binding to Helitron TEs directly upstream of the fbxa-215 locus. The interactome of HTH-bearing F-box factors suggests roles in post-translational regulation and proteostasis, consistent with established functions of F-box proteins. Based on AlphaFold2 multimer proteome-wide screens, we propose that the HTH domain may diversify the repertoire of protein substrates that F-box factors regulate post-translationally. We also describe an independent capture of a TE domain by F-box genes in zebrafish. In conclusion, we identify two independent TE domain captures by F-box genes in eukaryotes and provide insights into how these novel proteins are integrated within host gene regulatory networks.
    DOI:  https://doi.org/10.1093/molbev/msaf097
  16. Nucleic Acids Res. 2025 Apr 22. pii: gkaf329. [Epub ahead of print]53(8):
      Inverted repeats are repetitive elements that can form hairpin and cruciform structures. They are linked to genomic instability; however, they also have various biological functions. Their distribution differs markedly across taxonomic groups in the tree of life, and they exhibit high polymorphism due to their inherent genomic instability. Advances in sequencing technologies and declined costs have enabled the generation of an ever-growing number of complete genomes for organisms across taxonomic groups in the tree of life. However, a comprehensive database encompassing inverted repeats across diverse organismal genomes has been lacking. We present invertiaDB, the first comprehensive database of inverted repeats spanning multiple taxa, featuring repeats identified in the genomes of 118 101 organisms across all major taxonomic groups. For each organism, we derived inverted repeats with arm lengths of at least 10 bp, spacer lengths up to 8 bp, and no mismatches in the arms. The database currently hosts 34 330 450 inverted repeat sequences, serving as a centralized, user-friendly repository to perform searches and interactive visualizations, and download existing inverted repeat data for independent analysis. invertiaDB is implemented as a web portal for browsing, analyzing, and downloading inverted repeat data. invertiaDB is publicly available at https://invertiadb.netlify.app/homepage.html.
    DOI:  https://doi.org/10.1093/nar/gkaf329
  17. Physiol Mol Biol Plants. 2025 Mar;31(3): 357-373
      Ailanthus excelsa is a fast-growing, multipurpose agroforestry tree species of Indian Arid Regions (IAR). It is widely cultivated as tree outside forests (TOFs) on farm lands, roadside, canal banks, etc., where the genetic stocks were randomly planted. To ensure the availability of quality planting materials (QPM) for industrial profitability, the germplasm must undergo a systematic genetic improvement program. Genetic variability in the base population is crucial for effective selection, but the lack of genomic resources and marker impedes this process. This study aimed to generate genome sequence information and de novo development of simple sequence repeats (SSRs) in A. excelsa. About 96 million raw reads were generated using Illumina platform, assembled into ~ 183,000 contigs with 33% GC content and an N50 value of 641 bp. A total of 7,667 microsatellite repeats were identified, with di-nucleotides being the most abundant. AT rich repeats were more prevalent than GC rich motifs. A total of 3,696 primer pairs were designed, and 150 of these were selected for validation. In PCR, 145 SSRs were positively amplified and 15 showed polymorphic banding pattern. These polymorphic SSRs were used to characterize 213 individuals from northern and central India. SSR analysis revealed high gene diversity (He = 0.71; Ar = 9.12) with negligible genetic differentiation in populations. The study presents a comprehensive set of de novo SSR markers and provides baseline knowledge of genetic structure of A. excelsa, essential for conservation and long-term genetic improvement programs.
    Supplementary Information: The online version contains supplementary material available at 10.1007/s12298-025-01566-6.
    Keywords:  Ailanthus excelsa; Genetic diversity; Genetic structure; Genome sequencing; Indian Arid Regions; SSRs
    DOI:  https://doi.org/10.1007/s12298-025-01566-6
  18. Cytogenet Genome Res. 2025 Apr 24. 1-22
       INTRODUCTION: Telomeric sequences are stable parts of the genome and are widely conserved among higher-level taxa (e.g., TTAGG in insects and other arthropods), although exceptions are known and their numbers are increasing with research. The true bug suborder Heteroptera (Hemiptera) includes more than 40,000 species in about 100 families, classified into seven infraorders. Four different telomeric motifs are currently known in Heteroptera, including (TTAGG)n, (TTAGGGATGG)n, (TTAGGGGTGG)n, and (TTAGGGTGGT)n. The canonical "insect" motif (TTAGG)n was found in representatives of two infraorders, Nepomorpha and Cimicomorpha. Derived motifs were found in a few species previously known as TTAGG-negative in the evolutionarily advanced sister infraorders Cimicomorpha and Pentatomomorpha (= Terheteroptera). Here, we studied telomeric motifs in 20 species of true bugs belonging to 10 families of Terheteroptera.
    METHODS: We used fluorescence in situ hybridization (FISH) with the "insect" telomeric probe (TTAGG)n and an alternative probe (TTAGGGATGG)n to map the distribution of telomeric sequences in the chromosomes of 8 species of Pentatomomorpha (from the families Pentatomidae, Rhopalidae, Lygaeidae, Geocoridae, and Blissidae). We also analyzed chromosome-level genome assemblies available in the NCBI database for another 4 species of Pentatomomorpha (from Alydidae, Coreidae, and Pentatomidae) and 8 species of Cimicomorpha (from Reduviidae, Miridae, and Anthocoridae).
    RESULTS: Overall, we identified telomeric sequences in all but one (Geocoris dispar; Geocoridae) species. The telomeric motif (TTAGGGATGG)n was detected in both Cimicomorpha (in the families Anthocoridae and Miridae) and Pentatomomorpha (in Blissidae, Lygaeidae, Pentatomidae, and Rhopalidae); the motif (TTAGGGGTGG)n was found only in Pentatomomorpha (in Alydidae, Coreidae, and Pentatomidae); and the canonical "insect" motif (TTAGG)n was found in the family Reduviidae (Cimicomorpha). With our new data, telomeric motifs are now known for 40 species of true bugs from 30 genera, 13 families and 3 infraorders, including Nepomorpha, Cimicomorpha, and Pentatomomorpha. Non-canonical motifs are found so far only in the Terheteroptera clade and are dominant in this group, with (TTAGGGATGG)n leading.
    CONCLUSIONS: Our new data have expanded the understanding of telomere composition and evolution in Cimicomorpha and Pentatomomorpha and suggested that (TTAGGGATGG)n telomeric sequences can be considered ancestral for the entire clade Terheteroptera.
    DOI:  https://doi.org/10.1159/000545902
  19. J Phys Chem B. 2025 Apr 22.
      G-quadruplexes (G4s) in the telomere region are important targets for cancer therapy. Molecules that can fold and stabilize the telomere DNA sequences, even in the absence of salt, can be an exciting prospect for therapy purposes. Anti-inflammatory drugs hydroxychloroquine (HCQ) and chloroquine (CQ) have shown promising effects in cancer therapy and also in the different levels of trial stages. In this study, we have investigated the structure and stability of several natural and mutated telomeric sequences with anti-inflammatory drugs and their analogues in the absence of salts using the biophysical and docking methods to understand the role of the quartet and loop nucleobases of DNA along with the functional group of drugs responsible for triggering the folding of telomeric DNA sequences into G4. The findings indicate that the hydrogen bonding between the charged side chain with the guanine repeating unit associated with the quartet and the thymine in the terminal loops of telomere DNA is the main driving force for the folding of telomere DNA sequences into G4 induced by anti-inflammatory drugs. The data indicate that the adenine nucleobase in the loop of the telomere does not play any role in its folding process induced by HCQ and CQ.
    DOI:  https://doi.org/10.1021/acs.jpcb.5c00526
  20. Plants (Basel). 2025 Mar 11. pii: 874. [Epub ahead of print]14(6):
      Solanum quitoense and S. betaceum called, respectively, naranjilla and tomate de arbol, are both tropical Andean fruits of growing interest in the region. Microsatellite primers (SSRs) identified by NGS technology in both species were screened for the development of SSR marker technology. In S. quitoense, it was found that 41 primers were successfully transferred to six Lasiocarpa closely related species. Using multiplex primer combinations with the M13-Tailing technology in the DNA analyzer LI-COR 4300s, the variability of these primers in seven S. quitoense landraces was characterized. This SSR survey confirmed the narrow genetic base of S. quitoense cultivars with the polymorphism of 14 SSR markers. Moreover, transferability rates and genetic diversity analysis revealed a closer genetic relationship between the species S. candidum and S. hirtum among the Lasiocarpa germplasm screened. On the other hand, 110 SSR primers were screened in four cultivars, segregating plants and wild-related accessions of S. betaceum. Polymorphisms for only eight SSR primers were found but including the wild relative S. unilobum; in S. betaceum, no SSR showed polymorphism confirming the high genetic homogeneity of the cultivars. The results of this study are potentially useful for S.quitoense and S. betaceum genomics, providing an initial set of SSR markers for molecular characterization in S. quitoense germplasm and perspectives for S. betaceum.
    Keywords:  Andean fruit crops; DNA genotyping; SSRs; Solanum betaceum; Solanum quitoense
    DOI:  https://doi.org/10.3390/plants14060874
  21. Mol Biol Evol. 2025 Apr 24. pii: msaf093. [Epub ahead of print]
      Purifying selection is expected to prevent the accumulation of transposable elements within their host, especially when located in and around genes and if affected by epigenetic silencing. However, positive selection may favour the spread of TEs causing genomic imprinting under parental conflict, as genomic imprinting allows parent-specific influence over resource accumulation to the progeny. Concomitantly, the number and frequency of TE insertions in natural populations are conditioned by demographic events. In this study, we aimed to test how demography and selective forces interact to affect the accumulation of TEs around genes, depending on their epigenetic silencing and with a particular focus on imprinted genes. To this aim, we compared the frequency and distribution of TEs in A. lyrata from Europe and North America. Generally, we found that TE insertions showed a lower frequency when they were inserted in or near genes, especially TEs targeted by epigenetic silencing, suggesting purifying selection at work. We also found that many TEs were lost or got fixed in North American populations during the colonization and the postglacial range expansion from refugia of the species in North America, as well as during the transition to selfing, suggesting a potential "TE load'. Finally, we found that silenced TEs increased in frequency and even tended to reach fixation when they were linked to imprinted genes. We conclude that in A. lyrata, genomic imprinting has spread in natural populations through demographic events and positive selection acting on silenced TEs, potentially under a parental conflict scenario.
    Keywords:   Arabidopsis lyrata ; Transposable elements; demographic history; genomic imprinting; positive selection; transposon load
    DOI:  https://doi.org/10.1093/molbev/msaf093
  22. Plant Sci. 2025 Apr 21. pii: S0168-9452(25)00137-2. [Epub ahead of print] 112519
      Early senescence in plants significantly affects photosynthetic efficiency, crop yield, and overall plant vigor. In this study, we identified a spontaneous cucumber mutant, NW079, exhibiting premature leaf yellowing, reduced chlorophyll content, and impaired photosynthetic performance. To uncover the genetic basis of this phenotype, we generated F₂ mapping populations and employed bulked segregant analysis and fine mapping. These efforts led to the identification of a 5.5-kb long terminal repeat (LTR) retrotransposon insertion within the first exon of CsPHYB, a gene encoding phytochrome B. This insertion disrupted normal splicing and gave rise to two aberrant transcript variants: one containing a 261-bp LTR-derived sequence with premature stop codons, and the other harboring a 1,914-bp deletion due to exon skipping. Both variants are predicted to produce truncated, nonfunctional proteins. Functional analyses revealed that CsPHYB deficiency resulted in heightened sensitivity to varying light qualities and intensities, leading to pronounced leaf yellowing and reduced leaf area. RNA sequencing revealed widespread transcriptional reprogramming in NW079, with 580 differentially expressed genes (DEGs) implicated in heme metabolism, tetrapyrrole binding, and chloroplast development. These transcriptional disruptions were closely linked to the observed structural and functional abnormalities in chloroplasts. This study provides a molecular framework for understanding the early senescence in cucumber, offering valuable insights for breeding strategies aimed at improving crop resilience and productivity. Keymessage An LTR retrotransposon insertion in the first exon of CsPhyB disrupts its expression and splicing, leading to early leaf senescence in cucumber. This finding provides novel insights into the role of CsPHYB in chloroplast development and light signaling, offering valuable molecular markers and a target gene for cucumber breeding programs focused on enhancing yield and stress resilience.
    Keywords:  Chloroplast development; CsPHYB; Early senescence; LTR retrotransposon; cucumber
    DOI:  https://doi.org/10.1016/j.plantsci.2025.112519
  23. Sci Rep. 2025 Apr 23. 15(1): 14030
      SABATH gene family in plants participates in metabolic processes such as methylation of various hormones and plays an essential role in plant response to abiotic stress. In this study, we identified and sequenced 28 SABATH genes in soybean and divided them into three groups. Genome mapping annotation suggested that tandem repeat was the cause of SABATH gene amplification in soybean. Phylogenetic and homology analyses show that the three groups may have originated from different ancestors. Transcriptome analysis was performed in six soybean tissues using data from public transcriptome. In addition, transcriptome and gene expression analyses revealed their expression patterns under different soybean varieties and various abiotic stresses. These results reveal the differential expression of GmSABATHs gene under these stresses, indicating their potential role in the mechanism of soybean adapting to environmental challenges. These results provide reference information for the evolutionary study of the SABATH family and the potential role of GmSABATHs in soybean resistance to abiotic stress.
    Keywords:   SABATH gene family; Abiotic stress; Gene expression; Soybean; Transcriptome analysis
    DOI:  https://doi.org/10.1038/s41598-025-98467-1