bims-crepig Biomed News
on Chromatin regulation and epigenetics in cell fate and cancer
Issue of 2024‒06‒16
twenty-one papers selected by
Connor Rogerson, University of Cambridge



  1. Mol Cell. 2024 Jun 05. pii: S1097-2765(24)00437-4. [Epub ahead of print]
      The mechanisms and timescales controlling de novo establishment of chromatin-mediated transcriptional silencing by Polycomb repressive complex 2 (PRC2) are unclear. Here, we investigate PRC2 silencing at ArabidopsisFLOWERINGLOCUS C (FLC), known to involve co-transcriptional RNA processing, histone demethylation activity, and PRC2 function, but so far not mechanistically connected. We develop and test a computational model describing proximal polyadenylation/termination mediated by the RNA-binding protein FCA that induces H3K4me1 removal by the histone demethylase FLD. H3K4me1 removal feeds back to reduce RNA polymerase II (RNA Pol II) processivity and thus enhance early termination, thereby repressing productive transcription. The model predicts that this transcription-coupled repression controls the level of transcriptional antagonism to PRC2 action. Thus, the effectiveness of this repression dictates the timescale for establishment of PRC2/H3K27me3 silencing. We experimentally validate these mechanistic model predictions, revealing that co-transcriptional processing sets the level of productive transcription at the locus, which then determines the rate of the ON-to-OFF switch to PRC2 silencing.
    Keywords:  H3K27me3; H3K4me1; Polycomb silencing; alternative polyadenylation; analog and digital gene regulation; co-transcriptional processing; feedback interactions; mechanistic mathematical modeling; proximal termination; transcriptional antagonism
    DOI:  https://doi.org/10.1016/j.molcel.2024.05.014
  2. Nat Commun. 2024 Jun 08. 15(1): 4914
      FOXA family proteins act as pioneer factors by remodeling compact chromatin structures. FOXA1 is crucial for the chromatin binding of the androgen receptor (AR) in both normal prostate epithelial cells and the luminal subtype of prostate cancer (PCa). Recent studies have highlighted the emergence of FOXA2 as an adaptive response to AR signaling inhibition treatments. However, the role of the FOXA1 to FOXA2 transition in regulating cancer lineage plasticity remains unclear. Our study demonstrates that FOXA2 binds to distinct classes of developmental enhancers in multiple AR-independent PCa subtypes, with its binding depending on LSD1. Moreover, we reveal that FOXA2 collaborates with JUN at chromatin and promotes transcriptional reprogramming of AP-1 in lineage-plastic cancer cells, thereby facilitating cell state transitions to multiple lineages. Overall, our findings underscore the pivotal role of FOXA2 as a pan-plasticity driver that rewires AP-1 to induce the differential transcriptional reprogramming necessary for cancer cell lineage plasticity.
    DOI:  https://doi.org/10.1038/s41467-024-49234-9
  3. Commun Biol. 2024 Jun 11. 7(1): 719
      Estrogen Receptor α (ERα) is a major lineage determining transcription factor (TF) in mammary gland development. Dysregulation of ERα-mediated transcriptional program results in cancer. Transcriptomic and epigenomic profiling of breast cancer cell lines has revealed large numbers of enhancers involved in this regulatory program, but how these enhancers encode function in their sequence remains poorly understood. A subset of ERα-bound enhancers are transcribed into short bidirectional RNA (enhancer RNA or eRNA), and this property is believed to be a reliable marker of active enhancers. We therefore analyze thousands of ERα-bound enhancers and build quantitative, mechanism-aware models to discriminate eRNAs from non-transcribing enhancers based on their sequence. Our thermodynamics-based models provide insights into the roles of specific TFs in ERα-mediated transcriptional program, many of which are supported by the literature. We use in silico perturbations to predict TF-enhancer regulatory relationships and integrate these findings with experimentally determined enhancer-promoter interactions to construct a gene regulatory network. We also demonstrate that the model can prioritize breast cancer-related sequence variants while providing mechanistic explanations for their function. Finally, we experimentally validate the model-proposed mechanisms underlying three such variants.
    DOI:  https://doi.org/10.1038/s42003-024-06400-5
  4. Nucleic Acids Res. 2024 Jun 08. pii: gkae476. [Epub ahead of print]
      During early development, gene expression is tightly regulated. However, how genome organization controls gene expression during the transition from naïve embryonic stem cells to epiblast stem cells is still poorly understood. Using single-molecule microscopy approaches to reach nanoscale resolution, we show that genome remodeling affects gene transcription during pluripotency transition. Specifically, after exit from the naïve pluripotency state, chromatin becomes less compacted, and the OCT4 transcription factor has lower mobility and is more bound to its cognate sites. In epiblast cells, the active transcription hallmark, H3K9ac, decreases within the Oct4 locus, correlating with reduced accessibility of OCT4 and, in turn, with reduced expression of Oct4 nascent RNAs. Despite the high variability in the distances between active pluripotency genes, distances between Nodal and Oct4 decrease during epiblast specification. In particular, highly expressed Oct4 alleles are closer to nuclear speckles during all stages of the pluripotency transition, while only a distinct group of highly expressed Nodal alleles are in close proximity to Oct4 when associated with a nuclear speckle in epiblast cells. Overall, our results provide new insights into the role of the spatiotemporal genome remodeling during mouse pluripotency transition and its correlation with the expression of key pluripotency genes.
    DOI:  https://doi.org/10.1093/nar/gkae476
  5. Nat Commun. 2024 Jun 11. 15(1): 4962
      In all eukaryotes, acetylation of histone lysine residues correlates with transcription activation. Whether histone acetylation is a cause or consequence of transcription is debated. One model suggests that transcription promotes the recruitment and/or activation of acetyltransferases, and histone acetylation occurs as a consequence of ongoing transcription. However, the extent to which transcription shapes the global protein acetylation landscapes is not known. Here, we show that global protein acetylation remains virtually unaltered after acute transcription inhibition. Transcription inhibition ablates the co-transcriptionally occurring ubiquitylation of H2BK120 but does not reduce histone acetylation. The combined inhibition of transcription and CBP/p300 further demonstrates that acetyltransferases remain active and continue to acetylate histones independently of transcription. Together, these results show that histone acetylation is not a mere consequence of transcription; acetyltransferase recruitment and activation are uncoupled from the act of transcription, and histone and non-histone protein acetylation are sustained in the absence of ongoing transcription.
    DOI:  https://doi.org/10.1038/s41467-024-49370-2
  6. Genome Biol. 2024 Jun 13. 25(1): 156
      BACKGROUND: Genetic changes that modify the function of transcriptional enhancers have been linked to the evolution of biological diversity across species. Multiple studies have focused on the role of nucleotide substitutions, transposition, and insertions and deletions in altering enhancer function. CpG islands (CGIs) have recently been shown to influence enhancer activity, and here we test how their turnover across species contributes to enhancer evolution.RESULTS: We integrate maps of CGIs and enhancer activity-associated histone modifications obtained from multiple tissues in nine mammalian species and find that CGI content in enhancers is strongly associated with increased histone modification levels. CGIs show widespread turnover across species and species-specific CGIs are strongly enriched for enhancers exhibiting species-specific activity across all tissues and species. Genes associated with enhancers with species-specific CGIs show concordant biases in their expression, supporting that CGI turnover contributes to gene regulatory innovation. Our results also implicate CGI turnover in the evolution of Human Gain Enhancers (HGEs), which show increased activity in human embryonic development and may have contributed to the evolution of uniquely human traits. Using a humanized mouse model, we show that a highly conserved HGE with a large CGI absent from the mouse ortholog shows increased activity at the human CGI in the humanized mouse diencephalon.
    CONCLUSIONS: Collectively, our results point to CGI turnover as a mechanism driving gene regulatory changes potentially underlying trait evolution in mammals.
    Keywords:  Comparative genomics; Gene regulation; Orphan CpG islands; Transcriptional enhancer evolution
    DOI:  https://doi.org/10.1186/s13059-024-03300-z
  7. Elife. 2024 Jun 10. pii: e95856. [Epub ahead of print]13
      Once fertilized, mouse zygotes rapidly proceed to zygotic genome activation (ZGA), during which long terminal repeats (LTRs) of murine endogenous retroviruses with leucine tRNA primer (MERVL) are activated by a conserved homeodomain-containing transcription factor, DUX. However, Dux-knockout embryos produce fertile mice, suggesting that ZGA is redundantly driven by an unknown factor(s). Here we present multiple lines of evidence that the multicopy homeobox gene, Obox4, encodes a transcription factor that is highly expressed in mouse 2-cell embryos and redundantly drives ZGA. Genome-wide profiling revealed that OBOX4 specifically binds and activates MERVL LTRs as well as a subset of murine endogenous retroviruses with lysine tRNA primer (MERVK) LTRs. Depletion of Obox4 is tolerated by embryogenesis, whereas concomitant Obox4/Dux depletion markedly compromises embryonic development. Our study identified OBOX4 as a transcription factor that provides genetic redundancy to pre-implantation development.
    Keywords:  developmental biology; mouse
    DOI:  https://doi.org/10.7554/eLife.95856
  8. Mol Cell. 2024 Jun 07. pii: S1097-2765(24)00439-8. [Epub ahead of print]
      The interconnections between co-transcriptional regulation, chromatin environment, and transcriptional output remain poorly understood. Here, we investigate the mechanism underlying RNA 3' processing-mediated Polycomb silencing of Arabidopsis FLOWERING LOCUS C (FLC). We show a requirement for ANTHESIS PROMOTING FACTOR 1 (APRF1), a homolog of yeast Swd2 and human WDR82, known to regulate RNA polymerase II (RNA Pol II) during transcription termination. APRF1 interacts with TYPE ONE SERINE/THREONINE PROTEIN PHOSPHATASE 4 (TOPP4) (yeast Glc7/human PP1) and LUMINIDEPENDENS (LD), the latter showing structural features found in Ref2/PNUTS, all components of the yeast and human phosphatase module of the CPF 3' end-processing machinery. LD has been shown to co-associate in vivo with the histone H3 K4 demethylase FLOWERING LOCUS D (FLD). This work shows how the APRF1/LD-mediated polyadenylation/termination process influences subsequent rounds of transcription by changing the local chromatin environment at FLC.
    Keywords:  COOLAIR; ChIP; FLC; IP-MS; Quant-seq; RNA 3′ processing; chromatin silencing; co-transcriptional processing; plaNET-seq; transcription termination
    DOI:  https://doi.org/10.1016/j.molcel.2024.05.016
  9. Cell Rep. 2024 Jun 11. pii: S2211-1247(24)00673-9. [Epub ahead of print]43(6): 114345
      Ferroptosis is an iron-dependent cell death mechanism characterized by the accumulation of toxic lipid peroxides and cell membrane rupture. GPX4 (glutathione peroxidase 4) prevents ferroptosis by reducing these lipid peroxides into lipid alcohols. Ferroptosis induction by GPX4 inhibition has emerged as a vulnerability of cancer cells, highlighting the need to identify ferroptosis regulators that may be exploited therapeutically. Through genome-wide CRISPR activation screens, we identify the SWI/SNF (switch/sucrose non-fermentable) ATPases BRM (SMARCA2) and BRG1 (SMARCA4) as ferroptosis suppressors. Mechanistically, they bind to and increase chromatin accessibility at NRF2 target loci, thus boosting NRF2 transcriptional output to counter lipid peroxidation and confer resistance to GPX4 inhibition. We further demonstrate that the BRM/BRG1 ferroptosis connection can be leveraged to enhance the paralog dependency of BRG1 mutant cancer cells on BRM. Our data reveal ferroptosis induction as a potential avenue for broadening the efficacy of BRM degraders/inhibitors and define a specific genetic context for exploiting GPX4 dependency.
    Keywords:  CP: Cancer; CP: Molecular biology
    DOI:  https://doi.org/10.1016/j.celrep.2024.114345
  10. Cell Rep. 2024 Jun 12. pii: S2211-1247(24)00678-8. [Epub ahead of print]43(6): 114350
      Renal cell carcinoma with sarcomatoid differentiation (sRCC) is associated with poor survival and a heightened response to immune checkpoint inhibitors (ICIs). Two major barriers to improving outcomes for sRCC are the limited understanding of its gene regulatory programs and the low diagnostic yield of tumor biopsies due to spatial heterogeneity. Herein, we characterized the epigenomic landscape of sRCC by profiling 107 epigenomic libraries from tissue and plasma samples from 50 patients with RCC and healthy volunteers. By profiling histone modifications and DNA methylation, we identified highly recurrent epigenomic reprogramming enriched in sRCC. Furthermore, CRISPRa experiments implicated the transcription factor FOSL1 in activating sRCC-associated gene regulatory programs, and FOSL1 expression was associated with the response to ICIs in RCC in two randomized clinical trials. Finally, we established a blood-based diagnostic approach using detectable sRCC epigenomic signatures in patient plasma, providing a framework for discovering epigenomic correlates of tumor histology via liquid biopsy.
    Keywords:  CP: Cancer; CP: Genomics; FOSL1; epigenomics; histone modifications; immune checkpoint inhibitors; immunotherapy; kidney cancer; liquid biopsy; renal cell carcinoma; sarcomatoid; transcription factors
    DOI:  https://doi.org/10.1016/j.celrep.2024.114350
  11. Nucleic Acids Res. 2024 Jun 14. pii: gkae498. [Epub ahead of print]
      Long terminal repeats (LTRs), which often contain promoter and enhancer sequences of intact endogenous retroviruses (ERVs), are known to be co-opted as cis-regulatory elements for fine-tuning host-coding gene expression. Since LTRs are mainly silenced by the deposition of repressive epigenetic marks, substantial activation of LTRs has been found in human cells after treatment with epigenetic inhibitors. Although the LTR12C family makes up the majority of ERVs activated by epigenetic inhibitors, how these epigenetically and transcriptionally activated LTR12C elements can regulate the host-coding gene expression remains unclear due to genome-wide alteration of transcriptional changes after epigenetic inhibitor treatments. Here, we specifically transactivated >600 LTR12C elements by using single guide RNA-based dCas9-SunTag-VP64, a site-specific targeting CRISPR activation (CRISPRa) system, with minimal off-target events. Interestingly, most of the transactivated LTR12C elements acquired the H3K27ac-marked enhancer feature, while only 20% were co-marked with promoter-associated H3K4me3 modifications. The enrichment of the H3K4me3 signal was intricately associated with downstream regions of LTR12C, such as internal regions of intact ERV9 or other types of retrotransposons. Here, we leverage an optimized CRISPRa system to identify two distinct epigenetic signatures that define LTR12C transcriptional activation, which modulate the expression of proximal protein-coding genes.
    DOI:  https://doi.org/10.1093/nar/gkae498
  12. PLoS Comput Biol. 2024 Jun 10. 20(6): e1012194
      Transcription factors (TFs) regulate the process of transcription through the modulation of different kinetic steps. Although models can often describe the observed transcriptional output of a measured gene, predicting a TFs role on a given promoter requires an understanding of how the TF alters each step of the transcription process. In this work, we use a simple model of transcription to assess the role of promoter identity, and the degree to which TFs alter binding of RNAP (stabilization) and initiation of transcription (acceleration) on three primary characteristics: the range of steady-state regulation, cell-to-cell variability in expression, and the dynamic response time of a regulated gene. We find that steady state regulation and the response time of a gene behave uniquely for TFs that regulate incoherently, i.e that speed up one step but slow the other. We also find that incoherent TFs have dynamic implications, with one type of incoherent mode configuring the promoter to respond more slowly at intermediate TF concentrations. We also demonstrate that the noise of gene expression for these TFs is sensitive to promoter strength, with a distinct non-monotonic profile that is apparent under stronger promoters. Taken together, our work uncovers the coupling between promoters and TF regulatory modes with implications for understanding natural promoters and engineering synthetic gene circuits with desired expression properties.
    DOI:  https://doi.org/10.1371/journal.pcbi.1012194
  13. Mol Cell. 2024 Jun 10. pii: S1097-2765(24)00440-4. [Epub ahead of print]
      Transcriptional coregulators and transcription factors (TFs) contain intrinsically disordered regions (IDRs) that are critical for their association and function in gene regulation. More recently, IDRs have been shown to promote multivalent protein-protein interactions between coregulators and TFs to drive their association into condensates. By contrast, here we demonstrate how the IDR of the corepressor LSD1 excludes TF association, acting as a dynamic conformational switch that tunes repression of active cis-regulatory elements. Hydrogen-deuterium exchange shows that the LSD1 IDR interconverts between transient open and closed conformational states, the latter of which inhibits partitioning of the protein's structured domains with TF condensates. This autoinhibitory switch controls leukemic differentiation by modulating repression of active cis-regulatory elements bound by LSD1 and master hematopoietic TFs. Together, these studies unveil alternative mechanisms by which disordered regions and their dynamic crosstalk with structured regions can shape coregulator-TF interactions to control cis-regulatory landscapes and cell fate.
    Keywords:  autoinhibitory switch; condensate; intrinsically disordered region; leukemic differentiation; transcription factors; transcriptional corepressor
    DOI:  https://doi.org/10.1016/j.molcel.2024.05.017
  14. PLoS Genet. 2024 Jun 13. 20(6): e1011241
      Although introns are typically tens to thousands of nucleotides, there are notable exceptions. In flies as well as humans, a small number of genes contain introns that are more than 1000 times larger than typical introns, exceeding hundreds of kilobases (kb) to megabases (Mb). It remains unknown why gigantic introns exist and how cells overcome the challenges associated with their transcription and RNA processing. The Drosophila Y chromosome contains some of the largest genes identified to date: multiple genes exceed 4Mb, with introns accounting for over 99% of the gene span. Here we demonstrate that co-transcriptional splicing of these gigantic Y-linked genes is important to ensure successful transcription: perturbation of splicing led to the attenuation of transcription, leading to a failure to produce mature mRNA. Cytologically, defective splicing of the Y-linked gigantic genes resulted in disorganization of transcripts within the nucleus suggestive of entanglement of transcripts, likely resulting from unspliced long RNAs. We propose that co-transcriptional splicing maintains the length of nascent transcripts of gigantic genes under a critical threshold, preventing their entanglement and ensuring proper gene expression. Our study reveals a novel biological significance of co-transcriptional splicing.
    DOI:  https://doi.org/10.1371/journal.pgen.1011241
  15. Structure. 2024 May 31. pii: S0969-2126(24)00190-4. [Epub ahead of print]
      The SWI/SNF2 chromatin remodeling factor decreased DNA methylation 1 (DDM1) is essential for the silencing of transposable elements (TEs) in both euchromatic and heterochromatic regions. Here, we determined the cryo-EM structures of DDM1-nucleosomeH2A and DDM1-nucleosomeH2A.W complexes at near-atomic resolution in the presence of the ATP analog ADP-BeFx. The structures show that nucleosomal DNA is unwrapped more on the surface of the histone octamer containing histone H2A than that containing histone H2A.W. DDM1 embraces one DNA gyre of the nucleosome and interacts with the N-terminal tails of histone H4. Although we did not observe DDM1-H2A.W interactions in our structures, the results of the pull-down experiments suggest a direct interaction between DDM1 and the core region of histone H2A.W. Our work provides mechanistic insights into the heterochromatin remodeling process driven by DDM1 in plants.
    DOI:  https://doi.org/10.1016/j.str.2024.05.013
  16. Imeta. 2023 Nov;2(4): e152
      Chromatin accessibility sequencing has been widely used for uncovering genetic regulatory mechanisms and inferring gene regulatory networks. However, effectively integrating large-scale chromatin accessibility datasets has posed a significant challenge. This is due to the lack of a comprehensive end-to-end solution, as many existing tools primarily emphasize data preprocessing and overlook downstream analyses. To bridge this gap, we have introduced cisDynet, a holistic solution that combines streamlined data preprocessing using Snakemake and R functions with advanced downstream analysis capabilities. cisDynet excels in conventional data analyses, encompassing peak statistics, peak annotation, differential analysis, motif enrichment analysis, and more. Additionally, it allows to perform sophisticated data exploration, such as tissue-specific peak identification, time course data modeling, integration of RNA-seq data to establish peak-to-gene associations, constructing regulatory networks, and conducting enrichment analysis of genome-wide association study (GWAS) variants. As a proof of concept, we applied cisDynet to reanalyze comprehensive ATAC-seq datasets across various tissues from the Encyclopedia of DNA Elements (ENCODE) project. The analysis successfully delineated tissue-specific open chromatin regions (OCRs), established connections between OCRs and target genes, and effectively linked these discoveries with 1861 GWAS variants. Furthermore, cisDynet was instrumental in dissecting the time course open chromatin data of mouse embryonic development, revealing the dynamic behavior of OCRs over developmental stages and identifying key transcription factors governing differentiation trajectories. In summary, cisDynet offers researchers a user-friendly solution that minimizes the need for extensive coding, ensures the reproducibility of results, and greatly simplifies the exploration of epigenomic data.
    Keywords:  ATAC‐seq; GWAS; chromatin accessibility; gene expression; regulatory network; time course
    DOI:  https://doi.org/10.1002/imt2.152
  17. Nat Cell Biol. 2024 Jun 13.
    SU2C/PCF West Coast Prostate Cancer Dream Team
      Transcription factor (TF) proteins regulate gene activity by binding to regulatory regions, most importantly at gene promoters. Many genes have alternative promoters (APs) bound by distinct TFs. The role of differential TF activity at APs during tumour development is poorly understood. Here we show, using deep RNA sequencing in 274 biopsies of benign prostate tissue, localized prostate tumours and metastatic castration-resistant prostate cancer, that AP usage increases as tumours progress and APs are responsible for a disproportionate amount of tumour transcriptional activity. Expression of the androgen receptor (AR), the key driver of prostate tumour activity, is correlated with elevated AP usage. We identified AR, FOXA1 and MYC as potential drivers of AP activation. DNA methylation is a likely mechanism for AP activation during tumour progression and lineage plasticity. Our data suggest that prostate tumours activate APs to magnify the transcriptional impact of tumour drivers, including AR and MYC.
    DOI:  https://doi.org/10.1038/s41556-024-01438-3
  18. Nucleic Acids Res. 2024 Jun 14. pii: gkae522. [Epub ahead of print]
      The conserved Gsx homeodomain (HD) transcription factors specify neural cell fates in animals from flies to mammals. Like many HD proteins, Gsx factors bind A/T-rich DNA sequences prompting the following question: How do HD factors that bind similar DNA sequences in vitro regulate specific target genes in vivo? Prior studies revealed that Gsx factors bind DNA both as a monomer on individual A/T-rich sites and as a cooperative homodimer to two sites spaced precisely 7 bp apart. However, the mechanistic basis for Gsx-DNA binding and cooperativity is poorly understood. Here, we used biochemical, biophysical, structural and modeling approaches to (i) show that Gsx factors are monomers in solution and require DNA for cooperative complex formation, (ii) define the affinity and thermodynamic binding parameters of Gsx2/DNA interactions, (iii) solve a high-resolution monomer/DNA structure that reveals that Gsx2 induces a 20° bend in DNA, (iv) identify a Gsx2 protein-protein interface required for cooperative DNA binding and (v) determine that flexible spacer DNA sequences enhance Gsx2 cooperativity on dimer sites. Altogether, our results provide a mechanistic basis for understanding the protein and DNA structural determinants that underlie cooperative DNA binding by Gsx factors.
    DOI:  https://doi.org/10.1093/nar/gkae522
  19. Nat Commun. 2024 Jun 11. 15(1): 4995
      RNF214 is an understudied ubiquitin ligase with little knowledge of its biological functions or protein substrates. Here we show that the TEAD transcription factors in the Hippo pathway are substrates of RNF214. RNF214 induces non-proteolytic ubiquitylation at a conserved lysine residue of TEADs, enhances interactions between TEADs and YAP, and promotes transactivation of the downstream genes of the Hippo signaling. Moreover, YAP and TAZ could bind polyubiquitin chains, implying the underlying mechanisms by which RNF214 regulates the Hippo pathway. Furthermore, RNF214 is overexpressed in hepatocellular carcinoma (HCC) and inversely correlates with differentiation status and patient survival. Consistently, RNF214 promotes tumor cell proliferation, migration, and invasion, and HCC tumorigenesis in mice. Collectively, our data reveal RNF214 as a critical component in the Hippo pathway by forming a signaling axis of RNF214-TEAD-YAP and suggest that RNF214 is an oncogene of HCC and could be a potential drug target of HCC therapy.
    DOI:  https://doi.org/10.1038/s41467-024-49045-y
  20. NAR Genom Bioinform. 2024 Jun;6(2): lqae068
      Transcription factor (TF) binding is a key component of genomic regulation. There are numerous high-throughput experimental methods to characterize TF-DNA binding specificities. Their application, however, is both laborious and expensive, which makes profiling all TFs challenging. For instance, the binding preferences of ∼25% human TFs remain unknown; they neither have been determined experimentally nor inferred computationally. We introduce a structure-based learning approach to predict the binding preferences of TFs and the automated modelling of TF regulatory complexes. We show the advantage of using our approach over the classical nearest-neighbor prediction in the limits of remote homology. Starting from a TF sequence or structure, we predict binding preferences in the form of motifs that are then used to scan a DNA sequence for occurrences. The best matches are either profiled with a binding score or collected for their subsequent modeling into a higher-order regulatory complex with DNA. Co-operativity is modelled by: (i) the co-localization of TFs and (ii) the structural modeling of protein-protein interactions between TFs and with co-factors. We have applied our approach to automatically model the interferon-β enhanceosome and the pioneering complexes of OCT4, SOX2 (or SOX11) and KLF4 with a nucleosome, which are compared with the experimentally known structures.
    DOI:  https://doi.org/10.1093/nargab/lqae068
  21. JCI Insight. 2024 Jun 10. pii: e175278. [Epub ahead of print]9(11):
      Monogenic diabetes is a gateway to precision medicine through molecular mechanistic insight. Hepatocyte nuclear factor 1A (HNF-1A) and HNF-4A are transcription factors that engage in crossregulatory gene transcription networks to maintain glucose-stimulated insulin secretion in pancreatic β cells. Variants in the HNF1A and HNF4A genes are associated with maturity-onset diabetes of the young (MODY). Here, we explored 4 variants in the P2-HNF4A promoter region: 3 in the HNF-1A binding site and 1 close to the site, which were identified in 63 individuals from 21 families of different MODY disease registries across Europe. Our goal was to study the disease causality for these variants and to investigate diabetes mechanisms on the molecular level. We solved a crystal structure of HNF-1A bound to the P2-HNF4A promoter and established a set of techniques to probe HNF-1A binding and transcriptional activity toward different promoter variants. We used isothermal titration calorimetry, biolayer interferometry, x-ray crystallography, and transactivation assays, which revealed changes in HNF-1A binding or transcriptional activities for all 4 P2-HNF4A variants. Our results suggest distinct disease mechanisms of the promoter variants, which can be correlated with clinical phenotype, such as age of diagnosis of diabetes, and be important tools for clinical utility in precision medicine.
    Keywords:  Diabetes; Metabolism; Structural biology; Transcription
    DOI:  https://doi.org/10.1172/jci.insight.175278