bims-crepig Biomed News
on Chromatin regulation and epigenetics in cell fate and cancer
Issue of 2022‒12‒18
25 papers selected by
Connor Rogerson
University of Cambridge


  1. Proc Natl Acad Sci U S A. 2022 Dec 20. 119(51): e2212810119
      Chromatin accessibility assays are central to the genome-wide identification of gene regulatory elements associated with transcriptional regulation. However, the data have highly variable quality arising from several biological and technical factors. To surmount this problem, we developed a sequence-based machine learning method to evaluate and refine chromatin accessibility data. Our framework, gapped k-mer SVM quality check (gkmQC), provides the quality metrics for a sample based on the prediction accuracy of the trained models. We tested 886 DNase-seq samples from the ENCODE/Roadmap projects to demonstrate that gkmQC can effectively identify "high-quality" (HQ) samples with low conventional quality scores owing to marginal read depths. Peaks identified in HQ samples are more accurately aligned at functional regulatory elements, show greater enrichment of regulatory elements harboring functional variants, and explain greater heritability of phenotypes from their relevant tissues. Moreover, gkmQC can optimize the peak-calling threshold to identify additional peaks, especially for rare cell types in single-cell chromatin accessibility data.
    Keywords:  chromatin accessibility; gkmQC; quality control; sequence-based model
    DOI:  https://doi.org/10.1073/pnas.2212810119
  2. Cell Rep. 2022 Dec 13. pii: S2211-1247(22)01679-5. [Epub ahead of print]41(11): 111791
      Transposable elements (TEs) are the major sources of lineage-specific genomic innovation and comprise nearly half of the human genome, but most of their functions remain unclear. Here, we identify that a series of endogenous retroviruses (ERVs), a TE subclass, regulate the transcriptome at the definitive endoderm stage with in vitro differentiation model from human embryonic stem cell. Notably, these ERVs perform as enhancers containing binding sites for critical transcription factors for endoderm lineage specification. Genome-wide methylation analysis shows most of these ERVs are derepressed by TET1-mediated DNA demethylation. LTR6B, a representative definitive endoderm activating ERV, contains binding sites for FOXA2 and GATA4 and governs the primate-specific expression of its neighboring developmental genes such as ERBB4 in definitive endoderm. Together, our study proposes evidence that recently evolved ERVs represent potent de novo developmental regulatory elements, which, in turn, fine-tune species-specific transcriptomes during endoderm and embryonic development.
    Keywords:  CP: Developmental biology; CP: Molecular biology; DNA methylation; TET1; endogenous retroviruses; gene regulation; human development; transcriptional networks; transposable elements
    DOI:  https://doi.org/10.1016/j.celrep.2022.111791
  3. Nat Struct Mol Biol. 2022 Dec 15.
      The mechanism controlling the dynamic targeting of SWI/SNF has long been postulated to be coordinated by transcription factors (TFs), yet demonstrating a specific TF influence has proven difficult. Here we take a multi-omics approach to interrogate transient SWI/SNF interactors, chromatin targeting and the resulting three-dimensional epigenetic landscape. We utilize the labeling technique TurboID to map the SWI/SNF interactome and identify the activator protein-1 (AP-1) family members as critical interacting partners for SWI/SNF complexes. CUT&RUN profiling demonstrates SWI/SNF targeting enrichment at AP-1 bound loci, as well as SWI/SNF-AP-1 cooperation in chromatin targeting. HiChIP reveals AP-1-SWI/SNF-dependent restructuring of the three-dimensional promoter-enhancer architecture and generation of enhancer hubs. Through interrogation of the SWI/SNF-AP-1 interaction, we demonstrate an SWI/SNF dependency on AP-1-mediated chromatin localization. We propose that pioneer factors, such as AP-1, bind and target SWI/SNF to inactive chromatin, where it restructures the genomic landscape into an active state through epigenetic rewiring spanning multiple dimensions.
    DOI:  https://doi.org/10.1038/s41594-022-00880-x
  4. Nat Plants. 2022 Dec 15.
      Chromatin architecture and transcription factor (TF) binding underpin cell-fate specification during development, but their mutual regulatory relationships remain unclear. Here we report an atlas of dynamic chromatin landscapes during stomatal cell-lineage progression, in which sequential cell-state transitions are governed by lineage-specific bHLH TFs. Major reprogramming of chromatin accessibility occurs at the proliferation-to-differentiation transition. We discover novel co-cis regulatory elements (CREs) signifying the early precursor stage, BBR/BPC (GAGA) and bHLH (E-box) motifs, where master-regulatory bHLH TFs, SPEECHLESS and MUTE, consecutively bind to initiate and terminate the proliferative state, respectively. BPC TFs complex with MUTE to repress SPEECHLESS expression through a local deposition of repressive histone marks. We elucidate the mechanism by which cell-state-specific heterotypic TF complexes facilitate cell-fate commitment by recruiting chromatin modifiers via key co-CREs.
    DOI:  https://doi.org/10.1038/s41477-022-01304-w
  5. J Mol Biol. 2022 Dec 07. pii: S0022-2836(22)00542-3. [Epub ahead of print] 167916
      Pioneer transcription factors (pTFs) can bind directly to silent chromatin and promote vital transcriptional programs. Here, by integrating high-resolution nuclear magnetic resonance (NMR) spectroscopy with biochemistry, we reveal new structural and mechanistic insights into the interaction of pluripotency pTFs and functional partners Sox2 and Oct4 with nucleosomes. We find that the affinity and conformation of Sox2 for solvent-exposed nucleosome sites depends strongly on their position and DNA sequence. Sox2, which is partially disordered but becomes structured upon DNA binding and bending, forms a super-stable nucleosome complex at superhelical location +5 (SHL+5) with similar affinity and conformation to that with naked DNA. However, at suboptimal internal and end-positioned sites where DNA may be harder to deform, Sox2 favors partially unfolded and more dynamic states that are encoded in its intrinsic flexibility. Importantly, Sox2 structure and DNA bending can be stabilized by synergistic Oct4 binding, but only on adjacent motifs near the nucleosome edge and with the full Oct4 DNA-binding domain. Further mutational studies reveal that strategically impaired Sox2 folding is coupled to reduced DNA bending and inhibits nucleosome binding and Sox2-Oct4 cooperation, while increased nucleosomal DNA flexibility enhances Sox2 association. Together, our findings fit a model where the site-specific DNA bending propensity and structural plasticity of Sox2 govern distinct modes of nucleosome engagement and modulate Sox2-Oct4 synergism. The principles outlined here can potentially guide pTF site selection in the genome and facilitate interaction with other chromatin factors or chromatin opening in vivo.
    Keywords:  NMR; chromatin; crosslinking; gene regulation; pioneer factors
    DOI:  https://doi.org/10.1016/j.jmb.2022.167916
  6. Sci Rep. 2022 Dec 13. 12(1): 21506
      Changes in gene expression programs are intimately linked to cell fate decisions. Post-translational modifications of core histones contribute to control gene expression. Methylation of lysine 4 of histone H3 (H3K4) correlates with active promoters and gene transcription. This modification is catalyzed by KMT2 methyltransferases, which require interaction with 4 core subunits, WDR5, RBBP5, ASH2L and DPY30, for catalytic activity. Ash2l is necessary for organismal development and for tissue homeostasis. In mouse embryo fibroblasts (MEFs), Ash2l loss results in gene repression, provoking a senescence phenotype. We now find that upon knockout of Ash2l both H3K4 mono- and tri-methylation (H3K4me1 and me3, respectively) were deregulated. In particular, loss of H3K4me3 at promoters correlated with gene repression, especially at CpG island promoters. Ash2l loss resulted in increased loading of histone H3 and reduced chromatin accessibility at promoters, accompanied by an increase of repressing and a decrease of activating histone marks. Moreover, we observed altered binding of CTCF upon Ash2l loss. Lost and gained binding was noticed at promoter-associated and intergenic sites, respectively. Thus, Ash2l loss and reduction of H3K4me3 correlate with altered chromatin accessibility and transcription factor binding. These findings contribute to a more detailed understanding of mechanistic consequences of H3K4me3 loss and associated repression of gene transcription and thus of the observed cellular consequences.
    DOI:  https://doi.org/10.1038/s41598-022-25881-0
  7. Dev Cell. 2022 Dec 02. pii: S1534-5807(22)00811-5. [Epub ahead of print]
      Spinal cord development is precisely orchestrated by spatiotemporal gene regulatory programs. However, the underlying epigenetic mechanisms remain largely elusive. Here, we profiled single-cell chromatin accessibility landscapes in mouse neural tubes spanning embryonic days 9.5-13.5. We identified neuronal-cell-cluster-specific cis-regulatory elements in neural progenitors and neurons. Furthermore, we applied a novel computational method, eNet, to build enhancer networks by integrating single-cell chromatin accessibility and gene expression data and identify the hub enhancers within enhancer networks. It was experimentally validated in vivo for Atoh1 that knockout of the hub enhancers, but not the non-hub enhancers, markedly decreased Atoh1 expression and reduced dp1/dI1 cells. Together, our work provides insights into the epigenetic regulation of spinal cord development and a proof-of-concept demonstration of enhancer networks as a general mechanism in transcriptional regulation.
    Keywords:  Atoh1; chromatin accessibility; cis-regulatory elements; enhancer network; hub enhancer; scATAC-seq; spinal cord
    DOI:  https://doi.org/10.1016/j.devcel.2022.11.011
  8. Elife. 2022 Dec 12. pii: e73395. [Epub ahead of print]11
      A challenge in quantitative biology is to predict output patterns of gene expression from knowledge of input transcription factor patterns and from the arrangement of binding sites for these transcription factors on regulatory DNA. We tested whether widespread thermodynamic models could be used to infer parameters describing simple regulatory architectures that inform parameter-free predictions of more complex enhancers in the context of transcriptional repression by Runt in the early fruit fly embryo. By modulating the number and placement of Runt binding sites within an enhancer, and quantifying the resulting transcriptional activity using live imaging, we discovered that thermodynamic models call for higher-order cooperativity between multiple molecular players. This higher-order cooperativity capture the combinatorial complexity underlying eukaryotic transcriptional regulation and cannot be determined from simpler regulatory architectures, highlighting the challenges in reaching a predictive understanding of transcriptional regulation in eukaryotes and calling for approaches that quantitatively dissect their molecular nature.
    Keywords:  D. melanogaster; infectious disease; microbiology
    DOI:  https://doi.org/10.7554/eLife.73395
  9. Genes Dev. 2022 Dec 15.
      GATA4 is a transcription factor known for its crucial role in the development of many tissues, including the liver; however, its role in adult liver metabolism is unknown. Here, using high-throughput sequencing technologies, we identified GATA4 as a transcriptional regulator of metabolism in the liver. GATA4 expression is elevated in response to refeeding, and its occupancy is increased at enhancers of genes linked to fatty acid and lipoprotein metabolism. Knocking out GATA4 in the adult liver (Gata4LKO) decreased transcriptional activity at GATA4 binding sites, especially during feeding. Gata4LKO mice have reduced plasma HDL cholesterol and increased liver triglyceride levels. The expression of a panel of GATA4 binding genes involved in hepatic cholesterol export and triglyceride hydrolysis was down-regulated in Gata4LKO mice. We further demonstrate that GATA4 collaborates with LXR nuclear receptors in the liver. GATA4 and LXRs share a number of binding sites, and GATA4 was required for the full transcriptional response to LXR activation. Collectively, these results show that hepatic GATA4 contributes to the transcriptional control of hepatic and systemic lipid homeostasis.
    Keywords:  LXR; lipid metabolism; transcription factor
    DOI:  https://doi.org/10.1101/gad.350145.122
  10. Nat Struct Mol Biol. 2022 Dec;29(12): 1252-1265
      In mammalian embryos, DNA methylation is initialized to maximum levels in the epiblast by the de novo DNA methyltransferases DNMT3A and DNMT3B before gastrulation diversifies it across regulatory regions. Here we show that DNMT3A and DNMT3B are differentially regulated during endoderm and mesoderm bifurcation and study the implications in vivo and in meso-endoderm embryoid bodies. Loss of both Dnmt3a and Dnmt3b impairs exit from the epiblast state. More subtly, independent loss of Dnmt3a or Dnmt3b leads to small biases in mesoderm-endoderm bifurcation and transcriptional deregulation. Epigenetically, DNMT3A and DNMT3B drive distinct methylation kinetics in the epiblast, as can be predicted from their strand-specific sequence preferences. The enzymes compensate for each other in the epiblast, but can later facilitate lineage-specific methylation kinetics as their expression diverges. Single-cell analysis shows that differential activity of DNMT3A and DNMT3B combines with replication-linked methylation turnover to increase epigenetic plasticity in gastrulation. Together, these findings outline a dynamic model for the use of DNMT3A and DNMT3B sequence specificity during gastrulation.
    DOI:  https://doi.org/10.1038/s41594-022-00885-6
  11. Res Comput Mol Biol. 2022 May;13278 36-51
      Recent efforts to sequence the genomes of thousands of matched normal-tumor samples have led to the identification of millions of somatic mutations, the majority of which are non-coding. Most of these mutations are believed to be passengers, but a small number of non-coding mutations could contribute to tumor initiation or progression, e.g. by leading to dysregulation of gene expression. Efforts to identify putative regulatory drivers rely primarily on information about the recurrence of mutations across tumor samples. However, in regulatory regions of the genome, individual mutations are rarely seen in more than one donor. Instead of using recurrence information, here we present a method to identify putative regulatory driver mutations based on the magnitude of their effects on transcription factor-DNA binding. For each gene, we integrate the effects of mutations across all its regulatory regions, and we ask whether these effects are larger than expected by chance, given the mutation spectra observed in regulatory DNA in the cohort of interest. We applied our approach to analyze mutations in a liver cancer data set with ample somatic mutation and gene expression data available. By combining the effects of mutations across all regulatory regions of each gene, we identified dozens of genes whose regulation in tumor cells is likely to be significantly perturbed by non-coding mutations. Overall, our results show that focusing on the functional effects of non-coding mutations, rather than their recurrence, has the potential to identify putative regulatory drivers and the genes they dysregulate in tumor cells.
    Keywords:  Combining p-values; DNA-binding specificity; Enhancers and promoters; Liptak’s method; Non-coding mutations; Regulatory drivers; Transcription factors
    DOI:  https://doi.org/10.1007/978-3-031-04749-7_3
  12. Nat Immunol. 2022 Dec 15.
      The molecular regulation of human hematopoietic stem cell (HSC) maintenance is therapeutically important, but limitations in experimental systems and interspecies variation have constrained our knowledge of this process. Here, we have studied a rare genetic disorder due to MECOM haploinsufficiency, characterized by an early-onset absence of HSCs in vivo. By generating a faithful model of this disorder in primary human HSCs and coupling functional studies with integrative single-cell genomic analyses, we uncover a key transcriptional network involving hundreds of genes that is required for HSC maintenance. Through our analyses, we nominate cooperating transcriptional regulators and identify how MECOM prevents the CTCF-dependent genome reorganization that occurs as HSCs differentiate. We show that this transcriptional network is co-opted in high-risk leukemias, thereby enabling these cancers to acquire stem cell properties. Collectively, we illuminate a regulatory network necessary for HSC self-renewal through the study of a rare experiment of nature.
    DOI:  https://doi.org/10.1038/s41590-022-01370-4
  13. Nat Commun. 2022 Dec 10. 13(1): 7644
      BAF and PBAF are mammalian SWI/SNF family chromatin remodeling complexes that possess multiple histone/DNA-binding subunits and create nucleosome-depleted/free regions for transcription activation. Despite previous structural studies and recent advance of SWI/SNF family complexes, it remains incompletely understood how PBAF-nucleosome complex is organized. Here we determined structure of 13-subunit human PBAF in complex with acetylated nucleosome in ADP-BeF3-bound state. Four PBAF-specific subunits work together with nine BAF/PBAF-shared subunits to generate PBAF-specific modular organization, distinct from that of BAF at various regions. PBAF-nucleosome structure reveals six histone-binding domains and four DNA-binding domains/modules, the majority of which directly bind histone/DNA. This multivalent nucleosome-binding pattern, not observed in previous studies, suggests that PBAF may integrate comprehensive chromatin information to target genomic loci for function. Our study reveals molecular organization of subunits and histone/DNA-binding domains/modules in PBAF-nucleosome complex and provides structural insights into PBAF-mediated nucleosome association complimentary to the recently reported PBAF-nucleosome structure.
    DOI:  https://doi.org/10.1038/s41467-022-34859-5
  14. Nat Commun. 2022 Dec 13. 13(1): 7698
      The cohesin complex participates in many structural and functional aspects of genome organization. Cohesin recruitment onto chromosomes requires nucleosome-free DNA and the Scc2-Scc4 cohesin loader complex that catalyzes topological cohesin loading. Additionally, the cohesin loader facilitates promoter nucleosome clearance in a yet unknown way, and it recognizes chromatin receptors such as the RSC chromatin remodeler. Here, we explore the cohesin loader-RSC interaction. Amongst multi-pronged contacts by Scc2 and Scc4, we find that Scc4 contacts a conserved patch on the RSC ATPase motor module. The cohesin loader directly stimulates in vitro nucleosome sliding by RSC, providing an explanation how it facilitates promoter nucleosome clearance. Furthermore, we observe cohesin loader interactions with a wide range of chromatin remodelers. Our results provide mechanistic insight into how the cohesin loader recognizes, as well as influences, the chromatin landscape, with implications for our understanding of human developmental disorders including Cornelia de Lange and Coffin-Siris syndromes.
    DOI:  https://doi.org/10.1038/s41467-022-35444-6
  15. Cancer Sci. 2022 Dec 13.
      The molecular subtypes of pancreatic cancer (PC), either classical/progenitor-like or basal/squamous-like, are currently a major topic of research because of their direct association with clinical outcomes. Some transcription factors (TFs) have been reported to be associated with these subtypes. However, the mechanisms by which these molecular signatures of PCs are established remain unknown. Epigenetic regulatory processes, supported by dynamic changes in the chromatin structure, are essential for transcriptional profiles. Previously, we reported the importance of open chromatin profiles in the biological features and transcriptional status of PCs. Here, we aimed to analyze the relationships between three-dimensional (3D) genome structures and the molecular subtypes of human PCs using Hi-C analysis. We observed a correlation of the specific elements of 3D genome modules, including compartments, topologically associating domains, and enhancer-promoter loops, with the expression of related genes. We focused on HNF1B, a TF that is implicated in the progenitor subtype. Forced expression of HNF1B in squamous-type PC organoids induced the upregulation and downregulation of genes associated with progenitor and squamous subtypes, respectively. Long-range genomic interactions induced by HNF1B were accompanied by compartment modulation and H3K27ac redistribution. We also found that these HNF1B-induced changes in subtype-related gene expression required an intrinsically disordered region, suggesting a possible involvement of phase separation in compartment modulation. Thus, mapping of 3D structural changes induced by TFs, such as HNF1B, may become a useful resource for further understanding the molecular features of PCs.
    Keywords:  HNF1B; Hi-C; intrinsically disordered region; pancreatic cancer; patient-derived organoid
    DOI:  https://doi.org/10.1111/cas.15690
  16. Cell Rep. 2022 Dec 13. pii: S2211-1247(22)01693-X. [Epub ahead of print]41(11): 111805
      The lung exhibits a robust, multifaceted regenerative response to severe injuries such as influenza infection, during which quiescent lung-resident epithelial progenitors participate in two distinct reparative pathways: functionally beneficial regeneration via alveolar type 2 (AT2) cell proliferation and differentiation, and dysplastic tissue remodeling via intrapulmonary airway-resident basal p63+ progenitors. Here we show that the basal cell transcription factor ΔNp63 is required for intrapulmonary basal progenitors to participate in dysplastic alveolar remodeling following injury. We find that ΔNp63 restricts the plasticity of intrapulmonary basal progenitors by maintaining either active or repressive histone modifications at key differentiation gene loci. Following loss of ΔNp63, intrapulmonary basal progenitors are capable of either airway or alveolar differentiation depending on their surrounding environment both in vitro and in vivo. Uncovering these regulatory mechanisms of dysplastic repair and lung basal cell fate choice highlight potential therapeutic targets to promote functional alveolar regeneration following severe lung injuries.
    Keywords:  CP: Cell biology; CP: Stem cell research; cellular plasticity; dysplastic basal-like cells; influenza; injury repair; lung regeneration; lung stem cell biology
    DOI:  https://doi.org/10.1016/j.celrep.2022.111805
  17. PLoS Genet. 2022 Dec 12. 18(12): e1010535
      Noise in expression of individual genes gives rise to variations in activity of cellular pathways and generates heterogeneity in cellular phenotypes. Phenotypic heterogeneity has important implications for antibiotic persistence, mutation penetrance, cancer growth and therapy resistance. Specific molecular features such as the presence of the TATA box sequence and the promoter nucleosome occupancy have been associated with noise. However, the relative importance of these features in noise regulation is unclear and how well these features can predict noise has not yet been assessed. Here through an integrated statistical model of gene expression noise in yeast we found that the number of regulating transcription factors (TFs) of a gene was a key predictor of noise, whereas presence of the TATA box and the promoter nucleosome occupancy had poor predictive power. With an increase in the number of regulatory TFs, there was a rise in the number of cooperatively binding TFs. In addition, an increased number of regulatory TFs meant more overlaps in TF binding sites, resulting in competition between TFs for binding to the same region of the promoter. Through modeling of TF binding to promoter and application of stochastic simulations, we demonstrated that competition and cooperation among TFs could increase noise. Thus, our work uncovers a process of noise regulation that arises out of the dynamics of gene regulation and is not dependent on any specific transcription factor or specific promoter sequence.
    DOI:  https://doi.org/10.1371/journal.pgen.1010535
  18. PLoS Comput Biol. 2022 Dec 15. 18(12): e1010779
      Enhancers are short non-coding DNA sequences outside of the target promoter regions that can be bound by specific proteins to increase a gene's transcriptional activity, which has a crucial role in the spatiotemporal and quantitative regulation of gene expression. However, enhancers do not have a specific sequence motifs or structures, and their scattered distribution in the genome makes the identification of enhancers from human cell lines particularly challenging. Here we present a novel, stacked multivariate fusion framework called SMFM, which enables a comprehensive identification and analysis of enhancers from regulatory DNA sequences as well as their interpretation. Specifically, to characterize the hierarchical relationships of enhancer sequences, multi-source biological information and dynamic semantic information are fused to represent regulatory DNA enhancer sequences. Then, we implement a deep learning-based sequence network to learn the feature representation of the enhancer sequences comprehensively and to extract the implicit relationships in the dynamic semantic information. Ultimately, an ensemble machine learning classifier is trained based on the refined multi-source features and dynamic implicit relations obtained from the deep learning-based sequence network. Benchmarking experiments demonstrated that SMFM significantly outperforms other existing methods using several evaluation metrics. In addition, an independent test set was used to validate the generalization performance of SMFM by comparing it to other state-of-the-art enhancer identification methods. Moreover, we performed motif analysis based on the contribution scores of different bases of enhancer sequences to the final identification results. Besides, we conducted interpretability analysis of the identified enhancer sequences based on attention weights of EnhancerBERT, a fine-tuned BERT model that provides new insights into exploring the gene semantic information likely to underlie the discovered enhancers in an interpretable manner. Finally, in a human placenta study with 4,562 active distal gene regulatory enhancers, SMFM successfully exposed tissue-related placental development and the differential mechanism, demonstrating the generalizability and stability of our proposed framework.
    DOI:  https://doi.org/10.1371/journal.pcbi.1010779
  19. Nat Commun. 2022 Dec 13. 13(1): 7728
      The acquisition of germination and post-embryonic developmental ability during seed maturation is vital for seed vigor, an important trait for plant propagation and crop production. How seed vigor is established in seeds is still poorly understood. Here, we report the crucial function of Arabidopsis histone variant H3.3 in endowing seeds with post-embryonic developmental potentials. H3.3 is not essential for seed formation, but loss of H3.3 results in severely impaired germination and post-embryonic development. H3.3 exhibits a seed-specific 5' gene end distribution and facilitates chromatin opening at regulatory regions in seeds. During germination, H3.3 is essential for proper gene transcriptional regulation. Moreover, H3.3 is constantly loaded at the 3' gene end, correlating with gene body DNA methylation and the restriction of chromatin accessibility and cryptic transcription at this region. Our results suggest a fundamental role of H3.3 in initiating chromatin accessibility at regulatory regions in seed and licensing the embryonic to post-embryonic transition.
    DOI:  https://doi.org/10.1038/s41467-022-35509-6
  20. Nat Commun. 2022 Dec 15. 13(1): 7759
      Histone modifications are deposited by chromatin modifying enzymes and read out by proteins that recognize the modified state. BRD4-NUT is an oncogenic fusion protein of the acetyl lysine reader BRD4 that binds to the acetylase p300 and enables formation of long-range intra- and interchromosomal interactions. We here examine how acetylation reading and writing enable formation of such interactions. We show that NUT contains an acidic transcriptional activation domain that binds to the TAZ2 domain of p300. We use NMR to investigate the structure of the complex and found that the TAZ2 domain has an autoinhibitory role for p300. NUT-TAZ2 interaction or mutations found in cancer that interfere with autoinhibition by TAZ2 allosterically activate p300. p300 activation results in a self-organizing, acetylation-dependent feed-forward reaction that enables long-range interactions by bromodomain multivalent acetyl-lysine binding. We discuss the implications for chromatin organisation, gene regulation and dysregulation in disease.
    DOI:  https://doi.org/10.1038/s41467-022-35375-2
  21. Cell Rep. 2022 Dec 13. pii: S2211-1247(22)01672-2. [Epub ahead of print]41(11): 111784
      Heat stress (HS) induces a cellular response leading to profound changes in gene expression. Here, we show that human YTHDC1, a reader of N6-methyladenosine (m6A) RNA modification, mostly associates to the chromatin fraction and that HS induces a redistribution of YTHDC1 across the genome, including to heat-induced heat shock protein (HSP) genes. YTHDC1 binding to m6A-modified HSP transcripts co-transcriptionally promotes expression of HSPs. In parallel, hundreds of the genes enriched in YTHDC1 during HS have their transcripts undergoing YTHDC1- and m6A-dependent intron retention. Later, YTHDC1 concentrates within nuclear stress bodies (nSBs) where it binds to m6A-modified SATIII non-coding RNAs, produced in an HSF1-dependent manner upon HS. These findings reveal that YTHDC1 plays a central role in a chromatin-associated m6A-based reprogramming of gene expression during HS. Furthermore, they support the model where the subsequent and temporary sequestration of YTHDC1 within nSBs calibrates the timing of this YTHDC1-dependent gene expression reprogramming.
    Keywords:  CP: Molecular biology; HSF1; HSP; HSPs; N(6)-methyladenosine; RNA; YTH; chromatin; heat shock proteins; heat stress; intron retention; m(6)A; m6A; nSBs; ncRNA; ncRNAs; non-coding RNAs; nuclear stress bodies; nuclear stress body; satellite RNA
    DOI:  https://doi.org/10.1016/j.celrep.2022.111784
  22. iScience. 2022 Dec 22. 25(12): 105490
      It is unclear how the activation of HIV-1 transcription affects chromatin structure. We interrogated chromatin organization both genome-wide and nearby HIV-1 integration sites using Hi-C and ATAC-seq. In conjunction, we analyzed the transcription of the HIV-1 genome and neighboring genes. We found that long-range chromatin contacts did not differ significantly between uninfected cells and those harboring an integrated HIV-1 genome, whether the HIV-1 genome was actively transcribed or inactive. Instead, the activation of HIV-1 transcription changes chromatin accessibility immediately downstream of the provirus, demonstrating that HIV-1 can alter local cellular chromatin structure. Finally, we examined HIV-1 and neighboring host gene transcripts with long-read sequencing and found populations of chimeric RNAs both virus-to-host and host-to-virus. Thus, multiomics profiling revealed that the activation of HIV-1 transcription led to local changes in chromatin organization and altered the expression of neighboring host genes.
    Keywords:  Biological sciences; Chromosome organization; Molecular biology; Molecular interaction
    DOI:  https://doi.org/10.1016/j.isci.2022.105490
  23. BMC Bioinformatics. 2022 Dec 12. 23(Suppl 2): 395
      BACKGROUND: The widespread usage of Cap Analysis of Gene Expression (CAGE) has led to numerous breakthroughs in understanding the transcription mechanisms. Recent evidence in the literature, however, suggests that CAGE suffers from transcriptional and technical noise. Regardless of the sample quality, there is a significant number of CAGE peaks that are not associated with transcription initiation events. This type of signal is typically attributed to technical noise and more frequently to random five-prime capping or transcription bioproducts. Thus, the need for computational methods emerges, that can accurately increase the signal-to-noise ratio in CAGE data, resulting in error-free transcription start site (TSS) annotation and quantification of regulatory region usage. In this study, we present DeepTSS, a novel computational method for processing CAGE samples, that combines genomic signal processing (GSP), structural DNA features, evolutionary conservation evidence and raw DNA sequence with Deep Learning (DL) to provide single-nucleotide TSS predictions with unprecedented levels of performance.RESULTS: To evaluate DeepTSS, we utilized experimental data, protein-coding gene annotations and computationally-derived genome segmentations by chromatin states. DeepTSS was found to outperform existing algorithms on all benchmarks, achieving 98% precision and 96% sensitivity (accuracy 95.4%) on the protein-coding gene strategy, with 96.66% of its positive predictions overlapping active chromatin, 98.27% and 92.04% co-localized with at least one transcription factor and H3K4me3 peak.
    CONCLUSIONS: CAGE is a key protocol in deciphering the language of transcription, however, as every experimental protocol, it suffers from biological and technical noise that can severely affect downstream analyses. DeepTSS is a novel DL-based method for effectively removing noisy CAGE signal. In contrast to existing software, DeepTSS does not require feature selection since the embedded convolutional layers can readily identify patterns and only utilize the important ones for the classification task. This study highlights the key role that DL can play in Molecular Biology, by removing the inherent flaws of experimental protocols, that form the backbone of contemporary research. Here, we show how DeepTSS can unleash the full potential of an already popular and mature method such as CAGE, and push the boundaries of coding and non-coding gene expression regulator research even further.
    Keywords:  Bioinformatics; CAGE; Deep Learning; GSP; Machine Learning; Promoter; TSS; Transcription
    DOI:  https://doi.org/10.1186/s12859-022-04945-y
  24. Development. 2022 Dec 01. pii: dev201251. [Epub ahead of print]149(23):
      There are fundamental differences in how neonatal and adult intestines absorb nutrients. In adults, macromolecules are broken down into simpler molecular components in the lumen of the small intestine, then absorbed. In contrast, neonates are thought to rely on internalization of whole macromolecules and subsequent degradation in the lysosome. Here, we identify the Maf family transcription factors MAFB and c-MAF as markers of terminally differentiated intestinal enterocytes throughout life. The expression of these factors is regulated by HNF4α and HNF4γ, master regulators of enterocyte cell fate. Loss of Maf factors results in a neonatal-specific failure to thrive and loss of macromolecular nutrient uptake. RNA-Seq and CUT&RUN analyses defined an endo-lysosomal program as being downstream of these transcription factors. We demonstrate major transcriptional changes in metabolic pathways, including fatty acid oxidation and increases in peroxisome number, in response to loss of Maf proteins. Finally, we show that loss of BLIMP1, a repressor of adult enterocyte genes, shows highly overlapping changes in gene expression and similar defects in macromolecular uptake. This work defines transcriptional regulators that are necessary for nutrient uptake in neonatal enterocytes.
    Keywords:  Enterocyte; Intestine; Lysosomes; Mouse; Neonatal; Uptake
    DOI:  https://doi.org/10.1242/dev.201251
  25. Genes Dev. 2022 Dec 15.
      The Hippo-YAP signaling pathway plays a critical role in development, homeostasis, regeneration, and tumorigenesis by converging on YAP, a coactivator for the TEAD family DNA-binding transcription factors, to regulate downstream transcription programs. Given its pivotal role as the nuclear effector of the Hippo pathway, YAP is indispensable in multiple developmental and tissue contexts. Here we report that the essentiality of YAP in liver and lung development can be genetically bypassed by simultaneous inactivation of the TEAD corepressor VGLL4. This striking antagonistic epistasis suggests that the major physiological function of YAP is to antagonize VGLL4. We further show that the YAP-VGLL4 antagonism plays a widespread role in regulating Hippo pathway output beyond normal development, as inactivation of Vgll4 dramatically enhanced intrahepatic cholangiocarcinoma formation in Nf2-deficient livers and ameliorated CCl4-induced damage in normal livers. Interestingly, Vgll4 expression is temporally regulated in development and regeneration and, in certain contexts, provides a better indication of overall Hippo pathway output than YAP phosphorylation. Together, these findings highlight the central importance of VGLL4-mediated transcriptional repression in Hippo pathway regulation and inform potential strategies to modulate Hippo signaling in cancer and regenerative medicine.
    Keywords:  Hippo signaling; TEAD; VGLL4; YAP; default repression; regeneration; tumor suppressor
    DOI:  https://doi.org/10.1101/gad.350127.122