bims-gerecp Biomed News
on Gene regulatory networks of epithelial cell plasticity
Issue of 2026–02–22
sixteen papers selected by
Xiao Qin, University of Oxford



  1. Cell Syst. 2026 Feb 18. pii: S2405-4712(26)00016-5. [Epub ahead of print]17(2): 101534
      Deriving principles governing cell biology from single-cell measurements across modalities, called multimodal modeling, can advance our understanding of cellular states in health and disease. Realizing the full potential of multimodal models requires learning generalizable representations across data types, diseases, and biological contexts. This perspective examines the potential of compositional AI as a modular design approach for constructing multimodal foundation models that unify biological modalities-such as chromatin accessibility, protein abundance, spatial transcriptomics, microscopy imaging, and textual annotations-into cohesive representations of cellular behavior. We present key deep learning modeling approaches, along with transformer-based attention strategies to implement them, while addressing challenges posed by limited data availability and structural differences between modality representations. We also discuss how to connect and align partially overlapping multimodal measurements to build a comprehensive representation space. By synthesizing these technical advancements, we chart a path toward agentic virtual cell models, offering insights into opportunities, limitations, and future directions for leveraging multimodal AI to decode the complexity of cellular systems.
    Keywords:  compositional AI; single-cell foundation models; single-cell multi-omics
    DOI:  https://doi.org/10.1016/j.cels.2026.101534
  2. J Exp Med. 2026 Mar 02. pii: e20241266. [Epub ahead of print]223(3):
      Mapping the causal circuits that shape the phenotypic and functional landscape of immune cells remains a formidable challenge. Recent advances in pooled CRISPR-based screens, coupled with multiplexed single-cell profiling and imaging-based spatial readouts, make this goal increasingly attainable. In this Perspective, we discuss how CRISPR-based genetic screens will fundamentally transform our understanding of immunobiology. We highlight the applications of state-of-the-art, high-throughput pooled perturbation approaches, including emerging methodologies for bulk, single-cell, and spatial CRISPR screens, to advance our understanding of immunity and in vivo biology. Additionally, we summarize new strategies to address the complexity of combinatorial perturbations to uncover genetic interactions and mechanistic drivers of immunity at unprecedented scale and resolution. By integrating CRISPR screening data with experimental insights, we advocate a new framework in immunology research that leverages perturbation-driven regulatory effects and networks to discover new therapeutic targets and establish causal systems biology and immunology for advancing immunological knowledge and therapeutic application.
    DOI:  https://doi.org/10.1084/jem.20241266
  3. Front Oncol. 2025 ;15 1736140
      The paradigm of the Hallmarks of Cancer, updated by Douglas Hanahan in 2022, represents one of the most influential syntheses for understanding the functional capabilities that sustain neoplastic transformation. However, its traditional interpretation, often reductionist and fragmentary, does not capture the non-linear, emergent, and adaptive dynamics of tumor behavior. This review proposes a reinterpretation of the hallmarks through the lens of complexity theory, conceptualizing colorectal cancer (CRC) as a self-organizing, open system operating far from equilibrium. Using an integrative conceptual approach, we map the ten classical hallmarks and the new dimensions proposed in 2022 (phenotypic plasticity, non-mutational epigenetic reprogramming, polymorphic microbiomes, and senescence) onto the fundamental properties of complex systems: nonlinearity, emergence, feedback, openness, and historical dependence. We argue that CRC should not be understood as a simple sum of molecular alterations but as a dynamic network of interactions among cells, tissues, and microenvironments where global organization emerges from local rules. This systems-based perspective provides a conceptual foundation for translational models and integrative methodologies in oncology.
    Keywords:  colorectal cancer; complex systems; emergence; hallmarks of cancer; self-organization; systems biology
    DOI:  https://doi.org/10.3389/fonc.2025.1736140
  4. PLoS Comput Biol. 2026 Feb 17. 22(2): e1013973
      Tumor development and progression are affected not only by cancer cell-intrinsic factors comprising complex genetic variations, but also by -extrinsic factors such as cell-cell communication (CCC)-mediated immunosuppression. However, whether and how these two types of factors influence each other remains an open question. We present Driver2Comm, a general computational framework designed to systematically identify intrinsic-extrinsic (IE) pathways that functionally connect cancer cell driver genes with their associated CCC signatures in the tumor microenvironment (TME). By applying Driver2Comm to single-cell and spatial transcriptomic datasets of multiple cancer types, we find that driver gene-associated CCC signatures play critical roles in immune regulation, metastasis, and therapy response. These signatures not only illuminate mechanisms of TME remodeling but also demonstrate clinical value in predicting patient survival and response to immune checkpoint blockade. Furthermore, Driver2Comm captures higher-order, cell-type-pair-specific CCC functional modules and spatially coherent CCC patterns in tissue contexts. As a generalizable tool, Driver2Comm bridges cancer genomics and cellular ecosystems, offering insights into biomarker discovery and combination therapy strategies.
    DOI:  https://doi.org/10.1371/journal.pcbi.1013973
  5. Cell Syst. 2026 Feb 18. pii: S2405-4712(26)00015-3. [Epub ahead of print]17(2): 101533
      Synthetic biology aims to achieve predictable, programmable control over living systems by designing and engineering biological components and functions. Over the past 25 years, the field has advanced from foundational molecular tools to increasingly complex systems-level architectures. A new inflection point has emerged with the integration of generative artificial intelligence (AI), catalyzing a fundamental shift in how biological design is conceived and executed. Generative AI now enables the data-driven creation of novel designs with predictable functionality and context-aware precision. Here, we examine the convergence of synthetic biology and generative AI, highlighting key innovations at this emerging frontier of deep generative design across biological parts and systems. We discuss how design frameworks have evolved and outline the opportunities and challenges that lie ahead, spanning biomolecular elements, genetic circuits, and genomes. Finally, we propose a roadmap for how generative AI can unlock a new era of predictable, programmable synthetic biological systems.
    Keywords:  artificial intelligence; de novo design; deep generative models; deep learning; generative artificial intelligence; generative biology; machine learning; synthetic biological circuits; synthetic biology
    DOI:  https://doi.org/10.1016/j.cels.2026.101533
  6. Cell Syst. 2026 Feb 18. pii: S2405-4712(26)00020-7. [Epub ahead of print]17(2): 101538
      Biomedical research requires quantitative rigor, i.e., numeracy, a facility with numbers. The last decade has seen the broad adoption of statistical tools ("Numeracy 1.0"). To drive science forward, the expertise to quantitatively evaluate hypotheses and insights also needs to be broadly adopted ("Numeracy 2.0"). Systems biologists will be at the forefront of the transformation.
    DOI:  https://doi.org/10.1016/j.cels.2026.101538
  7. Cell Syst. 2026 Feb 18. pii: S2405-4712(25)00342-4. [Epub ahead of print]17(2): 101509
      Most diseases are not caused by large-effect single factors but by the cumulative impact of small, context-dependent perturbations arising from genetic variants, personal behavior, or environmental exposures, a phenomenon we term the "long tail" of biology. Early disease signals often differ from late-stage biomarkers and evolve across demographic, lifestyle, and environmental contexts. Shifting medicine from reactive treatment to proactive health requires detecting and interpreting these signals. This requires longitudinal, multimodal data collection; non-invasive, scalable biosensing platforms; new technologies for interrogating biological complexity; and AI models capable of contextual, mechanistic reasoning. We propose an "N-of-1 analyzer" framework to track divergence from personal baselines across analytes, relationships, networks, and trajectories, interpreted through digital-twin simulations and knowledge-grounded foundational models. This framework enables early, individualized insights into disease risk and system decline, offering a path toward scalable precision prevention. Regulatory innovations will have to evolve, embracing complexity instead of reducing it to the mean.
    Keywords:  AI in healthcare; N-of-1 medicine; biological aging; multi-omics; precision prevention; systems biology
    DOI:  https://doi.org/10.1016/j.cels.2025.101509
  8. Nat Rev Mol Cell Biol. 2026 Feb 18.
      Biological functions depend on the spatiotemporal distribution of proteins within cells. Key cellular activities such as signal transduction, metabolism, cell cycle and cell death are driven by the interactions of proteins that are localized in multiple cellular compartments. Such multilocalization can even allow protein with identical sequences to display multifunctionality, a phenomenon known as moonlighting. Despite its biological importance, the relationship between protein localization and function remains underexplored. In this Review, we discuss the known mechanisms of protein localization (including RNA transport, role of proteoforms and molecular interactions) and how subcellular localization controls protein function. Proper regulation of protein localization is crucial for specialized cell and tissue functions, including cell differentiation, polarization and the epithelial-mesenchymal transition. Protein mislocalization can also have important roles in pathological processes, such as in cancer, neurodegeneration and autoimmunity. We end with a discussion of current technological and conceptual challenges in the field of subcellular proteomics and spatial biology. Addressing these challenges will allow us to link the dynamic nature of protein localization and function across biological scales and contexts, with great impact on fundamental cell biology and clinical applications.
    DOI:  https://doi.org/10.1038/s41580-026-00947-3
  9. Sci Rep. 2026 Feb 17.
      Colorectal cancer is thought to develop through the stepwise accumulation of somatic mutations. Recent years have seen the publications of several studies greatly advancing our understanding of the molecular events driving the disease. However, individual studies tend to be small and additional insights may be obtained through the combination of data from multiple sources. We performed targeted sequencing of 2172 colorectal cancers from Icelandic patients and combined these data with publicly available mutation calls from 9 515 additional tumours collected from the literature. Analysing microsatellite stable (MSS) and instable (MSI) tumours separately, we find evidence of positive selection of mutations in 112 genes that replicate across multiple studies of patients with diverse demographics. We carried out a meta-analysis of conditional selection, identifying 57 gene pairs where a mutation in one gene influences the selection of the other. We describe many associations with tumour phenotypes, including a strong association between mucinous histology and mutations in the transcription growth factor beta (TGFb) pathway, only in MSS tumours. Our study demonstrates how combining evidence from multiple sources allows for new discoveries in cancer genomics.
    DOI:  https://doi.org/10.1038/s41598-026-39255-3
  10. Nat Rev Cancer. 2026 Feb 20.
      It is well established that malignant cells alter their metabolism to support proliferation, but the nutrients required to meet the anabolic demands of different cancers located at various anatomical sites throughout the body remain largely unknown. Moreover, the extent to which nutrients are supplied by neighbouring stromal cells or distant tissues, possibly due to metabolic reprogramming, is poorly understood. Metabolomics provides a unique biochemical approach to address these gaps in our knowledge, but cancer studies require careful consideration because it is challenging to identify appropriately matched control samples for comparison. Here, we detail a collection of metabolomics workflows designed to interrogate cancer across three discrete scales. First, we describe experiments to define the nutrient demands of cancer cells themselves. Second, we focus on identifying metabolic relationships between neighbouring cells in the tumour microenvironment. Finally, we highlight strategies to explore the metabolic crosstalk between cancer cells and distant tissues in the tumour macroenvironment. The approaches outlined span cells in culture, animal models and human specimens from patients with cancer. Special emphasis is dedicated to the application of emerging technologies and computational pipelines in the field of mass spectrometry that enable global profiling of metabolites and lipids.
    DOI:  https://doi.org/10.1038/s41568-026-00908-0
  11. Nat Rev Genet. 2026 Feb 17.
      Genome annotation captures the essence of a genome by cataloguing its genes, transcripts, proteins and other functional elements of the DNA sequence. Accurate annotation serves as the foundation for a wide range of downstream analyses and discoveries, ranging from basic biology to an understanding of the linkage between genes and disease. Over the past two decades, advances in high-throughput sequencing techniques have enabled faster and more accurate capture of diverse genomic features, generating data at an unprecedented scale. Concurrently, computational methods for translating these data into evidence for genome annotation have steadily improved, leading to better automated genome annotation systems. As such, the growing number of sequenced genomes provides a positive feedback loop, in which database searches become more effective and shared sequence patterns emerge more clearly. These advances are promising steps towards annotating the functions of many poorly understood genes, particularly non-coding RNA genes, for which more research is needed.
    DOI:  https://doi.org/10.1038/s41576-026-00937-3
  12. J Transl Med. 2026 Feb 19. 24(1): 281
       BACKGROUND: Colorectal cancer (CRC) remains a leading cause of global cancer mortality, highlighting the need for precise survival prediction to guide clinical decisions. Although tissue-level multi-omics is widely utilized for survival prediction, its limited resolution cannot capture tumor heterogeneity. Single-cell RNA sequencing (scRNA-seq) enables dissection of the tumor microenvironment (TME) at cellular resolution, supporting personalized prognostic assessment.
    METHODS: We collected 213 CRC scRNA-seq samples and established a CRC-specific TME atlas comprising 339,060 cells. Using this atlas as a reference, we deconvolved bulk RNA-seq data from TCGA-CRC cohort with the EcoTyper algorithm to reconstruct TME features. Clinical, genomic, and transcriptomic data were obtained from the Xena platform; microbial data were sourced from the BIC database. We integrated TME and multi-omics features through a self-normalizing neural network to construct a deep learning model (single-cell resolution TME ecosystem with multi-omics data [SCMO]) for survival prediction. To enhance interpretability, we utilized the Integrated Gradients algorithm and spatial transcriptomic data to analyze multi-omics and TME features. We performed anticancer drug screening with tumor necrosis factor receptor-associated protein 1 (TRAP1), a critical feature according to the Integrated Gradients algorithm, as a potential target.
    RESULTS: We identified 13 survival-related TME features from the CRC-specific atlas: 12 cell states and one multi-cellular ecosystem. SCMO, which combined TME and multi-omics features, improved survival prediction and outperformed existing methods, achieving a concordance index of 0.762. The SCMO demonstrated robust performance for long-term predictions, achieving areas under the curve (AUCs) of 0.752, 0.772, and 0.869 for 1-, 3-, and 5-year predictions in the training set, with corresponding test set AUCs of 0.639, 0.756, and 0.772. TME features from the SCMO model revealed that ecosystem density increased with CRC malignancy. Multi-omics features included TRAP1 as a potential drug target. Drug screening identified saikosaponin A as a novel TRAP1 inhibitor, and its anticancer activity was validated in vitro. We developed SCMO-Lite, a simplified model incorporating 12 high-attribution-weight multi-omics features, which demonstrated robust risk stratification.
    CONCLUSIONS: SCMO combines analytical precision with biological interpretability, offering novel insights for oncology survival prediction.
    Keywords:  Colorectal cancer; Deep learning; Drug screening; Multi-Omics; Single-cell analysis; Survival analysis; Tumor microenvironment
    DOI:  https://doi.org/10.1186/s12967-025-07417-y
  13. Sci Adv. 2026 Feb 20. 12(8): eaeb2473
      Colorectal cancer (CRC) is a leading cause of cancer-related mortality worldwide, yet the functional impact of noncoding variants on enhancer activity remains largely unexplored. In this study, we adapted and applied two high-throughput techniques, SNP-STARR-seq and Methyl-STARR-seq, to systematically evaluate the influence of 30,790 noncoding SNPs and more than 134,000 CpG sites on enhancer activity in primary and metastatic CRC cells. We identified 922 SNPs and 487 CpG-containing elements modulating enhancer activity in primary cells and found 3136 SNPs and 3008 methylation-sensitive elements with metastasis-specific regulatory effects. Multi-omics integration linked these variants to target genes, and CRISPR editing validated their roles in driving tumorigenic and metastatic phenotypes. Furthermore, we identified two CRC-specific hypermethylated loci, cg08640619 and cg25982657, as exceptional tissue-based early detection biomarkers (AUC > 0.96). Mechanistically, hypermethylation at cg08640619 disrupts RUNX2 binding, leading to inhibition of KIRREL1 and ETV3. Our study provides a comprehensive platform for understanding how genetic and epigenetic variants disrupt transcriptional programs in CRC, offering insights into disease susceptibility and identifying potential diagnostic and therapeutic targets.
    DOI:  https://doi.org/10.1126/sciadv.aeb2473
  14. Nat Cancer. 2026 Feb 19.
      The presence of microbiota in human tumors has been reported widely based on bioinformatic analyses of DNA sequencing datasets; however, the source of microbial sequences in atypical anatomical sites is challenging to validate, as these could derive from sampling, storage, handling and processing of samples, similar to what has been described in studies of ancient DNA. Contamination of microbial reference genomes can also be a source of microbial signals, causing misclassification of human reads. Here, we overview the required quality controls and validation approaches and summarize optimal practices to improve the rigor and standards of tumor microbiome studies.
    DOI:  https://doi.org/10.1038/s43018-026-01121-6
  15. Genome Res. 2026 Feb 17.
      Transcriptional regulation lies at the heart of cellular identity and function, hinging on the precise binding of transcription factors (TFs) and cofactors to gene regulatory elements such as promoters and enhancers. Although it is relatively routine to profile genome-wide DNA binding landscapes of proteins, identifying the specific proteins that bind to, and regulate the transcription of, a particular gene of interest (GOI) remains a persistent experimental and conceptual challenge. This gene-centric question is complicated by the multilayered regulatory environment in which each gene resides, comprising 3D chromatin structure, enhancer-promoter looping, DNA accessibility, histone modifications, and cell state-dependent protein dynamics. In this review, we dissect the strengths, limitations, and biological relevance of current approaches for studying direct protein-DNA interactions, distinguishing between protein-centric and DNA-centric methodologies. We introduce a conceptual matrix of biological relevance, integrating the origin of DNA and protein elements (cis and trans) to evaluate false-positive and false-negative risks across experimental systems. Moreover, we explore how perturbation strategies-gain and loss of function-can complement steady-state profiling to establish causality in gene regulation. By critically examining both established tools and emerging techniques such as genome editing, synthetic chromosomes, and high-resolution imaging, we provide a practical framework for investigators seeking to uncover direct regulators of specific genes. Our goal is to guide the design of experiments that balance biological relevance, sensitivity, and interpretability to ultimately answer a deceptively simple question: What TFs directly regulate the expression of my GOI?
    DOI:  https://doi.org/10.1101/gr.281154.125
  16. Nat Rev Clin Oncol. 2026 Feb 20.
      Colorectal cancer (CRC) is a heterogeneous malignancy, with various alterations in molecular signalling pathways driving disease progression and resistance to therapy. Liquid biopsy, as a source of circulating tumour DNA (ctDNA), has been utilized to characterize tumour molecular heterogeneity, facilitating the identification of actionable targets for precision medicine-guided therapies and the detection of emerging genomic drivers of drug resistance in patients with metastatic CRC. In addition, liquid biopsy-based analysis of ctDNA has been validated as a tool for detecting minimal residual disease (MRD) following locoregional treatment in patients with localized colon or rectal cancer, offering improved prognostic stratification and supporting the tailoring of adjuvant systemic therapy. Methodological evolution from PCR analysis of a few known mutations in one gene or a small panel of genes to the assessment of hundreds of genes and pathogenic variants by next-generation sequencing has enabled comprehensive genomic profiling (CGP), thereby improving knowledge of cancer molecular complexity at the individual patient level. In this respect, liquid biopsy-based CGP is an easily repeatable and minimally invasive approach that can provide a dynamic portrait of CRC molecular heterogeneity to guide personalized and adaptive treatment based on biomarkers of response and resistance. In this Review, we discuss current and potential roles of liquid biopsy-based ctDNA analysis in the clinical management of metastatic CRC. We also discuss the evidence supporting implementation of liquid biopsy-based assessment of MRD to refine the management of locoregional CRC and potentially improve cure rates while reducing overtreatment of many patients.
    DOI:  https://doi.org/10.1038/s41571-026-01126-1