bims-lances 2025-05-11 papers

bims-lances

Biomed News

on Landscapes from Cryo-EM and Simulations

Issue of 2025–05–11
ten papers selected by
James M. Krieger, National Centre for Biotechnology

Enhanced Exploration of Protein Conformational Space through Integration of Ultra-Coarse-Grained Models to Multiscale Workflows.
Bridging Dimensionality Reduction and Stochastic Sampling: The DA2-MC Algorithm for Protein Dynamics.
Machine Learning for Protein Science and Engineering.
Rational design of 19F NMR labelling sites to probe protein structure and interactions.
Conformational Plasticity in dsRNA-Binding Domains Drives Functional Divergence in RNA Recognition.
Liquid-Electron Microscopy and the Real-Time Revolution.
Human coronavirus HKU1 spike structures reveal the basis for sialoglycan specificity and carbohydrate-promoted conformational changes.
Machine Learning of Molecular Dynamics Simulations Provides Insights into the Modulation of Viral Capsid Assembly.
Synergistic Drug Combination Prediction via Dual-level Feature Aggregation and Knowledge Graph-based Deep Neural Network.
Foldclass and Merizo-search: Scalable structural similarity search for single- and multi-domain proteins using geometric learning.

J Phys Chem B. 2025 May 08.

Enhanced Exploration of Protein Conformational Space through Integration of Ultra-Coarse-Grained Models to Multiscale Workflows.

Fikret Aydin, Konstantia Georgouli, Loïc Pottier, Tomas Oppelstrup, Timothy S Carpenter, Jeremy O B Tempkin, Peer-Timo Bremer, Dwight V Nissley, Frederick H Streitz, Felice C Lightstone, Helgi I Ingólfsson.

Computational techniques such as all-atom (AA) molecular dynamics (MD) simulations and coarse-grained (CG) models have been essential to study various biological problems over a wide range of scales. While AA simulations provide detailed insights, they are computationally expensive for capturing dynamics over longer length and time scales. CG approaches, particularly ultra-coarse-grained (UCG) models as considered in this study, have addressed this limitation by simplifying molecular representations, enabling the study of larger systems and longer time scales. This work focuses on the development of UCG models of proteins and their integration into the Multiscale Machine-Learned Modeling Infrastructure (MuMMI) to efficiently sample protein conformations, exemplified by the RAS-RBDCRD protein complex. By employing a combination of essential dynamics coarse graining (EDCG) and heterogeneous elastic network modeling (hENM) with anharmonic modifications, we developed UCG models based on the fluctuations observed in the higher resolution Martini CG simulations. These models allow the accurate sampling of protein configurations and long-range conformational changes. The incorporation of an implicit membrane model further enhanced the exploration of protein-membrane dynamics. Additionally, a novel machine-learning-based backmapping approach was developed to convert UCG structures to Martini CG representations, resulting in improved prediction accuracy. Finally, the integration of UCG models into MuMMI significantly enhances the exploration of protein configurations, offering critical insights into the role of protein dynamics in biological processes.

DOI: https://doi.org/10.1021/acs.jpcb.4c08622
J Phys Chem Lett. 2025 May 07. 4788-4795

Bridging Dimensionality Reduction and Stochastic Sampling: The DA2-MC Algorithm for Protein Dynamics.

Ruizhe Shen, Qiang Zhu, Limu Hu, Jing Ma, Wei Wang, Hao Dong.

Elucidating protein dynamics and conformational changes is crucial for understanding their biological functions. This work introduces a data-driven accelerated conformational searching algorithm incorporating a Monte Carlo strategy, termed the DA2-MC method, which integrates dimensionality reduction techniques with Monte Carlo strategies to efficiently explore unknown protein conformations. The DA2-MC method was applied to investigate the folding mechanisms of two miniproteins, chignolin and WW domain, revealing their dynamic behavior in different conformational states at a reasonable computational cost. A Markov state model-based analysis of chignolin's folding pathway corroborated the dynamic insights obtained from the DA2-MC method. Moreover, free energy calculations initiated with the intermediate structures identified by DA2-MC yielded results consistent with published literature, affirming the method's reliability in accelerating conformational searches and reconstructing equilibrium properties. Collectively, the DA2-MC method emerges as an effective tool for efficiently exploring protein conformations, facilitating the identification of potential functional conformations on complex energy landscapes.

DOI: https://doi.org/10.1021/acs.jpclett.5c00921
Cold Spring Harb Perspect Biol. 2025 May 05. pii: a041877. [Epub ahead of print]

Machine Learning for Protein Science and Engineering.

Peter K Koo, Christian Dallago, Ananthan Nambiar, Kevin K Yang.

Recent years have seen significant breakthroughs at the intersection of machine learning and protein science. Tools such as AlphaFold have revolutionized protein structure prediction. They are also enabling variant effect prediction and functional annotation of proteins, as well as opening up new possibilities for protein design. However, these technological advances must be balanced with sustainable computing practices.

DOI: https://doi.org/10.1101/cshperspect.a041877
Nat Commun. 2025 May 08. 16(1): 4300

Rational design of 19F NMR labelling sites to probe protein structure and interactions.

Julian O Streit, Sammy H S Chan, Saifu Daya, John Christodoulou.

Proteins are investigated in increasingly more complex biological systems, where 19F NMR is proving highly advantageous due to its high gyromagnetic ratio and background-free spectra. Its application has, however, been hindered by limited chemical shift dispersions and an incomprehensive relationship between chemical shifts and protein structure. Here, we exploit the sensitivity of 19F chemical shifts to ring currents by designing labels with direct contact to a native or engineered aromatic ring. Fifty protein variants predicted by AlphaFold and molecular dynamics simulations show 80-90% success rates and direct correlations of their experimental chemical shifts with the magnitude of the engineered ring current. Our method consequently improves the chemical shift dispersion and through simple 1D experiments enables structural analyses of alternative conformational states, including ribosome-bound folding intermediates, and in-cell measurements of protein-protein interactions and thermodynamics. Our strategy thus provides a simple and sensitive tool to extract residue contact restraints from chemical shifts for previously intractable systems.

DOI: https://doi.org/10.1038/s41467-025-59105-6
J Am Chem Soc. 2025 May 06.

Conformational Plasticity in dsRNA-Binding Domains Drives Functional Divergence in RNA Recognition.

Debadutta Patra, Jaydeep Paul, Upasana Rai, Aravind P S, Mandar V Deshmukh.

The functional specificity of proteins is often attributed to their sequence and structural homology while frequently neglecting the underlying conformational dynamics occurring at different time scales that can profoundly impact biological consequences. Using 15N-CEST NMR and RDC-corrected metainference molecular dynamics simulations, here, we reveal differential substrate recognition mechanisms in two dsRNA-binding domain (dsRBD) paralogs, DRB2D1 and DRB3D1. Despite their nearly identical solution structures and conserved dsRNA interaction interfaces, DRB3D1 demonstrates structural plasticity that enables it to recognize conformationally flexible dsRNA, a feature notably absent in the more rigid DRB2D1. We present the pivotal role of intrinsic structural dynamics in driving functional divergence and provide insights into the mechanisms that govern specificity in dsRBD:dsRNA interactions. Importantly, our combined experimental and computational approach captures a cluster of intermediate conformations, complementing conventional methods to resolve the dominant ground state and sparsely populated excited states.

DOI: https://doi.org/10.1021/jacs.5c02057
Annu Rev Biophys. 2025 May;54(1): 1-15

Liquid-Electron Microscopy and the Real-Time Revolution.

Deborah F Kelly.

  Advances in imaging technology enable striking views of life's most minute details. A missing piece of the puzzle, however, is the direct atomic observation of biomolecules in action. Liquid-phase transmission electron microscopy (liquid-EM) is the room-temperature correlate to cryo-electron microscopy, which is leading the resolution revolution in biophysics. This article reviews current challenges and opportunities in the liquid-EM field while discussing technical considerations for specimen enclosures, devices and systems, and scientific data management. Since liquid-EM is gaining traction in the life sciences community, cross talk among the disciplines of materials and life sciences is needed to disseminate knowledge of best practices along with high-level user engagement. How liquid-EM technology is inspiring the real-time revolution in molecular microscopy is also discussed. Looking ahead, the new movement can be better supported through open resource sharing and partnerships among academic, industry, and federal organizations, which may benefit from the scientific equity foundational to the technique.

Keywords:  electron microscopy; graphene; liquid-EM; liquid-phase transmission electron microscopy; microchip; microfluidics; real-time

DOI:  https://doi.org/10.1146/annurev-biophys-071624-095107
Nat Commun. 2025 May 05. 16(1): 4158

Human coronavirus HKU1 spike structures reveal the basis for sialoglycan specificity and carbohydrate-promoted conformational changes.

Min Jin, Zaky Hassan, Zhijie Li, Ying Liu, Aleksandra Marakhovskaia, Alan H M Wong, Adam Forman, Mark Nitz, Michel Gilbert, Hai Yu, Xi Chen, James M Rini.

The human coronavirus HKU1 uses both sialoglycoconjugates and the protein transmembrane serine protease 2 (TMPRSS2) as receptors. Carbohydrate binding leads to the spike protein up conformation required for TMPRSS2 binding, an outcome suggesting a distinct mechanism for driving fusion of the viral and host cell membranes. Nevertheless, the conformational changes promoted by carbohydrate binding have not been fully elucidated and the basis for HKU1's carbohydrate binding specificity remains unknown. Reported here are high resolution cryo-EM structures of the HKU1 spike protein trimer in its apo form and in complex with the carbohydrate moiety of a candidate carbohydrate receptor, the 9-O-acetylated GD3 ganglioside. The structures show that the spike monomer can exist in four discrete conformational states and that progression through them would promote the up conformation upon carbohydrate binding. We also show that a six-amino-acid insert is a determinant of HKU1's specificity for gangliosides containing a 9-O-acetylated α2-8-linked disialic acid moiety and that HKU1 shows weak affinity for the 9-O-acetylated sialic acids found on decoy receptors such as mucins.

DOI: https://doi.org/10.1038/s41467-025-59137-y
J Chem Inf Model. 2025 May 08.

Machine Learning of Molecular Dynamics Simulations Provides Insights into the Modulation of Viral Capsid Assembly.

Anna Pavlova, Zixing Fan, Diane L Lynch, James C Gumbart.

An effective approach in the development of novel antivirals is to target the assembly of viral capsids by using capsid assembly modulators (CAMs). CAMs targeting hepatitis B virus (HBV) have two major modes of function: they can either accelerate nucleocapsid assembly, retaining its structure, or misdirect it into noncapsid-like particles. Previous molecular dynamics (MD) simulations of early capsid-assembly intermediates showed differences in protein conformations for the apo and bound states. Here, we have developed and tested several classification machine learning (ML) models to better distinguish between apo-tetramer intermediates and those bound to accelerating or misdirecting CAMs. Models based on tertiary structural properties of the Cp149 tetramers and their interdimer orientation, as well as models based on direct and inverse contact distances between protein residues, were tested. All models distinguished the apo states and the two CAM-bound states with high accuracy. Furthermore, tertiary structure models and residue-distance models highlighted different tetramer regions as being important for classification. Both models can be used to better understand structural transitions that govern the assembly of nucleocapsids and to assist in the development of more potent CAMs. Finally, we demonstrate the utility of classification ML methods in comparing MD trajectories and describe our ML approaches, which can be extended to other systems of interest.

DOI: https://doi.org/10.1021/acs.jcim.5c00274
IEEE J Biomed Health Inform. 2025 May 06. PP

Synergistic Drug Combination Prediction via Dual-level Feature Aggregation and Knowledge Graph-based Deep Neural Network.

Ying Zuo, Yan Zhang, Li Wang, Jianping Yu, Jiawei Luo, Qiu Xiao.

Identifying synergistic drug combinations is a critical but difficult challenge in cancer treatment, owing to the sheer complexity and enormous number of possible drug combinations. However, most existing computational methods rely on a single data perspective and often overlooking the complexity of interactions between different biological entities. Furthermore, they fail to fully integrate the intrinsic properties of drugs and cell lines with the broader biological relationships that play a crucial role in drug synergy. To address these challenges, we propose a novel framework called LGSyn that integrates two types of information: local features, including molecular fingerprints, descriptors, and gene expression profiles, as well as global features that encompass broader biological interactions, including drug-protein, protein-cell line, protein-protein, and cell line-tissue interactions. By combining these two types of features, LGSyn leverages the full spectrum of biological knowledge to predict drug synergy. In LGSyn, we developed three fusion strategies to effectively integrate local and global information and identify the most suitable strategy. The resulting fused feature vectors are then fed into a deep neural network for training and synergy prediction. Experimental results demonstrate that the proposed method outperforms current state-of-the-art models, achieving superior accuracy and stability in drug synergy prediction. The source code of LGSyn is publicly available at https://github.com/1zuoying/LGSyn.

DOI: https://doi.org/10.1109/JBHI.2025.3567108
Bioinformatics. 2025 May 06. pii: btaf277. [Epub ahead of print]

Foldclass and Merizo-search: Scalable structural similarity search for single- and multi-domain proteins using geometric learning.

Shaun M Kandathil, Andy M Lau, Daniel W A Buchan, David T Jones.

MOTIVATION: The availability of very large numbers of protein structures from accurate computational methods poses new challenges in storing, searching and detecting relationships between these structures. In particular, the new-found abundance of multi-domain structures in the AlphaFold structure database introduces challenges for traditional structure comparison methods.
RESULTS: We address these challenges using a fast, embedding-based structure comparison method called Foldclass which detects structural similarity between protein domains. We demonstrate the accuracy of Foldclass embeddings for homology detection. In combination with a recently developed deep learning-based automatic domain segmentation tool Merizo, we develop Merizo-search, which first segments multi-domain query structures into domains, and then searches a Foldclass embedding database to determine the top matches for each constituent domain. Combining the ability of Merizo to accurately segment complete chains into domains, and Foldclass to embed and detect similar domains, the Merizo-search tool can be used to rapidly detect per-domain similarities for complete chains, taking as little as 2 minutes to search all 365 million domains from the Encyclopedia of Domains. We anticipate that these tools will enable many analyses using the wealth of predicted structural data now available.
AVAILABILITY: Foldclass and Merizo-search are available at https://github.com/psipred/merizo_search. The version used in this publication is archived at https://doi.org/10.5281/zenodo.15120830. Merizo-search is also available on the PSIPRED web server at http://bioinf.cs.ucl.ac.uk/psipred.
SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

DOI: https://doi.org/10.1093/bioinformatics/btaf277