bioRxiv. 2024 Sep 19. pii: 2024.09.16.613176. [Epub ahead of print]
Single-particle analysis by Cryo-electron microscopy (CryoEM) provides direct access to the conformation of each macromolecule. However, the image's signal-to-noise ratio is low, and some form of classification is usually performed at the image processing level to allow structural modeling. Classical classification methods imply the existence of a discrete number of structural conformations. However, new heterogeneity algorithms introduce a novel reconstruction paradigm, where every state is represented by a lower number of particles, potentially just one, allowing the estimation of conformational landscapes representing the different structural states a biomolecule explores. In this work, we present a novel deep learning-based method called HetSIREN. HetSIREN can fully reconstruct or refine a CryoEM volume in real space based on the structural information summarized in a conformational latent space. The unique characteristics that set HetSIREN apart start with the definition of the approach as a real space-based only method, a fact that allows spatially focused analysis, but also the introduction of a novel network architecture specifically designed to make use of meta-sinusoidal activations, with proven high analytics capacities. Continuing with innovations, HetSIREN can also refine the pose parameters of the images at the same time that it conditions the network with prior information/constraints on the maps, such as Total Variation and L1 denoising, ultimately yielding cleaner volumes with high-quality structural features. Finally, but very importantly, HetSIREN addresses one of the most confusing issues in heterogeneity analysis, as it is the fact that real structural heterogeneity estimation is entangled with pose estimation (and to a lesser extent with CTF estimation), in this way, HetSIREN introduces a novel encoding architecture able to decouple pose and CTF information from the conformational landscape, resulting in more accurate and interpretable conformational latent spaces. We present results on computer-simulated data, public data from EMPIAR, and data from experimental systems currently being studied in our laboratories. An important finding is the sensitivity of the structure and dynamics of the SARS-CoV-2 Spike protein on the storage temperature.
Keywords: Cryo-Electron Microscopy (CryoEM); Deep Learning; Heterogenous Reconstruction; Principal Component Analysis (PCA); SARS-CoV-2 Spike protein; Sinusoidal Representation Network (SIREN); Uniform Manifold Approximation and Projection (UMAP); temperature dependence