Biophys J. 2021 Oct 07. pii: S0006-3495(21)00828-6. [Epub ahead of print]
Intrinsically disordered proteins and flexible regions in multi-domain proteins display substantial conformational heterogeneity. Characterizing the conformational ensembles of these proteins in solution typically requires combining one or more biophysical techniques with computational modelling or simulations. Experimental data can either be used to assess the accuracy of a computational model or to refine the computational model to get a better agreement with the experimental data. In both cases, one generally needs a so-called forward model, i.e. an algorithm to calculate experimental observables from individual conformations or ensembles. In many cases, this involve one or more parameters that need to be set, and it is not always trivial to determine the optimal values or to understand the impact on the choice of parameters. For example, in the case of small-angle X-ray scattering (SAXS) experiments, many forward models include parameters that describe the contribution of the hydration layer and displaced solvent to the background-subtracted experimental data. Often, one also needs to fit a scale factor and a constant background for the SAXS data, but across the entire ensemble. Here, we present a protocol to dissect the effect of free-parameters on the calculated SAXS intensities, and to identify a reliable set of values. We have implemented this procedure in our Bayesian/Maximum Entropy framework for ensemble refinement, and demonstrate the results on four intrinsically disordered proteins and a three-domain protein connected by flexible linkers. Our results show that the resulting ensembles can depend on the parameters used for solvent effects, and suggests that these should be chosen carefully. We also find a set of parameters that work robustly across all proteins. SIGNIFICANCE The flexibility of a protein is often key to its biological function, yet understanding and characterizing its conformational heterogeneity is difficult. We here describe a robust protocol for combining small-angle X-ray scattering experiments with computational modelling to obtain a conformational ensemble. In particular, we focus on the contribution of protein hydration to the experiments and how this is included in modelling the data. Our resulting algorithm and software should make modelling intrinsically disordered proteins and multi-domain proteins more robust, thus aiding in understanding the relationship between protein dynamics and biological function.