Integrative structure determination

Structures of several large protein complexes and assemblies are difficult to obtain using traditional experimental methods. Integrative structure determination fills this gap; various types of experimental data are combined along with principles from physics, statistical inference, and prior models to obtain the structure. The different sources of input information may span multiple scales (for example, X-ray data is at the atomic scale, while FRET distances are at the domain scale). However, these various sources can provide complementary information (for example, EM maps may provide the shape of a complex while chemical crosslinks may provide the orientation of binding interfaces). We rigorously incorporate each type of input information while accounting for its uncertainty in a Bayesian inference framework. This allows us to incorporate data that is sparse, noisy, ambiguous, and from heterogenous samples.

Integrative structures of chromatin-modifying assemblies

We recently determined the structures of sub-complexes of the Nucleosome Remodeling and Deacetylase (NuRD) complex, a chromatin-modifying assembly that regulates gene expression and DNA damage repair. It is conserved across plant and animal species and expressed in most metazoan tissues. However, its structure is hard to characterize experimentally. Using Bayesian integrative structure determination, we combined information from published SEC-MALLS, DIA-MS, XLMS, negative stain EM, X-ray crystallography, and NMR spectroscopy, secondary structure and homology predictions. The integrative structures were corroborated by independent cryo-EM maps, biochemical assays, and known cancer-associated mutations.

 

                                                            Integrative structure of the nucleosome deacetylase (NuDe) complex.

We are applying similar methods to study assemblies at cell-cell junctions, cytoskeletal, and centriolar assemblies.

Improving integrative modeling methods

We developed a method to optimize the sampling-related parameters for modeling assemblies in IMP (https://integrativemodeling.org). StOP (Stochastic Optimization of Parameters) automates the tuning of MCMC parameters such as rigid body and bead move sizes, restraint weights, and replica exchange temperatures. 

Optimizing sampling parameters for integrative modeling.

PrISM is our recently developed method to identify high and low precision regions in an ensemble of integrative models of large macromolecular assemblies. 

  Annotating precision for integrative models

This is part of our effort to develop protocols for validating integrative models deposited in the wwPDB (worldwide Protein Data Bank) [Viswanath, Chemmama, et al, Biophysical Journal 2017].