Integrative structure determination

Structures of several large protein complexes and assemblies are difficult to obtain using traditional experimental methods. Integrative structure determination fills this gap; various types of experimental data are combined along with principles from physics, statistical inference, and prior models to obtain the structure. The different sources of input information may span multiple scales (for example, X-ray data is at the atomic scale, while FRET distances are at the domain scale). However, these various sources can provide complementary information (for example, EM maps may provide the shape of a complex while chemical crosslinks may provide the orientation of binding interfaces). We have used structural, biochemical, biophysical, cell biological, genetic, and in-silico bioinformatics information for deducing the structure of assemblies. Our research follows two synergistic tracks: integrative modeling of specific biological systems and method development.

We recently determined the integrative structures of the Nucleosome Remodeling and Deacetylase (NuRD) complex, a chromatin-modifying assembly that regulates gene expression and DNA damage repair (PDB  9A8C), and the epithelial desmosomal outer dense plaque, an assembly that mediates cell-cell adhesion and signaling (PDB 9A8U).  These structures were determined by combining diverse biophysical and biochemical data at various scales. Together, these structures reveal mechanisms by which these complex molecular machines function and assemble; they also enable rationalizing disease mutations. 

 

Schematic

Schematic describing integrative structure determination for the nucleosome remodeling and deacetylase complex (orange box) and the desmosomal outer dense plaque (green box) combining data from multiple sources. Low-resolution cryo-EM and cryo-ET maps (yellow) and intrinsically disordered regions (yellow) in both complexes are highlighted as emerging areas for method development.

 

Two recurrent modeling challenges were noticed across a range of studies such as the two mentioned above. One was the need to develop methods for incorporating disordered regions in these assemblies and another was to better utilize information from cryo-electron tomography, a timely challenge as structural biology is moving towards in situ characterization. Recent examples of methods we developed to address these challenges include Disobind, a sequence-based deep learning method for identifying binding sites of intrinsically disordered regions (IDRs), and PickET, an unsupervised method for localizing macromolecules in cryo-ET data.  

Disobind  is a deep learning method for predicting interface residues and inter-protein contact maps for an IDR and its partner, given their sequences.

 

PickET workflow for unsupervised localization in tomograms (left) along with representative segmentations (right) for two real tomograms (CZI-DS-10001, gallium FIB milled S. pombe lamella: top row, CZI-DS-10301, plasma FIB milled C. reinhardtii lamella: bottom row).

 

Please look at our lab website for further details.