SMpred: A support vector machine approach to identify structural motifs in protein structure without using evolutionary information

Knowledge of three dimensional structure is essential to understand the function of a protein. Although the overall fold is made from the whole details of its sequence, a small group of residues, often called as structural motifs, play a crucial role in determining the protein fold and its stability. Identification of such structural motifs requires sufficient number of sequence and structural homologs to define conservation and evolutionary information. Unfortunately, there are many structures in the protein structure databases have no homologous structures or sequences. In this work, we report an SVM method, SMpred, to identify structural motifs from single protein structure without using sequence and structural homologs. SMpred method was trained and tested using 132 proteins domains containing 581 motifs. SMpred method achieved 78.79% accuracy with 79.06% sensitivity and 78.53% specificity. The performance of SMpred was evaluated with MegaMotifBase using 188 proteins containing 1161 motifs. Out of 1161 motifs, SMpred correctly identified 1503 structural motifs reported in MegaMotifBase. Further, we showed that SMpred is useful approach for the length deviant superfamilies and single member superfamilies. This result suggests the usefulness of our approach for facilitating the identification of structural motifs in protein structure in the absence of sequence and structural homologs.

PeptideMine-a webserver for the design of peptides for protein-peptide binding studies derived from protein-protein interactomes.

Here we describe an integrated approach called "PeptideMine" for the identification of peptides based on specific functional patterns present in the sequence of an interacting protein. This approach based on sequence searches in the interacting sequence space has been developed into a webserver, which can be used for the identification and analysis of peptides, peptide homologues or functional patterns from the interacting sequence space of a protein. To further facilitate experimental validation, the PeptideMine webserver also provides a list of physico-chemical parameters corresponding to the peptide to determine the feasibility of using the peptide for in vitro biochemical or biophysical studies.

HORI: a web server to compute Higher Order Residue Interactions in protein structures:

Folding of a protein into its three dimensional structure is influenced by both local and global interactions within a protein. Higher order residue interactions, like pairwise, triplet and quadruplet ones, play a vital role in attaining the stable conformation of the protein structure. It is generally agreed that higher order interactions make significant contribution to the potential energy landscape of folded proteins and therefore it is important to identify them to estimate their contributions to overall stability of a protein structure. We developed HORI [Higher order residue interactions in proteins], a web server for the calculation of global and local higher order interactions in protein structures. The basic algorithm of HORI is designed based on the classical concept of four-body nearest-neighbour propensities of amino-acid residues. It has been proved that higher order residue interactions up to the level of quadruple interactions plays a major role in the three-dimensional structure of proteins and is an important feature that can be used in protein structure analysis. HORI server will be a useful resource for the structural bioinformatics community to perform analysis on protein structures based on higher order residue interactions. HORI server is a highly interactive web server designed in three modules that enables the user to analyse higher order residue interactions in protein structures.

IWS - Integrated Web Server for protein sequence and structure analysis

Rapid increase in protein sequence information from genome sequencing projects demand the intervention of bioinformatics tools to recognize interesting gene-products and associated function. Often, multiple algorithms need to be employed to improve accuracy in predictions and several structure prediction algorithms are on the public domain. Here, we report the availability of an Integrated Web-server as a bioinformatics online package dedicated for in-silico analysis of protein sequence and structure data (IWS). IWS provides web interface to both in-house and widely accepted programs from major bioinformatics groups, organized as 10 different modules. IWS also provides interactive images for Analysis Work Flow, which will provide transparency to the user to carry out analysis by moving across modules seamlessly and to perform their predictions in a rapid manner.

PURE - Prediction of Unassigned REgions

PURE - Prediction of Unassigned REgions is a bioinformatics protocol to identify putative domains in the unassigned regions of protein sequences. PURE protocol is now available as a web server. PURE Server is a web server implimentation of the multi-step algorithm based on PURE method for the further examination of unassigned linker regions. Initially submitted sequence undergoes different automated filtering steps like length, coiled coiled, transmembrane and secondary structural content. In the next phase filtered sequences are fed to PSI-BLAST,CD-HIT and hmmpfam. We then integrate all the information and present the predicted domain(s) in the sequence.

COILCHECK - Web Server for validation of coiled coils

COILCHECK is a webserver for validation of coiled coils. COILCHECK requires a PDB file (containing only the coiled-coil region) and the identifiers of the two PDB chains as input. COILCHECK server reports strength of interactions between two helices in the terms of energy per residue. Details of different types of interactions (hydrogen bonds, hydrophobic interactions, salt bridges, favourable electrostatic interactions, unfavourable electrostatic interactions, and short contacts) can also be obtained.

SMotif - Structural Motifs in Proteins

SMotif is a server to identify set of structural motifs from protein structures. Such motifs among structurally aligned proteins are recognized by the conservation of amino acid preference and solvent inaccessibility and are examined for the conservation of other important structural features like secondary structural content, hydrogen bonding pattern and residue packing.

Harmony - Web Server for Protein Structure Assessment

Harmony is a server to assess the compatibility of an amino acid sequence with a proposed three-dimensional structure. Structural descriptors such as backbone conformation, solvent accessibility and hydrogen bonding are used to characterise the structural environment of each residue position. Propensity and Substitution values are used together to predict the occurrence of an amino acid at each position in the sequence on the basis of the local structural environment. We demonstrate that the information from amino acid substitutions among homologous sequences (in the form of environment-dependent amino acid substitution tables) is a powerful tool for identifying errors that may be present in the protein structure.


This server aligns two or more sequences by fixing sequentially conserved region or motifs within the aligned sequences. By fixing conserved regions this method allows flexibility and accuracy to the alignment. It also provides conserved regions for a set of sequences which can be used for FMALIGN.

iMOT: Interacting Motifs

iMOT is an automated method for identifying conserved spatially interacting regions across proteins. Signatures of proteins are derived based on spatial interactions among sequentially conserved regions. Interactions of the conserved stretches are evaluated based on pseudo-energies.

DIAL : Domain Identification Algorithm

DIAL is an algorithm for domain identification in proteins. The program works by segmenting the protein secondary structures and loops followed by clustering them according to their proximity indices. Each cluster thus derived is a potential structural domain. Disjoint factor is calculated for each such cluster and if it is more than one then it is considered as a structural domain.

MODIP [MOdelling of DIsulphide bonds in Proteins]

This computer modelling program requires N,C alpha ,C beta coordinates as input and considers all possible residue pairs and calculate the C alpha-C alpha, C beta-C beta distances. It selects the residue pair with C alpha-C alpha distance of less than or equal to 6.5 Angstorm and C beta-C beta distance of less than or equal to 4.5 Angstorm and geometricaly fixes the sulfur atom and grades them based on the stereochemical quality.