Protein modeling

Modeling – in the World of Proteins

Computational protein structure prediction provides three-dimensional structures of proteins that are predicted by in-silico techniques. Such protein modeling relies on principles from known protein structures obtained via x-Ray crystallography, NMR Spectroscopy, as well as from physical energy functions.

 There are three main methods of modeling:

  1. The first and favorite method is Homology Modeling,1-2
  2. Followed by the Threading/Fold Recognition method,
  3. And last but not least the Ab-initio Method.

Homology Modeling

You only use this method with a structurally unknown protein sequence and a structurally known similar protein (over 30% identity). This method relies on programs like BLAST to search for similar proteins in protein structural databases, such as PDB (Protein Data Bank). Another term for this method is comparative modeling, because you compare the protein sequence with known template structures.

The main tool or software you need for homology modeling is MODELLER. Note, this is a Python script open software source. Currently, a GUI version of the program is available called Easy Modeller, which lets you make a model of the protein with the ease of snapping your fingers.

Threading/Fold Recognition

With this method, you can predict the protein structures of your target protein using known protein folds of similar proteins found in different databases. You can do this easily through online web servers, such as I-Tasser and others.

Ab-initio Method

This method predicts protein structures when the structural information of similar proteins is not available. The protein structures are built from scratch by calculating the most favorable energy conformations. This method should only be used as a last resort.

Of the three methods, homology modeling is the star. This is due, in part, to the current availability of a large number of experimentally determined protein structures. Although many tools and servers are available for homology modeling, the main steps you need to follow for these programs are almost the same.

The Rungs of Homology Modeling

Consider homology modeling as a ladder with some rungs. Each rung or step is important in its own way and we cannot skip a step and jump up. If we do then, our final structure may fumble or turn out to be a disaster.

1. Target Sequence Selection (Optional)

This step depends upon your need. The protein sequence you wish to model is termed the “target sequence.” Do pick the appropriate length of target protein that allows you to reduce background interference. In some cases, you don’t need the entire protein structure. In such cases, sticking with the essential protein sequences (domains) makes the process of modeling easy-peasy.

2. Template Protein Recognition

The “template protein” is the reference protein structure. In this case, you pair the target protein with all the protein sequences of known protein structures that are present in the protein structural databases, using simple sequence alignment software. Select the proteins with topmost identity to the target sequence as template protein(s).

3. Preparation of Template Protein (Optional)

Now, you have to decide which form of the template to use from the protein sequences you retrieve because the database may contain many irrelevant constituents. These may be multiple chains, water molecules, ligands, and so on. Therefore, this step depends on what you need.

 4. Sequence Alignment

Next, you need to align your target and template protein sequences using a sequence alignment algorithm. This is a very important step in protein modeling. The use of an appropriate alignment algorithm is a requisite for bagging the right model of your protein. The alignment compares the proteins and presents the identical areas in the proteins.

5. Prediction of Secondary Structure

The secondary structures of the proteins are modeled through secondary structure prediction tools (e.g., tools present in ExPASy Portal). It compares the secondary structures of target and template proteins and analyzes them.

6. Building Homology Models

With a template protein, you can build a model of the target protein. You need to visualize these models by checking their 3D structure. Then, if any loops are present in the structure, you can further optimize them with loop modeling.

 7. Loop Modeling

To optimize loops present in the template protein, use loop modeling software, such as OMIC server, Modloop, or others. This will improve the accuracy of the structure.

 8. Model Optimization and Validation

Finally, once you have a tentative model you will need to improve it—to its near native structure via energy minimization. You can do this with protein model validation tools and verification servers. Consequently, such validation tests show whether your protein model is energetically satisfactory.3

Hmm… What to Do with Your Protein Model?

  1. Molecular Interaction Studies – Molecular Docking.
  2. Computer-aided drug discovery.
  3. Improve the results of Quantitative Structure-Activity Relationship (QSAR) techniques.

Have any questions? Comment below!


  1. Krieger E, Nabuurs SB, Vriend G. (2003) Homology modeling. Methods of Biochemical Analysis. 44:509–24.
  2. Rodriguez R, Chinea G, Lopez N, Pons T, Vriend G, (1998) Homology modeling, model, and software evaluation: three related resources. Bioinformatics (Oxford, England). 14(6):523–8.
  3. Hooft RW, Sander C, Vriend G. (1997) Objectively judging the quality of a protein structure from a Ramachandran plot. Bioinformatics. 13(4):425–30.
Image credit: User:A2-33

Leave a Comment