Setting yourself up for success with recombinant protein expression is important if you plan to use your protein of interest in downstream assays with any regularity. While the time investment up front can seem large, it will pale in comparison to the extra time you’ll spend purifying (and re-purifying and re-purifying…) your protein of interest with a sub-optimal protocol.

Put in the time now to optimize protein solubility so you don’t have to suffer later!

With this in mind, here are some considerations and tips for selecting the optimal conditions for recombinant protein expression and purification.

What’s in a tag?

Choosing the “right” tag can be a daunting task. The ideal tag will help you obtain soluble protein, simplify your purification scheme, and not interfere with downstream applications. However, given the plethora of vectors available with cleavage sites (Factor IX, TEV protease, etc.), your only worry should be which tag will aid in your protein solubility, expression, and purification—you can always cleave the tag later if you’re worried about how it might interfere with your protein’s behavior.

Certain tags, such as glutathione-S-transferase (GST) or maltose binding protein (MBP) are thought to enhance protein solubility. A possible drawback to these tags is their size —about 26 kDa for GST and 41 kDa for MBP.

Other tags (hexahistidine, FLAG, streptavidin) are comparatively small and are often chosen for the ease with which they can be captured and purified.

Pick a few different tags and then get cloning!

A word to the wise—if at all possible, use a C-terminal tag. This means that any protein you purify will be full length, without any truncation products, something you might see if you use an N-terminal tag.

Obtaining soluble protein

A common problem with recombinant protein expression is that eukaryotic proteins tend to be insoluble when expressed in bacteria. Denatured, aggregated and/or truncated forms of your protein form insoluble inclusion bodies that require further purification and refolding, which can be inefficient and incomplete.

Here are three critical parameters to obtaining soluble, native protein:

1.  IPTG concentration

Chances are, protein expression in your host will operate via the lac operon system. Detailed explanations of this system can be found elsewhere, but suffice it to say, you add isopropyl β-D-1-thiogalactopyranoside (IPTG), a lactose analog, to control when your protein is expressed. Expression of your protein should be in the log phase of growth, i.e. when the optical density at a wavelength of 600 (OD600) of the solution reaches 0.4-0.6.

If your goal is soluble protein, it is best not to overwhelm the protein folding machinery of the cell. It has been posited that insoluble protein may be partially due to availability of chaperones1. Thus using high concentrations of IPTG (>1 mM) may not always produce the best results. Try a few different concentrations and see which one gives you the highest yield of soluble protein.

2.  Temperature

While you can occasionally isolate proteins in soluble form through growth and induction at the typical temperature of 37°C, more often than not, solubility is enhanced at lower temperatures. Set up some test inductions at 30°C, 25°C, and 18°C.

Note: If first growing at 37°C, let the culture cool to the induction temperature before adding IPTG—heat shock proteins may be induced at higher temperatures, decreasing your yields of soluble, correctly folded protein.

3.  Induction time

Longer is not always better. This is especially true for proteins that are toxic to the host. During your test inductions, take a sample every few hours and check expression of your protein of interest. Typical induction times vary by temperature: for 37°C, try 4 hours; for 30°C, do 5-6 hours and anything below 25°C should be allowed to grow overnight.

Pilot it

Combining all of these factors in a pilot experiment is a great idea.

Clone the coding sequence of your protein of interest into 2-3 vectors with different tags (hexahistidine, glutathione-S-transferase (GST), maltose-binding protein (MBP) for example). Then pick three different induction times and temperatures and a few IPTG concentrations. Test everything in small (10-20 mL) cultures.

After induction, prepare cleared lysates from each sample and analyze the (1) uninduced sample, (2) induced sample(s) at different time points, (3) lysate, (4) soluble fraction and (5) insoluble fraction by SDS-PAGE. Make sure you equalize the loading of each lane taking into account the growth of the culture over time.

Nothing worked: What now??

While the steps outlined above should give you a good chance at success, you might find that you are still unable to obtain soluble protein.

While beyond the scope of this article, some general solutions could be to:

  • Express only a portion of your protein; smaller domains are often soluble even if the full-length protein is not
  • Use a different expression system for your protein (i.e. baculovirus or cell culture). This may even be desirable to retain post-translational modifications that may be important in your protein of interest.
  • Purify under denaturing conditions. While not always the best way, purifying under denaturing conditions (i.e. in high concentrations of chaotropic salts of guanidine or urea) is sometimes the only solution.

Often these conditions can actually give you a purer sample as the soluble contaminating proteins are removed. The only issue is refolding; however, there are many protocols for refolding of denatured proteins isolated from inclusion bodies.

I hope the strategies described here can help you out with your protein purifications—feel free to comment below with your own techniques/tips!


  1. Thomas, J.G., Baneyx,F. 1996.  Protein misfolding and inclusion body formation in recombinant Escherichia coli cells overexpressing heat-shock proteins. J of Biol Chem. 271:11141-11147



More by

More 'Protein Expression and Analysis' articles


  1. The Expresso Solubility and Expression Screening System from Lucigen is a cool product that enables the construction of 8 different fusion proteins from a single PCR product. The Expresso cloning trick works really well. You can easily create and test eight fusion proteins in just a few days.

    The kit includes GST, MBP, His as well as 5 less commonly used solubility tags. There is no substitute for empirical data when trying to determine the best fusion partner for your protein of interest.

  2. Another useful trick to try is to use codon optimization tables and express your protein of interest from E coli friendly codons. Obviously, this needs to be done before the first cloning step and also, entails the extra cost of getting the codon optimized gene using gene blocks/gene synthesis. The latter means this trick is useful only for domains and small proteins (15-20kDa at most).

    While typically marketed as a tool to enhance protein _production_, I have seen this method work for getting some soluble protein compared to zero protein from the original cDNA.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.