Variations on the ChIP-seq Theme and Challenges of Befriending Large Datasets

ChIP-seq has proved amazing. Through these new techniques, we can obtain big datasets in a matter of days, making our lives in the lab easier and more efficient.

ChiP-seq combines chromatin immunoprecipitation (ChIP) assays with whole genome sequencing. This makes it possible to understand where proteins bind to DNA and epigenetic modifications. Humans are not only their genome but also the epigenome, after all. Unlike arrays and other approaches used to investigate the epigenome, which are inherently biased because they require probes derived from known sequences, ChIP-Seq does not require prior knowledge!

The ChIP-Seq Variations

Nowadays there are some innovative techniques to analyze the epigenome giving us different insights and information.

Classic ChIP–seq reveals binding sites of specific transcription factors (TFs). In ChIP–seq, you use specific antibodies to extract DNA fragments bound to the target protein, either directly or through other proteins in a complex containing the target factor.
DNase-seq, Assay for Transposase-Accessible Chromatin-seq (ATAC-seq), and Formaldehyde-assisted Isolation of Regulatory Elelments–seq (FAIRE-seq) reveal regions of open chromatin, not associated with any protein
- In DNase-seq, the DNase I endonuclease fragments the chromatin. Then, these fragments are selected by size and enriched.
- ATAC-seq is an alternative method to DNase-seq that uses an engineered Tn5 transposase to cleave DNA and tag it with specific primers– a process called tagmentalion.
And MNase-seq identifies specifically positioned nucleosomes(1-4). Micrococcal nuclease (MNase) is an endo–exonuclease that progressively digests DNA until an obstruction, such as a nucleosome, is reached.

All of these techniques are useful in their own way—and can result in millions and millions of reads. Analyzing this big set of data can be a big headache!

Choose a free resource to help you move forward

EBOOK

Gene Editing 101 is your guide to understanding, designing, and performing CRISPR experiments, exploring how this revolutionary technology is driving advances across health, diagnostics, agriculture, and energy, and covering how to design gRNA, choose a Cas9 format, screen with CRISPR, use advanced CRISPR approaches, and more.

GET YOUR COPY

DOWNLOAD

Bitesize Bio’s blood collection tube chart explains each tube type, cap color, and essential properties in a clear format, further divided into serum and plasma tubes so you can pick with confidence. Grab your free chart, pin it up, and streamline your blood collection process today.

GET YOUR COPY

The Interpretations of Large Datasets

To obtain good results in these assays there are important factors to keep in mind:

the quality of your products (be it antibody, enzyme, or what you are using)
the amount of sample you input, as a higher concentration (or low one) may bias your experiment (and can indeed render your experiment useless)
the depth of sequencing
sample number and number of replicates.

And, of course, a control should always be present in your experiment. Believe me, you will thank me for this when you are analyzing your data.

A ChIP-seq experiment may produce millions and millions of short reads (depending on the organism and the experiment in itself). So, after the experimental setup, you will need to analyze this massive amount of information. After data quality checks like FASTQC, the first step is to align your information to the reference genome using a standard tool like Bowtie or BWA. You will obtain a “profile” of your reads, and you can then upload your alignment in browsers such as UCSC Genome Browser or IGV. Your data should have “peaks.” These peaks are your signal, the enriched sequences that were amplified. The challenge is to call peaks correctly. Could a peak be an artifact? A product of a repetitive sequence? Derived from a GC-enriched section? Or is it some other bias, an artifact specific to the practical experiment in itself?

Normalize Your Results

We need to normalize our results to get the real peaks. You can use different algorithms to detect peaks. Nevertheless, these methods often require careful discernment of several parameters to obtain good results. The choice of which algorithm to use must be well thought out. You should have in mind what question you are trying to answer. Remember, different algorithms can provide different results (even when applied to the same data). The real challenge comes from the lack of benchmark data-sets that makes it even harder to analyze your results. So, sometimes it becomes essential to apply several methods to your data as peaks remaining independent of the method applied are more likely to be real signals.

Tips for ChIP-Seq

Here are some takeaway tips to make your ChIP-seq life easier:

Use control groups in each experiment. These groups underwent the same experiment as our samples. Therefore, we can compare their profiles and use the comparison to address the fact that reads are not uniformly distributed
Deeper sequencing will also improve ChIP-seq performance, as long as you have a control to compare it with
And visualization is very important – no matter how technological everything gets, when you can see your alignments it gets much easier!

References

Jason D Buenrostro, Paul G Giresi, Lisa C Zaba, Howard Y Chang William J Greenleaf (2013). Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nature Methods 10: 1213-1218, doi:10.1038/nmeth.2688
Jeremy M Simon, Paul G Giresi, Ian J Davis, Jason D Lieb (2007). Using formaldehyde-assisted isolation of regulatory elements (FAIRE) to isolate active regulatory DNA. Nature protocols 7: 256-267 Doi:10.1038/nprot.2011.444.
Kairong Cui and Keji Zhao (2012). Genome-wide approaches to determining nucleosome occupancy in metazoans using MNase-Seq Methods Mol. Biol. 833: 413-419. Doi: 10.1007/978-1-61779-477-3_24
Lingyun Song and Gregory E. Crawford (2010). DNase-seq: A High-Resolution Technique for Mapping Active Gene Regulatory Elements across the Genome from Mammalian Cells. Cold Spring Harb. Protoc. Doi: doi:10.1101/pdb.prot5384

You made it to the end—nice work! If you’re the kind of scientist who likes figuring things out without wasting half a day on trial and error, you’ll love our newsletter. Get 3 quick reads a week, packed with hard-won lab wisdom. Join FREE here.

Cindy Duarte Castelão

Cindy gained a Masters degree in Molecular Biology and Genetics from the Universidade de Lisboa.

About Us

Marketing

Bitesize Bio Search

Variations on the ChIP-seq Theme and Challenges of Befriending Large Datasets

The ChIP-Seq Variations

The Interpretations of Large Datasets

Normalize Your Results

Tips for ChIP-Seq

References

P19 to the Rescue: How to Increase Protein Expression in Agroinfiltration

Thresholding in Flow Cytometry – Why It Is Important

The Art of Size Selection for Small RNAs

Generating RNA-seq Libraries from RNA

Data Spread and How to Measure It: the Coefficient of Variation (CV)

Sequencing-by-Synthesis: Explaining the Illumina Sequencing Technology

See the Hidden at EMBL Imaging Centre: Fast and Gentle 3D Imaging Powered by Adv

**Get help with everything* lab-related.**

10 Things Every Molecular Biologist Should Know

Get practical lab wisdom like this in your inbox

About Us

Marketing

Bitesize Bio Search

Variations on the ChIP-seq Theme and Challenges of Befriending Large Datasets

The ChIP-Seq Variations

The Interpretations of Large Datasets

Normalize Your Results

Tips for ChIP-Seq

References

More 'Genomics and Epigenetics' articles

See the Hidden at EMBL Imaging Centre: Fast and Gentle 3D Imaging Powered by Adv

Get help with everything* lab-related.

10 Things Every Molecular Biologist Should Know

Get practical lab wisdom like this in your inbox

**Get help with everything* lab-related.**