Next gen sequencing is a powerful technique, one that now lies at the heart of many scientific projects. This power comes with some special challenges, however, and by recognizing them you can ensure that your NGS results are robust. No one wants to publish findings that other scientists fail to replicate, but unfortunately it happens all the time. If you want to avoid disseminating shaky findings, check out my five tips for making sure that your NGS results stand the test of time.
1. Start With a Good Sample
Although NGS is fairly forgiving, using high-quality DNA or RNA yields better results. So when designing your study, make sure to preserve your samples in a manner that is compatible with sequencing. Degradation may bias your results, complicating things down the road.
When constructing your sequencing library, keep an eye on DNA/RNA quality. RNA can be especially tricky to extract; if you are new to nucleic acid work, it is a good idea to practice extractions on samples that don’t really matter, until the quality of your products looks great.
In some cases you may be stuck with using non-ideal samples, such as formalin-preserved tissue blocks. This isn’t a deal breaker: however, it is important to be aware of the problems you are likely to run into so that you can address them. There are special methods that allow researchers to obtain good NGS results from degraded or non-optimal samples (check out this article, for example, or this one), so it is well worth your while to do a little research to uncover the techniques that will work best for your samples.
2. Replicate, replicate, replicate
Once upon a time, sequencing was so expensive that using software to try and filter out bad results was preferable to using sequencing replicates. Now the costs of sequencing have gone down, however, and replication requires serious thought.
It may make sense to include biological replicates in your study. (Many researchers feel that technical replicates are less important, as most popular NGS platforms have good reproducibility.) Or, if you are investigating genetic variants linked to a health condition, including an entire replication sample is probably wise. Increasingly, high-impact journals are requiring replication samples for association studies, so incorporating replication into your study during the planning stages is ideal.
Using replicates wisely is a great way to increase your confidence in your data. Note, though, that replicates are not a substitute for sequencing depth.
3. Control for Multiple Comparisons
Often, analyzing a NGS dataset involves simultaneously testing thousands of hypotheses. Simply by chance, you are likely to obtain some statistically significant results from such a large pool when using conventional statistical tests. Therefore, controlling for multiple comparisons is essential. Many techniques for doing just this have been developed, so explore them and determine what makes the most sense for your study: potential solutions range from Bonferroni-adjusted p-values to the false discovery rate. Proper analysis can prevent you from investing lots of time and money in following up on results that appear intriguing but turn out to be statistical noise.
4. Be Aware of the Errors to Which Your System is Prone and Accommodate Them
Each sequencing platform has unique strengths and weaknesses. Whenever possible, choose the platform that is best-suited to your project, rather than the platform readily available in the lab next door. Once you have chosen a platform, you can then work to minimize the effects of its weaknesses on your results.
Looking for rare mutations, for example? If you are using Illumina’s HiSeq, keep in mind that A to T miscalls are most common, and reports of 0.5-1% errors in sequencing reads have been made (and are “within the specified operating parameters” of the machine). Thus, it’s helpful to use software that filters out bad calls using quality scores, make sure that a mutation of interest appears in multiple reads, validate rare mutations with Sanger sequencing, and perhaps even employ cross-platform replicates. Impress your future reviewers by being proactive about addressing potential problems!
5. Be Skeptical
The best safeguard against results that don’t hold up over the long term is a healthy dose of skepticism toward your own results. Be your own harshest critic. Make sure that you understand the techniques and methods being used to generate and analyze your data. If you don’t, ask questions until you are satisfied with your knowledge. Before publishing your findings, discuss them with colleagues who have expertise in the area. Conferences and presentations are also a great way to get feedback on potential problems. By addressing any nagging doubts early, you can proudly present your data to the rest of the world.
Do you have any other tips for ensuring your NGS results are robust? Let us know in the comments below.