Thousands upon thousands of genetic variants are now associated with every disease and trait you can possibly think of. Such traits range from cancers to blood pressure, intelligence, height, weight… and many more! This is largely because of the advent of genome-wide association studies (GWAS). However, the vast majority of genetic loci associated with these traits are in non-coding regions so they do not affect protein structure or function. These variants exert their influence by modulating gene expression. Thus, they are known as expression quantitative trait loci (eQTLs). Here are five ways in which you can investigate whether or not your genetic region acts as an eQTL:
Check GTEx Portal
The Gene and Tissue Expression (GTEx) database contains eQTL data from over 70 different human tissues comprised from over 7,000 samples to date. It gives you an idea of what genes are expressed in your tissue of interest and whether there are eQTLs operating on your gene. This isn’t an exhaustive list, so don’t be disheartened if your search draws a blank. Your eQTL might be found only in a particular cell type or disease state. If your search is fruitful and you do come across a hit, the website gives you a box plot of the data which you can incorporate into your presentation or manuscript. Voila!
This database is useful over and above GTEx and predicts the functional activity of your selected SNPs and their surrounding region. So whether or not your eQTL search on GTEx is successful, it’s wise to further investigate the region with Haploreg. This will give you a better idea of how likely the SNPs are to modulate gene expression via altering transcription factor binding. It also gives you a better handle on the nature of the surrounding chromatin such as its accessibility and openness and the genetic or epigenetic landscape. Crucially, this website presents all of the SNPs that are in linkage disequilibrium with your trait-associated SNP. Anyone of these linked SNPs could be functional!
RegulomeDB provides information on transcriptional regulatory activity associated with specific SNPs in a range of different cell lines and cell types. Search here using all of the SNPs within your linkage disequilibrium block to further investigate their functionality. After searching, your SNPs are allocated a RegulomeDB Score which can range from 1 to 6 (or No Data). A score of 1 is most likely and 6 is least likely to affect transcription factor binding. Following the scores, look to see if there is predicted transcription factor binding to the region in relevant cell types. You should also consider applicable chromatin structure and histone modifications. This is very useful for building up a body of evidence.
For information on microRNA binding sites, visit the MicroRNA Target Prediction and Functional Study Database. Search this database for your gene of interest and you will get a list of all the miRNAs predicted to bind somewhere within the 3’ UTR of the mRNA. miRNAs are ranked by a target score from 0 to 100, where 100 is most likely to bind. Follow the links to find the predicted binding sites within the sequence and then you can follow them up experimentally.
Trawl the Literature
Every trait or disease is different and affects its own set or subset of cells or tissues. To determine whether an eQTL operates at your locus and in your cell or tissue type, search Pubmed for published papers. Or use the Expression Atlas at the European Bioinformatics Institute for datasets where groups have carried out large scale genotyping and RNA-sequencing studies. As a condition of publishing in the big journals, these large datasets must be made publically available. Great news if you can find a relevant one!
Measure Your eQTL Yourself!
Of course it is lovely when you find data elsewhere that proves your hypothesis or characterizes your eQTL. This is good in the sense that you have ready-made evidence in the bag, but a scientist should always endeavour to replicate findings themselves. This holds more weight when it comes to sharing your data at conferences or in papers. It also ensures that previous findings are not artefactual. DIY eQTL analysis is done relatively easily by genotyping, qPCR, or by pyrosequencing. Remember, the underlying mechanism could be pre- or post-transcriptional, so make sure you consider this as well.
Measurement of eQTLs and identification of functional SNPs are currently a bottleneck in the genetics field. As such GWAS has not yet reached its full translational potential. These genetic effects are highly tissue, cell type, and disease specific so it is imperative that they are assessed in a relevant model. Of course, identifying eQTLs and the mechanisms controlling them can be of huge benefit to drug target experts, so this is a very worthwhile pursuit. Happy eQTL characterizing!