Getting the most out of your human DNA methylation studies

Written by: Kirsten Hogg

last updated: January 31, 2022

The field of epigenetics is exploding and given the strong links between epigenetic state and disease, the need to study markers like DNA methylation in humans is very relevant. This article outlines some of the main factors you should be taking into account in your study of DNA methylation in human tissues. Here goes:

Biological considerations

1. Choice of tissue. In humans, tissue availability is limited. The options include placenta, amnion, epithelial cells from a cheek swab, whole blood or cord blood. The major consideration here is heterogeneous cell populations. DNA methylation can vary vastly by cell or tissue type; therefore interpretation of altered methylation in these mixed populations must be made with caution. Blood samples contain a mix of haematopoietic and immunological cells that may be influenced by physiological events that could be related to the outcome studied. For example, elevated stress may lead to altered immunological cell profiles. Cord blood contains varying amounts of progenitor cells and nucleated red blood cells. Therefore altered DNA methylation levels may represent changed proportions of cell populations. Also, DNA methylation in blood might not necessarily extrapolate to tissues of interest such as the brain or metabolic tissue. However, if the goal is to identify a ‘biomarker’ for disease risk, then development of such assays has clear clinical application. 2. Developmental age. DNA methylation levels can vary throughout life. This is particularly true in early development. Cases should be closely matched by gestational age. For example, comparing DNA methylation between first or second trimester and term placentas is likely to impart more information about developmental-specific changes than any pathology being investigated. 3. Variation within the sample. There can be considerable interindividual variation in DNA methylation levels, which may be influenced by ethnicity or sex. Stratification of cases by these variables is an obvious consideration.

Interpreting the results

4. Genomic region of interest. DNA methylation occurs at CG dinucleotides (CpG sites) and is measured on a scale of 0-100%, where <20% is low-methylated and >80% is highly methylated. Typically gene promoters are unmethylated and gene bodies and intergenic regions are methylated. Increasingly the relevance of DNA methylation outside the promoter region is recognised. Differential DNA methylation observed within exons and introns may be mapped to regions of activating histone marks, transcription factor binding sites or enhancers and could reflect important regulatory regions. Whether you are identifying regions for follow-up based on large scale genome-wide DNA methylation analyses or selecting candidate regions based on the literature, bioinformatic online tools such as Ensembl or UCSC Genome browser are useful for predicting regions that may be of significance. 5. Degree of DNA methylation change. The degree of gene promoter methylation can be inversely correlated with mRNA expression and therefore is often interpreted to reflect a change in gene expression. However, it is not clear whether modest changes in DNA methylation are biologically relevant. For example, a 1-3% shift in DNA methylation at a gene promoter that is normally <10% methylated may not represent a biologically significant finding. It is important that such findings are interpreted cautiously in the absence of direct functional evidence pertaining to the change. Additionally, reporting raw DNA methylation values is most informative as fold changes can be misleading. 6. Technical noise. While there are methods to precisely measure DNA methylation at individual CpG sites (such as pyrosequencing), depending on the assay, there can be technical background noise of up to 5%. Therefore it is crucial to randomise samples between runs to reduce technical variance. Furthermore, if selecting follow-up gene regions from a wider array-based platform, a DNA methylation difference threshold of at least 5% among cases and controls, should be set. 7. Analysing the data. Given the regulatory significance of the CpG rich gene promoter, this region is often assessed resulting in assays containing multiple CpG sites. While it is beneficial to have a broad picture of DNA methylation patterns in your region of interest, these individual CpG sites should not be classed as independent entities, and multiple comparison testing should be employed to detect ‘real’ changes between groups rather than differences that are bound to arise if you assess enough sites. If the DNA methylation pattern is similar in your region, it is likely that the CpG methylation values are correlated. In these instances it is as informative to average the sites contained within this region to give an overall picture, and one that doesn’t require additional multiple comparison testing. From this it is clear that careful selection of biological samples, methodological criteria and interpretation of data is needed to acquire meaningful and potentially useful data to add the ever growing field of epigenetic research. This is particularly true of DNA methylation, and its apparent role in, or as a biomarker of, health and disease.

Kirsten gained a PhD in Reproductive Biology from the University of Edinburgh.

More 'Genomics and Epigenetics' articles