Downloaded from http://jmg.bmj.com/ on November 3, 2016 - Published by group.bmj.com
Methods
REVIEW
Studying the epigenome using next
generation sequencing
Chee Seng Ku,1,2 Nasheen Naidoo,2 Mengchu Wu,1 Richie Soong1
1
Cancer Science Institute of
Singapore, National University
of Singapore, Singapore
2
Centre for Molecular
Epidemiology, Department of
Epidemiology and Public Health,
Yong Loo Lin School of
Medicine, National University of
Singapore, Singapore
Correspondence to
Chee Seng Ku, Cancer Science
Institute of Singapore, National
University of Singapore,
Singapore;
[email protected]
Received 5 June 2011
Revised 24 June 2011
Accepted 25 June 2011
Published Online First
8 August 2011
ABSTRACT
The advances in next generation sequencing (NGS)
technologies have had a significant impact on
epigenomic research. The arrival of NGS technologies
has enabled a more powerful sequencing based
methoddthat is, ChIP-Seqdto interrogate whole
genome histone modifications, improving on the
conventional microarray based method (ChIP-chip).
Similarly, the first human DNA methylome was mapped
using NGS technologies. More importantly, studies of
DNA methylation and histone modification using NGS
technologies have yielded new discoveries and improved
our knowledge of human biology and diseases. The
concept that cytosine methylation was restricted to CpG
dinucleotides has only been recently challenged by new
data generated from sequencing the DNA methylome.
Approximately 25% of all cytosine methylation identified
in stem cells was in a non-CG context. The non-CG
methylation was more enriched in gene bodies and
depleted in protein binding sites and enhancers. The
recent developments of third generation sequencing
technologies have shown promising results of directly
sequencing methylated nucleotides and having the ability
to differentiate between 5-methylcytosine and
5-hydroxymethylcytosine. The importance of
5-hydroxymethylcytosine remains largely unknown, but
it has been found in various tissues.
5-hydroxymethylcytosine was particularly enriched at
promoters and in intragenic regions (gene bodies) but
was largely absent from non-gene regions in DNA from
human brain frontal lobe tissue. The presence of
5-hydroxymethylcytosine in gene bodies was more
positively correlated with gene expression levels. The
importance of studying 5-methylcytosine and
5-hydroxymethylcytosine separately for their biological
roles will become clearer when more efficient methods
to distinguish them are available.
INTRODUCTION
Epigenetics refers to the mechanisms that regulate
the cell type or tissue specific transcription or gene
expression levels without altering the DNA
sequences, through biochemical modifications such
as the addition of a methyl group to cytosines, and
post-translational
modifications
of
histone
proteins. These epigenetic mechanisms play a critical role in the normal stages of cellular developmental and processes such as embryogenesis, cell
differentiation (cell lineage specification), inactivation of the X chromosome and genomic imprinting
through modulation of transcriptional regulation in
a tissue specific manner. Abnormalities in these
J Med Genet 2011;48:721e730. doi:10.1136/jmedgenet-2011-100242
epigenetic mechanisms have been linked to a wide
range of diseases.1e6
The importance of exploring the epigenetics of
human complex diseases and traits is now being
increasingly recognised. Despite the success of
genome-wide association studies (GWAS) in identifying thousands of new genetic variants or loci for
complex diseases and traits, most of these identified
genetic variants confer modest effect sizes with an
OR <1.5. Therefore, these genetic variants collectively account for only a small fraction of the heritability of complex phenotypes.7e9 Epigenetic
mechanisms are a potential source of the missing
heritability. Theoretically, only the germline heritable
epigenetic events (inherited through meiosis) may
contribute towards the missing heritability
as opposed to the non-germline heritable components
(inherited through mitosis or somatic epigenetic
events). However, it is challenging to determine
which epigenetic events are germline heritable.
Comparing the epigenetic profiles between the
‘disease epigenome’, such as the cancer epigenome,
with the epigenome from constitutional DNA
from the same individual as a reference, is required
for distinguishing between germline and somatic
epigenetic events. These approaches were also applied
previously in cancer genome sequencing studies for
a proper assessment of somatic mutations.10 11
In addition, twin studies are also a powerful
study
design
to
investigate
epigenetic
heritability.12e14 Molecular mechanisms of heritability may not be limited to DNA sequence differences.12 The finding that monozygotic twins have
very similar epigenetic profiles has indicated a high
epigenetic heritability.15 In addition, twin studies in
epigenetics would be able to: (1) address the extent
of and variation in epigenetic heritability across the
genome; and (2) investigate whether non-germline
heritable (or somatic) epigenetic events contribute
to complex phenotypes in monozygotic twins
discordant for the phenotypes. Epigenetic studies of
disease discordant monozygotic twins offer several
advantages in detecting disease related epigenetic
differences over a study design involving unrelated
disease cases and controls.16e21
The arrival of next generation sequencing (NGS)
technologies in 2005 led to a paradigm shift in the
approaches to investigating functional genomics
in the human genome.22e24 Functional genomics
aims to interrogate the functional elements and
regulatory mechanisms in the genome including
DNA methylation and histone modifications.
Before 2005, the genome-wide epigenetic (epigenomic) studies were dependent mainly on DNA
721
Downloaded from http://jmg.bmj.com/ on November 3, 2016 - Published by group.bmj.com
Methods
microarray methods, such as the ChIP-chip method (chromatin
immunoprecipitation based on microarray hybridisation of the
immunoprecipitated DNA fragments), to study histone modifications and ‘DNA methylation microarrays’. However NGS
technologies were swiftly integrated into epigenomic studies
and several new and innovative sequencing based methods have
been developed together with bioinformatic and analytical
tools.25 26 For example, sequencing based approaches such as
ChIP-Seq (chromatin immunoprecipitation based on sequencing
of the immunoprecipitated DNA fragments) have replaced
microarray experiments in the study of histone modifications,
and the sequencing of the human whole genome DNA methylation (the DNA methylome) after bisulfite conversion has also
become feasible and replaced the microarrays targeting preselected CpG sites.27e30 Over the past several years, NGS technologies have contributed significantly to advances in
epigenomics and provided new alternatives to study the human
epigenome at a much greater resolution.31e35
The aim of this paper is to review the recent advances in
epigenomics using NGS technologies and the impact on our
understanding of human biology and diseases. We focus mainly
on DNA methylation and histone modification, which are
relatively well studied epigenetic mechanisms. We also discuss
the extent to which germline heritable epigenetic mechanisms
explain the missing heritability, and the challenges faced by
studies attempting to examine this largely unexplored epigenetic
component in complex diseases and traits. Finally, we also
highlight the new opportunities offered by third generation
sequencing (TGS) technologies in studying epigenomics.
are wrapped around by 146 bp of DNA. The N-terminal tails of
histone polypeptides can be modified by more than 100 different
post-translational modifications including methylation, acetylation, phosphorylation, and ubiquitination (collectively known
as histone modifications). Similarly to DNA methylation, the
histone modifications can regulate transcription through modification of the chromatin structure or through chromatin
condensation. Although their role, function, and relationship to
transcriptional regulation for most of these histone modifications remains poorly understood, considerable progress has been
achieved in recent years through studies applying the ChIP-chip
and ChIP-Seq approaches. For example, methylation of histone
H3 lysine 4 (H3K4) and H3 lysine 36 is associated with transcription activation. In contrast, methylation of H3 lysine 9
(H3K9), H3 lysine 27 (H3K27), and H4 lysine 20 (H4K20) is
correlated with repression of transcription.31e34 Figure 1 illustrates the epigenetic mechanisms.
An important prerequisite for further understanding of the
role and function of epigenetics in normal biology and the
development of disease is the ability to investigate comprehensively the pattern and distribution of epigenetic markers in the
whole genome from multiple tissue types, which requires newer
and more powerful methods. In the following sections, we
briefly discuss the traditional methods used to interrogate
epigenetic mechanisms in order to appreciate the improvements
made by sequencing based approaches enabled by NGS and TGS
technologies.
TRADITIONAL METHODS IN STUDYING EPIGENOMICS
DNA METHYLATION AND HISTONE MODIFICATIONS
Outlining genome-wide DNA methylation and histone modifications has important implications for our understanding of
normal biology, as well as how their molecular aberrations
contribute to the development of human diseases. The most
extensively studied epigenetic mechanism is DNA methylation
or, more specifically, cytosine methylation, which is the addition
of a methyl group at the carbon 5 position of cytosine through
DNA methyltransferase (DNMT) enzymes in human genome.
These enzymes (DNMT3A and DNMT3B) catalyse the de novo
covalent addition of a methyl group to cytosine in newly
synthesised DNA. Cytosine methylation plays an important role
in transcriptional regulation and hypermethylation of the CpG
islands (regions with a high density of CpG dinucleotides) in
promoter regions, which is frequently associated with gene
silencing. In differentiated cells, cytosine methylation occurs
almost exclusively in CpG dinucleotides, and most CpG sites in
the genome are methylated. However, CpG islands in the
promoter regions in the majority of human genes are not
methylated, indicating a transcriptional active state (except in
imprinted genes which were silenced by DNA methylation).27 35
Aberrations in DNA methylation can result in diseasedfor
example, the cancer genome is usually characterised by global
hypomethylation leading to genomic instability and focal
hypermethylation, such as in the promoter CpG islands of
tumour suppressor genes leading to gene silencing.5 6 36 37 The
concept that cytosine methylation was restricted to CpG
dinucleotides has only been recently challenged by new data
generated from sequencing the DNA methylome.38
Histone modifications are another important epigenetic
mechanism in transcriptional regulation. The nucleosome, the
fundamental unit of chromatin, is composed of two copies of
each of the four core histones (ie, H2A, H2B, H3, and H4) which
722
Methods developed to study DNA methylation include the use
of methylation sensitive restriction enzymes, affinity enrichment using antibodies specific to 5-methylcytosine (5mC), and
bisulfite conversion. However, a number of limitations are
associated with each of these methodsdfor example, enzyme
digestion methods are restricted to restriction enzyme recognition sites and as a result only a very small subset of all methylation sites can be interrogated. In comparison, methods that
rely on affinity enrichment using antibodies are biased towards
enrichment of sites containing relatively high levels of cytosine
methylation, namely, CpG islands. These methods were coupled
with microarray based methods to enable genome-wide analysis
of DNA methylation and histone modifications based on the
ChIP-chip method.27 31 39 40
A limitation of microarray based methods is that they do not
allow a truly ‘comprehensive’ interrogation of DNA methylation and histone modifications throughout the whole genome,
as synthesising the probes for microarrays requires prior
knowledge of the regions to be targeted. Thus only those
genomic regions that are probed by the microarrays will be
interrogated. For example, this is evident from the conventional
ChIP-chip studies where the immunoprecipitated DNA fragments that are associated with a specific histone modification
could not be detected unless there are probes covering the
genomic regions. In contrast, sequencing based approaches such
as ChIP-Seq, in theory, are able to capture all the DNA fragments
that are isolated by immunoprecipitation if the sequencing
depth or coverage is sufficient.24 28 29 Similarly, the current DNA
methylation microarrays are only able to interrogate a small
fraction of the entire DNA methylome. High density microarrays such as the Infinium Human Methylation 450 BeadChip
allow researchers to investigate >450 000 CpG sites out of the
approximately 28 million CpG sites in the human genome
(<0.02%).41 42 There is also ascertainment bias in selecting these
J Med Genet 2011;48:721e730. doi:10.1136/jmedgenet-2011-100242
Downloaded from http://jmg.bmj.com/ on November 3, 2016 - Published by group.bmj.com
Methods
>450 000 CpG sites to be interrogated. This type of microarray
is widely accepted as a genome-wide tool for DNA methylation
yet it is still restricted to preselected CpG sites. This incomplete
interrogation of DNA methylation patterns in the human
genome will reduce the power of discoveries. For example,
methylation in the non-CG context, revealed through
sequencing of the entire DNA methylome, would be overlooked
when using microarrays.38 Table 1 summarises the pros and cons
of the common approaches to mapping DNA methylation and
histone modifications and figure 2 illustrates the ChIP-chip and
ChIP-seq methods.
Readers should refer to these references for a more comprehensive review of different methodologies in studying DNA
methylation and histone modifications.27e29
HIGH THROUGHPUT SEQUENCING TECHNOLOGIES
Next generation sequencing technologies (ie, Roche 454 GS FLX,
Illumina GA and HiSeq and Life Technologies SOLiD) have
become important tools in epigenomic research. These platforms
are characterised by the ability to sequence a very large number
of sequence reads in paralleldthat is, massively parallel
sequencing. However, the Roche 454 GS FLX can only generate
approximately one million longer sequence reads (w400 bp) per
instrument run, in comparison to the Illumina and Life Technologies sequencing machines where several hundred million
shorter sequence reads (<150 bp) are produced. Thus, the
Illumina and Life Technologies sequencing platforms offer an
advantage for the ChIP-Seq experiments that require a high
coverage of sequence reads to detect the enrichment of DNA
fragments specific to a particular histone modification after
immunoprecipitation.43e45
NGS technologies have enabled an unprecedented scale of
sequencing success in epigenomics. For example, Lister et al
(2009) sequenced the shotgun libraries prepared from bisulfite
treated genomic DNA using the Illumina sequencing platform
for the genomes of stem cells and fibroblasts. They were able to
cover 94% of all cytosines in the genome and generated 87.5 and
91.0 gigabases of sequencing data, respectively. This amount of
data is well beyond the capacity of Sanger sequencing.38 This
clearly shows that the throughput of NGS technologies has
enabled a much larger scale of DNA methylation studies to be
performed. However, sequencing of DNA methylation still uses
the conventional approach of bisulfite conversion to differentiate methylated from unmethylated cytosines of which there
are several disadvantages (please see ‘New opportunities
from TGS technologies’). Third generation sequencing technologies, such as SMRT sequencing, offer new opportunities to
sequence directly the methylated cytosines without bisulfite
conversion.46 47
NEW BIOLOGICAL INSIGHTS
DNA methylation
The arrival of NGS technologies has led to the completion of
a number of DNA methylome studies at a single base resolution.38 48 The prevailing view that DNA methylation occurs
predominantly at CpG dinucleotides in the human genome has
been challenged with new findings from recent studies harnessing
the power of NGS technologies. The comparison of the DNA
methylome between human stem cells and fetal fibroblasts has
revealed substantial differences in terms of the composition and
pattern of cytosine methylation between the two different
genomes. Approximately 25% of all cytosine methylation identified in stem cells was in a non-CG context compared to fibroblasts where almost all of the cytosine methylation was in the
CG context. The substantial fraction of non-CG methylation was
first revealed through this study. This suggests that embryonic
stem cells may use different DNA methylation mechanisms in
transcriptional regulation to maintain their pluripotency
compared to differentiated cells. The methylation in the non-CG
context also showed different patternsdthat is, non-CG methylation were more enriched in gene bodies and depleted in protein
binding sites and enhancers. This, coupled with the finding that
non-CG methylation in gene bodies was positively correlated
with gene expression, offers insights into the regulation of the
mechanisms of gene expression.38
More interestingly, the non-CG methylation disappeared after
differentiation of the stem cells, clearly suggesting that non-CG
methylation is an important epigenetic mechanism occurring
Figure 1 The well studied epigenetic
mechanisms are DNA methylation and
histone modification. DNA methylation
or, more specifically, cytosine
methylation is an addition of a methyl
group at the carbon 5 position of
cytosine. The N-terminal tails of histone
polypeptides can be modified by more
than 100 different post-translational
modifications (collectively known as
histone modifications). Both DNA
methylation and histone modification
are important in transcriptional
regulations, and aberrations of these
epigenetic mechanisms are associated
with various diseases such as cancer
and autoimmune disease.
J Med Genet 2011;48:721e730. doi:10.1136/jmedgenet-2011-100242
723
Downloaded from http://jmg.bmj.com/ on November 3, 2016 - Published by group.bmj.com
Methods
Table 1
Common approaches to mapping DNA methylation and histone modifications
Method
Pros
Cons
Most suitable application
Bisulfite treatment
< Effectively converts an epigenetic
< Incomplete conversion
< Degradation of DNA
< Not easily adapted to array hybridisation
High resolution study of DNA
methylation at small or large scale
difference into a genetic difference, easily
detectable by sequencing
< Single base pair resolution
Methylation sensitive
restriction enzymes digestion
< Highly sensitive, simple
Affinity enrichment by
5-methylcytosine antibodies
< Powerful tool for comprehensive profiling
ChIP-chip
ChIP-seq
techniques
< Limited by methylation d sensitive restriction
enzyme cutting sites
of DNA methylation in complex genomes
< Rapid and efficient genome-wide
assessment of DNA methylation
< No information on individual CpG dinucleotides
< Varying CpG density at different regions of the
Targeted, site specific study of DNA
methylation
Rapid, large scale, low resolution
study of DNA methylation
genome requires computational adjustments
< Possibility of antibody cross-reactivity
< Productive for genome-wide mapping of
< Requires a priori knowledge of the regions to
histone modifications
< Bioinformaticallly and analytically less
challenging
be probed
< High signal-to-noise ratios
< Limited dynamic range
< Cross-hybridisation between similar sequences
< Quantification of the signal from ChIP
< Difficulty in the ambiguous aligning of short
is based on counting the sequence reads
< Wider dynamic range
< Bar coding allows sample multiplexing
for NGS
reads in repetitive regions
< Bioinformaticallly and analytically more
challenging
Histone modification and other
DNAeprotein interactions
Histone modification and other
DNAeprotein interactions
The information in this table is summarised from three excellent review papers by Laird, Hurd and Nelson, and Hirst and Marra.27e29
NGS, next generation sequencing.
specifically in embryonic stem cells as well as induced pluripotent stem cells to maintain pluripotency. This also suggests that
changes in epigenetic mechanisms are taking place during the
cell differentiation stages. The non-CG methylation was
restored in induced pluripotent stem cells from differentiated
cells.38 Dynamic changes in the human methylome during
differentiation were also demonstrated by another study
through whole genome bisulfite sequencing, where the developmental stage was reflected in both the level of global methylation and extent of non-CpG methylation. The total level of
global methylation and the degree of non-CpG methylation is
inversely correlated to the level of differentiation.49
Figure 2 Genome-wide mapping of
histone modifications and other
DNAeprotein interactions has relied on
(A) chromatin immunoprecipitation
(ChIP) coupled with microarrays (ie,
ChIP-chip) and (B) next generation
sequencing (NGS) technologies (ie,
ChIP-Seq) which have provided more
precise and comprehensive landscapes
of histone modifications in the entire
genome. Reprinted by permission from
Macmillan Publishers Ltd: Nat Rev
Genet (9:179-91), copyright (2008).
724
J Med Genet 2011;48:721e730. doi:10.1136/jmedgenet-2011-100242
Downloaded from http://jmg.bmj.com/ on November 3, 2016 - Published by group.bmj.com
Methods
Figure 3 Cytosine methylation (5-methylcytosine) is catalysed by DNA
methyltransferase (DNMT) enzymes. These enzymes (DNMT3A and
DNMT3B) catalyse the de novo covalent addition of a methyl group to
cytosine in newly synthesised DNA as well as maintenance of 5methylcytosine (DNMT1) during mitosis. 5-hydroxymethylcytosine is
generated by TET (Tet) proteins through oxidation of 5-methylcytosine.
Reprinted from Clin Chim Acta, 412, Dahl C, Grønbæk K, Guldberg P,
Advances in DNA methylation: 5-hydroxymethylcytosine revisited,
831e6, Copyright (2011), with permission from Elsevier.
It is expected that some differences in the composition and
pattern of cytosine methylation are likely to be found between
the genomes of cells from different developmental stages;
however, these differences remain poorly understood. Most of
the targeted sequencing approaches or microarrays have focused
on CG methylation, from which subsequent discoveries could
not have been made without sequencing the entire DNA
methylome. These discoveries will also provide new directions
to study the prevalence and pattern of non-CG methylation in
different adult stem cells that are found in various tissue types
and differentiated cell types. These studies will also elucidate
whether non-CG methylation is exclusively confined to
embryonic stem cells, or if it also occurs in adult stem cells. The
knowledge of the roles of non-CG methylation in maintaining
the pluripotency and inducing the differentiation of adult stem
cells will have important implications for regenerative medicine.
Although Lister et al (2009) completed sequencing the DNA
methylome of stem cells and differentiated fibroblasts, their
study used bisulfite sequencing and was thus unable to distinguish between 5mC and 5-hydroxymethylcytosine (5hmC).
Therefore it is unclear whether and to what extent there were
differences in the composition of 5mC and 5hmC between the
genomes of stem cells and fibroblasts. Furthermore, methylation
beyond the CG and non-CG contexts such as 5-methyladenine
or methylation in other nucleotides can also not be studied.
These are the major limitations of the current methods using
bisulfite conversion and NGS to study DNA methylation. The
ability to investigate 5hmC and methylation of other nucleotides will be important and is anticipated to provide further
biological insights into human biology and diseases.50
The absence of non-CG methylation in differentiated cells
was also corroborated by sequencing the entire DNA methylome
of peripheral blood mononuclear cells. The investigators found
that <0.2% of non-CG sites were methylated, suggesting that
non-CG methylation is minor in human peripheral blood
mononuclear cells.48 In addition, this study also investigated
allele specific methylation (ie, paternal and maternal alleles can
exhibit different methylation patterns) between the two haploid
methylomes by integrating the methylome data with the whole
genome sequencing data that were generated previously.51 This
led to identification of 599 haploid differentially methylated
regions covering 287 genes and demonstrated that allele specific
methylation is highly correlated with allele specific expression.48
The assumption that DNA methylation regulates gene
expression mainly through its effects at 59 promoters has also
been challenged by recent studies applying NGS based
J Med Genet 2011;48:721e730. doi:10.1136/jmedgenet-2011-100242
approaches.52 A recent study has shown that most tissue specific
gene regulation mediated by DNA methylation occurs at alternative promoters located in gene bodies instead of 59 promoters.
More specifically, the majority of methylated CpG islands were
shown to be in intragenic and intergenic regions. In contrast,
<3% of CpG islands in 59 promoters were methylated. This was
found through generating a map of DNA methylation from the
human brain encompassing 24.7 million of the 28 million CpG
sites. Similarly, these distribution patterns across the genome
would not have been unravelled if the studies were restricted to
interrogating a proportion of CpG sites. The study showed that
intragenic DNA methylation plays a role in regulating transcription from alternative promoters, and this is consistent with
transcriptomic studies which have shown that many genes have
alternative promoters within gene bodies.52
The NGS and sequencing based methods have contributed to
the advances in the epigenomics of stem cell research and the
knowledge of stem cell molecular biology. For example, it was
found that induced pluripotent stem cells showed significant
reprogramming variability, including aberrant reprogramming of
DNA methylation through whole genome profiling of DNA
methylation at a single base resolution in five human induced
pluripotent stem cell lines together with embryonic stem cells,
somatic cells, and differentiated induced pluripotent stem
cells.53 Data also revealed that the epigenomic landscapes in
embryonic stem cells and lineage committed cells are vastly
different through comparisons of the chromatin modification
profiles and DNA methylomes in the stem cells and primary
fibroblasts. This has provided new insights into the epigenetic
mechanisms modulating the properties of pluripotency and cell
fate commitment.54 However, the bioethical issues surrounding
the use of human embryonic stem cells in research have to be
given serious consideration. Adult stem cells and induced
pluripotent stem cells generated from somatic cells provide an
alternative source and should be considered for use in stem cell
research.55 56
In addition to the ‘conventional’ DNA methylation studies
(ie, 5mC), research on 5hmC is also gaining its impetus. 5hmC is
generated by TET proteins through oxidation of 5mC and is
present at low levels in diverse cell types in mammals
(figure 3).57 58 Currently, information on the genome-wide
distribution of 5hmC is limited. However, two novel and specific
approaches to profile the whole genome localisation of 5hmC
have been developed recently. These two methods were developed to enable the precipitation of 5hmC in genomic DNA. The
first approach, termed GLIB (glucosylation, periodate oxidation,
biotinylation), uses a combination of enzymatic and chemical
steps to isolate DNA fragments containing a few down to
a single 5hmC and entails the addition of a glucose molecule to
each 5hmC. The second approach involves the conversion of
5hmC to cytosine 5-methylenesulphonate (CMS) by treatment
of genomic DNA with sodium bisulfite, followed by immunoprecipitation of CMS-containing DNA with a specific antiserum
to CMS. Both methods are specific to DNA containing 5hmC.59
Applying these methods to 5hmC-containing DNA from mouse
embryonic stem cells showed strong enrichment within exons
and near transcriptional start sites. The enrichment of 5hmC at
the transcriptional start sites suggested a role for 5hmC in
transcriptional regulation. Additionally, 5hmC was especially
enriched at the start sites of genes whose promoters bear dual
histone 3 lysine 27 trimethylation (H3K27me3) and histone 3
lysine 4 trimethylation (H3K4me3) marks. Genes with 5hmC at
their start sites were disproportionately likely to contain bivalent H3K27 and H3K4 trimethylation at their promoters.
725
Downloaded from http://jmg.bmj.com/ on November 3, 2016 - Published by group.bmj.com
Methods
Likewise, a majority (w60%) of genes reported to contain
bivalent H3K27 and H3K4 trimethylation have 5hmC at their
start sites. Collectively, these findings indicate that 5hmC,
similar to 5mC, has a probable role in transcriptional regulation
but support a model in which 5mC and 5hmC have different
roles in transcription.59
New insights into 5hmC were also gained from other studies.
5hmC is mostly associated with euchromatin, and whereas 5mC
is under-represented at gene promoters and CpG islands, 5hmC
is enriched and is associated with increased transcriptional
levels. Most, if not all, 5hmC in the genome depends on preexisting 5mC and the balance between these two modifications
is different between genomic regions.60 Other studies have
provided data to support further the important roles of TET
proteins. The binding of TET1 throughout the genome of
embryonic stem cells, with the majority of binding sites located
at transcription start sites of CpG-rich promoters and within
genes, was recently shown by Williams et al (2011). The
hydroxymethylcytosine modification is found in gene bodies
and, in contrast to methylcytosine, is also enriched at CpG-rich
transcription start sites.61 Wu et al also showed in mouse
embryonic stem cells that Tet1 is preferentially bound to CpGrich sequences at promoters of both transcriptionally active and
Polycomb-repressed genes.62
Histone modifications
In addition, the first genome-wide mapping of histone modifications using NGS technologies has identified a number of
activation marks such as mono-methylations of H3K27, H3K9,
H4K20, H3K79, and H2BK5, and histone marks that are
linked to transcriptional repression such as trimethylations
of H3K27, H3K9, and H3K79. This has provided new insights
into the function of histone modifications in transcriptional
regulation.63
Cellular differentiation involves the gradual loss of pluripotency and acquisition of cell type specific features. This
process is precisely controlled by cell type specific epigenetic
programmes. Understanding these processes requires genomewide analysis of epigenetic and gene expression profiles. New
discoveries have been made through applications of NGS technologies for profiling histone and DNA methylation, as well as
gene expression patterns of normal human mammary progenitor enriched and luminal lineage committed cells.64 Integrative
analysis of the gene expression, DNA methylation, and histone
H3 K4 and K27 trimethylation profiles of progenitor enriched
and more differentiated luminal epithelial cell populations from
multiple individuals were performed to understand better the
regulation of human mammary epithelial cell type specification.
Significant differences in histone H3 lysine27 tri-methylation
(H3K27me3) enrichment and DNA methylation of genes
expressed in a cell type specific manner were observed,
suggesting their regulation by epigenetic mechanisms and
a dynamic interplay between the two processes that together
define developmental potential. The analysis further identified
key regulators of mammary epithelial and luminal lineage
commitment. The epigenetically regulated genes identified will
accelerate the dissection of human mammary epithelial lineage
commitment and luminal differentiation. Additionally, the list
of genes epigenetically regulated in a cell type specific manner
provides a rich resource for the further analysis of human breast
development and the role of epigenetic mechanisms in breast
tumorigenesis.64
This study has generated the first comprehensive epigenomic
profile of human mammary epithelial cells by analysing gene
726
expression and DNA and histone methylation patterns of
progenitor and luminal lineage enriched cells through applications of NGS.64 However, a major limitation associated with the
study was that the DNA methylation data were limited to
a fraction of the genome, as only the methylation status of the
recognition site of the BssHII enzyme was evaluated (an
inherent limitation of using restriction enzymes for DNA
methylation). The other limitation was that the cell fractions
used in the study were not homogenously pure. However, these
limitations could be overcome through whole genome
sequencing of bisulfite treated genomic DNA isolated from
single cells (please see ‘New opportunities from TGS technologies’ for analysis of DNA methylation patterns in single cells).
Mapping and analysis of chromatin state dynamics in nine
human cell types has been performed by a recent study using
ChIP-Seq coupled with NGS.65 More specifically, nine chromatin
marks across nine cell types were mapped to systematically
characterise regulatory elements, their cell type specificities, and
their functional interactions. Chromatin states showed distinct
associations with transcriptional start sites, transcripts, evolutionarily conserved non-coding regions, DNase hypersensitive
sites, binding sites for the regulators c-Myc (MYC) and NF-kB,
and inactive genomic regions associated with the nuclear lamina.
Chromatin states drastically reduced the large combinatorial
space of chromatin datasets to a manageable set of biologically
interpretable annotations, thus providing an efficient and robust
way to track coordinated changes across cell types. This allowed
the systematic identification and comparison of more than
100 000 promoter and enhancer elements.
This study is of significance because chromatin profiling has
emerged as a powerful means of genome annotation and
detection of regulatory activity as it provides a systematic
means of detecting cis-regulatory elements. Therefore, chromatin profiling is especially well suited to the characterisation of
non-coding portions of the genome, which remain largely
uncharacterised. The results also have implications for the
interpretation of GWASs. Disease variants were frequently
found to coincide with enhancer elements specific to a relevant
cell type. For example, 33 enhancers were found in the 9p21
region where multiple single nucleotide polymorphisms (SNPs)
within the region were associated with coronary artery disease
and type 2 diabetes. More specifically, the coronary artery
disease risk alleles of SNPs rs10811656 and rs10757278 are
located in one of the enhancers and disrupt a binding site for
STAT1.66 In addition, the 8q24 cancer risk allele (rs6983267)
increases prostate enhancer activity in vivo relative to the nonrisk allele also demonstrated.67 Intersecting with non-coding
SNPs from GWAS datasets has suggested potential mechanistic
explanations for disease variantsdthat is, how disease variants
lead to the observed disease phenotypes, either through their
presence within cell type specific enhancer states or by their
effect on binding motifs for predicted regulators.65
HERITABLE EPIGENETICS AND COMPLEX DISEASES AND TRAITS
‘Heritable epigenetics’ refers to the germline inherited epigenetic
markers. It has been proposed that in addition to the germline
genetic component, heritable epigenetics also plays an important role and contributes to the heritability of complex diseases
and traits. Epigenetic heritability has been well demonstrated
in twin studies.12e15 Monozygotic twins are epigenetically
indistinguishable during the early years of life whereas older
monozygotic twins exhibited large differences in their overall
content and genomic distribution of 5mC and histone acetylation.15 This indicates that monozygotic twins inherit identical
J Med Genet 2011;48:721e730. doi:10.1136/jmedgenet-2011-100242
Downloaded from http://jmg.bmj.com/ on November 3, 2016 - Published by group.bmj.com
Methods
epigenetic profiles from their parents and acquire different
somatic epigenetic markers throughout their lifetime.
The role of epigenetic changes in the mechanism of complex
phenotype loci remains largely unexplored. However, in the
context of GWAS there is growing evidence supporting the
importance of epigenetics in complex phenotypes. For example,
the finding of SNPs associated with diseases located within
500 kb of known imprinted genes has provided some evidence to
support the importance of epigenetics in complex diseases. A
total of five SNPs associated with breast cancer, basal cell
carcinoma, and type 2 diabetes were found to have parental
origin specific associations. These SNPs were located in two
genomic regions, 11p15 and 7q32, each harbouring a cluster of
imprinted genes.68 An SNP association in the imprinted region of
chromosome 14q32.2 was also found for type 1 diabetes.69
However, the importance of parental origin of sequence variants
in association with complex diseases has been largely understudied in GWAS because unrelated samples were investigated.
Pedigree or family information would be needed to investigate
the parent-of-origin effects.
In addition, studies investigating monozygotic twins who
were discordant for autoimmune disease (specifically systemic
lupus erythematosus) identified widespread changes in the DNA
methylation status of a significant number of genes. Subsequent
gene ontology analysis revealed enrichment in categories associated with immune function. Thus, this study also supports the
concept that epigenetic changes may be critical in the manifestation of autoimmune disease.20 Data are accumulating to
support the roles of epigenetic differences in monozygotic twins
discordant for diseases.16e19 By contrast, Baranzini et al (2010)
did not find evidence for epigenetic (as well as genetic and
transcriptomic) differences that explained monozygotic twins
who were discordant for multiple sclerosis.21
The biological effect of the SNPs associated with epigenetic
markers, is mediated through a variant specific epigenetic
change. The associations of these SNPs with complex phenotypes can be identified by GWAS.48 70 71 DNA methylation
associated with genetic variation in HapMap cell lines was
studied and association analyses of methylation levels with
more than 3 million SNPs identified 180 CpG-sites in 173 genes
that were associated with nearby SNPs (usually within 5 kb).70
Integration of whole genome sequencing data with whole
genome DNA methylation48 and histone modification data
should facilitate the identification of DNA sequence genetic
variations associated with the epigenetic markers, where the
contribution of these epigenetic markers to complex phenotypes
can be captured indirectly through SNPs by conventional genetic
association studies or GWAS. These epigenetic markers may be
germline heritable and thus account for heritability. Mapping
allele specific DNA methylation or SNPs associated with DNA
methylation can help to extract maximum information from
GWAS.72e75
If epigenetic markers (or similarly copy number variants
(CNVs)) are tagged by SNPs as the surrogate markers in GWAS,
then they would not, in theory, account for the missing heritability. However, the effect sizes of SNPs (surrogate markers)
found to be associated with complex phenotypes in GWAS could
be underestimated, if the true causal variants (epigenetic
markers or CNVs) were identified and which could then account
for the missing heritability. However, present data are still
rudimentary and it is unclear to what extent epigenetic markers
can be tagged by the SNPs genotyped in GWAS and to what
extent the effect size will be found to be higher when the true
causal variant is identified.
J Med Genet 2011;48:721e730. doi:10.1136/jmedgenet-2011-100242
By contrast, germline heritable epigenetic markers that cannot
be tracked by DNA sequence variations would be missed by
GWAS. It is currently unclear to what extent these heritable
epigenetic markers can be detected by DNA sequence variations
and through the genotyping of surrogate markers. This implies
that these heritable epigenetic markers have to be studied
directly. Further studies would be needed to address these
uncertainties.
In summary, studies of germline heritable epigenetic effects on
complex phenotypes are still rudimentary; however, high
throughput sequencing technologies will undoubtedly
contribute directly to the advances in epigenomic studies of
complex phenotypes or indirectly through identification of DNA
sequence variations tagging epigenetic markers. These technological developments have overcome the technical hurdles in
mapping and characterising epigenetic markers in the whole
genome. However, there are issues and challenges in studying
somatic epigenetic events in complex phenotypesdfor example,
the need for DNA samples to be extracted from the specific
tissue related to the disease.
ISSUES AND CHALLENGES IN STUDYING EPIGENETICS OF
COMPLEX DISEASES AND TRAITS
The study of somatic epigenetics of complex diseases and traits
is challenging, despite the availability of advanced sequencing
technologies, as somatic epigenetic alterations or aberrations are
tissue specific and undergo dynamic changes in response to the
cellular environment and various other stimuli.76e78 These
characteristics have created two substantial challenges when
studying complex diseases: (1) the need for a specific tissue
(related to the particular disease) to investigate; and (2) the need
for multiple tissue types to be collected at different time points
to interrogate the temporal changes in order to distinguish the
‘cause’ or the ‘consequence’. Ideally, the relevant tissue should be
collected before the onset of the disease in order to establish
a causal relationship. However, tissue samples are usually
collected after the disease has occurred and thus it is unclear
whether the aberrant epigenetic changes are the cause or
a consequence of the disease. Currently, experiments to profile
epigenomic changes are technologically feasible. Therefore, the
greatest challenge is rather in determining and collecting the
most appropriate tissue to be studied for each complex disease
and trait. Animal models would be very useful for collecting
tissues to investigate diseases; however, not all human diseases
have a suitable animal model.
Epigenetic studies are more feasible for certain diseases such as
cancer, because cancer tissue is accessible after surgical resection
or from diagnostic biopsies, from which the DNA can then be
extracted. However, tissue heterogeneity is still a significant
problem in epigenetic studies of primary cancer tissue,5 36 37
whereas in other diseases such as type 2 diabetes, the pancreatic
b-cell is likely the most appropriate cell type to be studied to
interrogate how aberrant epigenetics contribute to the impairment in insulin production. Nonetheless, this is only feasible in
animal models at present. Type 2 diabetes is a systematic disease
caused by impairment in insulin production and also insulin
resistance in peripheral tissue, such as muscle, which leads to
impairment in blood glucose uptake. As a result multiple
peripheral tissues, in addition to pancreatic b-cells, are also
required to provide a more complete picture of how epigenetic
aberrations contribute to the pathogenesis of type 2 diabetes.
This is in contrast to studying germline inherited epigenetic
events, where DNA samples derived from any tissue can be used.
Further adding to the complexity is the uncertainty of which
727
Downloaded from http://jmg.bmj.com/ on November 3, 2016 - Published by group.bmj.com
Methods
tissue to prioritise when multiple cell types or tissue are involved
in a complex disease, and whether it is feasible in terms of
sample collection, cost effectiveness and time to study all tissue
types.76e78 Furthermore, it is also unclear whether the disease
associated epigenetic changes can be revealed by a simple casee
control study design (like conventional genetic association
studies) or by other more robust study designs which need
to be developed. It is also unclear what sample size is required
to achieve adequate statistical power to detect the disease
associated epigenetic changes.
NEW OPPORTUNITIES FROM TGS TECHNOLOGIES
The ‘gold standard’ for detection of cytosine methylation is
sodium bisulfite conversion of DNA followed by sequencing.
However, several shortcomings of this method must be noted.
First, this method cannot distinguish between 5mC and 5hmC.
This also means that previous studies have grouped both types
of cytosine methylation into one group for analysis.79 Second, it
is unable to detect methyladenine. Third, sodium bisulfite causes
damage to DNA, resulting in fragmentation of DNA molecules
which will limit this method to sequencing only shorter DNA
sequences and subsequently hinder the ability to study the
haplotype (a combination of multiple methylcytosines at adjacent loci) or specific patterns (or differences in patterns) of DNA
methylation in the parental chromosomes.27 In addition to
bisulfite conversion, methods based on methylated DNA
immunoprecipitation (MeDIP) with anti-5mC antibody or
methods based on proteins that bind to methylated CpG
sequences do not detect 5hmC.80
The importance of 5hmC remains largely unknown, but it has
been found in various tissues.57 58 81 Jin et al (2011) examined
the distribution of 5hmC in DNA from human brain frontal lobe
tissue and found that 5hmC was particularly enriched at
promoters and in intragenic regions (gene bodies), but was
largely absent from non-gene regions. Moreover, the presence of
5hmC in gene bodies was more positively correlated with gene
expression levels.81
Single molecule real time sequencing technologies overcome
these problems by directly sequencing the methylated nucleotides without the need for bisulfite conversion. This is accomplished through monitoring the kinetics of incorporation of
nucleotides into newly synthesised DNA strands by polymerase.47 Nanopore sequencing technologies also have the
ability to detect methylated cytosine directly.82 However, in
comparison with SMRT sequencing, nanopore sequencing
technologies may take several years to become commercially
available to end users.
In addition to directly sequencing the methylated nucleotides
(which overcome the limitations of bisulfite conversion of
DNA), TGS technologies also hold promise for studying the
single cell epigenome. DNA samples are usually derived from
a collection of the same cells (a homogeneity collection) or
different cells (a heterogeneity collection such as primary
cancer tissue) in which the DNA methylation patterns (ie,
distributions and levels) may vary between the cells. The ability
to interrogate the epigenome of a single cell will directly overcome the problem of tissue heterogeneity, especially for primary
cancer tissue. Experimental data have shown that tumour
content significantly influences the interpretation of methylation levels.83 The rationale for single cell analysis is that
DNA methylation profiles could be highly variable across individual cells, even within the same tissue. Methods have been
previously developed for the analysis of DNA methylation
patterns in single cells, thus addressing the problems of cell
728
heterogeneity in epigenetics research.84 More powerful
sequencing technologies in the future will accelerate research
studies by providing greater resolution at the single cell level,84
single molecule DNA sequencing level,85 and the single nucleotide level.
Third generation sequencing technologies are characterised by
single molecule DNA sequencing without amplification. As
a result, this has simplified the steps in library preparation for
sequencing with minimal sample manipulation in experiments.
For example, the Helicos sequencing platform was used to
sequence chromatin immunoprecipitated DNA directly.86 87 A
further advantage is that only a small quantity of immunoprecipitated DNA (approximately 50 pg) is required and in contrast
to nanogrammes of DNA required for NGS platforms. The
requirement of a small amount of DNA is particular crucial for
single cell sequencing and for clinical samples with a limited
amount of DNA. As a proof-of-principle study using TGS for
ChIP-Seq experiments, a good agreement was obtained for the
ChIP-Seq data produced by the Helicos and Illumina sequencing
platforms,87 suggesting that TGS yields comparable results
and is able to overcome the limitations associated with NGS in
ChIP-Seq experiments.
FUTURE DIRECTIONS AND CONCLUSIONS
The tissue specificity of the epigenome creates substantial
challenges for researchers. Furthermore, epigenetic events are
dynamic and responsive to cellular environments and subject to
changes, as opposed to the human genome which has ‘static’
genetic variations in the DNA sequences. Thus, generating
a comprehensive human epigenomic map will involve multiple
tissue samples and ‘time points’ (to study the temporal changes)
requiring large scale international initiatives for this undertaking. The US National Institutes of Health Roadmap Epigenomics Mapping Consortium was established in response to
this.88
The primary aim of the Consortium is to provide a publicly
accessible resource of epigenomic maps in stem cells and primary
ex vivo tissue. The Consortium will apply leverage on new
experimental approaches enabling NGS technologies to explore
and map the epigenome that is not restricted to DNA methylation and histone modifications, but also includes chromatin
accessibility and small RNA transcripts in stem cells and
primary ex vivo tissue. The somatic epigenomic profile varies
from one tissue to another, and it is important to study the
‘correct or appropriate’ tissue for the specific diseases. Therefore,
the tissue studied by the Consortium will be chosen to represent
the normal counterparts of tissue and organ systems frequently
involved in human disease.
The arrival of NGS and TGS technologies has caused a paradigm shift in the approaches to epigenomic studies and also
improved our knowledge of their impacts on human biology and
diseases. Comprehensive studies of epigenetics will also enhance
our understanding of the interactions and relationships between
DNA methylation and histone modification, with the importance and implications of these relationships on normal biology
and diseases being increasingly recognised.89 These technologies
and methods will continue to explore the presently unanswered
questions in epigenomics.
Competing interests None.
Contributors CSK wrote this review paper and did the literature search. NN, WM and
RS were involved in editing and critical review of the manuscript. CSK and RS had final
responsibility for the decision to submit the paper for publication.
Provenance and peer review Not commissioned; externally peer reviewed.
J Med Genet 2011;48:721e730. doi:10.1136/jmedgenet-2011-100242
Downloaded from http://jmg.bmj.com/ on November 3, 2016 - Published by group.bmj.com
Methods
REFERENCES
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
Feinberg AP. Phenotypic plasticity and the epigenetics of human disease. Nature
2007;447:433e40.
Portela A, Esteller M. Epigenetic modifications and human disease. Nat Biotechnol
2010;28:1057e68.
Urdinguio RG, Sanchez-Mut JV, Esteller M. Epigenetic mechanisms in neurological
diseases: genes, syndromes, and therapies. Lancet Neurol 2009;8:1056e72.
Ballestar E. Epigenetic alterations in autoimmune rheumatic diseases. Nat Rev
Rheumatol 2011;7:263e71.
Esteller M. Epigenetics in cancer. N Engl J Med 2008;358:1148e59.
Rodrı́guez-Paredes M, Esteller M. Cancer epigenetics reaches mainstream
oncology. Nat Med 2011;17:330e9.
Manolio TA, Collins FS, Cox NJ, Goldstein DB, Hindorff LA, Hunter DJ, McCarthy MI,
Ramos EM, Cardon LR, Chakravarti A, Cho JH, Guttmacher AE, Kong A, Kruglyak L,
Mardis E, Rotimi CN, Slatkin M, Valle D, Whittemore AS, Boehnke M, Clark AG,
Eichler EE, Gibson G, Haines JL, Mackay TF, McCarroll SA, Visscher PM. Finding the
missing heritability of complex diseases. Nature 2009;461:747e53.
Eichler EE, Flint J, Gibson G, Kong A, Leal SM, Moore JH, Nadeau JH. Missing
heritability and strategies for finding the underlying causes of complex disease. Nat
Rev Genet 2010;11:446e50.
Clarke AJ, Cooper DN. GWAS: heritability missing in action? Eur J Hum Genet
2010;18:859e61.
Ley TJ, Mardis ER, Ding L, Fulton B, McLellan MD, Chen K, Dooling D, Dunford-Shore
BH, McGrath S, Hickenbotham M, Cook L, Abbott R, Larson DE, Koboldt DC, Pohl C,
Smith S, Hawkins A, Abbott S, Locke D, Hillier LW, Miner T, Fulton L, Magrini V,
Wylie T, Glasscock J, Conyers J, Sander N, Shi X, Osborne JR, Minx P, Gordon D,
Chinwalla A, Zhao Y, Ries RE, Payton JE, Westervelt P, Tomasson MH, Watson M,
Baty J, Ivanovich J, Heath S, Shannon WD, Nagarajan R, Walter MJ, Link DC,
Graubert TA, DiPersio JF, Wilson RK. DNA sequencing of a cytogenetically normal
acute myeloid leukaemia genome. Nature 2008;456:66e72.
Lee W, Jiang Z, Liu J, Haverty PM, Guan Y, Stinson J, Yue P, Zhang Y, Pant KP,
Bhatt D, Ha C, Johnson S, Kennemer MI, Mohan S, Nazarenko I, Watanabe C, Sparks
AB, Shames DS, Gentleman R, de Sauvage FJ, Stern H, Pandita A, Ballinger DG,
Drmanac R, Modrusan Z, Seshagiri S, Zhang Z. The mutation spectrum revealed by
paired genome sequences from a lung cancer patient. Nature 2010;465:473e7.
Kaminsky ZA, Tang T, Wang SC, Ptak C, Oh GH, Wong AH, Feldcamp LA, Virtanen
C, Halfvarson J, Tysk C, McRae AF, Visscher PM, Montgomery GW, Gottesman II,
Martin NG, Petronis A. DNA methylation profiles in monozygotic and dizygotic twins.
Nat Genet 2009;41:240e5.
Petronis A. Epigenetics and twins: three variations on the theme. Trends Genet
2006;22:347e50.
Bell JT, Spector TD. A twin approach to unraveling epigenetics. Trends Genet
2011;27:116e25.
Fraga MF, Ballestar E, Paz MF, Ropero S, Setien F, Ballestar ML, Heine-Suñer D,
Cigudosa JC, Urioste M, Benitez J, Boix-Chornet M, Sanchez-Aguilera A, Ling C,
Carlsson E, Poulsen P, Vaag A, Stephan Z, Spector TD, Wu YZ, Plass C, Esteller M.
Epigenetic differences arise during the lifetime of monozygotic twins. Proc Natl Acad
Sci USA 2005;102:10604e9.
Singh SM, Murphy B, O’Reilly R. Epigenetic contributors to the discordance of
monozygotic twins. Clin Genet 2002;62:97e103.
Poulsen P, Esteller M, Vaag A, Fraga MF. The epigenetic basis of twin discordance
in age-related diseases. Pediatr Res 2007;61:38Re42.
Haque FN, Gottesman II, Wong AH. Not really identical: epigenetic differences in
monozygotic twins and implications for twin studies in psychiatry. Am J Med Genet C
Semin Med Genet 2009;151C:136e41.
Ballestar E. Epigenetics lessons from twins: prospects for autoimmune disease. Clin
Rev Allergy Immunol 2010;39:30e41.
Javierre BM, Fernandez AF, Richter J, Al-Shahrour F, Martin-Subero JI, RodriguezUbreva J, Berdasco M, Fraga MF, O’Hanlon TP, Rider LG, Jacinto FV, Lopez-Longo FJ,
Dopazo J, Forn M, Peinado MA, Carreño L, Sawalha AH, Harley JB, Siebert R,
Esteller M, Miller FW, Ballestar E. Changes in the pattern of DNA methylation
associate with twin discordance in systemic lupus erythematosus. Genome Res
2010;20:170e9.
Baranzini SE, Mudge J, van Velkinburgh JC, Khankhanian P, Khrebtukova I, Miller
NA, Zhang L, Farmer AD, Bell CJ, Kim RW, May GD, Woodward JE, Caillier SJ,
McElroy JP, Gomez R, Pando MJ, Clendenen LE, Ganusova EE, Schilkey FD, Ramaraj
T, Khan OA, Huntley JJ, Luo S, Kwok PY, Wu TD, Schroth GP, Oksenberg JR, Hauser
SL, Kingsmore SF. Genome, epigenome and RNA sequences of monozygotic twins
discordant for multiple sclerosis. Nature 2010;464:1351e6.
Mardis ER. The impact of next-generation sequencing technology on genetics.
Trends Genet 2008;24:133e41.
Morozova O, Marra MA. Applications of next generation sequencing technologies in
functional genomics. Genomics 2008;92:255e64.
Werner T. Next generation sequencing in functional genomics. Brief Bioinform
2010;11:499e511.
Horner DS, Pavesi G, Castrignanò T, De Meo PD, Liuni S, Sammeth M, Picardi E,
Pesole G. Bioinformatics approaches for genomics and post genomics applications of
next-generation sequencing. Brief Bioinform 2010;11:181e97.
Huss M. Introduction into the analysis of high-throughput-sequencing based
epigenome data. Brief Bioinform 2010;11:512e23.
Laird PW. Principles and challenges of genome-wide DNA methylation analysis.
Nat Rev Genet 2010;11:191e203.
J Med Genet 2011;48:721e730. doi:10.1136/jmedgenet-2011-100242
28.
29.
30.
31.
32.
33.
34.
35.
36.
37.
38.
39.
40.
41.
42.
43.
44.
45.
46.
47.
48.
49.
50.
51.
52.
53.
54.
Hurd PJ, Nelson CJ. Advantages of next-generation sequencing versus
the microarray in epigenetic research. Brief Funct Genomic Proteomic
2009;8:174e83.
Hirst M, Marra MA. Next generation sequencing based approaches to epigenomics.
Brief Funct Genomics 2010;9:455e65.
Park PJ. ChIP-seq: advantages and challenges of a maturing technology. Nat Rev
Genet 2009;10:669e80.
Schones DE, Zhao K. Genome-wide approaches to studying chromatin
modifications. Nat Rev Genet 2008;9:179e91.
Zhou VW, Goren A, Bernstein BE. Charting histone modifications and the functional
organization of mammalian genomes. Nat Rev Genet 2011;12:7e18.
Izzo A, Schneider R. Chatting histone modifications in mammals. Brief Funct
Genomics 2010;9:429e43.
Wang Z, Schones DE, Zhao K. Characterization of human epigenomes. Curr Opin
Genet Dev 2009;19:127e34.
Suzuki MM, Bird A. DNA methylation landscapes: provocative insights from
epigenomics. Nat Rev Genet 2008;9:465e76.
Esteller M. Cancer epigenomics: DNA methylomes and histone-modification maps.
Nat Rev Genet 2007;8:286e98.
Jones PA, Baylin SB. The epigenomics of cancer. Cell 2007;128:683e92.
Lister R, Pelizzola M, Dowen RH, Hawkins RD, Hon G, Tonti-Filippini J, Nery JR,
Lee L, Ye Z, Ngo QM, Edsall L, Antosiewicz-Bourget J, Stewart R, Ruotti V, Millar AH,
Thomson JA, Ren B, Ecker JR. Human DNA methylomes at base resolution show
widespread epigenomic differences. Nature 2009;462:315e22.
Schumacher A, Weinhäusl A, Petronis A. Application of microarrays for DNA
methylation profiling. Methods Mol Biol 2008;439:109e29.
Lister R, Ecker JR. Finding the fifth base: genome-wide sequencing of cytosine
methylation. Genome Res 2009;19:959e66.
Bibikova M, Le J, Barnes B, Saedinia-Melnyk S, Zhou L, Shen R, Gunderson KL.
Genome-wide DNA methylation profiling using Infinium assay. Epigenomics
2009;1:177e200.
Bibikova M, Fan JB. Genome-wide DNA methylation profiling. Wiley Interdiscip Rev
Syst Biol Med 2010;2:210e23.
Metzker ML. Sequencing technologiesdthe next generation. Nat Rev Genet
2010;11:31e46.
Mardis ER. A decade’s perspective on DNA sequencing technology. Nature
2011;470:198e203.
Schadt EE, Turner S, Kasarskis A. A window into third-generation sequencing.
Hum Mol Genet 2010;19:R227e40.
Eid J, Fehr A, Gray J, Luong K, Lyle J, Otto G, Peluso P, Rank D, Baybayan P,
Bettman B, Bibillo A, Bjornson K, Chaudhuri B, Christians F, Cicero R, Clark S, Dalal R,
Dewinter A, Dixon J, Foquet M, Gaertner A, Hardenbol P, Heiner C, Hester K, Holden
D, Kearns G, Kong X, Kuse R, Lacroix Y, Lin S, Lundquist P, Ma C, Marks P, Maxham
M, Murphy D, Park I, Pham T, Phillips M, Roy J, Sebra R, Shen G, Sorenson J,
Tomaney A, Travers K, Trulson M, Vieceli J, Wegener J, Wu D, Yang A, Zaccarin D,
Zhao P, Zhong F, Korlach J, Turner S. Real-time DNA sequencing from single
polymerase molecules. Science 2009;323:133e8.
Flusberg BA, Webster DR, Lee JH, Travers KJ, Olivares EC, Clark TA, Korlach J,
Turner SW. Direct detection of DNA methylation during single-molecule, real-time
sequencing. Nat Methods 2010;7:461e5.
Li Y, Zhu J, Tian G, Li N, Li Q, Ye M, Zheng H, Yu J, Wu H, Sun J, Zhang H, Chen Q,
Luo R, Chen M, He Y, Jin X, Zhang Q, Yu C, Zhou G, Sun J, Huang Y, Zheng H, Cao H,
Zhou X, Guo S, Hu X, Li X, Kristiansen K, Bolund L, Xu J, Wang W, Yang H, Wang J,
Li R, Beck S, Wang J, Zhang X. The DNA methylome of human peripheral blood
mononuclear cells. PLoS Biol 2010;8:e1000533.
Laurent L, Wong E, Li G, Huynh T, Tsirigos A, Ong CT, Low HM, Kin Sung KW,
Rigoutsos I, Loring J, Wei CL. Dynamic changes in the human methylome during
differentiation. Genome Res 2010;20:320e31.
Dahl C, Grønbæk K, Guldberg P. Advances in DNA methylation:
5-hydroxymethylcytosine revisited. Clin Chim Acta 2011;412:831e6.
Wang J, Wang W, Li R, Li Y, Tian G, Goodman L, Fan W, Zhang J, Li J, Zhang J, Guo
Y, Feng B, Li H, Lu Y, Fang X, Liang H, Du Z, Li D, Zhao Y, Hu Y, Yang Z, Zheng H,
Hellmann I, Inouye M, Pool J, Yi X, Zhao J, Duan J, Zhou Y, Qin J, Ma L, Li G, Yang Z,
Zhang G, Yang B, Yu C, Liang F, Li W, Li S, Li D, Ni P, Ruan J, Li Q, Zhu H, Liu D, Lu Z,
Li N, Guo G, Zhang J, Ye J, Fang L, Hao Q, Chen Q, Liang Y, Su Y, San A, Ping C,
Yang S, Chen F, Li L, Zhou K, Zheng H, Ren Y, Yang L, Gao Y, Yang G, Li Z, Feng X,
Kristiansen K, Wong GK, Nielsen R, Durbin R, Bolund L, Zhang X, Li S, Yang H, Wang
J. The diploid genome sequence of an Asian individual. Nature 2008;456:60e5.
Maunakea AK, Nagarajan RP, Bilenky M, Ballinger TJ, D’Souza C, Fouse SD,
Johnson BE, Hong C, Nielsen C, Zhao Y, Turecki G, Delaney A, Varhol R, Thiessen N,
Shchors K, Heine VM, Rowitch DH, Xing X, Fiore C, Schillebeeckx M, Jones SJ,
Haussler D, Marra MA, Hirst M, Wang T, Costello JF. Conserved role of intragenic
DNA methylation in regulating alternative promoters. Nature 2010;466:253e7.
Lister R, Pelizzola M, Kida YS, Hawkins RD, Nery JR, Hon G, Antosiewicz-Bourget J,
O’Malley R, Castanon R, Klugman S, Downes M, Yu R, Stewart R, Ren B,
Thomson JA, Evans RM, Ecker JR. Hotspots of aberrant epigenomic reprogramming
in human induced pluripotent stem cells. Nature 2011;471:68e73.
Hawkins RD, Hon GC, Lee LK, Ngo Q, Lister R, Pelizzola M, Edsall LE, Kuan S,
Luu Y, Klugman S, Antosiewicz-Bourget J, Ye Z, Espinoza C, Agarwahl S, Shen L,
Ruotti V, Wang W, Stewart R, Thomson JA, Ecker JR, Ren B. Distinct epigenomic
landscapes of pluripotent and lineage-committed human cells. Cell Stem Cell
2010;6:479e91.
729
Downloaded from http://jmg.bmj.com/ on November 3, 2016 - Published by group.bmj.com
Methods
55.
56.
57.
58.
59.
60.
61.
62.
63.
64.
65.
66.
67.
68.
69.
70.
Condic ML, Rao M. Alternative sources of pluripotent stem cells: ethical and
scientific issues revisited. Stem Cells Dev 2010;19:1121e9.
Hyun I. The bioethics of stem cell research and therapy. J Clin Invest 2010;120:71e5.
Tahiliani M, Koh KP, Shen Y, Pastor WA, Bandukwala H, Brudno Y, Agarwal S, Iyer
LM, Liu DR, Aravind L, Rao A. Conversion of 5-methylcytosine to 5hydroxymethylcytosine in mammalian DNA by MLL partner TET1. Science
2009;324:930e5.
Kriaucionis S, Heintz N. The nuclear DNA base 5-hydroxymethylcytosine is present
in Purkinje neurons and the brain. Science 2009;324:929e30.
Pastor WA, Pape UJ, Huang Y, Henderson HR, Lister R, Ko M, McLoughlin EM,
Brudno Y, Mahapatra S, Kapranov P, Tahiliani M, Daley GQ, Liu XS, Ecker JR, Milos
PM, Agarwal S, Rao A. Genome-wide mapping of 5-hydroxymethylcytosine in
embryonic stem cells. Nature 2011;473:394e7.
Ficz G, Branco MR, Seisenberger S, Santos F, Krueger F, Hore TA, Marques CJ,
Andrews S, Reik W. Dynamic regulation of 5-hydroxymethylcytosine in mouse ES
cells and during differentiation. Nature 2011;473:398e402.
Williams K, Christensen J, Pedersen MT, Johansen JV, Cloos PA, Rappsilber J,
Helin K. TET1 and hydroxymethylcytosine in transcription and DNA methylation
fidelity. Nature 2011;473:343e8.
Wu H, D’Alessio AC, Ito S, Xia K, Wang Z, Cui K, Zhao K, Eve Sun Y, Zhang Y. Dual
functions of Tet1 in transcriptional regulation in mouse embryonic stem cells. Nature
2011;473:389e93.
Barski A, Cuddapah S, Cui K, Roh TY, Schones DE, Wang Z, Wei G, Chepelev I, Zhao
K. High-resolution profiling of histone methylations in the human genome. Cell
2007;129:823e37.
Maruyama R, Choudhury S, Kowalczyk A, Bessarabova M, Beresford-Smith B,
Conway T, Kaspi A, Wu Z, Nikolskaya T, Merino VF, Lo PK, Liu XS, Nikolsky Y,
Sukumar S, Haviv I, Polyak K. Epigenetic regulation of cell type-specific expression
patterns in the human mammary epithelium. PLoS Genet 2011;7:e1001369.
Ernst J, Kheradpour P, Mikkelsen TS, Shoresh N, Ward LD, Epstein CB, Zhang X,
Wang L, Issner R, Coyne M, Ku M, Durham T, Kellis M, Bernstein BE. Mapping and
analysis of chromatin state dynamics in nine human cell types. Nature
2011;473:43e9.
Harismendy O, Notani D, Song X, Rahim NG, Tanasa B, Heintzman N, Ren B,
Fu XD, Topol EJ, Rosenfeld MG, Frazer KA. 9p21 DNA variants associated with
coronary artery disease impair interferon-g signalling response. Nature
2011;470:264e8.
Wasserman NF, Aneas I, Nobrega MA. An 8q24 gene desert variant associated
with prostate cancer risk confers differential in vivo activity to a MYC enhancer.
Genome Res 2010;20:1191e7.
Kong A, Steinthorsdottir V, Masson G, Thorleifsson G, Sulem P, Besenbacher S,
Jonasdottir A, Sigurdsson A, Kristinsson KT, Jonasdottir A, Frigge ML, Gylfason A,
Olason PI, Gudjonsson SA, Sverrisson S, Stacey SN, Sigurgeirsson B, Benediktsdottir
KR, Sigurdsson H, Jonsson T, Benediktsson R, Olafsson JH, Johannsson OT,
Hreidarsson AB, Sigurdsson G, Ferguson-Smith AC, Gudbjartsson DF, Thorsteinsdottir
U, Stefansson K; DIAGRAM Consortium. Parental origin of sequence variants
associated with complex diseases. Nature 2009;462:868e74.
Wallace C, Smyth DJ, Maisuria-Armer M, Walker NM, Todd JA, Clayton DG. The
imprinted DLK1-MEG3 gene region on chromosome 14q32.2 alters susceptibility to
type 1 diabetes. Nat Genet 2010;42:68e71.
Bell JT, Pai AA, Pickrell JK, Gaffney DJ, Pique-Regi R, Degner JF, Gilad Y, Pritchard
JK. DNA methylation patterns associate with genetic and gene expression variation
in HapMap cell lines. Genome Biol 2011;12:R10.
71.
72.
73.
74.
75.
76.
77.
78.
79.
80.
81.
82.
83.
84.
85.
86.
87.
88.
89.
Verlaan DJ, Berlivet S, Hunninghake GM, Madore AM, Larivière M, Moussette S,
Grundberg E, Kwan T, Ouimet M, Ge B, Hoberman R, Swiatek M, Dias J, Lam KC,
Koka V, Harmsen E, Soto-Quiros M, Avila L, Celedón JC, Weiss ST, Dewar K, Sinnett
D, Laprise C, Raby BA, Pastinen T, Naumova AK. Allele-specific chromatin remodeling
in the ZPBP2/GSDMB/ORMDL3 locus associated with the risk of asthma and
autoimmune disease. Am J Hum Genet 2009;85:377e93.
Tycko B. Mapping allele-specific DNA methylation: a new tool for maximizing
information from GWAS. Am J Hum Genet 2010;86:109e12.
Kerkel K, Spadola A, Yuan E, Kosek J, Jiang L, Hod E, Li K, Murty VV, Schupf N,
Vilain E, Morris M, Haghighi F, Tycko B. Genomic surveys by methylation-sensitive
SNP analysis identify sequence-dependent allele-specific DNA methylation. Nat
Genet 2008;40:904e8.
Zhang Y, Rohde C, Reinhardt R, Voelcker-Rehage C, Jeltsch A. Non-imprinted allele
specific DNA methylation on human autosomes. Genome Biol 2009;10:R138.
Shoemaker R, Deng J, Wang W, Zhang K. Allele-specific methylation is prevalent
and is contributed by CpG-SNPs in the human genome. Genome Res 2010;20:883e9.
Petronis A. Epigenetics as a unifying principle in the aetiology of complex traits and
diseases. Nature 2010;465:721e7.
Bell CG, Beck S. The epigenomic interface between genome and environment in
common complex diseases. Brief Funct Genomics 2010;9:477e85.
Maunakea AK, Chepelev I, Zhao K. Epigenome mapping in normal and disease
states. Circ Res 2010;107:327e39.
Huang Y, Pastor WA, Shen Y, Tahiliani M, Liu DR, Rao A. The behaviour of
5-hydroxymethylcytosine in bisulfite sequencing. PLoS One 2010;5:e8888.
Jin SG, Kadam S, Pfeifer GP. Examination of the specificity of DNA methylation
profiling techniques towards 5-methylcytosine and 5-hydroxymethylcytosine. Nucleic
Acids Res 2010;38:e125.
Jin SG, Wu X, Li AX, Pfeifer GP. Genomic mapping of 5-hydroxymethylcytosine in the
human brain. Nucleic Acids Res 2011;39:5015e24.
Clarke J, Wu HC, Jayasinghe L, Patel A, Reid S, Bayley H. Continuous base
identification for single-molecule nanopore DNA sequencing. Nat Nanotechnol
2009;4:265e70.
Loh M, Liem N, Lim PL, Vaithilingam A, Cheng CL, Salto-Tellez M, Yong WP, Soong
R. Impact of sample heterogeneity on methylation analysis. Diagn Mol Pathol
2010;19:243e7.
Kantlehner M, Kirchner R, Hartmann P, Ellwart JW, Alunni-Fabbroni M, Schumacher
A. A high-throughput DNA methylation analysis of a single cell. Nucleic Acids Res
2011;39:e44.
Thompson JF, Milos PM. The properties and applications of single-molecule DNA
sequencing. Genome Biol 2011;12:217.
Harris TD, Buzby PR, Babcock H, Beer E, Bowers J, Braslavsky I, Causey M, Colonell
J, Dimeo J, Efcavitch JW, Giladi E, Gill J, Healy J, Jarosz M, Lapen D, Moulton K,
Quake SR, Steinmann K, Thayer E, Tyurina A, Ward R, Weiss H, Xie Z. Singlemolecule DNA sequencing of a viral genome. Science 2008;320:106e9.
Goren A, Ozsolak F, Shoresh N, Ku M, Adli M, Hart C, Gymrek M, Zuk O, Regev A,
Milos PM, Bernstein BE. Chromatin profiling by directly sequencing small quantities of
immunoprecipitated DNA. Nat Methods 2010;7:47e9.
Bernstein BE, Stamatoyannopoulos JA, Costello JF, Ren B, Milosavljevic A,
Meissner A, Kellis M, Marra MA, Beaudet AL, Ecker JR, Farnham PJ, Hirst M, Lander
ES, Mikkelsen TS, Thomson JA. The NIH Roadmap Epigenomics Mapping
Consortium. Nat Biotechnol 2010;28:1045e8.
Cedar H, Bergman Y. Linking DNA methylation and histone modification: patterns
and paradigms. Nat Rev Genet 2009;10:295e304.
PAGE fraction trail=9.75
730
J Med Genet 2011;48:721e730. doi:10.1136/jmedgenet-2011-100242
Downloaded from http://jmg.bmj.com/ on November 3, 2016 - Published by group.bmj.com
Studying the epigenome using next
generation sequencing
Chee Seng Ku, Nasheen Naidoo, Mengchu Wu and Richie Soong
J Med Genet 2011 48: 721-730 originally published online August 8, 2011
doi: 10.1136/jmedgenet-2011-100242
Updated information and services can be found at:
http://jmg.bmj.com/content/48/11/721
These include:
References
Email alerting
service
Topic
Collections
This article cites 89 articles, 22 of which you can access for free at:
http://jmg.bmj.com/content/48/11/721#BIBL
Receive free email alerts when new articles cite this article. Sign up in the
box at the top right corner of the online article.
Articles on similar topics can be found in the following collections
Molecular genetics (1250)
Notes
To request permissions go to:
http://group.bmj.com/group/rights-licensing/permissions
To order reprints go to:
http://journals.bmj.com/cgi/reprintform
To subscribe to BMJ go to:
http://group.bmj.com/subscribe/