Exploring the Future Potential of AI-Enabled Smartphone Processors
Cmb2003 lecture 12 2013
1. DNA microarrays and highthroughput sequencing approaches
for analysing patterns of gene
expression
Dianne Ford
2. Functional genomics
• Experimental methods of identifying the function and
expression pattern of genomic sequences
–
–
–
–
Bioinformatics (CMB2005)
Genome projects – Professor Morgan
Mouse knockout models (MMed lecture 11)
DNA microarrays and high throughput (“next generation”)
sequencing
• Detect patterns of expression of gene expression
– E.g. compare different tissues (see Mortazavi A et al (2008) Mapping and
quantifying mammalian transcriptomes by RNA-Seq. Nat Methods. 5:621-8)
– E.g. compare normal and abnormal states (e.g. cancer)
– E.g. compare effects in specific tissue of pharmaceutical/dietary intervention
(see Lagouge et al (2006) Resveratrol improves mitochondrial function and
protects against metabolic disease by activating Sirt1 and PGC-1alpha. Cell
127: 1109-1122)
– E.g. compare same tissue/cell line before and after a specific treatment (e.g
nutrient starvation; oxidative stress)
3. So what is a DNA microarray?
• A glass slide (like a microscope slide) spotted
at high density with individual DNA sequences
(up to approximately 40,000), each of which
corresponds to a known gene product
– Oligonucleotides (50-70 mers)
– PCR products
4. Why is that useful?
• Incubate microarray with fluorescentlylabelled cDNA from the tissue/cell line of
interest.
• Labelled cDNA will hybridase (stick) only to
DNA spots on the array to which it is
complementary.
• Detect which (known) spots test DNA has
hybridised to to determine which genes are
expressed.
5. Sounds too easy!
• True – that is an over-simplification
– It is more usual to hybridise cDNA samples prepared
separately from two different tissues/cell types or from
the same tissue in two different states (e.g. normal and
diseased) or from the same tissue/cell type before and
after a specific treatment.
– By incorporating a different coloured fluorescent dye into
each sample, the relative level of expression of each gene
on the array in each of the two samples can be compared.
• Other platforms (e.g. Affymetrix) use just a single dye and samples
are hybridised to separate arrays, which are then compared.
6. Principle of analysis of relative levels of gene
expression by DNA microarray hybridisation
RNA isolated
from sample A
RNA isolated
from sample B
Reverse transcribe
(to produce cDNA),
incorporating green
fluorescent dye
Reverse transcribe
(to produce cDNA),
incorporating red
fluorescent dye
Mix
Hybridise to microarray
and scan
Expressed in neither sample
Expressed only in sample A
Expressed only in sample B
Expressed equally in both samples
8. An example of a microarry experiment
• Middle-aged (1 y) male mice provided with standard diet or
high fat (60% energy) diet
• Resveratrol added to the diet of half mice on each diet
• Resveratrol shifted physiology of mice on high-fat diet
towards mice on standard diet and increased survival
significantly
• Microarray anaysis of gene expression in liver showed
resveratrol opposed effects of high-calorie diet on 144 out of
153 significantly-altered pathways
Lagouge et al (2006) Cell 127: 1109-1122
9. An example of a microarry experiment
Parametric analysis of gene-set
enrichment (PAGE) comparing
every pathway significantly
upregulated (red) or downregulated
(blue) by either the HC diet or
resveratrol (153 in total, with 144
showing opposing effects).
• Microarray anaysis of gene expression in liver showed
resveratrol opposed effects of high-calorie diet on 144 out of
153 significantly-altered pathways
Lagouge et al (2006) Cell 127: 1109-1122
10. Next generation DNA sequencing: extension of
the approach to “counting” transcript numbers
in an mRNA sample
•
•
Also known as “massively parallel” DNA sequencing
Different commercial platforms are available
– E.g. Ilumina Solexa Genome Analyser
• Achieves parallel (simultaneous) short (35-75 bp) sequencing of
hundreds of millions of random fragments of DNA (or for
determining transcript (mRNA) numbers of cDNA).
– Fragments for sequencing arrayed randomly in clusters of around 103-106
produced by “bridge amplification” of single fragments that bind to solid
support (flow cell) covered with oligonucleotides that pair with adapter
oligonucleotides ligated to each and of fragmented DNA (or cDNA copies of
short (e.g. approx. 200 base) mRNA fragments).
– Then uses “DNA sequencing by synthesis” technology
» All 4 nucleotides added together, with DNA polymerase; each carries base-unique fluorescent
label and 3’OH group blocked chemically so incorporate only one base at a time.
» Flow cell imaged by sophisticated optics after laser excitation.
» Large number of copies sequenced in each cluster is required to generate a sufficiently-strong
signal for detection.
» Then 3’ blocking group removed chemically and next round proceeds
• Sequence reads aligned against a reference genome.
• Depending on initial sample preparation gives information on
genomic sequence variations, splice variants or transcript (mRNA)
numbers.
11. New generation (Solexa) sequencing:
step 1
OR cDNA sample (copy of mRNA, so
representative of number of copies
of each mRNA in the sample)
17. New generation (Solexa) sequencing:
step 1 (reminder)
OR cDNA sample (copy of mRNA, so
representative of number of copies
of each mRNA in the sample)
18. New generation (Solexa) sequencing:
Use for RNA-seq to determine transcript (mRNA) copy
number
- by Mg-catalysed hydrolysis, to give fragments of approximately 200 bases
- by random-primed reverse transcription; then Solexa sequencing
RPKM = Reads Per Kilobase of transcript per Million mapped reads
Mortazavi A et al (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods. 5:621-8
19. Example of the use of sequence data from
multiple genomes: deducing gene (protein)
function
– Functionally-linked proteins should have
homologues in all organisms with that function
• E.g. Flagella proteins should be only in bacteria with
flagella
Flagella?
No
Yes Yes
No
Yes
No
20. References
Mardis ER (2008) Next-generation DNA sequencing methods.Annu Rev
Genomics Hum Genet. 9:387-402
Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B (2008) Mapping and
quantifying mammalian transcriptomes by RNA-Seq.
Nat Methods. 5:621-8
Noordewier M & Warren P (2001) Gene expression microarrays and the
integration of biological knowledge. Trends in Biotechnology
(2001)19:412 – 415
Lagouge et al (2006) Resveratrol improves mitochondrial function and
protects against metabolic disease by activating Sirt1 and PGC-1alpha.
Cell 127: 1109-1122