SlideShare una empresa de Scribd logo
1 de 58
Detecting single molecules and
sequencing DNA
Rohan T. Ranasinghe
University Chemical Laboratories, Lensfield Road, Cambridge CB2 1EW
Locations and timeline
1 mile
Locations and timeline
1 mile
http://www.cambridge
2000.com
Old Cavendish
Laboratory
1953: Discovery of the
structure of DNA
Locations and timeline
1 mile
http://www.cambridge
2000.com
Old Cavendish
Laboratory
1953: Discovery of the
structure of DNA
LMB
1977: Sanger method for
sequencing invented
http://www2.mrc-lmb.cam.ac.uk
Locations and timeline
1 mile
http://www.cambridge
2000.com
Old Cavendish
Laboratory
1953: Discovery of the
structure of DNA
Sanger
Institute
1993: Work on Human
genome project at the
Sanger starts
Genome Research Ltd.
LMB
1977: Sanger method for
sequencing invented
http://www2.mrc-lmb.cam.ac.uk
Locations and timeline
1 mile
http://www.cambridge
2000.com
Old Cavendish
Laboratory
1953: Discovery of the
structure of DNA
Chemistry
department
http://www.flickr.com/
photos/shai-
bl/5584629687/sizes/
m/in/photostream/
1997: Work on Solexa method
for sequencing started
Sanger
Institute
1993: Work on Human
genome project at the
Sanger starts
Genome Research Ltd.
LMB
1977: Sanger method for
sequencing invented
http://www2.mrc-lmb.cam.ac.uk
Structure of DNA
http://www.themicrobiologist.com
Solved in Cambridge in 1953 by James Watson
and Francis Crick using data collected by Rosalind
Franklin and Maurice Wilkins at King’s College
London
The key to the structure was base pairing
Structure of DNA
http://www.themicrobiologist.com
Solved in Cambridge in 1953 by James Watson
and Francis Crick using data collected by Rosalind
Franklin and Maurice Wilkins at King’s College
London
The key to the structure was base pairing
Structure of DNA
http://www.flickr.com/photos/grahams__flickr
/504365411/sizes/l/in/photostream/
Solved in Cambridge in 1953 by James Watson
and Francis Crick using data collected by Rosalind
Franklin and Maurice Wilkins at King’s College
London
The key to the structure was base pairing
Structure of DNA
http://www.flickr.com/photos/major_clanger/
5881631482/sizes/o/in/photostream/
http://www.flickr.com/photos/grahams__flickr
/504365411/sizes/l/in/photostream/
Solved in Cambridge in 1953 by James Watson
and Francis Crick using data collected by Rosalind
Franklin and Maurice Wilkins at King’s College
London
The key to the structure was base pairing
Structure of DNA
http://www.flickr.com/photos/major_clanger/
5881631482/sizes/o/in/photostream/
http://www.flickr.com/photos/grahams__flickr
/504365411/sizes/l/in/photostream/
Solved in Cambridge in 1953 by James Watson
and Francis Crick using data collected by Rosalind
Franklin and Maurice Wilkins at King’s College
London
The key to the structure was base pairing
The fidelity of the Watson-Crick base pairs and
the double helix structure are the cornerstones of
DNA sequencing and modern forensic science
DNA Sequencing
Why would you want to sequence DNA?
http://www.sikeston.k12.mo.us
DNA Sequencing
Why would you want to sequence DNA?
http://www.sikeston.k12.mo.us
© Invitrogen
DNA Sequencing
Why would you want to sequence DNA?
A genome contains the information
required to build an organism
http://www.sikeston.k12.mo.us
© Invitrogen
DNA Sequencing
Why would you want to sequence DNA?
A genome contains the information
required to build an organism
http://www.sikeston.k12.mo.us
It’s a long book...
© InvitrogenWikipedia
DNA Sequencing
Why would you want to sequence DNA?
A genome contains the information
required to build an organism
http://www.sikeston.k12.mo.us
It’s a long book...
© Invitrogen
~3,000,000,000 (3 ×109)
letters in each of the ~1014
cells in a human
Wikipedia
DNA Sequencing
Why would you want to sequence DNA?
A genome contains the information
required to build an organism
http://www.sikeston.k12.mo.us
It’s a long book...
© Invitrogen
~3,000,000,000 (3 ×109)
letters in each of the ~1014
cells in a human
Distance between base pairs
= 0.34 nm (0.34 ×10-9 m)
Wikipedia
DNA Sequencing
Why would you want to sequence DNA?
A genome contains the information
required to build an organism
http://www.sikeston.k12.mo.us
It’s a long book...
© Invitrogen
~3,000,000,000 (3 ×109)
letters in each of the ~1014
cells in a human
The DNA in one of your cells would be 2 m long in
the B-form structure
Distance between base pairs
= 0.34 nm (0.34 ×10-9 m)
Wikipedia
T
Sanger sequencing
CAGTCAGTCA
GA
C
G
T
A
C
G
TA
C
G
T
AC
Based on copying of DNA:
Genome Research Ltd.
Sanger sequencing
CAGTCAGTCA
GA
C
G
A
C
TA
G
T
C
Based on copying of DNA:
Genome Research Ltd.
Sanger sequencing
CAGTCAGTCA
GA
C
G
T
A
C
TA
G
T
C
Based on copying of DNA:
Genome Research Ltd.
Sanger sequencing
CAGTCAGTCA
GA
C
G
T
A
C
TA
CG
T
C
Based on copying of DNA:
Genome Research Ltd.
Sanger sequencing
CAGTCAGTCA
GA
C
G
T
A
C
TA
CG
T
A
C
Based on copying of DNA:
Genome Research Ltd.
Sanger sequencing
CAGTCAGTCA
GA
C
G
T
A
C
G
TA
CG
T
A
C
Based on copying of DNA:
Genome Research Ltd.
T
Sanger sequencing
CAGTCAGTCA
GA
C
G
T
A
C
G
TA
CG
T
A
C
Based on copying of DNA:
Genome Research Ltd.
Incorporation of fluorescent nucleotide
terminates the copying process
T
Sanger sequencing
CAGTCAGTCA
GA
C
G
T
A
C
G
TA
CG
T
A
C
Based on copying of DNA: Repeat ~1030 times
Genome Research Ltd.
T
Sanger sequencing
CAGTCAGTCA
GA
C
G
T
A
C
G
TA
CG
T
A
C
Based on copying of DNA: Repeat ~1030 times
Genome Research Ltd.
Sanger sequencing
Copied sequence
G
C
T
A
C
G
A
T
G
C
T
A
C
G
A
T
G
C
T
A
Original sequence
Repeat 3 × 108 times to read genome
(would take another 190 years at this speed!*)
*Note: original animation took ~20
seconds)
The human genome project
http://www.c-spanvideo.org/program/157909-1
Started: 1989 (in the USA)
The human genome project
First draft completed: 2000
‘Finished’: 2003
http://www.c-spanvideo.org/program/157909-1
Started: 1989 (in the USA)
The human genome project
First draft completed: 2000
‘Finished’: 2003
http://www.c-spanvideo.org/program/157909-1
Started: 1989 (in the USA)
The human genome project
Cost: $3,000,000,000
First draft completed: 2000
‘Finished’: 2003
http://www.c-spanvideo.org/program/157909-1
Started: 1989 (in the USA)
The human genome project
Cost: $3,000,000,000
First draft completed: 2000
‘Finished’: 2003
http://www.flickr.com/photos/93425126@N00/43948
34217/in/set-72157623515077498/
http://www.c-spanvideo.org/program/157909-1
Started: 1989 (in the USA)
The human genome project
Cost: $3,000,000,000
First draft completed: 2000
‘Finished’: 2003
http://www.flickr.com/photos/93425126@N00/43948
34217/in/set-72157623515077498/
http://www.c-spanvideo.org/program/157909-1
Started: 1989 (in the USA)
UK effort on the Human Genome Project largely carried
out in this building in the Sanger Centre
The human genome project
Cost: $3,000,000,000
First draft completed: 2000
‘Finished’: 2003
http://www.flickr.com/photos/93425126@N00/43948
34217/in/set-72157623515077498/
http://www.c-spanvideo.org/program/157909-1
Started: 1989 (in the USA)
UK effort on the Human Genome Project largely carried
out in this building in the Sanger Centre
9 Chromosomes were sequenced here (about
a third of the genome)
What does it mean to detect a single molecule?
Looking for a needle in a haystack?
What does it mean to detect a single molecule?
Looking for a needle in a haystack?
How many blades of grass on a football pitch?
What does it mean to detect a single molecule?
Looking for a needle in a haystack?
About 200,000,000 or 2×108
How many blades of grass on a football pitch?
What does it mean to detect a single molecule?
How many molecules in a vial of water?
Looking for a needle in a haystack?
About 200,000,000 or 2×108
How many blades of grass on a football pitch?
What does it mean to detect a single molecule?
18 mL (1 mole) of water contains Avogadro’s
number of molecules: 6.02 ×1023
How many molecules in a vial of water?
Looking for a needle in a haystack?
About 200,000,000 or 2×108
How many blades of grass on a football pitch?
What does it mean to detect a single molecule?
18 mL (1 mole) of water contains Avogadro’s
number of molecules: 6.02 ×1023
How many molecules in a vial of water?
Looking for a needle in a haystack?
About 200,000,000 or 2×108
How many blades of grass on a football pitch?
So 1 mole of grass blades would cover
6.02 ×1023 ÷ 2×108 = 3 ×1015 football pitches
What does it mean to detect a single molecule?
18 mL (1 mole) of water contains Avogadro’s
number of molecules: 6.02 ×1023
How many molecules in a vial of water?
Looking for a needle in a haystack?
About 200,000,000 or 2×108
How many blades of grass on a football pitch?
So 1 mole of grass blades would cover
6.02 ×1023 ÷ 2×108 = 3 ×1015 football pitches
That’s a lot of haystacks...
What does it mean to detect a single molecule?
1 mole of grass blades = 3×1015 football pitches = 15×1012 km2
What does it mean to detect a single molecule?
1 mole of grass blades = 3×1015 football pitches = 15×1012 km2
Surface area of Earth = 5×108 km2
(1011 football pitches!)
What does it mean to detect a single molecule?
1 mole of grass blades = 3×1015 football pitches = 15×1012 km2
Surface area of Jupiter = 6×1010 km2
*Lab demonstration: 180 µL
(15×1010 km2 of grass blades)
What does it mean to detect a single molecule?
1 mole of grass blades = 3×1015 football pitches = 15×1012 km2
Surface area of the Sun
= 6×1012 km2
What does it mean to detect a single molecule?
1 mole of grass blades = 3×1015 football pitches = 15×1012 km2
Surface area of the Sun
= 6×1012 km2
1 mole of grass blades would
cover the surface area of
about 2.5 Suns!
What does it mean to detect a single molecule?
1 mole of grass blades = 3×1015 football pitches = 15×1012 km2
Surface area of the Sun
= 6×1012 km2
1 mole of grass blades would
cover the surface area of
about 2.5 Suns!
All images: nasa.gov
Sanger sequencing
Sanger sequencing uses about 2×1010 molecules per 100 letters
Solexa sequencing
Invented in 1997 in this
department
Developed by a spin-out
company in Saffron Walden
Sold for $650,000,000 in 2006
Solexa sequencing
Solexa sequencing uses about 103 molecules to read
100 letters
About as many blades of grass as on the penalty spot
Imaging technology: lab demonstration
http://thesportboys.wordpress.com/category/international/page/2/
Invented in 1997 in this
department
Developed by a spin-out
company in Saffron Walden
Sold for $650,000,000 in 2006
Solexa sequencing
Solexa sequencing
© Royal Society of Chemistry publishing
Densely packed microscopic “islands” of
DNA generate information very quickly
Solexa sequencing
© Royal Society of Chemistry publishing
• “Recycled” template molecules ready for a
incorporation of the next fluorescent letter
• Possible to read about 100 letters from each DNA
strand, rather than 1
Solexa sequencing
© Royal Society of Chemistry publishing Genome Research Ltd.
Solexa sequencing
© Royal Society of Chemistry publishing Genome Research Ltd.
Cost to sequence a human genome:
around $10,000
Time to sequence a human genome:
less than a week
First African, Asian and giant panda
genomes sequenced
Sanger Institute owns 37 instruments
Solexa sequencing
© Royal Society of Chemistry publishing Genome Research Ltd.
Cost to sequence a human genome:
around $10,000
Time to sequence a human genome:
less than a week
First African, Asian and giant panda
genomes sequenced
Sanger Institute owns 37 instruments
Summary
The structure of DNA, discovered in 1953 has been crucial to sequencing the human genome
The first human genome was sequenced using Fred Sanger’s method, invented in 1977. The
project ran for 14 years, costing $3 billion
New methods for sequencing use single molecule detection to dramatically accelerate the
decoding process
One approach using single molecule techniques, invented by Shankar Balasubramanian and
David Klenerman in our department in 1997 is now widely used for sequencing worldwide
The cost of sequencing has fallen to $10,000 and takes less than a week

Más contenido relacionado

Similar a Detecting single molecules and sequencing DNA

Sanger Sequencing
Sanger SequencingSanger Sequencing
Sanger SequencingFotis17
 
Candidacy Exam Final Version
Candidacy Exam Final VersionCandidacy Exam Final Version
Candidacy Exam Final VersionAnthony Salvagno
 
DNA sequencing: rapid improvements and their implications
DNA sequencing: rapid improvements and their implicationsDNA sequencing: rapid improvements and their implications
DNA sequencing: rapid improvements and their implicationsJeffrey Funk
 
01-Sequencing_Technologies (1).ppt for education
01-Sequencing_Technologies (1).ppt for education01-Sequencing_Technologies (1).ppt for education
01-Sequencing_Technologies (1).ppt for educationaryajayakottarathil
 
データとツール ライフサイエンス統合データベースセンターのシナリオ
データとツール ライフサイエンス統合データベースセンターのシナリオデータとツール ライフサイエンス統合データベースセンターのシナリオ
データとツール ライフサイエンス統合データベースセンターのシナリオMitsuteru Nakao
 
Genome assembly: then and now — v1.1
Genome assembly: then and now — v1.1Genome assembly: then and now — v1.1
Genome assembly: then and now — v1.1Keith Bradnam
 
Casting a Wider Net in Zebrafish Screening with Automated Microscopy and Imag...
Casting a Wider Net in Zebrafish Screening with Automated Microscopy and Imag...Casting a Wider Net in Zebrafish Screening with Automated Microscopy and Imag...
Casting a Wider Net in Zebrafish Screening with Automated Microscopy and Imag...InsideScientific
 
Immigration Debate Essay Prompts
Immigration Debate Essay PromptsImmigration Debate Essay Prompts
Immigration Debate Essay PromptsRuth Phillips
 
FINAL PROJECT REPORT - MAIALEN
FINAL  PROJECT REPORT - MAIALENFINAL  PROJECT REPORT - MAIALEN
FINAL PROJECT REPORT - MAIALENMaialen Aizpurua
 
PLANT GENOME SEQUENCING AND DATA MINING.pptx
PLANT GENOME SEQUENCING AND DATA MINING.pptxPLANT GENOME SEQUENCING AND DATA MINING.pptx
PLANT GENOME SEQUENCING AND DATA MINING.pptxChristalKyuka
 
Nanotechnology
NanotechnologyNanotechnology
NanotechnologyRudy Garns
 
Clase 2 - Genoma Humano proyecto conicet.pdf
Clase 2 - Genoma Humano proyecto conicet.pdfClase 2 - Genoma Humano proyecto conicet.pdf
Clase 2 - Genoma Humano proyecto conicet.pdfNoraCRuizGuevara
 

Similar a Detecting single molecules and sequencing DNA (20)

Sanger Sequencing
Sanger SequencingSanger Sequencing
Sanger Sequencing
 
Candidacy Exam Final Version
Candidacy Exam Final VersionCandidacy Exam Final Version
Candidacy Exam Final Version
 
DNA sequencing: rapid improvements and their implications
DNA sequencing: rapid improvements and their implicationsDNA sequencing: rapid improvements and their implications
DNA sequencing: rapid improvements and their implications
 
01-Sequencing_Technologies (1).ppt for education
01-Sequencing_Technologies (1).ppt for education01-Sequencing_Technologies (1).ppt for education
01-Sequencing_Technologies (1).ppt for education
 
データとツール ライフサイエンス統合データベースセンターのシナリオ
データとツール ライフサイエンス統合データベースセンターのシナリオデータとツール ライフサイエンス統合データベースセンターのシナリオ
データとツール ライフサイエンス統合データベースセンターのシナリオ
 
Genome assembly: then and now — v1.1
Genome assembly: then and now — v1.1Genome assembly: then and now — v1.1
Genome assembly: then and now — v1.1
 
DNA
DNADNA
DNA
 
Casting a Wider Net in Zebrafish Screening with Automated Microscopy and Imag...
Casting a Wider Net in Zebrafish Screening with Automated Microscopy and Imag...Casting a Wider Net in Zebrafish Screening with Automated Microscopy and Imag...
Casting a Wider Net in Zebrafish Screening with Automated Microscopy and Imag...
 
Immigration Debate Essay Prompts
Immigration Debate Essay PromptsImmigration Debate Essay Prompts
Immigration Debate Essay Prompts
 
FINAL PROJECT REPORT - MAIALEN
FINAL  PROJECT REPORT - MAIALENFINAL  PROJECT REPORT - MAIALEN
FINAL PROJECT REPORT - MAIALEN
 
Dna nanotechnology
Dna nanotechnologyDna nanotechnology
Dna nanotechnology
 
PLANT GENOME SEQUENCING AND DATA MINING.pptx
PLANT GENOME SEQUENCING AND DATA MINING.pptxPLANT GENOME SEQUENCING AND DATA MINING.pptx
PLANT GENOME SEQUENCING AND DATA MINING.pptx
 
Replication
ReplicationReplication
Replication
 
DNA Notes
DNA NotesDNA Notes
DNA Notes
 
2014 davis-talk
2014 davis-talk2014 davis-talk
2014 davis-talk
 
Nanotechnology
NanotechnologyNanotechnology
Nanotechnology
 
PAG-2004-Roe
PAG-2004-RoePAG-2004-Roe
PAG-2004-Roe
 
BioSB meeting 2015
BioSB meeting 2015BioSB meeting 2015
BioSB meeting 2015
 
Watson and Crick model
Watson and Crick modelWatson and Crick model
Watson and Crick model
 
Clase 2 - Genoma Humano proyecto conicet.pdf
Clase 2 - Genoma Humano proyecto conicet.pdfClase 2 - Genoma Humano proyecto conicet.pdf
Clase 2 - Genoma Humano proyecto conicet.pdf
 

Último

Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 

Último (20)

Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 

Detecting single molecules and sequencing DNA

  • 1. Detecting single molecules and sequencing DNA Rohan T. Ranasinghe University Chemical Laboratories, Lensfield Road, Cambridge CB2 1EW
  • 3. Locations and timeline 1 mile http://www.cambridge 2000.com Old Cavendish Laboratory 1953: Discovery of the structure of DNA
  • 4. Locations and timeline 1 mile http://www.cambridge 2000.com Old Cavendish Laboratory 1953: Discovery of the structure of DNA LMB 1977: Sanger method for sequencing invented http://www2.mrc-lmb.cam.ac.uk
  • 5. Locations and timeline 1 mile http://www.cambridge 2000.com Old Cavendish Laboratory 1953: Discovery of the structure of DNA Sanger Institute 1993: Work on Human genome project at the Sanger starts Genome Research Ltd. LMB 1977: Sanger method for sequencing invented http://www2.mrc-lmb.cam.ac.uk
  • 6. Locations and timeline 1 mile http://www.cambridge 2000.com Old Cavendish Laboratory 1953: Discovery of the structure of DNA Chemistry department http://www.flickr.com/ photos/shai- bl/5584629687/sizes/ m/in/photostream/ 1997: Work on Solexa method for sequencing started Sanger Institute 1993: Work on Human genome project at the Sanger starts Genome Research Ltd. LMB 1977: Sanger method for sequencing invented http://www2.mrc-lmb.cam.ac.uk
  • 7. Structure of DNA http://www.themicrobiologist.com Solved in Cambridge in 1953 by James Watson and Francis Crick using data collected by Rosalind Franklin and Maurice Wilkins at King’s College London The key to the structure was base pairing
  • 8. Structure of DNA http://www.themicrobiologist.com Solved in Cambridge in 1953 by James Watson and Francis Crick using data collected by Rosalind Franklin and Maurice Wilkins at King’s College London The key to the structure was base pairing
  • 9. Structure of DNA http://www.flickr.com/photos/grahams__flickr /504365411/sizes/l/in/photostream/ Solved in Cambridge in 1953 by James Watson and Francis Crick using data collected by Rosalind Franklin and Maurice Wilkins at King’s College London The key to the structure was base pairing
  • 10. Structure of DNA http://www.flickr.com/photos/major_clanger/ 5881631482/sizes/o/in/photostream/ http://www.flickr.com/photos/grahams__flickr /504365411/sizes/l/in/photostream/ Solved in Cambridge in 1953 by James Watson and Francis Crick using data collected by Rosalind Franklin and Maurice Wilkins at King’s College London The key to the structure was base pairing
  • 11. Structure of DNA http://www.flickr.com/photos/major_clanger/ 5881631482/sizes/o/in/photostream/ http://www.flickr.com/photos/grahams__flickr /504365411/sizes/l/in/photostream/ Solved in Cambridge in 1953 by James Watson and Francis Crick using data collected by Rosalind Franklin and Maurice Wilkins at King’s College London The key to the structure was base pairing The fidelity of the Watson-Crick base pairs and the double helix structure are the cornerstones of DNA sequencing and modern forensic science
  • 12. DNA Sequencing Why would you want to sequence DNA? http://www.sikeston.k12.mo.us
  • 13. DNA Sequencing Why would you want to sequence DNA? http://www.sikeston.k12.mo.us © Invitrogen
  • 14. DNA Sequencing Why would you want to sequence DNA? A genome contains the information required to build an organism http://www.sikeston.k12.mo.us © Invitrogen
  • 15. DNA Sequencing Why would you want to sequence DNA? A genome contains the information required to build an organism http://www.sikeston.k12.mo.us It’s a long book... © InvitrogenWikipedia
  • 16. DNA Sequencing Why would you want to sequence DNA? A genome contains the information required to build an organism http://www.sikeston.k12.mo.us It’s a long book... © Invitrogen ~3,000,000,000 (3 ×109) letters in each of the ~1014 cells in a human Wikipedia
  • 17. DNA Sequencing Why would you want to sequence DNA? A genome contains the information required to build an organism http://www.sikeston.k12.mo.us It’s a long book... © Invitrogen ~3,000,000,000 (3 ×109) letters in each of the ~1014 cells in a human Distance between base pairs = 0.34 nm (0.34 ×10-9 m) Wikipedia
  • 18. DNA Sequencing Why would you want to sequence DNA? A genome contains the information required to build an organism http://www.sikeston.k12.mo.us It’s a long book... © Invitrogen ~3,000,000,000 (3 ×109) letters in each of the ~1014 cells in a human The DNA in one of your cells would be 2 m long in the B-form structure Distance between base pairs = 0.34 nm (0.34 ×10-9 m) Wikipedia
  • 20. Sanger sequencing CAGTCAGTCA GA C G A C TA G T C Based on copying of DNA: Genome Research Ltd.
  • 21. Sanger sequencing CAGTCAGTCA GA C G T A C TA G T C Based on copying of DNA: Genome Research Ltd.
  • 22. Sanger sequencing CAGTCAGTCA GA C G T A C TA CG T C Based on copying of DNA: Genome Research Ltd.
  • 25. T Sanger sequencing CAGTCAGTCA GA C G T A C G TA CG T A C Based on copying of DNA: Genome Research Ltd. Incorporation of fluorescent nucleotide terminates the copying process
  • 26. T Sanger sequencing CAGTCAGTCA GA C G T A C G TA CG T A C Based on copying of DNA: Repeat ~1030 times Genome Research Ltd.
  • 27. T Sanger sequencing CAGTCAGTCA GA C G T A C G TA CG T A C Based on copying of DNA: Repeat ~1030 times Genome Research Ltd.
  • 28. Sanger sequencing Copied sequence G C T A C G A T G C T A C G A T G C T A Original sequence Repeat 3 × 108 times to read genome (would take another 190 years at this speed!*) *Note: original animation took ~20 seconds)
  • 29. The human genome project http://www.c-spanvideo.org/program/157909-1 Started: 1989 (in the USA)
  • 30. The human genome project First draft completed: 2000 ‘Finished’: 2003 http://www.c-spanvideo.org/program/157909-1 Started: 1989 (in the USA)
  • 31. The human genome project First draft completed: 2000 ‘Finished’: 2003 http://www.c-spanvideo.org/program/157909-1 Started: 1989 (in the USA)
  • 32. The human genome project Cost: $3,000,000,000 First draft completed: 2000 ‘Finished’: 2003 http://www.c-spanvideo.org/program/157909-1 Started: 1989 (in the USA)
  • 33. The human genome project Cost: $3,000,000,000 First draft completed: 2000 ‘Finished’: 2003 http://www.flickr.com/photos/93425126@N00/43948 34217/in/set-72157623515077498/ http://www.c-spanvideo.org/program/157909-1 Started: 1989 (in the USA)
  • 34. The human genome project Cost: $3,000,000,000 First draft completed: 2000 ‘Finished’: 2003 http://www.flickr.com/photos/93425126@N00/43948 34217/in/set-72157623515077498/ http://www.c-spanvideo.org/program/157909-1 Started: 1989 (in the USA) UK effort on the Human Genome Project largely carried out in this building in the Sanger Centre
  • 35. The human genome project Cost: $3,000,000,000 First draft completed: 2000 ‘Finished’: 2003 http://www.flickr.com/photos/93425126@N00/43948 34217/in/set-72157623515077498/ http://www.c-spanvideo.org/program/157909-1 Started: 1989 (in the USA) UK effort on the Human Genome Project largely carried out in this building in the Sanger Centre 9 Chromosomes were sequenced here (about a third of the genome)
  • 36. What does it mean to detect a single molecule? Looking for a needle in a haystack?
  • 37. What does it mean to detect a single molecule? Looking for a needle in a haystack? How many blades of grass on a football pitch?
  • 38. What does it mean to detect a single molecule? Looking for a needle in a haystack? About 200,000,000 or 2×108 How many blades of grass on a football pitch?
  • 39. What does it mean to detect a single molecule? How many molecules in a vial of water? Looking for a needle in a haystack? About 200,000,000 or 2×108 How many blades of grass on a football pitch?
  • 40. What does it mean to detect a single molecule? 18 mL (1 mole) of water contains Avogadro’s number of molecules: 6.02 ×1023 How many molecules in a vial of water? Looking for a needle in a haystack? About 200,000,000 or 2×108 How many blades of grass on a football pitch?
  • 41. What does it mean to detect a single molecule? 18 mL (1 mole) of water contains Avogadro’s number of molecules: 6.02 ×1023 How many molecules in a vial of water? Looking for a needle in a haystack? About 200,000,000 or 2×108 How many blades of grass on a football pitch? So 1 mole of grass blades would cover 6.02 ×1023 ÷ 2×108 = 3 ×1015 football pitches
  • 42. What does it mean to detect a single molecule? 18 mL (1 mole) of water contains Avogadro’s number of molecules: 6.02 ×1023 How many molecules in a vial of water? Looking for a needle in a haystack? About 200,000,000 or 2×108 How many blades of grass on a football pitch? So 1 mole of grass blades would cover 6.02 ×1023 ÷ 2×108 = 3 ×1015 football pitches That’s a lot of haystacks...
  • 43. What does it mean to detect a single molecule? 1 mole of grass blades = 3×1015 football pitches = 15×1012 km2
  • 44. What does it mean to detect a single molecule? 1 mole of grass blades = 3×1015 football pitches = 15×1012 km2 Surface area of Earth = 5×108 km2 (1011 football pitches!)
  • 45. What does it mean to detect a single molecule? 1 mole of grass blades = 3×1015 football pitches = 15×1012 km2 Surface area of Jupiter = 6×1010 km2 *Lab demonstration: 180 µL (15×1010 km2 of grass blades)
  • 46. What does it mean to detect a single molecule? 1 mole of grass blades = 3×1015 football pitches = 15×1012 km2 Surface area of the Sun = 6×1012 km2
  • 47. What does it mean to detect a single molecule? 1 mole of grass blades = 3×1015 football pitches = 15×1012 km2 Surface area of the Sun = 6×1012 km2 1 mole of grass blades would cover the surface area of about 2.5 Suns!
  • 48. What does it mean to detect a single molecule? 1 mole of grass blades = 3×1015 football pitches = 15×1012 km2 Surface area of the Sun = 6×1012 km2 1 mole of grass blades would cover the surface area of about 2.5 Suns! All images: nasa.gov
  • 49. Sanger sequencing Sanger sequencing uses about 2×1010 molecules per 100 letters
  • 50. Solexa sequencing Invented in 1997 in this department Developed by a spin-out company in Saffron Walden Sold for $650,000,000 in 2006
  • 51. Solexa sequencing Solexa sequencing uses about 103 molecules to read 100 letters About as many blades of grass as on the penalty spot Imaging technology: lab demonstration http://thesportboys.wordpress.com/category/international/page/2/ Invented in 1997 in this department Developed by a spin-out company in Saffron Walden Sold for $650,000,000 in 2006
  • 53. Solexa sequencing © Royal Society of Chemistry publishing Densely packed microscopic “islands” of DNA generate information very quickly
  • 54. Solexa sequencing © Royal Society of Chemistry publishing • “Recycled” template molecules ready for a incorporation of the next fluorescent letter • Possible to read about 100 letters from each DNA strand, rather than 1
  • 55. Solexa sequencing © Royal Society of Chemistry publishing Genome Research Ltd.
  • 56. Solexa sequencing © Royal Society of Chemistry publishing Genome Research Ltd. Cost to sequence a human genome: around $10,000 Time to sequence a human genome: less than a week First African, Asian and giant panda genomes sequenced Sanger Institute owns 37 instruments
  • 57. Solexa sequencing © Royal Society of Chemistry publishing Genome Research Ltd. Cost to sequence a human genome: around $10,000 Time to sequence a human genome: less than a week First African, Asian and giant panda genomes sequenced Sanger Institute owns 37 instruments
  • 58. Summary The structure of DNA, discovered in 1953 has been crucial to sequencing the human genome The first human genome was sequenced using Fred Sanger’s method, invented in 1977. The project ran for 14 years, costing $3 billion New methods for sequencing use single molecule detection to dramatically accelerate the decoding process One approach using single molecule techniques, invented by Shankar Balasubramanian and David Klenerman in our department in 1997 is now widely used for sequencing worldwide The cost of sequencing has fallen to $10,000 and takes less than a week