SlideShare una empresa de Scribd logo
1 de 46
Descargar para leer sin conexión
An Introduction to NGS
(Next Generation Sequencing)
        François Paillier - 22/02/2011
Plan
  [ Reminder about Sanger Sequencing ]



• NGS Definition
• Overview of NGS technologies
• NGS Applications & examples
• Conclusion

 NOT discussed here : Sequence accuracy, assembly and sampling ; NGS
 data Analysis & BioInformatics tools
A word about Sanger Sequencing
  (First generation sequencing machine  Video)
                                                                         3730xl
Principle (only the tube G + dideoxyG)




                                                                               From gel to
                                                                               capillary




         Still a gold standard but capillary sequencing has reached its technical
         limitation (costs and performance will remain unchanged)
Short Reminder about « Classical » Assembly
                 projects

     Sample  Libraries

                                 Target genome


 n Sequencing sub-projects                    Cloning
                                 SubTargets (BACs, cosmids, ..)




           Assembly
                                     Clone selection &
                                        Sequencing
      Finishing: Draft (Q40)


          Annotation
                                       Assembly

     Annotated Genome
                                                 Other strategy : wgs
Sequencing, what for ?
                          Assembly projects for example

           In bioinformatics, sequence assembly refers to aligning and merging fragments of
           a much longer DNA sequence in order to reconstruct the original sequence. This
           is needed as DNA sequencing technology cannot read whole genomes in one go,
           but rather small pieces between 20 and 1000 bases, depending on the technology
           used. Typically the short fragments, called reads, result from shotgun sequencing
           genomic DNA, or gene transcript (ESTs).



Target genome


                                          Sequencing




                                                                                  reads

                                           Assembly
                                                                   Assembled reads




                    gap                               gap       gap
                            4X Local coverage                         Consensus
scaffold
Vocabulary that should be kept in mind
                  in the sequencing field

•   Assembly : result of the sequence clustering based on their local
    similarity
•   Contig : A set of overlapping DNA segments
•   Coverage (in sequencing) : The mean number of times a nucleotide is
    sequenced in a genome (example: 10X coverage)

•   Scaffolds : A series of contigs that are in the right order but not necessarily
    connected in one contiguous stretch
•   Mate pairs Sequences known to be in the 3′ and 5′ of a contig from a single
    clone




•   WGS = Whole genome shotgun sequencing strategy
•   ESS = Environmental Shotgun Sequencing
NGS = Next Generation
         Sequencing



    After PCR,
THE new revolution
   in Biology ?
NGS Synonym is : High-throughput Sequencing
                     (HTS)




                                    Third Generation :
                                    NGS = HTS, Single
                                    Molecule Sequencing

                     Second Generation :
                     NGS = Massively
                     Parallel Sequencing
First Generation :
SANGER Sequencing
Overview of actual NGS technologies
                 (Second generation sequencing machines)

Year 2005*

                                Roche, 454 GS-FLX
                                Titanium Protocol a must                           Each machine with
                                                                                   different :
 2006                                                                              - Throughput
                                                                                   - Sequence accuracy
                                 Illumina,        GA1 then      GA2
                                                                                   - Data formats (and
                                                                                   programs)
 2007
                                                   Applied Bio.,
                                                   Solid v3


*NGS “proof of principle” was done in 2000 by Lynx Therapeutics : They publishes and markets "MPSS" - a parallelized,
adapter/ligation-mediated, bead-based sequencing technology, launching "next-generation" sequencing.
Throughput per
Illumina Channel
HOW is it
Possible ? 
NGS Principle

Building sequencing devices at nanoscale

 Polony : Discrete clonal amplifications of a single DNA molecule,
  grown in a gel matrix. The clusters can then be individually
  sequenced, producing short reads. Polony-based sequencing is
  the basis of most second generation sequencers


A typical NGS Workflow is:
1) Library construction
2) Template CLONAL amplification
3) Massively PARALLEL sequencing
High Parallelism is Achieved in
     Polony Sequencing

Sanger                   Polony
Generation of Polony array: DNA
       Beads (454, SOLiD)




DNA Beads are generated using Emulsion PCR
Generation of Polony array: DNA
     Beads (454, SOLiD)




   DNA Beads are placed in wells
Sequencing: Pyrosequencing (454)

                                          DNA Polymerase




« pyrogram » / « Flowgram »
454 Process : Emulsion PCR &
       Pyrosequencing




              Titanium =
              Read lengths approx. 400 nt
              1 million reads / Run
               400 Mb / day


              VIDEOs
              About Pyrosequencing 1’53’’: <here>

              Summary about GS Flex 4’34’’: <click
              here>
454 GS FLX titanium



No more Cloning step                   - Seq. Accuracy not so high
From purified DNA to Sequencing        (especially in case of
Fit the laboratory bench top / small   homopolymers
LONG Sequences (400 nt)                 Main error type is indel
GS Junior system not so expensive
                                       - Cost : approx. 20K€ / Gb
Capabilities :   Multiplexing &        Cost per base is cheaper
                 paired-ends           (regarding Sanger) but still
                                       High regarding others NexGen
Well fitted to :                       Machines
         - proK. Genome sequencing
         - RNA-seq
Illumina* : Bridge PCR




                GA2x Version =
                Read lengths
                approx. 100 nt
                240 million reads
                 1500 Mb / day
                 30000 Mb / Run
Generation of Polony array: Bridge-
          PCR (Solexa)




DNA fragments are attached to array and
        used as PCR templates

<Watch VIDEO : Related Links  Video : Genome
    Analyzer workflow  Panel technology>
Illumina Chemistry : 4-color DNA sequencing-by-synthesis using reversible
              terminators with removable flourescent dyes




                                                                   8
                                                                   Lanes




                                                   A Flow cell
Illumina seq. Accuracy
Illumina Throughput
Illumina



No more Cloning step
From purified DNA to Sequencing          - Machine is very expensive
Fit the laboratory bench top / small     Main error type is mismatch
Good Sequence Accuracy
                                         - Read lengths are still too short
Capabilities :   Multiplexing &          Not fitted to big genomes
                 paired-ends             (Repeats)

Cost : approx. 2K€ / Gb , Cost per       - Poor coverage of AT rich regions
base is cheaper than 454                 - Most widely used NGS platform.
                                         - Requires least DNA
Well fitted to :
         - proK. Genome sequencing
         - RNA-seq, ChIP-Seq,
         Methyl-Seq
SOLiD system : 4-color DNA Sequencing by
                 Ligation




                         SOLiD V3 =
                         Read lengths
                         approx. 50 nt
                         400 million reads
                          1500 Mb / day
                          20000 Mb / Run
                          1500€ / Gb

                         <Watch Video> 4’46’’
Sequencing by ligation rxn: Fluorescently Labeled
             Nucleotides (ABI SOLiD)




Complementar y strand elongation: DNA Ligase
Sequencing by ligation ABI SOLiD
Sequencing: Fluorescently Labeled Nucleotides
                (ABI SOLiD)




            5 reading frames, each
             position is read twice
Sequencing: Fluorescently Labeled
    Nucleotides (ABI SOLiD)
SOLiD



No more Cloning step
From purified DNA or RNA to Seq.          - This Technology is NOT
Fit the laboratory bench top / small      Intuitive
Good Sequence Accuracy
                                          - Machine is VERY expensive
Capabilities :   Multiplexing &
                 paired-ends              -HUGE amount of data produced
                                          (1500 Gb !!)
Cost : approx. 1.5K€ / Gb , Cost per
base is cheaper than illumina             -Long Run times

Well fitted to :                          -Has been demonstrated
         - REsequencing                   certain reads don’t match
         - RNA-seq, ChIP-Seq,             Reference !
         Methyl-Seq
Focusing NGS effort on predefined targets :
« Target Enrichment » Technology (Capture Array)
Focusing NGS effort on predefined targets :
« Target Enrichment » Technology (Capture Beads)
Summary : NGS Workflows




   +/- Target Enrichment Strategy

                                    Source: BCG
Prokaryotic Genome Sequencing
 Project as a mix of NGS technologies




                                         Conclusion :
  - High quality drafts can be produced for small genomes without any Sanger data input.
- We found that 454 GSFLX and Solexa/Illumina show great complementarity in producing
                     large contigs and supercontigs with a low error rate.
NGS Applications
DEEPER insight into biological processes
BROADER sampling of populations (cells, viruses,
Ecosystems…)



   • In different fields…
      – Metagenomics
      – Genomics
      – Transcriptomics
      – proteomics
Genome
  * De Novo Sequencing
  * Targeted Resequencing           …for different
(SNP, Indel, CNV)
  * Whole Genome Resequencing       purposes…
                                    -Towards Personalized
  * Metagenome analyses             Medicine
                                    - Biodiversity assessment
Transcriptome                       -De Novo Sequencing of
  * Gene Expression Profiling       prokaryotic or eukaryotic
                                    genomes (or re-sequencing)
  * Small RNA Analysis
                                    -RNA-Seq  Annotation of
  * Whole Transcriptome Analysis    eukaryotic genomes
                                    -SNP calling : identification of
Epigenome                           mutations
  * Chromatin Immunoprecipitation   -Chip-Seq : identification of
                                    DNA/protein interactions
      Sequencing (ChIP-Seq)
  * Methylation Analysis
What is the current impact of
                NGS on Biology ?



• Both transcriptomics and genomics can now be
  adressed using one technology with higher
  accuracy and robustess (instead of Sanger
  sequencing + µarrays p.e.) ( Example of RNA-SEQ)
• SNP calling can rely on ultra-deep assemblies
• Whole genome overview of transcription factors
  binding sites
• Biodiversity assessment ( Metagenomics projects)
• And so much more…
About whole-exome sequencing :
 « For the First Time, DNA Sequencing Technology
                Saves A Child's Life »




« Proponents of genetic medicine say DNA sequencing is the future of
medicine and that soon every truly sick person will have his or her genome
sequenced. Critics cite privacy concerns and note that genetic mutations and
variations don’t necessarily lead to medical outcomes. Whatever the
position, it’s hard to argue that this isn’t good news: the first child – plagued
by undiagnosable illness – has been saved by DNA sequencing.
That may be a bit of a strong statement – six-year-old Nicholas Volker is
doing well, though complications could soon arise. But it’s highly likely that
the sequencing of young Nicholas’s genome saved his life. »
<Link> <Article>
                     Mayer & Al. Genetics IN Medicine • Volume xx, Number xx, 01 2011
What’s Next ?


                            IonTorrent
                               PacBio


 Roche, 454 GS-FLX
 Titanium




Illumina, GA2              Third Generation :
                           - Single
                           Molecule Sequencing (no bias)
                           - Faster
Applied BioSys, Solid v3
                           - Cheaper (or not)
Second Generation :        - 1000€ Human genome ?
NGS = Massively
Parallel Sequencing
(polony sequencing)
Conclusion : impact of NGS
               Global Shift to sequencing-based technologies

 Great improvements on-going : Higher throughput, longer reads
 Is it the end of µarrays ? A sub-part of NGS workflows restricted to target-
enrichment ?
 Is it the end of forward genetics ? Reverse genetics only ?
 Biologists education should integrate NGS knowledge
 Is it the end of « Big sequencing centers »? change in their mission ?


Next bottleneck : BioInformatics


- Storing data a problem (SRA soon down ?) AND IT networks speed
FAR too low  Very difficult to share NGS data  Fridges instead of
disks !?
- Analyzing data a problem  great improvements but still a lot of work
remain to be done
Thanks
for your attention !
Technology Summary

                Read length   Sequencing   Throughput   Cost
                              Technology   (per run)    (1mbp)*
   Sanger       ~800bp        Sanger       400kbp       500$

   454          ~400bp        Polony       500Mbp       60$

   Solexa/Illumi 75bp         Polony       20Gbp        2$
   na
   SOLiD        75bp          Polony       60Gbp        2$

   Helicos      30-35bp       Single       25Gbp        1$
                              molecule

*Source: Shendure & Ji, Nat Biotech, 2008
NGS Technology Comparison
           ABI SOLiD               Illumina GA               454 Roche FLX
Cost       SOLiD 4: $495k          IIe: $470k                Titanium: $500k
           SOLiD PI: $240k         IIx: $250k
                                   HiSeq: $690k
Quantity   SOLiD 4: 100Gb          IIe: 20 - 38 Gb           450 Mb
of Data    SOLiD PI: 50Gb          IIx: 50 – 95 Gb
per run                            HiSeq: 200Gb +

Run Time   7 Days                  4 Days                    9 Hours

Pros       Low error rate due to   Most widely used          Short run time. Long
           dibase probes           NGS platform.             reads better for de
                                   Requires least DNA        novo sequencing
Cons       Long run times. Has     Least multiplexing        Expensive reagent
           been demonstrated       capability of the 3.      cost. Difficulty
           certain reads don’t     Poor coverage of AT       reading
           match reference         rich regions              homopolymer
                                                             regions
                                                     Source: The University of Western Ontario

Más contenido relacionado

La actualidad más candente

Next generation sequencing
Next generation sequencingNext generation sequencing
Next generation sequencingTapish Goel
 
Tag-based transcript sequencing: Comparison of SAGE and CAGE
Tag-based transcript sequencing: Comparison of SAGE and CAGETag-based transcript sequencing: Comparison of SAGE and CAGE
Tag-based transcript sequencing: Comparison of SAGE and CAGEMatthias Harbers
 
2 whole genome sequencing and analysis
2 whole genome sequencing and analysis2 whole genome sequencing and analysis
2 whole genome sequencing and analysissaberhussain9
 
Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...
Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...
Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...QIAGEN
 
Next Generation Sequencing of DNA
Next Generation Sequencing of DNANext Generation Sequencing of DNA
Next Generation Sequencing of DNAmaryamshah13
 
New generation sequencing equipments
New generation sequencing equipmentsNew generation sequencing equipments
New generation sequencing equipmentsKalaivani P
 
Next generation sequencing methods
Next generation sequencing methods Next generation sequencing methods
Next generation sequencing methods Mrinal Vashisth
 
NEXT GENERATION SEQUENCING
NEXT GENERATION SEQUENCINGNEXT GENERATION SEQUENCING
NEXT GENERATION SEQUENCINGAayushi Pal
 
Introduction to Next-Generation Sequencing (NGS) Technology
Introduction to Next-Generation Sequencing (NGS) TechnologyIntroduction to Next-Generation Sequencing (NGS) Technology
Introduction to Next-Generation Sequencing (NGS) TechnologyQIAGEN
 
Conventional and next generation sequencing ppt
Conventional and next generation sequencing pptConventional and next generation sequencing ppt
Conventional and next generation sequencing pptAshwini R
 
Small Molecule Real Time Sequencing
Small Molecule Real Time SequencingSmall Molecule Real Time Sequencing
Small Molecule Real Time SequencingUSD Bioinformatics
 
Third Generation Sequencing
Third Generation Sequencing Third Generation Sequencing
Third Generation Sequencing priyanka raviraj
 
Next generation sequencing
Next generation sequencingNext generation sequencing
Next generation sequencingSwathi Prabakar
 

La actualidad más candente (20)

Ion torrent
Ion torrentIon torrent
Ion torrent
 
Next generation sequencing
Next generation sequencingNext generation sequencing
Next generation sequencing
 
Tag-based transcript sequencing: Comparison of SAGE and CAGE
Tag-based transcript sequencing: Comparison of SAGE and CAGETag-based transcript sequencing: Comparison of SAGE and CAGE
Tag-based transcript sequencing: Comparison of SAGE and CAGE
 
2 whole genome sequencing and analysis
2 whole genome sequencing and analysis2 whole genome sequencing and analysis
2 whole genome sequencing and analysis
 
Dna sequencing ppt
Dna sequencing pptDna sequencing ppt
Dna sequencing ppt
 
Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...
Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...
Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...
 
True Single Molecule Sequencing
True Single Molecule SequencingTrue Single Molecule Sequencing
True Single Molecule Sequencing
 
Next Generation Sequencing of DNA
Next Generation Sequencing of DNANext Generation Sequencing of DNA
Next Generation Sequencing of DNA
 
Microarray
MicroarrayMicroarray
Microarray
 
New generation sequencing equipments
New generation sequencing equipmentsNew generation sequencing equipments
New generation sequencing equipments
 
Next generation sequencing methods
Next generation sequencing methods Next generation sequencing methods
Next generation sequencing methods
 
NEXT GENERATION SEQUENCING
NEXT GENERATION SEQUENCINGNEXT GENERATION SEQUENCING
NEXT GENERATION SEQUENCING
 
Introduction to Next-Generation Sequencing (NGS) Technology
Introduction to Next-Generation Sequencing (NGS) TechnologyIntroduction to Next-Generation Sequencing (NGS) Technology
Introduction to Next-Generation Sequencing (NGS) Technology
 
RNA-Seq
RNA-SeqRNA-Seq
RNA-Seq
 
Ion torrent sequencing
Ion torrent sequencingIon torrent sequencing
Ion torrent sequencing
 
Conventional and next generation sequencing ppt
Conventional and next generation sequencing pptConventional and next generation sequencing ppt
Conventional and next generation sequencing ppt
 
Small Molecule Real Time Sequencing
Small Molecule Real Time SequencingSmall Molecule Real Time Sequencing
Small Molecule Real Time Sequencing
 
Intro to illumina sequencing
Intro to illumina sequencingIntro to illumina sequencing
Intro to illumina sequencing
 
Third Generation Sequencing
Third Generation Sequencing Third Generation Sequencing
Third Generation Sequencing
 
Next generation sequencing
Next generation sequencingNext generation sequencing
Next generation sequencing
 

Similar a Ngs intro_v6_public

20110524zurichngs 1st pub
20110524zurichngs 1st pub20110524zurichngs 1st pub
20110524zurichngs 1st pubsesejun
 
A Comparison of NGS Platforms.
A Comparison of NGS Platforms.A Comparison of NGS Platforms.
A Comparison of NGS Platforms.mkim8
 
2013 pag-equine-workshop
2013 pag-equine-workshop2013 pag-equine-workshop
2013 pag-equine-workshopc.titus.brown
 
THIRD GEN SEQUENCING.pptx
THIRD GEN SEQUENCING.pptxTHIRD GEN SEQUENCING.pptx
THIRD GEN SEQUENCING.pptxRITHIKA R S
 
20150601 bio sb_assembly_course
20150601 bio sb_assembly_course20150601 bio sb_assembly_course
20150601 bio sb_assembly_coursehansjansen9999
 
Ngs microbiome
Ngs microbiomeNgs microbiome
Ngs microbiomejukais
 
Getting Started with NGS (Discover the Benefits of Technology and How it Oper...
Getting Started with NGS (Discover the Benefits of Technology and How it Oper...Getting Started with NGS (Discover the Benefits of Technology and How it Oper...
Getting Started with NGS (Discover the Benefits of Technology and How it Oper...Tekmatic
 
2016. daisuke tsugama. next generation sequencing (ngs) for plant research
2016. daisuke tsugama. next generation sequencing (ngs) for plant research2016. daisuke tsugama. next generation sequencing (ngs) for plant research
2016. daisuke tsugama. next generation sequencing (ngs) for plant researchFOODCROPS
 
Introduction to NGS
Introduction to NGSIntroduction to NGS
Introduction to NGScursoNGS
 
Knowing Your NGS Upstream: Alignment and Variants
Knowing Your NGS Upstream: Alignment and VariantsKnowing Your NGS Upstream: Alignment and Variants
Knowing Your NGS Upstream: Alignment and VariantsGolden Helix Inc
 
Next generation sequencing technologies for crop improvement
Next generation sequencing technologies for crop improvementNext generation sequencing technologies for crop improvement
Next generation sequencing technologies for crop improvementanjaligoud
 
Approaches to cDNA Cloning and Analysis
Approaches to cDNA Cloning and AnalysisApproaches to cDNA Cloning and Analysis
Approaches to cDNA Cloning and AnalysisMatthias Harbers
 
Next-generation sequencing course, part 1: technologies
Next-generation sequencing course, part 1: technologiesNext-generation sequencing course, part 1: technologies
Next-generation sequencing course, part 1: technologiesJan Aerts
 
2011 jeroen vanhoudt_ngs
2011 jeroen vanhoudt_ngs2011 jeroen vanhoudt_ngs
2011 jeroen vanhoudt_ngsDin Apellidos
 
Next Generation Sequencing
Next Generation SequencingNext Generation Sequencing
Next Generation Sequencingshinycthomas
 

Similar a Ngs intro_v6_public (20)

20110524zurichngs 1st pub
20110524zurichngs 1st pub20110524zurichngs 1st pub
20110524zurichngs 1st pub
 
A Comparison of NGS Platforms.
A Comparison of NGS Platforms.A Comparison of NGS Platforms.
A Comparison of NGS Platforms.
 
2013 pag-equine-workshop
2013 pag-equine-workshop2013 pag-equine-workshop
2013 pag-equine-workshop
 
THIRD GEN SEQUENCING.pptx
THIRD GEN SEQUENCING.pptxTHIRD GEN SEQUENCING.pptx
THIRD GEN SEQUENCING.pptx
 
20150601 bio sb_assembly_course
20150601 bio sb_assembly_course20150601 bio sb_assembly_course
20150601 bio sb_assembly_course
 
Genome sequencing
Genome sequencingGenome sequencing
Genome sequencing
 
Ngs introduction
Ngs introductionNgs introduction
Ngs introduction
 
Ngs microbiome
Ngs microbiomeNgs microbiome
Ngs microbiome
 
Rnaseq forgenefinding
Rnaseq forgenefindingRnaseq forgenefinding
Rnaseq forgenefinding
 
Getting Started with NGS (Discover the Benefits of Technology and How it Oper...
Getting Started with NGS (Discover the Benefits of Technology and How it Oper...Getting Started with NGS (Discover the Benefits of Technology and How it Oper...
Getting Started with NGS (Discover the Benefits of Technology and How it Oper...
 
2016. daisuke tsugama. next generation sequencing (ngs) for plant research
2016. daisuke tsugama. next generation sequencing (ngs) for plant research2016. daisuke tsugama. next generation sequencing (ngs) for plant research
2016. daisuke tsugama. next generation sequencing (ngs) for plant research
 
Introduction to NGS
Introduction to NGSIntroduction to NGS
Introduction to NGS
 
Knowing Your NGS Upstream: Alignment and Variants
Knowing Your NGS Upstream: Alignment and VariantsKnowing Your NGS Upstream: Alignment and Variants
Knowing Your NGS Upstream: Alignment and Variants
 
Introduction to next generation sequencing
Introduction to next generation sequencingIntroduction to next generation sequencing
Introduction to next generation sequencing
 
Next generation sequencing technologies for crop improvement
Next generation sequencing technologies for crop improvementNext generation sequencing technologies for crop improvement
Next generation sequencing technologies for crop improvement
 
Approaches to cDNA Cloning and Analysis
Approaches to cDNA Cloning and AnalysisApproaches to cDNA Cloning and Analysis
Approaches to cDNA Cloning and Analysis
 
Lecture 3
Lecture 3Lecture 3
Lecture 3
 
Next-generation sequencing course, part 1: technologies
Next-generation sequencing course, part 1: technologiesNext-generation sequencing course, part 1: technologies
Next-generation sequencing course, part 1: technologies
 
2011 jeroen vanhoudt_ngs
2011 jeroen vanhoudt_ngs2011 jeroen vanhoudt_ngs
2011 jeroen vanhoudt_ngs
 
Next Generation Sequencing
Next Generation SequencingNext Generation Sequencing
Next Generation Sequencing
 

Último

Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 

Último (20)

Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 

Ngs intro_v6_public

  • 1. An Introduction to NGS (Next Generation Sequencing) François Paillier - 22/02/2011
  • 2. Plan [ Reminder about Sanger Sequencing ] • NGS Definition • Overview of NGS technologies • NGS Applications & examples • Conclusion NOT discussed here : Sequence accuracy, assembly and sampling ; NGS data Analysis & BioInformatics tools
  • 3. A word about Sanger Sequencing (First generation sequencing machine  Video) 3730xl Principle (only the tube G + dideoxyG) From gel to capillary Still a gold standard but capillary sequencing has reached its technical limitation (costs and performance will remain unchanged)
  • 4. Short Reminder about « Classical » Assembly projects Sample  Libraries Target genome n Sequencing sub-projects Cloning SubTargets (BACs, cosmids, ..) Assembly Clone selection & Sequencing Finishing: Draft (Q40) Annotation Assembly Annotated Genome Other strategy : wgs
  • 5. Sequencing, what for ? Assembly projects for example In bioinformatics, sequence assembly refers to aligning and merging fragments of a much longer DNA sequence in order to reconstruct the original sequence. This is needed as DNA sequencing technology cannot read whole genomes in one go, but rather small pieces between 20 and 1000 bases, depending on the technology used. Typically the short fragments, called reads, result from shotgun sequencing genomic DNA, or gene transcript (ESTs). Target genome Sequencing reads Assembly Assembled reads gap gap gap 4X Local coverage Consensus scaffold
  • 6. Vocabulary that should be kept in mind in the sequencing field • Assembly : result of the sequence clustering based on their local similarity • Contig : A set of overlapping DNA segments • Coverage (in sequencing) : The mean number of times a nucleotide is sequenced in a genome (example: 10X coverage) • Scaffolds : A series of contigs that are in the right order but not necessarily connected in one contiguous stretch • Mate pairs Sequences known to be in the 3′ and 5′ of a contig from a single clone • WGS = Whole genome shotgun sequencing strategy • ESS = Environmental Shotgun Sequencing
  • 7. NGS = Next Generation Sequencing After PCR, THE new revolution in Biology ?
  • 8. NGS Synonym is : High-throughput Sequencing (HTS) Third Generation : NGS = HTS, Single Molecule Sequencing Second Generation : NGS = Massively Parallel Sequencing First Generation : SANGER Sequencing
  • 9. Overview of actual NGS technologies (Second generation sequencing machines) Year 2005* Roche, 454 GS-FLX Titanium Protocol a must Each machine with different : 2006 - Throughput - Sequence accuracy Illumina, GA1 then GA2 - Data formats (and programs) 2007 Applied Bio., Solid v3 *NGS “proof of principle” was done in 2000 by Lynx Therapeutics : They publishes and markets "MPSS" - a parallelized, adapter/ligation-mediated, bead-based sequencing technology, launching "next-generation" sequencing.
  • 12. NGS Principle Building sequencing devices at nanoscale  Polony : Discrete clonal amplifications of a single DNA molecule, grown in a gel matrix. The clusters can then be individually sequenced, producing short reads. Polony-based sequencing is the basis of most second generation sequencers A typical NGS Workflow is: 1) Library construction 2) Template CLONAL amplification 3) Massively PARALLEL sequencing
  • 13. High Parallelism is Achieved in Polony Sequencing Sanger Polony
  • 14. Generation of Polony array: DNA Beads (454, SOLiD) DNA Beads are generated using Emulsion PCR
  • 15. Generation of Polony array: DNA Beads (454, SOLiD) DNA Beads are placed in wells
  • 16. Sequencing: Pyrosequencing (454) DNA Polymerase « pyrogram » / « Flowgram »
  • 17. 454 Process : Emulsion PCR & Pyrosequencing Titanium = Read lengths approx. 400 nt 1 million reads / Run  400 Mb / day VIDEOs About Pyrosequencing 1’53’’: <here> Summary about GS Flex 4’34’’: <click here>
  • 18.
  • 19. 454 GS FLX titanium No more Cloning step - Seq. Accuracy not so high From purified DNA to Sequencing (especially in case of Fit the laboratory bench top / small homopolymers LONG Sequences (400 nt)  Main error type is indel GS Junior system not so expensive - Cost : approx. 20K€ / Gb Capabilities : Multiplexing & Cost per base is cheaper paired-ends (regarding Sanger) but still High regarding others NexGen Well fitted to : Machines - proK. Genome sequencing - RNA-seq
  • 20. Illumina* : Bridge PCR GA2x Version = Read lengths approx. 100 nt 240 million reads  1500 Mb / day  30000 Mb / Run
  • 21. Generation of Polony array: Bridge- PCR (Solexa) DNA fragments are attached to array and used as PCR templates <Watch VIDEO : Related Links  Video : Genome Analyzer workflow  Panel technology>
  • 22. Illumina Chemistry : 4-color DNA sequencing-by-synthesis using reversible terminators with removable flourescent dyes 8 Lanes A Flow cell
  • 25. Illumina No more Cloning step From purified DNA to Sequencing - Machine is very expensive Fit the laboratory bench top / small Main error type is mismatch Good Sequence Accuracy - Read lengths are still too short Capabilities : Multiplexing & Not fitted to big genomes paired-ends (Repeats) Cost : approx. 2K€ / Gb , Cost per - Poor coverage of AT rich regions base is cheaper than 454 - Most widely used NGS platform. - Requires least DNA Well fitted to : - proK. Genome sequencing - RNA-seq, ChIP-Seq, Methyl-Seq
  • 26. SOLiD system : 4-color DNA Sequencing by Ligation SOLiD V3 = Read lengths approx. 50 nt 400 million reads  1500 Mb / day  20000 Mb / Run  1500€ / Gb <Watch Video> 4’46’’
  • 27. Sequencing by ligation rxn: Fluorescently Labeled Nucleotides (ABI SOLiD) Complementar y strand elongation: DNA Ligase
  • 29. Sequencing: Fluorescently Labeled Nucleotides (ABI SOLiD) 5 reading frames, each position is read twice
  • 30. Sequencing: Fluorescently Labeled Nucleotides (ABI SOLiD)
  • 31. SOLiD No more Cloning step From purified DNA or RNA to Seq. - This Technology is NOT Fit the laboratory bench top / small Intuitive Good Sequence Accuracy - Machine is VERY expensive Capabilities : Multiplexing & paired-ends -HUGE amount of data produced (1500 Gb !!) Cost : approx. 1.5K€ / Gb , Cost per base is cheaper than illumina -Long Run times Well fitted to : -Has been demonstrated - REsequencing certain reads don’t match - RNA-seq, ChIP-Seq, Reference ! Methyl-Seq
  • 32. Focusing NGS effort on predefined targets : « Target Enrichment » Technology (Capture Array)
  • 33. Focusing NGS effort on predefined targets : « Target Enrichment » Technology (Capture Beads)
  • 34. Summary : NGS Workflows +/- Target Enrichment Strategy Source: BCG
  • 35. Prokaryotic Genome Sequencing Project as a mix of NGS technologies Conclusion : - High quality drafts can be produced for small genomes without any Sanger data input. - We found that 454 GSFLX and Solexa/Illumina show great complementarity in producing large contigs and supercontigs with a low error rate.
  • 36. NGS Applications DEEPER insight into biological processes BROADER sampling of populations (cells, viruses, Ecosystems…) • In different fields… – Metagenomics – Genomics – Transcriptomics – proteomics
  • 37. Genome * De Novo Sequencing * Targeted Resequencing …for different (SNP, Indel, CNV) * Whole Genome Resequencing purposes… -Towards Personalized * Metagenome analyses Medicine - Biodiversity assessment Transcriptome -De Novo Sequencing of * Gene Expression Profiling prokaryotic or eukaryotic genomes (or re-sequencing) * Small RNA Analysis -RNA-Seq  Annotation of * Whole Transcriptome Analysis eukaryotic genomes -SNP calling : identification of Epigenome mutations * Chromatin Immunoprecipitation -Chip-Seq : identification of DNA/protein interactions Sequencing (ChIP-Seq) * Methylation Analysis
  • 38.
  • 39. What is the current impact of NGS on Biology ? • Both transcriptomics and genomics can now be adressed using one technology with higher accuracy and robustess (instead of Sanger sequencing + µarrays p.e.) ( Example of RNA-SEQ) • SNP calling can rely on ultra-deep assemblies • Whole genome overview of transcription factors binding sites • Biodiversity assessment ( Metagenomics projects) • And so much more…
  • 40. About whole-exome sequencing : « For the First Time, DNA Sequencing Technology Saves A Child's Life » « Proponents of genetic medicine say DNA sequencing is the future of medicine and that soon every truly sick person will have his or her genome sequenced. Critics cite privacy concerns and note that genetic mutations and variations don’t necessarily lead to medical outcomes. Whatever the position, it’s hard to argue that this isn’t good news: the first child – plagued by undiagnosable illness – has been saved by DNA sequencing. That may be a bit of a strong statement – six-year-old Nicholas Volker is doing well, though complications could soon arise. But it’s highly likely that the sequencing of young Nicholas’s genome saved his life. » <Link> <Article> Mayer & Al. Genetics IN Medicine • Volume xx, Number xx, 01 2011
  • 41. What’s Next ? IonTorrent PacBio Roche, 454 GS-FLX Titanium Illumina, GA2 Third Generation : - Single Molecule Sequencing (no bias) - Faster Applied BioSys, Solid v3 - Cheaper (or not) Second Generation : - 1000€ Human genome ? NGS = Massively Parallel Sequencing (polony sequencing)
  • 42. Conclusion : impact of NGS Global Shift to sequencing-based technologies  Great improvements on-going : Higher throughput, longer reads  Is it the end of µarrays ? A sub-part of NGS workflows restricted to target- enrichment ?  Is it the end of forward genetics ? Reverse genetics only ?  Biologists education should integrate NGS knowledge  Is it the end of « Big sequencing centers »? change in their mission ? Next bottleneck : BioInformatics - Storing data a problem (SRA soon down ?) AND IT networks speed FAR too low  Very difficult to share NGS data  Fridges instead of disks !? - Analyzing data a problem  great improvements but still a lot of work remain to be done
  • 43.
  • 45. Technology Summary Read length Sequencing Throughput Cost Technology (per run) (1mbp)* Sanger ~800bp Sanger 400kbp 500$ 454 ~400bp Polony 500Mbp 60$ Solexa/Illumi 75bp Polony 20Gbp 2$ na SOLiD 75bp Polony 60Gbp 2$ Helicos 30-35bp Single 25Gbp 1$ molecule *Source: Shendure & Ji, Nat Biotech, 2008
  • 46. NGS Technology Comparison ABI SOLiD Illumina GA 454 Roche FLX Cost SOLiD 4: $495k IIe: $470k Titanium: $500k SOLiD PI: $240k IIx: $250k HiSeq: $690k Quantity SOLiD 4: 100Gb IIe: 20 - 38 Gb 450 Mb of Data SOLiD PI: 50Gb IIx: 50 – 95 Gb per run HiSeq: 200Gb + Run Time 7 Days 4 Days 9 Hours Pros Low error rate due to Most widely used Short run time. Long dibase probes NGS platform. reads better for de Requires least DNA novo sequencing Cons Long run times. Has Least multiplexing Expensive reagent been demonstrated capability of the 3. cost. Difficulty certain reads don’t Poor coverage of AT reading match reference rich regions homopolymer regions Source: The University of Western Ontario