SlideShare una empresa de Scribd logo
1 de 37
VAAST
Deciphering Genetic Disease with Next-Generation
Sequencing
Barry Moore, M.S.
Research Scientist
Department of Human Genetics
Department of Biomedical Informatics
Outline

 The VAAST Analysis Pipeline
 Ogden Syndrome: Application of VAAST to a Genetic Disease
 of Unknown Cause
 The Future of VAAST Development
$10,000,000
Venter Genome



                $1,000,000
                  Watson




                             $5,000
                              You?
Next Generation Sequencing




                                             Disease


                                             Healthy

     geneA   geneB   geneX   geneY   geneZ
Variant
 Variant     Annotation
Annotation   Tool


             Variant
 Variant     Selection
Selection    Tool

             Variant
 Variant     Annotation
             Analysis
 Analysis    Search
             Tool
GVF

VAAST Pipeline                                             3.5 Million
                                                            Variants

 Reference                       VAT                       Reference
  Genome                 (Variant Annotation Tool)          Genes
   Fasta                                                         GFF3


           Annotated          Annotated              Annotated
                   GVF
            Variants           Variants               Variants



                                VST
                         (Variant Selection Tool)




                  CDR
                               Merged
                             Variant Sets
GVF

VAAST Pipeline                        Variant Effect
                                                        3.5 Million
                                      •sequence_variant
                                                         Variants
                                         •gene_variant
 Reference                     VAT                           Reference
                                         •five_prime_UTR_variant
  Genome Type
    Variant                                                    Genes
                                         •three_prime_UTR_variant
                        (Variant Annotation Tool)
    •sequence_alteration
   Fasta                                 •exon_variant          GFF3
    •deletion                            •splice_region_variant
    •insertion                           •splice_donor_variant
    •duplication
         Annotated            Annotated  •splice_acceptor_variant
                                                    Annotated
    •inversion     GVF                   •intron_variant
          Variants
    •substitution
                               Variants               Variants
                                         •coding_sequence_variant
    •SNV                                 •stop_retained
    •MNP                                 •stop_lost
    •complex substitution                •stop_gained
    •translocation            VST        •synonymous_codon
                                         •non_synonymous_codon
                         (Variant Selection Tool)
                                         •amino_acid_substitution
                                         •frameshift_variant
                                         •inframe_variant

                  CDR
                             Merged
                           Variant Sets
GVF

VAAST Pipeline                        Variant Effect
                                                        3.5 Million
                                      •sequence_variant
                                                         Variants
                                         •gene_variant
 Reference                     VAT                           Reference
                                         •five_prime_UTR_variant
  Genome Type
    Variant                                                    Genes
                                         •three_prime_UTR_variant
                        (Variant Annotation Tool)
    •sequence_alteration
   Fasta                                 •exon_variant          GFF3
    •deletion                            •splice_region_variant
    •insertion                           •splice_donor_variant
    •duplication
         Annotated            Annotated  •splice_acceptor_variant
                                                    Annotated
    •inversion     GVF                   •intron_variant
          Variants
    •substitution
                               Variants               Variants
                                         •coding_sequence_variant
    •SNV                                 •stop_retained
    •MNP                                 •stop_lost
    •complex substitution                •stop_gained
    •translocation            VST        •synonymous_codon
                                         •non_synonymous_codon
                         (Variant Selection Tool)
                                         •amino_acid_substitution
                                         •frameshift_variant
                                         •inframe_variant

                  CDR
                             Merged
                           Variant Sets
CDR                       CDR

Background                  Target
 Genomes                   Genomes



             VAAST


             Prioritized
             Candidate
              Genes
               VAAST
               Report
Key Features of VAAST

• Probabilistic
• Feature Based
• Both Allele and AAS Frequencies
• Considers Inheritance Model
• Fast
• Standardized Ontology Based Format
• Modular and Flexible in Design
VAAST Uses Variant Frequencies in a
Probabilistic Fashion

         Likelihood Ratio Test

                   Maximum Likelihood
                    of the Null Model
                     (No Difference)
                   Maximum Likelihood
                  of the Alternate Model
                   (There is Difference)
VAAST Uses Variant Frequencies in a
Probabilistic Fashion
VAAST Uses Variant Frequencies in a
Probabilistic Fashion
•   VAAST gives us the likelihood of the composite genotype
    at GENE X in the target given the background.

•   Do allele frequencies differ between Background and
    Target genomes within a given gene or feature?

•   Composite likelihood calculation assumes independence
    across sites. To control for LD, statistical significance is
    estimated by permutation test.

•   Multiple test correction for number of features (~20,000)
    is two orders of magnitude better than for the number of
    variants (~3,500,000).
Noise Decreases Dramatically with
Increasing Number of Genomes
            1 genome target
         1 genome background
1 genome target
10 genome background
1 genome target
250 genome background
1 genome target
250 genome background
       Trio Data
Alleles Responsible for Miller
    Syndrome in Utah Kindred
               CHR 16: DHODH                               CHR 5: DNAH5
                Mom     Dad                                Mom     Dad
            G:R                                        R:Q
                                     G:A                                    R:
                                                                                 *

                Son         Daughter                       Son        Daughter
            G:R            G:R                         R:Q            R:Q
                                                                 R:         R:
                      G:A            G:A
                                                                      *          *

•Ng et al, Nature Genetics 42, 30–35 (2010) doi:10.1038/ng.499
•Roach, et al, Science , 328 636, 2101
Schematic of VAAST Analysis of Utah
Miller Kindred Using a Single Quartet



                               DHODH


DNAH5
Average Rank for 100 Dominant and
Recessive Diseases
                           1300
   Ave. rank genome-wide


                                                            SIZE OF CASE COHORT
                           1100
                                                                 2 allele copies
                           900
                                                                 4 allele copies
                           700
                                                                 6 allele copies
                           500


                           300
                                  156                 132
                           100          21   9               8      3

                           -100
                                   DOMINANT             RECESSIVE
                           -300


                           -500         443 genomes in background
Impact of Missing Data
                          4000


                          3500
                                               2 of 6 allele copies
  Ave. rank genome-wide




                          3000
                                               4 of 6 allele copies
                          2500
                                               6 of 6 allele copies
                          2000


                          1500


                          1000
                                 639
                           500                                    373
                                          61
                                                                            21
                                                    9                               3
                             0


                          -500
                                       DOMINANT                         RECESSIVE

                                           443 genomes in background
Outline

 The VAAST Analysis Pipeline
 Ogden Syndrome: Application of VAAST to a Genetic
 Disease of Unknown Cause

 The Future of VAAST Development
An Rare X-linked Mendelian Disorder

•   A Utah family coming to the
    University Hospital for 20+
    years
•   About half of the male offspring
    die around 1 year of age
•   Aged appearance
•   Craniofacial anomalies
•   Hypotonia
•   Global developmental delays
•   Cardiac arrhythmias
Four Affected Boys over Two
      Generations
  I



 II



III
Exome Sequencing
 •   Agilent SureSelect In-Solution X Chromosome Capture
 •   Covaris S series Sonication (150-200 bp)
 •   76 bp single-end reads on one lane each of the
     IlluminaGAIIx



Variant Calling
 •    Sequence alignment with bwa
 •    Remove duplicate reads with PICARD
 •    Realign indel regions with GATK
 •    Variant calling with Samtools, GATK
Identifying Candidate Genes

 VAAST Identifies NAA10 as Candidate Gene
    •   About 20 min. run time
    •   3 candidate genes (NAA10 ranked 2) proband only
    •   1 candidate gene (NAA10) with pedigree
Additional Analyses

 •   Microarray based CNV analysis
     •   No likely causal variants found
 •   Sanger sequencing confirmation
     •   Variant segregates perfectly with disease in 13
         family members
 •   Haplotype sharing (STR genotyping)
     •   ~11 MB shared between two affected boys
 •   A second family discovered – same mutation
 •   IBD relatedness analysis – independent mutational
     events
N(alpha)-acetyltransferase
 • N-alpha-acetylation is one of the most common protein
      modifications that occurs during protein synthesis.
  •   NatA (catalytic subunit NAA10 (hARD1)
  •   Eight exons, Crick strand, highly conserved
  •   A:G transition causes p.Ser37Pro
Functional Analyses
 •   Quantitative in vitro N-terminal acetylation assay (RP-
     HPLC).
 •   Four peptide substrates previously shown to be
     acetylated by NatA (NAA10)
 •   Assays indicate loss-of-function allele.
Functional Analyses
VAAST in Summary

•   Probabilistic Disease Gene Finder
•   Feature Based not Variant Based
•   Both Allele and AAS Frequencies
•   Considers Inheritance Model
•   As few as two target genomes can be sufficient to
    identify causative gene.
•   Background Genomes are “Reusable”
•   Not Limited to Human Analyses
VAAST: Future Directions

 •   Indel support
 •   Splice-site
 •   No-call support
 •   Pedigree support
 •   Phylogenetic conservation
Acknowledgements
VAAST Development Ogden
•Chad Huff        Syndrome                •Thomas Arnesen
•HaoHu              •John Carey           •Rune Evjenth
•Lynn Jorde         •Steven Chin          •Johan R. Lillehaug
•Barry Moore        •Heidi Deborah Fain
•Martin Reese       •Gholson Lyon         •Leslie G. Biesecker
•Marc Singleton     •John Optiz           •Jennifer J.
•Jinchuan Xing      •Theodore J. Pysher   Johnston
•Mark Yandell       •Alan Rope            •Cathy A. Stevens
Yandell Lab         •Reid Robison
                    •Sarah T. South       •Brian Dalley
•Michael Campbell                         •Tao Jiang
•Daniel Ence                              •JeffereySwensen
                    •Chad Huff
•Guozhen Fan
                    •Evan Johnson
•Steven Flygare                           •HakonHakonarson
                    •Barry Moore
•HaoHu                                    •Lynn B. Jorde
                    •Christa Schank
•Zev Kronenberg                           •Mark Yandell
                    •Kai Wang
•Barry Moore
                    •Jinchuan Xing
•Marc Singleton
•Robert Ross
•Mark Yandell
Acknowledgements

Más contenido relacionado

Último

The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxseri bangash
 
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professormuralinath2
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learninglevieagacer
 
Dr. E. Muralinath_ Blood indices_clinical aspects
Dr. E. Muralinath_ Blood indices_clinical  aspectsDr. E. Muralinath_ Blood indices_clinical  aspects
Dr. E. Muralinath_ Blood indices_clinical aspectsmuralinath2
 
Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Silpa
 
300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptxryanrooker
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .Poonam Aher Patil
 
Chemistry 5th semester paper 1st Notes.pdf
Chemistry 5th semester paper 1st Notes.pdfChemistry 5th semester paper 1st Notes.pdf
Chemistry 5th semester paper 1st Notes.pdfSumit Kumar yadav
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptxSilpa
 
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxClimate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxDiariAli
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learninglevieagacer
 
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptxTHE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptxANSARKHAN96
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY1301aanya
 
GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry
GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry
GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry Areesha Ahmad
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.Silpa
 
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Silpa
 
Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.Silpa
 
CYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptxCYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptxSilpa
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxMohamedFarag457087
 

Último (20)

The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
 
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
Dr. E. Muralinath_ Blood indices_clinical aspects
Dr. E. Muralinath_ Blood indices_clinical  aspectsDr. E. Muralinath_ Blood indices_clinical  aspects
Dr. E. Muralinath_ Blood indices_clinical aspects
 
Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.
 
300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
Chemistry 5th semester paper 1st Notes.pdf
Chemistry 5th semester paper 1st Notes.pdfChemistry 5th semester paper 1st Notes.pdf
Chemistry 5th semester paper 1st Notes.pdf
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptx
 
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxClimate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
 
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptxTHE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
 
GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry
GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry
GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
 
Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.
 
CYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptxCYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptx
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
 

Destacado

PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at WorkGetSmarter
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...DevGAMM Conference
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationErica Santiago
 

Destacado (20)

PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy Presentation
 

VAAST: Deciphering Genetic Disease with Next-Generation Sequencing

  • 1. VAAST Deciphering Genetic Disease with Next-Generation Sequencing Barry Moore, M.S. Research Scientist Department of Human Genetics Department of Biomedical Informatics
  • 2. Outline  The VAAST Analysis Pipeline  Ogden Syndrome: Application of VAAST to a Genetic Disease of Unknown Cause  The Future of VAAST Development
  • 3. $10,000,000 Venter Genome $1,000,000 Watson $5,000 You?
  • 4. Next Generation Sequencing Disease Healthy geneA geneB geneX geneY geneZ
  • 5. Variant Variant Annotation Annotation Tool Variant Variant Selection Selection Tool Variant Variant Annotation Analysis Analysis Search Tool
  • 6. GVF VAAST Pipeline 3.5 Million Variants Reference VAT Reference Genome (Variant Annotation Tool) Genes Fasta GFF3 Annotated Annotated Annotated GVF Variants Variants Variants VST (Variant Selection Tool) CDR Merged Variant Sets
  • 7. GVF VAAST Pipeline Variant Effect 3.5 Million •sequence_variant Variants •gene_variant Reference VAT Reference •five_prime_UTR_variant Genome Type Variant Genes •three_prime_UTR_variant (Variant Annotation Tool) •sequence_alteration Fasta •exon_variant GFF3 •deletion •splice_region_variant •insertion •splice_donor_variant •duplication Annotated Annotated •splice_acceptor_variant Annotated •inversion GVF •intron_variant Variants •substitution Variants Variants •coding_sequence_variant •SNV •stop_retained •MNP •stop_lost •complex substitution •stop_gained •translocation VST •synonymous_codon •non_synonymous_codon (Variant Selection Tool) •amino_acid_substitution •frameshift_variant •inframe_variant CDR Merged Variant Sets
  • 8. GVF VAAST Pipeline Variant Effect 3.5 Million •sequence_variant Variants •gene_variant Reference VAT Reference •five_prime_UTR_variant Genome Type Variant Genes •three_prime_UTR_variant (Variant Annotation Tool) •sequence_alteration Fasta •exon_variant GFF3 •deletion •splice_region_variant •insertion •splice_donor_variant •duplication Annotated Annotated •splice_acceptor_variant Annotated •inversion GVF •intron_variant Variants •substitution Variants Variants •coding_sequence_variant •SNV •stop_retained •MNP •stop_lost •complex substitution •stop_gained •translocation VST •synonymous_codon •non_synonymous_codon (Variant Selection Tool) •amino_acid_substitution •frameshift_variant •inframe_variant CDR Merged Variant Sets
  • 9. CDR CDR Background Target Genomes Genomes VAAST Prioritized Candidate Genes VAAST Report
  • 10. Key Features of VAAST • Probabilistic • Feature Based • Both Allele and AAS Frequencies • Considers Inheritance Model • Fast • Standardized Ontology Based Format • Modular and Flexible in Design
  • 11. VAAST Uses Variant Frequencies in a Probabilistic Fashion Likelihood Ratio Test Maximum Likelihood of the Null Model (No Difference) Maximum Likelihood of the Alternate Model (There is Difference)
  • 12. VAAST Uses Variant Frequencies in a Probabilistic Fashion
  • 13. VAAST Uses Variant Frequencies in a Probabilistic Fashion • VAAST gives us the likelihood of the composite genotype at GENE X in the target given the background. • Do allele frequencies differ between Background and Target genomes within a given gene or feature? • Composite likelihood calculation assumes independence across sites. To control for LD, statistical significance is estimated by permutation test. • Multiple test correction for number of features (~20,000) is two orders of magnitude better than for the number of variants (~3,500,000).
  • 14. Noise Decreases Dramatically with Increasing Number of Genomes 1 genome target 1 genome background
  • 15. 1 genome target 10 genome background
  • 16. 1 genome target 250 genome background
  • 17. 1 genome target 250 genome background Trio Data
  • 18.
  • 19. Alleles Responsible for Miller Syndrome in Utah Kindred CHR 16: DHODH CHR 5: DNAH5 Mom Dad Mom Dad G:R R:Q G:A R: * Son Daughter Son Daughter G:R G:R R:Q R:Q R: R: G:A G:A * * •Ng et al, Nature Genetics 42, 30–35 (2010) doi:10.1038/ng.499 •Roach, et al, Science , 328 636, 2101
  • 20. Schematic of VAAST Analysis of Utah Miller Kindred Using a Single Quartet DHODH DNAH5
  • 21. Average Rank for 100 Dominant and Recessive Diseases 1300 Ave. rank genome-wide SIZE OF CASE COHORT 1100 2 allele copies 900 4 allele copies 700 6 allele copies 500 300 156 132 100 21 9 8 3 -100 DOMINANT RECESSIVE -300 -500 443 genomes in background
  • 22. Impact of Missing Data 4000 3500 2 of 6 allele copies Ave. rank genome-wide 3000 4 of 6 allele copies 2500 6 of 6 allele copies 2000 1500 1000 639 500 373 61 21 9 3 0 -500 DOMINANT RECESSIVE 443 genomes in background
  • 23. Outline  The VAAST Analysis Pipeline  Ogden Syndrome: Application of VAAST to a Genetic Disease of Unknown Cause  The Future of VAAST Development
  • 24. An Rare X-linked Mendelian Disorder • A Utah family coming to the University Hospital for 20+ years • About half of the male offspring die around 1 year of age • Aged appearance • Craniofacial anomalies • Hypotonia • Global developmental delays • Cardiac arrhythmias
  • 25. Four Affected Boys over Two Generations I II III
  • 26. Exome Sequencing • Agilent SureSelect In-Solution X Chromosome Capture • Covaris S series Sonication (150-200 bp) • 76 bp single-end reads on one lane each of the IlluminaGAIIx Variant Calling • Sequence alignment with bwa • Remove duplicate reads with PICARD • Realign indel regions with GATK • Variant calling with Samtools, GATK
  • 27. Identifying Candidate Genes VAAST Identifies NAA10 as Candidate Gene • About 20 min. run time • 3 candidate genes (NAA10 ranked 2) proband only • 1 candidate gene (NAA10) with pedigree
  • 28. Additional Analyses • Microarray based CNV analysis • No likely causal variants found • Sanger sequencing confirmation • Variant segregates perfectly with disease in 13 family members • Haplotype sharing (STR genotyping) • ~11 MB shared between two affected boys • A second family discovered – same mutation • IBD relatedness analysis – independent mutational events
  • 29. N(alpha)-acetyltransferase • N-alpha-acetylation is one of the most common protein modifications that occurs during protein synthesis. • NatA (catalytic subunit NAA10 (hARD1) • Eight exons, Crick strand, highly conserved • A:G transition causes p.Ser37Pro
  • 30. Functional Analyses • Quantitative in vitro N-terminal acetylation assay (RP- HPLC). • Four peptide substrates previously shown to be acetylated by NatA (NAA10) • Assays indicate loss-of-function allele.
  • 32.
  • 33. VAAST in Summary • Probabilistic Disease Gene Finder • Feature Based not Variant Based • Both Allele and AAS Frequencies • Considers Inheritance Model • As few as two target genomes can be sufficient to identify causative gene. • Background Genomes are “Reusable” • Not Limited to Human Analyses
  • 34. VAAST: Future Directions • Indel support • Splice-site • No-call support • Pedigree support • Phylogenetic conservation
  • 35.
  • 36. Acknowledgements VAAST Development Ogden •Chad Huff Syndrome •Thomas Arnesen •HaoHu •John Carey •Rune Evjenth •Lynn Jorde •Steven Chin •Johan R. Lillehaug •Barry Moore •Heidi Deborah Fain •Martin Reese •Gholson Lyon •Leslie G. Biesecker •Marc Singleton •John Optiz •Jennifer J. •Jinchuan Xing •Theodore J. Pysher Johnston •Mark Yandell •Alan Rope •Cathy A. Stevens Yandell Lab •Reid Robison •Sarah T. South •Brian Dalley •Michael Campbell •Tao Jiang •Daniel Ence •JeffereySwensen •Chad Huff •Guozhen Fan •Evan Johnson •Steven Flygare •HakonHakonarson •Barry Moore •HaoHu •Lynn B. Jorde •Christa Schank •Zev Kronenberg •Mark Yandell •Kai Wang •Barry Moore •Jinchuan Xing •Marc Singleton •Robert Ross •Mark Yandell

Notas del editor

  1. I’m going to begin the discussion of VAAST with a simple description of how the pipeline runs
  2. Numerator = Null Model (No Difference)Denominator = Alternate Model (Difference)
  3. The maximum likelihood of the null model over the maximum likelihood of the alternate model - weighted by the frequency of the AAS in the healthy dataset over the frequency of that AAS in a disease datasetn=frequency of that AAS in the background p=estimated probability of...B=T=Y=X=a=frequency of this AAS in OMIM
  4. The maximum likelihood of the null model over the maximum likelihood of the alternate model - weighted by the frequency of the AAS in the healthy dataset over the frequency of that AAS in a disease datasetn=frequency of that AAS in the background p=estimated probability of...B=T=Y=X=a=frequency of this AAS in OMIM
  5. Miller Syndrome
  6. A-G TransitionSerine to Proline
  7. Thank the family