SlideShare una empresa de Scribd logo
1 de 59
Descargar para leer sin conexión
The 100,000
Genomes
Project
David Montaner
Bioinformatics Department
david.montaner@genomicsengland.co.uk
Valencia University, October 6th
2016
Talk Outline
1. Introduction & Background
2. Pipelines
3. Systems and Databases
4. Cancer
5. Rare Diseases
2
3
The 100,000 Genomes Project
Genomics England & Partners
Genomics England
• Owned by the Department of Health, UK
• Set up to deliver the 100,000 Genomes Project: 
 100,000 whole genome sequences of National Health Service (NHS)
patients with:
• Rare Diseases (and family members)
• Cancer
Aims:
 Create an ethical and transparent programme based on consent
 Establish the infrastructure, human capacity & capability to set up a
genomic medicine service for the NHS and bring benefit to patients.
 Enable new scientific discovery and medical insights, and add to
the already extensive databases on human variation
 Working with the National Health Service (NHS), academics and industry
to make the UK a world leader in Genomic Medicine
4
Who are we & what are we doing?
Generate health & wealth
• Sequence 100,000 genomes
• Cancer and rare genetic disease
• Capture data delivered
electronically, store it securely
and analyse it
• within an English data centre
(reading library)
• Combine genomes with
extracted clinical information for
analysis, interpretation, and
aggregation
• Create capacity, capability and
legacy in personalised medicine
for the UK
Goals of Genomics England
1. To bring
benefit to
NHS patients
2. To enable
new scientific
discovery
and medical
insights
3. To create
an ethical
and
transparent
programme
based on
consent
4. To kickstart
the
development
of a UK
genomics
industry
Inception of the 100,000
genomes project (2012, 2014)
“If we get this right, we could
transform how we diagnose and
treat our most complex diseases
not only here but across the world”
(December 2012)
“I am determined to do all I can to
support the health and scientific
sector to unlock the power of DNA,
turning an important scientific
breakthrough into something that will
help deliver better tests, better drugs
and above all better care for
patients.”
(August 2014)
Schedule

2012 -2014: consortium creation

2014-2015: pilot studies

2016-2015: main project
Where are we?
8
Lodon
Where are we?
9
Lodon


London:

Management

All data storage


Cambridge:

Software for genomic data storage


Oxford:

Software for clinical data storage and collection
Recruitment and clinical interface
13 “GMCs”, Scotland and
Northern Ireland• Genomic Medicine Centres
• Networks of NHS hospitals
including genomics labs
• 13 “Lead organisation” plus
71 “Local Delivery Partners”
• Contracted by NHS England
• Cover recruitment, data and
return of results
• Scotland
• Doing own sequencing
• Northern Ireland
• Similar to a GMC
• Contracted by NI payer
+
The Journey of a Genome
11
ACGTTTGAAGC
Consent &
Sample
collection
DNA
extraction
Bio-
repository
Sequencing
Variant
Calling
Interpretation
Feedback
to clinician
Validation
Treatment
The Journey of a Genome:
Partners
12
ACGTTTGAAGC
?
Consent &
Sample
collection
DNA
extraction
Bio-
repository
Sequencing
Variant
Calling
Interpretation
Feedback
to clinician
Validation
Treatment
Genome
Medicine
Centres (GMCs)
13x NHS
organisations
Genomics
England Clinical
Interpretation
Partnerships
(GeCIPs)
Collaborations of
clinicians &
academics,
> 2,000
researchers
Clinical
interpretation
companies
• Omicia
• Congenica
• Nextcode
Hiseq X Ten
GENE Consortium
• Working together on a year-long
Industry Trial involving a
selection of whole genome
sequences across cancer and rare
diseases
• Aims to identify most effective and
secure way to accelerate
development of new
diagnostics and treatments for
patients 
• Working in a pre-competitive
environment
AbbVie
Alexion Pharmaceuticals
AstraZeneca
Berg Health
Biogen
Dimension Therapeutics
GSK
Helomics
NGM Biopharmaceuticals
Roche
Takeda
Genomics Expert Network for
Enterprises
14
BAM file
From Illumina
Variant Calling
pipelines: VCF file
QC1 QC2
Variant
Annotation
Tiering of variantsDispatchClinical
Interpretation
QC Portal Reporting portal
Medical
review
Validation
Simplified Workflow
Genomic Medicine Centre (GMC)
Bioinformatics Team Role
15
ACGTTTGAAGC
?
Consent &
Sample
collection
DNA
extraction
Biorepository
Sequencing
Variant
Calling
Interpretation
Feedback to
clinician
Validation
Treatment
Genomics Education
Health Education England
• MSc in Genomic Medicine
• 10 Universities across the UK
• Online training courses and resources
• The fundamentals of genomics
• Sample handling and DNA
extraction
• Bioinformatics
• How to support patients through
the consent process
Genomics England Communications
Team
Update on numbers:
at about 10%
• >10,000 genomes
received
• >1PB of primary data
• >1.3M files received or
generated and indexed
• 200M germline variants
databased
• 48M somatic variants
databased
• 70,000 HPO terms asserted
• >450,000 hospital episodes
100,000 Genomes
• Rare Disease
• Each Genome: 100Gb
• Trio is preferred so 300Gb per
participant
• x 50,000 participants =
15,000,000Gb total
• Cancer
• Germline: 100Gb
• Tumour: 200Gb
• 300Gb per patient
• x 25,000 participants =
15,000,000Gb total
• 10,000,000Gb = 10 Petabytes
• Expecting around 30 Petabytes
18
Huge Amount of Data
10 Billion Photos = 1.5 Petabytes
Data Processed in 1 day = 20 Petabytes
19
Pipelines
bertha_default 1.1.0
Single Sample QC & Processing
Analysis
Intake QC
Multi Sample QC
Cross Sample Contamination
Single-Sample QC Check Point
Identity by DecentMendelian Inconsistency Rate
Sex Check
Somatic VCF re-headering
Tumour Cross Sample ContaminationCross Species Contamination Depth of Coverage Concordance check
Intake QC Check Point
Merge Array Genotypes
Multi-Sample QC Check Point
Consent Check Point
Variant Calling
Variant Normalisation
Tumour PloidyTumour PurityTumour ClonalityMutation SignatureViral InsertionsActionable Mutation CoverageSNV & Indel RefinementMutation BurdenInbreeding Coefficient Homozygosity Runs
Variant Annotation
Variant Tiering
Interpretation Dispatch Exomiser
Delivery API
Integrity Check
MD5 Check
Validate BAM Picard
Filtered Bamstats Unfiltered Bamstats Q30 Bamstats VCF QC
Fix Permissions
Plot Filtered Bamstats Generate Filtered Metrics Bamstats Plot Unfiltered Bamstats Generate Q30 Metrics Bamstats
QC Stats Post-processing
Workflow
diagramme
Data intake
Single Sample QC & Processing
Multi-sample QC
Analysis
Interpretation Request Dispatched
InterpretationAPI
Bertha
Distributed Workflow Management System
Interpretation Dispatch
Message Broker
Tracki
ng DB
Job Scheduler
Dashboard
DeliveryAPI
Auditor
Orchestrator
Grid
Consumer
Oxford Bus
6 node Hadoop cluster:
• Transform: 97 min
• Load: 80 sec
• Merge: 84 sec
• Millisecond response
times for regional queries
• Whole genome filtering
queries for all individuals
within seconds
OpenCGA: storage
Extensive capabilities to query across genotype and phenotype relationships
https://github.com/opencb/opencga
To be fully GA4GH compatible from v1.0
global data standards for Genomics - http://ga4gh.org/
Clinical data
+ 150 tables (+2000 variables)
Administrative & Consent
Clinical / medical reviews
Imaging, blood & non genetic tests
Disease status and phenotype
Family & pedigree
Treatments and clinical history
Security and logs:
CMCs access here
CatalogBioinformatics
Oxford
OpenCGA - Catalog
Metadata store and A&A for
OpenCGA
• Manages roles, groups,
acls
• Audit log
• LDAP integration
• Arbitrary schemas
(annotation sets)
Cellbase: annotation
Reference Genomic data warehouse
• Compared in testing against VEP
• More than 99.999% similarity in Consequence
types
• Phased annotation implemented for
MNVs
• Initial structural variation annotation
• Can annotate 4-5 families per hour
(>8000 variants/s) on a single
database instance
• Will have (very soon) an Rpackage
similar to biomaRt
PanelApp
27https://panelapp.extge.co.uk/crowdsourcing/PanelApp
Panel list
28
https://panelapp.extge.co.uk/crowdsourcing/PanelApp/
Platform for interpretation
● Filter and classify variants
● Well-defined rules, stable across the project
● General, it works for any family configuration
● Implemented using VCF/Cellbase or OpenCGA
● Based on GA4GH variant model
● Uses pedigrees as defined at Genomics England
(Based on phenotips format) Uses PanelApp as
source of gene panels
Variant Tiering
Yes No
Tier 1 Tier 2Tier 3
Yes No
Expected pathogenic
(set criteria; transcript_ablation,
splice_donor_variant,
splice_acceptor_variant, stop_gained,
frameshift_variant, stop_lost,
initiator_codon_variant)
Is the variant in a gene in the Virtual Gene
Panel (green list) for that disorder?
Known Pathogenic
(Not implemented)
Yes No
Tier 3
Is the variant in a gene in the Virtual Gene
Panel (green list) for that disorder?
Other coding impact
(set criteria;
inframe_insertion
inframe_deletion
missense_variant
transcript_amplification
splice_region_variant
incomplete_terminal_codon_variant)
Impact of the variant?
Other
Does not fit any
of the other
criteria?
The variant allele is not commonly found in the general healthy population (set criteria for allele frequency filter)
Familial segregation
Allelic state matches known mode of inheritance for the gene and disorder (moi required)
Variant
Variant Tiering
32
The Cancer Programme
Cancer
33
Which cancers?
• Lung
• Breast
• Colon
• Prostate
• Ovary
• Hematological
malignancies (CLL)
• Pediatric Cancers
atthew Parker, Lead Analyst for Cancer (Bioinformatics)
Why sequence?
• Disease of disordered
genomes
• >200 driver genes known
• Stratified
Management/targeted
therapy
• Complications:
Heterogeneity
Sequencing cancer genomes
34
Tumour
genome
Germline
genome
Germline
variants
Tumour
variants
Somatic
variation=
Coverage
35
High Depth
ATGCGTTCGATGAGTGATGAAACCCATGATGGATGCCGATGAGATGATG
Coverage
Germline Samples
35x Coverage
• Rare Disease
Participants
• Cancer “Normal”
Cancer Samples
75x Coverage
• Cancer “Tumour”
Samples
Dr Matthew Parker, Lead Analyst for Cancer (Bioinformatics)
Normal
Contamination
Coverage
36
Why Higher Depth for Cancer?
Clonality/Heterogene
ity
Cancer Pilot
• Resections/Biopsies are
routinely fixed in formalin and
embedded in paraffin
• Causes DNA damage
• Difficult to extract DNA
• Fresh frozen logistically
difficult & not trusted to
maintain morphology
37
Fresh Frozen vs Formalin-fixed, paraffin-
embedded (FFPE) tumour samples
atthew Parker, Lead Analyst for Cancer (Bioinformatics)
Cancer Pilot
• Difficulty in obtaining long
fragments
• “Random” DNA damage
• “Cross-links” DNA which can be
reversed – but currently at high
temperatures
• Chimeric fragments in library
preparation
38
Problems with FFPE
Heat
A T
Repetitive
Regions Re-
anneal causing
Chimeric
Reads
GC Rich
regions are
more robust
atthew Parker, Lead Analyst for Cancer (Bioinformatics)
FFPE = Formalin-fixed, paraffin-embedded tumour samples
Read Alignment
CG Content
FF Copy Number Data
41atthew Parker, Lead Analyst for Cancer (Bioinformatics)
FFPE Copy Number Data
42atthew Parker, Lead Analyst for Cancer (Bioinformatics)
Fraction of overlapping SNVs
in FF and FFPE samples from 5 trios
Improving FFPE Sequencing
44
What can we do?
Procedur
e
Procedur
e FixationFixation
DNA
Extractio
n
DNA
Extractio
n
Library
Preparati
on
Library
Preparati
on
Cold Ischaemic Time
Storage Conditions
Time of Fixation
Size of Sample
pH of Fixative
Temperature of De-crosslinking
Addition of Salt
atthew Parker, Lead Analyst for Cancer (Bioinformatics)
FFPE = Formalin-fixed, paraffin-embedded tumour samples
Cancer reports
45
• Quality metrics pre- and post-sequencing
• A small number of clinically actionable mutations
• Germline results which affect cancer development
• Remainder of results are mostly of research interest
for now, but in future may assist:
• Drug development
• Targeted treatment selection
• Prediction of prognosis
• Monitoring of disease progression
46
Rare Disease Programme
47
The case for whole genomes
• Severe intellectual disability occurs in 0.5% of newborns
• Whole-genome sequencing at 80x in 50 parent-offspring with no
diagnosis for their severe intellectual disability.
• Overall 62% increase in diagnostic yield with WGS.
• Most diagnoses were for de-novo dominant mutations, roughly
equally divided in SNVs and CNVs.
48
Gilissen et al (2014), Nature PMID: 24896178
Why make a genetic diagnosis?
49
For a patient with
rare disease
• Understand why their
condition happened
• More accurate knowledge of
how it might develop in
future
• Possible treatment avenues
• Early intervention may help
avoid disability
• Contact with others with the
same condition
For the family
• Predict whether family
members will get the
condition
• Offer screening/treatment to
prevent it
• Reproductive decisions
For medical research
• Further our understanding of
disease mechanisms
• Novel drug development or
drug repurposing
Rare disease programme
• Over 200 disorders so far
Data model: describes the clinical
information to be collected for each
disorder
Disorders nominated by the NHS and
academia
Eligibility & Exclusion criteria for
recruitment; rare, mendelian, unmet
clinical diagnostic need, prior genetic
testing
Virtual Gene panel to aid analysis
Challenges
• Equity of diseases for
inclusion
• Tightness of criteria
for patient inclusion
• Equity of WGS
consumption per
phenotype
The biggest challenge?
51
Interpretation
• ~5-10 million variants in our
genome
• ~3.5 million “known” SNPs
• ~0.5 million “novel” SNPs
• ~0.5 million small indels
• ~1000 large (>500bp) CNVs
• ~20,000-25,000 coding variants
• ~9,000-11,000 non-synonymous
• 92 rare missense variants (MAF
<0.1%)
• 5 rare truncating variants (MAF
<0.1%)
• 0-2 de novo variants
What information is needed?
52
To aid interpretation of variants
• Allele frequency: How common is the variant in the ‘healthy’
population?
• Familial segregation: Is the variant present in the family
members with the disorder, and not in those without it?
• Mode of inheritance: Does the pattern fit with the
inheritance within the family and what is known about the
gene?
• Likely consequence: Does the variant cause a change in the
protein sequence likely to affect function?
• Gene panel: Is the variant in a gene associated with causing
the disorder?
• Known pathogenicity? Has the variant been seen before in
people with the same disease?
Rare Diseases
Gender
• X chromosome homozygosity, Y chromosome genotyping
rate
• Copy number for X and Y chromosomes
Relatedness
• Mendelian error checking for parent-child pairs
• IBD sharing estimation for all participants
Inbreeding/ excess homozygosity
• Observed vs expected homozygosity
Ancestry
• Multidimensional scaling
53
Genetic data checks and analyses
herine Smith, Lead Analyst for Rare Disorders (Bioinformatics)
Rare Disease Pilot
54
4800 people
Primary Data
• 4,128 participants
data cleansed
• (15,065 including
family members),
• 149 different
conditions. 
• 56,004 HPO terms
used
• 12,966 terms present
• 43,088 terms absent
Secondary Data
• Hospital Episodes
• 250,000 records
• 11,910 - Accident
Dept
• 37,479 - Inpatient
• 199418 - Outpatients
Rare disease pilot – 4,919 samples
55
Relatedness checking
56
Georgia
57
Georgia and her family
Image courtesy of Great Ormond
Street Hospital
• Undiagnosed condition that
included physical and mental
developmental delay, a rare eye
condition affecting sight, impaired
kidney function, verbal dyspraxia.
• Through enrolling in the project, a
mutation in a single gene was
found in Georgia’s genome which
is likely to be the cause of her
condition.
• Provides a molecular diagnosis for
her condition for the first time.
Maria Bitner-Glindzicz –
Great Ormond Street Hospital
http://www.genomicsengland.co.uk/first-children-recieve-diagnoses-through-100000-genomes-project/
Jessica
58
Jessica and her family.
Image courtesy of Great Ormond
Street Hospital.
“Now that we have this diagnosis there are
things that we can do differently almost
straight away. Her condition is one that has a
high chance of improvement on a special
diet, which means that her medication
dose is likely to decrease and her epilepsy
may be more easily controlled. Hopefully she
might have better balance so she can be
more stable and walk more…”
“…More than anything the outcome of the
project has taken the uncertainty out of life
for us and the worry of not knowing what was
wrong. It has allowed us to feel like we can
take control of things and make positive
changes for Jessica. It may also open doors to
other research projects that we can to go on.
These could be more specific to her condition
and we are hopeful that they could one
day find a cure.”
http://www.genomicsengland.co.uk/first-children-recieve-diagnoses-through-100000-genomes-project/
Mum, Kate Palmer:
59
Thank you!

Más contenido relacionado

La actualidad más candente

Next Generation Sequencing
Next Generation SequencingNext Generation Sequencing
Next Generation SequencingAtifa Ambreen
 
Gene knockout in mice
Gene knockout in miceGene knockout in mice
Gene knockout in miceAbuKarulai
 
Gene prediction methods vijay
Gene prediction methods  vijayGene prediction methods  vijay
Gene prediction methods vijayVijay Hemmadi
 
genome sequencing, types by kk sahu sir
genome sequencing, types by kk sahu sirgenome sequencing, types by kk sahu sir
genome sequencing, types by kk sahu sirKAUSHAL SAHU
 
Whole genome sequence.
Whole genome sequence.Whole genome sequence.
Whole genome sequence.jayalakshmi311
 
Systems biology & Approaches of genomics and proteomics
 Systems biology & Approaches of genomics and proteomics Systems biology & Approaches of genomics and proteomics
Systems biology & Approaches of genomics and proteomicssonam786
 
Whole genome sequence
Whole genome sequenceWhole genome sequence
Whole genome sequencesababibi
 
Assembly and finishing
Assembly and finishingAssembly and finishing
Assembly and finishingNikolay Vyahhi
 
Prokka - rapid bacterial genome annotation - ABPHM 2013
Prokka - rapid bacterial genome annotation - ABPHM 2013Prokka - rapid bacterial genome annotation - ABPHM 2013
Prokka - rapid bacterial genome annotation - ABPHM 2013Torsten Seemann
 
2 whole genome sequencing and analysis
2 whole genome sequencing and analysis2 whole genome sequencing and analysis
2 whole genome sequencing and analysissaberhussain9
 
Expressed sequence tag (EST), molecular marker
Expressed sequence tag (EST), molecular markerExpressed sequence tag (EST), molecular marker
Expressed sequence tag (EST), molecular markerKAUSHAL SAHU
 
Next generation sequencing
Next generation sequencingNext generation sequencing
Next generation sequencingTapish Goel
 

La actualidad más candente (20)

Next Generation Sequencing
Next Generation SequencingNext Generation Sequencing
Next Generation Sequencing
 
Gene knockout in mice
Gene knockout in miceGene knockout in mice
Gene knockout in mice
 
Gene prediction methods vijay
Gene prediction methods  vijayGene prediction methods  vijay
Gene prediction methods vijay
 
genome sequencing, types by kk sahu sir
genome sequencing, types by kk sahu sirgenome sequencing, types by kk sahu sir
genome sequencing, types by kk sahu sir
 
Whole genome sequence.
Whole genome sequence.Whole genome sequence.
Whole genome sequence.
 
Systems biology & Approaches of genomics and proteomics
 Systems biology & Approaches of genomics and proteomics Systems biology & Approaches of genomics and proteomics
Systems biology & Approaches of genomics and proteomics
 
Ngs introduction
Ngs introductionNgs introduction
Ngs introduction
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
 
Whole genome sequence
Whole genome sequenceWhole genome sequence
Whole genome sequence
 
PRIDE-ProteomeXchange
PRIDE-ProteomeXchangePRIDE-ProteomeXchange
PRIDE-ProteomeXchange
 
Assembly and finishing
Assembly and finishingAssembly and finishing
Assembly and finishing
 
Sequence assembly
Sequence assemblySequence assembly
Sequence assembly
 
Mutational analysis
Mutational analysisMutational analysis
Mutational analysis
 
Prokka - rapid bacterial genome annotation - ABPHM 2013
Prokka - rapid bacterial genome annotation - ABPHM 2013Prokka - rapid bacterial genome annotation - ABPHM 2013
Prokka - rapid bacterial genome annotation - ABPHM 2013
 
Swiss PROT
Swiss PROT Swiss PROT
Swiss PROT
 
Genome Big Data
Genome Big DataGenome Big Data
Genome Big Data
 
2 whole genome sequencing and analysis
2 whole genome sequencing and analysis2 whole genome sequencing and analysis
2 whole genome sequencing and analysis
 
Expressed sequence tag (EST), molecular marker
Expressed sequence tag (EST), molecular markerExpressed sequence tag (EST), molecular marker
Expressed sequence tag (EST), molecular marker
 
Mouse genome
Mouse genomeMouse genome
Mouse genome
 
Next generation sequencing
Next generation sequencingNext generation sequencing
Next generation sequencing
 

Destacado

GeneTalk engl. analyze human sequence variants
GeneTalk engl.   analyze human sequence variantsGeneTalk engl.   analyze human sequence variants
GeneTalk engl. analyze human sequence variantsAlexej Knaus
 
Bioinformatics Institute of India
Bioinformatics Institute of IndiaBioinformatics Institute of India
Bioinformatics Institute of Indiabiinoida
 
Applications Of Bioinformatics In Drug Discovery And Process
Applications Of Bioinformatics In Drug Discovery And ProcessApplications Of Bioinformatics In Drug Discovery And Process
Applications Of Bioinformatics In Drug Discovery And ProcessProf. Dr. Basavaraj Nanjwade
 
Bioinformatics Final Presentation
Bioinformatics Final PresentationBioinformatics Final Presentation
Bioinformatics Final PresentationShruthi Choudary
 
Dr. Jon Whitehurst - Bats, Maths and Maps - Isle of Wight Cafe Sci - Nov 2016
Dr. Jon Whitehurst - Bats, Maths and Maps - Isle of Wight Cafe Sci - Nov 2016Dr. Jon Whitehurst - Bats, Maths and Maps - Isle of Wight Cafe Sci - Nov 2016
Dr. Jon Whitehurst - Bats, Maths and Maps - Isle of Wight Cafe Sci - Nov 2016onthewight
 
Dr Catherine Mercer and Dr Frank Ratcliff - The 100,000 Genome Project - Jan ...
Dr Catherine Mercer and Dr Frank Ratcliff - The 100,000 Genome Project - Jan ...Dr Catherine Mercer and Dr Frank Ratcliff - The 100,000 Genome Project - Jan ...
Dr Catherine Mercer and Dr Frank Ratcliff - The 100,000 Genome Project - Jan ...onthewight
 

Destacado (6)

GeneTalk engl. analyze human sequence variants
GeneTalk engl.   analyze human sequence variantsGeneTalk engl.   analyze human sequence variants
GeneTalk engl. analyze human sequence variants
 
Bioinformatics Institute of India
Bioinformatics Institute of IndiaBioinformatics Institute of India
Bioinformatics Institute of India
 
Applications Of Bioinformatics In Drug Discovery And Process
Applications Of Bioinformatics In Drug Discovery And ProcessApplications Of Bioinformatics In Drug Discovery And Process
Applications Of Bioinformatics In Drug Discovery And Process
 
Bioinformatics Final Presentation
Bioinformatics Final PresentationBioinformatics Final Presentation
Bioinformatics Final Presentation
 
Dr. Jon Whitehurst - Bats, Maths and Maps - Isle of Wight Cafe Sci - Nov 2016
Dr. Jon Whitehurst - Bats, Maths and Maps - Isle of Wight Cafe Sci - Nov 2016Dr. Jon Whitehurst - Bats, Maths and Maps - Isle of Wight Cafe Sci - Nov 2016
Dr. Jon Whitehurst - Bats, Maths and Maps - Isle of Wight Cafe Sci - Nov 2016
 
Dr Catherine Mercer and Dr Frank Ratcliff - The 100,000 Genome Project - Jan ...
Dr Catherine Mercer and Dr Frank Ratcliff - The 100,000 Genome Project - Jan ...Dr Catherine Mercer and Dr Frank Ratcliff - The 100,000 Genome Project - Jan ...
Dr Catherine Mercer and Dr Frank Ratcliff - The 100,000 Genome Project - Jan ...
 

Similar a 100,000 Genomes Project.

VarSeq 2.6.0: Advancing Pharmacogenomics and Genomic Analysis
VarSeq 2.6.0: Advancing Pharmacogenomics and Genomic AnalysisVarSeq 2.6.0: Advancing Pharmacogenomics and Genomic Analysis
VarSeq 2.6.0: Advancing Pharmacogenomics and Genomic AnalysisGolden Helix
 
Bioinformatics tools for development, analysis, and preclinical testing of in...
Bioinformatics tools for development, analysis, and preclinical testing of in...Bioinformatics tools for development, analysis, and preclinical testing of in...
Bioinformatics tools for development, analysis, and preclinical testing of in...Malachi Griffith
 
Let's Talk Research Annual Conference - 24th-25th September 2014 (Jane Rogan)
Let's Talk Research Annual Conference - 24th-25th September 2014 (Jane Rogan)Let's Talk Research Annual Conference - 24th-25th September 2014 (Jane Rogan)
Let's Talk Research Annual Conference - 24th-25th September 2014 (Jane Rogan)NHSNWRD
 
Step by Step, from Liquid Biopsy to a Genomic Biomarker: Liquid Biopsy Series...
Step by Step, from Liquid Biopsy to a Genomic Biomarker: Liquid Biopsy Series...Step by Step, from Liquid Biopsy to a Genomic Biomarker: Liquid Biopsy Series...
Step by Step, from Liquid Biopsy to a Genomic Biomarker: Liquid Biopsy Series...QIAGEN
 
3b. Biotechnolgies & Genomics - Jane Theaker
3b. Biotechnolgies & Genomics - Jane Theaker3b. Biotechnolgies & Genomics - Jane Theaker
3b. Biotechnolgies & Genomics - Jane TheakerIventus
 
Kevin Dean Digital Health Assembly 2015
Kevin Dean Digital Health Assembly 2015 Kevin Dean Digital Health Assembly 2015
Kevin Dean Digital Health Assembly 2015 DHA2015
 
Canopy BioSciences August 2017
Canopy BioSciences August 2017Canopy BioSciences August 2017
Canopy BioSciences August 2017Jens-Ole Bock
 
2013-07-17: Incorporating Personalized Medicine in Community Hospital Systems
2013-07-17: Incorporating Personalized Medicine in Community Hospital Systems2013-07-17: Incorporating Personalized Medicine in Community Hospital Systems
2013-07-17: Incorporating Personalized Medicine in Community Hospital SystemsBaltimore Lean Startup
 
A Step to the Clouded Solution of Scalable Clinical Genome Sequencing (BDT308...
A Step to the Clouded Solution of Scalable Clinical Genome Sequencing (BDT308...A Step to the Clouded Solution of Scalable Clinical Genome Sequencing (BDT308...
A Step to the Clouded Solution of Scalable Clinical Genome Sequencing (BDT308...Amazon Web Services
 
VarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
VarSeq 2.4.0: VSClinical ACMG Workflow from the User PerspectiveVarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
VarSeq 2.4.0: VSClinical ACMG Workflow from the User PerspectiveGolden Helix
 
VarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
VarSeq 2.4.0: VSClinical ACMG Workflow from the User PerspectiveVarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
VarSeq 2.4.0: VSClinical ACMG Workflow from the User PerspectiveGolden Helix
 
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference DatabaseDevelopment of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference DatabaseNathan Olson
 
GIAB update for GRC GIAB workshop 191015
GIAB update for GRC GIAB workshop 191015GIAB update for GRC GIAB workshop 191015
GIAB update for GRC GIAB workshop 191015GenomeInABottle
 
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference DatabaseDevelopment of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Databasenist-spin
 
Best Practices for Validating a Next-Gen Sequencing Workflow
Best Practices for Validating a Next-Gen Sequencing WorkflowBest Practices for Validating a Next-Gen Sequencing Workflow
Best Practices for Validating a Next-Gen Sequencing WorkflowGolden Helix
 
The BRCA Challenge & Exchange: Progress and Plans - Gunnar Rätsch
The BRCA Challenge & Exchange: Progress and Plans - Gunnar RätschThe BRCA Challenge & Exchange: Progress and Plans - Gunnar Rätsch
The BRCA Challenge & Exchange: Progress and Plans - Gunnar RätschHuman Variome Project
 

Similar a 100,000 Genomes Project. (20)

Axt microarrays
Axt microarraysAxt microarrays
Axt microarrays
 
VarSeq 2.6.0: Advancing Pharmacogenomics and Genomic Analysis
VarSeq 2.6.0: Advancing Pharmacogenomics and Genomic AnalysisVarSeq 2.6.0: Advancing Pharmacogenomics and Genomic Analysis
VarSeq 2.6.0: Advancing Pharmacogenomics and Genomic Analysis
 
Oncogenomics 2013
Oncogenomics 2013Oncogenomics 2013
Oncogenomics 2013
 
Bioinformatics tools for development, analysis, and preclinical testing of in...
Bioinformatics tools for development, analysis, and preclinical testing of in...Bioinformatics tools for development, analysis, and preclinical testing of in...
Bioinformatics tools for development, analysis, and preclinical testing of in...
 
Let's Talk Research Annual Conference - 24th-25th September 2014 (Jane Rogan)
Let's Talk Research Annual Conference - 24th-25th September 2014 (Jane Rogan)Let's Talk Research Annual Conference - 24th-25th September 2014 (Jane Rogan)
Let's Talk Research Annual Conference - 24th-25th September 2014 (Jane Rogan)
 
Open data genomics_palermo_2017_ver03
Open data genomics_palermo_2017_ver03Open data genomics_palermo_2017_ver03
Open data genomics_palermo_2017_ver03
 
Step by Step, from Liquid Biopsy to a Genomic Biomarker: Liquid Biopsy Series...
Step by Step, from Liquid Biopsy to a Genomic Biomarker: Liquid Biopsy Series...Step by Step, from Liquid Biopsy to a Genomic Biomarker: Liquid Biopsy Series...
Step by Step, from Liquid Biopsy to a Genomic Biomarker: Liquid Biopsy Series...
 
3b. Biotechnolgies & Genomics - Jane Theaker
3b. Biotechnolgies & Genomics - Jane Theaker3b. Biotechnolgies & Genomics - Jane Theaker
3b. Biotechnolgies & Genomics - Jane Theaker
 
Kevin Dean Digital Health Assembly 2015
Kevin Dean Digital Health Assembly 2015 Kevin Dean Digital Health Assembly 2015
Kevin Dean Digital Health Assembly 2015
 
Canopy BioSciences August 2017
Canopy BioSciences August 2017Canopy BioSciences August 2017
Canopy BioSciences August 2017
 
2013-07-17: Incorporating Personalized Medicine in Community Hospital Systems
2013-07-17: Incorporating Personalized Medicine in Community Hospital Systems2013-07-17: Incorporating Personalized Medicine in Community Hospital Systems
2013-07-17: Incorporating Personalized Medicine in Community Hospital Systems
 
A Step to the Clouded Solution of Scalable Clinical Genome Sequencing (BDT308...
A Step to the Clouded Solution of Scalable Clinical Genome Sequencing (BDT308...A Step to the Clouded Solution of Scalable Clinical Genome Sequencing (BDT308...
A Step to the Clouded Solution of Scalable Clinical Genome Sequencing (BDT308...
 
VarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
VarSeq 2.4.0: VSClinical ACMG Workflow from the User PerspectiveVarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
VarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
 
VarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
VarSeq 2.4.0: VSClinical ACMG Workflow from the User PerspectiveVarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
VarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
 
01-14 Analysis of Liquid Biopsies - Ibrahim.pdf
01-14 Analysis of Liquid Biopsies - Ibrahim.pdf01-14 Analysis of Liquid Biopsies - Ibrahim.pdf
01-14 Analysis of Liquid Biopsies - Ibrahim.pdf
 
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference DatabaseDevelopment of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
 
GIAB update for GRC GIAB workshop 191015
GIAB update for GRC GIAB workshop 191015GIAB update for GRC GIAB workshop 191015
GIAB update for GRC GIAB workshop 191015
 
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference DatabaseDevelopment of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
 
Best Practices for Validating a Next-Gen Sequencing Workflow
Best Practices for Validating a Next-Gen Sequencing WorkflowBest Practices for Validating a Next-Gen Sequencing Workflow
Best Practices for Validating a Next-Gen Sequencing Workflow
 
The BRCA Challenge & Exchange: Progress and Plans - Gunnar Rätsch
The BRCA Challenge & Exchange: Progress and Plans - Gunnar RätschThe BRCA Challenge & Exchange: Progress and Plans - Gunnar Rätsch
The BRCA Challenge & Exchange: Progress and Plans - Gunnar Rätsch
 

Más de David Montaner

dmontaner at cipf_2014
dmontaner at cipf_2014dmontaner at cipf_2014
dmontaner at cipf_2014David Montaner
 
Biostatistics Unit at CIPF
Biostatistics Unit at CIPFBiostatistics Unit at CIPF
Biostatistics Unit at CIPFDavid Montaner
 
Dmontaner dissertation slides
Dmontaner dissertation slidesDmontaner dissertation slides
Dmontaner dissertation slidesDavid Montaner
 
Bioinformatics Introduction
Bioinformatics IntroductionBioinformatics Introduction
Bioinformatics IntroductionDavid Montaner
 
Genometra Empresas Innovadoras Valencia
Genometra Empresas Innovadoras ValenciaGenometra Empresas Innovadoras Valencia
Genometra Empresas Innovadoras ValenciaDavid Montaner
 
Seguimiento y Evaluación OnLine de Trabajos de Prácticas en Asignaturas de Es...
Seguimiento y Evaluación OnLine de Trabajos de Prácticas en Asignaturas de Es...Seguimiento y Evaluación OnLine de Trabajos de Prácticas en Asignaturas de Es...
Seguimiento y Evaluación OnLine de Trabajos de Prácticas en Asignaturas de Es...David Montaner
 

Más de David Montaner (6)

dmontaner at cipf_2014
dmontaner at cipf_2014dmontaner at cipf_2014
dmontaner at cipf_2014
 
Biostatistics Unit at CIPF
Biostatistics Unit at CIPFBiostatistics Unit at CIPF
Biostatistics Unit at CIPF
 
Dmontaner dissertation slides
Dmontaner dissertation slidesDmontaner dissertation slides
Dmontaner dissertation slides
 
Bioinformatics Introduction
Bioinformatics IntroductionBioinformatics Introduction
Bioinformatics Introduction
 
Genometra Empresas Innovadoras Valencia
Genometra Empresas Innovadoras ValenciaGenometra Empresas Innovadoras Valencia
Genometra Empresas Innovadoras Valencia
 
Seguimiento y Evaluación OnLine de Trabajos de Prácticas en Asignaturas de Es...
Seguimiento y Evaluación OnLine de Trabajos de Prácticas en Asignaturas de Es...Seguimiento y Evaluación OnLine de Trabajos de Prácticas en Asignaturas de Es...
Seguimiento y Evaluación OnLine de Trabajos de Prácticas en Asignaturas de Es...
 

Último

BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Onlineanilsa9823
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxolyaivanovalion
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 

Último (20)

BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptx
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 

100,000 Genomes Project.

  • 1. The 100,000 Genomes Project David Montaner Bioinformatics Department david.montaner@genomicsengland.co.uk Valencia University, October 6th 2016
  • 2. Talk Outline 1. Introduction & Background 2. Pipelines 3. Systems and Databases 4. Cancer 5. Rare Diseases 2
  • 3. 3 The 100,000 Genomes Project Genomics England & Partners
  • 4. Genomics England • Owned by the Department of Health, UK • Set up to deliver the 100,000 Genomes Project:   100,000 whole genome sequences of National Health Service (NHS) patients with: • Rare Diseases (and family members) • Cancer Aims:  Create an ethical and transparent programme based on consent  Establish the infrastructure, human capacity & capability to set up a genomic medicine service for the NHS and bring benefit to patients.  Enable new scientific discovery and medical insights, and add to the already extensive databases on human variation  Working with the National Health Service (NHS), academics and industry to make the UK a world leader in Genomic Medicine 4 Who are we & what are we doing? Generate health & wealth
  • 5. • Sequence 100,000 genomes • Cancer and rare genetic disease • Capture data delivered electronically, store it securely and analyse it • within an English data centre (reading library) • Combine genomes with extracted clinical information for analysis, interpretation, and aggregation • Create capacity, capability and legacy in personalised medicine for the UK Goals of Genomics England 1. To bring benefit to NHS patients 2. To enable new scientific discovery and medical insights 3. To create an ethical and transparent programme based on consent 4. To kickstart the development of a UK genomics industry
  • 6. Inception of the 100,000 genomes project (2012, 2014) “If we get this right, we could transform how we diagnose and treat our most complex diseases not only here but across the world” (December 2012) “I am determined to do all I can to support the health and scientific sector to unlock the power of DNA, turning an important scientific breakthrough into something that will help deliver better tests, better drugs and above all better care for patients.” (August 2014)
  • 7. Schedule  2012 -2014: consortium creation  2014-2015: pilot studies  2016-2015: main project
  • 9. Where are we? 9 Lodon   London:  Management  All data storage   Cambridge:  Software for genomic data storage   Oxford:  Software for clinical data storage and collection
  • 10. Recruitment and clinical interface 13 “GMCs”, Scotland and Northern Ireland• Genomic Medicine Centres • Networks of NHS hospitals including genomics labs • 13 “Lead organisation” plus 71 “Local Delivery Partners” • Contracted by NHS England • Cover recruitment, data and return of results • Scotland • Doing own sequencing • Northern Ireland • Similar to a GMC • Contracted by NI payer +
  • 11. The Journey of a Genome 11 ACGTTTGAAGC Consent & Sample collection DNA extraction Bio- repository Sequencing Variant Calling Interpretation Feedback to clinician Validation Treatment
  • 12. The Journey of a Genome: Partners 12 ACGTTTGAAGC ? Consent & Sample collection DNA extraction Bio- repository Sequencing Variant Calling Interpretation Feedback to clinician Validation Treatment Genome Medicine Centres (GMCs) 13x NHS organisations Genomics England Clinical Interpretation Partnerships (GeCIPs) Collaborations of clinicians & academics, > 2,000 researchers Clinical interpretation companies • Omicia • Congenica • Nextcode Hiseq X Ten
  • 13. GENE Consortium • Working together on a year-long Industry Trial involving a selection of whole genome sequences across cancer and rare diseases • Aims to identify most effective and secure way to accelerate development of new diagnostics and treatments for patients  • Working in a pre-competitive environment AbbVie Alexion Pharmaceuticals AstraZeneca Berg Health Biogen Dimension Therapeutics GSK Helomics NGM Biopharmaceuticals Roche Takeda Genomics Expert Network for Enterprises
  • 14. 14 BAM file From Illumina Variant Calling pipelines: VCF file QC1 QC2 Variant Annotation Tiering of variantsDispatchClinical Interpretation QC Portal Reporting portal Medical review Validation Simplified Workflow Genomic Medicine Centre (GMC)
  • 15. Bioinformatics Team Role 15 ACGTTTGAAGC ? Consent & Sample collection DNA extraction Biorepository Sequencing Variant Calling Interpretation Feedback to clinician Validation Treatment
  • 16. Genomics Education Health Education England • MSc in Genomic Medicine • 10 Universities across the UK • Online training courses and resources • The fundamentals of genomics • Sample handling and DNA extraction • Bioinformatics • How to support patients through the consent process Genomics England Communications Team
  • 17. Update on numbers: at about 10% • >10,000 genomes received • >1PB of primary data • >1.3M files received or generated and indexed • 200M germline variants databased • 48M somatic variants databased • 70,000 HPO terms asserted • >450,000 hospital episodes
  • 18. 100,000 Genomes • Rare Disease • Each Genome: 100Gb • Trio is preferred so 300Gb per participant • x 50,000 participants = 15,000,000Gb total • Cancer • Germline: 100Gb • Tumour: 200Gb • 300Gb per patient • x 25,000 participants = 15,000,000Gb total • 10,000,000Gb = 10 Petabytes • Expecting around 30 Petabytes 18 Huge Amount of Data 10 Billion Photos = 1.5 Petabytes Data Processed in 1 day = 20 Petabytes
  • 20. bertha_default 1.1.0 Single Sample QC & Processing Analysis Intake QC Multi Sample QC Cross Sample Contamination Single-Sample QC Check Point Identity by DecentMendelian Inconsistency Rate Sex Check Somatic VCF re-headering Tumour Cross Sample ContaminationCross Species Contamination Depth of Coverage Concordance check Intake QC Check Point Merge Array Genotypes Multi-Sample QC Check Point Consent Check Point Variant Calling Variant Normalisation Tumour PloidyTumour PurityTumour ClonalityMutation SignatureViral InsertionsActionable Mutation CoverageSNV & Indel RefinementMutation BurdenInbreeding Coefficient Homozygosity Runs Variant Annotation Variant Tiering Interpretation Dispatch Exomiser Delivery API Integrity Check MD5 Check Validate BAM Picard Filtered Bamstats Unfiltered Bamstats Q30 Bamstats VCF QC Fix Permissions Plot Filtered Bamstats Generate Filtered Metrics Bamstats Plot Unfiltered Bamstats Generate Q30 Metrics Bamstats QC Stats Post-processing Workflow diagramme Data intake Single Sample QC & Processing Multi-sample QC Analysis Interpretation Request Dispatched InterpretationAPI
  • 21. Bertha Distributed Workflow Management System Interpretation Dispatch Message Broker Tracki ng DB Job Scheduler Dashboard DeliveryAPI Auditor Orchestrator Grid Consumer Oxford Bus
  • 22. 6 node Hadoop cluster: • Transform: 97 min • Load: 80 sec • Merge: 84 sec • Millisecond response times for regional queries • Whole genome filtering queries for all individuals within seconds OpenCGA: storage Extensive capabilities to query across genotype and phenotype relationships https://github.com/opencb/opencga
  • 23. To be fully GA4GH compatible from v1.0 global data standards for Genomics - http://ga4gh.org/
  • 24. Clinical data + 150 tables (+2000 variables) Administrative & Consent Clinical / medical reviews Imaging, blood & non genetic tests Disease status and phenotype Family & pedigree Treatments and clinical history Security and logs: CMCs access here CatalogBioinformatics Oxford
  • 25. OpenCGA - Catalog Metadata store and A&A for OpenCGA • Manages roles, groups, acls • Audit log • LDAP integration • Arbitrary schemas (annotation sets)
  • 26. Cellbase: annotation Reference Genomic data warehouse • Compared in testing against VEP • More than 99.999% similarity in Consequence types • Phased annotation implemented for MNVs • Initial structural variation annotation • Can annotate 4-5 families per hour (>8000 variants/s) on a single database instance • Will have (very soon) an Rpackage similar to biomaRt
  • 30. ● Filter and classify variants ● Well-defined rules, stable across the project ● General, it works for any family configuration ● Implemented using VCF/Cellbase or OpenCGA ● Based on GA4GH variant model ● Uses pedigrees as defined at Genomics England (Based on phenotips format) Uses PanelApp as source of gene panels Variant Tiering
  • 31. Yes No Tier 1 Tier 2Tier 3 Yes No Expected pathogenic (set criteria; transcript_ablation, splice_donor_variant, splice_acceptor_variant, stop_gained, frameshift_variant, stop_lost, initiator_codon_variant) Is the variant in a gene in the Virtual Gene Panel (green list) for that disorder? Known Pathogenic (Not implemented) Yes No Tier 3 Is the variant in a gene in the Virtual Gene Panel (green list) for that disorder? Other coding impact (set criteria; inframe_insertion inframe_deletion missense_variant transcript_amplification splice_region_variant incomplete_terminal_codon_variant) Impact of the variant? Other Does not fit any of the other criteria? The variant allele is not commonly found in the general healthy population (set criteria for allele frequency filter) Familial segregation Allelic state matches known mode of inheritance for the gene and disorder (moi required) Variant Variant Tiering
  • 33. Cancer 33 Which cancers? • Lung • Breast • Colon • Prostate • Ovary • Hematological malignancies (CLL) • Pediatric Cancers atthew Parker, Lead Analyst for Cancer (Bioinformatics) Why sequence? • Disease of disordered genomes • >200 driver genes known • Stratified Management/targeted therapy • Complications: Heterogeneity
  • 35. Coverage 35 High Depth ATGCGTTCGATGAGTGATGAAACCCATGATGGATGCCGATGAGATGATG Coverage Germline Samples 35x Coverage • Rare Disease Participants • Cancer “Normal” Cancer Samples 75x Coverage • Cancer “Tumour” Samples Dr Matthew Parker, Lead Analyst for Cancer (Bioinformatics)
  • 36. Normal Contamination Coverage 36 Why Higher Depth for Cancer? Clonality/Heterogene ity
  • 37. Cancer Pilot • Resections/Biopsies are routinely fixed in formalin and embedded in paraffin • Causes DNA damage • Difficult to extract DNA • Fresh frozen logistically difficult & not trusted to maintain morphology 37 Fresh Frozen vs Formalin-fixed, paraffin- embedded (FFPE) tumour samples atthew Parker, Lead Analyst for Cancer (Bioinformatics)
  • 38. Cancer Pilot • Difficulty in obtaining long fragments • “Random” DNA damage • “Cross-links” DNA which can be reversed – but currently at high temperatures • Chimeric fragments in library preparation 38 Problems with FFPE Heat A T Repetitive Regions Re- anneal causing Chimeric Reads GC Rich regions are more robust atthew Parker, Lead Analyst for Cancer (Bioinformatics) FFPE = Formalin-fixed, paraffin-embedded tumour samples
  • 41. FF Copy Number Data 41atthew Parker, Lead Analyst for Cancer (Bioinformatics)
  • 42. FFPE Copy Number Data 42atthew Parker, Lead Analyst for Cancer (Bioinformatics)
  • 43. Fraction of overlapping SNVs in FF and FFPE samples from 5 trios
  • 44. Improving FFPE Sequencing 44 What can we do? Procedur e Procedur e FixationFixation DNA Extractio n DNA Extractio n Library Preparati on Library Preparati on Cold Ischaemic Time Storage Conditions Time of Fixation Size of Sample pH of Fixative Temperature of De-crosslinking Addition of Salt atthew Parker, Lead Analyst for Cancer (Bioinformatics) FFPE = Formalin-fixed, paraffin-embedded tumour samples
  • 45. Cancer reports 45 • Quality metrics pre- and post-sequencing • A small number of clinically actionable mutations • Germline results which affect cancer development • Remainder of results are mostly of research interest for now, but in future may assist: • Drug development • Targeted treatment selection • Prediction of prognosis • Monitoring of disease progression
  • 47. 47
  • 48. The case for whole genomes • Severe intellectual disability occurs in 0.5% of newborns • Whole-genome sequencing at 80x in 50 parent-offspring with no diagnosis for their severe intellectual disability. • Overall 62% increase in diagnostic yield with WGS. • Most diagnoses were for de-novo dominant mutations, roughly equally divided in SNVs and CNVs. 48 Gilissen et al (2014), Nature PMID: 24896178
  • 49. Why make a genetic diagnosis? 49 For a patient with rare disease • Understand why their condition happened • More accurate knowledge of how it might develop in future • Possible treatment avenues • Early intervention may help avoid disability • Contact with others with the same condition For the family • Predict whether family members will get the condition • Offer screening/treatment to prevent it • Reproductive decisions For medical research • Further our understanding of disease mechanisms • Novel drug development or drug repurposing
  • 50. Rare disease programme • Over 200 disorders so far Data model: describes the clinical information to be collected for each disorder Disorders nominated by the NHS and academia Eligibility & Exclusion criteria for recruitment; rare, mendelian, unmet clinical diagnostic need, prior genetic testing Virtual Gene panel to aid analysis Challenges • Equity of diseases for inclusion • Tightness of criteria for patient inclusion • Equity of WGS consumption per phenotype
  • 51. The biggest challenge? 51 Interpretation • ~5-10 million variants in our genome • ~3.5 million “known” SNPs • ~0.5 million “novel” SNPs • ~0.5 million small indels • ~1000 large (>500bp) CNVs • ~20,000-25,000 coding variants • ~9,000-11,000 non-synonymous • 92 rare missense variants (MAF <0.1%) • 5 rare truncating variants (MAF <0.1%) • 0-2 de novo variants
  • 52. What information is needed? 52 To aid interpretation of variants • Allele frequency: How common is the variant in the ‘healthy’ population? • Familial segregation: Is the variant present in the family members with the disorder, and not in those without it? • Mode of inheritance: Does the pattern fit with the inheritance within the family and what is known about the gene? • Likely consequence: Does the variant cause a change in the protein sequence likely to affect function? • Gene panel: Is the variant in a gene associated with causing the disorder? • Known pathogenicity? Has the variant been seen before in people with the same disease?
  • 53. Rare Diseases Gender • X chromosome homozygosity, Y chromosome genotyping rate • Copy number for X and Y chromosomes Relatedness • Mendelian error checking for parent-child pairs • IBD sharing estimation for all participants Inbreeding/ excess homozygosity • Observed vs expected homozygosity Ancestry • Multidimensional scaling 53 Genetic data checks and analyses herine Smith, Lead Analyst for Rare Disorders (Bioinformatics)
  • 54. Rare Disease Pilot 54 4800 people Primary Data • 4,128 participants data cleansed • (15,065 including family members), • 149 different conditions.  • 56,004 HPO terms used • 12,966 terms present • 43,088 terms absent Secondary Data • Hospital Episodes • 250,000 records • 11,910 - Accident Dept • 37,479 - Inpatient • 199418 - Outpatients
  • 55. Rare disease pilot – 4,919 samples 55
  • 57. Georgia 57 Georgia and her family Image courtesy of Great Ormond Street Hospital • Undiagnosed condition that included physical and mental developmental delay, a rare eye condition affecting sight, impaired kidney function, verbal dyspraxia. • Through enrolling in the project, a mutation in a single gene was found in Georgia’s genome which is likely to be the cause of her condition. • Provides a molecular diagnosis for her condition for the first time. Maria Bitner-Glindzicz – Great Ormond Street Hospital http://www.genomicsengland.co.uk/first-children-recieve-diagnoses-through-100000-genomes-project/
  • 58. Jessica 58 Jessica and her family. Image courtesy of Great Ormond Street Hospital. “Now that we have this diagnosis there are things that we can do differently almost straight away. Her condition is one that has a high chance of improvement on a special diet, which means that her medication dose is likely to decrease and her epilepsy may be more easily controlled. Hopefully she might have better balance so she can be more stable and walk more…” “…More than anything the outcome of the project has taken the uncertainty out of life for us and the worry of not knowing what was wrong. It has allowed us to feel like we can take control of things and make positive changes for Jessica. It may also open doors to other research projects that we can to go on. These could be more specific to her condition and we are hopeful that they could one day find a cure.” http://www.genomicsengland.co.uk/first-children-recieve-diagnoses-through-100000-genomes-project/ Mum, Kate Palmer: