SlideShare una empresa de Scribd logo
1 de 85
Status and Update of the
International Cancer Genomics
Consortium (ICGC)
June 1st 2015
B.F. Francis Ouellette francis@oicr.on.ca
• Senior Scientists & Associate Director,
Informatics and Biocomputing, Ontario Institute for
Cancer Research, Toronto, ON
• Associate Professor, Department of Cell and Systems Biology,
University of Toronto, Toronto, ON.
ONTARIO INSTITUTE FOR CANCER RESEARCH
You are free to:
Copy, share, adapt, or re-mix;
Photograph, film, or broadcast;
Blog, live-blog, or post video of;
This presentation. Provided that:
You attribute the work to its author and respect the rights
and licenses associated with its components.
Slide Concept by Cameron Neylon, who has waived all copyright and related or neighbouring rights. This slide only ccZero.
Social Media Icons adapted with permission from originals by Christopher Ross. Original images are available under GPL at;
http://www.thisismyurl.com/free-downloads/15-free-speech-bubble-icons-for-popular-websites
ONTARIO INSTITUTE FOR CANCER RESEARCH
3Module #: Title of Module
ONTARIO INSTITUTE FOR CANCER RESEARCH
ONTARIO INSTITUTE FOR CANCER RESEARCH
Disclaimer
I am on the SAB of many NIH funded projects (SGD,
Galaxy, GenomeSpace, and HMP2), as well as on the
Science, Industry Advisory Committee of Genome
Canada.
I do not (and will not) profit in any way, shape or form,
from any of the brands, products or companies I may
mention.
ONTARIO INSTITUTE FOR CANCER RESEARCH
@bffo
francis@oicr.on.caE-mail
ONTARIO INSTITUTE FOR CANCER RESEARCH
ONTARIO INSTITUTE FOR CANCER RESEARCH
International Cancer Genome Consortium
ONTARIO INSTITUTE FOR CANCER RESEARCH
http://www.csb.utoronto.ca/
ONTARIO INSTITUTE FOR CANCER RESEARCH
http://bioinformatics.ca/
ONTARIO INSTITUTE FOR CANCER RESEARCH
ONTARIO INSTITUTE FOR CANCER RESEARCH
ONTARIO INSTITUTE FOR CANCER RESEARCH
http://bioinformatics.ca/workshops/2014
ONTARIO INSTITUTE FOR CANCER RESEARCH
E-mail: course_info@bioinformatics.ca
Web: http://bioinformatics.ca
ONTARIO INSTITUTE FOR CANCER RESEARCH
Cancer
A Disease of the Genome
Challenge in Treating Cancer:
 Every tumor is different
 Every cancer patient is different
ONTARIO INSTITUTE FOR CANCER RESEARCH
 Johns Hopkins
> 18,000 genes analyzed for mutations
11 breast and 11 colon tumors
L.D. Wood et al, Science, Oct. 2007
 Wellcome Trust Sanger Institute
518 genes analyzed for mutations
210 tumors of various types
C. Greenman et al, Nature, Mar. 2007
 TCGA (NIH)
Multiple technologies
brain (glioblastoma multiforme), lung (squamous carcinoma),
and ovarian (serous cystadenocarcinoma).
F.S. Collins & A.D. Barker, Sci. Am, Mar. 2007
Large-Scale Studies of Cancer Genomes
ONTARIO INSTITUTE FOR CANCER RESEARCH
 Heterogeneity within and across tumor types
 High rate of abnormalities (driver vs passenger)
 Sample quality matters
 Consent and controlled data access is complicated
Lessons learned
ONTARIO INSTITUTE FOR CANCER RESEARCH
International Cancer Genome Consortium
Collect ~500 tumour/normal pairs from each of 50 different
major cancer types;
Comprehensive genome analysis of each T/N pair:
Genome
Transcriptome
Methylome
Clinical data
Make the data available to the research community & public.
Identify
genome
changes
…GATTATTCCAGGTAT… …GATTATTGCAGGTAT… …GATTATTGCAGGTAT…
ONTARIO INSTITUTE FOR CANCER RESEARCH
Rationale for the ICGC
The scope is huge, such that no country can do it all.
Coordinated cancer genome initiatives will reduce
duplication of effort for common and easy to acquire
tumor samples and and ensure complete studies for
many less frequent forms of cancer.
Standardization and uniform quality measures across
studies will enable the merging of datasets,
increasing power to detect additional targets.
The spectrum of many cancers varies across the
world for many tumor types, because of environmental,
genetic and other causes.
The ICGC will accelerate the dissemination of
genomic and analytical methods across participating
sites, and the user community
ONTARIO INSTITUTE FOR CANCER RESEARCH
ICGC
Goals, Structure,
Policies & Guidelines
http://goo.gl/sPGLQN
ONTARIO INSTITUTE FOR CANCER RESEARCH
Primary Goal: coordinate efforts
to reach goals (50 tumours)
ONTARIO INSTITUTE FOR CANCER RESEARCH
http://docs.icgc.org/dcc-data-element-specifications
ONTARIO INSTITUTE FOR CANCER RESEARCH
Primary Goal: be comprehensive
http://goo.gl/BE7KH1
ONTARIO INSTITUTE FOR CANCER RESEARCH
Analysis Data Types
Germline variants (SNPs)
Simple Somatic Mutations (SSM)
Copy Number Alterations (CNA)
Structural Variants (SV)
Gene Expression (micro-arrays and RNASeq)
miRNA Expression (RNASeq)
Epigenomics (Arrays and Methylation)
Splicing Variation (RNASeq)
Protein Expression (Arrays)
ONTARIO INSTITUTE FOR CANCER RESEARCH
Primary Goal: generate highest quality
http://goo.gl/FXCvi9
ONTARIO INSTITUTE FOR CANCER RESEARCH
ONTARIO INSTITUTE FOR CANCER RESEARCH
Primary Goal: available to all
ONTARIO INSTITUTE FOR CANCER RESEARCH
Primary Goal: available to all
ONTARIO INSTITUTE FOR CANCER RESEARCH
• Detailed Phenotype and Outcome data
Region of residence
Risk factors
Examination
Surgery
Radiation
Sample
Slide
Specific histological features
Analyte
Aliquot
Donor notes
• Gene Expression (probe-level data)
• Raw genotype calls
• Gene-sample identifier links
• Genome sequence files
ICGC Controlled
Access Datasets
• Cancer Pathology
Histologic type or subtype
Histologic nuclear grade
• Patient/Person
Gender, Age range,
Vital status, Survival time
Relapse type, Status at follow-up
• Gene Expression (normalized)
• DNA methylation
•Computed Copy Number and
Loss of Heterozygosity
• Newly discovered somatic variants
ICGC OA
Datasets
http://goo.gl/w4mrV
ONTARIO INSTITUTE FOR CANCER RESEARCH
Secondary Goal: coordinate
work to benefit productivity
http://goo.gl/K5mHC3
ONTARIO INSTITUTE FOR CANCER RESEARCH
https://icgc.org/icgc/committees-and-working-groups
ONTARIO INSTITUTE FOR CANCER RESEARCH
Secondary Goal: disseminate knowledge
http://goo.gl/ObcZXy
ONTARIO INSTITUTE FOR CANCER RESEARCH
ICGC
Goals, Structure,
Policies & Guidelines
http://goo.gl/sPGLQN
ONTARIO INSTITUTE FOR CANCER RESEARCH
Policy
ICGC membership implies compliance with Core Bioethical
Elements for samples used in ICGC Cancer Projects:
http://goo.gl/TFrCmK
http://goo.gl/nYx6YG
ONTARIO INSTITUTE FOR CANCER RESEARCH
POLICY:
The members of the International Cancer Genomics
Consortium (ICGC) are committed to the principle of rapid
data release to the scientific community.
http://goo.gl/TFrCmK
ONTARIO INSTITUTE FOR CANCER RESEARCH
Publication Policy
The individual research groups in
the ICGC are free to publish the
results of their own efforts in
independent publications at any
time (subject, of course, to any
policies of any collaborations in
which they may be participating).
ONTARIO INSTITUTE FOR CANCER RESEARCH
Moratorium:
http://www.icgc.org/icgc/goals-structure-policies-guidelines/e3-publication-policy
ONTARIO INSTITUTE FOR CANCER RESEARCH
Publication Policy
ONTARIO INSTITUTE FOR CANCER RESEARCH
Where do you find that information?
We actually make it hard to find, but we are working on
that! (this is an example of where ICGC would like to do
what TCGA does!)
http://cancergenome.nih.gov/publications/publicationguidelines
ONTARIO INSTITUTE FOR CANCER RESEARCH
Policy on Intellectual Property
All ICGC members agree not to make claims to
possible IP derived from primary data (including somatic
mutations) and to not pursue IP protections that would
prevent or block access to or use of any element of
ICGC data or conclusions drawn directly from those
data.
http://goo.gl/TCMXCl
ONTARIO INSTITUTE FOR CANCER RESEARCH
85 Projects 18 Jurisdictions 42 Cancer types
Over 12,000 Cancer Genomes
International Cancer Genome Consortium: February 2015
ONTARIO INSTITUTE FOR CANCER RESEARCH
DCC Activities
DCC activities are split between two groups:
Software Development
DCC portal
Submission tool
Biocuration (which also includes Content Management)
Data level management
Submitter “handling”
Coordination with secretariat
User support
http://dcc.icgc.org/team
42
ONTARIO INSTITUTE FOR CANCER RESEARCH
Data
Validation
ValidationValidation
(dictionary)
Validation
(across
fields)
Validation
(across
fields)
Validation
(across
fields)
indexing
Happy
Users
http://goo.gl/1EcyR
ONTARIO INSTITUTE FOR CANCER RESEARCH
http://docs.icgc.org/methods
ONTARIO INSTITUTE FOR CANCER RESEARCH
http://docs.icgc.org/dcc-data-element-specifications
ONTARIO INSTITUTE FOR CANCER RESEARCH
ICGC Biocuration
Helping submitters get their data to ICGC
Progress reporting (data audit)
Quality checks (coverage, correctness, etc.)
Helping users get to the data
Validate and check (and recheck) metadata on public
repositories
Test and integrate with other public repositories via
standard data formats, ontologies.
Documentation, documentation, and more
documentation
Training
46
ONTARIO INSTITUTE FOR CANCER RESEARCH
ICGC datasets to date:
https://dcc.icgc.org/projects/history
ONTARIO INSTITUTE FOR CANCER RESEARCH
http://goo.gl/CekF6y
Missing Clinical Data?
ONTARIO INSTITUTE FOR CANCER RESEARCH
49
http://goo.gl/CekF6y
ONTARIO INSTITUTE FOR CANCER RESEARCH
50
ONTARIO INSTITUTE FOR CANCER RESEARCH
DACO
Data Portal Info/help
Login
ONTARIO INSTITUTE FOR CANCER RESEARCH
http://dcc.icgc.org/
ONTARIO INSTITUTE FOR CANCER RESEARCH
http://dcc.icgc.org/
55 projects
Access to all data files
(and more with DACO access)
Faceted searches
ONTARIO INSTITUTE FOR CANCER RESEARCH
https://dcc.icgc.org/projects
ONTARIO INSTITUTE FOR CANCER RESEARCH
https://dcc.icgc.org/search
ONTARIO INSTITUTE FOR CANCER RESEARCH
ONTARIO INSTITUTE FOR CANCER RESEARCH
https://dcc.icgc.org/repository
ONTARIO INSTITUTE FOR CANCER RESEARCH
ICGC DCC community http://goo.gl/wfxRqJ
58
https://goo.gl/M1vch1
ONTARIO INSTITUTE FOR CANCER RESEARCH
ICGC
BAM/FASTQ
TCGA
BAM/FASTQ
ICGC
Open
Data
(includes
TCGA
Open Data)
ONTARIO INSTITUTE FOR CANCER RESEARCH
ICGC
TCGA
ONTARIO INSTITUTE FOR CANCER RESEARCH
ICG
C
TCGA
Differences between ICGC & TCGA
• Different tumour types
• Different geographic rules
• Many countries vs one jurisdiction
• Different definitions of what is controlled
• Different data access rules
ONTARIO INSTITUTE FOR CANCER RESEARCH
• Detailed Phenotype and Outcome data
• Gene Expression (probe-level data)
• Raw genotype calls
• Gene-sample identifier links
• Genome sequence files
• Germ line variants
ICGC Controlled
Access Datasets
• Cancer Pathology
Histologic type or subtype
Histologic nuclear grade
• Patient/Person
Gender, Age range,
Vital status, Survival time
Relapse type, Status at follow-up
• Gene Expression (normalized)
• DNA methylation
•Computed Copy Number and
Loss of Heterozygosity
• Somatic variants from Exome or WGS
ICGC Open
Access Datasets
http://goo.gl/w4mrV
ONTARIO INSTITUTE FOR CANCER RESEARCH
• Primary sequence data
(BAM and FASTQ files)
• SNP6 array level 1 and level 2 data
• Exon array level 1 and level 2 data
• Somatic variants from whole
genome sequencing
• Certain information in MAFs
• A full list of controlled-access
data types can be found at:
http://goo.gl/K1h7zu
TCGA Controlled
Access Datasets
• De-identified clinical and
demographic data
• Gene expression data
• Copy number alterations in regions
of the genome
• Epigenetic data
• Summaries of data compiled across
individuals
• Anonymized single amplicon DNA
sequence data
• Somatic variants from scrubbed
exome sequencing
TCGA Open
Access Datasets
http://goo.gl/A1rMRB
ONTARIO INSTITUTE FOR CANCER RESEARCH
TCGA/ICGC users agreed:
… to keep all computer systems on which controlled
access data reside, or which provide access to such
data, up to date with respect to software and security
patches.
… to protect Controlled Access Data against disclosure
to unauthorized individuals.
… to monitor and control which individuals have access
to Controlled Access Data.
ONTARIO INSTITUTE FOR CANCER RESEARCH
TCGA/ICGC users agreed:
… to destroy all copies of controlled access data after
controlled access privileges expires.
... to only use secure transfer protocols:
e.g. https and sftp
… to encrypt Controlled Access data in transfers and
storage
ONTARIO INSTITUTE FOR CANCER RESEARCH
What does it mean for this file?
simple_somatic_mutation.aggregated.vcf.gz
https://dcc.icgc.org/repository/release_18/Summary
ONTARIO INSTITUTE FOR CANCER RESEARCH
ONTARIO INSTITUTE FOR CANCER RESEARCH
Identify
yourself
Fill out detail form which
includes:
• Contact and Project
Information
•Information Technology
details and procedures
for keeping data secure
•Data Access Agreement
All of these
documents are
put into a PDF
file that you
print and get your
institution to sign
off on your behalf
ONTARIO INSTITUTE FOR CANCER RESEARCH
ONTARIO INSTITUTE FOR CANCER RESEARCH
ONTARIO INSTITUTE FOR CANCER RESEARCH
ONTARIO INSTITUTE FOR CANCER RESEARCH
ONTARIO INSTITUTE FOR CANCER RESEARCH
ONTARIO INSTITUTE FOR CANCER RESEARCH
ONTARIO INSTITUTE FOR CANCER RESEARCH
75
https://icgc.org/daco/approved-projects
173 groups
977 people
ONTARIO INSTITUTE FOR CANCER RESEARCH
DACO
ICGC
dbGaP
cgHUB
EGA
TCGA
BAM
Open
Open
ERA
BA
M
BA
M
EGA id
& password
WGS
Ger m
Line
ONTARIO INSTITUTE FOR CANCER RESEARCH
Making sense of it all
1 project == 1 pipeline
ONTARIO INSTITUTE FOR CANCER RESEARCH
Making sense of it all
55 projects == 55 pipelines
ONTARIO INSTITUTE FOR CANCER RESEARCH
Making sense of it all
55 projects == 1 pipeline
ONTARIO INSTITUTE FOR CANCER RESEARCH
PanCancer Analysis of Whole Genomes
(PCAWG)
2,400 T/N pairs with clinical data
analyzed over 6 Academic clouds
16 working groups, > 1000 scientists
1 alignment pipeline (10 months)
Data freeze 2 months ago
3 somatic mutation pipelines (2 more months?)
2 RNA-Seq pipelines (done)
Start writing papers in January 2016
ONTARIO INSTITUTE FOR CANCER RESEARCH
From PCAWG we will have:
81
1st PANCANCER analysis on > 2,400 cancer tumours
from a WGS perspective
RNA, SSM, CNV, Methylation analysis
Published (executable) pipelines
Docker https://github.com/docker/docker
Galaxy galaxyproject.org
Seqware http://seqware.github.io/
Method papers
Multiple cloud access to data
Multiple portal access to data
ONTARIO INSTITUTE FOR CANCER RESEARCH
Other projects in planning
ICGC to finish in Spring of 2018
Planning for ICGC2
ICGC 1: 25,000 tumours (DNA, RNA, Epigenome,
Clinical data)
ICGC2: (planning) 250,000 Tumours (DNA, RNA,
Epigenome, Clinical trial) (1/2 million genomes)
ICGC1 was the picture, ICGC2 will be the movie (before
and after treatment).
Trailers to come out in December, before Christmas
Submission system with one place for data and metadata
Tools/links directory portal
ONTARIO INSTITUTE FOR CANCER RESEARCH
DCC Software
Developer
Vincent Ferretti
Daniel Chang
Anthony Cros
Jerry Lam
Brian O'Connor
Bob Tiernay
Stuart Watt
Shane Wilson
Junjun Zhang
Acknowledgments
ICGC/OICR
Project leaders:
Tom Hudson
John McPherson
Lincoln Stein
Jared Simpson
Paul Boutros
Vincent Ferretti
Francis Ouellette
Jennifer Jennings
Ouellette Lab
Michelle Brazas
Emilie Chautard
Nina Palikuca
Zhibin Lu
Web Dev
Joseph Yamada
Kamen Wu
Kim Cullion
Miyuki Fukuma
ICGC DCC Biocuration
Hardeep Nahal
Marc Perry
Kevin Chen
http://oicr.on.ca http://icgc.org
… and all the patients and their
families that that are putting
their hopes into our work!
Research
IT/Systems
David Sutton,
Bob Gibson
Sam Maclennan
David Magda
Rob Naccarato
Brian Ott
Gino Yearwood
EGA
Justin Paschall
Jeff Almeida-King
Ilkka Lappalainen
Jordi Rambla De
Argila
Marc Sitges Puy
Genome Sequence
Informatics (GSI)
Lars Jorgensen
Tim Beck
Tony DeBat
Larry Heissler
Xuemei (Mei) Luo
Michael Moorhouse
Yogi Sundaravadanam
Morgan Taschuk
Michael Laszloffy
Peter Ruzanov
ONTARIO INSTITUTE FOR CANCER RESEARCH
Informatics and Biocomputing at the OICR
ONTARIO INSTITUTE FOR CANCER RESEARCH
http://icgc.org
http://dcc.icgc.org
http://docs.icgc.org
info@icgc.org
francis@oicr.on.ca

Más contenido relacionado

La actualidad más candente

(December 2, 2021) The Bench to Bedside Series: Preclinical Cancer Research w...
(December 2, 2021) The Bench to Bedside Series: Preclinical Cancer Research w...(December 2, 2021) The Bench to Bedside Series: Preclinical Cancer Research w...
(December 2, 2021) The Bench to Bedside Series: Preclinical Cancer Research w...
Scintica Instrumentation
 

La actualidad más candente (20)

Integrative Everything, Deep Learning and Streaming Data
Integrative Everything, Deep Learning and Streaming DataIntegrative Everything, Deep Learning and Streaming Data
Integrative Everything, Deep Learning and Streaming Data
 
Enriching Scholarship Personal Genomics presentation
Enriching Scholarship Personal Genomics presentationEnriching Scholarship Personal Genomics presentation
Enriching Scholarship Personal Genomics presentation
 
JALANov2000
JALANov2000JALANov2000
JALANov2000
 
cBioPortal Webinar Slides (2/3)
cBioPortal Webinar Slides (2/3)cBioPortal Webinar Slides (2/3)
cBioPortal Webinar Slides (2/3)
 
UNMSymposium2014
UNMSymposium2014UNMSymposium2014
UNMSymposium2014
 
Brief introduction to Bioinformatics
Brief introduction to BioinformaticsBrief introduction to Bioinformatics
Brief introduction to Bioinformatics
 
Bioinformatics: What, Why and Where?
Bioinformatics: What, Why and Where?Bioinformatics: What, Why and Where?
Bioinformatics: What, Why and Where?
 
Digital Pathology: Precision Medicine, Deep Learning and Computer Aided Inter...
Digital Pathology: Precision Medicine, Deep Learning and Computer Aided Inter...Digital Pathology: Precision Medicine, Deep Learning and Computer Aided Inter...
Digital Pathology: Precision Medicine, Deep Learning and Computer Aided Inter...
 
Pathomics Based Biomarkers and Precision Medicine
Pathomics Based Biomarkers and Precision MedicinePathomics Based Biomarkers and Precision Medicine
Pathomics Based Biomarkers and Precision Medicine
 
tranSMART Community Meeting 5-7 Nov 13 - Session 3: Characterization of the c...
tranSMART Community Meeting 5-7 Nov 13 - Session 3: Characterization of the c...tranSMART Community Meeting 5-7 Nov 13 - Session 3: Characterization of the c...
tranSMART Community Meeting 5-7 Nov 13 - Session 3: Characterization of the c...
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Bioinformatics in a Nutshell
Bioinformatics in a NutshellBioinformatics in a Nutshell
Bioinformatics in a Nutshell
 
Final cecr presenation
Final cecr presenationFinal cecr presenation
Final cecr presenation
 
Personal Genomes: what can I do with my data?
Personal Genomes: what can I do with my data?Personal Genomes: what can I do with my data?
Personal Genomes: what can I do with my data?
 
Learning, Training,  Classification,  Common Sense and Exascale Computing
Learning, Training,  Classification,  Common Sense and Exascale ComputingLearning, Training,  Classification,  Common Sense and Exascale Computing
Learning, Training,  Classification,  Common Sense and Exascale Computing
 
Stephen Friend Institute of Development, Aging and Cancer 2011-11-29
Stephen Friend Institute of Development, Aging and Cancer 2011-11-29Stephen Friend Institute of Development, Aging and Cancer 2011-11-29
Stephen Friend Institute of Development, Aging and Cancer 2011-11-29
 
(December 2, 2021) The Bench to Bedside Series: Preclinical Cancer Research w...
(December 2, 2021) The Bench to Bedside Series: Preclinical Cancer Research w...(December 2, 2021) The Bench to Bedside Series: Preclinical Cancer Research w...
(December 2, 2021) The Bench to Bedside Series: Preclinical Cancer Research w...
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Common languages in genomic epidemiology: from ontologies to algorithms
Common languages in genomic epidemiology: from ontologies to algorithmsCommon languages in genomic epidemiology: from ontologies to algorithms
Common languages in genomic epidemiology: from ontologies to algorithms
 
Prof. Mohamed Labib Salem's students
Prof. Mohamed Labib Salem's studentsProf. Mohamed Labib Salem's students
Prof. Mohamed Labib Salem's students
 

Similar a Genentech icgc 2015

Computational Pathology Workshop July 8 2014
Computational Pathology Workshop July 8 2014Computational Pathology Workshop July 8 2014
Computational Pathology Workshop July 8 2014
Joel Saltz
 
Cancer genome repository_berkeley
Cancer genome repository_berkeleyCancer genome repository_berkeley
Cancer genome repository_berkeley
Shyam Sarkar
 
Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009
Ian Foster
 

Similar a Genentech icgc 2015 (20)

Pistoia Alliance US Conference 2015 - 1.5.4 New data - Nikolaus Schultz
Pistoia Alliance US Conference 2015 - 1.5.4 New data - Nikolaus SchultzPistoia Alliance US Conference 2015 - 1.5.4 New data - Nikolaus Schultz
Pistoia Alliance US Conference 2015 - 1.5.4 New data - Nikolaus Schultz
 
Trans disciplinary research is a must for excellence in science by Prof. Moha...
Trans disciplinary research is a must for excellence in science by Prof. Moha...Trans disciplinary research is a must for excellence in science by Prof. Moha...
Trans disciplinary research is a must for excellence in science by Prof. Moha...
 
Nci clinical genomics data sharing ncra sept 2016
Nci clinical genomics data sharing ncra sept 2016Nci clinical genomics data sharing ncra sept 2016
Nci clinical genomics data sharing ncra sept 2016
 
Genomics and Computation in Precision Medicine March 2017
Genomics and Computation in Precision Medicine March 2017Genomics and Computation in Precision Medicine March 2017
Genomics and Computation in Precision Medicine March 2017
 
Advancing Innovation and Convergence in Cancer Research: US Federal Cancer Mo...
Advancing Innovation and Convergence in Cancer Research: US Federal Cancer Mo...Advancing Innovation and Convergence in Cancer Research: US Federal Cancer Mo...
Advancing Innovation and Convergence in Cancer Research: US Federal Cancer Mo...
 
Rozen 2016-10-05-ieee-cibcb-big-genome-data-to-share
Rozen 2016-10-05-ieee-cibcb-big-genome-data-to-shareRozen 2016-10-05-ieee-cibcb-big-genome-data-to-share
Rozen 2016-10-05-ieee-cibcb-big-genome-data-to-share
 
Advancing Convergence and Innovation in Cancer Research: Seminar at Universit...
Advancing Convergence and Innovation in Cancer Research: Seminar at Universit...Advancing Convergence and Innovation in Cancer Research: Seminar at Universit...
Advancing Convergence and Innovation in Cancer Research: Seminar at Universit...
 
Icenco2015- 30 Dec. 2015 (invited talk)
Icenco2015- 30 Dec. 2015 (invited talk)Icenco2015- 30 Dec. 2015 (invited talk)
Icenco2015- 30 Dec. 2015 (invited talk)
 
Keynote at NVIDIA GPU Technology Conference in D.C.
Keynote at NVIDIA GPU Technology Conference in D.C.Keynote at NVIDIA GPU Technology Conference in D.C.
Keynote at NVIDIA GPU Technology Conference in D.C.
 
The BRCA Challenge & Exchange: Progress and Plans - Gunnar Rätsch
The BRCA Challenge & Exchange: Progress and Plans - Gunnar RätschThe BRCA Challenge & Exchange: Progress and Plans - Gunnar Rätsch
The BRCA Challenge & Exchange: Progress and Plans - Gunnar Rätsch
 
Computational Pathology Workshop July 8 2014
Computational Pathology Workshop July 8 2014Computational Pathology Workshop July 8 2014
Computational Pathology Workshop July 8 2014
 
Precision Oncology - using Genomics, Proteomics and Imaging to inform biology...
Precision Oncology - using Genomics, Proteomics and Imaging to inform biology...Precision Oncology - using Genomics, Proteomics and Imaging to inform biology...
Precision Oncology - using Genomics, Proteomics and Imaging to inform biology...
 
Bioinformatics as a tool for understanding carcinogenesis
Bioinformatics as a tool for understanding carcinogenesisBioinformatics as a tool for understanding carcinogenesis
Bioinformatics as a tool for understanding carcinogenesis
 
Cancer genome repository_berkeley
Cancer genome repository_berkeleyCancer genome repository_berkeley
Cancer genome repository_berkeley
 
ICBO 2014, October 8, 2014
ICBO 2014, October 8, 2014ICBO 2014, October 8, 2014
ICBO 2014, October 8, 2014
 
Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009
 
Dalton
DaltonDalton
Dalton
 
Dalton presentation
Dalton presentationDalton presentation
Dalton presentation
 
Cancer moonshot and data sharing
Cancer moonshot and data sharingCancer moonshot and data sharing
Cancer moonshot and data sharing
 
Web cast cancer gene panels march 11 2015
Web cast cancer gene panels march 11 2015Web cast cancer gene panels march 11 2015
Web cast cancer gene panels march 11 2015
 

Último

Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
ssuser79fe74
 
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
Lokesh Kothari
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
RizalinePalanog2
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Lokesh Kothari
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
AlMamun560346
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Sérgio Sacani
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
PirithiRaju
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
Areesha Ahmad
 

Último (20)

Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 

Genentech icgc 2015

  • 1. Status and Update of the International Cancer Genomics Consortium (ICGC) June 1st 2015 B.F. Francis Ouellette francis@oicr.on.ca • Senior Scientists & Associate Director, Informatics and Biocomputing, Ontario Institute for Cancer Research, Toronto, ON • Associate Professor, Department of Cell and Systems Biology, University of Toronto, Toronto, ON.
  • 2. ONTARIO INSTITUTE FOR CANCER RESEARCH You are free to: Copy, share, adapt, or re-mix; Photograph, film, or broadcast; Blog, live-blog, or post video of; This presentation. Provided that: You attribute the work to its author and respect the rights and licenses associated with its components. Slide Concept by Cameron Neylon, who has waived all copyright and related or neighbouring rights. This slide only ccZero. Social Media Icons adapted with permission from originals by Christopher Ross. Original images are available under GPL at; http://www.thisismyurl.com/free-downloads/15-free-speech-bubble-icons-for-popular-websites
  • 3. ONTARIO INSTITUTE FOR CANCER RESEARCH 3Module #: Title of Module
  • 4. ONTARIO INSTITUTE FOR CANCER RESEARCH
  • 5. ONTARIO INSTITUTE FOR CANCER RESEARCH Disclaimer I am on the SAB of many NIH funded projects (SGD, Galaxy, GenomeSpace, and HMP2), as well as on the Science, Industry Advisory Committee of Genome Canada. I do not (and will not) profit in any way, shape or form, from any of the brands, products or companies I may mention.
  • 6. ONTARIO INSTITUTE FOR CANCER RESEARCH @bffo francis@oicr.on.caE-mail
  • 7. ONTARIO INSTITUTE FOR CANCER RESEARCH
  • 8. ONTARIO INSTITUTE FOR CANCER RESEARCH International Cancer Genome Consortium
  • 9. ONTARIO INSTITUTE FOR CANCER RESEARCH http://www.csb.utoronto.ca/
  • 10. ONTARIO INSTITUTE FOR CANCER RESEARCH http://bioinformatics.ca/
  • 11. ONTARIO INSTITUTE FOR CANCER RESEARCH
  • 12. ONTARIO INSTITUTE FOR CANCER RESEARCH
  • 13. ONTARIO INSTITUTE FOR CANCER RESEARCH http://bioinformatics.ca/workshops/2014
  • 14. ONTARIO INSTITUTE FOR CANCER RESEARCH E-mail: course_info@bioinformatics.ca Web: http://bioinformatics.ca
  • 15. ONTARIO INSTITUTE FOR CANCER RESEARCH Cancer A Disease of the Genome Challenge in Treating Cancer:  Every tumor is different  Every cancer patient is different
  • 16. ONTARIO INSTITUTE FOR CANCER RESEARCH  Johns Hopkins > 18,000 genes analyzed for mutations 11 breast and 11 colon tumors L.D. Wood et al, Science, Oct. 2007  Wellcome Trust Sanger Institute 518 genes analyzed for mutations 210 tumors of various types C. Greenman et al, Nature, Mar. 2007  TCGA (NIH) Multiple technologies brain (glioblastoma multiforme), lung (squamous carcinoma), and ovarian (serous cystadenocarcinoma). F.S. Collins & A.D. Barker, Sci. Am, Mar. 2007 Large-Scale Studies of Cancer Genomes
  • 17. ONTARIO INSTITUTE FOR CANCER RESEARCH  Heterogeneity within and across tumor types  High rate of abnormalities (driver vs passenger)  Sample quality matters  Consent and controlled data access is complicated Lessons learned
  • 18. ONTARIO INSTITUTE FOR CANCER RESEARCH International Cancer Genome Consortium Collect ~500 tumour/normal pairs from each of 50 different major cancer types; Comprehensive genome analysis of each T/N pair: Genome Transcriptome Methylome Clinical data Make the data available to the research community & public. Identify genome changes …GATTATTCCAGGTAT… …GATTATTGCAGGTAT… …GATTATTGCAGGTAT…
  • 19. ONTARIO INSTITUTE FOR CANCER RESEARCH Rationale for the ICGC The scope is huge, such that no country can do it all. Coordinated cancer genome initiatives will reduce duplication of effort for common and easy to acquire tumor samples and and ensure complete studies for many less frequent forms of cancer. Standardization and uniform quality measures across studies will enable the merging of datasets, increasing power to detect additional targets. The spectrum of many cancers varies across the world for many tumor types, because of environmental, genetic and other causes. The ICGC will accelerate the dissemination of genomic and analytical methods across participating sites, and the user community
  • 20. ONTARIO INSTITUTE FOR CANCER RESEARCH ICGC Goals, Structure, Policies & Guidelines http://goo.gl/sPGLQN
  • 21. ONTARIO INSTITUTE FOR CANCER RESEARCH Primary Goal: coordinate efforts to reach goals (50 tumours)
  • 22. ONTARIO INSTITUTE FOR CANCER RESEARCH http://docs.icgc.org/dcc-data-element-specifications
  • 23. ONTARIO INSTITUTE FOR CANCER RESEARCH Primary Goal: be comprehensive http://goo.gl/BE7KH1
  • 24. ONTARIO INSTITUTE FOR CANCER RESEARCH Analysis Data Types Germline variants (SNPs) Simple Somatic Mutations (SSM) Copy Number Alterations (CNA) Structural Variants (SV) Gene Expression (micro-arrays and RNASeq) miRNA Expression (RNASeq) Epigenomics (Arrays and Methylation) Splicing Variation (RNASeq) Protein Expression (Arrays)
  • 25. ONTARIO INSTITUTE FOR CANCER RESEARCH Primary Goal: generate highest quality http://goo.gl/FXCvi9
  • 26. ONTARIO INSTITUTE FOR CANCER RESEARCH
  • 27. ONTARIO INSTITUTE FOR CANCER RESEARCH Primary Goal: available to all
  • 28. ONTARIO INSTITUTE FOR CANCER RESEARCH Primary Goal: available to all
  • 29. ONTARIO INSTITUTE FOR CANCER RESEARCH • Detailed Phenotype and Outcome data Region of residence Risk factors Examination Surgery Radiation Sample Slide Specific histological features Analyte Aliquot Donor notes • Gene Expression (probe-level data) • Raw genotype calls • Gene-sample identifier links • Genome sequence files ICGC Controlled Access Datasets • Cancer Pathology Histologic type or subtype Histologic nuclear grade • Patient/Person Gender, Age range, Vital status, Survival time Relapse type, Status at follow-up • Gene Expression (normalized) • DNA methylation •Computed Copy Number and Loss of Heterozygosity • Newly discovered somatic variants ICGC OA Datasets http://goo.gl/w4mrV
  • 30. ONTARIO INSTITUTE FOR CANCER RESEARCH Secondary Goal: coordinate work to benefit productivity http://goo.gl/K5mHC3
  • 31. ONTARIO INSTITUTE FOR CANCER RESEARCH https://icgc.org/icgc/committees-and-working-groups
  • 32. ONTARIO INSTITUTE FOR CANCER RESEARCH Secondary Goal: disseminate knowledge http://goo.gl/ObcZXy
  • 33. ONTARIO INSTITUTE FOR CANCER RESEARCH ICGC Goals, Structure, Policies & Guidelines http://goo.gl/sPGLQN
  • 34. ONTARIO INSTITUTE FOR CANCER RESEARCH Policy ICGC membership implies compliance with Core Bioethical Elements for samples used in ICGC Cancer Projects: http://goo.gl/TFrCmK http://goo.gl/nYx6YG
  • 35. ONTARIO INSTITUTE FOR CANCER RESEARCH POLICY: The members of the International Cancer Genomics Consortium (ICGC) are committed to the principle of rapid data release to the scientific community. http://goo.gl/TFrCmK
  • 36. ONTARIO INSTITUTE FOR CANCER RESEARCH Publication Policy The individual research groups in the ICGC are free to publish the results of their own efforts in independent publications at any time (subject, of course, to any policies of any collaborations in which they may be participating).
  • 37. ONTARIO INSTITUTE FOR CANCER RESEARCH Moratorium: http://www.icgc.org/icgc/goals-structure-policies-guidelines/e3-publication-policy
  • 38. ONTARIO INSTITUTE FOR CANCER RESEARCH Publication Policy
  • 39. ONTARIO INSTITUTE FOR CANCER RESEARCH Where do you find that information? We actually make it hard to find, but we are working on that! (this is an example of where ICGC would like to do what TCGA does!) http://cancergenome.nih.gov/publications/publicationguidelines
  • 40. ONTARIO INSTITUTE FOR CANCER RESEARCH Policy on Intellectual Property All ICGC members agree not to make claims to possible IP derived from primary data (including somatic mutations) and to not pursue IP protections that would prevent or block access to or use of any element of ICGC data or conclusions drawn directly from those data. http://goo.gl/TCMXCl
  • 41. ONTARIO INSTITUTE FOR CANCER RESEARCH 85 Projects 18 Jurisdictions 42 Cancer types Over 12,000 Cancer Genomes International Cancer Genome Consortium: February 2015
  • 42. ONTARIO INSTITUTE FOR CANCER RESEARCH DCC Activities DCC activities are split between two groups: Software Development DCC portal Submission tool Biocuration (which also includes Content Management) Data level management Submitter “handling” Coordination with secretariat User support http://dcc.icgc.org/team 42
  • 43. ONTARIO INSTITUTE FOR CANCER RESEARCH Data Validation ValidationValidation (dictionary) Validation (across fields) Validation (across fields) Validation (across fields) indexing Happy Users http://goo.gl/1EcyR
  • 44. ONTARIO INSTITUTE FOR CANCER RESEARCH http://docs.icgc.org/methods
  • 45. ONTARIO INSTITUTE FOR CANCER RESEARCH http://docs.icgc.org/dcc-data-element-specifications
  • 46. ONTARIO INSTITUTE FOR CANCER RESEARCH ICGC Biocuration Helping submitters get their data to ICGC Progress reporting (data audit) Quality checks (coverage, correctness, etc.) Helping users get to the data Validate and check (and recheck) metadata on public repositories Test and integrate with other public repositories via standard data formats, ontologies. Documentation, documentation, and more documentation Training 46
  • 47. ONTARIO INSTITUTE FOR CANCER RESEARCH ICGC datasets to date: https://dcc.icgc.org/projects/history
  • 48. ONTARIO INSTITUTE FOR CANCER RESEARCH http://goo.gl/CekF6y Missing Clinical Data?
  • 49. ONTARIO INSTITUTE FOR CANCER RESEARCH 49 http://goo.gl/CekF6y
  • 50. ONTARIO INSTITUTE FOR CANCER RESEARCH 50
  • 51. ONTARIO INSTITUTE FOR CANCER RESEARCH DACO Data Portal Info/help Login
  • 52. ONTARIO INSTITUTE FOR CANCER RESEARCH http://dcc.icgc.org/
  • 53. ONTARIO INSTITUTE FOR CANCER RESEARCH http://dcc.icgc.org/ 55 projects Access to all data files (and more with DACO access) Faceted searches
  • 54. ONTARIO INSTITUTE FOR CANCER RESEARCH https://dcc.icgc.org/projects
  • 55. ONTARIO INSTITUTE FOR CANCER RESEARCH https://dcc.icgc.org/search
  • 56. ONTARIO INSTITUTE FOR CANCER RESEARCH
  • 57. ONTARIO INSTITUTE FOR CANCER RESEARCH https://dcc.icgc.org/repository
  • 58. ONTARIO INSTITUTE FOR CANCER RESEARCH ICGC DCC community http://goo.gl/wfxRqJ 58 https://goo.gl/M1vch1
  • 59. ONTARIO INSTITUTE FOR CANCER RESEARCH ICGC BAM/FASTQ TCGA BAM/FASTQ ICGC Open Data (includes TCGA Open Data)
  • 60. ONTARIO INSTITUTE FOR CANCER RESEARCH ICGC TCGA
  • 61. ONTARIO INSTITUTE FOR CANCER RESEARCH ICG C TCGA Differences between ICGC & TCGA • Different tumour types • Different geographic rules • Many countries vs one jurisdiction • Different definitions of what is controlled • Different data access rules
  • 62. ONTARIO INSTITUTE FOR CANCER RESEARCH • Detailed Phenotype and Outcome data • Gene Expression (probe-level data) • Raw genotype calls • Gene-sample identifier links • Genome sequence files • Germ line variants ICGC Controlled Access Datasets • Cancer Pathology Histologic type or subtype Histologic nuclear grade • Patient/Person Gender, Age range, Vital status, Survival time Relapse type, Status at follow-up • Gene Expression (normalized) • DNA methylation •Computed Copy Number and Loss of Heterozygosity • Somatic variants from Exome or WGS ICGC Open Access Datasets http://goo.gl/w4mrV
  • 63. ONTARIO INSTITUTE FOR CANCER RESEARCH • Primary sequence data (BAM and FASTQ files) • SNP6 array level 1 and level 2 data • Exon array level 1 and level 2 data • Somatic variants from whole genome sequencing • Certain information in MAFs • A full list of controlled-access data types can be found at: http://goo.gl/K1h7zu TCGA Controlled Access Datasets • De-identified clinical and demographic data • Gene expression data • Copy number alterations in regions of the genome • Epigenetic data • Summaries of data compiled across individuals • Anonymized single amplicon DNA sequence data • Somatic variants from scrubbed exome sequencing TCGA Open Access Datasets http://goo.gl/A1rMRB
  • 64. ONTARIO INSTITUTE FOR CANCER RESEARCH TCGA/ICGC users agreed: … to keep all computer systems on which controlled access data reside, or which provide access to such data, up to date with respect to software and security patches. … to protect Controlled Access Data against disclosure to unauthorized individuals. … to monitor and control which individuals have access to Controlled Access Data.
  • 65. ONTARIO INSTITUTE FOR CANCER RESEARCH TCGA/ICGC users agreed: … to destroy all copies of controlled access data after controlled access privileges expires. ... to only use secure transfer protocols: e.g. https and sftp … to encrypt Controlled Access data in transfers and storage
  • 66. ONTARIO INSTITUTE FOR CANCER RESEARCH What does it mean for this file? simple_somatic_mutation.aggregated.vcf.gz https://dcc.icgc.org/repository/release_18/Summary
  • 67. ONTARIO INSTITUTE FOR CANCER RESEARCH
  • 68. ONTARIO INSTITUTE FOR CANCER RESEARCH Identify yourself Fill out detail form which includes: • Contact and Project Information •Information Technology details and procedures for keeping data secure •Data Access Agreement All of these documents are put into a PDF file that you print and get your institution to sign off on your behalf
  • 69. ONTARIO INSTITUTE FOR CANCER RESEARCH
  • 70. ONTARIO INSTITUTE FOR CANCER RESEARCH
  • 71. ONTARIO INSTITUTE FOR CANCER RESEARCH
  • 72. ONTARIO INSTITUTE FOR CANCER RESEARCH
  • 73. ONTARIO INSTITUTE FOR CANCER RESEARCH
  • 74. ONTARIO INSTITUTE FOR CANCER RESEARCH
  • 75. ONTARIO INSTITUTE FOR CANCER RESEARCH 75 https://icgc.org/daco/approved-projects 173 groups 977 people
  • 76. ONTARIO INSTITUTE FOR CANCER RESEARCH DACO ICGC dbGaP cgHUB EGA TCGA BAM Open Open ERA BA M BA M EGA id & password WGS Ger m Line
  • 77. ONTARIO INSTITUTE FOR CANCER RESEARCH Making sense of it all 1 project == 1 pipeline
  • 78. ONTARIO INSTITUTE FOR CANCER RESEARCH Making sense of it all 55 projects == 55 pipelines
  • 79. ONTARIO INSTITUTE FOR CANCER RESEARCH Making sense of it all 55 projects == 1 pipeline
  • 80. ONTARIO INSTITUTE FOR CANCER RESEARCH PanCancer Analysis of Whole Genomes (PCAWG) 2,400 T/N pairs with clinical data analyzed over 6 Academic clouds 16 working groups, > 1000 scientists 1 alignment pipeline (10 months) Data freeze 2 months ago 3 somatic mutation pipelines (2 more months?) 2 RNA-Seq pipelines (done) Start writing papers in January 2016
  • 81. ONTARIO INSTITUTE FOR CANCER RESEARCH From PCAWG we will have: 81 1st PANCANCER analysis on > 2,400 cancer tumours from a WGS perspective RNA, SSM, CNV, Methylation analysis Published (executable) pipelines Docker https://github.com/docker/docker Galaxy galaxyproject.org Seqware http://seqware.github.io/ Method papers Multiple cloud access to data Multiple portal access to data
  • 82. ONTARIO INSTITUTE FOR CANCER RESEARCH Other projects in planning ICGC to finish in Spring of 2018 Planning for ICGC2 ICGC 1: 25,000 tumours (DNA, RNA, Epigenome, Clinical data) ICGC2: (planning) 250,000 Tumours (DNA, RNA, Epigenome, Clinical trial) (1/2 million genomes) ICGC1 was the picture, ICGC2 will be the movie (before and after treatment). Trailers to come out in December, before Christmas Submission system with one place for data and metadata Tools/links directory portal
  • 83. ONTARIO INSTITUTE FOR CANCER RESEARCH DCC Software Developer Vincent Ferretti Daniel Chang Anthony Cros Jerry Lam Brian O'Connor Bob Tiernay Stuart Watt Shane Wilson Junjun Zhang Acknowledgments ICGC/OICR Project leaders: Tom Hudson John McPherson Lincoln Stein Jared Simpson Paul Boutros Vincent Ferretti Francis Ouellette Jennifer Jennings Ouellette Lab Michelle Brazas Emilie Chautard Nina Palikuca Zhibin Lu Web Dev Joseph Yamada Kamen Wu Kim Cullion Miyuki Fukuma ICGC DCC Biocuration Hardeep Nahal Marc Perry Kevin Chen http://oicr.on.ca http://icgc.org … and all the patients and their families that that are putting their hopes into our work! Research IT/Systems David Sutton, Bob Gibson Sam Maclennan David Magda Rob Naccarato Brian Ott Gino Yearwood EGA Justin Paschall Jeff Almeida-King Ilkka Lappalainen Jordi Rambla De Argila Marc Sitges Puy Genome Sequence Informatics (GSI) Lars Jorgensen Tim Beck Tony DeBat Larry Heissler Xuemei (Mei) Luo Michael Moorhouse Yogi Sundaravadanam Morgan Taschuk Michael Laszloffy Peter Ruzanov
  • 84. ONTARIO INSTITUTE FOR CANCER RESEARCH Informatics and Biocomputing at the OICR
  • 85. ONTARIO INSTITUTE FOR CANCER RESEARCH http://icgc.org http://dcc.icgc.org http://docs.icgc.org info@icgc.org francis@oicr.on.ca