SlideShare una empresa de Scribd logo
1 de 63
Canadian Cancer
Research Conference
November 3-6, 2013

Canadian Bioinformatics Workshops
www.bioinformatics.ca
Module #: Title of Module

2
You are free to:
Copy, share, adapt, or re-mix;
Photograph, film, or broadcast;
Blog, live-blog, or post video of;

This presentation. Provided that:
You attribute the work to its author and
respect the rights and licenses associated
with its components.
Slide Concept by Cameron Neylon, who has waived all copyright and related or neighbouring rights. This slide only ccZero.
Social Media Icons adapted with permission from originals by Christopher Ross. Original images are available under GPL at;
http://www.thisismyurl.com/free-downloads/15-free-speech-bubble-icons-for-popular-websites

Module 1: Cancer Genomic Databases

bioinformatics.ca
Module 1
Cancer Genomic Databases
E-mail
E-mail

francis@oicr.on.ca
@bffo

Module 1: Cancer Genomic Databases

bioinformatics.ca
Schedule for Module 1
Cancer Genomic Databases
•The Databases:
– The International Cancer Genome Consortium (ICGC)
– The Cancer Genome Atlas (TCGA)
– The Catalogue of Somatic Mutations in Cancer (COSMIC)

•Data Access: human genomes and security and
privacy issues, Open vs. Controlled Access data

Module 1: Cancer Genomic Databases

bioinformatics.ca
Module 1: Cancer Genomic Databases

bioinformatics.ca
http://bioinformatics.ca/

Module 1: Cancer Genomic Databases

bioinformatics.ca
Module 1: Cancer Genomic Databases

bioinformatics.ca
Workshops planned for 2014:
http://bioinformatics.ca/workshops

1.
2.
3.
4.
5.
6.
7.
8.

Exploratory Analysis of Biological Data using R
Bioinformatics for Cancer Genomics
Informatics for RNA-sequence Analysis
Informatics on High Throughput Sequencing Data
Pathway and Network Analysis of -omics Data
Flow Cytometry Data Analysis using R
Microarray Data Analysis
Informatics and Statistics for Metabolomics

Module 1: Cancer Genomic Databases

bioinformatics.ca
http://bioinformatics.ca/workshops/2013

Module 1: Cancer Genomic Databases

bioinformatics.ca
E-mail: course_info@bioinformatics.ca
Web: http://bioinformatics.ca
Workshop announcement mailing list:
http://bioinformatics.ca/mailman/listinfo/announce

Module 1: Cancer Genomic Databases

bioinformatics.ca
Soap-Box time!
•
•

Open Access, Open Data and Open Source are essential for good
Science.
Openness is a responsibility, an obligation, and something that comes
with the privilege of doing publicly funded work.

Open Source
Open Access
Open Data
Opencourseware
Module 1: Cancer Genomic Databases

bioinformatics.ca
Module 1: Cancer Genomic Databases

bioinformatics.ca
Cancer therapy is like
beating the dog with
a stick to get rid of
his fleas.
- Anna Deavere Smith,
Let me down easy

Module 1: Cancer Genomic Databases

bioinformatics.ca
http://goo.gl/Yhbsj

Module 1: Cancer Genomic Databases

bioinformatics.ca
The revolution in cancer
research can summed up
in a single sentence:
cancer is in essence,
a genetic disease.
- Bert Vogelstein

Module 1: Cancer Genomic Databases

bioinformatics.ca
Cancer: a Disease of the Genome

Challenge in Treating Cancer:
 Every tumour is different
 Every cancer patient is different
Module 1: Cancer Genomic Databases

bioinformatics.ca
Cancer Genomic Databases

Chin et al, Genes. Dev. 2011 March 15; 25(6): 534-555
http://www.ncbi.nlm.nih.gov/pubmed/?term=21406553

Module 1: Cancer Genomic Databases

bioinformatics.ca
TCGA
The Cancer Genome Atlas is a
comprehensive and coordinated
effort to accelerate our
understanding of the molecular
basis of cancer through the
application of genome analysis
technologies, including largescale genome sequencing.

Module 1: Cancer Genomic Databases

bioinformatics.ca
About the TCGA
•
•
•
•

National Cancer Institute (NCI)
National Human Genome Research
Institute (NHGRI)
Phased Structure:
– Three-year pilot in 2006 with an investment of $50 million
from each
– TCGA will collect and characterize more than 20 additional
tumour types (now at 16)

Module 1: Cancer Genomic Databases

bioinformatics.ca
Where to start with the TCGA?
Wiki: https://wiki.nci.nih.gov/display/TCGA/About+TCGA

Module 1: Cancer Genomic Databases

bioinformatics.ca
Division of Labour
•

Biospecimen Core Resource (BCR)
– centre where samples are carefully catalogued, processed, qualitychecked
and stored along with participant clinical information

•

Genome Sequencing Centre (GSC)
– uses high-throughput methods to identify changes to DNA sequences that are
associated with specific cancer types

•

Genome Characterization Centre (GCC)
– uses high-throughput technologies to analyze genomic changes involved in cancer

•

Genome Data Analysis Centre (GDAC)
– provides novel informatics tools to the research community

•

– provides analysis results using TCGA data.
Data Coordinating Centre (DCC)
– Central provider of TCGA data.

– Standardizes data formats and validates submitted data.

Module 1: Cancer Genomic Databases

bioinformatics.ca
TCGA Data
• Sequence reads from newer sequencing
technologies are available at the Cancer Genome
Hub: https://cghub.ucsc.edu/
• Higher level sequence data (variation calls and
abundance measures) are available at the TCGA
Portal: http://cancergenome.nih.gov/

Module 1: Cancer Genomic Databases

bioinformatics.ca
TCGA data flow

http://goo.gl/b5nojx

Module 1: Cancer Genomic Databases

bioinformatics.ca
Data Coordinating Centre
• Play a central role
– Receiving data from BCR, GSC and GCC sites
– Providing access to users
– Performing analysis of data

• Responsibilities:
–
–
–
–

Protecting participant privacy and confidentiality
Developing data standards and controlled vocabularies
Establishing informatics pipelines for data flow
Developing new analytical and visualization technologies
to facilitate data analysis, for all audiences

Module 1: Cancer Genomic Databases

bioinformatics.ca
TCGA DCC Data Portal
• Provides a platform to search, download and
analyze TCGA data sets
• Two data access tiers: Open and Controlled
• Analytic tools include: Cancer Molecular Analysis
and Cancer Genome Workbench (NCBIB),
Integrative Genomics Viewer (Broad) and
CancerGenomics Analysis (MSKCC).

Module 1: Cancer Genomic Databases

bioinformatics.ca
TCGA Data Browser
https://tcga-data.nci.nih.gov/tcga/
Query TCGA
data online
using the
TCGA Data
Browser

Module 1: Cancer Genomic Databases

bioinformatics.ca
The International Cancer Genome Consortium
(ICGC)

• http://www.icgc.org/
• “ICGC was launched
to coordinate largescale cancer genome
studies in tumours
from 50 different
cancer types and/or
subtypes that are of
clinical and societal
importance across
the globe”

Module 1: Cancer Genomic Databases

bioinformatics.ca
ICGC

BAM/FASTQ

ICGC

Open
Data
(includes
TCGA
Open Data)

COSMIC
Open
Data

TCGA

BAM/FASTQ
ICGC Map – November 2013
67 projects launched

Module 1: Cancer Genomic Databases

bioinformatics.ca
Hardeep Nahal

ICGC datasets to date
ICGC Data Portal Cumulative Donor Count for Member Projects

10,000

Release 14

Release 11
Release 13

9000

Release 12
8000

Release 10
Release 9

7000

6000

Number
of
Donors
5000

Release 8

4000

Release 7

3000

2000

1000

Dec-11

Jan-2012

Feb

March

April

May

June

July

Aug

Sept

Oct

Nov

Module 1: Cancer Genomic Databases

Dec

Jan-2013

Feb

March

April

May

June

July

Aug

Sept-2013

bioinformatics.ca
ICGC dataset version 14
September 2013

Hardeep Nahal

• Cancer types: 41
• Donors: 8,532 (18,056 specimens)
• Simple somatic mutations: 1,995,134
• Copy number mutations: 18,526,593
• Structural rearrangements: 18,614
• Genes affected* by simple somatic mutations: 22,074
• Genes affected* by non-synonymous coding mutations: 19,150 Genes
affected* by copy number mutations: 20,341
• Genes affected* by structural rearrangements: 1,884
•

*out 22,259 protein coding genes annotated in Ensembl Human release 69

• Open tier and controlled data currently available
Module 1: Cancer Genomic Databases

bioinformatics.ca
Module 1: Cancer Genomic Databases

bioinformatics.ca
Select “Pancreatic cancer – Canada”

Module 1: Cancer Genomic Databases

bioinformatics.ca
… But where is the data?

Module 1: Cancer Genomic Databases

bioinformatics.ca
Module 1: Cancer Genomic Databases

bioinformatics.ca
http://dcc.icgc.org/

Module 1: Cancer Genomic Databases

bioinformatics.ca
Module 1: Cancer Genomic Databases

bioinformatics.ca
Module 1: Cancer Genomic Databases

bioinformatics.ca
Can do bulk download of the data …

Module 1: Cancer Genomic Databases

bioinformatics.ca
ERA
ERA
TCGA
TCGA

DACO
DACO
ICGC
ICGC

dbGaP
dbGaP

EGA
EGA

BA
BA
BA
BA
MM
M
M

BA
BA
BA
BA
MM
M
M

+ EGA id
Module 1: Cancer Genomic Databases

bioinformatics.ca
Module 1: Cancer Genomic Databases

bioinformatics.ca
http://icgc.org/daco

Module 1: Cancer Genomic Databases

bioinformatics.ca
ICGC Controlled
Access Datasets
• Detailed Phenotype and Outcome data
Region of residence
Risk factors
Examination
Surgery
Radiation
Sample
Slide
Specific histological features
Analyte
Aliquot
Donor notes
• Gene Expression (probe-level data)
• Raw genotype calls
• Gene-sample identifier links
• Genome sequence files

ICGC OA
Datasets
• Cancer Pathology
Histologic type or subtype
Histologic nuclear grade
• Patient/Person
Gender, Age range,
Vital status, Survival time
Relapse type, Status at follow-up
• Gene Expression (normalized)
• DNA methylation
•Computed Copy Number and
Loss of Heterozygosity
• Newly discovered somatic variants
http://goo.gl/w4mrV

Module 1: Cancer Genomic Databases

bioinformatics.ca
Identify
Identify
yourself
yourself

Fill out detail form which
Fill out detail form which
includes:
includes:
••Contact and Project
Contact and Project
Information
Information
••InformationTechnology
Information Technology
details and procedures
details and procedures
for keeping data secure
for keeping data secure
••DataAccess Agreement
Data Access Agreement

Module 1: Cancer Genomic Databases

All of these
All of these
documents are
documents are
put into a PDF
put into a PDF
file that you
file that you
print and get your
print and get your
institution to sign
institution to sign
off on your behalf
off on your behalf

bioinformatics.ca
Module 1: Cancer Genomic Databases

bioinformatics.ca
Module 1: Cancer Genomic Databases

bioinformatics.ca
Module 1: Cancer Genomic Databases

bioinformatics.ca
Module 1: Cancer Genomic Databases

bioinformatics.ca
Module 1: Cancer Genomic Databases

bioinformatics.ca
Module 1: Cancer Genomic Databases

bioinformatics.ca
DACO approved projects

Module 1: Cancer Genomic Databases

bioinformatics.ca
DACO/DCC User Data Access Process
•

Users approved through DACO are now automatically granted access to
ICGC controlled access datasets available through the ICGC Data Portal and
the EBI’s EGA repository

DACO Web
DACO Web
Application
Application

application
approved
by DACO

user
accounts
activated

DCC Data
DCC Data
Portal
Portal

DCC User
DCC User
Registry
Registry
EBI EGA
EBI EGA

Module 1: Cancer Genomic Databases

bioinformatics.ca
Catalogue of Somatic Mutations in Cancer
(COSMIC)
• http://cancer.sanger.ac.uk/cancerg
enome/projects/cosmic/

• COSMIC is designed
to store and display
somatic mutation
information and
related details and
contains information
relating to human
cancers.

Module 1: Cancer Genomic Databases

bioinformatics.ca
COSMIC
• Somatic Mutations Only
• Diverse sources
– Literature (Arrays, Next-Gen, PCR...)
– TCGA
– ICGC

• Diverse ways to look at data
–
–
–
–
–

Gene
Variation
Tumour type
Cell line
Experiment

Module 1: Cancer Genomic Databases

bioinformatics.ca
FAQ

Module 1: Cancer Genomic Databases

bioinformatics.ca
Looking up your favorite gene

1

2

Module 1: Cancer Genomic Databases

3
bioinformatics.ca
Module 1: Cancer Genomic Databases

bioinformatics.ca
Module 1: Cancer Genomic Databases

bioinformatics.ca
In closing
• Remember all these sites have great amounts of
documentation
• The field is changing quickly, and so are the portals.
• New features are planned as we speak, and so you
need to use the sites, and keep coming back.
• Don’t be afraid to explore
• Interested in learning more after today? Consider
one of the bioinformatics.ca workshops!

Module 1: Cancer Genomic Databases

bioinformatics.ca
Acknowledgements:
the CBW gang
Michael
Brudno

Michael

Stromberg

Michelle Brazas
Marc
Fiume

Module 1: Cancer Genomic Databases

bioinformatics.ca

Más contenido relacionado

La actualidad más candente

The Human Genome Project - Part I
The Human Genome Project - Part IThe Human Genome Project - Part I
The Human Genome Project - Part I
hhalhaddad
 

La actualidad más candente (20)

Gene expression introduction
Gene expression introductionGene expression introduction
Gene expression introduction
 
Ensembl genome
Ensembl genomeEnsembl genome
Ensembl genome
 
Introduction to databases.pptx
Introduction to databases.pptxIntroduction to databases.pptx
Introduction to databases.pptx
 
Role of Bioinformatics in Cancer Research
Role of Bioinformatics in Cancer Research Role of Bioinformatics in Cancer Research
Role of Bioinformatics in Cancer Research
 
The Human Genome Project - Part I
The Human Genome Project - Part IThe Human Genome Project - Part I
The Human Genome Project - Part I
 
CRISPR Cas 9 TECHNOLOGY
CRISPR Cas 9 TECHNOLOGYCRISPR Cas 9 TECHNOLOGY
CRISPR Cas 9 TECHNOLOGY
 
BITS: UCSC genome browser - Part 1
BITS: UCSC genome browser - Part 1BITS: UCSC genome browser - Part 1
BITS: UCSC genome browser - Part 1
 
Protein database
Protein databaseProtein database
Protein database
 
Challenges and opportunities in personal omics profiling
Challenges and opportunities in personal omics profilingChallenges and opportunities in personal omics profiling
Challenges and opportunities in personal omics profiling
 
Proteome databases
Proteome databasesProteome databases
Proteome databases
 
The story of personalized medicine
The story of personalized medicineThe story of personalized medicine
The story of personalized medicine
 
Biological databases
Biological databasesBiological databases
Biological databases
 
Exome sequence analysis
Exome sequence analysisExome sequence analysis
Exome sequence analysis
 
Bioinformatics Projects And Applications
Bioinformatics Projects And ApplicationsBioinformatics Projects And Applications
Bioinformatics Projects And Applications
 
Variant (SNP) calling - an introduction (with a worked example, using FreeBay...
Variant (SNP) calling - an introduction (with a worked example, using FreeBay...Variant (SNP) calling - an introduction (with a worked example, using FreeBay...
Variant (SNP) calling - an introduction (with a worked example, using FreeBay...
 
Post human genome project
Post human genome projectPost human genome project
Post human genome project
 
Functional Analysis of miRNA: miRNA and its Role in Human Disease Webinar Ser...
Functional Analysis of miRNA: miRNA and its Role in Human Disease Webinar Ser...Functional Analysis of miRNA: miRNA and its Role in Human Disease Webinar Ser...
Functional Analysis of miRNA: miRNA and its Role in Human Disease Webinar Ser...
 
The STRING database
The STRING databaseThe STRING database
The STRING database
 
NGS in Clinical Research: Meet the NGS Experts Series Part 1
NGS in Clinical Research: Meet the NGS Experts Series Part 1NGS in Clinical Research: Meet the NGS Experts Series Part 1
NGS in Clinical Research: Meet the NGS Experts Series Part 1
 
Functional genomics, and tools
Functional genomics, and toolsFunctional genomics, and tools
Functional genomics, and tools
 

Destacado

Bioinformatics Project Training for 2,4,6 month
Bioinformatics Project Training for 2,4,6 monthBioinformatics Project Training for 2,4,6 month
Bioinformatics Project Training for 2,4,6 month
biinoida
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
biinoida
 

Destacado (7)

Sssc retreat.bioinfo resources.20110411
Sssc retreat.bioinfo resources.20110411Sssc retreat.bioinfo resources.20110411
Sssc retreat.bioinfo resources.20110411
 
Bioinformatics in Gene Research
Bioinformatics in Gene ResearchBioinformatics in Gene Research
Bioinformatics in Gene Research
 
Architecture and evolution of neochromosomes
Architecture and evolution of neochromosomesArchitecture and evolution of neochromosomes
Architecture and evolution of neochromosomes
 
Biometric encryption
Biometric encryptionBiometric encryption
Biometric encryption
 
Bioinformatics Project Training for 2,4,6 month
Bioinformatics Project Training for 2,4,6 monthBioinformatics Project Training for 2,4,6 month
Bioinformatics Project Training for 2,4,6 month
 
Sequence Alignment In Bioinformatics
Sequence Alignment In BioinformaticsSequence Alignment In Bioinformatics
Sequence Alignment In Bioinformatics
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 

Similar a Introduction to Cancer Genomics Databases

Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009
Ian Foster
 
Cancer genome repository_berkeley
Cancer genome repository_berkeleyCancer genome repository_berkeley
Cancer genome repository_berkeley
Shyam Sarkar
 

Similar a Introduction to Cancer Genomics Databases (20)

Nov 2014 ouellette_windsor_icgc_final
Nov 2014 ouellette_windsor_icgc_finalNov 2014 ouellette_windsor_icgc_final
Nov 2014 ouellette_windsor_icgc_final
 
Biocuration activities for the International Cancer Genome Consortium (ICGC).
Biocuration activities for the International Cancer Genome Consortium (ICGC).Biocuration activities for the International Cancer Genome Consortium (ICGC).
Biocuration activities for the International Cancer Genome Consortium (ICGC).
 
Omprn 2018 module1_final
Omprn 2018 module1_finalOmprn 2018 module1_final
Omprn 2018 module1_final
 
Federal Research & Development for the Florida system Sept 2014
Federal Research & Development for the Florida system Sept 2014 Federal Research & Development for the Florida system Sept 2014
Federal Research & Development for the Florida system Sept 2014
 
International Cancer Genomics Consortium (ICGC) Data Coordinating Center
International Cancer Genomics Consortium (ICGC) Data Coordinating CenterInternational Cancer Genomics Consortium (ICGC) Data Coordinating Center
International Cancer Genomics Consortium (ICGC) Data Coordinating Center
 
16
1616
16
 
FDA NGS and Big Data Conference September 2014
FDA NGS and Big Data Conference September 2014FDA NGS and Big Data Conference September 2014
FDA NGS and Big Data Conference September 2014
 
Cancer moonshot and data sharing
Cancer moonshot and data sharingCancer moonshot and data sharing
Cancer moonshot and data sharing
 
ICBO 2014, October 8, 2014
ICBO 2014, October 8, 2014ICBO 2014, October 8, 2014
ICBO 2014, October 8, 2014
 
Data Commons & Data Science Workshop
Data Commons & Data Science WorkshopData Commons & Data Science Workshop
Data Commons & Data Science Workshop
 
Gene Wiki and Mark2Cure update for BD2K
Gene Wiki and Mark2Cure update for BD2KGene Wiki and Mark2Cure update for BD2K
Gene Wiki and Mark2Cure update for BD2K
 
Personal Genomes: what can I do with my data?
Personal Genomes: what can I do with my data?Personal Genomes: what can I do with my data?
Personal Genomes: what can I do with my data?
 
Rozen 2016-10-05-ieee-cibcb-big-genome-data-to-share
Rozen 2016-10-05-ieee-cibcb-big-genome-data-to-shareRozen 2016-10-05-ieee-cibcb-big-genome-data-to-share
Rozen 2016-10-05-ieee-cibcb-big-genome-data-to-share
 
Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009
 
Cancer genome repository_berkeley
Cancer genome repository_berkeleyCancer genome repository_berkeley
Cancer genome repository_berkeley
 
Keynote at NVIDIA GPU Technology Conference in D.C.
Keynote at NVIDIA GPU Technology Conference in D.C.Keynote at NVIDIA GPU Technology Conference in D.C.
Keynote at NVIDIA GPU Technology Conference in D.C.
 
NCI Cancer Genomics, Open Science and PMI: FAIR
NCI Cancer Genomics, Open Science and PMI: FAIR NCI Cancer Genomics, Open Science and PMI: FAIR
NCI Cancer Genomics, Open Science and PMI: FAIR
 
Cancer Moonshot, Data sharing and the Genomic Data Commons
Cancer Moonshot, Data sharing and the Genomic Data CommonsCancer Moonshot, Data sharing and the Genomic Data Commons
Cancer Moonshot, Data sharing and the Genomic Data Commons
 
EBI Industry programme TCGA Warren KIbbe November 2013
EBI Industry programme TCGA Warren KIbbe November 2013EBI Industry programme TCGA Warren KIbbe November 2013
EBI Industry programme TCGA Warren KIbbe November 2013
 
Grand round whsiao_may2015
Grand round whsiao_may2015Grand round whsiao_may2015
Grand round whsiao_may2015
 

Más de Neuro, McGill University

Más de Neuro, McGill University (8)

White_matter_Ouellette_2022-06-07.pdf
White_matter_Ouellette_2022-06-07.pdfWhite_matter_Ouellette_2022-06-07.pdf
White_matter_Ouellette_2022-06-07.pdf
 
Open data genomics_palermo_2017_ver03
Open data genomics_palermo_2017_ver03Open data genomics_palermo_2017_ver03
Open data genomics_palermo_2017_ver03
 
Ouellette elixir 2017
Ouellette elixir 2017Ouellette elixir 2017
Ouellette elixir 2017
 
Madrid icgc pcawg_2016_slideshare
Madrid icgc pcawg_2016_slideshareMadrid icgc pcawg_2016_slideshare
Madrid icgc pcawg_2016_slideshare
 
Cancer uk 2015_module1_ouellette_ver02
Cancer uk 2015_module1_ouellette_ver02Cancer uk 2015_module1_ouellette_ver02
Cancer uk 2015_module1_ouellette_ver02
 
Genentech icgc 2015
Genentech icgc 2015Genentech icgc 2015
Genentech icgc 2015
 
Big data
Big dataBig data
Big data
 
Ouellette icgc toronto_oct2012_fged_ver02
Ouellette icgc toronto_oct2012_fged_ver02Ouellette icgc toronto_oct2012_fged_ver02
Ouellette icgc toronto_oct2012_fged_ver02
 

Último

Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
KarakKing
 

Último (20)

TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptx
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptx
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 

Introduction to Cancer Genomics Databases

  • 1. Canadian Cancer Research Conference November 3-6, 2013 Canadian Bioinformatics Workshops www.bioinformatics.ca
  • 2. Module #: Title of Module 2
  • 3. You are free to: Copy, share, adapt, or re-mix; Photograph, film, or broadcast; Blog, live-blog, or post video of; This presentation. Provided that: You attribute the work to its author and respect the rights and licenses associated with its components. Slide Concept by Cameron Neylon, who has waived all copyright and related or neighbouring rights. This slide only ccZero. Social Media Icons adapted with permission from originals by Christopher Ross. Original images are available under GPL at; http://www.thisismyurl.com/free-downloads/15-free-speech-bubble-icons-for-popular-websites Module 1: Cancer Genomic Databases bioinformatics.ca
  • 5. E-mail E-mail francis@oicr.on.ca @bffo Module 1: Cancer Genomic Databases bioinformatics.ca
  • 6. Schedule for Module 1 Cancer Genomic Databases •The Databases: – The International Cancer Genome Consortium (ICGC) – The Cancer Genome Atlas (TCGA) – The Catalogue of Somatic Mutations in Cancer (COSMIC) •Data Access: human genomes and security and privacy issues, Open vs. Controlled Access data Module 1: Cancer Genomic Databases bioinformatics.ca
  • 7. Module 1: Cancer Genomic Databases bioinformatics.ca
  • 8. http://bioinformatics.ca/ Module 1: Cancer Genomic Databases bioinformatics.ca
  • 9. Module 1: Cancer Genomic Databases bioinformatics.ca
  • 10. Workshops planned for 2014: http://bioinformatics.ca/workshops 1. 2. 3. 4. 5. 6. 7. 8. Exploratory Analysis of Biological Data using R Bioinformatics for Cancer Genomics Informatics for RNA-sequence Analysis Informatics on High Throughput Sequencing Data Pathway and Network Analysis of -omics Data Flow Cytometry Data Analysis using R Microarray Data Analysis Informatics and Statistics for Metabolomics Module 1: Cancer Genomic Databases bioinformatics.ca
  • 11. http://bioinformatics.ca/workshops/2013 Module 1: Cancer Genomic Databases bioinformatics.ca
  • 12. E-mail: course_info@bioinformatics.ca Web: http://bioinformatics.ca Workshop announcement mailing list: http://bioinformatics.ca/mailman/listinfo/announce Module 1: Cancer Genomic Databases bioinformatics.ca
  • 13. Soap-Box time! • • Open Access, Open Data and Open Source are essential for good Science. Openness is a responsibility, an obligation, and something that comes with the privilege of doing publicly funded work. Open Source Open Access Open Data Opencourseware Module 1: Cancer Genomic Databases bioinformatics.ca
  • 14. Module 1: Cancer Genomic Databases bioinformatics.ca
  • 15. Cancer therapy is like beating the dog with a stick to get rid of his fleas. - Anna Deavere Smith, Let me down easy Module 1: Cancer Genomic Databases bioinformatics.ca
  • 16. http://goo.gl/Yhbsj Module 1: Cancer Genomic Databases bioinformatics.ca
  • 17. The revolution in cancer research can summed up in a single sentence: cancer is in essence, a genetic disease. - Bert Vogelstein Module 1: Cancer Genomic Databases bioinformatics.ca
  • 18. Cancer: a Disease of the Genome Challenge in Treating Cancer:  Every tumour is different  Every cancer patient is different Module 1: Cancer Genomic Databases bioinformatics.ca
  • 19. Cancer Genomic Databases Chin et al, Genes. Dev. 2011 March 15; 25(6): 534-555 http://www.ncbi.nlm.nih.gov/pubmed/?term=21406553 Module 1: Cancer Genomic Databases bioinformatics.ca
  • 20. TCGA The Cancer Genome Atlas is a comprehensive and coordinated effort to accelerate our understanding of the molecular basis of cancer through the application of genome analysis technologies, including largescale genome sequencing. Module 1: Cancer Genomic Databases bioinformatics.ca
  • 21. About the TCGA • • • • National Cancer Institute (NCI) National Human Genome Research Institute (NHGRI) Phased Structure: – Three-year pilot in 2006 with an investment of $50 million from each – TCGA will collect and characterize more than 20 additional tumour types (now at 16) Module 1: Cancer Genomic Databases bioinformatics.ca
  • 22. Where to start with the TCGA? Wiki: https://wiki.nci.nih.gov/display/TCGA/About+TCGA Module 1: Cancer Genomic Databases bioinformatics.ca
  • 23. Division of Labour • Biospecimen Core Resource (BCR) – centre where samples are carefully catalogued, processed, qualitychecked and stored along with participant clinical information • Genome Sequencing Centre (GSC) – uses high-throughput methods to identify changes to DNA sequences that are associated with specific cancer types • Genome Characterization Centre (GCC) – uses high-throughput technologies to analyze genomic changes involved in cancer • Genome Data Analysis Centre (GDAC) – provides novel informatics tools to the research community • – provides analysis results using TCGA data. Data Coordinating Centre (DCC) – Central provider of TCGA data. – Standardizes data formats and validates submitted data. Module 1: Cancer Genomic Databases bioinformatics.ca
  • 24. TCGA Data • Sequence reads from newer sequencing technologies are available at the Cancer Genome Hub: https://cghub.ucsc.edu/ • Higher level sequence data (variation calls and abundance measures) are available at the TCGA Portal: http://cancergenome.nih.gov/ Module 1: Cancer Genomic Databases bioinformatics.ca
  • 25. TCGA data flow http://goo.gl/b5nojx Module 1: Cancer Genomic Databases bioinformatics.ca
  • 26. Data Coordinating Centre • Play a central role – Receiving data from BCR, GSC and GCC sites – Providing access to users – Performing analysis of data • Responsibilities: – – – – Protecting participant privacy and confidentiality Developing data standards and controlled vocabularies Establishing informatics pipelines for data flow Developing new analytical and visualization technologies to facilitate data analysis, for all audiences Module 1: Cancer Genomic Databases bioinformatics.ca
  • 27. TCGA DCC Data Portal • Provides a platform to search, download and analyze TCGA data sets • Two data access tiers: Open and Controlled • Analytic tools include: Cancer Molecular Analysis and Cancer Genome Workbench (NCBIB), Integrative Genomics Viewer (Broad) and CancerGenomics Analysis (MSKCC). Module 1: Cancer Genomic Databases bioinformatics.ca
  • 28. TCGA Data Browser https://tcga-data.nci.nih.gov/tcga/ Query TCGA data online using the TCGA Data Browser Module 1: Cancer Genomic Databases bioinformatics.ca
  • 29. The International Cancer Genome Consortium (ICGC) • http://www.icgc.org/ • “ICGC was launched to coordinate largescale cancer genome studies in tumours from 50 different cancer types and/or subtypes that are of clinical and societal importance across the globe” Module 1: Cancer Genomic Databases bioinformatics.ca
  • 31. ICGC Map – November 2013 67 projects launched Module 1: Cancer Genomic Databases bioinformatics.ca
  • 32. Hardeep Nahal ICGC datasets to date ICGC Data Portal Cumulative Donor Count for Member Projects 10,000 Release 14 Release 11 Release 13 9000 Release 12 8000 Release 10 Release 9 7000 6000 Number of Donors 5000 Release 8 4000 Release 7 3000 2000 1000 Dec-11 Jan-2012 Feb March April May June July Aug Sept Oct Nov Module 1: Cancer Genomic Databases Dec Jan-2013 Feb March April May June July Aug Sept-2013 bioinformatics.ca
  • 33. ICGC dataset version 14 September 2013 Hardeep Nahal • Cancer types: 41 • Donors: 8,532 (18,056 specimens) • Simple somatic mutations: 1,995,134 • Copy number mutations: 18,526,593 • Structural rearrangements: 18,614 • Genes affected* by simple somatic mutations: 22,074 • Genes affected* by non-synonymous coding mutations: 19,150 Genes affected* by copy number mutations: 20,341 • Genes affected* by structural rearrangements: 1,884 • *out 22,259 protein coding genes annotated in Ensembl Human release 69 • Open tier and controlled data currently available
  • 34. Module 1: Cancer Genomic Databases bioinformatics.ca
  • 35. Module 1: Cancer Genomic Databases bioinformatics.ca
  • 36. Select “Pancreatic cancer – Canada” Module 1: Cancer Genomic Databases bioinformatics.ca
  • 37. … But where is the data? Module 1: Cancer Genomic Databases bioinformatics.ca
  • 38. Module 1: Cancer Genomic Databases bioinformatics.ca
  • 39. http://dcc.icgc.org/ Module 1: Cancer Genomic Databases bioinformatics.ca
  • 40. Module 1: Cancer Genomic Databases bioinformatics.ca
  • 41. Module 1: Cancer Genomic Databases bioinformatics.ca
  • 42. Can do bulk download of the data … Module 1: Cancer Genomic Databases bioinformatics.ca
  • 44. Module 1: Cancer Genomic Databases bioinformatics.ca
  • 45. http://icgc.org/daco Module 1: Cancer Genomic Databases bioinformatics.ca
  • 46. ICGC Controlled Access Datasets • Detailed Phenotype and Outcome data Region of residence Risk factors Examination Surgery Radiation Sample Slide Specific histological features Analyte Aliquot Donor notes • Gene Expression (probe-level data) • Raw genotype calls • Gene-sample identifier links • Genome sequence files ICGC OA Datasets • Cancer Pathology Histologic type or subtype Histologic nuclear grade • Patient/Person Gender, Age range, Vital status, Survival time Relapse type, Status at follow-up • Gene Expression (normalized) • DNA methylation •Computed Copy Number and Loss of Heterozygosity • Newly discovered somatic variants http://goo.gl/w4mrV Module 1: Cancer Genomic Databases bioinformatics.ca
  • 47. Identify Identify yourself yourself Fill out detail form which Fill out detail form which includes: includes: ••Contact and Project Contact and Project Information Information ••InformationTechnology Information Technology details and procedures details and procedures for keeping data secure for keeping data secure ••DataAccess Agreement Data Access Agreement Module 1: Cancer Genomic Databases All of these All of these documents are documents are put into a PDF put into a PDF file that you file that you print and get your print and get your institution to sign institution to sign off on your behalf off on your behalf bioinformatics.ca
  • 48. Module 1: Cancer Genomic Databases bioinformatics.ca
  • 49. Module 1: Cancer Genomic Databases bioinformatics.ca
  • 50. Module 1: Cancer Genomic Databases bioinformatics.ca
  • 51. Module 1: Cancer Genomic Databases bioinformatics.ca
  • 52. Module 1: Cancer Genomic Databases bioinformatics.ca
  • 53. Module 1: Cancer Genomic Databases bioinformatics.ca
  • 54. DACO approved projects Module 1: Cancer Genomic Databases bioinformatics.ca
  • 55. DACO/DCC User Data Access Process • Users approved through DACO are now automatically granted access to ICGC controlled access datasets available through the ICGC Data Portal and the EBI’s EGA repository DACO Web DACO Web Application Application application approved by DACO user accounts activated DCC Data DCC Data Portal Portal DCC User DCC User Registry Registry EBI EGA EBI EGA Module 1: Cancer Genomic Databases bioinformatics.ca
  • 56. Catalogue of Somatic Mutations in Cancer (COSMIC) • http://cancer.sanger.ac.uk/cancerg enome/projects/cosmic/ • COSMIC is designed to store and display somatic mutation information and related details and contains information relating to human cancers. Module 1: Cancer Genomic Databases bioinformatics.ca
  • 57. COSMIC • Somatic Mutations Only • Diverse sources – Literature (Arrays, Next-Gen, PCR...) – TCGA – ICGC • Diverse ways to look at data – – – – – Gene Variation Tumour type Cell line Experiment Module 1: Cancer Genomic Databases bioinformatics.ca
  • 58. FAQ Module 1: Cancer Genomic Databases bioinformatics.ca
  • 59. Looking up your favorite gene 1 2 Module 1: Cancer Genomic Databases 3 bioinformatics.ca
  • 60. Module 1: Cancer Genomic Databases bioinformatics.ca
  • 61. Module 1: Cancer Genomic Databases bioinformatics.ca
  • 62. In closing • Remember all these sites have great amounts of documentation • The field is changing quickly, and so are the portals. • New features are planned as we speak, and so you need to use the sites, and keep coming back. • Don’t be afraid to explore • Interested in learning more after today? Consider one of the bioinformatics.ca workshops! Module 1: Cancer Genomic Databases bioinformatics.ca
  • 63. Acknowledgements: the CBW gang Michael Brudno Michael Stromberg Michelle Brazas Marc Fiume Module 1: Cancer Genomic Databases bioinformatics.ca

Notas del editor

  1. {"33":"Ensembl 61 Hs has 53,515 gene loci annotated, which explain high affected genes numbers for SSMs (I’ve double-checked these numbers)\n","29":"A few notes on ICGC\n","19":"Consequtive basepairs\n","59":"Summary page with basic gene description and list of curated pubs. Click on Histogram to view the distribution of mutations. \n"}