SlideShare una empresa de Scribd logo
1 de 31
Michael Reich
Broad Institute of Harvard and MIT
           July 14, 2012
Need: Insights through integrative studies
Gene mutation causing Leigh Syndrome                  Discovery of 3 new genes involved in
French Canadian Type (LFSC) and 8                     Glioblastoma Multiforme (NF1, ERRB2,
other mitochondrial diseases                          PIK3R1); Confirmation of TP53, PTEN,
  Integrate: candidate genomic region,                EGFR, RB1, PIK3CA
  mitochondrial proteomic data, and cancer               Integrate: DNA sequence, copy number,
  expression compendium                                  methylation aberrations and expression
  Authors: Mootha et al. 2003, Calvo et al. 2006         profiles in 206 glioblastomas
                                                        Authors: TCGA Research Network 2008
Subtle repression of oxphos genes in
diabetic muscle: role for mitochondrial               ~3000 novel, large non-coding RNAs with
dysregulation in diabetes pathology; new              functions in development, the immune
computational approach Gene Set                       response and cancer
Enrichment Analysis                                      Integrate: Genome sequences from 21
   Integrate: Gene sets/pathways & processes             mammals, epigenomic maps, and
   with expression profiles                              expression profiles
  Authors: Mootha et al. 2003, Subramanian & Tamayo     Authors: Guttman, Rinn et al. 2009
  et al. 2005
                                                      Characterization of disease subtypes and
IKBKE as a new breast cancer oncogene                 improved risk stratification for
  Integrate: RNAi screens, transformation             medulloblastoma patients
  of activated kinases, and copy number                 Integrate: Copy number, expression,
  from SNP arrays of cell lines                         clinical data for 96 medulloblastoma
  Authors: Boehm et al. 2007                            patients
                                                        Authors: Tamayo et al., Cho et al. 2011
Translational Research Example
                            GenePattern                Cytoscape              IGV/UCSC             Genomica             CMAP
                                                                                                                                  iii

                       1
   Compendium               Differentially
                             Expressed
                               Genes
                                                                   GCT  GXP                         Load
                                                                                                  compendium            Arrests
                                                   SIF                                            Show module
                       ODF  GMT                                                                                         G2/M
                                                   3                                                 map           i
   Network             2
                                                          Show
                              GSEA test                  network                                            GXC  GRP
                             enrichment
                                                                        5
                                                                                                     Extract
                               HTML                4                           Show
                                                                                                     module
                                                       Expand +1            Chromosome
                                                                            NA  gene list
                                 Idea                                                                              ii
                                                        (include
  Alterations                                          neighbors)
                                                                        6         Add
                                                                             Transcription          Learn p53
atcgcgtttattcgataagg                                                                              site/score on
atcgcgttttttcgataagg                                                          Factor track
                           Added to GenePattern
                                                                              from UCSC             promoter
                              Pathway                                                                              iv
                             activation
                                              v                                               8
   Expression
                                                                        7                            Test for
                                                                             Looks closeGFF       similarity of
                                                                                                  GXA
                             Conclusion                                       to p53 site         p53 and gene
                                                                                                     location
                                              vi
12 steps, 6 tools, 7 transitions


1                   i
                                       GenePattern
                                       Cytoscape
                                       IGV
                                       Genomica
2        3              ii       iii   CMAP
                                       UCSC Browser

                                       Analysis step
         4      5       iv       v     Analysis conclusion
                                       Within tool
                                       Across tools

                6            8
The Need

• A lightweight “connection layer” for a wide variety of
  integrative genomic analyses
   o Support for all types of resource: Web-based,
     desktop, etc.
   o Automatic conversion of data formats between tools
   o Easy access to data from any location
   o Any tool that joins is automatically connected to the
     community of tools
   o Ease of entry into the environment
Cloud-based storage




API connectivity layer
Online community to share diverse computational tools




 6 Seed Tools                            3 Driving Biological
                                               Projects
  Cytoscape
    Galaxy                                      lincRNAs
 GenePattern
  Genomica                                 Cancer stem cells
     IGV
                                          Patient Stratification
 UCSC Browser



                                              New Biological
  New tools
                   www.genomespace.org          Projects
GenomeSpace Principles
• Aimed towards non-programming users

• Support interoperability through
  automatic cross-tool data transfer

• Requires minimal changes to tools




                              www.genomespace.org
GenomeSpace Components
     GenePattern         Galaxy                      Cytoscape                  Integrative Genomics Viewer




GS Enabled Tools


                           1
                                          Authentication and Authorization
                                      2                          3
                                       Analysis and
                                                                     Data Manager
                                       Tool Manager


                                  Genome Space     GenomeSpace
                                     Server         Project Data
      geWorkbench




External Data Sources & Tools
Seed Tools



Cytoscape      Galaxy     GenePattern




Genomica        IGV      UCSC Genome
                           Browser
New Tools

Recently
 added

                        InSilicoDB                      Cistrome
                (University of Brussels)      (Dana-Farber Cancer Institute)

In development




  geWorkbench                         Reactome                       ArrayExpress
(Columbia University)     (Ontario Institute of Cancer Research)          (EBI)
Using GenomeSpace

                    Tools and
                      Data
                     Sources
                     Actions




                       Cloud-
                       based
                    filesystem
GenomeSpace Actions
GenomeSpace Tool Enablement: IGV
GenomeSpace Tool Enablement: GenePattern
GenomeSpace Tool Enablement: GenePattern
GenomeSpace Data Source Integration: InSilico DB
Other collaborating projects

• Taverna/MyExperiment (University of
  Manchester)


• National Center for Biomedical Ontology
  (Stanford University)
DBP3: Studying the regulatory control of
human hematopoiesis
DBP3: Studying the regulatory control of
human hematopoiesis – Overview

Part 1: Data pre-processing              Part 3: Studying the               Part 4: cis-regulatory
     and quality control               transcriptional program                  site analysis
                                                                                                       Genomica
                                       From                                  From part 3          3
                                                        2
    2         4                        part 2
                                                                                                       GenePattern

                                                    3       4         1                           2
                                                                                                       IGV
 To part 2                               2
              2
                                                    1       1         1      From part 2          1    Cytoscape

                                                                                                       Galaxy
    1    1    1        1       1         2                                                        2
                                                                                                       Manual step
                                         3              2
                                                                              To part 5           1
              2                                                                                        Analysis section
                                                        1
                                                                             Part 5: Finding new       Analysis step (# steps)
                                                                                                       4
                                                                             transcription factor
              1                                         1                         regulators           Analysis conclusion
                                                                            From part 4
                                                                                                       Optional choices
                      Part 2: Basic analysis                                     1
                                                                                                       Currently integrated
    From part 1                                                 To part 3
                           3       1                                             2        3       2
                                                2
                                                                                                       Not yet integrated
                  2                                         1
                                                2                                2            1   3+
                           2       1
                                                                                 2                1

                       To part 4
Part 5: Finding new
  transcription factor
       regulators
 From part 4



Step 1   2     3   Step 4


Step 2   2


Step 3 2
Step 1: Create transcription factor dataset in
   Genomica and save to GenomeSpace
Step 2: Send transcription factor datasets into GenePattern
Step 2: Perform differential expression analysis in
                  GenePattern
Step 3: Send differentially expressed genes to Genomica
Step 3: perform module network analysis in Genomica
Step 4: Visualize regulators with known SNPs and linkage regions
Step 4: Visualize regulators with known SNPs and linkage regions
Deployment Architecture and APIs
                  Deployment Architecture
GS Clients     REST                                    Amazon
 GS UI                   Identity
                         Service
                         (OpenID)
               REST




 IGV
                                                   Analysis




                                            REST
               CDK




                                                   Task
 Genomica
                                                   Manager
                                                   (ATM)
               CDK




 Gene-




                                                             Provenance
 Pattern                                                                  Simple
                                                                          DB
               REST




 Galaxy
                                                   Data
                                           REST    Manager
               CDK




 Cytoscape                                         (DM)                            External
                                                                                   Data
               REST




                                                                                   Sources
 UCSC                                                                              (e.g., Arrary
                                                                                     Express)
          S3              File transfers

                                                                                                   30
Join the GenomeSpace community

• Researchers with biological projects
• Developers
  – Add your tools
  – Contribute format converters
  – Build new infrastructure
• Data portals and repositories
  – Link your resources
Acknowledgements

GenomeSpace Collaborators                             Broad Institute
Cytoscape: Trey Ideker Lab, UCSD                      Ted Liefeld
Galaxy: Anton Nekrutenko Lab, Penn State University   Helga Thorvaldsdottir
Genomica: Eran Segal Lab, Weizmann Institute          Jim Robinson
UCSC Browser Team                                     Marco Ocana
GenePattern Team                                      Eliot Polk
IGV Team                                              Jill Mesirov, PI

Driving Biological Projects
Howard Chang Lab – Stanford University
Aviv Regev Lab – Broad Institute                                 Funding




          gs-help@broadinstitute.org
            www.genomespace.org

Más contenido relacionado

Similar a M Reich - GenomeSpace

C Amp Detection Methods In Hts
C Amp Detection Methods In HtsC Amp Detection Methods In Hts
C Amp Detection Methods In HtsVincen Pan
 
Sequencing the transcriptome reveals complex layers of regulation, Department...
Sequencing the transcriptome reveals complex layers of regulation, Department...Sequencing the transcriptome reveals complex layers of regulation, Department...
Sequencing the transcriptome reveals complex layers of regulation, Department...Copenhagenomics
 
Biocuration2012 Eugeni Belda
Biocuration2012 Eugeni BeldaBiocuration2012 Eugeni Belda
Biocuration2012 Eugeni Beldaeugenibc
 
RAD: Christian J. Stockert Jr, Junmin Liu
RAD: Christian J. Stockert Jr, Junmin LiuRAD: Christian J. Stockert Jr, Junmin Liu
RAD: Christian J. Stockert Jr, Junmin Liuniranabey
 
Bio-IT 2010 Genome Commons
Bio-IT 2010 Genome CommonsBio-IT 2010 Genome Commons
Bio-IT 2010 Genome CommonsReece Hart
 
Stephen Friend Haas School of Business 2012-03-05
Stephen Friend Haas School of Business 2012-03-05Stephen Friend Haas School of Business 2012-03-05
Stephen Friend Haas School of Business 2012-03-05Sage Base
 
NetBioSIG2012 joshstuart
NetBioSIG2012 joshstuartNetBioSIG2012 joshstuart
NetBioSIG2012 joshstuartAlexander Pico
 
Stephen Friend Fanconi Anemia Research Fund 2012-01-21
Stephen Friend Fanconi Anemia Research Fund 2012-01-21Stephen Friend Fanconi Anemia Research Fund 2012-01-21
Stephen Friend Fanconi Anemia Research Fund 2012-01-21Sage Base
 
Friend WIN Symposium 2012-06-28
Friend WIN Symposium 2012-06-28Friend WIN Symposium 2012-06-28
Friend WIN Symposium 2012-06-28Sage Base
 
Review 4.30.2010
Review 4.30.2010Review 4.30.2010
Review 4.30.2010Greg
 
Stephen Friend Nature Genetics Colloquium 2012-03-24
Stephen Friend Nature Genetics Colloquium 2012-03-24Stephen Friend Nature Genetics Colloquium 2012-03-24
Stephen Friend Nature Genetics Colloquium 2012-03-24Sage Base
 
Friend Oslo 2012-09-09
Friend Oslo 2012-09-09Friend Oslo 2012-09-09
Friend Oslo 2012-09-09Sage Base
 
Stephen Friend ICR UK 2012-06-18
Stephen Friend ICR UK 2012-06-18Stephen Friend ICR UK 2012-06-18
Stephen Friend ICR UK 2012-06-18Sage Base
 
BRED and Butters Mountaintop Biology poster 2013 36x46(1)
BRED and Butters Mountaintop Biology poster 2013 36x46(1)BRED and Butters Mountaintop Biology poster 2013 36x46(1)
BRED and Butters Mountaintop Biology poster 2013 36x46(1)Michael Angelo Santana
 
Towards a computational representation of host-pathogen interaction networks
Towards a computational representation of host-pathogen interaction networksTowards a computational representation of host-pathogen interaction networks
Towards a computational representation of host-pathogen interaction networksjgrzebyta
 
Automated Prokaryotic Annotation at JCVI
Automated Prokaryotic Annotation at JCVIAutomated Prokaryotic Annotation at JCVI
Automated Prokaryotic Annotation at JCVIPathema
 

Similar a M Reich - GenomeSpace (20)

C Amp Detection Methods In Hts
C Amp Detection Methods In HtsC Amp Detection Methods In Hts
C Amp Detection Methods In Hts
 
Sequencing the transcriptome reveals complex layers of regulation, Department...
Sequencing the transcriptome reveals complex layers of regulation, Department...Sequencing the transcriptome reveals complex layers of regulation, Department...
Sequencing the transcriptome reveals complex layers of regulation, Department...
 
Biocuration2012 Eugeni Belda
Biocuration2012 Eugeni BeldaBiocuration2012 Eugeni Belda
Biocuration2012 Eugeni Belda
 
RAD: Christian J. Stockert Jr, Junmin Liu
RAD: Christian J. Stockert Jr, Junmin LiuRAD: Christian J. Stockert Jr, Junmin Liu
RAD: Christian J. Stockert Jr, Junmin Liu
 
Rnaseq forgenefinding
Rnaseq forgenefindingRnaseq forgenefinding
Rnaseq forgenefinding
 
Bio-IT 2010 Genome Commons
Bio-IT 2010 Genome CommonsBio-IT 2010 Genome Commons
Bio-IT 2010 Genome Commons
 
Stephen Friend Haas School of Business 2012-03-05
Stephen Friend Haas School of Business 2012-03-05Stephen Friend Haas School of Business 2012-03-05
Stephen Friend Haas School of Business 2012-03-05
 
NetBioSIG2012 joshstuart
NetBioSIG2012 joshstuartNetBioSIG2012 joshstuart
NetBioSIG2012 joshstuart
 
Stephen Friend Fanconi Anemia Research Fund 2012-01-21
Stephen Friend Fanconi Anemia Research Fund 2012-01-21Stephen Friend Fanconi Anemia Research Fund 2012-01-21
Stephen Friend Fanconi Anemia Research Fund 2012-01-21
 
Friend WIN Symposium 2012-06-28
Friend WIN Symposium 2012-06-28Friend WIN Symposium 2012-06-28
Friend WIN Symposium 2012-06-28
 
Review 4.30.2010
Review 4.30.2010Review 4.30.2010
Review 4.30.2010
 
Stephen Friend Nature Genetics Colloquium 2012-03-24
Stephen Friend Nature Genetics Colloquium 2012-03-24Stephen Friend Nature Genetics Colloquium 2012-03-24
Stephen Friend Nature Genetics Colloquium 2012-03-24
 
Friend Oslo 2012-09-09
Friend Oslo 2012-09-09Friend Oslo 2012-09-09
Friend Oslo 2012-09-09
 
Stephen Friend ICR UK 2012-06-18
Stephen Friend ICR UK 2012-06-18Stephen Friend ICR UK 2012-06-18
Stephen Friend ICR UK 2012-06-18
 
Grc ashg2015 workshop_mudge
Grc ashg2015 workshop_mudgeGrc ashg2015 workshop_mudge
Grc ashg2015 workshop_mudge
 
Paper - Muhammad Gulraj
Paper - Muhammad GulrajPaper - Muhammad Gulraj
Paper - Muhammad Gulraj
 
Data analysis pipelines for NGS applications
Data analysis pipelines for NGS applicationsData analysis pipelines for NGS applications
Data analysis pipelines for NGS applications
 
BRED and Butters Mountaintop Biology poster 2013 36x46(1)
BRED and Butters Mountaintop Biology poster 2013 36x46(1)BRED and Butters Mountaintop Biology poster 2013 36x46(1)
BRED and Butters Mountaintop Biology poster 2013 36x46(1)
 
Towards a computational representation of host-pathogen interaction networks
Towards a computational representation of host-pathogen interaction networksTowards a computational representation of host-pathogen interaction networks
Towards a computational representation of host-pathogen interaction networks
 
Automated Prokaryotic Annotation at JCVI
Automated Prokaryotic Annotation at JCVIAutomated Prokaryotic Annotation at JCVI
Automated Prokaryotic Annotation at JCVI
 

Más de Jan Aerts

VIZBI 2014 - Visualizing Genomic Variation
VIZBI 2014 - Visualizing Genomic VariationVIZBI 2014 - Visualizing Genomic Variation
VIZBI 2014 - Visualizing Genomic VariationJan Aerts
 
Visual Analytics in Omics - why, what, how?
Visual Analytics in Omics - why, what, how?Visual Analytics in Omics - why, what, how?
Visual Analytics in Omics - why, what, how?Jan Aerts
 
Visual Analytics in Omics: why, what, how?
Visual Analytics in Omics: why, what, how?Visual Analytics in Omics: why, what, how?
Visual Analytics in Omics: why, what, how?Jan Aerts
 
Visual Analytics talk at ISMB2013
Visual Analytics talk at ISMB2013Visual Analytics talk at ISMB2013
Visual Analytics talk at ISMB2013Jan Aerts
 
Visualizing the Structural Variome (VMLS-Eurovis 2013)
Visualizing the Structural Variome (VMLS-Eurovis 2013)Visualizing the Structural Variome (VMLS-Eurovis 2013)
Visualizing the Structural Variome (VMLS-Eurovis 2013)Jan Aerts
 
Humanizing Data Analysis
Humanizing Data AnalysisHumanizing Data Analysis
Humanizing Data AnalysisJan Aerts
 
Intro to data visualization
Intro to data visualizationIntro to data visualization
Intro to data visualizationJan Aerts
 
L Fu - Dao: a novel programming language for bioinformatics
L Fu - Dao: a novel programming language for bioinformaticsL Fu - Dao: a novel programming language for bioinformatics
L Fu - Dao: a novel programming language for bioinformaticsJan Aerts
 
J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...
J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...
J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...Jan Aerts
 
S Cain - GMOD in the cloud
S Cain - GMOD in the cloudS Cain - GMOD in the cloud
S Cain - GMOD in the cloudJan Aerts
 
B Temperton - The Bioinformatics Testing Consortium
B Temperton - The Bioinformatics Testing ConsortiumB Temperton - The Bioinformatics Testing Consortium
B Temperton - The Bioinformatics Testing ConsortiumJan Aerts
 
J Goecks - The Galaxy Visual Analysis Framework
J Goecks - The Galaxy Visual Analysis FrameworkJ Goecks - The Galaxy Visual Analysis Framework
J Goecks - The Galaxy Visual Analysis FrameworkJan Aerts
 
S Cain - GMOD in the cloud
S Cain - GMOD in the cloudS Cain - GMOD in the cloud
S Cain - GMOD in the cloudJan Aerts
 
B Chapman - Toolkit for variation comparison and analysis
B Chapman - Toolkit for variation comparison and analysisB Chapman - Toolkit for variation comparison and analysis
B Chapman - Toolkit for variation comparison and analysisJan Aerts
 
P Rocca-Serra - The open source ISA metadata tracking framework: from data cu...
P Rocca-Serra - The open source ISA metadata tracking framework: from data cu...P Rocca-Serra - The open source ISA metadata tracking framework: from data cu...
P Rocca-Serra - The open source ISA metadata tracking framework: from data cu...Jan Aerts
 
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...Jan Aerts
 
S Cheng - eagle-i: development and expansion of a scientific resource discove...
S Cheng - eagle-i: development and expansion of a scientific resource discove...S Cheng - eagle-i: development and expansion of a scientific resource discove...
S Cheng - eagle-i: development and expansion of a scientific resource discove...Jan Aerts
 
A Kanterakis - PyPedia: a python crowdsourcing development environment for bi...
A Kanterakis - PyPedia: a python crowdsourcing development environment for bi...A Kanterakis - PyPedia: a python crowdsourcing development environment for bi...
A Kanterakis - PyPedia: a python crowdsourcing development environment for bi...Jan Aerts
 
A Kalderimis - InterMine: Embeddable datamining components
A Kalderimis - InterMine: Embeddable datamining componentsA Kalderimis - InterMine: Embeddable datamining components
A Kalderimis - InterMine: Embeddable datamining componentsJan Aerts
 
E Afgan - Zero to a bioinformatics analysis platform in four minutes
E Afgan - Zero to a bioinformatics analysis platform in four minutesE Afgan - Zero to a bioinformatics analysis platform in four minutes
E Afgan - Zero to a bioinformatics analysis platform in four minutesJan Aerts
 

Más de Jan Aerts (20)

VIZBI 2014 - Visualizing Genomic Variation
VIZBI 2014 - Visualizing Genomic VariationVIZBI 2014 - Visualizing Genomic Variation
VIZBI 2014 - Visualizing Genomic Variation
 
Visual Analytics in Omics - why, what, how?
Visual Analytics in Omics - why, what, how?Visual Analytics in Omics - why, what, how?
Visual Analytics in Omics - why, what, how?
 
Visual Analytics in Omics: why, what, how?
Visual Analytics in Omics: why, what, how?Visual Analytics in Omics: why, what, how?
Visual Analytics in Omics: why, what, how?
 
Visual Analytics talk at ISMB2013
Visual Analytics talk at ISMB2013Visual Analytics talk at ISMB2013
Visual Analytics talk at ISMB2013
 
Visualizing the Structural Variome (VMLS-Eurovis 2013)
Visualizing the Structural Variome (VMLS-Eurovis 2013)Visualizing the Structural Variome (VMLS-Eurovis 2013)
Visualizing the Structural Variome (VMLS-Eurovis 2013)
 
Humanizing Data Analysis
Humanizing Data AnalysisHumanizing Data Analysis
Humanizing Data Analysis
 
Intro to data visualization
Intro to data visualizationIntro to data visualization
Intro to data visualization
 
L Fu - Dao: a novel programming language for bioinformatics
L Fu - Dao: a novel programming language for bioinformaticsL Fu - Dao: a novel programming language for bioinformatics
L Fu - Dao: a novel programming language for bioinformatics
 
J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...
J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...
J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...
 
S Cain - GMOD in the cloud
S Cain - GMOD in the cloudS Cain - GMOD in the cloud
S Cain - GMOD in the cloud
 
B Temperton - The Bioinformatics Testing Consortium
B Temperton - The Bioinformatics Testing ConsortiumB Temperton - The Bioinformatics Testing Consortium
B Temperton - The Bioinformatics Testing Consortium
 
J Goecks - The Galaxy Visual Analysis Framework
J Goecks - The Galaxy Visual Analysis FrameworkJ Goecks - The Galaxy Visual Analysis Framework
J Goecks - The Galaxy Visual Analysis Framework
 
S Cain - GMOD in the cloud
S Cain - GMOD in the cloudS Cain - GMOD in the cloud
S Cain - GMOD in the cloud
 
B Chapman - Toolkit for variation comparison and analysis
B Chapman - Toolkit for variation comparison and analysisB Chapman - Toolkit for variation comparison and analysis
B Chapman - Toolkit for variation comparison and analysis
 
P Rocca-Serra - The open source ISA metadata tracking framework: from data cu...
P Rocca-Serra - The open source ISA metadata tracking framework: from data cu...P Rocca-Serra - The open source ISA metadata tracking framework: from data cu...
P Rocca-Serra - The open source ISA metadata tracking framework: from data cu...
 
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...
 
S Cheng - eagle-i: development and expansion of a scientific resource discove...
S Cheng - eagle-i: development and expansion of a scientific resource discove...S Cheng - eagle-i: development and expansion of a scientific resource discove...
S Cheng - eagle-i: development and expansion of a scientific resource discove...
 
A Kanterakis - PyPedia: a python crowdsourcing development environment for bi...
A Kanterakis - PyPedia: a python crowdsourcing development environment for bi...A Kanterakis - PyPedia: a python crowdsourcing development environment for bi...
A Kanterakis - PyPedia: a python crowdsourcing development environment for bi...
 
A Kalderimis - InterMine: Embeddable datamining components
A Kalderimis - InterMine: Embeddable datamining componentsA Kalderimis - InterMine: Embeddable datamining components
A Kalderimis - InterMine: Embeddable datamining components
 
E Afgan - Zero to a bioinformatics analysis platform in four minutes
E Afgan - Zero to a bioinformatics analysis platform in four minutesE Afgan - Zero to a bioinformatics analysis platform in four minutes
E Afgan - Zero to a bioinformatics analysis platform in four minutes
 

Último

Call Girl Lucknow Mallika 7001305949 Independent Escort Service Lucknow
Call Girl Lucknow Mallika 7001305949 Independent Escort Service LucknowCall Girl Lucknow Mallika 7001305949 Independent Escort Service Lucknow
Call Girl Lucknow Mallika 7001305949 Independent Escort Service Lucknownarwatsonia7
 
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...narwatsonia7
 
Ahmedabad Call Girls CG Road 🔝9907093804 Short 1500 💋 Night 6000
Ahmedabad Call Girls CG Road 🔝9907093804  Short 1500  💋 Night 6000Ahmedabad Call Girls CG Road 🔝9907093804  Short 1500  💋 Night 6000
Ahmedabad Call Girls CG Road 🔝9907093804 Short 1500 💋 Night 6000aliya bhat
 
call girls in Connaught Place DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...
call girls in Connaught Place  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...call girls in Connaught Place  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...
call girls in Connaught Place DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...saminamagar
 
See the 2,456 pharmacies on the National E-Pharmacy Platform
See the 2,456 pharmacies on the National E-Pharmacy PlatformSee the 2,456 pharmacies on the National E-Pharmacy Platform
See the 2,456 pharmacies on the National E-Pharmacy PlatformKweku Zurek
 
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...Miss joya
 
Housewife Call Girls Hsr Layout - Call 7001305949 Rs-3500 with A/C Room Cash ...
Housewife Call Girls Hsr Layout - Call 7001305949 Rs-3500 with A/C Room Cash ...Housewife Call Girls Hsr Layout - Call 7001305949 Rs-3500 with A/C Room Cash ...
Housewife Call Girls Hsr Layout - Call 7001305949 Rs-3500 with A/C Room Cash ...narwatsonia7
 
Call Girls Whitefield Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Whitefield Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Whitefield Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Whitefield Just Call 7001305949 Top Class Call Girl Service Availablenarwatsonia7
 
Call Girl Bangalore Nandini 7001305949 Independent Escort Service Bangalore
Call Girl Bangalore Nandini 7001305949 Independent Escort Service BangaloreCall Girl Bangalore Nandini 7001305949 Independent Escort Service Bangalore
Call Girl Bangalore Nandini 7001305949 Independent Escort Service Bangalorenarwatsonia7
 
Hemostasis Physiology and Clinical correlations by Dr Faiza.pdf
Hemostasis Physiology and Clinical correlations by Dr Faiza.pdfHemostasis Physiology and Clinical correlations by Dr Faiza.pdf
Hemostasis Physiology and Clinical correlations by Dr Faiza.pdfMedicoseAcademics
 
Book Call Girls in Yelahanka - For 7001305949 Cheap & Best with original Photos
Book Call Girls in Yelahanka - For 7001305949 Cheap & Best with original PhotosBook Call Girls in Yelahanka - For 7001305949 Cheap & Best with original Photos
Book Call Girls in Yelahanka - For 7001305949 Cheap & Best with original Photosnarwatsonia7
 
High Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service Jaipur
High Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service JaipurHigh Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service Jaipur
High Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service Jaipurparulsinha
 
Call Girls Thane Just Call 9910780858 Get High Class Call Girls Service
Call Girls Thane Just Call 9910780858 Get High Class Call Girls ServiceCall Girls Thane Just Call 9910780858 Get High Class Call Girls Service
Call Girls Thane Just Call 9910780858 Get High Class Call Girls Servicesonalikaur4
 
Glomerular Filtration and determinants of glomerular filtration .pptx
Glomerular Filtration and  determinants of glomerular filtration .pptxGlomerular Filtration and  determinants of glomerular filtration .pptx
Glomerular Filtration and determinants of glomerular filtration .pptxDr.Nusrat Tariq
 
VIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service Lucknow
VIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service LucknowVIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service Lucknow
VIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service Lucknownarwatsonia7
 
Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...
Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...
Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...narwatsonia7
 
Mumbai Call Girls Service 9910780858 Real Russian Girls Looking Models
Mumbai Call Girls Service 9910780858 Real Russian Girls Looking ModelsMumbai Call Girls Service 9910780858 Real Russian Girls Looking Models
Mumbai Call Girls Service 9910780858 Real Russian Girls Looking Modelssonalikaur4
 
97111 47426 Call Girls In Delhi MUNIRKAA
97111 47426 Call Girls In Delhi MUNIRKAA97111 47426 Call Girls In Delhi MUNIRKAA
97111 47426 Call Girls In Delhi MUNIRKAAjennyeacort
 
VIP Call Girls Mumbai Arpita 9910780858 Independent Escort Service Mumbai
VIP Call Girls Mumbai Arpita 9910780858 Independent Escort Service MumbaiVIP Call Girls Mumbai Arpita 9910780858 Independent Escort Service Mumbai
VIP Call Girls Mumbai Arpita 9910780858 Independent Escort Service Mumbaisonalikaur4
 
Russian Call Girl Brookfield - 7001305949 Escorts Service 50% Off with Cash O...
Russian Call Girl Brookfield - 7001305949 Escorts Service 50% Off with Cash O...Russian Call Girl Brookfield - 7001305949 Escorts Service 50% Off with Cash O...
Russian Call Girl Brookfield - 7001305949 Escorts Service 50% Off with Cash O...narwatsonia7
 

Último (20)

Call Girl Lucknow Mallika 7001305949 Independent Escort Service Lucknow
Call Girl Lucknow Mallika 7001305949 Independent Escort Service LucknowCall Girl Lucknow Mallika 7001305949 Independent Escort Service Lucknow
Call Girl Lucknow Mallika 7001305949 Independent Escort Service Lucknow
 
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
 
Ahmedabad Call Girls CG Road 🔝9907093804 Short 1500 💋 Night 6000
Ahmedabad Call Girls CG Road 🔝9907093804  Short 1500  💋 Night 6000Ahmedabad Call Girls CG Road 🔝9907093804  Short 1500  💋 Night 6000
Ahmedabad Call Girls CG Road 🔝9907093804 Short 1500 💋 Night 6000
 
call girls in Connaught Place DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...
call girls in Connaught Place  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...call girls in Connaught Place  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...
call girls in Connaught Place DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...
 
See the 2,456 pharmacies on the National E-Pharmacy Platform
See the 2,456 pharmacies on the National E-Pharmacy PlatformSee the 2,456 pharmacies on the National E-Pharmacy Platform
See the 2,456 pharmacies on the National E-Pharmacy Platform
 
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...
 
Housewife Call Girls Hsr Layout - Call 7001305949 Rs-3500 with A/C Room Cash ...
Housewife Call Girls Hsr Layout - Call 7001305949 Rs-3500 with A/C Room Cash ...Housewife Call Girls Hsr Layout - Call 7001305949 Rs-3500 with A/C Room Cash ...
Housewife Call Girls Hsr Layout - Call 7001305949 Rs-3500 with A/C Room Cash ...
 
Call Girls Whitefield Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Whitefield Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Whitefield Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Whitefield Just Call 7001305949 Top Class Call Girl Service Available
 
Call Girl Bangalore Nandini 7001305949 Independent Escort Service Bangalore
Call Girl Bangalore Nandini 7001305949 Independent Escort Service BangaloreCall Girl Bangalore Nandini 7001305949 Independent Escort Service Bangalore
Call Girl Bangalore Nandini 7001305949 Independent Escort Service Bangalore
 
Hemostasis Physiology and Clinical correlations by Dr Faiza.pdf
Hemostasis Physiology and Clinical correlations by Dr Faiza.pdfHemostasis Physiology and Clinical correlations by Dr Faiza.pdf
Hemostasis Physiology and Clinical correlations by Dr Faiza.pdf
 
Book Call Girls in Yelahanka - For 7001305949 Cheap & Best with original Photos
Book Call Girls in Yelahanka - For 7001305949 Cheap & Best with original PhotosBook Call Girls in Yelahanka - For 7001305949 Cheap & Best with original Photos
Book Call Girls in Yelahanka - For 7001305949 Cheap & Best with original Photos
 
High Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service Jaipur
High Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service JaipurHigh Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service Jaipur
High Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service Jaipur
 
Call Girls Thane Just Call 9910780858 Get High Class Call Girls Service
Call Girls Thane Just Call 9910780858 Get High Class Call Girls ServiceCall Girls Thane Just Call 9910780858 Get High Class Call Girls Service
Call Girls Thane Just Call 9910780858 Get High Class Call Girls Service
 
Glomerular Filtration and determinants of glomerular filtration .pptx
Glomerular Filtration and  determinants of glomerular filtration .pptxGlomerular Filtration and  determinants of glomerular filtration .pptx
Glomerular Filtration and determinants of glomerular filtration .pptx
 
VIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service Lucknow
VIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service LucknowVIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service Lucknow
VIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service Lucknow
 
Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...
Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...
Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...
 
Mumbai Call Girls Service 9910780858 Real Russian Girls Looking Models
Mumbai Call Girls Service 9910780858 Real Russian Girls Looking ModelsMumbai Call Girls Service 9910780858 Real Russian Girls Looking Models
Mumbai Call Girls Service 9910780858 Real Russian Girls Looking Models
 
97111 47426 Call Girls In Delhi MUNIRKAA
97111 47426 Call Girls In Delhi MUNIRKAA97111 47426 Call Girls In Delhi MUNIRKAA
97111 47426 Call Girls In Delhi MUNIRKAA
 
VIP Call Girls Mumbai Arpita 9910780858 Independent Escort Service Mumbai
VIP Call Girls Mumbai Arpita 9910780858 Independent Escort Service MumbaiVIP Call Girls Mumbai Arpita 9910780858 Independent Escort Service Mumbai
VIP Call Girls Mumbai Arpita 9910780858 Independent Escort Service Mumbai
 
Russian Call Girl Brookfield - 7001305949 Escorts Service 50% Off with Cash O...
Russian Call Girl Brookfield - 7001305949 Escorts Service 50% Off with Cash O...Russian Call Girl Brookfield - 7001305949 Escorts Service 50% Off with Cash O...
Russian Call Girl Brookfield - 7001305949 Escorts Service 50% Off with Cash O...
 

M Reich - GenomeSpace

  • 1. Michael Reich Broad Institute of Harvard and MIT July 14, 2012
  • 2. Need: Insights through integrative studies Gene mutation causing Leigh Syndrome Discovery of 3 new genes involved in French Canadian Type (LFSC) and 8 Glioblastoma Multiforme (NF1, ERRB2, other mitochondrial diseases PIK3R1); Confirmation of TP53, PTEN, Integrate: candidate genomic region, EGFR, RB1, PIK3CA mitochondrial proteomic data, and cancer Integrate: DNA sequence, copy number, expression compendium methylation aberrations and expression Authors: Mootha et al. 2003, Calvo et al. 2006 profiles in 206 glioblastomas Authors: TCGA Research Network 2008 Subtle repression of oxphos genes in diabetic muscle: role for mitochondrial ~3000 novel, large non-coding RNAs with dysregulation in diabetes pathology; new functions in development, the immune computational approach Gene Set response and cancer Enrichment Analysis Integrate: Genome sequences from 21 Integrate: Gene sets/pathways & processes mammals, epigenomic maps, and with expression profiles expression profiles Authors: Mootha et al. 2003, Subramanian & Tamayo Authors: Guttman, Rinn et al. 2009 et al. 2005 Characterization of disease subtypes and IKBKE as a new breast cancer oncogene improved risk stratification for Integrate: RNAi screens, transformation medulloblastoma patients of activated kinases, and copy number Integrate: Copy number, expression, from SNP arrays of cell lines clinical data for 96 medulloblastoma Authors: Boehm et al. 2007 patients Authors: Tamayo et al., Cho et al. 2011
  • 3. Translational Research Example GenePattern Cytoscape IGV/UCSC Genomica CMAP iii 1 Compendium Differentially Expressed Genes GCT  GXP Load compendium Arrests SIF Show module ODF  GMT G2/M 3 map i Network 2 Show GSEA test network GXC  GRP enrichment 5 Extract HTML 4 Show module Expand +1 Chromosome NA  gene list Idea ii (include Alterations neighbors) 6 Add Transcription Learn p53 atcgcgtttattcgataagg site/score on atcgcgttttttcgataagg Factor track Added to GenePattern from UCSC promoter Pathway iv activation v 8 Expression 7 Test for Looks closeGFF  similarity of GXA Conclusion to p53 site p53 and gene location vi
  • 4. 12 steps, 6 tools, 7 transitions 1 i GenePattern Cytoscape IGV Genomica 2 3 ii iii CMAP UCSC Browser Analysis step 4 5 iv v Analysis conclusion Within tool Across tools 6 8
  • 5. The Need • A lightweight “connection layer” for a wide variety of integrative genomic analyses o Support for all types of resource: Web-based, desktop, etc. o Automatic conversion of data formats between tools o Easy access to data from any location o Any tool that joins is automatically connected to the community of tools o Ease of entry into the environment
  • 7. Online community to share diverse computational tools 6 Seed Tools 3 Driving Biological Projects Cytoscape Galaxy lincRNAs GenePattern Genomica Cancer stem cells IGV Patient Stratification UCSC Browser New Biological New tools www.genomespace.org Projects
  • 8. GenomeSpace Principles • Aimed towards non-programming users • Support interoperability through automatic cross-tool data transfer • Requires minimal changes to tools www.genomespace.org
  • 9. GenomeSpace Components GenePattern Galaxy Cytoscape Integrative Genomics Viewer GS Enabled Tools 1 Authentication and Authorization 2 3 Analysis and Data Manager Tool Manager Genome Space GenomeSpace Server Project Data geWorkbench External Data Sources & Tools
  • 10. Seed Tools Cytoscape Galaxy GenePattern Genomica IGV UCSC Genome Browser
  • 11. New Tools Recently added InSilicoDB Cistrome (University of Brussels) (Dana-Farber Cancer Institute) In development geWorkbench Reactome ArrayExpress (Columbia University) (Ontario Institute of Cancer Research) (EBI)
  • 12. Using GenomeSpace Tools and Data Sources Actions Cloud- based filesystem
  • 17. GenomeSpace Data Source Integration: InSilico DB
  • 18. Other collaborating projects • Taverna/MyExperiment (University of Manchester) • National Center for Biomedical Ontology (Stanford University)
  • 19. DBP3: Studying the regulatory control of human hematopoiesis
  • 20. DBP3: Studying the regulatory control of human hematopoiesis – Overview Part 1: Data pre-processing Part 3: Studying the Part 4: cis-regulatory and quality control transcriptional program site analysis Genomica From From part 3 3 2 2 4 part 2 GenePattern 3 4 1 2 IGV To part 2 2 2 1 1 1 From part 2 1 Cytoscape Galaxy 1 1 1 1 1 2 2 Manual step 3 2 To part 5 1 2 Analysis section 1 Part 5: Finding new Analysis step (# steps) 4 transcription factor 1 1 regulators Analysis conclusion From part 4 Optional choices Part 2: Basic analysis 1 Currently integrated From part 1 To part 3 3 1 2 3 2 2 Not yet integrated 2 1 2 2 1 3+ 2 1 2 1 To part 4
  • 21. Part 5: Finding new transcription factor regulators From part 4 Step 1 2 3 Step 4 Step 2 2 Step 3 2
  • 22. Step 1: Create transcription factor dataset in Genomica and save to GenomeSpace
  • 23. Step 2: Send transcription factor datasets into GenePattern
  • 24. Step 2: Perform differential expression analysis in GenePattern
  • 25. Step 3: Send differentially expressed genes to Genomica
  • 26. Step 3: perform module network analysis in Genomica
  • 27. Step 4: Visualize regulators with known SNPs and linkage regions
  • 28. Step 4: Visualize regulators with known SNPs and linkage regions
  • 29. Deployment Architecture and APIs Deployment Architecture GS Clients REST Amazon GS UI Identity Service (OpenID) REST IGV Analysis REST CDK Task Genomica Manager (ATM) CDK Gene- Provenance Pattern Simple DB REST Galaxy Data REST Manager CDK Cytoscape (DM) External Data REST Sources UCSC (e.g., Arrary Express) S3 File transfers 30
  • 30. Join the GenomeSpace community • Researchers with biological projects • Developers – Add your tools – Contribute format converters – Build new infrastructure • Data portals and repositories – Link your resources
  • 31. Acknowledgements GenomeSpace Collaborators Broad Institute Cytoscape: Trey Ideker Lab, UCSD Ted Liefeld Galaxy: Anton Nekrutenko Lab, Penn State University Helga Thorvaldsdottir Genomica: Eran Segal Lab, Weizmann Institute Jim Robinson UCSC Browser Team Marco Ocana GenePattern Team Eliot Polk IGV Team Jill Mesirov, PI Driving Biological Projects Howard Chang Lab – Stanford University Aviv Regev Lab – Broad Institute Funding gs-help@broadinstitute.org www.genomespace.org

Notas del editor

  1. The key observation here is that if you can focus on making the transitions between tools easier, you can greatly accelerate the way you can use those tools in concert
  2. The key challenge is the growing gap between the need to use a variety of different analysis and visualization tools, and the difficulty of getting tools from different sources to work together. 7-10K bioinformatics toolsBroad alone lists ~60 tools on its external website.Doesn’t include internal use tools, lims, data processing/mgmt pipelines,5K public data bases
  3. Data storage in the cloud is being used in a huge number of applications for the cost, scalability and accessibilityAnd, Web-based APIs have been used to connect disparate Web resources in a very easy wayThese were the principles we used when we developed GenomeSpace
  4. 0. GenomeSpace project to build an online community to share, find, and interoperate diverse computational tools. Tools retain their identity and use as stand- alone software and GenomeSpace maintains their native look and feel. Our goal is to bring the ever-changing wealth of genomic analysis methods and whatever data is required to the fingertips of any biologistSeeded with 6 popular genomics tools representing diverse architectures(cytoscape, galaxy, genepattern, genomica, igv, UCSC browser)Support interoperability through frictionless data transfer with Reproducibility, analytic work flows, comprehensive documentationDevelopment driven by 3 driving biological projects in (cancer, lincRNAs,, stem cell circuits, and patient stratification)Live in the CloudNext phase starting to Engage new toolsEngage new biomedical projectsCurrent participating institutions
  5. Seed tools span a wide variety of genomic analysis areas: Network analysis and visualization Sequence analysis Functional genomics analysis Module map analysis Integrated genomics and sequence visualization
  6. Seed tools span a wide variety of genomic analysis areas: Network analysis and visualization Sequence analysis Functional genomics analysis Module map analysis Integrated genomics and sequence visualization
  7. View GenomeSpace files from within GenePattern, use a file from GenomeSpace in a GenePattern analysisGo to the GenomeSpace UI
  8. This is a way to get GEO datasets into any tool that will accept gene expression files. You also don’t need to convert from MAGE-TAB.We are also working with the developers of ArrayExpress to GenomeSpace-enable their repository.
  9. Let’s show you how it works. One of the scientific collaborators, Aviv Regev, recently published work on the regulatory control of human hematopoiesis. For this GenomeSpace demonstration, we will show a simplified version of part of this work.
  10. This is an overview of the analytic workflow of the work done by Aviv and colleagues. This is months of research condensed to one slide. The analysiscanbedivided in 5 major parts. The first part comprises data preprocessing and qualitycontrol, the second part is a basicanalysis, the third part comprisesstudying the transcriptional program, part four is a cis-regulatoryanalysis, and part fiveaimsforfindingnewtranscription factor regulators. Eachcirclerepresentsworkdone in one of the seed tools, whichyoucanseehere, coloredby the identity of the tool. Circlescolored in greyrepresentmanual steps. They are manualbecause the functionalitydoesn'texist in any of the tools. Duringthis demo, we willmentionwhen we have to do such a manual step. The numberinsideeachcirclerepresents the number of steps that are performedwithin the tool. And the arrowsrepresent the different pathsthatcanbefollowed. At certainpoints, indicatedby the blue boxes, there are alternativeoptionalpathsthatcanbefollowed. We willnow zoom in on part 5 in this demo, which is onfindingnewtranscription factor regulators.
  11. Whatwe’reactuallydoing in this demo is the following. We start from a very broad set of transcription factors. We then find those transcription factorsthat are significantly differentially expressed in a particular lineage as compared to the other lineages. These we’ll call our lineage-dependent transcription factors. We then infer regulatory programs using Module Networks from Genomica. Regulatory programs consist of modules of coexpressed genes that themselves are coregulated. These regulators we then visualize and validate using previously published GWAS SNPs and linkage regions. We’renowgoing to walk youthrough the steps in this demo. We first start in Genomica and there we load the full expression data containing more than 200 samples and about 8000 genestogetherwith a gene set thatcontains the GO transcription factors. The blue squares next to the genesindicatewhichgenes are GO transcription factors. In Genomica we thencan save the expression data fromonly the GO transcription factors into a text file. In GenePattern we want to refine the GO transcription factors to a set of lineage-specifictranscription factors. We do thisusing the ComparativeMarkerSelection and ExtractComparativeMarkerResults modules in GenePattern. So we basicallyuse a t-test to assessdifferentialexpression of transcription factors in eachlineage versus the rest of the lineages, and we canuse criteria such as an FDR < 0.05 to actually select significantlydifferentiallyexpressedtranscription factors. As anexample, we continue with the transcription factors that are specific to the hematopoietic stem cell (HSC) lineage. Back in Genomica, we run Module Networks on the full expression data, whilealsoloading the lineage-specifictranscription factors we justgenerated in GenePattern. As running Module Networks usuallytakes a while, we nowimmediatelyloadourpreviouslysavedresults. In this case, we have 80 modules of coexpressedgenesforwhich we cangenerate a list of the potential regulators. This list allowsus to browseveryeasilythrough the results. Hereyousee a module that has as top regulator SMAD4. SMAD4 divides the sample spaceinto 2 partitions, whileother regulators furthersubdivide the samples. As a last step, we want to visualize and validateour list of potential regulators. To do that, we follow 2 approaches in parallel. Both approachesrequireus to manuallycreate a .bed annotation track thatcontains the coordinates of the regulators. To validateour regulators, we want to findoverlapsbetweenour regulators and previouslypublished GWAS SNPs and linkageregions. We download these SNPs and linkageregions and we manuallycreate .bed annotation tracks forboth of them. The firstapproach is to visualizeour regulators in IGV. Thisrequiresus to upload the 3 .bed annotation tracks in IGV. In parallel, we submit the same 3 .bed annotation tracks to Galaxy, where we can do overlap analysisbetween regulators, SNPs and linkageregions in several steps. Fromthisanalysis, we getanexcel file that displays the overlaps in tableformat. Interestingalso is thatthistablecontains the pubmedIDs of the papers in which a particular GWAS SNP has been published in relation to a particulardisease.
  12. To launch desktop-based tools, we first get the prompt that we’re downloading the application in jnlp format
  13. NOTE: The file sent to GenePattern is still a Genomica-formatted file. We specify in the URL what format we want to convert the file to
  14. We send this file directly from the GenePattern interface to Genomica, now converting it back to a Genomica formatted file
  15. We now see Genomica open with the data file, and we perform a module map analysis on this data to determine regulators of module networks, and since we’ll need some other data for the next step, we save the results to GenomeSpace
  16. From the GenomeSpace Ui, we’re going to send these files now to IGV
  17. We now see IGV open with our tracks.What’s under the covers…
  18. The beta release is available for use!A few brave men and women (explorers)