SlideShare una empresa de Scribd logo
1 de 32
Using Galaxy for Metabolomics
ICG8
2013
Rob L Davidson PhD

Copyright NBAF-B 2013
Overview
• Metabolomics
• Galaxy
• Birmingham Workflow
• Galaxy Implementation

Copyright NBAF-B 2013
Metabolomics - data

Copyright NBAF-B 2013
Metabolomics - tools

Copyright NBAF-B 2013

Verberk WCEP, Int&Comp Biol. 2013
Metabolomics - future

• Identify metabolite profile
• Link to pathways etc
• Correlate with other ‘omes
• Trans-omics!

Copyright NBAF-B 2013
Galaxy

Copyright NBAF-B 2013
Galaxy
Open source
Over 20,000 main
Galaxy server users

Over 1200 papers
citing Galaxy use

45+ public
Galaxy servers

http://galaxyproject.org
Copyright NBAF-B 2013
Galaxy - interface

Tool list

Tool Copyright NBAF-B 2013
parameterisation

Results panel
Galaxy - toolshed

• Lots of tools ->
• Good for trans-omics
• But no metabolomics

…yet

http://toolshed.g2.bx.psu.edu/

Copyright NBAF-B 2013
Galaxy - workflows

Copyright NBAF-B 2013
Metabolomics Workflow

Copyright NBAF-B 2013
Metabolomics Workflow

Met ID:
coming soon

Copyright NBAF-B 2013
Metabolomics Workflow
• SimStitch
– FTICR-DIMS data
– RAW format ; NB: requires MS Windows!
– Stitches together short MZ ranges for greater
accuracy
– Picks peaks/removes noise, aligns samples
– Applies filters to technical replicates, blanks and
samples for greater robustness
• Southam AD, Anal Chem. 2007;79(12):4595-602.
Copyright NBAF-B 2013
Metabolomics Workflow
• XCMS
– LC-MS (also does GC-MS etc)
– netCDF format (also does MZML etc)
– Picks peaks/removes noise, aligns samples

• Smith CA, Anal. Chem. 2006 78:779-787
• http://www.bioconductor.org/packages/2.12/bioc/html/xcms.html

• In our pipeline, we call XCMS (in R) using Matlab...
– To make use of our FileManager Structure
– Because we also use Matlab for post-XCMS processing
Copyright NBAF-B 2013
Metabolomics Workflow
• Matrix Prep:
– PQN Normalisation
• Dieterle F, Anal Chem. 2006;78(13):4281-90.

– KNN-Missing Value Imputation
• Hrydziuszko O, Metabolomics 2012 8(1):161-174

– G-Log scaling and variance stabilisation:
• Parsons HM,BMC Bioinf. 2007 8:234

– All done using PLS Toolbox data structures in
Matlab
Copyright NBAF-B 2013
Metabolomics Workflow
• Multivariate stats
– PCA with automatic selection of PCs
– 2 classes: T-Test on scores on each PC
– 3+ classes: ANOVA and Tukey-Kramer
– Output = text file containing these statistics
– a PLS Toolbox ‘model’ is also created and scores
plots etc can be viewed in Matlab

Copyright NBAF-B 2013
Metabolomics Workflow
• Univariate stats
– 2 classes: T-Test or Mann-Whitney-U for each
peak
– 3+ classes: ANOVA or Kruskall-Wallis
– False Discovery Rate correction (Benjamini
Hochberg)
– Output = csv file containing these statistics

Copyright NBAF-B 2013
Metabolomics Workflow
• Done
– 1st end-end metabolomics pipeline in Galaxy
– FTICR-DIMS and LCMS data

• To Do
– Add in MI-Pack (underway)
– Add more stats e.g. PLSDA
• possibly merging with Netherlands Metabolomics
http://galaxy.nmcdsp.org/ (stats only)

– Replace input file structure with ISA-Tab
• http://www.ebi.ac.uk/metabolights/
Copyright NBAF-B 2013
Metabolomics Workflow
•
•
•
•

Requires 32bit MS Windows
Data is large (100s Gb per study)
Lots of processing power
Multiple licenses

Copyright NBAF-B 2013
Galaxy Implementation

Copyright NBAF-B 2013
Standard Galaxy

Copyright NBAF-B 2013
Standard Galaxy
BUT!

Copyright NBAF-B 2013
Metabo - Galaxy
• Requirements
– Allow Galaxy access to MS Windows
• for FTICR-DIMS RAW file processing

– Avoid passing large data over LAN
• slow

– Minimise cost of Galaxy implementation
• make use of existing processors, storage and licences

Copyright NBAF-B 2013
Metabo - Galaxy
• Solution
– Use Galaxy’s Light Weight Runner (LWR)
– Install LWR client on user’s desktop
– Adjust Python wrappers to send tools via LWR
– Run all tools on User’s desktop (MS Windows)
– No need for
• extra licenses
• central storage/file transfer
• powerful server
Copyright NBAF-B 2013
Metabo - Galaxy

• New Input type
• Pre-fills user’s IP
• Uses REMOTE_ADDR header

Copyright NBAF-B 2013
Metabo - Galaxy
• Makes use of
 existing proc. power
 licenses
 user’s MS Windows (RAW)
• Still acts as
 version control
 workflow manager
 GUI

Copyright NBAF-B 2013
Metabo - Galaxy
That said...
• Working with GigaScience
– http://www.gigasciencejournal.com/

• To be hosted on GigaGalaxy
– http://galaxy.cbiit.cuhk.edu.hk/

• Using normal setup
– (all processing/licenses on Galaxy server)

• Downloadable version to include both options
Copyright NBAF-B 2013
Summary

Copyright NBAF-B 2013
First RAW -> stats Galaxy Pipe

Copyright NBAF-B 2013
Summary
• Metabolomics has entered Galaxy!
• Can be expanded BY COMMUNITY!
• Can merge more easily with other ‘omics

• Have developed Galaxy in a new way that allows
– Use of existing hardware
– Use of existing licenses
– Less slow transfer of large data

Copyright NBAF-B 2013
Acknowledgements
• University of Birmingham
– Ralf Weber, Ulf Sommer, Mark Viant

• Gigascience
– Pete Li

• NERC Discipline Hop scheme

Copyright NBAF-B 2013
End

Questions?

r.l.davidson@bham.ac.uk
Copyright NBAF-B 2013

Más contenido relacionado

La actualidad más candente

Flink Forward Berlin 2017: Dongwon Kim - Predictive Maintenance with Apache F...
Flink Forward Berlin 2017: Dongwon Kim - Predictive Maintenance with Apache F...Flink Forward Berlin 2017: Dongwon Kim - Predictive Maintenance with Apache F...
Flink Forward Berlin 2017: Dongwon Kim - Predictive Maintenance with Apache F...
Flink Forward
 
Flink Forward San Francisco 2019: Massive Scale Data Processing at Netflix us...
Flink Forward San Francisco 2019: Massive Scale Data Processing at Netflix us...Flink Forward San Francisco 2019: Massive Scale Data Processing at Netflix us...
Flink Forward San Francisco 2019: Massive Scale Data Processing at Netflix us...
Flink Forward
 
Till Rohrmann – Fault Tolerance and Job Recovery in Apache Flink
Till Rohrmann – Fault Tolerance and Job Recovery in Apache FlinkTill Rohrmann – Fault Tolerance and Job Recovery in Apache Flink
Till Rohrmann – Fault Tolerance and Job Recovery in Apache Flink
Flink Forward
 
Christian Kreuzfeld – Static vs Dynamic Stream Processing
Christian Kreuzfeld – Static vs Dynamic Stream ProcessingChristian Kreuzfeld – Static vs Dynamic Stream Processing
Christian Kreuzfeld – Static vs Dynamic Stream Processing
Flink Forward
 

La actualidad más candente (20)

PGConf APAC 2018 Keynote: PostgreSQL goes eleven
PGConf APAC 2018 Keynote: PostgreSQL goes elevenPGConf APAC 2018 Keynote: PostgreSQL goes eleven
PGConf APAC 2018 Keynote: PostgreSQL goes eleven
 
Flink Forward Berlin 2017: Dongwon Kim - Predictive Maintenance with Apache F...
Flink Forward Berlin 2017: Dongwon Kim - Predictive Maintenance with Apache F...Flink Forward Berlin 2017: Dongwon Kim - Predictive Maintenance with Apache F...
Flink Forward Berlin 2017: Dongwon Kim - Predictive Maintenance with Apache F...
 
Flink Forward Berlin 2017: Zohar Mizrahi - Python Streaming API
Flink Forward Berlin 2017: Zohar Mizrahi - Python Streaming APIFlink Forward Berlin 2017: Zohar Mizrahi - Python Streaming API
Flink Forward Berlin 2017: Zohar Mizrahi - Python Streaming API
 
Flink Forward San Francisco 2019: Massive Scale Data Processing at Netflix us...
Flink Forward San Francisco 2019: Massive Scale Data Processing at Netflix us...Flink Forward San Francisco 2019: Massive Scale Data Processing at Netflix us...
Flink Forward San Francisco 2019: Massive Scale Data Processing at Netflix us...
 
Unify Enterprise Data Processing System Platform Level Integration of Flink a...
Unify Enterprise Data Processing System Platform Level Integration of Flink a...Unify Enterprise Data Processing System Platform Level Integration of Flink a...
Unify Enterprise Data Processing System Platform Level Integration of Flink a...
 
Flink Forward San Francisco 2018 keynote: Stephan Ewen - "What turns stream p...
Flink Forward San Francisco 2018 keynote: Stephan Ewen - "What turns stream p...Flink Forward San Francisco 2018 keynote: Stephan Ewen - "What turns stream p...
Flink Forward San Francisco 2018 keynote: Stephan Ewen - "What turns stream p...
 
Flink Forward Berlin 2017: Stephan Ewen - The State of Flink and how to adopt...
Flink Forward Berlin 2017: Stephan Ewen - The State of Flink and how to adopt...Flink Forward Berlin 2017: Stephan Ewen - The State of Flink and how to adopt...
Flink Forward Berlin 2017: Stephan Ewen - The State of Flink and how to adopt...
 
Going Reactive with Spring 5
Going Reactive with Spring 5Going Reactive with Spring 5
Going Reactive with Spring 5
 
Jan2015 bioinfo update_on_ftp_sr_aand_usage
Jan2015 bioinfo update_on_ftp_sr_aand_usageJan2015 bioinfo update_on_ftp_sr_aand_usage
Jan2015 bioinfo update_on_ftp_sr_aand_usage
 
Flink Forward Berlin 2017: Pramod Bhatotia, Do Le Quoc - StreamApprox: Approx...
Flink Forward Berlin 2017: Pramod Bhatotia, Do Le Quoc - StreamApprox: Approx...Flink Forward Berlin 2017: Pramod Bhatotia, Do Le Quoc - StreamApprox: Approx...
Flink Forward Berlin 2017: Pramod Bhatotia, Do Le Quoc - StreamApprox: Approx...
 
Monitoring Flink with Prometheus
Monitoring Flink with PrometheusMonitoring Flink with Prometheus
Monitoring Flink with Prometheus
 
Virtual Flink Forward 2020: Production-Ready Flink and Hive Integration - wha...
Virtual Flink Forward 2020: Production-Ready Flink and Hive Integration - wha...Virtual Flink Forward 2020: Production-Ready Flink and Hive Integration - wha...
Virtual Flink Forward 2020: Production-Ready Flink and Hive Integration - wha...
 
Flink Forward Berlin 2017: Aljoscha Krettek - Talk Python to me: Stream Proce...
Flink Forward Berlin 2017: Aljoscha Krettek - Talk Python to me: Stream Proce...Flink Forward Berlin 2017: Aljoscha Krettek - Talk Python to me: Stream Proce...
Flink Forward Berlin 2017: Aljoscha Krettek - Talk Python to me: Stream Proce...
 
Flink Forward Berlin 2018: Xingcan Cui - "Stream Join in Flink: from Discrete...
Flink Forward Berlin 2018: Xingcan Cui - "Stream Join in Flink: from Discrete...Flink Forward Berlin 2018: Xingcan Cui - "Stream Join in Flink: from Discrete...
Flink Forward Berlin 2018: Xingcan Cui - "Stream Join in Flink: from Discrete...
 
Flink Connector Development Tips & Tricks
Flink Connector Development Tips & TricksFlink Connector Development Tips & Tricks
Flink Connector Development Tips & Tricks
 
QTP Basics-2
QTP Basics-2QTP Basics-2
QTP Basics-2
 
Till Rohrmann – Fault Tolerance and Job Recovery in Apache Flink
Till Rohrmann – Fault Tolerance and Job Recovery in Apache FlinkTill Rohrmann – Fault Tolerance and Job Recovery in Apache Flink
Till Rohrmann – Fault Tolerance and Job Recovery in Apache Flink
 
Flink Forward Berlin 2017: Patrick Lucas - Flink in Containerland
Flink Forward Berlin 2017: Patrick Lucas - Flink in ContainerlandFlink Forward Berlin 2017: Patrick Lucas - Flink in Containerland
Flink Forward Berlin 2017: Patrick Lucas - Flink in Containerland
 
Christian Kreuzfeld – Static vs Dynamic Stream Processing
Christian Kreuzfeld – Static vs Dynamic Stream ProcessingChristian Kreuzfeld – Static vs Dynamic Stream Processing
Christian Kreuzfeld – Static vs Dynamic Stream Processing
 
Virtual Flink Forward 2020: Autoscaling Flink at Netflix - Timothy Farkas
Virtual Flink Forward 2020: Autoscaling Flink at Netflix - Timothy FarkasVirtual Flink Forward 2020: Autoscaling Flink at Netflix - Timothy Farkas
Virtual Flink Forward 2020: Autoscaling Flink at Netflix - Timothy Farkas
 

Similar a Metabolomics in Galaxy - ICG8 Shenzhen 2013

OSMC 2019 | Monitoring Alerts and Metrics on Large Power Systems Clusters by ...
OSMC 2019 | Monitoring Alerts and Metrics on Large Power Systems Clusters by ...OSMC 2019 | Monitoring Alerts and Metrics on Large Power Systems Clusters by ...
OSMC 2019 | Monitoring Alerts and Metrics on Large Power Systems Clusters by ...
NETWAYS
 

Similar a Metabolomics in Galaxy - ICG8 Shenzhen 2013 (20)

Scaling machine learning to millions of users with Apache Beam
Scaling machine learning to millions of users with Apache BeamScaling machine learning to millions of users with Apache Beam
Scaling machine learning to millions of users with Apache Beam
 
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...
 
Resume2015
Resume2015Resume2015
Resume2015
 
The Next Generation of Data Processing and Open Source
The Next Generation of Data Processing and Open SourceThe Next Generation of Data Processing and Open Source
The Next Generation of Data Processing and Open Source
 
Kickstart your Kafka with Faker Data | Francesco Tisiot, Aiven.io
Kickstart your Kafka with Faker Data | Francesco Tisiot, Aiven.ioKickstart your Kafka with Faker Data | Francesco Tisiot, Aiven.io
Kickstart your Kafka with Faker Data | Francesco Tisiot, Aiven.io
 
Microservices for Systematic Profiling and Monitoring of the Refactoring
Microservices for Systematic Profiling and Monitoring of the RefactoringMicroservices for Systematic Profiling and Monitoring of the Refactoring
Microservices for Systematic Profiling and Monitoring of the Refactoring
 
Performance Tuning Oracle Weblogic Server 12c
Performance Tuning Oracle Weblogic Server 12cPerformance Tuning Oracle Weblogic Server 12c
Performance Tuning Oracle Weblogic Server 12c
 
Galaxy
GalaxyGalaxy
Galaxy
 
Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II
Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen IIPorting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II
Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II
 
Airflow at lyft
Airflow at lyftAirflow at lyft
Airflow at lyft
 
How did we move the mountain? - Migrating 1 trillion+ messages per day across...
How did we move the mountain? - Migrating 1 trillion+ messages per day across...How did we move the mountain? - Migrating 1 trillion+ messages per day across...
How did we move the mountain? - Migrating 1 trillion+ messages per day across...
 
Bottlenecks rel b works and rel c planning
Bottlenecks rel b works and rel c planningBottlenecks rel b works and rel c planning
Bottlenecks rel b works and rel c planning
 
OSMC 2019 | Monitoring Alerts and Metrics on Large Power Systems Clusters by ...
OSMC 2019 | Monitoring Alerts and Metrics on Large Power Systems Clusters by ...OSMC 2019 | Monitoring Alerts and Metrics on Large Power Systems Clusters by ...
OSMC 2019 | Monitoring Alerts and Metrics on Large Power Systems Clusters by ...
 
OSMC 2023 | What’s new with Grafana Labs’s Open Source Observability stack by...
OSMC 2023 | What’s new with Grafana Labs’s Open Source Observability stack by...OSMC 2023 | What’s new with Grafana Labs’s Open Source Observability stack by...
OSMC 2023 | What’s new with Grafana Labs’s Open Source Observability stack by...
 
Opal: Simple Web Services Wrappers for Scientific Applications
Opal: Simple Web Services Wrappers for Scientific ApplicationsOpal: Simple Web Services Wrappers for Scientific Applications
Opal: Simple Web Services Wrappers for Scientific Applications
 
Semantic Validation: Enforcing Kafka Data Quality Through Schema-Driven Verif...
Semantic Validation: Enforcing Kafka Data Quality Through Schema-Driven Verif...Semantic Validation: Enforcing Kafka Data Quality Through Schema-Driven Verif...
Semantic Validation: Enforcing Kafka Data Quality Through Schema-Driven Verif...
 
Raghu nambiar:industry standard benchmarks
Raghu nambiar:industry standard benchmarksRaghu nambiar:industry standard benchmarks
Raghu nambiar:industry standard benchmarks
 
sigrok: Adventures in Integrating a Power-Measurement Device
sigrok: Adventures in Integrating a Power-Measurement Devicesigrok: Adventures in Integrating a Power-Measurement Device
sigrok: Adventures in Integrating a Power-Measurement Device
 
Capital One Delivers Risk Insights in Real Time with Stream Processing
Capital One Delivers Risk Insights in Real Time with Stream ProcessingCapital One Delivers Risk Insights in Real Time with Stream Processing
Capital One Delivers Risk Insights in Real Time with Stream Processing
 
OCRE webinar - April 14 - Cloud_Validation_Suite_Ignacio Peluaga Lozada.pdf
OCRE webinar - April 14 - Cloud_Validation_Suite_Ignacio Peluaga Lozada.pdfOCRE webinar - April 14 - Cloud_Validation_Suite_Ignacio Peluaga Lozada.pdf
OCRE webinar - April 14 - Cloud_Validation_Suite_Ignacio Peluaga Lozada.pdf
 

Último

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Último (20)

Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 

Metabolomics in Galaxy - ICG8 Shenzhen 2013

  • 1. Using Galaxy for Metabolomics ICG8 2013 Rob L Davidson PhD Copyright NBAF-B 2013
  • 2. Overview • Metabolomics • Galaxy • Birmingham Workflow • Galaxy Implementation Copyright NBAF-B 2013
  • 4. Metabolomics - tools Copyright NBAF-B 2013 Verberk WCEP, Int&Comp Biol. 2013
  • 5. Metabolomics - future • Identify metabolite profile • Link to pathways etc • Correlate with other ‘omes • Trans-omics! Copyright NBAF-B 2013
  • 7. Galaxy Open source Over 20,000 main Galaxy server users Over 1200 papers citing Galaxy use 45+ public Galaxy servers http://galaxyproject.org Copyright NBAF-B 2013
  • 8. Galaxy - interface Tool list Tool Copyright NBAF-B 2013 parameterisation Results panel
  • 9. Galaxy - toolshed • Lots of tools -> • Good for trans-omics • But no metabolomics …yet http://toolshed.g2.bx.psu.edu/ Copyright NBAF-B 2013
  • 12. Metabolomics Workflow Met ID: coming soon Copyright NBAF-B 2013
  • 13. Metabolomics Workflow • SimStitch – FTICR-DIMS data – RAW format ; NB: requires MS Windows! – Stitches together short MZ ranges for greater accuracy – Picks peaks/removes noise, aligns samples – Applies filters to technical replicates, blanks and samples for greater robustness • Southam AD, Anal Chem. 2007;79(12):4595-602. Copyright NBAF-B 2013
  • 14. Metabolomics Workflow • XCMS – LC-MS (also does GC-MS etc) – netCDF format (also does MZML etc) – Picks peaks/removes noise, aligns samples • Smith CA, Anal. Chem. 2006 78:779-787 • http://www.bioconductor.org/packages/2.12/bioc/html/xcms.html • In our pipeline, we call XCMS (in R) using Matlab... – To make use of our FileManager Structure – Because we also use Matlab for post-XCMS processing Copyright NBAF-B 2013
  • 15. Metabolomics Workflow • Matrix Prep: – PQN Normalisation • Dieterle F, Anal Chem. 2006;78(13):4281-90. – KNN-Missing Value Imputation • Hrydziuszko O, Metabolomics 2012 8(1):161-174 – G-Log scaling and variance stabilisation: • Parsons HM,BMC Bioinf. 2007 8:234 – All done using PLS Toolbox data structures in Matlab Copyright NBAF-B 2013
  • 16. Metabolomics Workflow • Multivariate stats – PCA with automatic selection of PCs – 2 classes: T-Test on scores on each PC – 3+ classes: ANOVA and Tukey-Kramer – Output = text file containing these statistics – a PLS Toolbox ‘model’ is also created and scores plots etc can be viewed in Matlab Copyright NBAF-B 2013
  • 17. Metabolomics Workflow • Univariate stats – 2 classes: T-Test or Mann-Whitney-U for each peak – 3+ classes: ANOVA or Kruskall-Wallis – False Discovery Rate correction (Benjamini Hochberg) – Output = csv file containing these statistics Copyright NBAF-B 2013
  • 18. Metabolomics Workflow • Done – 1st end-end metabolomics pipeline in Galaxy – FTICR-DIMS and LCMS data • To Do – Add in MI-Pack (underway) – Add more stats e.g. PLSDA • possibly merging with Netherlands Metabolomics http://galaxy.nmcdsp.org/ (stats only) – Replace input file structure with ISA-Tab • http://www.ebi.ac.uk/metabolights/ Copyright NBAF-B 2013
  • 19. Metabolomics Workflow • • • • Requires 32bit MS Windows Data is large (100s Gb per study) Lots of processing power Multiple licenses Copyright NBAF-B 2013
  • 23. Metabo - Galaxy • Requirements – Allow Galaxy access to MS Windows • for FTICR-DIMS RAW file processing – Avoid passing large data over LAN • slow – Minimise cost of Galaxy implementation • make use of existing processors, storage and licences Copyright NBAF-B 2013
  • 24. Metabo - Galaxy • Solution – Use Galaxy’s Light Weight Runner (LWR) – Install LWR client on user’s desktop – Adjust Python wrappers to send tools via LWR – Run all tools on User’s desktop (MS Windows) – No need for • extra licenses • central storage/file transfer • powerful server Copyright NBAF-B 2013
  • 25. Metabo - Galaxy • New Input type • Pre-fills user’s IP • Uses REMOTE_ADDR header Copyright NBAF-B 2013
  • 26. Metabo - Galaxy • Makes use of  existing proc. power  licenses  user’s MS Windows (RAW) • Still acts as  version control  workflow manager  GUI Copyright NBAF-B 2013
  • 27. Metabo - Galaxy That said... • Working with GigaScience – http://www.gigasciencejournal.com/ • To be hosted on GigaGalaxy – http://galaxy.cbiit.cuhk.edu.hk/ • Using normal setup – (all processing/licenses on Galaxy server) • Downloadable version to include both options Copyright NBAF-B 2013
  • 29. First RAW -> stats Galaxy Pipe Copyright NBAF-B 2013
  • 30. Summary • Metabolomics has entered Galaxy! • Can be expanded BY COMMUNITY! • Can merge more easily with other ‘omics • Have developed Galaxy in a new way that allows – Use of existing hardware – Use of existing licenses – Less slow transfer of large data Copyright NBAF-B 2013
  • 31. Acknowledgements • University of Birmingham – Ralf Weber, Ulf Sommer, Mark Viant • Gigascience – Pete Li • NERC Discipline Hop scheme Copyright NBAF-B 2013