SlideShare una empresa de Scribd logo
1 de 26
Status of ProteomeXchange
Dr. Juan Antonio Vizcaíno
EMBL-EBI
Hinxton, Cambridge, UK
PSI meeting 2017
Ghent, 18 April 2016
Overview
• Introduction and status
• Submission and citation statistics
• New prospective members: jPOST and iPROX
• OmicsDI interface
PSI meeting 2017
Ghent, 18 April 2016
ProteomeXchange Consortium
• Goal: Development of a framework to allow standard
data submission and dissemination pipelines
between the main existing proteomics repositories.
• Includes PeptideAtlas (ISB, Seattle), PRIDE
(Cambridge, UK) and (very recently) MassIVE (UCSD,
San Diego).
• Common identifier space (PXD identifiers)
• Two supported data workflows: MS/MS and SRM.
• Main objective: Make life easier for researchers
http://www.proteomexchange.org
PSI meeting 2017
Ghent, 18 April 2016
ProteomeCentral
Metadata /
Manuscript
Raw Data*
Results
Journals
UniProt/
neXtProt
Peptide Atlas
Other DBs
Receiving repositories
PASSEL
(SRM data)
PRIDE
(MS/MS data)
Other DBs
GPMDB
Researcher’s results
Reprocessed results
Raw data*
Metadata
MassIVE
(MS/MS data)
Vizcaíno et al., Nat Biotechnol, 2014
ProteomeXchange data workflow
PSI meeting 2017
Ghent, 18 April 2016
Complete
Partial
Complete vs Partial submissions: processed results
For complete submissions, it is possible to connect the spectra with the identification
processed results and they can be visualized.
PSI meeting 2017
Ghent, 18 April 2016
Complete vs Partial submissions: experimental metadata
Complete Partial
General experimental metadata about the projects is similar.
However, at the assay level information in partial submissions is not so detailed
PSI meeting 2017
Ghent, 18 April 2016
Complete submissions
Search
Engine
Results + MS
files
Search
engines
mzIdentML
- Mascot
- MSGF+
- MyriMatch and related tools from D. Tabb’s lab
- OpenMS
- PEAKS
- PeptideShaker
- ProCon (ProteomeDiscoverer, Sequest)
- Scaffold
- TPP via the idConvert tool (ProteoWizard)
- ProteinPilot (from version 5.0)
- X!Tandem native conversion (Beta, PILEDRIVER)
- Others: library for X!Tandem conversion, lab
internal pipelines, …
- Crux
An increasing number of tools support export to mzIdentML 1.1
- Referenced spectral files need to be submitted as well
(all open formats are supported).
Updated list: http://www.psidev.info/tools-implementing-mzIdentML#.
PSI meeting 2017
Ghent, 18 April 2016
Status of ProteomeXchange
• No changes in the Consortium during 2015.
• Grant ‘ProteomeXchange 2’ refined and submitted again to
the joint NSF/BBSRC call but it was not successful.
• Prospective members:
• JPOST (Japan). Dedicated funding for 3 years.
• iProx (China).
• BBSRC Partnering grants with China and Japan obtained to help with
the process.
• No further contacts with other proteomics resources.
PSI meeting 2017
Ghent, 18 April 2016
Overview
• Introduction and status
• Submission and citation statistics
• New prospective members: jPOST and iPROX
• OmicsDI interface
PSI meeting 2017
Ghent, 18 April 2016
Origin:
885 USA
465 Germany
342 United Kingdom
264 China
194 France
158 Netherland
136 Canada
126 Switzerland
107 Denmark
104 Spain
99 Australia
95 Japan
72 Belgium
68 Austria
63 Sweden
61 India
51 Norway
43 Taiwan
30 Italy
29 Brazil
28 Singapore
28 Finland
27 Ireland
27 Russia
26 Israel …
ProteomeXchange: 3,802 datasets up until 1st April, 2016
Type:
2429 PRIDE partial
1016 PRIDE complete
250 MassIVE
84 PeptideAtlas/PASSEL complete
23 Reprocessed
Publicly Accessible:
1973 datasets, 52% of all
91% PRIDE
5% MassIVE
4% PASSEL
Data volume:
Total: ~220 TB
Number of all files: ~560,000
Datasets/year:
2012: 102
2013: 527
2014: 963
2015: 1758
2016: 452
Top Species studied by at least 20 datasets:
1526 Homo sapiens
485 Mus musculus
150 Saccharomyces cerevisiae
121 Arabidopsis thaliana
102 Rattus norvegicus
86 Escherichia coli
44 Bos taurus
35 Drosophila melanogaster
32 Glycine max
~ 700 species in total
PSI meeting 2017
Ghent, 18 April 2016
PRIDE Archive submitted datasets up until 1st April, 2016
• In the last year: ~150 submitted datasets per month
• Size: ~ 210TB
PSI meeting 2017
Ghent, 18 April 2016
PRIDE Archive: Size comparison with other EBI resources (May 2015)
1.E+07
1.E+08
1.E+09
1.E+10
1.E+11
1.E+12
1.E+13
1.E+14
1.E+15
1.E+16
1.E+17
2004 2006 2008 2010 2012 2014 2016
bytes
date
Data accumulation by resource
Metabolites
PRIDE
EGA
ENA (less AE)
AE
Chart generated by Guy Cochrane
PSI meeting 2017
Ghent, 18 April 2016
Data reuse is increasing
Data download volume in 2015: ~ 200 TB
PSI meeting 2017
Ghent, 18 April 2016
Which are the most accessed datasets? (total
number of hits)
PSI meeting 2017
Ghent, 18 April 2016
Citations statistics
Top cited paper (citations/year) in proteomics in NBT
PSI meeting 2017
Ghent, 18 April 2016
Overview
• Introduction and status
• Submission and citation statistics
• New prospective members: jPOST and iPROX
• OmicsDI interface
PSI meeting 2017
Ghent, 18 April 2016
jPOST Features
(Slice)
Slides from Y. Ishihama
PSI meeting 2017
Ghent, 18 April 2016
jPOST Project (April 2015 – March 2018)
The jPOST project is supported by National Bioscience
Database Center, Japan Science and Technology Agency
(NBDC-JST).
 Set the main servers (Dec, 2015)
 Use the PSI terminology for data registration
 Preparation of demo-site for jPOST repository
(until this meeting)
jPOST Repository Ready for PX partnership
 jPOST Repository Start (May 2, 2016)
 jPOST Database Start (2017)
(www.jpost.org)
PSI meeting 2017
Ghent, 18 April 2016
jPOST Repository site
(May 2 ~: www.jpost.org)
PSI meeting 2017
Ghent, 18 April 2016
iProX: integrated proteome resources in China
At present, iProX contains:
• 225 projects
• 834 subprojects
• 15398 data files
• Most of data comes from
the CNHPP
http://www.iprox.org Slides from Y. Zhu
PSI meeting 2017
Ghent, 18 April 2016
Providing stable
service to users
iProX
submission
system
iProX
proteome
database
Dataset import and
management
User information
MS/MS data
processing
pipeline
iProX
Experiment raw
files and
metedata
Information of
dataset and
idenficaitons
iProX diagram
PSI meeting 2017
Ghent, 18 April 2016
Updates
• Two full time curators
• Chunyuan Yang, Ph.D. in medical genetics
• Xue Wang, M.Sc. in bioinformatics
• Aspera license upgraded from 100M bps to 500M bps
• High availability: hot standby
• Will be deployed in cloud platform in May, 2016
• Move to Network Information Center, Chinese Academy of
Sciences
• Internet connection for service will exceed 1 G bps
• Remote backup in Shanghai, China
PSI meeting 2017
Ghent, 18 April 2016
Overview
• Introduction and status
• Submission and citation statistics
• New prospective members: jPOST and iPROX
• OmicsDI interface
PSI meeting 2017
Ghent, 18 April 2016
ProteomeCentral: Portal for all PX datasets
http://proteomecentral.proteomexchange.org/cgi/GetDataset
PSI meeting 2017
Ghent, 18 April 2016
OmicsDI: Portal for omics datasets
http://www.ebi.ac.uk/Tools/omicsdi/
• Aims to integrate of ‘omics’ datasets (genomics, proteomics and
metabolomics at present). Not only EBI resources are included.
PRIDE Archive
MassIVE
PASSEL
GPMDB
MetaboLights
Metabolomics Workbench
GNPS
EGA
PSI meeting 2017
Ghent, 18 April 2016
Aknowledgements: People
PRIDE team
Attila Csordas
Tobias Ternent
Noemi del Toro
Gerhard Mayer (Bochum, de.NBI)
Johannes Griss
Yasset Perez-Riverol
Henning Hermjakob
Former team members: Rui Wang,
Florian Reisinger and Jose A. Dianes
Acknowledgements:
PX partners
Eric Deutsch
Nuno Bandeira
Yasushi Ishihama (jPOST team)
Yunping Zhu (iPROX team)

Más contenido relacionado

La actualidad más candente

ICIC 2013 New Product Introductions ChemAxon
ICIC 2013 New Product Introductions ChemAxonICIC 2013 New Product Introductions ChemAxon
ICIC 2013 New Product Introductions ChemAxonDr. Haxel Consult
 
01 caa2019 ariadn_eplus_snd_uj_krakow 20190425
01 caa2019 ariadn_eplus_snd_uj_krakow 2019042501 caa2019 ariadn_eplus_snd_uj_krakow 20190425
01 caa2019 ariadn_eplus_snd_uj_krakow 20190425ariadnenetwork
 
ICIC 2013 Conference Proceedings Andreas Pesenhofer max.recall
ICIC 2013 Conference Proceedings Andreas Pesenhofer max.recallICIC 2013 Conference Proceedings Andreas Pesenhofer max.recall
ICIC 2013 Conference Proceedings Andreas Pesenhofer max.recallDr. Haxel Consult
 
ICIC 2013 New Product Introductions Minesoft
ICIC 2013 New Product Introductions MinesoftICIC 2013 New Product Introductions Minesoft
ICIC 2013 New Product Introductions MinesoftDr. Haxel Consult
 
Cogapp Open Studios 2012 - Adventures with Linked Data
Cogapp Open Studios 2012 - Adventures with Linked DataCogapp Open Studios 2012 - Adventures with Linked Data
Cogapp Open Studios 2012 - Adventures with Linked DataCogapp
 
Digital Humanities Clinics – Leading Dutch Librarians into DH. Lotte Wilms, N...
Digital Humanities Clinics – Leading Dutch Librarians into DH. Lotte Wilms, N...Digital Humanities Clinics – Leading Dutch Librarians into DH. Lotte Wilms, N...
Digital Humanities Clinics – Leading Dutch Librarians into DH. Lotte Wilms, N...LIBER Europe
 
ICIC 2014 New Product Presentations ChemAxon
ICIC 2014 New Product Presentations ChemAxon ICIC 2014 New Product Presentations ChemAxon
ICIC 2014 New Product Presentations ChemAxon Dr. Haxel Consult
 
ICIC 2017: New Poduct presentations InfoChem
ICIC 2017: New Poduct presentations InfoChemICIC 2017: New Poduct presentations InfoChem
ICIC 2017: New Poduct presentations InfoChemDr. Haxel Consult
 
ICIC 2013 New Product Introductions InfoChem
ICIC 2013 New Product Introductions InfoChemICIC 2013 New Product Introductions InfoChem
ICIC 2013 New Product Introductions InfoChemDr. Haxel Consult
 
Discovering Related Data Sources in Data Portals
Discovering Related Data Sources in Data PortalsDiscovering Related Data Sources in Data Portals
Discovering Related Data Sources in Data PortalsPeter Haase
 
Adoption and Integration of Persistent Identifiers in European Research Infor...
Adoption and Integration of Persistent Identifiers in European Research Infor...Adoption and Integration of Persistent Identifiers in European Research Infor...
Adoption and Integration of Persistent Identifiers in European Research Infor...LIBER Europe
 
UKSG 2018 Plenary - National license negotiations advancing the OA transition...
UKSG 2018 Plenary - National license negotiations advancing the OA transition...UKSG 2018 Plenary - National license negotiations advancing the OA transition...
UKSG 2018 Plenary - National license negotiations advancing the OA transition...UKSG: connecting the knowledge community
 
Scaling Usage Statistics across Repositories as an OpenAIRE Analytics Service...
Scaling Usage Statistics across Repositories as an OpenAIRE Analytics Service...Scaling Usage Statistics across Repositories as an OpenAIRE Analytics Service...
Scaling Usage Statistics across Repositories as an OpenAIRE Analytics Service...OpenAIRE
 
Making Data FAIR on WikiData - Andra Waagmeester
Making Data FAIR on WikiData - Andra WaagmeesterMaking Data FAIR on WikiData - Andra Waagmeester
Making Data FAIR on WikiData - Andra WaagmeesterOpenAIRE
 
Zenodo and linking Open Science - Nielsen Lars Holm
Zenodo and linking Open Science - Nielsen Lars HolmZenodo and linking Open Science - Nielsen Lars Holm
Zenodo and linking Open Science - Nielsen Lars HolmOpenAIRE
 
ICIC 2014 New Product Introduction InfoChem
ICIC 2014 New Product Introduction InfoChemICIC 2014 New Product Introduction InfoChem
ICIC 2014 New Product Introduction InfoChemDr. Haxel Consult
 
Research data: what can libraries do?
Research data: what can libraries do?Research data: what can libraries do?
Research data: what can libraries do?Zaven Hakopov
 
Smart Data Applications powered by the Wikidata Knowledge Graph
Smart Data Applications powered by the Wikidata Knowledge GraphSmart Data Applications powered by the Wikidata Knowledge Graph
Smart Data Applications powered by the Wikidata Knowledge GraphPeter Haase
 
Illuminating DSpace's Linked Data Support
Illuminating DSpace's Linked Data SupportIlluminating DSpace's Linked Data Support
Illuminating DSpace's Linked Data SupportPascal-Nicolas Becker
 

La actualidad más candente (19)

ICIC 2013 New Product Introductions ChemAxon
ICIC 2013 New Product Introductions ChemAxonICIC 2013 New Product Introductions ChemAxon
ICIC 2013 New Product Introductions ChemAxon
 
01 caa2019 ariadn_eplus_snd_uj_krakow 20190425
01 caa2019 ariadn_eplus_snd_uj_krakow 2019042501 caa2019 ariadn_eplus_snd_uj_krakow 20190425
01 caa2019 ariadn_eplus_snd_uj_krakow 20190425
 
ICIC 2013 Conference Proceedings Andreas Pesenhofer max.recall
ICIC 2013 Conference Proceedings Andreas Pesenhofer max.recallICIC 2013 Conference Proceedings Andreas Pesenhofer max.recall
ICIC 2013 Conference Proceedings Andreas Pesenhofer max.recall
 
ICIC 2013 New Product Introductions Minesoft
ICIC 2013 New Product Introductions MinesoftICIC 2013 New Product Introductions Minesoft
ICIC 2013 New Product Introductions Minesoft
 
Cogapp Open Studios 2012 - Adventures with Linked Data
Cogapp Open Studios 2012 - Adventures with Linked DataCogapp Open Studios 2012 - Adventures with Linked Data
Cogapp Open Studios 2012 - Adventures with Linked Data
 
Digital Humanities Clinics – Leading Dutch Librarians into DH. Lotte Wilms, N...
Digital Humanities Clinics – Leading Dutch Librarians into DH. Lotte Wilms, N...Digital Humanities Clinics – Leading Dutch Librarians into DH. Lotte Wilms, N...
Digital Humanities Clinics – Leading Dutch Librarians into DH. Lotte Wilms, N...
 
ICIC 2014 New Product Presentations ChemAxon
ICIC 2014 New Product Presentations ChemAxon ICIC 2014 New Product Presentations ChemAxon
ICIC 2014 New Product Presentations ChemAxon
 
ICIC 2017: New Poduct presentations InfoChem
ICIC 2017: New Poduct presentations InfoChemICIC 2017: New Poduct presentations InfoChem
ICIC 2017: New Poduct presentations InfoChem
 
ICIC 2013 New Product Introductions InfoChem
ICIC 2013 New Product Introductions InfoChemICIC 2013 New Product Introductions InfoChem
ICIC 2013 New Product Introductions InfoChem
 
Discovering Related Data Sources in Data Portals
Discovering Related Data Sources in Data PortalsDiscovering Related Data Sources in Data Portals
Discovering Related Data Sources in Data Portals
 
Adoption and Integration of Persistent Identifiers in European Research Infor...
Adoption and Integration of Persistent Identifiers in European Research Infor...Adoption and Integration of Persistent Identifiers in European Research Infor...
Adoption and Integration of Persistent Identifiers in European Research Infor...
 
UKSG 2018 Plenary - National license negotiations advancing the OA transition...
UKSG 2018 Plenary - National license negotiations advancing the OA transition...UKSG 2018 Plenary - National license negotiations advancing the OA transition...
UKSG 2018 Plenary - National license negotiations advancing the OA transition...
 
Scaling Usage Statistics across Repositories as an OpenAIRE Analytics Service...
Scaling Usage Statistics across Repositories as an OpenAIRE Analytics Service...Scaling Usage Statistics across Repositories as an OpenAIRE Analytics Service...
Scaling Usage Statistics across Repositories as an OpenAIRE Analytics Service...
 
Making Data FAIR on WikiData - Andra Waagmeester
Making Data FAIR on WikiData - Andra WaagmeesterMaking Data FAIR on WikiData - Andra Waagmeester
Making Data FAIR on WikiData - Andra Waagmeester
 
Zenodo and linking Open Science - Nielsen Lars Holm
Zenodo and linking Open Science - Nielsen Lars HolmZenodo and linking Open Science - Nielsen Lars Holm
Zenodo and linking Open Science - Nielsen Lars Holm
 
ICIC 2014 New Product Introduction InfoChem
ICIC 2014 New Product Introduction InfoChemICIC 2014 New Product Introduction InfoChem
ICIC 2014 New Product Introduction InfoChem
 
Research data: what can libraries do?
Research data: what can libraries do?Research data: what can libraries do?
Research data: what can libraries do?
 
Smart Data Applications powered by the Wikidata Knowledge Graph
Smart Data Applications powered by the Wikidata Knowledge GraphSmart Data Applications powered by the Wikidata Knowledge Graph
Smart Data Applications powered by the Wikidata Knowledge Graph
 
Illuminating DSpace's Linked Data Support
Illuminating DSpace's Linked Data SupportIlluminating DSpace's Linked Data Support
Illuminating DSpace's Linked Data Support
 

Similar a ProteomeXchange update

PRIDE and ProteomeXchange: A golden age for working with public proteomics data
PRIDE and ProteomeXchange: A golden age for working with public proteomics dataPRIDE and ProteomeXchange: A golden age for working with public proteomics data
PRIDE and ProteomeXchange: A golden age for working with public proteomics dataJuan Antonio Vizcaino
 
The ProteomeXchange Consoritum: 2017 update
The ProteomeXchange Consoritum: 2017 updateThe ProteomeXchange Consoritum: 2017 update
The ProteomeXchange Consoritum: 2017 updateJuan Antonio Vizcaino
 
Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...
Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...
Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...Juan Antonio Vizcaino
 
Mass spectrometry resources at the EBI
Mass spectrometry resources at the EBIMass spectrometry resources at the EBI
Mass spectrometry resources at the EBIJuan Antonio Vizcaino
 
An overview of the PRIDE ecosystem of resources and computational tools for m...
An overview of the PRIDE ecosystem of resources and computational tools for m...An overview of the PRIDE ecosystem of resources and computational tools for m...
An overview of the PRIDE ecosystem of resources and computational tools for m...Juan Antonio Vizcaino
 
Big Data and its Role in Biomedical Research
Big Data and its Role in Biomedical ResearchBig Data and its Role in Biomedical Research
Big Data and its Role in Biomedical ResearchPhilip Bourne
 
PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...
PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...
PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...Juan Antonio Vizcaino
 
AHUPO_Vizcaino_remote_presentation_082014
AHUPO_Vizcaino_remote_presentation_082014AHUPO_Vizcaino_remote_presentation_082014
AHUPO_Vizcaino_remote_presentation_082014Juan Antonio Vizcaino
 
Public proteomics data: a (mostly unexploited) gold mine for computational re...
Public proteomics data: a (mostly unexploited) gold mine for computational re...Public proteomics data: a (mostly unexploited) gold mine for computational re...
Public proteomics data: a (mostly unexploited) gold mine for computational re...Juan Antonio Vizcaino
 

Similar a ProteomeXchange update (20)

ProteomeXchange update
ProteomeXchange updateProteomeXchange update
ProteomeXchange update
 
PRIDE and ProteomeXchange: A golden age for working with public proteomics data
PRIDE and ProteomeXchange: A golden age for working with public proteomics dataPRIDE and ProteomeXchange: A golden age for working with public proteomics data
PRIDE and ProteomeXchange: A golden age for working with public proteomics data
 
PRIDE and ProteomeXchange
PRIDE and ProteomeXchangePRIDE and ProteomeXchange
PRIDE and ProteomeXchange
 
The ProteomeXchange Consoritum: 2017 update
The ProteomeXchange Consoritum: 2017 updateThe ProteomeXchange Consoritum: 2017 update
The ProteomeXchange Consoritum: 2017 update
 
Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...
Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...
Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...
 
Mass spectrometry resources at the EBI
Mass spectrometry resources at the EBIMass spectrometry resources at the EBI
Mass spectrometry resources at the EBI
 
Proteomexchange
ProteomexchangeProteomexchange
Proteomexchange
 
Pride and ProteomeXchange
Pride and ProteomeXchangePride and ProteomeXchange
Pride and ProteomeXchange
 
An overview of the PRIDE ecosystem of resources and computational tools for m...
An overview of the PRIDE ecosystem of resources and computational tools for m...An overview of the PRIDE ecosystem of resources and computational tools for m...
An overview of the PRIDE ecosystem of resources and computational tools for m...
 
Proteomics repositories
Proteomics repositoriesProteomics repositories
Proteomics repositories
 
ProteomeXchange update HUPO 2016
ProteomeXchange update HUPO 2016ProteomeXchange update HUPO 2016
ProteomeXchange update HUPO 2016
 
PRIDE resources and ProteomeXchange
PRIDE resources and ProteomeXchangePRIDE resources and ProteomeXchange
PRIDE resources and ProteomeXchange
 
Proteomics data standards
Proteomics data standardsProteomics data standards
Proteomics data standards
 
Big Data and its Role in Biomedical Research
Big Data and its Role in Biomedical ResearchBig Data and its Role in Biomedical Research
Big Data and its Role in Biomedical Research
 
Proteomics repositories
Proteomics repositoriesProteomics repositories
Proteomics repositories
 
PRIDE-ProteomeXchange
PRIDE-ProteomeXchangePRIDE-ProteomeXchange
PRIDE-ProteomeXchange
 
Proteomics repositories
Proteomics repositoriesProteomics repositories
Proteomics repositories
 
PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...
PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...
PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...
 
AHUPO_Vizcaino_remote_presentation_082014
AHUPO_Vizcaino_remote_presentation_082014AHUPO_Vizcaino_remote_presentation_082014
AHUPO_Vizcaino_remote_presentation_082014
 
Public proteomics data: a (mostly unexploited) gold mine for computational re...
Public proteomics data: a (mostly unexploited) gold mine for computational re...Public proteomics data: a (mostly unexploited) gold mine for computational re...
Public proteomics data: a (mostly unexploited) gold mine for computational re...
 

Más de Juan Antonio Vizcaino

Reusing and integrating public proteomics data to improve our knowledge of th...
Reusing and integrating public proteomics data to improve our knowledge of th...Reusing and integrating public proteomics data to improve our knowledge of th...
Reusing and integrating public proteomics data to improve our knowledge of th...Juan Antonio Vizcaino
 
Introduction to the PSI standard data formats
Introduction to the PSI standard data formatsIntroduction to the PSI standard data formats
Introduction to the PSI standard data formatsJuan Antonio Vizcaino
 
Introduction to the Proteomics Bioinformatics Course 2018
Introduction to the Proteomics Bioinformatics Course 2018Introduction to the Proteomics Bioinformatics Course 2018
Introduction to the Proteomics Bioinformatics Course 2018Juan Antonio Vizcaino
 
ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...
ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...
ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...Juan Antonio Vizcaino
 
A proteomics data “gold mine” at your disposal: Now that the data is there, w...
A proteomics data “gold mine” at your disposal: Now that the data is there, w...A proteomics data “gold mine” at your disposal: Now that the data is there, w...
A proteomics data “gold mine” at your disposal: Now that the data is there, w...Juan Antonio Vizcaino
 
How to run and maintain a popular biological data repository?
How to run and maintain a popular biological data repository?How to run and maintain a popular biological data repository?
How to run and maintain a popular biological data repository?Juan Antonio Vizcaino
 
Introduction to the Proteomics Bioinformatics Course 2017
Introduction to the Proteomics Bioinformatics Course 2017Introduction to the Proteomics Bioinformatics Course 2017
Introduction to the Proteomics Bioinformatics Course 2017Juan Antonio Vizcaino
 
Is it feasible to identify novel biomarkers by mining public proteomics data?
Is it feasible to identify novel biomarkers by mining public proteomics data?Is it feasible to identify novel biomarkers by mining public proteomics data?
Is it feasible to identify novel biomarkers by mining public proteomics data?Juan Antonio Vizcaino
 
The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...
The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...
The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...Juan Antonio Vizcaino
 
Enabling automated processing and analysis of large-scale proteomics data
Enabling automated processing and analysis of large-scale proteomics dataEnabling automated processing and analysis of large-scale proteomics data
Enabling automated processing and analysis of large-scale proteomics dataJuan Antonio Vizcaino
 
Introduction to EBI for Proteomics in ELIXIR
Introduction to EBI for Proteomics in ELIXIRIntroduction to EBI for Proteomics in ELIXIR
Introduction to EBI for Proteomics in ELIXIRJuan Antonio Vizcaino
 
The Proteomics Standards Initiative (PSI)
The Proteomics Standards Initiative (PSI)The Proteomics Standards Initiative (PSI)
The Proteomics Standards Initiative (PSI)Juan Antonio Vizcaino
 
Introduction to the Proteomics Bioinformatics Course 2016
Introduction to the Proteomics Bioinformatics Course 2016Introduction to the Proteomics Bioinformatics Course 2016
Introduction to the Proteomics Bioinformatics Course 2016Juan Antonio Vizcaino
 

Más de Juan Antonio Vizcaino (19)

Reusing and integrating public proteomics data to improve our knowledge of th...
Reusing and integrating public proteomics data to improve our knowledge of th...Reusing and integrating public proteomics data to improve our knowledge of th...
Reusing and integrating public proteomics data to improve our knowledge of th...
 
Introduction to the PSI standard data formats
Introduction to the PSI standard data formatsIntroduction to the PSI standard data formats
Introduction to the PSI standard data formats
 
Reuse of public proteomics data
Reuse of public proteomics dataReuse of public proteomics data
Reuse of public proteomics data
 
Introduction to the Proteomics Bioinformatics Course 2018
Introduction to the Proteomics Bioinformatics Course 2018Introduction to the Proteomics Bioinformatics Course 2018
Introduction to the Proteomics Bioinformatics Course 2018
 
ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...
ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...
ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...
 
PSI-Proteome Informatics update
PSI-Proteome Informatics updatePSI-Proteome Informatics update
PSI-Proteome Informatics update
 
The ELIXIR Proteomics community
The ELIXIR Proteomics community The ELIXIR Proteomics community
The ELIXIR Proteomics community
 
The ELIXIR Proteomics Community
The ELIXIR Proteomics CommunityThe ELIXIR Proteomics Community
The ELIXIR Proteomics Community
 
A proteomics data “gold mine” at your disposal: Now that the data is there, w...
A proteomics data “gold mine” at your disposal: Now that the data is there, w...A proteomics data “gold mine” at your disposal: Now that the data is there, w...
A proteomics data “gold mine” at your disposal: Now that the data is there, w...
 
How to run and maintain a popular biological data repository?
How to run and maintain a popular biological data repository?How to run and maintain a popular biological data repository?
How to run and maintain a popular biological data repository?
 
Reuse of public proteomics data
Reuse of public proteomics dataReuse of public proteomics data
Reuse of public proteomics data
 
Introduction to the Proteomics Bioinformatics Course 2017
Introduction to the Proteomics Bioinformatics Course 2017Introduction to the Proteomics Bioinformatics Course 2017
Introduction to the Proteomics Bioinformatics Course 2017
 
Is it feasible to identify novel biomarkers by mining public proteomics data?
Is it feasible to identify novel biomarkers by mining public proteomics data?Is it feasible to identify novel biomarkers by mining public proteomics data?
Is it feasible to identify novel biomarkers by mining public proteomics data?
 
The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...
The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...
The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...
 
Enabling automated processing and analysis of large-scale proteomics data
Enabling automated processing and analysis of large-scale proteomics dataEnabling automated processing and analysis of large-scale proteomics data
Enabling automated processing and analysis of large-scale proteomics data
 
Introduction to EBI for Proteomics in ELIXIR
Introduction to EBI for Proteomics in ELIXIRIntroduction to EBI for Proteomics in ELIXIR
Introduction to EBI for Proteomics in ELIXIR
 
The Proteomics Standards Initiative (PSI)
The Proteomics Standards Initiative (PSI)The Proteomics Standards Initiative (PSI)
The Proteomics Standards Initiative (PSI)
 
Introduction to the Proteomics Bioinformatics Course 2016
Introduction to the Proteomics Bioinformatics Course 2016Introduction to the Proteomics Bioinformatics Course 2016
Introduction to the Proteomics Bioinformatics Course 2016
 
Reuse of public data in proteomics
Reuse of public data in proteomicsReuse of public data in proteomics
Reuse of public data in proteomics
 

Último

Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bSérgio Sacani
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)Areesha Ahmad
 
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Bookingroncy bisnoi
 
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Silpa
 
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)AkefAfaneh2
 
IDENTIFICATION OF THE LIVING- forensic medicine
IDENTIFICATION OF THE LIVING- forensic medicineIDENTIFICATION OF THE LIVING- forensic medicine
IDENTIFICATION OF THE LIVING- forensic medicinesherlingomez2
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPirithiRaju
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICEayushi9330
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPirithiRaju
 
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)Joonhun Lee
 
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑Damini Dixit
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfrohankumarsinghrore1
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...chandars293
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY1301aanya
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsSérgio Sacani
 
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts ServiceJustdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Servicemonikaservice1
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfSumit Kumar yadav
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLkantirani197
 
Unit5-Cloud.pptx for lpu course cse121 o
Unit5-Cloud.pptx for lpu course cse121 oUnit5-Cloud.pptx for lpu course cse121 o
Unit5-Cloud.pptx for lpu course cse121 oManavSingh202607
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000Sapana Sha
 

Último (20)

Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
 
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
 
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
 
IDENTIFICATION OF THE LIVING- forensic medicine
IDENTIFICATION OF THE LIVING- forensic medicineIDENTIFICATION OF THE LIVING- forensic medicine
IDENTIFICATION OF THE LIVING- forensic medicine
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
 
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts ServiceJustdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
 
Unit5-Cloud.pptx for lpu course cse121 o
Unit5-Cloud.pptx for lpu course cse121 oUnit5-Cloud.pptx for lpu course cse121 o
Unit5-Cloud.pptx for lpu course cse121 o
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 

ProteomeXchange update

  • 1. Status of ProteomeXchange Dr. Juan Antonio Vizcaíno EMBL-EBI Hinxton, Cambridge, UK
  • 2. PSI meeting 2017 Ghent, 18 April 2016 Overview • Introduction and status • Submission and citation statistics • New prospective members: jPOST and iPROX • OmicsDI interface
  • 3. PSI meeting 2017 Ghent, 18 April 2016 ProteomeXchange Consortium • Goal: Development of a framework to allow standard data submission and dissemination pipelines between the main existing proteomics repositories. • Includes PeptideAtlas (ISB, Seattle), PRIDE (Cambridge, UK) and (very recently) MassIVE (UCSD, San Diego). • Common identifier space (PXD identifiers) • Two supported data workflows: MS/MS and SRM. • Main objective: Make life easier for researchers http://www.proteomexchange.org
  • 4. PSI meeting 2017 Ghent, 18 April 2016 ProteomeCentral Metadata / Manuscript Raw Data* Results Journals UniProt/ neXtProt Peptide Atlas Other DBs Receiving repositories PASSEL (SRM data) PRIDE (MS/MS data) Other DBs GPMDB Researcher’s results Reprocessed results Raw data* Metadata MassIVE (MS/MS data) Vizcaíno et al., Nat Biotechnol, 2014 ProteomeXchange data workflow
  • 5. PSI meeting 2017 Ghent, 18 April 2016 Complete Partial Complete vs Partial submissions: processed results For complete submissions, it is possible to connect the spectra with the identification processed results and they can be visualized.
  • 6. PSI meeting 2017 Ghent, 18 April 2016 Complete vs Partial submissions: experimental metadata Complete Partial General experimental metadata about the projects is similar. However, at the assay level information in partial submissions is not so detailed
  • 7. PSI meeting 2017 Ghent, 18 April 2016 Complete submissions Search Engine Results + MS files Search engines mzIdentML - Mascot - MSGF+ - MyriMatch and related tools from D. Tabb’s lab - OpenMS - PEAKS - PeptideShaker - ProCon (ProteomeDiscoverer, Sequest) - Scaffold - TPP via the idConvert tool (ProteoWizard) - ProteinPilot (from version 5.0) - X!Tandem native conversion (Beta, PILEDRIVER) - Others: library for X!Tandem conversion, lab internal pipelines, … - Crux An increasing number of tools support export to mzIdentML 1.1 - Referenced spectral files need to be submitted as well (all open formats are supported). Updated list: http://www.psidev.info/tools-implementing-mzIdentML#.
  • 8. PSI meeting 2017 Ghent, 18 April 2016 Status of ProteomeXchange • No changes in the Consortium during 2015. • Grant ‘ProteomeXchange 2’ refined and submitted again to the joint NSF/BBSRC call but it was not successful. • Prospective members: • JPOST (Japan). Dedicated funding for 3 years. • iProx (China). • BBSRC Partnering grants with China and Japan obtained to help with the process. • No further contacts with other proteomics resources.
  • 9. PSI meeting 2017 Ghent, 18 April 2016 Overview • Introduction and status • Submission and citation statistics • New prospective members: jPOST and iPROX • OmicsDI interface
  • 10. PSI meeting 2017 Ghent, 18 April 2016 Origin: 885 USA 465 Germany 342 United Kingdom 264 China 194 France 158 Netherland 136 Canada 126 Switzerland 107 Denmark 104 Spain 99 Australia 95 Japan 72 Belgium 68 Austria 63 Sweden 61 India 51 Norway 43 Taiwan 30 Italy 29 Brazil 28 Singapore 28 Finland 27 Ireland 27 Russia 26 Israel … ProteomeXchange: 3,802 datasets up until 1st April, 2016 Type: 2429 PRIDE partial 1016 PRIDE complete 250 MassIVE 84 PeptideAtlas/PASSEL complete 23 Reprocessed Publicly Accessible: 1973 datasets, 52% of all 91% PRIDE 5% MassIVE 4% PASSEL Data volume: Total: ~220 TB Number of all files: ~560,000 Datasets/year: 2012: 102 2013: 527 2014: 963 2015: 1758 2016: 452 Top Species studied by at least 20 datasets: 1526 Homo sapiens 485 Mus musculus 150 Saccharomyces cerevisiae 121 Arabidopsis thaliana 102 Rattus norvegicus 86 Escherichia coli 44 Bos taurus 35 Drosophila melanogaster 32 Glycine max ~ 700 species in total
  • 11. PSI meeting 2017 Ghent, 18 April 2016 PRIDE Archive submitted datasets up until 1st April, 2016 • In the last year: ~150 submitted datasets per month • Size: ~ 210TB
  • 12. PSI meeting 2017 Ghent, 18 April 2016 PRIDE Archive: Size comparison with other EBI resources (May 2015) 1.E+07 1.E+08 1.E+09 1.E+10 1.E+11 1.E+12 1.E+13 1.E+14 1.E+15 1.E+16 1.E+17 2004 2006 2008 2010 2012 2014 2016 bytes date Data accumulation by resource Metabolites PRIDE EGA ENA (less AE) AE Chart generated by Guy Cochrane
  • 13. PSI meeting 2017 Ghent, 18 April 2016 Data reuse is increasing Data download volume in 2015: ~ 200 TB
  • 14. PSI meeting 2017 Ghent, 18 April 2016 Which are the most accessed datasets? (total number of hits)
  • 15. PSI meeting 2017 Ghent, 18 April 2016 Citations statistics Top cited paper (citations/year) in proteomics in NBT
  • 16. PSI meeting 2017 Ghent, 18 April 2016 Overview • Introduction and status • Submission and citation statistics • New prospective members: jPOST and iPROX • OmicsDI interface
  • 17. PSI meeting 2017 Ghent, 18 April 2016 jPOST Features (Slice) Slides from Y. Ishihama
  • 18. PSI meeting 2017 Ghent, 18 April 2016 jPOST Project (April 2015 – March 2018) The jPOST project is supported by National Bioscience Database Center, Japan Science and Technology Agency (NBDC-JST).  Set the main servers (Dec, 2015)  Use the PSI terminology for data registration  Preparation of demo-site for jPOST repository (until this meeting) jPOST Repository Ready for PX partnership  jPOST Repository Start (May 2, 2016)  jPOST Database Start (2017) (www.jpost.org)
  • 19. PSI meeting 2017 Ghent, 18 April 2016 jPOST Repository site (May 2 ~: www.jpost.org)
  • 20. PSI meeting 2017 Ghent, 18 April 2016 iProX: integrated proteome resources in China At present, iProX contains: • 225 projects • 834 subprojects • 15398 data files • Most of data comes from the CNHPP http://www.iprox.org Slides from Y. Zhu
  • 21. PSI meeting 2017 Ghent, 18 April 2016 Providing stable service to users iProX submission system iProX proteome database Dataset import and management User information MS/MS data processing pipeline iProX Experiment raw files and metedata Information of dataset and idenficaitons iProX diagram
  • 22. PSI meeting 2017 Ghent, 18 April 2016 Updates • Two full time curators • Chunyuan Yang, Ph.D. in medical genetics • Xue Wang, M.Sc. in bioinformatics • Aspera license upgraded from 100M bps to 500M bps • High availability: hot standby • Will be deployed in cloud platform in May, 2016 • Move to Network Information Center, Chinese Academy of Sciences • Internet connection for service will exceed 1 G bps • Remote backup in Shanghai, China
  • 23. PSI meeting 2017 Ghent, 18 April 2016 Overview • Introduction and status • Submission and citation statistics • New prospective members: jPOST and iPROX • OmicsDI interface
  • 24. PSI meeting 2017 Ghent, 18 April 2016 ProteomeCentral: Portal for all PX datasets http://proteomecentral.proteomexchange.org/cgi/GetDataset
  • 25. PSI meeting 2017 Ghent, 18 April 2016 OmicsDI: Portal for omics datasets http://www.ebi.ac.uk/Tools/omicsdi/ • Aims to integrate of ‘omics’ datasets (genomics, proteomics and metabolomics at present). Not only EBI resources are included. PRIDE Archive MassIVE PASSEL GPMDB MetaboLights Metabolomics Workbench GNPS EGA
  • 26. PSI meeting 2017 Ghent, 18 April 2016 Aknowledgements: People PRIDE team Attila Csordas Tobias Ternent Noemi del Toro Gerhard Mayer (Bochum, de.NBI) Johannes Griss Yasset Perez-Riverol Henning Hermjakob Former team members: Rui Wang, Florian Reisinger and Jose A. Dianes Acknowledgements: PX partners Eric Deutsch Nuno Bandeira Yasushi Ishihama (jPOST team) Yunping Zhu (iPROX team)