SlideShare una empresa de Scribd logo
1 de 130
Descargar para leer sin conexión
T H E W O R L D O F
B I O C U R AT I O N
O P T I M I Z I N G I T S I M PA C T
April 7, 2014—Seventh International Biocuration Conference
S O M E O N E W H O I S R E S P O N S I B L E F O R T H E
C A R E A N D S U P E R V I S I O N O F B I O L O G I C A L
K N O W L E D G E R E S O U R C E S A N D T H E I R U S E
W H A T I S A B I O C U R A T O R ?
W H AT D O B I O C U R AT O R S D O T O D AY ?
• Credits to Kaveh Bazargan ᔥ
• @kaveh1000
F R U I T I N F O O D
P R O C E S S O R
S M O O T H I E
R E S E A R C H
R E S E A R C H I N
W O R D P R O C E S S O R
P D F
F R U I T ? ?
R E S E A R C H ? ?
?
R E S E A R C H ? ?
Y O U , T H E
B I O C U R AT O R
B I O C U R AT O R S O F T H E W O R L D U N I T E !
• You have nothing to lose but your PDF files
!
!
X
O U R R O L E I N T H E
R E S E A R C H L I F E C Y C L E
T H E W O R L D O F B I O C U R A T I O N
http://www.nbcnews.com/id/49258816/ns/technology_and_science-science/t/live-concert-microbial-data-turned-song-lab/#.UzSB9ceT4_E
D E S I G N I N G E X P E R I M E N T S
http://www.nbcnews.com/id/49258816/ns/technology_and_science-science/t/live-concert-microbial-data-turned-song-lab/#.UzSB9ceT4_E
D E S I G N I N G E X P E R I M E N T S
http://www.langdonbiology.org/AP/labs/Notebook/AP_notebook.htm
C O L L E C T I N G D ATA
Thomas Nast - http://www.victorianweb.org/art/illustration/nast/51.jpg
W R I T I N G
U P
R E S U LT S
http://rrresearch.fieldofscience.com/2012_02_01_archive.html
R E V I E W I N G
C O N C L U S I O N S
C A P T U R I N G
K N O W L E D G E
I S B
C A P T U R I N G
K N O W L E D G E
D E S I G N I N G E X P E R I M E N T S C O L L E C T I N G D ATA
R E V I E W I N G
C O N C L U S I O N S
W R I T I N G
U P
R E S U LT S
~ 3 0 0 B I O C U R A T O R S
B I O C U R AT I O N I N V E R S I O N
D E S I G N I N G
E X P E R I M E N T S
C O L L E C T I N G D ATA
W R I T I N G U P R E S U LT S
R E V I E W I N G C O N C L U S I O N S
C A P T U R I N G K N O W L E D G E
http://www.nsf.gov/statistics/nsf13331/pdf/nsf13331.pdf
H U N D R E D S O F T H O U S A N D S O F G R A D
S T U D E N T S
P O S T- D O C S
L A B O R AT O R I E S
J O U R N A L S
I N T H E L A B
E A R LY I N T E R V E N T I O N —
S U P P O R T I N G S TA N D A R D S
• Promote community-accepted identifiers, ontologies,
& formats
S U P P O R T S TA N D A R D S , T H E Y ’ R E O U R
F R I E N D
• November, 1999
• 45 biologists
• 14 days
• 140 megabases of Drosophila genome
!
• Published in March 2000
G E N E O N T O L O G Y, E T A L .
Q U E S T F O R
O R T H O L O G S
questfororthologs.org/ — www.ebi.ac.uk/reference_proteomes
Q U E S T F O R
O R T H O L O G S
• 30 phylogenomic databases
questfororthologs.org/ — www.ebi.ac.uk/reference_proteomes
Q U E S T F O R
O R T H O L O G S
• 30 phylogenomic databases
• Vary in # of species, taxonomic range, sampling density,
and methodology
questfororthologs.org/ — www.ebi.ac.uk/reference_proteomes
Q U E S T F O R
O R T H O L O G S
• 30 phylogenomic databases
• Vary in # of species, taxonomic range, sampling density,
and methodology
• Joint benchmarking effort
questfororthologs.org/ — www.ebi.ac.uk/reference_proteomes
Q U E S T F O R
O R T H O L O G S
• 30 phylogenomic databases
• Vary in # of species, taxonomic range, sampling density,
and methodology
• Joint benchmarking effort
• Only possible through the use of shared reference
proteomes and formats
questfororthologs.org/ — www.ebi.ac.uk/reference_proteomes
Q U E S T F O R
O R T H O L O G S
• 30 phylogenomic databases
• Vary in # of species, taxonomic range, sampling density,
and methodology
• Joint benchmarking effort
• Only possible through the use of shared reference
proteomes and formats
questfororthologs.org/ — www.ebi.ac.uk/reference_proteomes
E A R LY I N T E R V E N T I O N —
S U P P O R T I N G S TA N D A R D S
• Promote community-accepted identifiers, ontologies, & formats
• Develop and follow guidelines (paper and web-based)
• e.g. Gaudet, P., et al. Towards BioDBcore: a community-defined
information specification for biological databases. Database
2011. PMCID: PMC3017395
• Resource Identification Initiative
• www.force11.org/Resource_identification_initiative
• Vasilevsky NA, et al. On the reproducibility of science: unique
identification of research resources in the biomedical literature.
PeerJ. 2013 Sep 5;1:e148. doi: 10.7717/peerj.148. PubMed
PMID: 24032093; PubMed Central PMCID: PMC3771067.
E A R LY I N T E R V E N T I O N —
S U P P O R T I N G S TA N D A R D S
• Promote community-accepted identifiers, ontologies,
& formats
• Embed community accepted standards in the lab
environment
K N O C K O U T
M O U S E
P R O J E C T 2
• Broad standardized phenotyping of knockout mice on a
standard genetic background
• Data collection from many centres
• www.mousephenotype.org
K N O C K O U T
M O U S E
P R O J E C T 2
• Broad standardized phenotyping of knockout mice on a
standard genetic background
• Data collection from many centres
• www.mousephenotype.org
Cindy Smith
P R O T O C O L S A R E S TA N D A R D I Z E D
R E Q U I R E U S E O F PA R T I C U L A R O N T O L O G Y
T E R M S T O D E S C R I B E P H E N O T Y P E
E A R LY I N T E R V E N T I O N —
S U P P O R T I N G S TA N D A R D S
• Promote community-accepted identifiers, ontologies,
& formats
• Embed community accepted standards in the lab
environment
• Work with labs to embed standards into their data
generation pipeline
E A R LY I N T E R V E N T I O N —
S U P P O R T I N G S TA N D A R D S
• Promote community-accepted identifiers, ontologies,
& formats
• Embed community accepted standards in the lab
environment
• Stealth standards
S TA N D A R D S T H R O U G H U T I L I T Y —
A P O L L O
C S I R O V I D E O — D E M O A T G E N O M E A R C H I T E C T. O R G
S TA N D A R D S T H R O U G H U T I L I T Y —
A P O L L O
C S I R O V I D E O — D E M O A T G E N O M E A R C H I T E C T. O R G
T O O L S F O R T H E C O M M U N I T Y
T O O L S F O R T H E C O M M U N I T Y
• Web-based so researchers anywhere have access
T O O L S F O R T H E C O M M U N I T Y
• Web-based so researchers anywhere have access
• Concurrent access supports real-time collaboration
T O O L S F O R T H E C O M M U N I T Y
• Web-based so researchers anywhere have access
• Concurrent access supports real-time collaboration
• Built-in support for standards (transparently compliant)
T O O L S F O R T H E C O M M U N I T Y
• Web-based so researchers anywhere have access
• Concurrent access supports real-time collaboration
• Built-in support for standards (transparently compliant)
• Automatic generation of ready-made computable
data
T O O L S F O R T H E C O M M U N I T Y
• Web-based so researchers anywhere have access
• Concurrent access supports real-time collaboration
• Built-in support for standards (transparently compliant)
• Automatic generation of ready-made computable
data
• Client-side application relieves server bottleneck and
supports privacy
E A R LY I N T E R V E N T I O N —
S U P P O R T I N G S TA N D A R D S
• Promote community-accepted identifiers, ontologies, & formats
• Embed community accepted standards in the lab environment
• Stealth standards
• Re-purpose internal curation tools for external users
• Provide on-line documentation, hands-on training and rapid-response user
help
• Work with educators to make these tools an integral part of the curriculum
• e.g. CACAO (Critical Assessment of Community Annotation using
Ontologies), ecoliwiki.net/colipedia/index.php/CACAO_0.1
• DNA subway (Apollo)
S U B M I S S I O N
• CANTO: curation.pombase.org
• Structured Digital Abstracts
• Identifiers for all named genes, proteins, metabolites or other objects in the
article
• Main results described in simple ontology terms
• Experimental evidence types
• Not only a synopsis of the results but computer-readable
• Gerstein, M., et al. Structured digital abstract makes text mining easy.
Nature 447, 142 (10 May 2007) | doi:10.1038/447142a.
• Minimal Information reporting guidelines
• http://mibbi.sourceforge.net/portal.shtml
S U B M I T T I N G D ATA —
I N A S T R U C T U R E D WAY
P U B L I S H I N G
P U B L I S H I N G
P U B L I S H I N G
• First there were letters
P U B L I S H I N G
• First there were letters
• Then Henry Oldenburg created the first scientific journal in 1665
P U B L I S H I N G
• First there were letters
• Then Henry Oldenburg created the first scientific journal in 1665
• Result: too much to absorb
P U B L I S H I N G
• First there were letters
• Then Henry Oldenburg created the first scientific journal in 1665
• Result: too much to absorb
Washed away on the sea of information
P E E R A N D E D I T O R I A L
R E V I E W B E C A M E A F I LT E R
C O N S E Q U E N T LY …
• Figshare: figshare.org
• iDigBio: www.idigbio.org
• Dryad: datadryad.org
• eLife: www.elifesciences.org
• Unlike journal articles, the scale of web-native
publishing may overwhelm attempts at manual
curation (using current strategies)
T H E M E D I U M O F P U B L I C AT I O N I S
C H A N G I N G
D O W E N E E D T O
C U R AT E ?
S C H O L A R S H I P : B E Y O N D T H E PA P E R . J A S O N P R I E M .
N AT U R E 4 9 5 , 4 3 7 – 4 4 0 ( 2 8 M A R C H 2 0 1 4 )
“…powerful, online filters will distill communities
impact judgements algorithmically”
S O M E S AY N O …
D O W E N E E D T O C U R AT E ?
• Resolution of differences
• Clarity, eliminating noise
• Validation & design of automated methods
E V E N A P L A C E L I K E G O O G L E U S E S
C U R AT O R S ( * A N D S O F T WA R E )
• Hundreds of operators per country
• Multiple kinds of errors: overlapping jurisdictions, accidental
merges, road maps to satellite images mismatch, etc.
• Every road that you see has been hand-massaged
!
!
http://www.theatlantic.com/technology/archive/2012/09/how-google-builds-its-maps-and-what-it-means-for-the-future-of-everything/
261913/
D O W E N E E D T O C U R AT E ?
• Resolution of differences
• Clarity, eliminating noise
• Validation & design of automated methods
C L A R I T Y
• Answer boxes: Quick answers to concrete questions
!
!
!
!
C L A R I T Y
• Answer boxes: Quick answers to concrete questions
!
!
!
!
C L A R I T Y
• Answer boxes: Quick answers to concrete questions
!
!
!
!
C L A R I T Y
• Answer boxes: Quick answers to concrete questions
!
!
!
!
• Much of this information comes
from Freebase which is structured
in terms of entities and properties
C L A R I T Y
• Answer boxes: Quick answers to concrete questions
!
!
!
!
• Much of this information comes
from Freebase which is structured
in terms of entities and properties
Robert West, et al. Knowledge Base Completion via Search-Based
Question Answering. http://www.cs.ubc.ca/~murphyk/Papers/www14.pdf
WWW’14 April 7–11, 2014, Seoul, Korea. ACM 978-1-4503-2744-2/14/04.
DOI:2568032
D O W E N E E D T O C U R AT E ?
• Resolution of differences
• Clarity, eliminating noise
• Validation & design of automated methods
• PDF is still the dominant form of distribution
• PDF “Annotation”
• UTOPIA, www.utopiadocs.com
• DOMEO, swan.mindinformatics.org
• Textpresso, www.textpresso.org
• All of these are still lacking domain specifics (or need to be taught)
• FORCE11, www.force11.org
• Common goal is advancing scientific communications
• Beyond the PDF
L I T E R AT U R E I S I N F O R M AT I V E
B U T I S N O T I N F O R M AT I O N
X
VA L I D AT I O N A N D D E S I G N O F
A U T O M AT E D M E T H O D S
VA L I D AT I O N A N D D E S I G N O F
A U T O M AT E D M E T H O D S
VA L I D AT I O N A N D D E S I G N O F
A U T O M AT E D M E T H O D S
Write/modify
software
VA L I D AT I O N A N D D E S I G N O F
A U T O M AT E D M E T H O D S
Run the algorithm
Write/modify
software
VA L I D AT I O N A N D D E S I G N O F
A U T O M AT E D M E T H O D S
Run the algorithm
Write/modify
software
Evaluate results
VA L I D AT I O N A N D D E S I G N O F
A U T O M AT E D M E T H O D S
• Requires trusted reference datasets!
Run the algorithm
Write/modify
software
Evaluate results
VA L I D AT I O N A N D D E S I G N O F
A U T O M AT E D M E T H O D S
• Requires trusted reference datasets!
• Biocurators are partners with developers!
Run the algorithm
Write/modify
software
Evaluate results
S C H O L A R S H I P : B E Y O N D T H E PA P E R . J A S O N P R I E M .
N AT U R E 4 9 5 , 4 3 7 – 4 4 0 ( 2 8 M A R C H 2 0 1 4 )
“…powerful, online filters will distill communities
impact judgements algorithmically”
D O W E N E E D T O
C U R AT E ?
T H E PA R A B L E O F G O O G L E F L U : T R A P S I N B I G D ATA
A N A LY S I S . D AV I D L A Z E R E T A L . S C I E N C E 1 4 M A R C H 2 0 1 4 :
V O L . 3 4 3 N O . 6 1 7 6 P P. 1 2 0 3 - 1 2 0 5
“‘Big data hubris” is the often implicit assumption that
big data are a substitute for, rather than a supplement
to, traditional data collection and analysis.”
D O W E N E E D T O
C U R AT E ?
D O W E N E E D T O C U R AT E ?
• Yes
!
!
!
!
D O W E N E E D T O C U R AT E ?
• Yes
!
!
!
!
• But…
S Y S T E M AT I C R E V I E W &
C R I T I C I S M I S R E Q U I R E D
O U R S T R E N G T H I S I N Q U A L I T Y O F T H E I N F O R M A T I O N W E C A N
P R O V I D E
C U S I C K , M . , E T A L . L I T E R AT U R E - C U R AT E D P R O T E I N
I N T E R A C T I O N D ATA S E T S
N AT M E T H O D S . J A N 2 0 0 9 ; 6 ( 1 ) : 3 9 – 4 6 .
P M C I D : P M C 2 6 8 3 7 4 5
“…literature curated datasets have inherent
reliability difficulties…”
H O W C A N B I O C U R AT O R S
A D D R E S S C R I T I C I S M S ?
G R E E N B E R G , S . , H O W C I TAT I O N D I S T O R T I O N S C R E AT E U N F O U N D E D
A U T H O R I T Y: A N A LY S I S O F A C I TAT I O N N E T W O R K
B M J J U LY 2 0 0 9 ; 3 3 9 D O I : H T T P : / / D X . D O I . O R G / 1 0 . 1 1 3 6 /
T H E R I S K ( B Y A N A L O G Y )
56
W E ' R E R E S P O N S I B L E F O R T H E Q U A L I T Y
• “Reviewing the quality of the data is an obligation of
any entity that assumes responsibility over the data.”
• Limor Peer et al., IDCC 2014
PA I N T A P O P T O S I S - S U M M A RY
• 52 families annotated: 

- 8 were par$cipants in execution phase of apoptosis;
• 44 others are either:
A. upstream	
  of	
  apoptosis	
  	
  
B. phenotypes	
  
C. targets

Example 1: Protein (cytochrome c) upstream of
apoptosis execution
Cytochrome c is directly involved in apoptotic DNA fragmentation
Example 1: Protein (cytochrome c) upstream of
apoptosis execution
Cytochrome c is directly involved in apoptotic DNA fragmentation
➢ [Cells] – [cytochrome c] = No apoptotic DNA fragmentation
Example 1: Protein (cytochrome c) upstream of
apoptosis execution
Cytochrome c is directly involved in apoptotic DNA fragmentation
➢ [Cells] – [cytochrome c] = No apoptotic DNA fragmentation
➢ [Cells] – [cytochrome c] + [cytochrome c] = apoptotic DNA fragmentation
Example 2: Phenotype of reduced cell survival and
increased DNA fragmentation
• E3 ubiquitin-protein ligase TRAF7

was annotated to execution phase of apoptosis
➢ Exogenous expression of TRAF7
➢ No other data in terms of where
in apoptosis this may be.
!
➢ All we know is altering TRAF7
levels affects apoptosis.
Example 3: Target
DSG2 was annotated to execution phase of
apoptosis
Example 3: Target
DSG2 was annotated to execution phase of
apoptosis
Example 3: Target
DSG2 was annotated to execution phase of
apoptosis
DSG2 is a *target* of a protease (caspase), and
although its degradation indeed seems to be a part of
apoptosis it does not *mediate* apoptosis.
P R O V E T H E N E E D F O R B I O C U R AT I O N
• Publish: Quantitative improvements before/after
• Publish: Curator consistency studies
• Publish: Independent external reviews
R E C O G N I T I O N & C R E D I T
O R C I D . O R G
E N A B L I N G
R E S E A R C H
W H AT I S A B I O C U R AT O R ?
W H AT I S A B I O C U R AT O R ?
W H AT I S A B I O C U R AT O R ?
W H AT I S A B I O C U R AT O R ?
• A highly skilled and trained keeper of our biological
heritage of knowledge.
W H AT I S A B I O C U R AT O R ?
• A highly skilled and trained keeper of our biological
heritage of knowledge.
• A content specialist who understands the research and
can succinctly distill biological research results into
computable data
W H AT I S A B I O C U R AT O R ?
• A highly skilled and trained keeper of our biological
heritage of knowledge.
• A content specialist who understands the research and
can succinctly distill biological research results into
computable data
• Considers the ease of finding this information, its
relatedness to other information, and its research and
educational usability
 B6.Cg-­‐Alms1foz/fox/J
increased	
  weight,	
  
adipose	
  tissue	
  volume,	
  	
  
glucose	
  homeostasis	
  altered
ALSM1(NM_015120.4)	
  
[c.10775delC]	
  +	
  [-­‐]
GENOTYPE
PHENOTYPE
obesity,	
  
diabetes	
  mellitus,	
  
	
  insulin	
  resistance
increased	
  food	
  intake,	
  	
  
hyperglycemia,	
  
insulin	
  resistance
kcnj11c14/c14;	
  insrt143/+(AB)
M O D E L S R E C A P I T U L AT E VA R I O U S
P H E N O T Y P I C A S P E C T S O F D I S E A S E
 B6.Cg-­‐Alms1foz/fox/J
increased	
  weight,	
  
adipose	
  tissue	
  volume,	
  	
  
glucose	
  homeostasis	
  altered
GENOTYPE
PHENOTYPE
obesity,	
  
diabetes	
  mellitus,	
  
	
  insulin	
  resistance
increased	
  food	
  intake,	
  	
  
hyperglycemia,	
  
insulin	
  resistance
kcnj11c14/c14;	
  insrt143/+(AB)
M O D E L S R E C A P I T U L AT E VA R I O U S
P H E N O T Y P I C A S P E C T S O F D I S E A S E
?
R E S E A R C H R E S O U R C E S
Doelken S C et al. Dis. Model.
Mech. 2013;6:358-372
Smedley D et al. Database. 2013; bat025
Mungall CJ et al. Genome Biol. 2010; 11(1):R2
Washington N et al. Plos Biol 2009; e1000247
C R O S S - S P E C I E S P H E N O T Y P E C O M PA R I S O N S 

B Y S E M A N T I C S I M I L A R I T Y
CANDIDATE GENE PRIORITIZATION
PHENOTYPIC INTERPRETATION OF VARIANTS IN EXOMES (PHIVE)
Whole exome
Remove off-target and
common variants
Variant score
from allele freq and pathogenicity
Phenotype score
from phenotypic similarity
PhenIX/PhIVE score
to give final candidates
http://monarchinitiative.org	
  
C O N F I R M E D D I A G N O S E S
• Infantile Parkinsonism-dystonia
• Wiedemann Steiner syndrome
• de novo SYNGAP1 mutation leading autosomal dominant
mental retardation
• Frank-ter Haar syndrome
• Infantile hypophosphatasia
• … (~28%)
R E L AT E D N E S S A C R O S S B I O L O G Y
R E L AT E D N E S S A C R O S S B I O L O G Y
• Bio-Curator, not bio-Archivist
• Actively trying to represent current best understanding
R E L AT E D N E S S A C R O S S B I O L O G Y
• Bio-Curator, not bio-Archivist
• Actively trying to represent current best understanding
• Support interoperability
R E L AT E D N E S S A C R O S S B I O L O G Y
• Bio-Curator, not bio-Archivist
• Actively trying to represent current best understanding
• Support interoperability
• Support research and educational usability
R E L AT E D N E S S A C R O S S B I O L O G Y
• Bio-Curator, not bio-Archivist
• Actively trying to represent current best understanding
• Support interoperability
• Support research and educational usability
• Support inference
R E L AT E D N E S S A C R O S S B I O L O G Y
• Bio-Curator, not bio-Archivist
• Actively trying to represent current best understanding
• Support interoperability
• Support research and educational usability
• Support inference
• Not just for supporting searches, not just for finding
PDF/online papers!
W H AT C A N
B E D O N E ?
W H AT C A N
B E D O N E ?
W H AT C A N
B E D O N E ?
W H AT C A N
B E D O N E ?
W H AT C A N
B E D O N E ?
B I O D I V E R S I T Y D ATA J O U R N A L
B I O D I V E R S I T Y D ATA J O U R N A L
B I O D I V E R S I T Y D ATA J O U R N A L
F R O M W R I T I N G , S U B M I S S I O N , P E E R - R E V I E W, E D I T I N G , P U B L I C AT I O N T O D I S S E M I N AT I O N !
W H AT C A N I S B D O ?
W H AT C A N I S B D O ?
• Tangible support of standards efforts
• QfO, RII, MI, publish guidelines, validators …
W H AT C A N I S B D O ?
• Tangible support of standards efforts
• QfO, RII, MI, publish guidelines, validators …
• Create a curation mindset across the entire life cycle
• Support embedded/repurposed software, education, actively
engage with text-miners, provide on-line support …
W H AT C A N I S B D O ?
• Tangible support of standards efforts
• QfO, RII, MI, publish guidelines, validators …
• Create a curation mindset across the entire life cycle
• Support embedded/repurposed software, education, actively
engage with text-miners, provide on-line support …
• Prove the necessity for curation
• Publish studies, greater emphasis on review and quality (assessment)
W H AT C A N I S B D O ?
• Tangible support of standards efforts
• QfO, RII, MI, publish guidelines, validators …
• Create a curation mindset across the entire life cycle
• Support embedded/repurposed software, education, actively
engage with text-miners, provide on-line support …
• Prove the necessity for curation
• Publish studies, greater emphasis on review and quality (assessment)
• Work with traditional publishers
• FORCE11, structured submissions
W H AT C A N Y O U D O ?
• Consider
• The ease of finding information
• Its relatedness to other information
• Its research and educational usability
R E S E A R C H ? ?
Y O U , T H E
B I O C U R AT O R
I S B
A C K N O W L E D G E M E N T S A N D T H A N K S
Y O U A R E N O T A L O N E

Más contenido relacionado

Similar a Lewis isb 7 april2014

A Central Role for DOAJ in the Global Ecosystem of Open Access infrastructures
A Central Role for DOAJ in the Global Ecosystem of Open Access infrastructuresA Central Role for DOAJ in the Global Ecosystem of Open Access infrastructures
A Central Role for DOAJ in the Global Ecosystem of Open Access infrastructuresDOAJ (Directory of Open Access Journals)
 
ResearchGate - How do 'Social Networks for Scientists' Affect Libraries?
ResearchGate - How do 'Social Networks for Scientists' Affect Libraries?ResearchGate - How do 'Social Networks for Scientists' Affect Libraries?
ResearchGate - How do 'Social Networks for Scientists' Affect Libraries?Keita Bando
 
How to improve your research impact and who is talking about (or using) your...
How to improve your research impact  and who is talking about (or using) your...How to improve your research impact  and who is talking about (or using) your...
How to improve your research impact and who is talking about (or using) your...Guus van den Brekel
 
Open Access developments in Russia and other important regions in the world
Open Access developments in Russia and other important regions in the worldOpen Access developments in Russia and other important regions in the world
Open Access developments in Russia and other important regions in the worldDOAJ (Directory of Open Access Journals)
 
Open access developments in Russia
Open access developments in Russia  Open access developments in Russia
Open access developments in Russia Clara Armengou
 
Data Interoperability for Learning Analytics and Lifelong Learning
Data Interoperability for Learning Analytics and Lifelong LearningData Interoperability for Learning Analytics and Lifelong Learning
Data Interoperability for Learning Analytics and Lifelong LearningMegan Bowe
 
Criteria for open access publishing and indexing in DOAJ
Criteria for open access publishing and indexing in DOAJCriteria for open access publishing and indexing in DOAJ
Criteria for open access publishing and indexing in DOAJClara Armengou
 
The role of DOAJ in quality assurance of OA publishing
The role of DOAJ in quality assurance of OA publishingThe role of DOAJ in quality assurance of OA publishing
The role of DOAJ in quality assurance of OA publishingClara Armengou
 
30 tips How to (possibly) Improve Your Research Impact
30 tips How to (possibly) Improve Your Research Impact 30 tips How to (possibly) Improve Your Research Impact
30 tips How to (possibly) Improve Your Research Impact Guus van den Brekel
 
From Content Strategy to Drupal Site Building - Connecting the Dots
From Content Strategy to Drupal Site Building - Connecting the DotsFrom Content Strategy to Drupal Site Building - Connecting the Dots
From Content Strategy to Drupal Site Building - Connecting the DotsRonald Ashri
 
SCAR biodiversity information ecosystems
SCAR biodiversity information ecosystemsSCAR biodiversity information ecosystems
SCAR biodiversity information ecosystemsBruno Danis
 
GSMS PhD Development Speaker Series: how to improve your research impact? an...
GSMS PhD Development Speaker Series:  how to improve your research impact? an...GSMS PhD Development Speaker Series:  how to improve your research impact? an...
GSMS PhD Development Speaker Series: how to improve your research impact? an...Guus van den Brekel
 
Pure in Groningen & Horizon Report 2015 Library Edition
Pure in Groningen & Horizon Report 2015 Library EditionPure in Groningen & Horizon Report 2015 Library Edition
Pure in Groningen & Horizon Report 2015 Library EditionGuus van den Brekel
 
Researchers Night Frascati Scienza
Researchers Night  Frascati ScienzaResearchers Night  Frascati Scienza
Researchers Night Frascati ScienzaGiovanni Mazzitelli
 
Listening To a Forest for Project Health
Listening To a Forest for Project HealthListening To a Forest for Project Health
Listening To a Forest for Project HealthShelley Lambert
 
How to Make your Research Process more Effective? 4 Must-Use Tools for Resear...
How to Make your Research Process more Effective? 4 Must-Use Tools for Resear...How to Make your Research Process more Effective? 4 Must-Use Tools for Resear...
How to Make your Research Process more Effective? 4 Must-Use Tools for Resear...ResearchLeap
 
Pedagogical v. pathfinder: reimagining course and research guides for student...
Pedagogical v. pathfinder: reimagining course and research guides for student...Pedagogical v. pathfinder: reimagining course and research guides for student...
Pedagogical v. pathfinder: reimagining course and research guides for student...Bronwen Maxson
 

Similar a Lewis isb 7 april2014 (20)

A Central Role for DOAJ in the Global Ecosystem of Open Access infrastructures
A Central Role for DOAJ in the Global Ecosystem of Open Access infrastructuresA Central Role for DOAJ in the Global Ecosystem of Open Access infrastructures
A Central Role for DOAJ in the Global Ecosystem of Open Access infrastructures
 
ResearchGate - How do 'Social Networks for Scientists' Affect Libraries?
ResearchGate - How do 'Social Networks for Scientists' Affect Libraries?ResearchGate - How do 'Social Networks for Scientists' Affect Libraries?
ResearchGate - How do 'Social Networks for Scientists' Affect Libraries?
 
How to improve your research impact and who is talking about (or using) your...
How to improve your research impact  and who is talking about (or using) your...How to improve your research impact  and who is talking about (or using) your...
How to improve your research impact and who is talking about (or using) your...
 
Open Access developments in Russia and other important regions in the world
Open Access developments in Russia and other important regions in the worldOpen Access developments in Russia and other important regions in the world
Open Access developments in Russia and other important regions in the world
 
Open access developments in Russia
Open access developments in Russia  Open access developments in Russia
Open access developments in Russia
 
Data Interoperability for Learning Analytics and Lifelong Learning
Data Interoperability for Learning Analytics and Lifelong LearningData Interoperability for Learning Analytics and Lifelong Learning
Data Interoperability for Learning Analytics and Lifelong Learning
 
Criteria for open access publishing and indexing in DOAJ
Criteria for open access publishing and indexing in DOAJCriteria for open access publishing and indexing in DOAJ
Criteria for open access publishing and indexing in DOAJ
 
The role of DOAJ in quality assurance of OA publishing
The role of DOAJ in quality assurance of OA publishingThe role of DOAJ in quality assurance of OA publishing
The role of DOAJ in quality assurance of OA publishing
 
The role of DOAJ in quality assurance of OA publishing
The role of DOAJ in quality assurance of OA publishingThe role of DOAJ in quality assurance of OA publishing
The role of DOAJ in quality assurance of OA publishing
 
How To Improve Your Research Impact? 30+ tips to use befóre, whíle and áfter ...
How To Improve Your Research Impact? 30+ tips to use befóre, whíle and áfter ...How To Improve Your Research Impact? 30+ tips to use befóre, whíle and áfter ...
How To Improve Your Research Impact? 30+ tips to use befóre, whíle and áfter ...
 
30 tips How to (possibly) Improve Your Research Impact
30 tips How to (possibly) Improve Your Research Impact 30 tips How to (possibly) Improve Your Research Impact
30 tips How to (possibly) Improve Your Research Impact
 
From Content Strategy to Drupal Site Building - Connecting the Dots
From Content Strategy to Drupal Site Building - Connecting the DotsFrom Content Strategy to Drupal Site Building - Connecting the Dots
From Content Strategy to Drupal Site Building - Connecting the Dots
 
SCAR biodiversity information ecosystems
SCAR biodiversity information ecosystemsSCAR biodiversity information ecosystems
SCAR biodiversity information ecosystems
 
GSMS PhD Development Speaker Series: how to improve your research impact? an...
GSMS PhD Development Speaker Series:  how to improve your research impact? an...GSMS PhD Development Speaker Series:  how to improve your research impact? an...
GSMS PhD Development Speaker Series: how to improve your research impact? an...
 
Assessing the quality of scholarly publishing
Assessing the quality of scholarly publishing  Assessing the quality of scholarly publishing
Assessing the quality of scholarly publishing
 
Pure in Groningen & Horizon Report 2015 Library Edition
Pure in Groningen & Horizon Report 2015 Library EditionPure in Groningen & Horizon Report 2015 Library Edition
Pure in Groningen & Horizon Report 2015 Library Edition
 
Researchers Night Frascati Scienza
Researchers Night  Frascati ScienzaResearchers Night  Frascati Scienza
Researchers Night Frascati Scienza
 
Listening To a Forest for Project Health
Listening To a Forest for Project HealthListening To a Forest for Project Health
Listening To a Forest for Project Health
 
How to Make your Research Process more Effective? 4 Must-Use Tools for Resear...
How to Make your Research Process more Effective? 4 Must-Use Tools for Resear...How to Make your Research Process more Effective? 4 Must-Use Tools for Resear...
How to Make your Research Process more Effective? 4 Must-Use Tools for Resear...
 
Pedagogical v. pathfinder: reimagining course and research guides for student...
Pedagogical v. pathfinder: reimagining course and research guides for student...Pedagogical v. pathfinder: reimagining course and research guides for student...
Pedagogical v. pathfinder: reimagining course and research guides for student...
 

Último

Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...Milind Agarwal
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...Dr Arash Najmaei ( Phd., MBA, BSc)
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxHaritikaChhatwal1
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxHimangsuNath
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
convolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfconvolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfSubhamKumar3239
 
INTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingINTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingsocarem879
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectBoston Institute of Analytics
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksdeepakthakur548787
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data VisualizationKianJazayeri1
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfblazblazml
 
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxTasha Penwell
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesTimothy Spann
 

Último (20)

Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
 
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptx
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptx
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
convolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfconvolutional neural network and its applications.pdf
convolutional neural network and its applications.pdf
 
INTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingINTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processing
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis Project
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing works
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data Visualization
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
 
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
 

Lewis isb 7 april2014

  • 1. T H E W O R L D O F B I O C U R AT I O N O P T I M I Z I N G I T S I M PA C T April 7, 2014—Seventh International Biocuration Conference
  • 2. S O M E O N E W H O I S R E S P O N S I B L E F O R T H E C A R E A N D S U P E R V I S I O N O F B I O L O G I C A L K N O W L E D G E R E S O U R C E S A N D T H E I R U S E W H A T I S A B I O C U R A T O R ?
  • 3. W H AT D O B I O C U R AT O R S D O T O D AY ? • Credits to Kaveh Bazargan ᔥ • @kaveh1000
  • 4.
  • 5. F R U I T I N F O O D P R O C E S S O R
  • 6. S M O O T H I E
  • 7. R E S E A R C H
  • 8. R E S E A R C H I N W O R D P R O C E S S O R
  • 10. F R U I T ? ?
  • 11. R E S E A R C H ? ? ?
  • 12. R E S E A R C H ? ? Y O U , T H E B I O C U R AT O R
  • 13. B I O C U R AT O R S O F T H E W O R L D U N I T E ! • You have nothing to lose but your PDF files ! ! X
  • 14. O U R R O L E I N T H E R E S E A R C H L I F E C Y C L E T H E W O R L D O F B I O C U R A T I O N
  • 18. Thomas Nast - http://www.victorianweb.org/art/illustration/nast/51.jpg W R I T I N G U P R E S U LT S
  • 20. C A P T U R I N G K N O W L E D G E
  • 21. I S B C A P T U R I N G K N O W L E D G E D E S I G N I N G E X P E R I M E N T S C O L L E C T I N G D ATA R E V I E W I N G C O N C L U S I O N S W R I T I N G U P R E S U LT S
  • 22. ~ 3 0 0 B I O C U R A T O R S B I O C U R AT I O N I N V E R S I O N D E S I G N I N G E X P E R I M E N T S C O L L E C T I N G D ATA W R I T I N G U P R E S U LT S R E V I E W I N G C O N C L U S I O N S C A P T U R I N G K N O W L E D G E http://www.nsf.gov/statistics/nsf13331/pdf/nsf13331.pdf H U N D R E D S O F T H O U S A N D S O F G R A D S T U D E N T S P O S T- D O C S L A B O R AT O R I E S J O U R N A L S
  • 23. I N T H E L A B
  • 24. E A R LY I N T E R V E N T I O N — S U P P O R T I N G S TA N D A R D S • Promote community-accepted identifiers, ontologies, & formats
  • 25. S U P P O R T S TA N D A R D S , T H E Y ’ R E O U R F R I E N D • November, 1999 • 45 biologists • 14 days • 140 megabases of Drosophila genome ! • Published in March 2000 G E N E O N T O L O G Y, E T A L .
  • 26. Q U E S T F O R O R T H O L O G S questfororthologs.org/ — www.ebi.ac.uk/reference_proteomes
  • 27. Q U E S T F O R O R T H O L O G S • 30 phylogenomic databases questfororthologs.org/ — www.ebi.ac.uk/reference_proteomes
  • 28. Q U E S T F O R O R T H O L O G S • 30 phylogenomic databases • Vary in # of species, taxonomic range, sampling density, and methodology questfororthologs.org/ — www.ebi.ac.uk/reference_proteomes
  • 29. Q U E S T F O R O R T H O L O G S • 30 phylogenomic databases • Vary in # of species, taxonomic range, sampling density, and methodology • Joint benchmarking effort questfororthologs.org/ — www.ebi.ac.uk/reference_proteomes
  • 30. Q U E S T F O R O R T H O L O G S • 30 phylogenomic databases • Vary in # of species, taxonomic range, sampling density, and methodology • Joint benchmarking effort • Only possible through the use of shared reference proteomes and formats questfororthologs.org/ — www.ebi.ac.uk/reference_proteomes
  • 31. Q U E S T F O R O R T H O L O G S • 30 phylogenomic databases • Vary in # of species, taxonomic range, sampling density, and methodology • Joint benchmarking effort • Only possible through the use of shared reference proteomes and formats questfororthologs.org/ — www.ebi.ac.uk/reference_proteomes
  • 32. E A R LY I N T E R V E N T I O N — S U P P O R T I N G S TA N D A R D S • Promote community-accepted identifiers, ontologies, & formats • Develop and follow guidelines (paper and web-based) • e.g. Gaudet, P., et al. Towards BioDBcore: a community-defined information specification for biological databases. Database 2011. PMCID: PMC3017395 • Resource Identification Initiative • www.force11.org/Resource_identification_initiative • Vasilevsky NA, et al. On the reproducibility of science: unique identification of research resources in the biomedical literature. PeerJ. 2013 Sep 5;1:e148. doi: 10.7717/peerj.148. PubMed PMID: 24032093; PubMed Central PMCID: PMC3771067.
  • 33. E A R LY I N T E R V E N T I O N — S U P P O R T I N G S TA N D A R D S • Promote community-accepted identifiers, ontologies, & formats • Embed community accepted standards in the lab environment
  • 34. K N O C K O U T M O U S E P R O J E C T 2 • Broad standardized phenotyping of knockout mice on a standard genetic background • Data collection from many centres • www.mousephenotype.org
  • 35. K N O C K O U T M O U S E P R O J E C T 2 • Broad standardized phenotyping of knockout mice on a standard genetic background • Data collection from many centres • www.mousephenotype.org Cindy Smith
  • 36. P R O T O C O L S A R E S TA N D A R D I Z E D R E Q U I R E U S E O F PA R T I C U L A R O N T O L O G Y T E R M S T O D E S C R I B E P H E N O T Y P E
  • 37. E A R LY I N T E R V E N T I O N — S U P P O R T I N G S TA N D A R D S • Promote community-accepted identifiers, ontologies, & formats • Embed community accepted standards in the lab environment • Work with labs to embed standards into their data generation pipeline
  • 38. E A R LY I N T E R V E N T I O N — S U P P O R T I N G S TA N D A R D S • Promote community-accepted identifiers, ontologies, & formats • Embed community accepted standards in the lab environment • Stealth standards
  • 39. S TA N D A R D S T H R O U G H U T I L I T Y — A P O L L O C S I R O V I D E O — D E M O A T G E N O M E A R C H I T E C T. O R G
  • 40. S TA N D A R D S T H R O U G H U T I L I T Y — A P O L L O C S I R O V I D E O — D E M O A T G E N O M E A R C H I T E C T. O R G
  • 41. T O O L S F O R T H E C O M M U N I T Y
  • 42. T O O L S F O R T H E C O M M U N I T Y • Web-based so researchers anywhere have access
  • 43. T O O L S F O R T H E C O M M U N I T Y • Web-based so researchers anywhere have access • Concurrent access supports real-time collaboration
  • 44. T O O L S F O R T H E C O M M U N I T Y • Web-based so researchers anywhere have access • Concurrent access supports real-time collaboration • Built-in support for standards (transparently compliant)
  • 45. T O O L S F O R T H E C O M M U N I T Y • Web-based so researchers anywhere have access • Concurrent access supports real-time collaboration • Built-in support for standards (transparently compliant) • Automatic generation of ready-made computable data
  • 46. T O O L S F O R T H E C O M M U N I T Y • Web-based so researchers anywhere have access • Concurrent access supports real-time collaboration • Built-in support for standards (transparently compliant) • Automatic generation of ready-made computable data • Client-side application relieves server bottleneck and supports privacy
  • 47. E A R LY I N T E R V E N T I O N — S U P P O R T I N G S TA N D A R D S • Promote community-accepted identifiers, ontologies, & formats • Embed community accepted standards in the lab environment • Stealth standards • Re-purpose internal curation tools for external users • Provide on-line documentation, hands-on training and rapid-response user help • Work with educators to make these tools an integral part of the curriculum • e.g. CACAO (Critical Assessment of Community Annotation using Ontologies), ecoliwiki.net/colipedia/index.php/CACAO_0.1 • DNA subway (Apollo)
  • 48. S U B M I S S I O N
  • 49. • CANTO: curation.pombase.org • Structured Digital Abstracts • Identifiers for all named genes, proteins, metabolites or other objects in the article • Main results described in simple ontology terms • Experimental evidence types • Not only a synopsis of the results but computer-readable • Gerstein, M., et al. Structured digital abstract makes text mining easy. Nature 447, 142 (10 May 2007) | doi:10.1038/447142a. • Minimal Information reporting guidelines • http://mibbi.sourceforge.net/portal.shtml S U B M I T T I N G D ATA — I N A S T R U C T U R E D WAY
  • 50. P U B L I S H I N G
  • 51. P U B L I S H I N G
  • 52. P U B L I S H I N G • First there were letters
  • 53. P U B L I S H I N G • First there were letters • Then Henry Oldenburg created the first scientific journal in 1665
  • 54. P U B L I S H I N G • First there were letters • Then Henry Oldenburg created the first scientific journal in 1665 • Result: too much to absorb
  • 55. P U B L I S H I N G • First there were letters • Then Henry Oldenburg created the first scientific journal in 1665 • Result: too much to absorb Washed away on the sea of information
  • 56. P E E R A N D E D I T O R I A L R E V I E W B E C A M E A F I LT E R C O N S E Q U E N T LY …
  • 57. • Figshare: figshare.org • iDigBio: www.idigbio.org • Dryad: datadryad.org • eLife: www.elifesciences.org • Unlike journal articles, the scale of web-native publishing may overwhelm attempts at manual curation (using current strategies) T H E M E D I U M O F P U B L I C AT I O N I S C H A N G I N G
  • 58. D O W E N E E D T O C U R AT E ?
  • 59. S C H O L A R S H I P : B E Y O N D T H E PA P E R . J A S O N P R I E M . N AT U R E 4 9 5 , 4 3 7 – 4 4 0 ( 2 8 M A R C H 2 0 1 4 ) “…powerful, online filters will distill communities impact judgements algorithmically” S O M E S AY N O …
  • 60. D O W E N E E D T O C U R AT E ? • Resolution of differences • Clarity, eliminating noise • Validation & design of automated methods
  • 61. E V E N A P L A C E L I K E G O O G L E U S E S C U R AT O R S ( * A N D S O F T WA R E ) • Hundreds of operators per country • Multiple kinds of errors: overlapping jurisdictions, accidental merges, road maps to satellite images mismatch, etc. • Every road that you see has been hand-massaged ! ! http://www.theatlantic.com/technology/archive/2012/09/how-google-builds-its-maps-and-what-it-means-for-the-future-of-everything/ 261913/
  • 62. D O W E N E E D T O C U R AT E ? • Resolution of differences • Clarity, eliminating noise • Validation & design of automated methods
  • 63. C L A R I T Y • Answer boxes: Quick answers to concrete questions ! ! ! !
  • 64. C L A R I T Y • Answer boxes: Quick answers to concrete questions ! ! ! !
  • 65. C L A R I T Y • Answer boxes: Quick answers to concrete questions ! ! ! !
  • 66. C L A R I T Y • Answer boxes: Quick answers to concrete questions ! ! ! ! • Much of this information comes from Freebase which is structured in terms of entities and properties
  • 67. C L A R I T Y • Answer boxes: Quick answers to concrete questions ! ! ! ! • Much of this information comes from Freebase which is structured in terms of entities and properties Robert West, et al. Knowledge Base Completion via Search-Based Question Answering. http://www.cs.ubc.ca/~murphyk/Papers/www14.pdf WWW’14 April 7–11, 2014, Seoul, Korea. ACM 978-1-4503-2744-2/14/04. DOI:2568032
  • 68. D O W E N E E D T O C U R AT E ? • Resolution of differences • Clarity, eliminating noise • Validation & design of automated methods
  • 69. • PDF is still the dominant form of distribution • PDF “Annotation” • UTOPIA, www.utopiadocs.com • DOMEO, swan.mindinformatics.org • Textpresso, www.textpresso.org • All of these are still lacking domain specifics (or need to be taught) • FORCE11, www.force11.org • Common goal is advancing scientific communications • Beyond the PDF L I T E R AT U R E I S I N F O R M AT I V E B U T I S N O T I N F O R M AT I O N X
  • 70. VA L I D AT I O N A N D D E S I G N O F A U T O M AT E D M E T H O D S
  • 71. VA L I D AT I O N A N D D E S I G N O F A U T O M AT E D M E T H O D S
  • 72. VA L I D AT I O N A N D D E S I G N O F A U T O M AT E D M E T H O D S Write/modify software
  • 73. VA L I D AT I O N A N D D E S I G N O F A U T O M AT E D M E T H O D S Run the algorithm Write/modify software
  • 74. VA L I D AT I O N A N D D E S I G N O F A U T O M AT E D M E T H O D S Run the algorithm Write/modify software Evaluate results
  • 75. VA L I D AT I O N A N D D E S I G N O F A U T O M AT E D M E T H O D S • Requires trusted reference datasets! Run the algorithm Write/modify software Evaluate results
  • 76. VA L I D AT I O N A N D D E S I G N O F A U T O M AT E D M E T H O D S • Requires trusted reference datasets! • Biocurators are partners with developers! Run the algorithm Write/modify software Evaluate results
  • 77. S C H O L A R S H I P : B E Y O N D T H E PA P E R . J A S O N P R I E M . N AT U R E 4 9 5 , 4 3 7 – 4 4 0 ( 2 8 M A R C H 2 0 1 4 ) “…powerful, online filters will distill communities impact judgements algorithmically” D O W E N E E D T O C U R AT E ?
  • 78. T H E PA R A B L E O F G O O G L E F L U : T R A P S I N B I G D ATA A N A LY S I S . D AV I D L A Z E R E T A L . S C I E N C E 1 4 M A R C H 2 0 1 4 : V O L . 3 4 3 N O . 6 1 7 6 P P. 1 2 0 3 - 1 2 0 5 “‘Big data hubris” is the often implicit assumption that big data are a substitute for, rather than a supplement to, traditional data collection and analysis.” D O W E N E E D T O C U R AT E ?
  • 79. D O W E N E E D T O C U R AT E ? • Yes ! ! ! !
  • 80. D O W E N E E D T O C U R AT E ? • Yes ! ! ! ! • But…
  • 81. S Y S T E M AT I C R E V I E W & C R I T I C I S M I S R E Q U I R E D O U R S T R E N G T H I S I N Q U A L I T Y O F T H E I N F O R M A T I O N W E C A N P R O V I D E
  • 82. C U S I C K , M . , E T A L . L I T E R AT U R E - C U R AT E D P R O T E I N I N T E R A C T I O N D ATA S E T S N AT M E T H O D S . J A N 2 0 0 9 ; 6 ( 1 ) : 3 9 – 4 6 . P M C I D : P M C 2 6 8 3 7 4 5 “…literature curated datasets have inherent reliability difficulties…” H O W C A N B I O C U R AT O R S A D D R E S S C R I T I C I S M S ?
  • 83. G R E E N B E R G , S . , H O W C I TAT I O N D I S T O R T I O N S C R E AT E U N F O U N D E D A U T H O R I T Y: A N A LY S I S O F A C I TAT I O N N E T W O R K B M J J U LY 2 0 0 9 ; 3 3 9 D O I : H T T P : / / D X . D O I . O R G / 1 0 . 1 1 3 6 / T H E R I S K ( B Y A N A L O G Y ) 56
  • 84. W E ' R E R E S P O N S I B L E F O R T H E Q U A L I T Y • “Reviewing the quality of the data is an obligation of any entity that assumes responsibility over the data.” • Limor Peer et al., IDCC 2014
  • 85. PA I N T A P O P T O S I S - S U M M A RY • 52 families annotated: 
 - 8 were par$cipants in execution phase of apoptosis; • 44 others are either: A. upstream  of  apoptosis     B. phenotypes   C. targets

  • 86. Example 1: Protein (cytochrome c) upstream of apoptosis execution Cytochrome c is directly involved in apoptotic DNA fragmentation
  • 87. Example 1: Protein (cytochrome c) upstream of apoptosis execution Cytochrome c is directly involved in apoptotic DNA fragmentation ➢ [Cells] – [cytochrome c] = No apoptotic DNA fragmentation
  • 88. Example 1: Protein (cytochrome c) upstream of apoptosis execution Cytochrome c is directly involved in apoptotic DNA fragmentation ➢ [Cells] – [cytochrome c] = No apoptotic DNA fragmentation ➢ [Cells] – [cytochrome c] + [cytochrome c] = apoptotic DNA fragmentation
  • 89. Example 2: Phenotype of reduced cell survival and increased DNA fragmentation • E3 ubiquitin-protein ligase TRAF7
 was annotated to execution phase of apoptosis ➢ Exogenous expression of TRAF7 ➢ No other data in terms of where in apoptosis this may be. ! ➢ All we know is altering TRAF7 levels affects apoptosis.
  • 90. Example 3: Target DSG2 was annotated to execution phase of apoptosis
  • 91. Example 3: Target DSG2 was annotated to execution phase of apoptosis
  • 92. Example 3: Target DSG2 was annotated to execution phase of apoptosis DSG2 is a *target* of a protease (caspase), and although its degradation indeed seems to be a part of apoptosis it does not *mediate* apoptosis.
  • 93. P R O V E T H E N E E D F O R B I O C U R AT I O N • Publish: Quantitative improvements before/after • Publish: Curator consistency studies • Publish: Independent external reviews
  • 94. R E C O G N I T I O N & C R E D I T O R C I D . O R G
  • 95. E N A B L I N G R E S E A R C H
  • 96. W H AT I S A B I O C U R AT O R ?
  • 97. W H AT I S A B I O C U R AT O R ?
  • 98. W H AT I S A B I O C U R AT O R ?
  • 99. W H AT I S A B I O C U R AT O R ? • A highly skilled and trained keeper of our biological heritage of knowledge.
  • 100. W H AT I S A B I O C U R AT O R ? • A highly skilled and trained keeper of our biological heritage of knowledge. • A content specialist who understands the research and can succinctly distill biological research results into computable data
  • 101. W H AT I S A B I O C U R AT O R ? • A highly skilled and trained keeper of our biological heritage of knowledge. • A content specialist who understands the research and can succinctly distill biological research results into computable data • Considers the ease of finding this information, its relatedness to other information, and its research and educational usability
  • 102.  B6.Cg-­‐Alms1foz/fox/J increased  weight,   adipose  tissue  volume,     glucose  homeostasis  altered ALSM1(NM_015120.4)   [c.10775delC]  +  [-­‐] GENOTYPE PHENOTYPE obesity,   diabetes  mellitus,    insulin  resistance increased  food  intake,     hyperglycemia,   insulin  resistance kcnj11c14/c14;  insrt143/+(AB) M O D E L S R E C A P I T U L AT E VA R I O U S P H E N O T Y P I C A S P E C T S O F D I S E A S E
  • 103.  B6.Cg-­‐Alms1foz/fox/J increased  weight,   adipose  tissue  volume,     glucose  homeostasis  altered GENOTYPE PHENOTYPE obesity,   diabetes  mellitus,    insulin  resistance increased  food  intake,     hyperglycemia,   insulin  resistance kcnj11c14/c14;  insrt143/+(AB) M O D E L S R E C A P I T U L AT E VA R I O U S P H E N O T Y P I C A S P E C T S O F D I S E A S E ?
  • 104. R E S E A R C H R E S O U R C E S Doelken S C et al. Dis. Model. Mech. 2013;6:358-372
  • 105. Smedley D et al. Database. 2013; bat025 Mungall CJ et al. Genome Biol. 2010; 11(1):R2 Washington N et al. Plos Biol 2009; e1000247 C R O S S - S P E C I E S P H E N O T Y P E C O M PA R I S O N S 
 B Y S E M A N T I C S I M I L A R I T Y
  • 107. PHENOTYPIC INTERPRETATION OF VARIANTS IN EXOMES (PHIVE) Whole exome Remove off-target and common variants Variant score from allele freq and pathogenicity Phenotype score from phenotypic similarity PhenIX/PhIVE score to give final candidates http://monarchinitiative.org  
  • 108. C O N F I R M E D D I A G N O S E S • Infantile Parkinsonism-dystonia • Wiedemann Steiner syndrome • de novo SYNGAP1 mutation leading autosomal dominant mental retardation • Frank-ter Haar syndrome • Infantile hypophosphatasia • … (~28%)
  • 109. R E L AT E D N E S S A C R O S S B I O L O G Y
  • 110. R E L AT E D N E S S A C R O S S B I O L O G Y • Bio-Curator, not bio-Archivist • Actively trying to represent current best understanding
  • 111. R E L AT E D N E S S A C R O S S B I O L O G Y • Bio-Curator, not bio-Archivist • Actively trying to represent current best understanding • Support interoperability
  • 112. R E L AT E D N E S S A C R O S S B I O L O G Y • Bio-Curator, not bio-Archivist • Actively trying to represent current best understanding • Support interoperability • Support research and educational usability
  • 113. R E L AT E D N E S S A C R O S S B I O L O G Y • Bio-Curator, not bio-Archivist • Actively trying to represent current best understanding • Support interoperability • Support research and educational usability • Support inference
  • 114. R E L AT E D N E S S A C R O S S B I O L O G Y • Bio-Curator, not bio-Archivist • Actively trying to represent current best understanding • Support interoperability • Support research and educational usability • Support inference • Not just for supporting searches, not just for finding PDF/online papers!
  • 115. W H AT C A N B E D O N E ?
  • 116. W H AT C A N B E D O N E ?
  • 117. W H AT C A N B E D O N E ?
  • 118. W H AT C A N B E D O N E ?
  • 119. W H AT C A N B E D O N E ?
  • 120. B I O D I V E R S I T Y D ATA J O U R N A L
  • 121. B I O D I V E R S I T Y D ATA J O U R N A L
  • 122. B I O D I V E R S I T Y D ATA J O U R N A L F R O M W R I T I N G , S U B M I S S I O N , P E E R - R E V I E W, E D I T I N G , P U B L I C AT I O N T O D I S S E M I N AT I O N !
  • 123. W H AT C A N I S B D O ?
  • 124. W H AT C A N I S B D O ? • Tangible support of standards efforts • QfO, RII, MI, publish guidelines, validators …
  • 125. W H AT C A N I S B D O ? • Tangible support of standards efforts • QfO, RII, MI, publish guidelines, validators … • Create a curation mindset across the entire life cycle • Support embedded/repurposed software, education, actively engage with text-miners, provide on-line support …
  • 126. W H AT C A N I S B D O ? • Tangible support of standards efforts • QfO, RII, MI, publish guidelines, validators … • Create a curation mindset across the entire life cycle • Support embedded/repurposed software, education, actively engage with text-miners, provide on-line support … • Prove the necessity for curation • Publish studies, greater emphasis on review and quality (assessment)
  • 127. W H AT C A N I S B D O ? • Tangible support of standards efforts • QfO, RII, MI, publish guidelines, validators … • Create a curation mindset across the entire life cycle • Support embedded/repurposed software, education, actively engage with text-miners, provide on-line support … • Prove the necessity for curation • Publish studies, greater emphasis on review and quality (assessment) • Work with traditional publishers • FORCE11, structured submissions
  • 128. W H AT C A N Y O U D O ? • Consider • The ease of finding information • Its relatedness to other information • Its research and educational usability
  • 129. R E S E A R C H ? ? Y O U , T H E B I O C U R AT O R I S B
  • 130. A C K N O W L E D G E M E N T S A N D T H A N K S Y O U A R E N O T A L O N E