SlideShare una empresa de Scribd logo
1 de 59
Why would a publisher care
about open data?
Anita de Waard
November 2019
Why would a publisher care about open data?
What do we mean by open?
What do we mean by data?
What do we mean by a publisher?
data
Data, after all, is stuff machines can handle […]
we could create a world in which it would be programs
-- not just people -- that would enjoy the data.
For data, as for documents, the value of any part of the web is
increased by the amount of other stuff out there.
For documents it is the ability to follow links,
but for open data it is the ability to also interconnect and join,
to summarise and compare, to monitor, extrapolate, to infer.
Tim Berners-Lee, 2009
NOW!
• Provenance of data: STAR Methods at Cell
• Contributor Roles (CRediT) taxonomy
• Citation and linking to data and software
• Versioned linking to data & software
REAGENT/RESOURCE SOURCE IDENTIFIER
Antibodies
Rabbit monoclonal anti-
Snail
Cell Signaling
Technology
Cat#3879S; RRID:
AB_2255011
Mouse monoclonal anti-
Tubulin (clone DM1A)
Sigma-Aldrich Cat#T9026; RRID:
AB_477593
Rabbit polyclonal anti-
BMAL1
This paper N/A
Bacterial and Virus Strains
pAAV-hSyn-DIO-
hM3D(Gq)-mCherry
Krashes et al.,
2011
Addgene AAV5;
44361-AAV5
AAV5-EF1a-DIO-
hChR2(H134R)-EYFP
Hope Center Viral
Vectors Core
N/A
Cowpox virus Brighton
Red
BEI Resources NR-88
Zika-SMGC-1,
GENBANK: KX266255
Isolated from
patient (Wa 2016)
N/A
Staphylococcus aureus ATCC ATCC 29213
Streptococcus pyogenes:
M1 serotype strain: strain
SF370; M1 GAS
ATCC ATCC 700294
Biological Samples
Healthy adult BA9 brain
tissue
University of
Maryland Brain &
Tissue Bank
Cat#UMB1455
19.11.2019
Elsevier Data Solutions for Research
open
Scholix: A Linked Open Data Hub
to connect papers and datasets
Research Object Composer:
An Open source editor for
Research Objects
a publisher
What does a publisher even do anymore?
cites
20081977
newexisting
Example 1: Human papilloma virus causes cervical cancer
What does a publisher even do anymore?
Example 2: Top 20 universities in Quantum Computing
7
7
Author
Editor/
Publishers
Reader/
User
Researcher
Data Results Article UI
article
article
article
article
tool
tool
data
user
user
tool
data
article
article
tool
tool
data
data
data
datauser
user
user
article
Model: Castle
• Goal: selling content
• Metrics: number of units sold
• Strategy: optimize content delivery to users
Model: Marketplace
• Goal: grow number of interactions
• Metrics: number of interactions between users
• Strategy: optimize number of network interactions
Today:
linear supply chains
Linear supply chains are evolving into complex,
dynamic and connected value webs
Win by reputation Win by trust
Why publishers care about open science:
The future:
networked open science
19.11.2019
Elsevier Data Solutions for Research
Extra Slides:
1. Elsevier in numbers
2. Research Data Management
3. Research Object Composer
4. Entellect and Life Science Solutions
5. Data analytics: Quantum Computing
6. Elsevier and Open Science
19.11.2019
Elsevier Data Solutions for Research
1. Elsevier by the numbers
Elsevier by the numbers
25,000
Our products are used at
more than 25,000 Academic
and Government institutes
globally
14+ m
people a month use Science
Direct, our flagship platform
for academic research
320+
Reaxys®'s ML capability enables the
chemistry of drug discovery, and
materials innnovation for over 320
pharma innovators, 130 chemical
companies, and over 1100
7,500
Elsevier has 7,500
employees and serves
customers in over 180
countries.
430,000
Elsevier publishes 430,000
peer-reviewed articles
annually
9 m
Mendeley is a scientific social media
platform that enables around 9
million users worldwide, to organize,
write, collaborate and promote their
19.11.2019
Elsevier Data Solutions for Research
2. Research Data Management
19.11.2019
Elsevier Data Solutions for Research
Elsevier Data Solutions for Research
DisseminateAnalyzeCollaborateControlStoreCreate & Collect
Collect
Create
Extract
Store
Secure
Manage
Control
Workspaces
Researchers
Data sets
Search
Integrate
Analyze
Share
Publish
Archive
EntellectTM
MACRO EDC
Hivebench GDPR
19.11.2019
Elsevier Data Solutions for Research
How we deliver
1. Open system: through open
APIs, modules can be
integrations with other RDM tools
2. Data remains private at or
owned by institution
3. System is integrated with the
researcher workflows, to ensure
simple and clear use
4. Researchers continue to work
the same way, avoiding
additional bureaucracy and
administration
19.11.2019
Elsevier Data Solutions for Research
Data Search
Retrieve active data, discover public data
Discover data
• 10 million+ datasets indexed from more than
35 repositories
• Deep indexing of data significantly enhances
the relevancy of results
• Keyword search within data files
• Filter search results by specific author,
institution, journal, subject category
Retrieve active data*
• Navigate to locally held institutional data
• Powerful keyword search and filtering
19.11.2019
Elsevier Data Solutions for Research
Data Manager
Researchers can
• Share data privately within a research project
• Invite external collaborators to join a project
• Gather research data from data sources as it’s
generated (including ELNs)
• Annotate research data with detailed, subject-
specific metadata
• Curate data according to project or institutional
workflows
• Prepare to publish data on a repository of your
choice
• Open APIs allow tailored upload forms, automated
workflows, analyze and re-upload data files
Go from raw files to active datasets
19.11.2019
Elsevier Data Solutions for Research
Data Repository
Researchers can
• Store up to 100 GB of data per
dataset in many formats
• Describe how experiments can be
reproduced
• Keep track of dataset versions
• Create DOI
for citation
(or university prefix)
Store datasets in a secure and trusted repository
19.11.2019
Elsevier Data Solutions for Research
Data Monitor
Institutions can
• Keep track of data inside
and outside institution
• Achieve credibility,
visibility and integrity of
key research outputs
• Maintain visibility of
events in RDM space
• Improve researcher's
adoption of data sharing
tools
• Communicate value of
data sharing to
researchers during the
research process
Encourage and monitor compliance
Five Facts about Elsevier and Research Data
Fact #1 Elsevier’s Mendeley Data supports the entire lifecycle of research data
The 5 modules that make up Mendeley Data are specifically designed to utilize data
to its fullest potential, simplifying and enhancing current way of working.
Fact #3 Mendeley Data is an open system
It is a flexible platform — modules are designed to be used together, standalone, or
combined with other Elsevier and non-Elsevier solutions
Fact #2 Researchers and institutions own and control all the data
Mendeley Data allows researchers to keep data private, or publish it under one of
16 open data licenses, so they stay in full control
Fact #4 Mendeley Data can increase the exposure and impact of research
Mendeley Data Search indexes over 10 million datasets from more than 35
repositories
Fact #5 Elsevier is an active participant in the open data community
Elsevier partners with the open data community, and is currently working on
more than 20 projects globally
19.11.2019
Elsevier Data Solutions for Research
Mendeley Data already integrates through open APIs with the global Research Data
Management ecosystem, as well as other Elsevier solutions
+ 35 repositories
(BePress planned)
• Mendeley Data Repository
datasets are automatically
synced with the Pure
curation workflow
• Projects, grants,
equipment, showcase
on portal (planned)
• Mendeley Data Search results
are visible on Scopus
• Notify new articles to Monitor
for data sharing compliance
• Datasets appear as records
on Scopus (planned)
• Mendeley Data usage is
accessible through Plum API
and widget
• Plumx metrics (citations,
usage, social mentions) are
captured and shown on
Mendeley Data Repository
Publish datasets
alongside an article
on Mendeley Data
within the SSRN
publication flow
Publish or link datasets
alongside an article on
Mendeley Data within the
ScienceDirect publication flow
Researcher and
Institutional
Dataset metrics
• User identity & login
• Library (planned)
• Notes (planned)
• Projects (planned)
Existing integration
Planned integration
• Mendeley Data indexed
by OpenAIRE index
• OpenAire Zenodo
repository indexed by
Mendeley Data Search
Long-term
preservation of
published datasets
Links between articles and datasets:
• Contributed by Mendeley
Data to Scholix
• Indexed by Menndeley Data
Search and Data Monitor
• Consumed by Scopus and
ScienceDirect
Integrate with machine
readabledata management plans
• For more than 35 repositories the
metadata as well as the underlying
datasets are indexed by Mendeley
Data Search
• First repositories are actively
integrating with the free and open
‘push API’ of Mendeley Data
Search
• Mint DOIs for Mendeley Data
Repository
• Data Cite indexed by
Mendeley Data Search
19.11.2019
Elsevier Data Solutions for Research
3. Research Object Composer
Building an open interoperable data ecosystem:
Aggregates
link things together
Annotations
about things & their
relationships
Container
Packaging content & links:
Zip files, BagIt, Docker images
Identification
locate things
regardless where
21
Building an open interoperable data ecosystem:
database
Open
repository
Workflow Tool
Task 1
Workflow
Input
Task 2
Task 3
Output
Research Object Composer
http://www.researchobject.org
Research Object Profiler
Add annotation and
relationships (metadata)
to collection to describe a
research object:
- URI
- Length
- Filename
- Checksums
etc.
Research Object Serializer
(a manifest itemizing file names)
Serialise Research Object
in standard format based BagIt
=1
=2
=3
RO
1
2
3
Open API
22
Mendeley Data
RO
1
2
3
• DOIs
• Metadata
(Findability)
• Open repo
(Accessibility)
• Versioning
• RO Standard
(Interoperability,
Reusability)
• The RO Composer is not a registry of research objects, but it can list research objects currently under construction.
• The RO Composer is a microservice which responsibility is to help other services create and deposit research objects.
• The composer acts as a temporary construction site that can be completed by multiple services (e.g. a data management
system, a workflow system, a user interface).
• These clients will be jointly building a Research Object
that can then be validated according to the schema,
before the RO is downloaded or deposited into an archive
(like Zenodo or Mendeley Data).
• Clients of the RO Composer are applications
(driven by a user interface) or agents (engaged
automatically from other events, e.g. a workflow run).
• The RO Composer is not a required component to this:
any software may generate research objects by following
Research Object specifications.
Purpose of the Research Object Composer*:
23* From: https://github.com/ResearchObject/research-object-composer/blob/master/introduction.ipynb
• API: https://researchobject.github.io/research-object-composer/api/
• Source: https://github.com/ResearchObject/research-object-composer
• Link to Jupyter Notebook tutorial (even I can do it!)
You can drive it today!
24
19.11.2019
Elsevier Data Solutions for Research
4. Entellect and Life Science Solutions
19.11.2019
Elsevier Data Solutions for Research
27
Human Papilloma Virus and Cervical Cancer
2008
zur Hausen awarded
Nobel Prize
1976
zur Hausen
proposes link
between HPV and
Cervical Cancer
1946
Papanicolau
develops PAP
smear
2006
Gardasil HPV
vaccine approved
Study impact of intervening
research in this talk
28
Early Work
1977
“a hypothesis has been presented that the virus
found in genital warts may be involved in the etiology
of human genital cancer”
29
Consensus Reached
2010
30
Citation Mapping Process
19.11.2019
Build corpus of papers using broad search (~20,000 papers) on all aspects of cervical
cancer and HPV
Expand corpus by adding all cited works not in the original corpus
Add cited works from the cited corpus (“grandchild” references )
Connect the discrete steps of scientific advances connecting the works
Apply graph mathematics to find all connected paths
31
Assembling The Graph
19.11.2019
• Dense interconnected web of
cititations
• Filter for only cited works within 3
years of the citing work – building
on relevant knowledge
First level Second level
Recognize
identities in
graph
Corpus
32
Building the Corpus
19.11.2019
'papillomaviridae' AND 'cancer' AND [article]/lim - 2,747 results from 1975-2019
• 55,414 references total cited in this set
• 29,064 unique references (the references overlap) 1870-2019
• 719,470 references cited in this set of 29,064 papers
• 259,908 unique in this set.
Total corpus of work using this method is 182,402 unique articles
• Citation network has 103,443 edges
33
Path Finding
19.11.2019
Select “interesting” endpoints
• Significant starting point – proposal that HPV could be related to cancer
• Significant endpoint – recognition of HPV/cancer connection
Use graph traversal analytics to find all paths greater than 5 papers that connect the two
ideas
Separate by year
34
Example Pathway Linking Idea to Vaccine 17 links. 30 years.
19.11.2019
35
Resulting Graph
19.11.2019
Represents the
incremental advances
by year from concept
to acceptance
20081977
New
cites
existing
19.11.2019
Elsevier Data Solutions for Research
5. Data Analytics:
Quantum Computing
Quantum Computing Research: Highest FWCI Non-US
Quantum Computing Research: Highest FWCI US
Quantum Computing Research Worldwide--FWCI
Top 20 universities active in Quantum Computing
University of
Waterloo
National University
of Singapore
Massachusetts
Institute of
TechnologyUniversity of Science
and Technology of
China
University of Oxford
Tsinghua University
University of Tokyo
Harvard University
University of
Maryland
University of New
South Wales
University of
California at Santa
Barbara
ETH Zurich
University of Sydney
RAS
University of
Southern California
Perimeter Institute
for Theoretical
Physics
University College
London
Princeton University
University of
Michigan
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
0 50 100 150 200 250
FWCI
Publications
Quantum Computing Research in Top 10% Citation Percentile, US vs. Non-US
Quantum Computing Research Worldwide--Academic-Corporate Collaboration
Academic-Corporate Collaboration—Network Map of Top 20 universities in Quantum Computing
Quantum Computing: Academic-Corporate Collaboration and Patent Citations per
Scholarly Output, US vs. Non-US
Quantum Optics/Flux Qubits: Academic-Corporate Collaboration and Patent
Citations per Scholarly Output, US vs. Non-US
Quantum Optics/Flux Qubits Research by Country
Quantum Optics/Flux Qubits—Top Corporates
Quantum Optics/Flux Qubits—Top Universities
Quantum Optics/Flux Qubits—Keyphrase-based Analysis
19.11.2019
Elsevier Data Solutions for Research
6. Elsevier and Open Science
ELSEVIER I Elsevier Open Science: Creating value through collaboration I
CONFIDENTIAL
55
Global market dynamics and technologies are reconfiguring the academic ecosystem:
Macroeconomic developments
Ecological and societal sustainability
• Global population is growing; 9B people in 2050
• Challenge to produce more with less and cleaner
input
• Challenge to solve poverty and unequal
allocation of resources
Shifting power balance from West to East
• Strong economic growth in China and India
• Rise of the middle class; improvement of
educational and health care system and food
supply chain
Technological developments
The web
• Everyone is a publisher
• Content access is ubiquitous
The social web
• Professional and personal networks emerge
without traditional institutions
• Everyone is a peer reviewer
Big data
• Explosion of data through networking of
measurement tools
• Radically cheaper tools and computing
power
Social developments
• Pressure from society and funders to
justify the costs of science
• Need for reliable research results (that can
be trusted).
• Patients/citizens demand access.
increased participation
• Distributed computing makes it easier to
make and share tools, content and code
• Overall need for more transparency and
accountability, also in doing and reporting
research
Emergence of open
science
Open Peer Review
New social networks
Data, tools and workflows are sharedOpen Data
Society is engaging moreOpen API’s Open Source Software
ELSEVIER I Elsevier Open Science: Creating value through collaboration I
CONFIDENTIAL
Carl Kesselman builds tools to enable
neuroscientists to store and share their data
in a better way
Viktor Pankratius builds software programs
that generate hypotheses about volcano
eruptions: the software can steer drones to
collect data.
Lena Deus solves scientific problems
through Kraggle: the system awards her
points for scoring highest on Machine
Learning tasks.
Scientists build data sharing
tools Computers are scientists
Science becomes a game,
which anyone can play
Some examples of Open Science:
ELSEVIER I Elsevier Open Science: Creating value through collaboration I
CONFIDENTIAL
57
Moving to a network of connected components:
Take an Open Source data repository and find some Open Data:1
Deriva, an Open
Source data
repository
2
Write some Open Source
software to mash them up:
3 Share outputs as
OA/OD/OS:
Share new data
sets on data
Deriva
Publish
papers in an
OA journal
Share code on
platforms like
Github
user
A
1
Community adds
elements to open
science platforms that
can be used by
everyone.
2
Researchers build upon
the combination of
shared content/system
elements. This leads to
new scientific knowledge
and output.
All sharable elements find
their way to other open
platforms and formats and
can be re-used, causing a
network effect.
3
Networked system:
PLATFOR
M A
Data v1
user
B
PLATFOR
M BTools B
Open Research Platform
Data v2
Tools Carticle
user
C
Open Data
Repositorie
s
Open
Access
Journals
Code
Networks
Neuroscience data
Jupyter Notebook to calculate
properties
Share code on
platforms like
Github
ELSEVIER I Elsevier Open Science: Creating value through collaboration I
CONFIDENTIAL
58
Manu-
facturers
Distri-
butors
Consu-
mers
Suppliers
data
tool article user
article
article
article
article
tool
tool
data
user
user
tool
data
article
article
tool
tool
data
data
data
datauser
user
user
article
Open Science represents a transition from a pipeline to a networked knowledge system:
Model: Castle
• Goal: selling content
• Metrics: number of units sold
• Strategy: optimize content delivery to users
Model: Marketplace
• Goal: grow number of interactions
• Metrics: number of interactions between users
• Strategy: optimize number of network interactions
Today:
linear supply chains
The future:
networked open
science
Linear supply chains are evolving into complex, dynamic and connected value
webs
Win by reputation Win by trust
ELSEVIER I Elsevier Open Science: Creating value through collaboration I
CONFIDENTIAL
59
Some current Open Science efforts:
Open
Access
Open
Data
Open
Metrics
Research
Integrity
&
Reproduci
bility
Science
&
Society
Open Tools and Software
Open Science
Open Access:
- Hybrid/Gold journals, open/self-
archive options
- Contributing to CHORUS,
CrossMark, RA21
- ‘Platinum OA’ on bepress Digital
Commons
- Pilot SSRN Preprint of the Lancet
.
Research Integrity and Reproducibility:
Many efforts, including:
- Full GDPR Compliance across all Elsevier products
- Preregistration and Registered Reports
- STAR Methods for Cell, transparent reporting
- Plagiarism and Image manipulation detection
- Statistics checking
- Reproducibility badges/TOP guidelines
- Transparency in contributorship roles (CRediT
Taxonomy)
- Research collaborations e.g Humboldt, Data Integrity
Science and Society:
- Science Literacy effort: Topic Pages,
Audioslides, Science and People
- Access to content via Patient Inform,
Research4life, Bookshare and Load2Learn.
- Elsevier Foundation supporting many
projects including Green and Sustainable
Chemistry, awards for early-career women
scientists from developing world, many
more
Open Data:
- All data is open on all platforms
- Following TOP guidelines across board
- Coleads on Enabling FAIR Data requiring
data deposits in Earth & Space Science
- Coleads Data Citation Principles in
Force11
- Supporting Scholix Linked Data repository
and other open data standards, efforts
through RDA, ORCID, CrossRef, etc
Open Metrics:
- CiteScore free API
- PlumX metrics and NewsFlo: free layer of
societal impact metrics on article level
- Helping lead RDA Make Data Count effort
with CDL/Datacite to establish data
metrics
Open Tools and Software:
- Open APIs for most products
- Many research collaborations leading to Open Source
software, e.g. Github4Labs, NIH Data commons
- Hackathons, in medicine <Elsevier Hacks>, for Mendeley
- Content and data available for research and development
and hackathons

Más contenido relacionado

La actualidad más candente

Optimising Scientific Knowledge Transfer: How Collective Sensemaking Can Ena...
Optimising Scientific Knowledge Transfer: How Collective Sensemaking Can Ena...Optimising Scientific Knowledge Transfer: How Collective Sensemaking Can Ena...
Optimising Scientific Knowledge Transfer: How Collective Sensemaking Can Ena...Anita de Waard
 
NIH BD2K DataMed metadata model - Force11, 2016
NIH BD2K DataMed metadata model - Force11, 2016NIH BD2K DataMed metadata model - Force11, 2016
NIH BD2K DataMed metadata model - Force11, 2016Susanna-Assunta Sansone
 
THOR Workshop - Data Publishing Elsevier
THOR Workshop - Data Publishing ElsevierTHOR Workshop - Data Publishing Elsevier
THOR Workshop - Data Publishing ElsevierMaaike Duine
 
THOR Workshop - Persistent Identifier Linking
THOR Workshop - Persistent Identifier LinkingTHOR Workshop - Persistent Identifier Linking
THOR Workshop - Persistent Identifier LinkingMaaike Duine
 
NIH BD2K bioCADDIE DataMed: Data Discovery Index
NIH BD2K bioCADDIE DataMed: Data Discovery IndexNIH BD2K bioCADDIE DataMed: Data Discovery Index
NIH BD2K bioCADDIE DataMed: Data Discovery IndexSusanna-Assunta Sansone
 
THOR Workshop - Introduction
THOR Workshop - IntroductionTHOR Workshop - Introduction
THOR Workshop - IntroductionMaaike Duine
 
FAIR Data Management and FAIR Data Sharing
FAIR Data Management and FAIR Data SharingFAIR Data Management and FAIR Data Sharing
FAIR Data Management and FAIR Data SharingMerce Crosas
 
Fsci 2018 friday3_august_am6
Fsci 2018 friday3_august_am6Fsci 2018 friday3_august_am6
Fsci 2018 friday3_august_am6ARDC
 
A Data Citation Roadmap for Scholarly Data Repositories
A Data Citation Roadmap for Scholarly Data RepositoriesA Data Citation Roadmap for Scholarly Data Repositories
A Data Citation Roadmap for Scholarly Data RepositoriesLIBER Europe
 
Making Data FAIR (Findable, Accessible, Interoperable, Reusable)
Making Data FAIR (Findable, Accessible, Interoperable, Reusable)Making Data FAIR (Findable, Accessible, Interoperable, Reusable)
Making Data FAIR (Findable, Accessible, Interoperable, Reusable)Tom Plasterer
 
THOR Workshop - Services PANGAEA
THOR Workshop - Services PANGAEATHOR Workshop - Services PANGAEA
THOR Workshop - Services PANGAEAMaaike Duine
 
dkNET ESP Meeting - February 2016
dkNET ESP Meeting - February 2016dkNET ESP Meeting - February 2016
dkNET ESP Meeting - February 2016dkNET
 
FAIR Data Knowledge Graphs–from Theory to Practice
FAIR Data Knowledge Graphs–from Theory to PracticeFAIR Data Knowledge Graphs–from Theory to Practice
FAIR Data Knowledge Graphs–from Theory to PracticeTom Plasterer
 
THOR Workshop - Data Publishing PLOS
THOR Workshop - Data Publishing PLOSTHOR Workshop - Data Publishing PLOS
THOR Workshop - Data Publishing PLOSMaaike Duine
 
FAIR Data Knowledge Graphs
FAIR Data Knowledge GraphsFAIR Data Knowledge Graphs
FAIR Data Knowledge GraphsTom Plasterer
 
BioPharma and FAIR Data, a Collaborative Advantage
BioPharma and FAIR Data, a Collaborative AdvantageBioPharma and FAIR Data, a Collaborative Advantage
BioPharma and FAIR Data, a Collaborative AdvantageTom Plasterer
 

La actualidad más candente (20)

Optimising Scientific Knowledge Transfer: How Collective Sensemaking Can Ena...
Optimising Scientific Knowledge Transfer: How Collective Sensemaking Can Ena...Optimising Scientific Knowledge Transfer: How Collective Sensemaking Can Ena...
Optimising Scientific Knowledge Transfer: How Collective Sensemaking Can Ena...
 
NIH BD2K DataMed metadata model - Force11, 2016
NIH BD2K DataMed metadata model - Force11, 2016NIH BD2K DataMed metadata model - Force11, 2016
NIH BD2K DataMed metadata model - Force11, 2016
 
THOR Workshop - Data Publishing Elsevier
THOR Workshop - Data Publishing ElsevierTHOR Workshop - Data Publishing Elsevier
THOR Workshop - Data Publishing Elsevier
 
THOR Workshop - Persistent Identifier Linking
THOR Workshop - Persistent Identifier LinkingTHOR Workshop - Persistent Identifier Linking
THOR Workshop - Persistent Identifier Linking
 
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
 
NIH BD2K bioCADDIE DataMed: Data Discovery Index
NIH BD2K bioCADDIE DataMed: Data Discovery IndexNIH BD2K bioCADDIE DataMed: Data Discovery Index
NIH BD2K bioCADDIE DataMed: Data Discovery Index
 
THOR Workshop - Introduction
THOR Workshop - IntroductionTHOR Workshop - Introduction
THOR Workshop - Introduction
 
FAIR Data Management and FAIR Data Sharing
FAIR Data Management and FAIR Data SharingFAIR Data Management and FAIR Data Sharing
FAIR Data Management and FAIR Data Sharing
 
Fsci 2018 friday3_august_am6
Fsci 2018 friday3_august_am6Fsci 2018 friday3_august_am6
Fsci 2018 friday3_august_am6
 
NISO Training Thursday Crafting a Scientific Data Management Plan
NISO Training Thursday Crafting a Scientific Data Management PlanNISO Training Thursday Crafting a Scientific Data Management Plan
NISO Training Thursday Crafting a Scientific Data Management Plan
 
A Data Citation Roadmap for Scholarly Data Repositories
A Data Citation Roadmap for Scholarly Data RepositoriesA Data Citation Roadmap for Scholarly Data Repositories
A Data Citation Roadmap for Scholarly Data Repositories
 
Making Data FAIR (Findable, Accessible, Interoperable, Reusable)
Making Data FAIR (Findable, Accessible, Interoperable, Reusable)Making Data FAIR (Findable, Accessible, Interoperable, Reusable)
Making Data FAIR (Findable, Accessible, Interoperable, Reusable)
 
THOR Workshop - Services PANGAEA
THOR Workshop - Services PANGAEATHOR Workshop - Services PANGAEA
THOR Workshop - Services PANGAEA
 
Valen Metadata and the [Data] Repository
Valen Metadata and the [Data] RepositoryValen Metadata and the [Data] Repository
Valen Metadata and the [Data] Repository
 
dkNET ESP Meeting - February 2016
dkNET ESP Meeting - February 2016dkNET ESP Meeting - February 2016
dkNET ESP Meeting - February 2016
 
FAIR Data Knowledge Graphs–from Theory to Practice
FAIR Data Knowledge Graphs–from Theory to PracticeFAIR Data Knowledge Graphs–from Theory to Practice
FAIR Data Knowledge Graphs–from Theory to Practice
 
THOR Workshop - Data Publishing PLOS
THOR Workshop - Data Publishing PLOSTHOR Workshop - Data Publishing PLOS
THOR Workshop - Data Publishing PLOS
 
FAIR data overview
FAIR data overviewFAIR data overview
FAIR data overview
 
FAIR Data Knowledge Graphs
FAIR Data Knowledge GraphsFAIR Data Knowledge Graphs
FAIR Data Knowledge Graphs
 
BioPharma and FAIR Data, a Collaborative Advantage
BioPharma and FAIR Data, a Collaborative AdvantageBioPharma and FAIR Data, a Collaborative Advantage
BioPharma and FAIR Data, a Collaborative Advantage
 

Similar a Why would a publisher care about open data?

Data, Data Everywhere: What's A Publisher to Do?
Data, Data Everywhere: What's  A Publisher to Do?Data, Data Everywhere: What's  A Publisher to Do?
Data, Data Everywhere: What's A Publisher to Do?Anita de Waard
 
Effective research data management
Effective research data managementEffective research data management
Effective research data managementCatherine Gold
 
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne UlitmatumElsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne UlitmatumAnita de Waard
 
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer SchoolCarole Goble
 
Alain Frey Research Data for universities and information producers
Alain Frey Research Data for universities and information producersAlain Frey Research Data for universities and information producers
Alain Frey Research Data for universities and information producersIncisive_Events
 
FAIR for the future: embracing all things data
FAIR for the future: embracing all things dataFAIR for the future: embracing all things data
FAIR for the future: embracing all things dataARDC
 
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...Sarah Anna Stewart
 
Open ILRI
Open ILRIOpen ILRI
Open ILRIILRI
 
BROWN BAG TALK WITH MICAH ALTMAN INTEGRATING OPEN DATA INTO OPEN ACCESS JOURNALS
BROWN BAG TALK WITH MICAH ALTMAN INTEGRATING OPEN DATA INTO OPEN ACCESS JOURNALSBROWN BAG TALK WITH MICAH ALTMAN INTEGRATING OPEN DATA INTO OPEN ACCESS JOURNALS
BROWN BAG TALK WITH MICAH ALTMAN INTEGRATING OPEN DATA INTO OPEN ACCESS JOURNALSMicah Altman
 
bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...
bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...
bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...dkNET
 
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...William Gunn
 
Ross Wilkinson - Data Publication: Australian and Global Policy Developments
Ross Wilkinson - Data Publication: Australian and Global Policy DevelopmentsRoss Wilkinson - Data Publication: Australian and Global Policy Developments
Ross Wilkinson - Data Publication: Australian and Global Policy DevelopmentsWiley
 
Whitehead Seminar 5/2
Whitehead Seminar 5/2Whitehead Seminar 5/2
Whitehead Seminar 5/2Physion
 
Big Data (SOCIOMETRIC METHODS FOR RELEVANCY ANALYSIS OF LONG TAIL SCIENCE D...
Big Data (SOCIOMETRIC METHODS FOR  RELEVANCY ANALYSIS OF LONG TAIL  SCIENCE D...Big Data (SOCIOMETRIC METHODS FOR  RELEVANCY ANALYSIS OF LONG TAIL  SCIENCE D...
Big Data (SOCIOMETRIC METHODS FOR RELEVANCY ANALYSIS OF LONG TAIL SCIENCE D...AKSHAY BHAGAT
 
Networked Science, And Integrating with Dataverse
Networked Science, And Integrating with DataverseNetworked Science, And Integrating with Dataverse
Networked Science, And Integrating with DataverseAnita de Waard
 
Lynch & Dirks - Platforms for Open Research - Charleston Conference 2011
Lynch & Dirks  - Platforms for Open Research - Charleston Conference 2011Lynch & Dirks  - Platforms for Open Research - Charleston Conference 2011
Lynch & Dirks - Platforms for Open Research - Charleston Conference 2011Lee Dirks
 
Facilitating good research data management practice as part of scholarly publ...
Facilitating good research data management practice as part of scholarly publ...Facilitating good research data management practice as part of scholarly publ...
Facilitating good research data management practice as part of scholarly publ...Varsha Khodiyar
 
Toward a FAIR Biomedical Data Ecosystem
Toward a FAIR Biomedical Data EcosystemToward a FAIR Biomedical Data Ecosystem
Toward a FAIR Biomedical Data EcosystemGlobus
 
Recognising data sharing
Recognising data sharingRecognising data sharing
Recognising data sharingJisc RDM
 

Similar a Why would a publisher care about open data? (20)

Data, Data Everywhere: What's A Publisher to Do?
Data, Data Everywhere: What's  A Publisher to Do?Data, Data Everywhere: What's  A Publisher to Do?
Data, Data Everywhere: What's A Publisher to Do?
 
Effective research data management
Effective research data managementEffective research data management
Effective research data management
 
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne UlitmatumElsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
 
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
 
Alain Frey Research Data for universities and information producers
Alain Frey Research Data for universities and information producersAlain Frey Research Data for universities and information producers
Alain Frey Research Data for universities and information producers
 
FAIR for the future: embracing all things data
FAIR for the future: embracing all things dataFAIR for the future: embracing all things data
FAIR for the future: embracing all things data
 
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...
 
Open ILRI
Open ILRIOpen ILRI
Open ILRI
 
The Future of Research Communications and e-Scholarship: Are we there yet?
The Future of Research Communications and e-Scholarship: Are we there yet?The Future of Research Communications and e-Scholarship: Are we there yet?
The Future of Research Communications and e-Scholarship: Are we there yet?
 
BROWN BAG TALK WITH MICAH ALTMAN INTEGRATING OPEN DATA INTO OPEN ACCESS JOURNALS
BROWN BAG TALK WITH MICAH ALTMAN INTEGRATING OPEN DATA INTO OPEN ACCESS JOURNALSBROWN BAG TALK WITH MICAH ALTMAN INTEGRATING OPEN DATA INTO OPEN ACCESS JOURNALS
BROWN BAG TALK WITH MICAH ALTMAN INTEGRATING OPEN DATA INTO OPEN ACCESS JOURNALS
 
bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...
bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...
bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...
 
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
 
Ross Wilkinson - Data Publication: Australian and Global Policy Developments
Ross Wilkinson - Data Publication: Australian and Global Policy DevelopmentsRoss Wilkinson - Data Publication: Australian and Global Policy Developments
Ross Wilkinson - Data Publication: Australian and Global Policy Developments
 
Whitehead Seminar 5/2
Whitehead Seminar 5/2Whitehead Seminar 5/2
Whitehead Seminar 5/2
 
Big Data (SOCIOMETRIC METHODS FOR RELEVANCY ANALYSIS OF LONG TAIL SCIENCE D...
Big Data (SOCIOMETRIC METHODS FOR  RELEVANCY ANALYSIS OF LONG TAIL  SCIENCE D...Big Data (SOCIOMETRIC METHODS FOR  RELEVANCY ANALYSIS OF LONG TAIL  SCIENCE D...
Big Data (SOCIOMETRIC METHODS FOR RELEVANCY ANALYSIS OF LONG TAIL SCIENCE D...
 
Networked Science, And Integrating with Dataverse
Networked Science, And Integrating with DataverseNetworked Science, And Integrating with Dataverse
Networked Science, And Integrating with Dataverse
 
Lynch & Dirks - Platforms for Open Research - Charleston Conference 2011
Lynch & Dirks  - Platforms for Open Research - Charleston Conference 2011Lynch & Dirks  - Platforms for Open Research - Charleston Conference 2011
Lynch & Dirks - Platforms for Open Research - Charleston Conference 2011
 
Facilitating good research data management practice as part of scholarly publ...
Facilitating good research data management practice as part of scholarly publ...Facilitating good research data management practice as part of scholarly publ...
Facilitating good research data management practice as part of scholarly publ...
 
Toward a FAIR Biomedical Data Ecosystem
Toward a FAIR Biomedical Data EcosystemToward a FAIR Biomedical Data Ecosystem
Toward a FAIR Biomedical Data Ecosystem
 
Recognising data sharing
Recognising data sharingRecognising data sharing
Recognising data sharing
 

Más de Anita de Waard

Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Anita de Waard
 
NFAIS Talk on Enabling FAIR Data
NFAIS Talk on Enabling FAIR DataNFAIS Talk on Enabling FAIR Data
NFAIS Talk on Enabling FAIR DataAnita de Waard
 
CNI 2018: A Research Object Authoring Tool for the Data Commons
CNI 2018: A Research Object Authoring Tool for the Data CommonsCNI 2018: A Research Object Authoring Tool for the Data Commons
CNI 2018: A Research Object Authoring Tool for the Data CommonsAnita de Waard
 
Enabling FAIR Data: TAG B Authoring Guidelines
Enabling FAIR Data: TAG B Authoring GuidelinesEnabling FAIR Data: TAG B Authoring Guidelines
Enabling FAIR Data: TAG B Authoring GuidelinesAnita de Waard
 
Scientific facts are myths, told through fairytales and spread by gossip.
Scientific facts are myths, told through fairytales and spread by gossip.Scientific facts are myths, told through fairytales and spread by gossip.
Scientific facts are myths, told through fairytales and spread by gossip.Anita de Waard
 
Talk on Research Data Management
Talk on Research Data ManagementTalk on Research Data Management
Talk on Research Data ManagementAnita de Waard
 
Big Data and the Future of Publishing
Big Data and the Future of PublishingBig Data and the Future of Publishing
Big Data and the Future of PublishingAnita de Waard
 
Real-World Data Challenges: Moving Towards Richer Data Ecosystems
Real-World Data Challenges: Moving Towards Richer Data EcosystemsReal-World Data Challenges: Moving Towards Richer Data Ecosystems
Real-World Data Challenges: Moving Towards Richer Data EcosystemsAnita de Waard
 
Data Repositories: Recommendation, Certification and Models for Cost Recovery
Data Repositories: Recommendation, Certification and Models for Cost RecoveryData Repositories: Recommendation, Certification and Models for Cost Recovery
Data Repositories: Recommendation, Certification and Models for Cost RecoveryAnita de Waard
 
The Economics of Data Sharing
The Economics of Data SharingThe Economics of Data Sharing
The Economics of Data SharingAnita de Waard
 
Public Identifiers in Scholarly Publishing
Public Identifiers in Scholarly PublishingPublic Identifiers in Scholarly Publishing
Public Identifiers in Scholarly PublishingAnita de Waard
 
Elsevier‘s RDM Program: Ten Habits of Highly Effective Data
Elsevier‘s RDM Program: Ten Habits of Highly Effective DataElsevier‘s RDM Program: Ten Habits of Highly Effective Data
Elsevier‘s RDM Program: Ten Habits of Highly Effective DataAnita de Waard
 
Charleston Conference 2016
Charleston Conference 2016Charleston Conference 2016
Charleston Conference 2016Anita de Waard
 
The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...
The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...
The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...Anita de Waard
 
Argumentation in biology papers
Argumentation in biology papersArgumentation in biology papers
Argumentation in biology papersAnita de Waard
 
Ten Habits of Highly Effective Data
Ten Habits of Highly Effective DataTen Habits of Highly Effective Data
Ten Habits of Highly Effective DataAnita de Waard
 
Ten Habits of Highly Successful Data
Ten Habits of Highly Successful DataTen Habits of Highly Successful Data
Ten Habits of Highly Successful DataAnita de Waard
 
How to persuade with data
How to persuade with dataHow to persuade with data
How to persuade with dataAnita de Waard
 
Ten habits of highly effective data
Ten habits of highly effective dataTen habits of highly effective data
Ten habits of highly effective dataAnita de Waard
 

Más de Anita de Waard (20)

Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
 
NFAIS Talk on Enabling FAIR Data
NFAIS Talk on Enabling FAIR DataNFAIS Talk on Enabling FAIR Data
NFAIS Talk on Enabling FAIR Data
 
CNI 2018: A Research Object Authoring Tool for the Data Commons
CNI 2018: A Research Object Authoring Tool for the Data CommonsCNI 2018: A Research Object Authoring Tool for the Data Commons
CNI 2018: A Research Object Authoring Tool for the Data Commons
 
Enabling FAIR Data: TAG B Authoring Guidelines
Enabling FAIR Data: TAG B Authoring GuidelinesEnabling FAIR Data: TAG B Authoring Guidelines
Enabling FAIR Data: TAG B Authoring Guidelines
 
Scientific facts are myths, told through fairytales and spread by gossip.
Scientific facts are myths, told through fairytales and spread by gossip.Scientific facts are myths, told through fairytales and spread by gossip.
Scientific facts are myths, told through fairytales and spread by gossip.
 
Talk on Research Data Management
Talk on Research Data ManagementTalk on Research Data Management
Talk on Research Data Management
 
History of the future
History of the futureHistory of the future
History of the future
 
Big Data and the Future of Publishing
Big Data and the Future of PublishingBig Data and the Future of Publishing
Big Data and the Future of Publishing
 
Real-World Data Challenges: Moving Towards Richer Data Ecosystems
Real-World Data Challenges: Moving Towards Richer Data EcosystemsReal-World Data Challenges: Moving Towards Richer Data Ecosystems
Real-World Data Challenges: Moving Towards Richer Data Ecosystems
 
Data Repositories: Recommendation, Certification and Models for Cost Recovery
Data Repositories: Recommendation, Certification and Models for Cost RecoveryData Repositories: Recommendation, Certification and Models for Cost Recovery
Data Repositories: Recommendation, Certification and Models for Cost Recovery
 
The Economics of Data Sharing
The Economics of Data SharingThe Economics of Data Sharing
The Economics of Data Sharing
 
Public Identifiers in Scholarly Publishing
Public Identifiers in Scholarly PublishingPublic Identifiers in Scholarly Publishing
Public Identifiers in Scholarly Publishing
 
Elsevier‘s RDM Program: Ten Habits of Highly Effective Data
Elsevier‘s RDM Program: Ten Habits of Highly Effective DataElsevier‘s RDM Program: Ten Habits of Highly Effective Data
Elsevier‘s RDM Program: Ten Habits of Highly Effective Data
 
Charleston Conference 2016
Charleston Conference 2016Charleston Conference 2016
Charleston Conference 2016
 
The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...
The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...
The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...
 
Argumentation in biology papers
Argumentation in biology papersArgumentation in biology papers
Argumentation in biology papers
 
Ten Habits of Highly Effective Data
Ten Habits of Highly Effective DataTen Habits of Highly Effective Data
Ten Habits of Highly Effective Data
 
Ten Habits of Highly Successful Data
Ten Habits of Highly Successful DataTen Habits of Highly Successful Data
Ten Habits of Highly Successful Data
 
How to persuade with data
How to persuade with dataHow to persuade with data
How to persuade with data
 
Ten habits of highly effective data
Ten habits of highly effective dataTen habits of highly effective data
Ten habits of highly effective data
 

Último

Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PPRINCE C P
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfSumit Kumar yadav
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptMAESTRELLAMesa2
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPirithiRaju
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfSumit Kumar yadav
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsSumit Kumar yadav
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...jana861314
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptxRajatChauhan518211
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 

Último (20)

Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C P
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.ppt
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questions
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptx
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 

Why would a publisher care about open data?

  • 1. Why would a publisher care about open data? Anita de Waard November 2019
  • 2. Why would a publisher care about open data? What do we mean by open? What do we mean by data? What do we mean by a publisher?
  • 3. data Data, after all, is stuff machines can handle […] we could create a world in which it would be programs -- not just people -- that would enjoy the data. For data, as for documents, the value of any part of the web is increased by the amount of other stuff out there. For documents it is the ability to follow links, but for open data it is the ability to also interconnect and join, to summarise and compare, to monitor, extrapolate, to infer. Tim Berners-Lee, 2009 NOW! • Provenance of data: STAR Methods at Cell • Contributor Roles (CRediT) taxonomy • Citation and linking to data and software • Versioned linking to data & software REAGENT/RESOURCE SOURCE IDENTIFIER Antibodies Rabbit monoclonal anti- Snail Cell Signaling Technology Cat#3879S; RRID: AB_2255011 Mouse monoclonal anti- Tubulin (clone DM1A) Sigma-Aldrich Cat#T9026; RRID: AB_477593 Rabbit polyclonal anti- BMAL1 This paper N/A Bacterial and Virus Strains pAAV-hSyn-DIO- hM3D(Gq)-mCherry Krashes et al., 2011 Addgene AAV5; 44361-AAV5 AAV5-EF1a-DIO- hChR2(H134R)-EYFP Hope Center Viral Vectors Core N/A Cowpox virus Brighton Red BEI Resources NR-88 Zika-SMGC-1, GENBANK: KX266255 Isolated from patient (Wa 2016) N/A Staphylococcus aureus ATCC ATCC 29213 Streptococcus pyogenes: M1 serotype strain: strain SF370; M1 GAS ATCC ATCC 700294 Biological Samples Healthy adult BA9 brain tissue University of Maryland Brain & Tissue Bank Cat#UMB1455
  • 4. 19.11.2019 Elsevier Data Solutions for Research open Scholix: A Linked Open Data Hub to connect papers and datasets Research Object Composer: An Open source editor for Research Objects
  • 5. a publisher What does a publisher even do anymore? cites 20081977 newexisting Example 1: Human papilloma virus causes cervical cancer
  • 6. What does a publisher even do anymore? Example 2: Top 20 universities in Quantum Computing
  • 7. 7 7 Author Editor/ Publishers Reader/ User Researcher Data Results Article UI article article article article tool tool data user user tool data article article tool tool data data data datauser user user article Model: Castle • Goal: selling content • Metrics: number of units sold • Strategy: optimize content delivery to users Model: Marketplace • Goal: grow number of interactions • Metrics: number of interactions between users • Strategy: optimize number of network interactions Today: linear supply chains Linear supply chains are evolving into complex, dynamic and connected value webs Win by reputation Win by trust Why publishers care about open science: The future: networked open science
  • 8. 19.11.2019 Elsevier Data Solutions for Research Extra Slides: 1. Elsevier in numbers 2. Research Data Management 3. Research Object Composer 4. Entellect and Life Science Solutions 5. Data analytics: Quantum Computing 6. Elsevier and Open Science
  • 9. 19.11.2019 Elsevier Data Solutions for Research 1. Elsevier by the numbers
  • 10. Elsevier by the numbers 25,000 Our products are used at more than 25,000 Academic and Government institutes globally 14+ m people a month use Science Direct, our flagship platform for academic research 320+ Reaxys®'s ML capability enables the chemistry of drug discovery, and materials innnovation for over 320 pharma innovators, 130 chemical companies, and over 1100 7,500 Elsevier has 7,500 employees and serves customers in over 180 countries. 430,000 Elsevier publishes 430,000 peer-reviewed articles annually 9 m Mendeley is a scientific social media platform that enables around 9 million users worldwide, to organize, write, collaborate and promote their
  • 11. 19.11.2019 Elsevier Data Solutions for Research 2. Research Data Management
  • 12. 19.11.2019 Elsevier Data Solutions for Research Elsevier Data Solutions for Research DisseminateAnalyzeCollaborateControlStoreCreate & Collect Collect Create Extract Store Secure Manage Control Workspaces Researchers Data sets Search Integrate Analyze Share Publish Archive EntellectTM MACRO EDC Hivebench GDPR
  • 13. 19.11.2019 Elsevier Data Solutions for Research How we deliver 1. Open system: through open APIs, modules can be integrations with other RDM tools 2. Data remains private at or owned by institution 3. System is integrated with the researcher workflows, to ensure simple and clear use 4. Researchers continue to work the same way, avoiding additional bureaucracy and administration
  • 14. 19.11.2019 Elsevier Data Solutions for Research Data Search Retrieve active data, discover public data Discover data • 10 million+ datasets indexed from more than 35 repositories • Deep indexing of data significantly enhances the relevancy of results • Keyword search within data files • Filter search results by specific author, institution, journal, subject category Retrieve active data* • Navigate to locally held institutional data • Powerful keyword search and filtering
  • 15. 19.11.2019 Elsevier Data Solutions for Research Data Manager Researchers can • Share data privately within a research project • Invite external collaborators to join a project • Gather research data from data sources as it’s generated (including ELNs) • Annotate research data with detailed, subject- specific metadata • Curate data according to project or institutional workflows • Prepare to publish data on a repository of your choice • Open APIs allow tailored upload forms, automated workflows, analyze and re-upload data files Go from raw files to active datasets
  • 16. 19.11.2019 Elsevier Data Solutions for Research Data Repository Researchers can • Store up to 100 GB of data per dataset in many formats • Describe how experiments can be reproduced • Keep track of dataset versions • Create DOI for citation (or university prefix) Store datasets in a secure and trusted repository
  • 17. 19.11.2019 Elsevier Data Solutions for Research Data Monitor Institutions can • Keep track of data inside and outside institution • Achieve credibility, visibility and integrity of key research outputs • Maintain visibility of events in RDM space • Improve researcher's adoption of data sharing tools • Communicate value of data sharing to researchers during the research process Encourage and monitor compliance
  • 18. Five Facts about Elsevier and Research Data Fact #1 Elsevier’s Mendeley Data supports the entire lifecycle of research data The 5 modules that make up Mendeley Data are specifically designed to utilize data to its fullest potential, simplifying and enhancing current way of working. Fact #3 Mendeley Data is an open system It is a flexible platform — modules are designed to be used together, standalone, or combined with other Elsevier and non-Elsevier solutions Fact #2 Researchers and institutions own and control all the data Mendeley Data allows researchers to keep data private, or publish it under one of 16 open data licenses, so they stay in full control Fact #4 Mendeley Data can increase the exposure and impact of research Mendeley Data Search indexes over 10 million datasets from more than 35 repositories Fact #5 Elsevier is an active participant in the open data community Elsevier partners with the open data community, and is currently working on more than 20 projects globally
  • 19. 19.11.2019 Elsevier Data Solutions for Research Mendeley Data already integrates through open APIs with the global Research Data Management ecosystem, as well as other Elsevier solutions + 35 repositories (BePress planned) • Mendeley Data Repository datasets are automatically synced with the Pure curation workflow • Projects, grants, equipment, showcase on portal (planned) • Mendeley Data Search results are visible on Scopus • Notify new articles to Monitor for data sharing compliance • Datasets appear as records on Scopus (planned) • Mendeley Data usage is accessible through Plum API and widget • Plumx metrics (citations, usage, social mentions) are captured and shown on Mendeley Data Repository Publish datasets alongside an article on Mendeley Data within the SSRN publication flow Publish or link datasets alongside an article on Mendeley Data within the ScienceDirect publication flow Researcher and Institutional Dataset metrics • User identity & login • Library (planned) • Notes (planned) • Projects (planned) Existing integration Planned integration • Mendeley Data indexed by OpenAIRE index • OpenAire Zenodo repository indexed by Mendeley Data Search Long-term preservation of published datasets Links between articles and datasets: • Contributed by Mendeley Data to Scholix • Indexed by Menndeley Data Search and Data Monitor • Consumed by Scopus and ScienceDirect Integrate with machine readabledata management plans • For more than 35 repositories the metadata as well as the underlying datasets are indexed by Mendeley Data Search • First repositories are actively integrating with the free and open ‘push API’ of Mendeley Data Search • Mint DOIs for Mendeley Data Repository • Data Cite indexed by Mendeley Data Search
  • 20. 19.11.2019 Elsevier Data Solutions for Research 3. Research Object Composer
  • 21. Building an open interoperable data ecosystem: Aggregates link things together Annotations about things & their relationships Container Packaging content & links: Zip files, BagIt, Docker images Identification locate things regardless where 21
  • 22. Building an open interoperable data ecosystem: database Open repository Workflow Tool Task 1 Workflow Input Task 2 Task 3 Output Research Object Composer http://www.researchobject.org Research Object Profiler Add annotation and relationships (metadata) to collection to describe a research object: - URI - Length - Filename - Checksums etc. Research Object Serializer (a manifest itemizing file names) Serialise Research Object in standard format based BagIt =1 =2 =3 RO 1 2 3 Open API 22 Mendeley Data RO 1 2 3 • DOIs • Metadata (Findability) • Open repo (Accessibility) • Versioning • RO Standard (Interoperability, Reusability)
  • 23. • The RO Composer is not a registry of research objects, but it can list research objects currently under construction. • The RO Composer is a microservice which responsibility is to help other services create and deposit research objects. • The composer acts as a temporary construction site that can be completed by multiple services (e.g. a data management system, a workflow system, a user interface). • These clients will be jointly building a Research Object that can then be validated according to the schema, before the RO is downloaded or deposited into an archive (like Zenodo or Mendeley Data). • Clients of the RO Composer are applications (driven by a user interface) or agents (engaged automatically from other events, e.g. a workflow run). • The RO Composer is not a required component to this: any software may generate research objects by following Research Object specifications. Purpose of the Research Object Composer*: 23* From: https://github.com/ResearchObject/research-object-composer/blob/master/introduction.ipynb
  • 24. • API: https://researchobject.github.io/research-object-composer/api/ • Source: https://github.com/ResearchObject/research-object-composer • Link to Jupyter Notebook tutorial (even I can do it!) You can drive it today! 24
  • 25. 19.11.2019 Elsevier Data Solutions for Research 4. Entellect and Life Science Solutions
  • 27. 27 Human Papilloma Virus and Cervical Cancer 2008 zur Hausen awarded Nobel Prize 1976 zur Hausen proposes link between HPV and Cervical Cancer 1946 Papanicolau develops PAP smear 2006 Gardasil HPV vaccine approved Study impact of intervening research in this talk
  • 28. 28 Early Work 1977 “a hypothesis has been presented that the virus found in genital warts may be involved in the etiology of human genital cancer”
  • 30. 30 Citation Mapping Process 19.11.2019 Build corpus of papers using broad search (~20,000 papers) on all aspects of cervical cancer and HPV Expand corpus by adding all cited works not in the original corpus Add cited works from the cited corpus (“grandchild” references ) Connect the discrete steps of scientific advances connecting the works Apply graph mathematics to find all connected paths
  • 31. 31 Assembling The Graph 19.11.2019 • Dense interconnected web of cititations • Filter for only cited works within 3 years of the citing work – building on relevant knowledge First level Second level Recognize identities in graph Corpus
  • 32. 32 Building the Corpus 19.11.2019 'papillomaviridae' AND 'cancer' AND [article]/lim - 2,747 results from 1975-2019 • 55,414 references total cited in this set • 29,064 unique references (the references overlap) 1870-2019 • 719,470 references cited in this set of 29,064 papers • 259,908 unique in this set. Total corpus of work using this method is 182,402 unique articles • Citation network has 103,443 edges
  • 33. 33 Path Finding 19.11.2019 Select “interesting” endpoints • Significant starting point – proposal that HPV could be related to cancer • Significant endpoint – recognition of HPV/cancer connection Use graph traversal analytics to find all paths greater than 5 papers that connect the two ideas Separate by year
  • 34. 34 Example Pathway Linking Idea to Vaccine 17 links. 30 years. 19.11.2019
  • 35. 35 Resulting Graph 19.11.2019 Represents the incremental advances by year from concept to acceptance 20081977 New cites existing
  • 36. 19.11.2019 Elsevier Data Solutions for Research 5. Data Analytics: Quantum Computing
  • 37.
  • 38.
  • 39.
  • 40.
  • 41. Quantum Computing Research: Highest FWCI Non-US
  • 42. Quantum Computing Research: Highest FWCI US
  • 43. Quantum Computing Research Worldwide--FWCI
  • 44. Top 20 universities active in Quantum Computing University of Waterloo National University of Singapore Massachusetts Institute of TechnologyUniversity of Science and Technology of China University of Oxford Tsinghua University University of Tokyo Harvard University University of Maryland University of New South Wales University of California at Santa Barbara ETH Zurich University of Sydney RAS University of Southern California Perimeter Institute for Theoretical Physics University College London Princeton University University of Michigan 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 50 100 150 200 250 FWCI Publications
  • 45. Quantum Computing Research in Top 10% Citation Percentile, US vs. Non-US
  • 46. Quantum Computing Research Worldwide--Academic-Corporate Collaboration
  • 47. Academic-Corporate Collaboration—Network Map of Top 20 universities in Quantum Computing
  • 48. Quantum Computing: Academic-Corporate Collaboration and Patent Citations per Scholarly Output, US vs. Non-US
  • 49. Quantum Optics/Flux Qubits: Academic-Corporate Collaboration and Patent Citations per Scholarly Output, US vs. Non-US
  • 50. Quantum Optics/Flux Qubits Research by Country
  • 54. 19.11.2019 Elsevier Data Solutions for Research 6. Elsevier and Open Science
  • 55. ELSEVIER I Elsevier Open Science: Creating value through collaboration I CONFIDENTIAL 55 Global market dynamics and technologies are reconfiguring the academic ecosystem: Macroeconomic developments Ecological and societal sustainability • Global population is growing; 9B people in 2050 • Challenge to produce more with less and cleaner input • Challenge to solve poverty and unequal allocation of resources Shifting power balance from West to East • Strong economic growth in China and India • Rise of the middle class; improvement of educational and health care system and food supply chain Technological developments The web • Everyone is a publisher • Content access is ubiquitous The social web • Professional and personal networks emerge without traditional institutions • Everyone is a peer reviewer Big data • Explosion of data through networking of measurement tools • Radically cheaper tools and computing power Social developments • Pressure from society and funders to justify the costs of science • Need for reliable research results (that can be trusted). • Patients/citizens demand access. increased participation • Distributed computing makes it easier to make and share tools, content and code • Overall need for more transparency and accountability, also in doing and reporting research Emergence of open science Open Peer Review New social networks Data, tools and workflows are sharedOpen Data Society is engaging moreOpen API’s Open Source Software
  • 56. ELSEVIER I Elsevier Open Science: Creating value through collaboration I CONFIDENTIAL Carl Kesselman builds tools to enable neuroscientists to store and share their data in a better way Viktor Pankratius builds software programs that generate hypotheses about volcano eruptions: the software can steer drones to collect data. Lena Deus solves scientific problems through Kraggle: the system awards her points for scoring highest on Machine Learning tasks. Scientists build data sharing tools Computers are scientists Science becomes a game, which anyone can play Some examples of Open Science:
  • 57. ELSEVIER I Elsevier Open Science: Creating value through collaboration I CONFIDENTIAL 57 Moving to a network of connected components: Take an Open Source data repository and find some Open Data:1 Deriva, an Open Source data repository 2 Write some Open Source software to mash them up: 3 Share outputs as OA/OD/OS: Share new data sets on data Deriva Publish papers in an OA journal Share code on platforms like Github user A 1 Community adds elements to open science platforms that can be used by everyone. 2 Researchers build upon the combination of shared content/system elements. This leads to new scientific knowledge and output. All sharable elements find their way to other open platforms and formats and can be re-used, causing a network effect. 3 Networked system: PLATFOR M A Data v1 user B PLATFOR M BTools B Open Research Platform Data v2 Tools Carticle user C Open Data Repositorie s Open Access Journals Code Networks Neuroscience data Jupyter Notebook to calculate properties Share code on platforms like Github
  • 58. ELSEVIER I Elsevier Open Science: Creating value through collaboration I CONFIDENTIAL 58 Manu- facturers Distri- butors Consu- mers Suppliers data tool article user article article article article tool tool data user user tool data article article tool tool data data data datauser user user article Open Science represents a transition from a pipeline to a networked knowledge system: Model: Castle • Goal: selling content • Metrics: number of units sold • Strategy: optimize content delivery to users Model: Marketplace • Goal: grow number of interactions • Metrics: number of interactions between users • Strategy: optimize number of network interactions Today: linear supply chains The future: networked open science Linear supply chains are evolving into complex, dynamic and connected value webs Win by reputation Win by trust
  • 59. ELSEVIER I Elsevier Open Science: Creating value through collaboration I CONFIDENTIAL 59 Some current Open Science efforts: Open Access Open Data Open Metrics Research Integrity & Reproduci bility Science & Society Open Tools and Software Open Science Open Access: - Hybrid/Gold journals, open/self- archive options - Contributing to CHORUS, CrossMark, RA21 - ‘Platinum OA’ on bepress Digital Commons - Pilot SSRN Preprint of the Lancet . Research Integrity and Reproducibility: Many efforts, including: - Full GDPR Compliance across all Elsevier products - Preregistration and Registered Reports - STAR Methods for Cell, transparent reporting - Plagiarism and Image manipulation detection - Statistics checking - Reproducibility badges/TOP guidelines - Transparency in contributorship roles (CRediT Taxonomy) - Research collaborations e.g Humboldt, Data Integrity Science and Society: - Science Literacy effort: Topic Pages, Audioslides, Science and People - Access to content via Patient Inform, Research4life, Bookshare and Load2Learn. - Elsevier Foundation supporting many projects including Green and Sustainable Chemistry, awards for early-career women scientists from developing world, many more Open Data: - All data is open on all platforms - Following TOP guidelines across board - Coleads on Enabling FAIR Data requiring data deposits in Earth & Space Science - Coleads Data Citation Principles in Force11 - Supporting Scholix Linked Data repository and other open data standards, efforts through RDA, ORCID, CrossRef, etc Open Metrics: - CiteScore free API - PlumX metrics and NewsFlo: free layer of societal impact metrics on article level - Helping lead RDA Make Data Count effort with CDL/Datacite to establish data metrics Open Tools and Software: - Open APIs for most products - Many research collaborations leading to Open Source software, e.g. Github4Labs, NIH Data commons - Hackathons, in medicine <Elsevier Hacks>, for Mendeley - Content and data available for research and development and hackathons

Notas del editor

  1. Analogies: Manager is like OneDrive for dataset: collaborate on active project; Allows for review and approval of datasets prior to publication by library Manager is the Trello for research project management RESEARCHER: Example from Wouter: Why would a psychologist use this? Project management dashboard : It enables organized project management (where is the data? Could be dropbox) Templates can be set up MOVE FROM FILES TO DATASET (files with description, metadata and structure) Manager helps make your data FAIR INSTITUTION: Monitir allows for clear presentation and enables librarians to make a decision to keep/delete private data, esp when someone has left the instituions. Archival policies. Monitor helps prevent «data loss»
  2. Now let’s dive a little deeper into each module, starting with Repository. We know that counting only publications does not reflect the true amount of research created during an experience- we know there is likely more than 1 dataset tied to a published article. By using Repository, Researchers can: Store up to 100GB of data per dataset Ensure proper metadata tagging and storage Increase discoverability of their dataset by easily creating a DOI to allow for citation. This also ensures datasets gets counted as a research output.
  3. Standards-based metadata framework for logically and physically bundling resources with context http://researchobject.org
  4. So let’s get to quantum computing, which is the area we were asked to focus on within the larger topic of quantum technologies. Here we can see the institutions that create the largest number of papers on QC, with the Chinese Academy of Sciences and CNRS, two national lab systems, at or near the top.
  5. If we flip this to look at field-weighted citation impact, however, a measure of the works relative impact in the field, we get a very different picture—still highly international, but more US institutions here, and notably a number of US companies producing high-impact work.
  6. The word cloud represents the top 50 semantically-derived keyphrases for the total set of papers representing quantum computing.
  7. If we click on the specific term “polynomial approximation” in the word cloud, we find out how quickly the topic is growing over the last 5 years, and even which individuals and instituions worldwide are working on the particular concept of polynomial approximation. It’s immediately evident that quantum computing is a highly international and competitive field. And remember, 50 keyphrases exist for each of the 100,000 topics that are modeled in the topic prominence calculation.
  8. Let’s slice the data in a different way. Here are top 20 institutions outside the US, again arranged by FWCI, who are doing important work in quantum computing. Notice anything? Virtually every one of these is a university.
  9. Here’s the same list for the US. What is different here? For the US list alone, there are 3 large corporations, the NSF, and a DOE national lab contributing high-impact research. We know that quantum computing is being invested in and chased vigorously across the globe. The Chinese are pouring immense financial resources into this, and they have plenty of human talent, including many who are likely employed by the people in this room. In my view, it is this nexus of different organizations, the close linkages between them, that gives the US its edge, if we have any edge. SEMATECH is another example of a complex of different organizations engaging in coordinated action. Over 90% of the research papers that Google publishes, and over 80% that IBM publishes, are done with one or more collaborators from academia.
  10. So what does this difference look like in action? This geomap captures global research activity in quantum computing. The size of the bubble is the number of papers, the color intensity is the FWCI. Here we can see research is fairly evenly distributed in the US, Europe, and East Asia.
  11. The Y axis here is the Field-Weighted Citation Impact for each university, while the position on the X-axis looks at total number of papers—clearly UC Santa Barbara is doing something exceptional here, we’ll explore that a bit more later. Waterloo and NUS are producing a lot of papers, though at a relatitvely low citation impact. Generally, the more papers one is publishing, the lower the overall impact will be. (traken from slightly different data set)
  12. We can look at other proxies for quality, including the number of outputs in top percentiles—here the percentage of research in the top 10% of cited outputs, which is around 29% for the US in 2016, around 15.5% for non-US institutions.
  13. Here’s the same map, but now the color intensity is the level of academic-corporate collaboration. The dark red are tech companies, but US universities also have much higher levels of AC collaboration than others. Europe and Asia are very pale by comparison.
  14. Let’s look at different and more granular view of the same information. So there is a lot going on in this graphic—It’s a different way of looking at the landscape. The bluer the dot, the higher the FWCI. The thicker the line, the more papers are shared between between the two nodes. Network centrality implies higher levels of connectedness. Japan is peripheral and mostly connected to other Japanese entities. China, particularly Tsinghua and UST China, are more connected, Singapore still more so. However, they are not as connected or central as a few key US, UK, Australian and Canadian institutions, and one can clearly see that as few large US corporations are also quite central here.    In my view, one remaining advantage the US seems to have (in addition to lots of high-quality research) is the nexus between industry and academia--because of the enormous manufacturing complexities, the SEMATECH kind of highly coordinated approach (academia/industry/govt) may make more sense in this sector than many--also given questions of cryptographic security and national security implications.
  15. ,. We can also look at three-factor analysis. Here we map total scholarly output on the Y axis. US output of 2392 papers (2008-2016) represents about 27% of global output. The X axis is the level of academic-corporate collaboration. 7.7% of US papers, but only 1.2% of non-US papers, are AC collaborations. Finally, the size of the bubble shows the number of patent citations for every thousand papers published. For the US, this is 111 citations, meaning over 11% of these papers were cited in patents worldwide. It generally takes 3-5 years before papers are cited in patents, so this likely understates the total since we have 2016 papers in here. The same measure for non-US institutions is 21.6 per 1000, less than one-fifth the level. This graphic really points out the large gap between the US and the ROW regarding UI collaboration, and overall patenting activity driven by university research as well.
  16. The quantum computing topic is actually an aggregate made up of somewhat more and distinct granular topics—the same kinds of analysis can be done on these topics, which are generated directly from the topical model that I covered earlier.
  17. This is the same topics by country and number of downloaded articles.
  18. We can look at top corporations publishing in this area, and can see that the bulk are US firms with some Japanese representation as well.
  19. Top universities for the same topic—Yale, UCSB, Berkeley, and MIT produce a great deal, with UCSB and Yale authors having a particularly high FWCI
  20. One can always do a Keyphrase-based analysis if you want to delve into a particular aspect of the topic. Here we look at the same set of papers on flux qubits that cover the concepts of circuits, resonators, and Josephson junctions—note the number of papers from Yale has gone down from 85 to 46 here. Dr. Devoret has produced more work than anyone else covering these concepts.