SlideShare una empresa de Scribd logo
1 de 40
Descargar para leer sin conexión
Measuring
the
adoption
of

     open
science


         Heather
Piwowar
       Department
of
Biomedical
Informatics
             University
of
Pittsburgh

    PSB
workshop
on
Open
Science,
January
2008
you
can
not
manage

what
you
do
not
measure
Measuring
the
adoption
of

open
science
sharing
data
lots
of
data
sharing!




                        http://www.genome.jp/en/db_growth.html
but
how
much
isn’t

     shared?

  what
isn’t
shared?
              who
isn’t
sharing
it?
why
not?
     how
much
does
it
matter?
               what
can
we
do

                 about
it?
I’ll
be
highlighting
the
results
of
a

         number
of
studies:
              surveys
          manual
reviews
         citation
analyses
Preview
  Although
some
scientists
voluntarily
share

  their
research
data,
many
don’t.

  Data
withholding
correlates
with
the
usual

  suspects.
  Feedback
on
incentives
may
surprise
you.
  Much
room
for
continued
research,

  including
several
ways
that
you
can
help.
How
much
data
gets
shared?
Data
sharing
frequency
depends

on
datatype

              DNA sequences

   gene expression microarrays

           proteomics spectra
                                 0%   25%   50%    75%    100%




                                             Noor et al. PLoS Biology 2006.
                                        Ochsner et al. Nature Methods 2008.
                                            Piwowar et al. PLoS ONE 2007.
                                             Editorial. Nature Biotech 2007.
Data
sharing
frequency
depends

on
who
you
ask
 self-reported denying a request in last 3 years

      trainees self-reported denying a request

   been denied access to data, materials, code

       authors “not able to retrieve raw data”

                     not willing to release data

                                                   0%   10%      20%      30%      40%

                                                                Campbell et al. JAMA. 2002.
                                                        Kyzas et al. J Natl Cancer Inst. 2005.
                                                               Vogeli et al. Acad Med. 2006.
                                                              Reidpath et al. Bioethics 2001.
Are
the
outcomes
of

data
sharing

      positive
or
negative?
80%
of
scientists
report
positive

experiences
from
data
sharing
                positive only

          mixed experiences

                negative only

                                0%   10% 20% 30% 40% 50%

Positive
experiences:

collaboration,
new
research,
etc.
Negative:

scooping,
preventing
publishing,
IP,
or
$$
benefit,
etc.

                                                Blumenthal et al. Acad Med. 2006
Why
is
data
withheld?
Withholding
is
associated
with

industry
links,
competiveness
            industry involvement
perceived competitiveness of field
                             male
   sharing discouraged in training
              human participants
            academic productivity
                                     0   1             2            3

  40%
of
surveyed
scientists
said
data
sharing
was
discouraged

  during
their
training!
                                             Blumenthal et al. Acad Med. 2006
Withhold
because
too
much
effort,

desire
for
continued
publishing

               sharing is too much effort
want student or jr faculty to publish more
   they themselves want to publish more
                                       cost
                         industrial sponsor
                             confidentiality
              commercial value of results
                                               0%   20%   40%    60%    80%



                                                      Campbell et al. JAMA 2002.
Obstacles
for
sharing:

publishing,

control,
cost

   want to publish more papers first
                 want exclusive use
          ensure data confidentiality
                             control
           avoid cost of preparation
                                        0%    10% 20% 30% 40% 50%




                            Hedstrom. Society of Am Archivists Ann Meeting. 2008.
Comments
show
desire
for
control
`Before
I
send
you
the
data
could
I
ask
what
you
want
it
for?'
`Can
you
be
more
explicit,
please,
about
the
analyses
you
have
in

  mind
and
what
you
plan
to
do
with
them?'

`We'll
have
to
discuss
your
request
with
the
other
coauthors.


 Before
we
do
that,
I'd
like
to
know
your
proposed
analysis
plan.'


`We
are
not
finished
using
the
data,
but
when
we
are
finished
with

 it,
we
would
be
open
to
requests
for
the
data.'

`Any
use
of
the
data
other
than
for
the
specific
purpose
laid
down

  in
the
contract
of
collaboration
is
effectively
ruled
out.'

                                                Reidpath et al. Bioethics 2001.
What
are
the
perceived
and

  measured
benefits?
Benefits
both
societal
and
personal

               saves other people effort
                     for the public good
 will be cited and enhance my reputation
  saves me effort in answering questions
    saves me effort in managing my data
                                           0%   20%   40%    60%     80%




                                                Hedstrom et al. IASSIST 2006.
Measuring
societal
benefit
 ‐
assume
each
database
hit
saves
$0.10,
or
a

   fraction
of
data
collection
costs
 ‐
assume
the
value
is
approximated
by
the

   (idealized)
funding
target
for
data

   maintenance:

   20‐25%
the
cost
of
generating
the
data
 Remembering,
moreover,
the
indirect
benefits
are
much

  higher
than
the
direct
ones.



                                      Ball et al. Nature Biotechol. 2004.
Measuring
personal
benefit:


increased
citations




                     Gleditsch et al. Int Studies Perspectives. 2003.
                                   Piwowar et al. PLoS ONE. 2007.
What
incentives
are
valued?
Incentives
to
share:
perceived
value,

mandates,
recognition
as
publication

if I thought it would really benefit others
               if required for future funding
                    if required for publication
       if deposits counted as a publication
            if citations to data were valued
                   if monetary compensation
                                                  0%       25%       50%       75%




                                   Hedstrom. Society of Am Archivists Ann Meeting. 2008.
What
would
make
it
easier?

help

and
straightforward
guidelines
   more funder time and money
  help with confidentiality issues
                      on-site help
                    more training
                better guidelines
                      better tools
           simpler requirements
              less staff turn-over
                                     0%   25%         50%          75%




                                                Hedstrom et al. IASSIST 2006.
Incentives
for
quality
and
docs:


help,
visibility,
and
nagging
                               if I had help
            if quality was visible to others
 if I noticed that others had higher quality
                if the archivists nagged me
                   if data users nagged me
                 if I had released it sooner
                                               0%   10%    20%    30%    40%




                                                    Hedstrom et al. IASSIST 2006.
Do
journal
mandates
work?
Journals
with
enforceable
policies

have
more
shared
datasets

 sharing rate when no policy (baseline)

                     unenforceable policy

                         enforceable policy
                                                     0       1       2      3       4




        Piwowar, Chapman. A review of journal policies for sharing research data. ELPUB 2008.
Once
shared,
always
there?
Data
contacts
and
storage
decay

with
time
URL
decay:



















































email
decay:




Supplementary
information:

in
6
top
journals:





5%
unavailable
after
2
years,
10%
unavail
after
5
years
                                                              Evangelou
et
al.

FASEB
J.

2006.
                                                                  Wren.

Bioinformatics
2008.
                                                                 Wren
et
al.

EMBO
Rep
2006.
Anything
else?
data
completeness?
              replicability?
  theoretical
models
of
info
behaviour?

              Good
questions.
                Out
of
time.
Ask
or
see
online
bibliography
for
more
info.
Do
funder
mandates
work?

Which
subdisciplines
have
best
practices?
        particular
weaknesses?
                  why?


            Good
questions.
          Research
underway....
NIH: Haga, S.
Exploring Attitudes About Data Disclosure and Data-Sharing in
Genomics Research.
NSF: Hedstrom, M.
Incentives for Data Producers to Create Archive-Ready Data
Sets.
National Inst of Nursing Research: Pienta, A.
Barriers and Opportunities for Sharing Research Data.
NLM training grant: Piwowar, H.
Impact, prevalence, and patterns of shared biomedical data.
+others
In
some
cases
do
the

       costs
outweigh
the
benefits?

Do
mandates
decrease
quality
of
shared
data?
    What
is
the
prevalence
of
data
reuse?
        What
would
facilitate
reuse?
              Good
questions.
              We
don’t
know.
              Future
research!
Conclusions
Take
home
#1
    Although
some
researchers
voluntarily

    share
data,
many
don’t.


    the
frequency
of
sharing
depends
on

    data
type,

    who
you
ask,

    how
you
ask,
    what
you
plan
to
do
with
the
data,
    what
journal
it
is
published
in....
Take
home
#2
 Withholding
is
correlated
with
the
usual

 suspects:


 desire
to
publish
more,
avoid
effort,
maintain

 control,
industry
relationships.
 Relative
value
of
incentives
is
surprising:

 demonstrated
value,
visibility,
help,

 straightforward
guidelines,

 effective
mandates,
and
nagging
:)
 Each
of
us
can
make
a
difference
here:


 Write
letters
to
the
editor
about
journal
policies,

 blog
a
how‐to
guide
in
plain
English,
 get
involved
in
data
standards,

 offer
help
to
colleagues,

 communicate
instances
of
value.
Take
home
#3
Much
room
for
future
research:


costs
and
benefits,
data
quality,
reuse
....

Opportunities
for
traditional
large‐scale
grants

across
a
range
of
disciplines
and
agencies
But
also
opportunity
for
impact
in
less
formal
channels:
You
can
help
communicate

anecdotes,
evaluations,
and
visualizations

via
blogs,
published
research
notes,
perspectives,

letters
to
the
editor,
and
water‐cooler
conversations.
you
can
not
manage

        what
you
do
not
measure
                   ‐>

    If
we
measure
current
behaviour,

we’ll
learn
how
to
facilitate
the
adoption

           of
open
science,
and
We’ll
know
what
and
when
to
celebrate!
Thanks
to

    Wendy
Chapman
+
the
Dept
of
Biomedical
Informatics
at
U
of
Pittsburgh
    NLM
for
training
grant
funding:

5
T15
LM007059‐22
(U
of
Pitt
DBMI)
    NIH
for
research
and
travel
funding:
1R01LM009427‐01
(Wendy
Chapman)
    PSB,
Shirley,
and
Cameron
for
organizing
this
workshop



Study
references
available
at
http://www.citeulike.org/user/hpiwowar/tag/psb‐talk
Contact
me
for
more
info
at
hpiwowar@gmail.com



           My
shared
data:

www.dbmi.pitt.edu/piwowar
                   Share
your
research
data
too!

Más contenido relacionado

Similar a Heather Piwowar - Measuring the adoption of Open Science

JCDL doctoral consortium 2008: Proposed Foundations for Evaluating Data Shar...
JCDL doctoral consortium 2008:  Proposed Foundations for Evaluating Data Shar...JCDL doctoral consortium 2008:  Proposed Foundations for Evaluating Data Shar...
JCDL doctoral consortium 2008: Proposed Foundations for Evaluating Data Shar...
Heather Piwowar
 
ELPUB 2008: A review of journal policies for sharing research data
ELPUB 2008:    A review of journal policies for sharing research dataELPUB 2008:    A review of journal policies for sharing research data
ELPUB 2008: A review of journal policies for sharing research data
Heather Piwowar
 
Ivf In Low Resource Settings Uganda
Ivf In Low Resource Settings UgandaIvf In Low Resource Settings Uganda
Ivf In Low Resource Settings Uganda
sandradill2009
 
Impact.Tech "Statistical Literacy for Deep Tech"
Impact.Tech "Statistical Literacy for Deep Tech"Impact.Tech "Statistical Literacy for Deep Tech"
Impact.Tech "Statistical Literacy for Deep Tech"
Impact.Tech
 
Using Social Media to Empower Employees: Confernce Board Workshop
Using Social Media to Empower Employees: Confernce Board WorkshopUsing Social Media to Empower Employees: Confernce Board Workshop
Using Social Media to Empower Employees: Confernce Board Workshop
Lois Kelly
 

Similar a Heather Piwowar - Measuring the adoption of Open Science (20)

JCDL doctoral consortium 2008: Proposed Foundations for Evaluating Data Shar...
JCDL doctoral consortium 2008:  Proposed Foundations for Evaluating Data Shar...JCDL doctoral consortium 2008:  Proposed Foundations for Evaluating Data Shar...
JCDL doctoral consortium 2008: Proposed Foundations for Evaluating Data Shar...
 
Thesis Proposal, as presented for dissertation proposal defense
Thesis Proposal, as presented for dissertation proposal defenseThesis Proposal, as presented for dissertation proposal defense
Thesis Proposal, as presented for dissertation proposal defense
 
MIE2009 Keynote Address: Clinical Decision Support
MIE2009 Keynote Address: Clinical Decision SupportMIE2009 Keynote Address: Clinical Decision Support
MIE2009 Keynote Address: Clinical Decision Support
 
SLA webinar: Open research data needs librarians
SLA webinar: Open research data needs librariansSLA webinar: Open research data needs librarians
SLA webinar: Open research data needs librarians
 
ELPUB 2008: A review of journal policies for sharing research data
ELPUB 2008:    A review of journal policies for sharing research dataELPUB 2008:    A review of journal policies for sharing research data
ELPUB 2008: A review of journal policies for sharing research data
 
sience 2.0 : an illustration of good research practices in a real study
sience 2.0 : an illustration of good research practices in a real studysience 2.0 : an illustration of good research practices in a real study
sience 2.0 : an illustration of good research practices in a real study
 
PLoS ONE Piwowar: Sharing Detailed Research Data Is Associated with Increa...
PLoS ONE Piwowar:    Sharing Detailed Research Data Is Associated with Increa...PLoS ONE Piwowar:    Sharing Detailed Research Data Is Associated with Increa...
PLoS ONE Piwowar: Sharing Detailed Research Data Is Associated with Increa...
 
Research into Open Research Data
Research into Open Research DataResearch into Open Research Data
Research into Open Research Data
 
Ivf In Low Resource Settings Uganda
Ivf In Low Resource Settings UgandaIvf In Low Resource Settings Uganda
Ivf In Low Resource Settings Uganda
 
Surviving statistics lecture 1
Surviving statistics lecture 1Surviving statistics lecture 1
Surviving statistics lecture 1
 
'Living Laboratories': Rethinking Ecological Designs and Experimentation in H...
'Living Laboratories': Rethinking Ecological Designs and Experimentation in H...'Living Laboratories': Rethinking Ecological Designs and Experimentation in H...
'Living Laboratories': Rethinking Ecological Designs and Experimentation in H...
 
Day 1: Real-World Data Panel
Day 1: Real-World Data Panel Day 1: Real-World Data Panel
Day 1: Real-World Data Panel
 
Infusing Information Literacy Skills by Researching 'Never Events'
Infusing Information Literacy Skills by Researching 'Never Events'Infusing Information Literacy Skills by Researching 'Never Events'
Infusing Information Literacy Skills by Researching 'Never Events'
 
Enhancing the Social Web through Augmented Social Cognition Research
Enhancing the Social Web through Augmented Social Cognition ResearchEnhancing the Social Web through Augmented Social Cognition Research
Enhancing the Social Web through Augmented Social Cognition Research
 
MCDA OR WEIGHTED CEA BASED ON THE QALY? WHICH IS THE FUTURE FOR HTA DECISION ...
MCDA OR WEIGHTED CEA BASED ON THE QALY? WHICH IS THE FUTURE FOR HTA DECISION ...MCDA OR WEIGHTED CEA BASED ON THE QALY? WHICH IS THE FUTURE FOR HTA DECISION ...
MCDA OR WEIGHTED CEA BASED ON THE QALY? WHICH IS THE FUTURE FOR HTA DECISION ...
 
Impact.Tech "Statistical Literacy for Deep Tech"
Impact.Tech "Statistical Literacy for Deep Tech"Impact.Tech "Statistical Literacy for Deep Tech"
Impact.Tech "Statistical Literacy for Deep Tech"
 
BioSharing - RDA Plenary 6 - Metadata Standards Catalog WG and BioSharing WG ...
BioSharing - RDA Plenary 6 - Metadata Standards Catalog WG and BioSharing WG ...BioSharing - RDA Plenary 6 - Metadata Standards Catalog WG and BioSharing WG ...
BioSharing - RDA Plenary 6 - Metadata Standards Catalog WG and BioSharing WG ...
 
Using Social Media to Empower Employees: Confernce Board Workshop
Using Social Media to Empower Employees: Confernce Board WorkshopUsing Social Media to Empower Employees: Confernce Board Workshop
Using Social Media to Empower Employees: Confernce Board Workshop
 
Why study Data Sharing? (+ why share your data)
Why study Data Sharing?  (+ why share your data)Why study Data Sharing?  (+ why share your data)
Why study Data Sharing? (+ why share your data)
 
An Introduction to Artificial Intelligence for the Everyday Radiologist
An Introduction to Artificial Intelligence for the Everyday RadiologistAn Introduction to Artificial Intelligence for the Everyday Radiologist
An Introduction to Artificial Intelligence for the Everyday Radiologist
 

Último

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Último (20)

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 

Heather Piwowar - Measuring the adoption of Open Science