SlideShare una empresa de Scribd logo
1 de 38
Descargar para leer sin conexión
Open access to scientific research data
Gudmundur A. Thorisson, PhD <gt50@leicester.ac.uk>
Research associate, University of Leicester
Guest scientist, University of Iceland
Participant in the GEN2PHEN Consortium and the ORCID Technical Working Group



                                  This work is published under the Creative Commons Attribution license (CC BY:
                                  http://creativecommons.org/licenses/by/3.0/) which means that it can be freely
                                  copied, redistributed and adapted, as long as proper attribution is given.
Overview



   ๏ Intro to the world of Big Science & Big Data
              •Why is inadequate access to data such a problem?
   ๏ Incentive-based approaches to tackling the sharing problem
                 Identification, identification, identification
   ๏ Key relevant developments internationally
   ๏ Some food for thought for funders, institutions, other key players
   ๏ Concluding remarks




RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
Big Science, Big Data

• Scientific research increasingly large-scale and data-driven

• High-profile discipline examples

     – High-energy particle physics - experiments
       performed in the Large Hadron Collider

     – Astronomy - data from ground-based and space
       telescopes, the Virtual Observatory (VO)




                                                                             •   Doctorow, C. Big data: Welcome to the petacentre. Nature 455, 16-
                                                                                 21 (2008). http://dx.doi.org/10.1038/455016a
RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
Hypothesis generation guided by available data




                                                                                              Kell and Oliver. Bioessays (2004) vol. 26 (1)



• Science paradigms
    – 1st: Empirical - describing natural phenomena
    – 2nd: Theoretical - models, generalizations
    – 3rd: Computational - simulating complex phenomena
    – 4th (1+2+3): Data exploration, e-Science


Gray, J. 2009. The Fourth Paradigm: Data-Intensive Scientific Discovery. Microsoft Research




 RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
Biological research too is
                         increasingly big and data-driven



  • From: small-scale datasets that
    fit into a printed journal article




                                    Richards, M. et al. Paleolithic and neolithic lineages in the European mitochondrial gene pool. American
                                    journal of human genetics 59, 185-203 (1996). http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1915109/




RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
Biological research too is
                         increasingly big and data-driven

• To: large-scale collection of
  biological data in digital form




• Huge technological advances in last 5-10 years
     – experimental / observations <-- gathering data with high-throughput equipment
     – computer technology <-- storing & analyzing massive data volumes


• Example: massively-parallel sequencing
     – Determine human genome sequence in <1 day - the $1000 genome
     – Metagenomics: sequence *everything* in environment samples
     – Large bio-specimen collections
          • x100,0000 of individuals in disease/population biobanks

RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
Examples: domain repositories for sequence data


 • GenBank - genetic sequence
   repository, established 1986




                                                                                    • UniProt - knowledge base for
                                                                                      protein sequence & function
Conference on Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
   RDFC2012 Unique Identifiers, Vilnius, Feb 14 2012
“Community resource projects” - large-scale data generation
  for the purpose of making the data available for broad reuse


• The sequence of the human genome
     – International Human Genome project - mandatory rapid data sharing, the Bermuda
       principles



• Pattern of variation in the human genome
     – International Haplotype Map Project - genotyping population samples
     – 1000 Genomes Project - sequencing population samples




RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
Big Data – challenges, opportunities
• Managing & making sense of large-scale datasets
     – Data easy/cheap to generate - not so cheap to store & use
     – Favorite quote: “the $1000 genome sequence, followed by the ++$10,000 analysis”



• Integration & analysis - combining datasets
     – more data of the same type - e.g. combine sequences from multiple species
     – related data of different type - e.g. a person’s genome sequence + his/her phenotype


• Potential for accelerating research, creating new knowledge and (in
  biomedicine) improving human health.


• Key driver = unrestricted sharing of scientifc data deposited in
  the public domain

RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
Data = “fuel” of science




                         Smith,V. Data publication: towards a database of everything. BMC Res Notes (2009) vol. 2 (1)




RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
11
12
Data = “fuel” of science



                            [..] If digital technologies are the engine of this
                            revolution, digital data are its fuel. But for many
                            scientific disciplines, this fuel is in short supply.[..]

                         Smith,V. Data publication: towards a database of everything. BMC Res Notes (2009) vol. 2 (1)




RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
Biology and data sharing in the “long tail”

• Biology is complex, so data are often very
  heterogeneous
• Technologies changing rapidly
• Lots of small-scale research projects
• Lots of small/medium datasets            The ‘long tail’ of dark bio-data
• Data in the long tail usually *not* shared
  OR not shared in a useful way

 • Contrast with other data-intensive disciplines with
      – a long history of sharing research data - a “culture of sharing”
      – big, expensive, shared facilities = the only way to do this kind of research
      – relatively homogeneous datasets, easier to scale up to big volumes (e.g. telescope images)


RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
[…] Overall, only 47 papers (9%) deposited full primary raw data
                         online. None of the 149 papers not subject to data availability
                         policies made their full primary data publicly available.

                         Conclusion: A substantial proportion of original research papers published in
                         high-impact journals are either not subject to any data availability
                         policies, or do not adhere to the data availability instructions in their
                         respective journals. This empiric evaluation highlights opportunities for
                         improvement




RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
DATA
                                                               analysed
                                                               synthesised
                                                               interpreted



                                INFORMATION

                                                                  published




                                 KNOWLEDGE
                                                                    Publication


                                          Lots of published knowledge but
                                           hard/impossible to go back and
                                         reproduce work & validate findings

                                                                    +
                                    Opportunity for maximising the value of
                                        data through reuse is wasted

RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
Credit: http://cutcaster.com/photo/800902839-The-hand-drawing-question-WHY/




                                                                              17
Lots and lots of diverse reasons!!
                                            Some quotes from researchers:

                                                          “Don't which digital repository I should upload to”
                                                          “Too much work, got better things to do!”
                                                          “My competitors will just take the data and ‘scoop’ me”
                                                          “It's my data, I collected them and noone else is entitled
                                                          to use them”
                                                          “[myriad other reasons]”




                                                         Worringly, many authors don't seem to
                                                         care whether evidence underpinning their
                                                         published findings is accessible or not




Koslow. Should the neuroscience community make a
paradigm shift to sharing primary data?. Nat Neurosci
(2000) vol. 3 (9). http://dx.doi.org/10.1038/78760


 RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
Gnarly issue #1: “ownership” vs “stewardship”



• Many researchers consider data their property, even if research
  funded by public money
   – e.g. want to do further analysis on data in future, publish more papers


• ..which conficts with interests of other stakeholders in the game,
  e.g. (funders, universities) who want:
   – to maximize return on investment in the funded research
   – to ensure good, solid evidence-based science is done, etc.




RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
Gnarly issue #2 – biomedical data

• Usually sensitive, cannot be shared without restrictions
     – Detailed, reidentifiable biomedical data that cannot be fully anonymized
     – Personal privacy considerations


• Specialized controlled-access archives deal with some of this
     – NCBI's database of Genotypes and Phenotypes – dbGaP
     – European Genome-phenome Archive – EGA
     – [specific diseases / disorders, research consortia, others]




RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
How to Make a Tackle in Rugby
Tackling in rugby is one of the most important aspects of the game.
[...]
Credit:http://djamba.com/how-to-make-a-tackle-in-rugby.html




                                                                21
...which are an imperfect solution

• Arguments that mandates by themselves are not the way

• Mandates likely to ensure only minimum compliance
     – sharing would be done in minimally useful form (as in, whatever is the least effort)



    …. and are meaningless if not enforced (currently the case with
    many journals)




RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
Sharing now tends to be driven by mandates...

  * Journals increasingly require data to be made available

  “Provide supporting data in a repository OR we won’t
  publish your paper”



   * Funders increasingly require data sharing plan &
   budget baked into grant proposals.

   “Publish data we are funding you to generate OR we
   will not fund your research again”




                                                                                     Using just a stick
                                                                                     gets you so only far


RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
Strategies focused on encouraging sharing

                                 - Make it easy -
                                - Make it useful -
                                - Make it citable -




RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
Treating data as citable
                      publications in their own right

• Core strategy: enable data to be treated as 1st class citizens of the
  scholarly record which:
         i) are indexed and can be discovered, located and accessed, and
         ii) can be properly identified & cited unambiguously like other scholarly works

• Link datasets with the primary journal publication - citation crosslinks
• Give data creators/curators/analysts proper credit for their contribution
  to the digital resource


• Focus on the benefits to researchers from publishing their data
     –   Data sharing → Data PUBLICATION + CITATION
     – Others reuse & cite their stuff → more citations → more impact
     – The more useful a dataset, the more likely to be used & cited
RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
Exemplar – Data Dryad

“international repository of data
underlying peer-reviewed articles in
the basic and applied biosciences”
  http://datadryad.org



• Combines
     – Mandates (journal policy)
            and
     – Citable data publication



• Citation cross-linking
     – Paper references dataset
     – Dataset references paper



RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
Key building blocks: the 3 I’s of identification




RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
1
I        Identifying scholarly publications (and other research outputs)
          • Why? So it is possible to..
             ..cite the work unambiguously (‘..we used the method described in Thorisson et al (2009)’)
             ..locate the work (retrieve Nature article as PDF from journal website)
             ..give credit to persons/entities who contributed to the work (G. Thorisson authored paper X)
          • Need for globally unique, persistent identifiers to combat unstable Web URLs, broken hyperlinks
          • e.g. Digital Object Identifiers (DOIs) for pubs, datasets and more:
               – Bell et al. 2009. Science 323(5919) doi:10.1371/journal.pone.0024357
               – Goodwillie C et al (2005) Data from: The evolutionary enigma of mixed mating systems in
                 plants: occurrence, theoretical explanations, and empirical evidence. Dryad Digital
                 Repository. doi:10.5061/dryad.292q34fp




RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
2
I        Identifying use/reuse - measuring impact

      – Historical reliance on formal citations and citation-based metrics
      – ISI Impact Factor widely used, but really metric for infuence of a scholarly journal
      – Citation analysis not going away - remains the gold standard


      – Many other use/reuse indicators for impact of individual research outputs
          • Focus on the impact of the *publication* itself, not the journal in which it appears
          • Indicators: no. full-text downloads, tweets (i.e. mentions on Twitter), social bookmarking
          • AltMetrics - a growing grassroots movement “ to better measure and reward all the different
            ways that people contribute to the messy and complex process of scientific progress [..] born out
            of a simple recognition: Many of the traditional measurements are too slow or simplistic to
            keep pace with today’s Internet-age science” http://altmetrics.org
      – Lots new tools and projects emerging to explore possibilities in this space
          • e.g. http://total-impact.org




RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
3
I        Identifying contributors – attributing credit

      – Why? So we can..
         ..link content creators with their works - attribute credit accurately
         ..figure out: who contributed to publication X?
                       which publications has person/organization Y contributed to?
      – What kind of contributions? Characterizing ‘contributorship’
         author, creator, analyst, reviewer, ‘conceived of study & designed experiment’ etc




RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
Tackling the author name ambiguity problem
               (or ‘Who’s Who?’)

                                                                                     How about these?    Or these?

                                                                                                        J.   Smith
                                                                                                        J.   Smith
                                                                                                        J.   Smith
  Are these authors all the same person?                                                                J.   Smith
 G. Thorisson, University of Leicester                                                                  J.   Smith
 G. A. Thorisson, University of Leicester                                                                    [etc.]
 G. A. Thorisson, Cold Spring Harbor Laboratory




           ∼2/3 of the ∼6 million authors in MEDLINE share a last name and
           first initial with at least one other author, and an ambiguous name
           refers to ∼8 persons on average.
           Torvik and Smalheiser. Author name disambiguation in MEDLINE. ACM Transactions on Knowledge
           Discovery from Data (2009) vol. 3 (3)

RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
The Open Researcher & Contributor ID initiative

Launched end of 2009, ORCID will work to
support the creation of a permanent, clear
and unambiguous record of scholarly
communication by enabling reliable
attribution of authors and contributors
through unique identifiers




 RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
The Open Researcher & Contributor ID initiative

ORCID will add value for scholars and
the organizations that they are
interacting with, including universities,
scholarly societies, funding
organizations and publishers


                                                                              •Joins faculty or student body
                                                                              •Joins scholarly society
                                                                              •Applies for grant
                                                                              •Submits manuscript




RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
ORCID transcends discipline, geographic, national and
institutional boundaries - now >300 participants




http://www.orcid.org                                    34
Some food for thought / recommendation
            kind of stuff to conclude
• Status of research data in Iceland is unclear → need research
     – Build on & extend 2007 Rannís report “Gagnagrunnar á Íslandi um náttúru, umhverfi og orku”

                                                                  Rannís, we´re looking at you!

• Funders to take lead
     – Mandates (aka sticks) - require data management plan + budget in grant proposals
          • Many best practices & tools available to draw upon, e.g. by the UK Digital Curation Centre
     – Call for & fund research proposals to build infrastructural foundations & explore
       technologies/initiatives
     – Raise awareness in the local research community




RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
Even more food for thought /
                         recommendation kind of stuff

• Universities & other research institutions need to
     – Take research data seriously
     – Build infrastructure for data storage & preservation, support personnel (e.g. data
       officers / coordinators)
     – Include datasets and other non-conventional outputs in professional evalutations


• Identify & engage with key international initiatives in this space
     – ORCID, DataCite, Dryad, Open Knowledge Foundation, others
     – OpenAIRPlus ← Solveig's talk coming up!




RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
Final bite of food-for-thought




                  Let's make research data an integral part of the
                   OA mission in Iceland, NOT an afterthought




RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
Acknowledgements
GEN2PHEN Consortium
                                                              This work has received funding by the
   http://www.gen2phen.org/about-gen2phen/partners            European Community's Seventh Framework
                                                              Programme (FP7/2007-2013)
                                                              under grant agreement number 200754 -
Prof Anthony J. Brookes Bioinformatics Group, Leicester       the GEN2PHEN project.




                   Contact me!
                   Contact me!
                                                     ORCID - http://www.orcid.org
             <gthorisson@gmail.com>
             <gthorisson@gmail.com>
       http://www.linkedin.com/in/mummi
       http://www.linkedin.com/in/mummi
        http://www.twitter.com/gthorisson
        http://www.twitter.com/gthorisson
                                                          Published under the Creative Commons BY license
           http://www.gthorisson.name
            http://www.gthorisson.name                     (http://creativecommons.org/licenses/by/3.0/)

Más contenido relacionado

La actualidad más candente

Beyond Preservation: Situating Archaeological Data in Professional Practice
Beyond Preservation: Situating Archaeological Data in Professional PracticeBeyond Preservation: Situating Archaeological Data in Professional Practice
Beyond Preservation: Situating Archaeological Data in Professional PracticeEric Kansa
 
Big Data in the Arts and Humanities
Big Data in the Arts and HumanitiesBig Data in the Arts and Humanities
Big Data in the Arts and HumanitiesAndrew Prescott
 
Open Access: Open Access Looking for ways to increase the reach and impact of...
Open Access: Open Access Looking for ways to increase the reach and impact of...Open Access: Open Access Looking for ways to increase the reach and impact of...
Open Access: Open Access Looking for ways to increase the reach and impact of...librarianrafia
 
DataCite - services and support for opening up research data
DataCite - services and support for opening up research dataDataCite - services and support for opening up research data
DataCite - services and support for opening up research dataHerbert Gruttemeier
 
RDAP13 Lorrie Johnson: Facilitating Access to Scientific Data
RDAP13 Lorrie Johnson: Facilitating Access to Scientific DataRDAP13 Lorrie Johnson: Facilitating Access to Scientific Data
RDAP13 Lorrie Johnson: Facilitating Access to Scientific DataASIS&T
 
Open Access, Open Data. Open Research?
Open Access, Open Data. Open Research?Open Access, Open Data. Open Research?
Open Access, Open Data. Open Research?Cameron Neylon
 
SEAD Virtual Archive: Building a Federation of Institutional Repositories fo...
 SEAD Virtual Archive: Building a Federation of Institutional Repositories fo... SEAD Virtual Archive: Building a Federation of Institutional Repositories fo...
SEAD Virtual Archive: Building a Federation of Institutional Repositories fo...skonkiel
 
BeSTGRID OpenGridForum 29 GIN session
BeSTGRID OpenGridForum 29 GIN sessionBeSTGRID OpenGridForum 29 GIN session
BeSTGRID OpenGridForum 29 GIN sessionNick Jones
 
Managing and Sharing Research Data
Managing and Sharing Research DataManaging and Sharing Research Data
Managing and Sharing Research DataMartin Donnelly
 
Building a Data Discovery Network for Sustainability Science
Building a Data Discovery Network for Sustainability ScienceBuilding a Data Discovery Network for Sustainability Science
Building a Data Discovery Network for Sustainability ScienceRobert H. McDonald
 
Managing and Sharing Research Data: Good practices for an ideal world...in th...
Managing and Sharing Research Data: Good practices for an ideal world...in th...Managing and Sharing Research Data: Good practices for an ideal world...in th...
Managing and Sharing Research Data: Good practices for an ideal world...in th...Martin Donnelly
 
Case Study Big Data: Socio-Technical Issues of HathiTrust Digital Texts
Case Study Big Data: Socio-Technical Issues of HathiTrust Digital TextsCase Study Big Data: Socio-Technical Issues of HathiTrust Digital Texts
Case Study Big Data: Socio-Technical Issues of HathiTrust Digital TextsBeth Plale
 
Building the FAIR Research Commons: A Data Driven Society of Scientists
Building the FAIR Research Commons: A Data Driven Society of ScientistsBuilding the FAIR Research Commons: A Data Driven Society of Scientists
Building the FAIR Research Commons: A Data Driven Society of ScientistsCarole Goble
 
The Future of Open Science
The Future of Open ScienceThe Future of Open Science
The Future of Open SciencePhilip Bourne
 
Chris Marsden, University of Essex (Plenary): Regulation, Standards, Governan...
Chris Marsden, University of Essex (Plenary): Regulation, Standards, Governan...Chris Marsden, University of Essex (Plenary): Regulation, Standards, Governan...
Chris Marsden, University of Essex (Plenary): Regulation, Standards, Governan...i_scienceEU
 

La actualidad más candente (18)

Beyond Preservation: Situating Archaeological Data in Professional Practice
Beyond Preservation: Situating Archaeological Data in Professional PracticeBeyond Preservation: Situating Archaeological Data in Professional Practice
Beyond Preservation: Situating Archaeological Data in Professional Practice
 
Cornell 2011 05-13
Cornell 2011 05-13Cornell 2011 05-13
Cornell 2011 05-13
 
Big Data in the Arts and Humanities
Big Data in the Arts and HumanitiesBig Data in the Arts and Humanities
Big Data in the Arts and Humanities
 
Open Access: Open Access Looking for ways to increase the reach and impact of...
Open Access: Open Access Looking for ways to increase the reach and impact of...Open Access: Open Access Looking for ways to increase the reach and impact of...
Open Access: Open Access Looking for ways to increase the reach and impact of...
 
OpenData Public Research, University of Toronto, Open Access Week, 25/11/2011
OpenData Public Research, University of Toronto, Open Access Week, 25/11/2011OpenData Public Research, University of Toronto, Open Access Week, 25/11/2011
OpenData Public Research, University of Toronto, Open Access Week, 25/11/2011
 
DataCite - services and support for opening up research data
DataCite - services and support for opening up research dataDataCite - services and support for opening up research data
DataCite - services and support for opening up research data
 
RDAP13 Lorrie Johnson: Facilitating Access to Scientific Data
RDAP13 Lorrie Johnson: Facilitating Access to Scientific DataRDAP13 Lorrie Johnson: Facilitating Access to Scientific Data
RDAP13 Lorrie Johnson: Facilitating Access to Scientific Data
 
Open Access, Open Data. Open Research?
Open Access, Open Data. Open Research?Open Access, Open Data. Open Research?
Open Access, Open Data. Open Research?
 
SEAD Virtual Archive: Building a Federation of Institutional Repositories fo...
 SEAD Virtual Archive: Building a Federation of Institutional Repositories fo... SEAD Virtual Archive: Building a Federation of Institutional Repositories fo...
SEAD Virtual Archive: Building a Federation of Institutional Repositories fo...
 
BeSTGRID OpenGridForum 29 GIN session
BeSTGRID OpenGridForum 29 GIN sessionBeSTGRID OpenGridForum 29 GIN session
BeSTGRID OpenGridForum 29 GIN session
 
Managing and Sharing Research Data
Managing and Sharing Research DataManaging and Sharing Research Data
Managing and Sharing Research Data
 
Data Publishing in Archaeozoology
Data Publishing in ArchaeozoologyData Publishing in Archaeozoology
Data Publishing in Archaeozoology
 
Building a Data Discovery Network for Sustainability Science
Building a Data Discovery Network for Sustainability ScienceBuilding a Data Discovery Network for Sustainability Science
Building a Data Discovery Network for Sustainability Science
 
Managing and Sharing Research Data: Good practices for an ideal world...in th...
Managing and Sharing Research Data: Good practices for an ideal world...in th...Managing and Sharing Research Data: Good practices for an ideal world...in th...
Managing and Sharing Research Data: Good practices for an ideal world...in th...
 
Case Study Big Data: Socio-Technical Issues of HathiTrust Digital Texts
Case Study Big Data: Socio-Technical Issues of HathiTrust Digital TextsCase Study Big Data: Socio-Technical Issues of HathiTrust Digital Texts
Case Study Big Data: Socio-Technical Issues of HathiTrust Digital Texts
 
Building the FAIR Research Commons: A Data Driven Society of Scientists
Building the FAIR Research Commons: A Data Driven Society of ScientistsBuilding the FAIR Research Commons: A Data Driven Society of Scientists
Building the FAIR Research Commons: A Data Driven Society of Scientists
 
The Future of Open Science
The Future of Open ScienceThe Future of Open Science
The Future of Open Science
 
Chris Marsden, University of Essex (Plenary): Regulation, Standards, Governan...
Chris Marsden, University of Essex (Plenary): Regulation, Standards, Governan...Chris Marsden, University of Essex (Plenary): Regulation, Standards, Governan...
Chris Marsden, University of Essex (Plenary): Regulation, Standards, Governan...
 

Destacado

ORCID Outreach Meeting dev breakout session
ORCID Outreach Meeting dev breakout sessionORCID Outreach Meeting dev breakout session
ORCID Outreach Meeting dev breakout sessionGudmundur Thorisson
 
DataCite workshop at BL April 2011
DataCite workshop at BL April 2011DataCite workshop at BL April 2011
DataCite workshop at BL April 2011Gudmundur Thorisson
 
Thorisson science online london sep2010
Thorisson science online london sep2010Thorisson science online london sep2010
Thorisson science online london sep2010Gudmundur Thorisson
 
Identity in research data publication - meeting with SageCite people march2011
Identity in research data publication - meeting with SageCite people march2011Identity in research data publication - meeting with SageCite people march2011
Identity in research data publication - meeting with SageCite people march2011Gudmundur Thorisson
 
BRIF workshop Toulouse 2012 ORCID intro and status update
BRIF workshop Toulouse 2012 ORCID intro and status updateBRIF workshop Toulouse 2012 ORCID intro and status update
BRIF workshop Toulouse 2012 ORCID intro and status updateGudmundur Thorisson
 
NIH VIVO workshop Indiana March 2011
NIH VIVO workshop Indiana March 2011NIH VIVO workshop Indiana March 2011
NIH VIVO workshop Indiana March 2011Gudmundur Thorisson
 
Staða opins aðgangs á Íslandi
Staða opins aðgangs á ÍslandiStaða opins aðgangs á Íslandi
Staða opins aðgangs á ÍslandiGudmundur Thorisson
 
Flickr.com: More than Pretty Pictures (updated for GWA2010)
Flickr.com: More than Pretty Pictures (updated for GWA2010)Flickr.com: More than Pretty Pictures (updated for GWA2010)
Flickr.com: More than Pretty Pictures (updated for GWA2010)Kim Kruse
 

Destacado (9)

ORCID Outreach Meeting dev breakout session
ORCID Outreach Meeting dev breakout sessionORCID Outreach Meeting dev breakout session
ORCID Outreach Meeting dev breakout session
 
DataCite workshop at BL April 2011
DataCite workshop at BL April 2011DataCite workshop at BL April 2011
DataCite workshop at BL April 2011
 
Thorisson science online london sep2010
Thorisson science online london sep2010Thorisson science online london sep2010
Thorisson science online london sep2010
 
Identity in research data publication - meeting with SageCite people march2011
Identity in research data publication - meeting with SageCite people march2011Identity in research data publication - meeting with SageCite people march2011
Identity in research data publication - meeting with SageCite people march2011
 
T M 6 Etika Linkungan (4)
T M 6  Etika  Linkungan (4)T M 6  Etika  Linkungan (4)
T M 6 Etika Linkungan (4)
 
BRIF workshop Toulouse 2012 ORCID intro and status update
BRIF workshop Toulouse 2012 ORCID intro and status updateBRIF workshop Toulouse 2012 ORCID intro and status update
BRIF workshop Toulouse 2012 ORCID intro and status update
 
NIH VIVO workshop Indiana March 2011
NIH VIVO workshop Indiana March 2011NIH VIVO workshop Indiana March 2011
NIH VIVO workshop Indiana March 2011
 
Staða opins aðgangs á Íslandi
Staða opins aðgangs á ÍslandiStaða opins aðgangs á Íslandi
Staða opins aðgangs á Íslandi
 
Flickr.com: More than Pretty Pictures (updated for GWA2010)
Flickr.com: More than Pretty Pictures (updated for GWA2010)Flickr.com: More than Pretty Pictures (updated for GWA2010)
Flickr.com: More than Pretty Pictures (updated for GWA2010)
 

Similar a RDFC2012 Open Access to Research Data

Getting Started with Institutional Repositories and Open Access
Getting Started with Institutional Repositories and Open AccessGetting Started with Institutional Repositories and Open Access
Getting Started with Institutional Repositories and Open AccessAbby Clobridge
 
Open data: Enhancing preservation, reproducibility, and innovation
Open data: Enhancing preservation, reproducibility, and innovationOpen data: Enhancing preservation, reproducibility, and innovation
Open data: Enhancing preservation, reproducibility, and innovationciakov
 
VIVO Conference 2013 Panel Slides
VIVO Conference 2013 Panel SlidesVIVO Conference 2013 Panel Slides
VIVO Conference 2013 Panel SlidesPatrick West
 
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...BigData_Europe
 
The current challenges of upgrading the infrastructure
The current challenges of upgrading the infrastructureThe current challenges of upgrading the infrastructure
The current challenges of upgrading the infrastructureArhiv družboslovnih podatkov
 
2021-01-27--biodiversity-informatics-gbif-(52slides)
2021-01-27--biodiversity-informatics-gbif-(52slides)2021-01-27--biodiversity-informatics-gbif-(52slides)
2021-01-27--biodiversity-informatics-gbif-(52slides)Dag Endresen
 
Open Data in a Big Data World: easy to say, but hard to do?
Open Data in a Big Data World: easy to say, but hard to do?Open Data in a Big Data World: easy to say, but hard to do?
Open Data in a Big Data World: easy to say, but hard to do?LEARN Project
 
ExLibris National Library Meeting @ IFLA-Helsinki - Aug 15th 2012
ExLibris National Library Meeting @ IFLA-Helsinki - Aug 15th 2012ExLibris National Library Meeting @ IFLA-Helsinki - Aug 15th 2012
ExLibris National Library Meeting @ IFLA-Helsinki - Aug 15th 2012Lee Dirks
 
British Library Datasets Programme 2010
British Library Datasets Programme 2010British Library Datasets Programme 2010
British Library Datasets Programme 2010ALISS
 
Open Access Week | Dag van het onderzoek
Open Access Week | Dag van het onderzoekOpen Access Week | Dag van het onderzoek
Open Access Week | Dag van het onderzoekHendrik Drachsler
 
Data as a research output and a research asset: the case for Open Science/Sim...
Data as a research output and a research asset: the case for Open Science/Sim...Data as a research output and a research asset: the case for Open Science/Sim...
Data as a research output and a research asset: the case for Open Science/Sim...African Open Science Platform
 
Open Data - strategies for research data management & impact of best practices
Open Data - strategies for research data management & impact of best practicesOpen Data - strategies for research data management & impact of best practices
Open Data - strategies for research data management & impact of best practicesMartin Donnelly
 
TNC2012 Federated and scholarly identity - match made in heaven?
TNC2012 Federated and scholarly identity - match made in heaven?TNC2012 Federated and scholarly identity - match made in heaven?
TNC2012 Federated and scholarly identity - match made in heaven?Gudmundur Thorisson
 

Similar a RDFC2012 Open Access to Research Data (20)

Getting Started with Institutional Repositories and Open Access
Getting Started with Institutional Repositories and Open AccessGetting Started with Institutional Repositories and Open Access
Getting Started with Institutional Repositories and Open Access
 
Open data: Enhancing preservation, reproducibility, and innovation
Open data: Enhancing preservation, reproducibility, and innovationOpen data: Enhancing preservation, reproducibility, and innovation
Open data: Enhancing preservation, reproducibility, and innovation
 
VIVO Conference 2013 Panel Slides
VIVO Conference 2013 Panel SlidesVIVO Conference 2013 Panel Slides
VIVO Conference 2013 Panel Slides
 
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
 
The current challenges of upgrading the infrastructure
The current challenges of upgrading the infrastructureThe current challenges of upgrading the infrastructure
The current challenges of upgrading the infrastructure
 
Aggregation as tactic sm new
Aggregation as tactic sm newAggregation as tactic sm new
Aggregation as tactic sm new
 
Aggregation as Tactic
Aggregation as TacticAggregation as Tactic
Aggregation as Tactic
 
Opendatasessions
OpendatasessionsOpendatasessions
Opendatasessions
 
2021-01-27--biodiversity-informatics-gbif-(52slides)
2021-01-27--biodiversity-informatics-gbif-(52slides)2021-01-27--biodiversity-informatics-gbif-(52slides)
2021-01-27--biodiversity-informatics-gbif-(52slides)
 
Open Data in a Big Data World: easy to say, but hard to do?
Open Data in a Big Data World: easy to say, but hard to do?Open Data in a Big Data World: easy to say, but hard to do?
Open Data in a Big Data World: easy to say, but hard to do?
 
CAEPIA 2011
CAEPIA 2011CAEPIA 2011
CAEPIA 2011
 
ExLibris National Library Meeting @ IFLA-Helsinki - Aug 15th 2012
ExLibris National Library Meeting @ IFLA-Helsinki - Aug 15th 2012ExLibris National Library Meeting @ IFLA-Helsinki - Aug 15th 2012
ExLibris National Library Meeting @ IFLA-Helsinki - Aug 15th 2012
 
British Library Datasets Programme 2010
British Library Datasets Programme 2010British Library Datasets Programme 2010
British Library Datasets Programme 2010
 
Open Access Week | Dag van het onderzoek
Open Access Week | Dag van het onderzoekOpen Access Week | Dag van het onderzoek
Open Access Week | Dag van het onderzoek
 
Ciard Initiative and a Global Infrastructure for Linked Open Data
Ciard Initiative and a Global Infrastructure for Linked Open Data Ciard Initiative and a Global Infrastructure for Linked Open Data
Ciard Initiative and a Global Infrastructure for Linked Open Data
 
Data as a research output and a research asset: the case for Open Science/Sim...
Data as a research output and a research asset: the case for Open Science/Sim...Data as a research output and a research asset: the case for Open Science/Sim...
Data as a research output and a research asset: the case for Open Science/Sim...
 
Open Data - strategies for research data management & impact of best practices
Open Data - strategies for research data management & impact of best practicesOpen Data - strategies for research data management & impact of best practices
Open Data - strategies for research data management & impact of best practices
 
TNC2012 Federated and scholarly identity - match made in heaven?
TNC2012 Federated and scholarly identity - match made in heaven?TNC2012 Federated and scholarly identity - match made in heaven?
TNC2012 Federated and scholarly identity - match made in heaven?
 
E scidocdays review
E scidocdays reviewE scidocdays review
E scidocdays review
 
Introduction to Research Data Management
Introduction to Research Data ManagementIntroduction to Research Data Management
Introduction to Research Data Management
 

Más de Gudmundur Thorisson

ODIN 1st year Conference Oct 2013 Interoperability: connecting identifiers
ODIN 1st year Conference Oct 2013 Interoperability: connecting identifiersODIN 1st year Conference Oct 2013 Interoperability: connecting identifiers
ODIN 1st year Conference Oct 2013 Interoperability: connecting identifiersGudmundur Thorisson
 
ORCID Outreach meeting Oxford may 2013 integration demo
ORCID Outreach meeting Oxford may 2013 integration demoORCID Outreach meeting Oxford may 2013 integration demo
ORCID Outreach meeting Oxford may 2013 integration demoGudmundur Thorisson
 
OA útskýrt: hvað er opinn aðgangur og af hverju?
OA útskýrt: hvað er opinn aðgangur og af hverju?OA útskýrt: hvað er opinn aðgangur og af hverju?
OA útskýrt: hvað er opinn aðgangur og af hverju?Gudmundur Thorisson
 
BRIF workshop Toulouse 2012 Digital IDs subgroup
BRIF workshop Toulouse 2012 Digital IDs subgroupBRIF workshop Toulouse 2012 Digital IDs subgroup
BRIF workshop Toulouse 2012 Digital IDs subgroupGudmundur Thorisson
 
GEN2PHEN GAM9 Toulouse - Launching the ORCID system, what do we do now?
GEN2PHEN GAM9 Toulouse - Launching the ORCID system, what do we do now?GEN2PHEN GAM9 Toulouse - Launching the ORCID system, what do we do now?
GEN2PHEN GAM9 Toulouse - Launching the ORCID system, what do we do now?Gudmundur Thorisson
 
Afmælisfundur Líf- og umhverfisvísindastofnunar - kynning á vef
Afmælisfundur Líf- og umhverfisvísindastofnunar - kynning á vefAfmælisfundur Líf- og umhverfisvísindastofnunar - kynning á vef
Afmælisfundur Líf- og umhverfisvísindastofnunar - kynning á vefGudmundur Thorisson
 
Value of Unique IDs in Academia, Vilnius - Identifying knowledge contributors
Value of Unique IDs in Academia, Vilnius - Identifying knowledge contributorsValue of Unique IDs in Academia, Vilnius - Identifying knowledge contributors
Value of Unique IDs in Academia, Vilnius - Identifying knowledge contributorsGudmundur Thorisson
 
GEN2PHEN GAM8 meeting Leiden - Identifiers for LSDBs
GEN2PHEN GAM8 meeting Leiden - Identifiers for LSDBsGEN2PHEN GAM8 meeting Leiden - Identifiers for LSDBs
GEN2PHEN GAM8 meeting Leiden - Identifiers for LSDBsGudmundur Thorisson
 
GEN2PHEN GAM8 meeting Leiden - Update on ORCID and other ID developments
GEN2PHEN GAM8 meeting Leiden - Update on ORCID and other ID developmentsGEN2PHEN GAM8 meeting Leiden - Update on ORCID and other ID developments
GEN2PHEN GAM8 meeting Leiden - Update on ORCID and other ID developmentsGudmundur Thorisson
 
VIVO conference Aug 2011: The VIVO platform and ORCID in the scholarly identi...
VIVO conference Aug 2011: The VIVO platform and ORCID in the scholarly identi...VIVO conference Aug 2011: The VIVO platform and ORCID in the scholarly identi...
VIVO conference Aug 2011: The VIVO platform and ORCID in the scholarly identi...Gudmundur Thorisson
 
ORCID participant meeting May 2011: The digital scholar, identity on the Web ...
ORCID participant meeting May 2011: The digital scholar, identity on the Web ...ORCID participant meeting May 2011: The digital scholar, identity on the Web ...
ORCID participant meeting May 2011: The digital scholar, identity on the Web ...Gudmundur Thorisson
 
Data Citation Principles Harvard May 2011: ORCID and data publication - Ident...
Data Citation Principles Harvard May 2011: ORCID and data publication - Ident...Data Citation Principles Harvard May 2011: ORCID and data publication - Ident...
Data Citation Principles Harvard May 2011: ORCID and data publication - Ident...Gudmundur Thorisson
 
sameAs London May 2011: The digital scholar, identity on the Web and ORCID
sameAs London May 2011: The digital scholar, identity on the Web and ORCIDsameAs London May 2011: The digital scholar, identity on the Web and ORCID
sameAs London May 2011: The digital scholar, identity on the Web and ORCIDGudmundur Thorisson
 
JISC MRD workshop Birmingham march 2011
JISC MRD workshop Birmingham march 2011JISC MRD workshop Birmingham march 2011
JISC MRD workshop Birmingham march 2011Gudmundur Thorisson
 

Más de Gudmundur Thorisson (15)

ODIN 1st year Conference Oct 2013 Interoperability: connecting identifiers
ODIN 1st year Conference Oct 2013 Interoperability: connecting identifiersODIN 1st year Conference Oct 2013 Interoperability: connecting identifiers
ODIN 1st year Conference Oct 2013 Interoperability: connecting identifiers
 
ORCID Outreach meeting Oxford may 2013 integration demo
ORCID Outreach meeting Oxford may 2013 integration demoORCID Outreach meeting Oxford may 2013 integration demo
ORCID Outreach meeting Oxford may 2013 integration demo
 
Elsevier webinar New York
Elsevier webinar New YorkElsevier webinar New York
Elsevier webinar New York
 
OA útskýrt: hvað er opinn aðgangur og af hverju?
OA útskýrt: hvað er opinn aðgangur og af hverju?OA útskýrt: hvað er opinn aðgangur og af hverju?
OA útskýrt: hvað er opinn aðgangur og af hverju?
 
BRIF workshop Toulouse 2012 Digital IDs subgroup
BRIF workshop Toulouse 2012 Digital IDs subgroupBRIF workshop Toulouse 2012 Digital IDs subgroup
BRIF workshop Toulouse 2012 Digital IDs subgroup
 
GEN2PHEN GAM9 Toulouse - Launching the ORCID system, what do we do now?
GEN2PHEN GAM9 Toulouse - Launching the ORCID system, what do we do now?GEN2PHEN GAM9 Toulouse - Launching the ORCID system, what do we do now?
GEN2PHEN GAM9 Toulouse - Launching the ORCID system, what do we do now?
 
Afmælisfundur Líf- og umhverfisvísindastofnunar - kynning á vef
Afmælisfundur Líf- og umhverfisvísindastofnunar - kynning á vefAfmælisfundur Líf- og umhverfisvísindastofnunar - kynning á vef
Afmælisfundur Líf- og umhverfisvísindastofnunar - kynning á vef
 
Value of Unique IDs in Academia, Vilnius - Identifying knowledge contributors
Value of Unique IDs in Academia, Vilnius - Identifying knowledge contributorsValue of Unique IDs in Academia, Vilnius - Identifying knowledge contributors
Value of Unique IDs in Academia, Vilnius - Identifying knowledge contributors
 
GEN2PHEN GAM8 meeting Leiden - Identifiers for LSDBs
GEN2PHEN GAM8 meeting Leiden - Identifiers for LSDBsGEN2PHEN GAM8 meeting Leiden - Identifiers for LSDBs
GEN2PHEN GAM8 meeting Leiden - Identifiers for LSDBs
 
GEN2PHEN GAM8 meeting Leiden - Update on ORCID and other ID developments
GEN2PHEN GAM8 meeting Leiden - Update on ORCID and other ID developmentsGEN2PHEN GAM8 meeting Leiden - Update on ORCID and other ID developments
GEN2PHEN GAM8 meeting Leiden - Update on ORCID and other ID developments
 
VIVO conference Aug 2011: The VIVO platform and ORCID in the scholarly identi...
VIVO conference Aug 2011: The VIVO platform and ORCID in the scholarly identi...VIVO conference Aug 2011: The VIVO platform and ORCID in the scholarly identi...
VIVO conference Aug 2011: The VIVO platform and ORCID in the scholarly identi...
 
ORCID participant meeting May 2011: The digital scholar, identity on the Web ...
ORCID participant meeting May 2011: The digital scholar, identity on the Web ...ORCID participant meeting May 2011: The digital scholar, identity on the Web ...
ORCID participant meeting May 2011: The digital scholar, identity on the Web ...
 
Data Citation Principles Harvard May 2011: ORCID and data publication - Ident...
Data Citation Principles Harvard May 2011: ORCID and data publication - Ident...Data Citation Principles Harvard May 2011: ORCID and data publication - Ident...
Data Citation Principles Harvard May 2011: ORCID and data publication - Ident...
 
sameAs London May 2011: The digital scholar, identity on the Web and ORCID
sameAs London May 2011: The digital scholar, identity on the Web and ORCIDsameAs London May 2011: The digital scholar, identity on the Web and ORCID
sameAs London May 2011: The digital scholar, identity on the Web and ORCID
 
JISC MRD workshop Birmingham march 2011
JISC MRD workshop Birmingham march 2011JISC MRD workshop Birmingham march 2011
JISC MRD workshop Birmingham march 2011
 

Último

Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 

Último (20)

Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 

RDFC2012 Open Access to Research Data

  • 1. Open access to scientific research data Gudmundur A. Thorisson, PhD <gt50@leicester.ac.uk> Research associate, University of Leicester Guest scientist, University of Iceland Participant in the GEN2PHEN Consortium and the ORCID Technical Working Group This work is published under the Creative Commons Attribution license (CC BY: http://creativecommons.org/licenses/by/3.0/) which means that it can be freely copied, redistributed and adapted, as long as proper attribution is given.
  • 2. Overview ๏ Intro to the world of Big Science & Big Data •Why is inadequate access to data such a problem? ๏ Incentive-based approaches to tackling the sharing problem Identification, identification, identification ๏ Key relevant developments internationally ๏ Some food for thought for funders, institutions, other key players ๏ Concluding remarks RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
  • 3. Big Science, Big Data • Scientific research increasingly large-scale and data-driven • High-profile discipline examples – High-energy particle physics - experiments performed in the Large Hadron Collider – Astronomy - data from ground-based and space telescopes, the Virtual Observatory (VO) • Doctorow, C. Big data: Welcome to the petacentre. Nature 455, 16- 21 (2008). http://dx.doi.org/10.1038/455016a RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
  • 4. Hypothesis generation guided by available data Kell and Oliver. Bioessays (2004) vol. 26 (1) • Science paradigms – 1st: Empirical - describing natural phenomena – 2nd: Theoretical - models, generalizations – 3rd: Computational - simulating complex phenomena – 4th (1+2+3): Data exploration, e-Science Gray, J. 2009. The Fourth Paradigm: Data-Intensive Scientific Discovery. Microsoft Research RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
  • 5. Biological research too is increasingly big and data-driven • From: small-scale datasets that fit into a printed journal article Richards, M. et al. Paleolithic and neolithic lineages in the European mitochondrial gene pool. American journal of human genetics 59, 185-203 (1996). http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1915109/ RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
  • 6. Biological research too is increasingly big and data-driven • To: large-scale collection of biological data in digital form • Huge technological advances in last 5-10 years – experimental / observations <-- gathering data with high-throughput equipment – computer technology <-- storing & analyzing massive data volumes • Example: massively-parallel sequencing – Determine human genome sequence in <1 day - the $1000 genome – Metagenomics: sequence *everything* in environment samples – Large bio-specimen collections • x100,0000 of individuals in disease/population biobanks RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
  • 7. Examples: domain repositories for sequence data • GenBank - genetic sequence repository, established 1986 • UniProt - knowledge base for protein sequence & function Conference on Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012 RDFC2012 Unique Identifiers, Vilnius, Feb 14 2012
  • 8. “Community resource projects” - large-scale data generation for the purpose of making the data available for broad reuse • The sequence of the human genome – International Human Genome project - mandatory rapid data sharing, the Bermuda principles • Pattern of variation in the human genome – International Haplotype Map Project - genotyping population samples – 1000 Genomes Project - sequencing population samples RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
  • 9. Big Data – challenges, opportunities • Managing & making sense of large-scale datasets – Data easy/cheap to generate - not so cheap to store & use – Favorite quote: “the $1000 genome sequence, followed by the ++$10,000 analysis” • Integration & analysis - combining datasets – more data of the same type - e.g. combine sequences from multiple species – related data of different type - e.g. a person’s genome sequence + his/her phenotype • Potential for accelerating research, creating new knowledge and (in biomedicine) improving human health. • Key driver = unrestricted sharing of scientifc data deposited in the public domain RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
  • 10. Data = “fuel” of science Smith,V. Data publication: towards a database of everything. BMC Res Notes (2009) vol. 2 (1) RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
  • 11. 11
  • 12. 12
  • 13. Data = “fuel” of science [..] If digital technologies are the engine of this revolution, digital data are its fuel. But for many scientific disciplines, this fuel is in short supply.[..] Smith,V. Data publication: towards a database of everything. BMC Res Notes (2009) vol. 2 (1) RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
  • 14. Biology and data sharing in the “long tail” • Biology is complex, so data are often very heterogeneous • Technologies changing rapidly • Lots of small-scale research projects • Lots of small/medium datasets The ‘long tail’ of dark bio-data • Data in the long tail usually *not* shared OR not shared in a useful way • Contrast with other data-intensive disciplines with – a long history of sharing research data - a “culture of sharing” – big, expensive, shared facilities = the only way to do this kind of research – relatively homogeneous datasets, easier to scale up to big volumes (e.g. telescope images) RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
  • 15. […] Overall, only 47 papers (9%) deposited full primary raw data online. None of the 149 papers not subject to data availability policies made their full primary data publicly available. Conclusion: A substantial proportion of original research papers published in high-impact journals are either not subject to any data availability policies, or do not adhere to the data availability instructions in their respective journals. This empiric evaluation highlights opportunities for improvement RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
  • 16. DATA analysed synthesised interpreted INFORMATION published KNOWLEDGE Publication Lots of published knowledge but hard/impossible to go back and reproduce work & validate findings + Opportunity for maximising the value of data through reuse is wasted RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
  • 18. Lots and lots of diverse reasons!! Some quotes from researchers: “Don't which digital repository I should upload to” “Too much work, got better things to do!” “My competitors will just take the data and ‘scoop’ me” “It's my data, I collected them and noone else is entitled to use them” “[myriad other reasons]” Worringly, many authors don't seem to care whether evidence underpinning their published findings is accessible or not Koslow. Should the neuroscience community make a paradigm shift to sharing primary data?. Nat Neurosci (2000) vol. 3 (9). http://dx.doi.org/10.1038/78760 RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
  • 19. Gnarly issue #1: “ownership” vs “stewardship” • Many researchers consider data their property, even if research funded by public money – e.g. want to do further analysis on data in future, publish more papers • ..which conficts with interests of other stakeholders in the game, e.g. (funders, universities) who want: – to maximize return on investment in the funded research – to ensure good, solid evidence-based science is done, etc. RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
  • 20. Gnarly issue #2 – biomedical data • Usually sensitive, cannot be shared without restrictions – Detailed, reidentifiable biomedical data that cannot be fully anonymized – Personal privacy considerations • Specialized controlled-access archives deal with some of this – NCBI's database of Genotypes and Phenotypes – dbGaP – European Genome-phenome Archive – EGA – [specific diseases / disorders, research consortia, others] RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
  • 21. How to Make a Tackle in Rugby Tackling in rugby is one of the most important aspects of the game. [...] Credit:http://djamba.com/how-to-make-a-tackle-in-rugby.html 21
  • 22. ...which are an imperfect solution • Arguments that mandates by themselves are not the way • Mandates likely to ensure only minimum compliance – sharing would be done in minimally useful form (as in, whatever is the least effort) …. and are meaningless if not enforced (currently the case with many journals) RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
  • 23. Sharing now tends to be driven by mandates... * Journals increasingly require data to be made available “Provide supporting data in a repository OR we won’t publish your paper” * Funders increasingly require data sharing plan & budget baked into grant proposals. “Publish data we are funding you to generate OR we will not fund your research again” Using just a stick gets you so only far RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
  • 24. Strategies focused on encouraging sharing - Make it easy - - Make it useful - - Make it citable - RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
  • 25. Treating data as citable publications in their own right • Core strategy: enable data to be treated as 1st class citizens of the scholarly record which: i) are indexed and can be discovered, located and accessed, and ii) can be properly identified & cited unambiguously like other scholarly works • Link datasets with the primary journal publication - citation crosslinks • Give data creators/curators/analysts proper credit for their contribution to the digital resource • Focus on the benefits to researchers from publishing their data – Data sharing → Data PUBLICATION + CITATION – Others reuse & cite their stuff → more citations → more impact – The more useful a dataset, the more likely to be used & cited RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
  • 26. Exemplar – Data Dryad “international repository of data underlying peer-reviewed articles in the basic and applied biosciences” http://datadryad.org • Combines – Mandates (journal policy) and – Citable data publication • Citation cross-linking – Paper references dataset – Dataset references paper RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
  • 27. Key building blocks: the 3 I’s of identification RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
  • 28. 1 I Identifying scholarly publications (and other research outputs) • Why? So it is possible to.. ..cite the work unambiguously (‘..we used the method described in Thorisson et al (2009)’) ..locate the work (retrieve Nature article as PDF from journal website) ..give credit to persons/entities who contributed to the work (G. Thorisson authored paper X) • Need for globally unique, persistent identifiers to combat unstable Web URLs, broken hyperlinks • e.g. Digital Object Identifiers (DOIs) for pubs, datasets and more: – Bell et al. 2009. Science 323(5919) doi:10.1371/journal.pone.0024357 – Goodwillie C et al (2005) Data from: The evolutionary enigma of mixed mating systems in plants: occurrence, theoretical explanations, and empirical evidence. Dryad Digital Repository. doi:10.5061/dryad.292q34fp RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
  • 29. 2 I Identifying use/reuse - measuring impact – Historical reliance on formal citations and citation-based metrics – ISI Impact Factor widely used, but really metric for infuence of a scholarly journal – Citation analysis not going away - remains the gold standard – Many other use/reuse indicators for impact of individual research outputs • Focus on the impact of the *publication* itself, not the journal in which it appears • Indicators: no. full-text downloads, tweets (i.e. mentions on Twitter), social bookmarking • AltMetrics - a growing grassroots movement “ to better measure and reward all the different ways that people contribute to the messy and complex process of scientific progress [..] born out of a simple recognition: Many of the traditional measurements are too slow or simplistic to keep pace with today’s Internet-age science” http://altmetrics.org – Lots new tools and projects emerging to explore possibilities in this space • e.g. http://total-impact.org RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
  • 30. 3 I Identifying contributors – attributing credit – Why? So we can.. ..link content creators with their works - attribute credit accurately ..figure out: who contributed to publication X? which publications has person/organization Y contributed to? – What kind of contributions? Characterizing ‘contributorship’ author, creator, analyst, reviewer, ‘conceived of study & designed experiment’ etc RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
  • 31. Tackling the author name ambiguity problem (or ‘Who’s Who?’) How about these? Or these? J. Smith J. Smith J. Smith Are these authors all the same person? J. Smith G. Thorisson, University of Leicester J. Smith G. A. Thorisson, University of Leicester [etc.] G. A. Thorisson, Cold Spring Harbor Laboratory ∼2/3 of the ∼6 million authors in MEDLINE share a last name and first initial with at least one other author, and an ambiguous name refers to ∼8 persons on average. Torvik and Smalheiser. Author name disambiguation in MEDLINE. ACM Transactions on Knowledge Discovery from Data (2009) vol. 3 (3) RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
  • 32. The Open Researcher & Contributor ID initiative Launched end of 2009, ORCID will work to support the creation of a permanent, clear and unambiguous record of scholarly communication by enabling reliable attribution of authors and contributors through unique identifiers RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
  • 33. The Open Researcher & Contributor ID initiative ORCID will add value for scholars and the organizations that they are interacting with, including universities, scholarly societies, funding organizations and publishers •Joins faculty or student body •Joins scholarly society •Applies for grant •Submits manuscript RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
  • 34. ORCID transcends discipline, geographic, national and institutional boundaries - now >300 participants http://www.orcid.org 34
  • 35. Some food for thought / recommendation kind of stuff to conclude • Status of research data in Iceland is unclear → need research – Build on & extend 2007 Rannís report “Gagnagrunnar á Íslandi um náttúru, umhverfi og orku” Rannís, we´re looking at you! • Funders to take lead – Mandates (aka sticks) - require data management plan + budget in grant proposals • Many best practices & tools available to draw upon, e.g. by the UK Digital Curation Centre – Call for & fund research proposals to build infrastructural foundations & explore technologies/initiatives – Raise awareness in the local research community RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
  • 36. Even more food for thought / recommendation kind of stuff • Universities & other research institutions need to – Take research data seriously – Build infrastructure for data storage & preservation, support personnel (e.g. data officers / coordinators) – Include datasets and other non-conventional outputs in professional evalutations • Identify & engage with key international initiatives in this space – ORCID, DataCite, Dryad, Open Knowledge Foundation, others – OpenAIRPlus ← Solveig's talk coming up! RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
  • 37. Final bite of food-for-thought Let's make research data an integral part of the OA mission in Iceland, NOT an afterthought RDFC2012 Conference on Open Access and Digital Rights, Reykjavik, March 29 th 2012
  • 38. Acknowledgements GEN2PHEN Consortium This work has received funding by the http://www.gen2phen.org/about-gen2phen/partners European Community's Seventh Framework Programme (FP7/2007-2013) under grant agreement number 200754 - Prof Anthony J. Brookes Bioinformatics Group, Leicester the GEN2PHEN project. Contact me! Contact me! ORCID - http://www.orcid.org <gthorisson@gmail.com> <gthorisson@gmail.com> http://www.linkedin.com/in/mummi http://www.linkedin.com/in/mummi http://www.twitter.com/gthorisson http://www.twitter.com/gthorisson Published under the Creative Commons BY license http://www.gthorisson.name http://www.gthorisson.name (http://creativecommons.org/licenses/by/3.0/)