Identity in research data publication - meeting with SageCite people march2011
1. G. A. Thorisson, University of Leicester
Identity in research data publication
Gudmundur ‘Mummi’ Thorisson
<gt50@le.ac.uk>
Brookes lab
Identity Workshop prep-meeting, Helsinki, January 27 2011 1
Tuesday, 22 March 2011
2. G. A. Thorisson, University of Leicester
Non-unique names are a major
problem in the scholarly literature
Identity Workshop prep-meeting, Helsinki, January 27 2011 2
Tuesday, 22 March 2011
3. G. A. Thorisson, University of Leicester
Non-unique names are a major
problem in the scholarly literature
Are these authors all the same person?
G. Thorisson, University of Leicester
G. A. Thorisson, University of Leicester
G. A. Thorisson, Cold Spring Harbor Laboratory
Identity Workshop prep-meeting, Helsinki, January 27 2011 2
Tuesday, 22 March 2011
4. G. A. Thorisson, University of Leicester
Non-unique names are a major
problem in the scholarly literature
Or these?
J. Smith
Are these authors all the same person? J. Smith
G. Thorisson, University of Leicester J. Smith
G. A. Thorisson, University of Leicester J. Smith
G. A. Thorisson, Cold Spring Harbor Laboratory J. Smith
[etc.]
Identity Workshop prep-meeting, Helsinki, January 27 2011 2
Tuesday, 22 March 2011
5. G. A. Thorisson, University of Leicester
Non-unique names are a major
problem in the scholarly literature
How about these? Or these?
J. Smith
Are these authors all the same person? J. Smith
G. Thorisson, University of Leicester J. Smith
G. A. Thorisson, University of Leicester J. Smith
G. A. Thorisson, Cold Spring Harbor Laboratory J. Smith
[etc.]
Identity Workshop prep-meeting, Helsinki, January 27 2011 2
Tuesday, 22 March 2011
6. G. A. Thorisson, University of Leicester
Non-unique names are a major
problem in the scholarly literature
How about these? Or these?
J. Smith
Are these authors all the same person? J. Smith
G. Thorisson, University of Leicester J. Smith
G. A. Thorisson, University of Leicester J. Smith
G. A. Thorisson, Cold Spring Harbor Laboratory J. Smith
[etc.]
∼2/3 of the ∼6 million authors in MEDLINE share a last name and first
initial with at least one other author, and an ambiguous name refers to
∼8 persons on average.
Torvik and Smalheiser. Author name disambiguation in MEDLINE. ACM Transactions on Knowledge
Discovery from Data (2009) vol. 3 (3)
Identity Workshop prep-meeting, Helsinki, January 27 2011 2
Tuesday, 22 March 2011
17. G. A. Thorisson, University of Leicester
Geoffrey Bilder
Director of Strategic Initiatives
Identity Workshop prep-meeting, Helsinki, January 27 2011 5
Tuesday, 22 March 2011
18. G. A. Thorisson, University of Leicester
Identity Workshop prep-meeting, Helsinki, January 27 2011 6
Tuesday, 22 March 2011
19. G. A. Thorisson, University of Leicester
• Contributor recognition - attribute published works to the
person(s) who contributed to them
Identity Workshop prep-meeting, Helsinki, January 27 2011 7
Tuesday, 22 March 2011
20. G. A. Thorisson, University of Leicester
Unique identifiers for authors contributors
Identity Workshop prep-meeting, Helsinki, January 27 2011 8
Tuesday, 22 March 2011
21. G. A. Thorisson, University of Leicester
Unique identifiers for authors contributors
automated author disambiguation
+
author involvement
Identity Workshop prep-meeting, Helsinki, January 27 2011 8
Tuesday, 22 March 2011
22. G. A. Thorisson, University of Leicester
Unique identifiers for authors contributors
automated author disambiguation
+
author involvement
Identity Workshop prep-meeting, Helsinki, January 27 2011 8
Tuesday, 22 March 2011
23. G. A. Thorisson, University of Leicester
Unique identifiers for authors contributors
automated author disambiguation
+
author involvement
Dec’09: launch of the Open Researcher
Contributor Identification Initiative - ORCID
Identity Workshop prep-meeting, Helsinki, January 27 2011 8
Tuesday, 22 March 2011
24. G. A. Thorisson, University of Leicester
?
ORCID
ORCID ID: B-1242-2010
G. Thorisson, Univ. Leicester
G. A. Thorisson, Univ. Leicester F67572010
G. A. Thorisson, Cold Spring Harbor Lab.
ORCID ID: G-1442-2009
J. Smith, Univ. North Pole
ORCID ID: D-2400-2010
J. Smith, Luthor Corporation
Centrally-managed informatics infrastructure:
i) for researchers to manage & use profile
ii) for tracking author-to-publication attribution links
iii) interaction with other systems (e.g. publishers, digital libraries
Identity Workshop prep-meeting, Helsinki, January 27 2011 9
Tuesday, 22 March 2011
25. G. A. Thorisson, University of Leicester
Manuscript submission to journal
Attribution: ORCID ID for author <--> DOI for article
Identity Workshop prep-meeting, Helsinki, January 27 2011 10
Tuesday, 22 March 2011
26. G. A. Thorisson, University of Leicester
Why publishers want this
– single sign-on (SSO) for manuscript tracking systems
– Disambiguating contact information for use by editorial offices, royalty
payments systems, copyright clearances, etc.
– Automatic updating of email addresses for table of contents (TOC) alerts and
other automated email communications
– Automated tools for detecting potential reviewers, including tools for
detecting potential conflicts of interest
– Synchronization with publisher web site user profiles and granting researchers
customized, privileged access to content based on profiles
– Understanding all of the manifold ways in which an individual “contributes” to
a publisher or a field (e.g. As an editor, reviewer, letter writer, conference
chair, etc.).
– Etc.
Identity Workshop prep-meeting, Helsinki, January 27 2011 11
Tuesday, 22 March 2011
27. G. A. Thorisson, University of Leicester
Why researchers will want this
-Verified publication record
-streamlined MS submission
-attribution for non-traditional
scholarly output
-Etc.
Identity Workshop prep-meeting, Helsinki, January 27 2011 12
Tuesday, 22 March 2011
28. G. A. Thorisson, University of Leicester www.gen2phen.org
Identity Workshop prep-meeting, Helsinki, January 27 2011 13
Tuesday, 22 March 2011
29. G. A. Thorisson, University of Leicester www.gen2phen.org
t ions
a nisa
0 Org
>15
Identity Workshop prep-meeting, Helsinki, January 27 2011 13
Tuesday, 22 March 2011
31. G. A. Thorisson, University of Leicester
Research data as
scholarly output
• Provenance - I trust dataset X generated by a certain J. Smith
• Contributor recognition - publication credit for sharing data
Identity Workshop prep-meeting, Helsinki, January 27 2011 15
Tuesday, 22 March 2011
32. G. A. Thorisson, University of Leicester
Research data as
scholarly output
• Provenance - I trust dataset X generated by a certain J. Smith
• Contributor recognition - publication credit for sharing data
• Access management - control access to sensitive research data
Identity Workshop prep-meeting, Helsinki, January 27 2011 15
Tuesday, 22 March 2011
33. G. A. Thorisson, University of Leicester
Need to IDENTIFY people as they
contribute to
&
access
Internet resources
• The basic identity problem the
Internet poses is establishing one
party’s identity to another party’s
satisfaction through
communication across the
network.
Weitzner. In Search of Manageable Identity Systems. IEEE
Internet Computing (2006) vol. 10 (6) pp. 84-86
Lab meeting Fri 11 Feb 2010 16
Tuesday, 22 March 2011
34. G. A. Thorisson, University of Leicester
Need to IDENTIFY people as they
contribute to
&
access
Internet resources
• The basic identity problem the
Internet poses is establishing one
party’s identity to another party’s
satisfaction through
communication across the
network.
Weitzner. In Search of Manageable Identity Systems. IEEE
Internet Computing (2006) vol. 10 (6) pp. 84-86
Lab meeting Fri 11 Feb 2010 16
Tuesday, 22 March 2011
35. G. A. Thorisson www.gen2phen.org
Data-related applications for an online
digital identity (a.k.a ‘researcher IDs’)
• Access management
– Controlling access to non-public resources on the Web
• Analytical resources - incl. high-performance computing clusters
• Potentially identifiable biomedical data
• Contribution tracking
– Data submissions to central repositores
– Data curation / micro-attribution
– Bio-resource impact factor + nanopublications
3rd Human Variome Project Meeting, Paris, 10-14 May, 2010 17
Tuesday, 22 March 2011
36. G. A. Thorisson, University of Leicester
The data sharing problem
From http://www.nature.com/news/specials/datasharing/
Identity Workshop prep-meeting, Helsinki, January 27 2011 18
Tuesday, 22 March 2011
37. G. A. Thorisson, University of Leicester
The data sharing problem
Data
analysed
synthesised
interpreted
Information
published
From http://www.nature.com/news/specials/datasharing/
Knowledge
Publication
Identity Workshop prep-meeting, Helsinki, January 27 2011 18
Tuesday, 22 March 2011
38. G. A. Thorisson, University of Leicester
Publishing a journal article
Identity Workshop prep-meeting, Helsinki, January 27 2011 19
Tuesday, 22 March 2011
39. G. A. Thorisson, University of Leicester
Publishing a journal article
Publishing a dataset
Identity Workshop prep-meeting, Helsinki, January 27 2011 19
Tuesday, 22 March 2011
40. G. A. Thorisson, University of Leicester
Outcome 3/3
DOI <-> ORCID ID
• Thorisson, G. (A-883-2010), Bilder, G.W. (C-035-2009) and Fenner, M. (A-101-2010).
Icelandic 9th century viking bowl. Psychoceramics Archive. Sep 2 2010.
doi:10.4259/psycho.5gtpq-thorisson
Identity Workshop prep-meeting, Helsinki, January 27 2011 20
Tuesday, 22 March 2011
41. G. A. Thorisson, University of Leicester
Outcome 3/3
DOI <-> ORCID ID
• Thorisson, G. (A-883-2010), Bilder, G.W. (C-035-2009) and Fenner, M. (A-101-2010).
Icelandic 9th century viking bowl. Psychoceramics Archive. Sep 2 2010.
doi:10.4259/psycho.5gtpq-thorisson
• A-883-2010 <created> 10.4259/psycho.5gtpq-thorisson
• C-035-2009 <created> 10.4259/psycho.5gtpq-thorisson
• A-101-2010 <created> 10.4259/psycho.5gtpq-thorisson
Identity Workshop prep-meeting, Helsinki, January 27 2011 20
Tuesday, 22 March 2011
42. G. A. Thorisson, University of Leicester
Cafe RouGE
1. Diagnostic 2. Central 3. End-users (e.g.
laboratories mutation depot LSDB curators)
Publish data
Retrieve RSS feeds
•Digital IDs for security / access management
•Attribution for published data, via digital IDs
10
Lab meeting Fri 11 Feb 2010 21
Tuesday, 22 March 2011
43. G. A. Thorisson, University of Leicester
• Who contributed to dataset 10.4259/psycho.5gtpq-thorisson?
• All data publications by A-883-2010 ?
G. Thorisson, Univ. Leicester
• Which papers have cited the works of A-883-2010 ? gthorisson@gmail.com
ORCID ID: A-883-2010
• Total no. citations to datasets by A-883-2010 in the last 2
years?
• Total no. downloads of datasets by A-883-2010?
• [....]
Identity Workshop prep-meeting, Helsinki, January 27 2011 22
Tuesday, 22 March 2011
44. G. A. Thorisson, University of Leicester
A digital identity for
researchers centred on
scholarly profile?
Identity Workshop prep-meeting, Helsinki, January 27 2011 23
Tuesday, 22 March 2011
45. G. A. Thorisson, University of Leicester
A digital identity for
researchers centred on
scholarly profile?
ORCID ID: B-1242-2010
G. Thorisson, Univ. Leicester
G. A. Thorisson, Univ. Leicester
G. A. Thorisson, Cold Spring Harbor Lab.
http://mummi.myopenid.com
Identity Workshop prep-meeting, Helsinki, January 27 2011 23
Tuesday, 22 March 2011
46. G. A. Thorisson, University of Leicester
A digital identity for
researchers centred on
scholarly profile?
ORCID ID: B-1242-2010
G. Thorisson, Univ. Leicester
G. A. Thorisson, Univ. Leicester
G. A. Thorisson, Cold Spring Harbor Lab.
http://mummi.myopenid.com
Identity Workshop prep-meeting, Helsinki, January 27 2011 23
Tuesday, 22 March 2011
47. G. A. Thorisson, University of Leicester
A digital identity for
researchers centred on
scholarly profile?
ORCID ID: B-1242-2010
G. Thorisson, Univ. Leicester
G. A. Thorisson, Univ. Leicester
G. A. Thorisson, Cold Spring Harbor Lab.
http://mummi.myopenid.com
Identity Workshop prep-meeting, Helsinki, January 27 2011 23
Tuesday, 22 March 2011
48. G. A. Thorisson, University of Leicester
A digital identity for
researchers centred on
scholarly profile?
ORCID ID: B-1242-2010
G. Thorisson, Univ. Leicester
G. A. Thorisson, Univ. Leicester
G. A. Thorisson, Cold Spring Harbor Lab.
http://mummi.myopenid.com
Identity Workshop prep-meeting, Helsinki, January 27 2011 23
Tuesday, 22 March 2011
49. G. A. Thorisson, University of Leicester
Coming autumn 2011, to a venue near you!
Int’l workshop on researcher identity
• Co-organized by CSC (Finland IT Centre for Science)
• Provisional title: “Identity in research infrastructure and
scientific communication" - IRISC
• Location: Helsinki
• Time: September 12-13
Lab meeting Fri 11 Feb 2010 24
Tuesday, 22 March 2011
50. G. A. Thorisson, University of Leicester
Acknowledgements
This work has received funding from
GEN2PHEN Consortium the European Community's Seventh
Framework Programme
http://www.gen2phen.org/about-gen2phen/partners
(FP7/2007-2013)
under grant agreement number
200754 - the GEN2PHEN project.
Anthony J. Brookes Bioinformatics Group
Contact me!
Gudmundur A. Thorisson
<gt50@le.ac.uk>
http://friendfeed.com/mummi
http://www.linkedin.com/in/mummi
Identity Workshop prep-meeting, Helsinki, January 27 2011 25
Tuesday, 22 March 2011