Breaking the Kubernetes Kill Chain: Host Path Mount
TNC2012 Federated and scholarly identity - match made in heaven?
1. ??????
Federated identity and scholarly
identity - a match made in heaven?
Gudmundur A. Thorisson, PhD <gt50@leicester.ac.uk>
Research associate, University of Leicester
Guest scientist, University of Iceland
Participant in the GEN2PHEN Consortium and the ORCID Technical Working Group
This work is published under the Creative Commons Attribution license (CC BY:
http://creativecommons.org/licenses/by/3.0/) which means that it can be
freely copied, redistributed and adapted, as long as proper attribution is given.
TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012
2. Overview
๏ Crash course in scholarly identity
๏ Some problems: name ambiguity and online identity fragmentation
๏ The Open Researcher & Contributor ID initiative - ORCID
background, current status and roadmap
๏ Applications of ORCID in the scholarly identity landscape
๏ Some thoughts on collaboration between ORCID and identity feds
TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012
3. Scholarly identity
‣ A scientific researcher’s publication record
- Defined by “authorship” of mostly “traditional” kind of works
- Articles in peer-reviewed journals, books, conf. proceedings
‣ The “publish or perish” culture in scientific research
- Authorship of papers in top-tier, high-impact journal is single
biggest factor in career advancement
- Not enough high-profile papers? no grant, no tenure, etc.
‣ At the heart of peer recognition / professional reputation
TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012
4.
5. Digital scholarship
in the 21st century
‣ Creation of online digital research outputs increasingly
common & important part of doing scientific work
- Research datasets deposited in online repositories
- Data curation - adding value to research data
- Scientific software
- Research blogging
- Contributions to scientific articles in Wikipedia
- [and so on]
TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012
6. Big Science, Big Data
• Scientific research increasingly large-scale and data-driven
• High-profile examples
– High-energy particle physics - experiments
performed in the Large Hadron Collider
– Astronomy - data from ground-based and space
telescopes, the Virtual Observatory (VO)
• Doctorow, C. Big data: Welcome to the petacentre. Nature
455, 16-21 (2008). http://dx.doi.org/10.1038/455016a
TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012
7. Biological research too is
increasingly “Big” and data-driven
‣ From: small-scale datasets that
fit into a printed journal article
Richards, M. et al. Paleolithic and neolithic lineages in the European mitochondrial gene pool. American
journal of human genetics 59, 185-203 (1996). http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1915109/
TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012
8. Biological research too is
increasingly “Big” and data-driven
‣ To: large-scale collection of
biological data in digital form
‣ Huge technological advances in last 5-10 years
experimental / observations <-- gathering data with high-throughput equipment
computer technology <-- storing & analyzing massive data volumes
‣ Example: massively-parallel sequencing
Determine human genome sequence in <1 day - the $1000 genome
Metagenomics: sequence *everything* in environment samples
Large bio-specimen collections
x100,0000 of individuals in disease/population biobanks
TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012
9. http://www.gen2phen.org
Prof Anthony J Brookes
GEN2PHEN coordinator
Chair, Bioinformatics and Genomics
Department of Genetics
University of Leicester, UK
4
10. Identifying contributors
‣ Why? So we can..
- Attribution - link content creators with their works and attribute credit
appropriately
- Discovery - who contributed to publication X?
which publications has person/organization Y contributed to?
‣ What kind of contributions?
- Characterizing ‘contributorship‘:
role: author, creator, analyst, reviewer
contribution: ‘conceived of study & designed experiment’,
‘wrote paper’, ‘performed experiments’
‣ LHC example: ~2000 ‘authors’ and ~170 institutions
TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012
11.
12. Problem #1: name ambiguity
Are these authors all the same person?
G. Thorisson, University of Leicester How about these?
G. A. Thorisson, University of Leicester
G. A. Thorisson, Cold Spring Harbor Laboratory
Or these?
J. Smith
J. Smith
J. Smith
J. Smith
J. Smith
[...]
[..] ∼2/3 of the ∼6 million authors in MEDLINE share a last name and first initial with at
least one other author, and an ambiguous name refers to ∼8 persons on average.
Torvik and Smalheiser. Author name disambiguation in MEDLINE. ACM Transactions on Knowledge
Discovery from Data (2009) vol. 3 (3)
TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012
13. Problem #1: name ambiguity
‣ Number of authors and other
scholarly contributors is
increasing
‣ Number & kinds of “works” they
contribute to is increasing
TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012
14. Problem #1: name ambiguity
‣ Number of authors and other
scholarly contributors is
increasing
‣ Number & kinds of “works” they
contribute to is increasing
‣ The scholarly record is broken
‣ Reliable attribution of authors and contributors is
impossible without unique person-level identifiers
TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012
15. Problem #2: digital identity crisis
‣ Session title: Scientific Schizophrenia - How many identities do
YOU have?
‣ Well, I have several! <-- identity crisis??
- 2x Universities I’m affiliated with
- Several scholarly/professional profile services
- LinkedIn professional profile / CV
- Twitter microblogging (for professional purposes)
- Several other author profiles that are not under my control
(Web of Science, Scopus, others)
‣ Identity fragmentation - big, big mess!!!!
TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012
16. How to Make a Tackle in Rugby
Tackling in rugby is one of the most important aspects of the game.[...]
Credit: http://djamba.com/how-to-make-a-tackle-in-rugby.html
17. The Open Researcher &
Contributor ID initiative
‣ ORCID is an international, interdisciplinary
organization involving multiple stakeholders:
- Research institutions, libraries, funding organizations,
publishers, intermediares and individual researchers
‣ Started in late 2009 to solve the name ambiguity
problem in scholarly communication.
‣ Incorporated as a non-profit with a Board of
Directors in August 2010.
TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012
18. The Open Researcher &
Contributor ID initiative
ORCID will work to support the creation of
a permanent, clear and unambiguous
record of scholarly communication by
enabling reliable attribution of authors and
contributors through unique identifiers
TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012
19. ORCID Participants
ORCID has 328 participant organizations from across the
world, 50 of which have provided sponsorship funding.
TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012
20. Some knowledge discovery use cases
Given a work, tell me who is responsible for it and
describe the nature of that responsibility.
Credit: Geoff Bilder http://irisc-workshop.org/wp-content/uploads/2011/09/irisc2011-geoff-bilder.ppt
TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012
21. Some knowledge discovery use cases
Given a work, tell me who is responsible for it and
describe the nature of that responsibility.
Credit: Geoff Bilder http://irisc-workshop.org/wp-content/uploads/2011/09/irisc2011-geoff-bilder.ppt
TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012
22. Some knowledge discovery use cases
Given a contributor, tell be what works he/she has
contributed to and describe the nature of the contributions.
Credit: Geoff Bilder http://irisc-workshop.org/wp-content/uploads/2011/09/irisc2011-geoff-bilder.ppt
TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012
23. Some knowledge discovery use cases
Given a contributor, tell me which other contributors are
“related” to the first one and tell me the nature of that
relationship.
Credit: Geoff Bilder http://irisc-workshop.org/wp-content/uploads/2011/09/irisc2011-geoff-bilder.ppt
TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012
24. WHO CARES!?
‣ Publishers who publish researchers’ work
- Accurate author info, dealing with coauthors, generally managing the
peer-review & publishing process
‣ Institutions that employ researchers
- Evaluating performance of research staff, tenure decisions
‣ Funders who give researchers money
- Which PI scientists are getting funded, who are their co-applications, track
which research outputs were produced by a given grant
‣ Researchers themselves!
- Automated CVs, receive credit, save time when submitting manuscripts
to journals
TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012
31. What makes ORCID different?
• Some key facts:
• ORCID is the only researcher identifier that is not limited to discipline,
institution or geographic area
• ORCID is backed by a non-profit organization with >300 participants
• ORCID is backed by many different stakeholders
• Publishers are an important ORCID stakeholder but are just one part
• ORCID is serious about building an open system
TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012
32. Tackling problems
&
Creating new opportunities
TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012
33. ORCID uptake by “usual suspect”
stakeholders in scholarly communication
‣ Publishers, funding agencies, universities, libraries
‣ Big payoffs from solving big identification problems - BUT,
big, sprawling organizations take long time to move
‣ HOWEVER, adoption could well happen relatively rapidly
- .. via integration with manuscript tracking systems
- .. via deposit of profile data from large organizations
‣ Several publishers & their software vendors are already
working on ORCID integration
TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012
34. Publisher integration - NPG example
Link your account now
User authenticates and
approves NPG accessing
their data
ORCID returns User to
the NPG registration
NPG registration form is
pre-populated with data
ORCID sends back
Credit:Veronique Kiermer http://about.orcid.org/sites/default/files/kiermerorcidoutreachmay2012.pptx
TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012
35. Creating opportunities in the Long Tail
‣ Lots of small, diverse online scholarly services - more
nimble than bigger players so faster to onboard
‣ Rich flora of grassroots initiatives that can benefit from
integration with ORCID
- Example: #altmetrics movement
Total Impact - http://total-impact.org
ScienceCard - http://sciencecard.org
- Example: genetic variation databases
small-to-medium size data submissions
&
expert curation
** Part of the GEN2PHEN mission **
TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012
36. How? Play the social networking card
‣ Now in modern social networking arena:
- Rich flora of 3rd party applications built around social IDs
that users already have on Twitter, Facebook and other sites
‣ Coming soon:
- Lots of online scholarly communication tools built around
ORCID IDs that scholars already have
- Ease of use - build on users’ familiarity with mainstream apps
- Rich ecosystem of ‘ORCID apps’
- Lower the barrier to participation - tackle the “multiple
profiles syndrome”
TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012
37. Technologically, this is not rocket science
TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012
38. Technologically, this is not rocket science
TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012
39. The Phase 1 ORCID service
will support this stuff!!!
‣ Simple RESTful API - focus on making integration *easy*
‣ Standard OAuth 2 authn/authz so users can:
- link their local accounts with their ORCID ID
- authorize client apps to access non-public profile data
- authorize client apps to add/update profile data on their behalf
‣ USER DRIVEN - up to individual author/contributors
whether to link accounts
TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012
40.
41. IRISC2011 workshop @CSC, Helsinki
‣ Workshop themes
- unambiguously identifying authors/creators & attributing their scholarly works
- individual identification and access mgmt in the context of identity federations
‣ Workshop aims
- Raising overall awareness of key technical and non-technical challenges,
opportunities and developments.
- Facilitating a dialogue, cross-pollination of ideas, collaboration and coordination
between diverse – and largely unconnected – communities.
- Identifying & discussing existing/emerging technologies, best practices and
requirements for researcher identification.
‣ >60 participants, ~2/3 from IDF community
TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012
42. IRISC2011 workshop @CSC, Helsinki
‣ Workshop themes
- unambiguously identifying authors/creators & attributing their scholarly works
- individual identification and access mgmt in the context of identity federations
‣ Workshop aims
- Raising overall awareness of key technical and non-technical challenges,
opportunities and developments.
- Facilitating a dialogue, cross-pollination of ideas, collaboration and coordination
between diverse – and largely unconnected – communities.
- Identifying & discussing existing/emerging technologies, best practices and
requirements for researcher identification.
‣ >60 participants, ~2/3 from IDF community
TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012
43. IRISC2011 workshop @CSC, Helsinki
‣ Workshop report published online
http://irisc-workshop.org/irisc2011-helsinki/workshop-report/
TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012
44. IRISC2011 workshop @CSC, Helsinki
‣ Workshop report published online
http://irisc-workshop.org/irisc2011-helsinki/workshop-report/
TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012
45. IRISC2011 workshop @CSC, Helsinki
‣ Recommendations / suggested actions from report
[...]
Opportunities for collaboration and interoperability
Service providers should investigate possibilities for authenticating ‘homeless’ users
(i.e. freelance researchers with no affiliation, or affiliated researchers at institutions
which aren't part of an IDF) via ORCID or other trusted source of author identifiers that
may join IDFs in the future.
The IDF community and ORCID should work to harmonize core profile fields/
attributes which are likely to hold institution-validated information.
Establish a pilot on federated access management to a biomedical data provider
together with EGA, eduGAIN and related national IDFs.
Investigate how an ORCID or other author identifier and its provenance can be
modelled as an attribute in IDF and interfederation services, as part of a set of
attributes automatically released by the identity provider.
[...]
TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012
46. Match made in heaven, no?
- Opportunities for collaboration -
‣ Pilot ORCID <-> IDF integration in high-value use cases
‣ Starting points - some suggestions
- A) Authenticate via federated identity to central ORCID system
- Users authenticates the first time, registers & his new profile is
populated on the fly with orgz-validated information released by IdP
- B) Starting from institution, link ORCID account with inst. user
account and pull in ORCID identifier + publication data
- Would need IDF attribute to carry universal, validated author identifier
TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012
47. Where do we go from here?
‣ Get involved - join the discussion
- http://about.orcid.org - Main website, general info
- http://dev.orcid.org - Developer web portal - NEW!!
- Test “sandbox” system (bring your own sand!)
http://devsandbox.orcid.org
http://api.devsandbox.orcid.org
- Contact me, as (provisionally) co-chair of ORCID’s Technical
Outreach Working Group, together with Elsevier’s Mike Taylor
TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012
48. Acknowledgements
GEN2PHEN Consortium - http://www.gen2phen.org/about-gen2phen/partners
Prof Anthony J. Brookes Bioinformatics Group, Leicester
This work has received funding from the
European Community's Seventh
Framework Programme (FP7/2007-2013)
under grant agreement number 200754 -
the GEN2PHEN project.
Contact me!
<gthorisson@gmail.com>
http://www.linkedin.com/in/mummi
http://www.twitter.com/gthorisson
http://www.gthorisson.name
Published under the CC BY license:
http://creativecommons.org/licenses/by/3.0/