SC CTSI Team Members Present at International VIVO Conference: Over 115 authors from five continents presented at the 2014 VIVO conference focused on opportunities created by advancing data sharing and team science. Topics ranged from implementation to ontologies, visualization, and collaboration in research and scholarship.
Large research organizations have become increasingly interested in VIVO and other Research Networking Systems (RNSs) as a means of addressing multidisciplinary research challenges, fostering collaboration, and increasing the visibility of individual investigators. Read the full story: http://sc-ctsi.org/index.php/news/sc-ctsi-team-members-present-at-international-vivo-conference#.U_aiTmK9mKU
It’s All About the Links: An Automated Approach to Promote Research Networking Systems More Efficiently
1. Name Linking Application (NLA)
REST API
INTRODUCTION
Large research organizations have begun deploying Research Networking Systems
(RNSs) that allow for easy search of their research experts and related networks as a
means of fostering research and collaborations.
However, increasing the adoption of RNSs has posed challenges. Previous data
show that cross-linking (i.e., links to the RNSs on other websites) serves as an effective
tool to significantly increase the traffic to and adoption of RNSs.
A strategy to embed those cross-links on other websites requires the establishment of
partnerships and is a time intensive effort that may take months or years before yielding
significant results.
We developed a more effective solution: an automated approach (i.e., NLA, Name
Linking Application) that identifies researchers' names on web pages (e.g.,
University news, directory, department pages) and links them to their respective
researcher page (if it exists).
RESULTS
METHODS
Our NLA utilizes natural language processing based on Named Entity Recognition
(NER). We use an NER library based on CRF (Conditional Random Fields) originally
developed at Stanford University to identify names in text.
Our application searches the RNS database for the found names. Every match in the
RNS database gets a score based on the quality of the match. We apply a threshold
score to filter out distant (poor) matches, which further filters out false positives. Each
article was manually checked for accuracy ensuring that all names in a news article
were automatically recognized and linked to the correct profiles.
To make NLA easy to use, we have given it a RESTful interface to which any entity on
the Internet can send content scanning requests. Thus, the NLA can be used with any
website) dynamically, by injecting a client side script into the website's pages, or ii)
statically, by integrating NLA with a Content Management System (CMS) used by the
website.
CONCLUSION
Previous data show that the RNS search traffic is heavily dominated by commercial
search engines such as Google and Bing, which is why we believe that this application
will serve research organizations well to efficiently promote and increase the
adoption of their RNSs without requiring time-intensive manual cross-linking.
The tool allows for automatic and accurate cross-linking within news articles and website
content to specific profile pages, improving search engine ranking and generating
referral traffic.
NEXT STEPS
Our NLA can be extended for any RNS (including VIVO) with minimal code
modifications. We will develop a client side library that can be installed on any website.
This library will allow the NLA tool to automatically link content on partner websites.
ACKNOWLEDGEMENTS
This project was partially supported by the National Center for Advancing Translational Sciences of the National Institutes of Health under Award Number
UL1TR000130 (formerly by the National Center for Research Resources, Award Number UL1RR031986). The content is solely the responsibility of the authors
and does not necessarily represent the official views of the National Institutes of Health.
It’s All About the Links
An Automated Approach to Promote Research Networking Systems More Efficiently
Anirudha Kumar, Praveen Angyan, Francis Ukpolo, Katja Reuter, PhD
Southern California Clinical and Translational Science Institute (SC CTSI)
100%
Of full names in news articles were
detected.
Definition of full name: First name followed by
last name with an optional middle name.
97.4%
Of names in news articles were
linked to the correct page on our
Profiles RNS installation.
In four instances the NLA did not succeed due
to name contractions or where the legal name
had changed.
10 Sec per Article
144 news articles were processed
by the NLA, including 156 names.
HOW IT WORKS
News Article on Organizational Website
Linked Investigator Name in News Article
Linked Investigator Profile
Framework of the NLA
Content
Management
System
(CMS)
Stanford Named Entity
Recognizer (NER)
Content Processor
Unprocessed
HTML Content
Processed HTML
Content (with
Profile links)
Profiles RNS
REST
API
Full Name Search
Search Results