More Related Content Similar to Harnessing User Library Statistics for Research Evaluation and Knowledge Domain Visualization (20) More from Open Knowledge Maps (20) Harnessing User Library Statistics for Research Evaluation and Knowledge Domain Visualization1. WWW„2012 Workshop on Large Scale Network Analysis
www.know-center.at
16/04/2012
Harnessing User Library
Statistics for Research
Evaluation and Knowledge
Domain Visualization
Peter Kraker (Know Center Graz)
Christian Körner (TU Graz)
Kris Jack (Mendeley London)
Michael Granitzer (University of Passau)
Funded by
2. Introduction
Information overload is not a contemporary issue
Science has been growing exponentially
for the last 400 years (Price 1961)
Papers (Larsen/von Ins 2010)
Scientists (NSF 2010)
Problems
Lacking overview of (sub-)disciplines
Simultaneous discoveries
Repeated research
2
Price 1961 © Know-Center 2011
extended by Leydesdorff (2008)
3. Quantitative Analysis of Science based on
Large Scale Citation Networks
Research Evaluation – Knowledge Domain
Impact Factor Visualization
Chen & Carr 1999 3
© Know-Center 2011
http://www.scimagojr.com
4. Quantitative Analysis of Science based on
Usage Data
Problems of citation-based approaches
Citations take a long time to become available (~ 3-5 years)
Corpus has to be limited results differ (Meho & Yang
2004)
Possible solution: usage data
References are earlier available
Not restricted to formal communication
Examples
Click data/download data (e.g. PLoS, arXiv)
Social reference management: “Add to library” (e.g.
bibsonomy, Mendeley, Zotero)
4
© Know-Center 2011
5. Mendeley
Social reference Crowdsourced Mendeley
management platform research cataloge
Organizing personal 1.5 million users
research library
50 million unique articles
Creating user profile
Reading and annotating
of PDFs
Sharing of
references/PDFs
5
© Know-Center 2011
http://www.mendeley.com/research-papers/
6. Empirical Studies
Two studies in the main areas of quantitative analysis of
science
Large-scale impact factor analysis
Exploratory knowledge domain visualization of the
emerging research field of Technology Enhanced Learning
Basis: User library statistics from Mendeley
Measures: Occurrences and co-occurrences of references
in user libraries
6
© Know-Center 2011
7. Biology
Data
Comp. Sc.
Medicine
Engineering
None
Social Sc.
Psychology
Education
Business
Physics
Electrical
Chemistry
Environment
Economics
Arts
Humanities
Earth
Management
Materials
Mathematics
Linguistics
Law
Design
Philosophy
Astronomy
Sports
437,812 users
Snapshot from March 2011
18,080,679 unique documents
7
© Know-Center 2011
8. Study 1 – Large Scale Analysis of Journal
Impact
MRank: Measuring journal impact with number of readers
Research question: “How do library occurrences reflect
traditional measures of impact based on citations?”
Measures
Occurrences/unique occurrences
Authority score (Kleinberg)
External validation using SCIMago (based on Scopus)
Total number of documents
Citations per document (Impact factor)
Method
Ranking journals for each measure 8
© Know-Center 2011
Calculating Spearman correlations between rankings
9. Results
Unique occurrences of publications in Mendeley 2010 x
Total number of documents in SCIMago 2010
Overall Biology Comp. Sc. Arts
N=3806 N=508 N=225 N=116
Corr. 0.70 0.76 0.57 0.28
Mendeley library statistics of publications from 2008 and
2009 x
Citations per document (impact factor) from SCIMago 2010
Authority Score Occurrence
Overall 0.64 0.53
Biology 0.60 0.56
Comp. Sc. 0.60 0.59 9
Arts 0.52 0.30 © Know-Center 2011
10. Study 2 – Knowledge Domain Visualization of
Technology Enhanced Learning
Co-citation as a measure of subject similarity (Small 1973)
Research question: “Can we use library co-occurrences to
visualize a research field that is not yet covered by
traditional subject descriptors?”
Data
Researchers from computer science: 35,560 user libraries
and 1,964,367 articles.
Method
Identify libraries from the field by filtering resarch interests
Calculate co-occurrences of most occurring papers
Perform multi-dimensional scaling and hierarchical
clustering 10
© Know-Center 2011
11. Results
Multidimensional Scaling
Hierarchical Clustering
AH: Adaptive Hypermedia
GL: Game-based Learning
CC: Citation Classics
11
MC: Miscellaneous Publications from TEL
© Know-Center 2011
OD: Publications from Other Disciplines
12. Conclusions & Future Work
Results are encouraging…
Significant relationship between library statistics and the
impact factor
Meaningful results in knowledge domain visualization
…but need further validation
Longer periods of time, beyond Scopus (social media)
Outlook
Including more information from the user profile
(discipline, location, academic status)
Getting even closer to readership: incorporating click data
Implementing knowledege domain visualizations in
12
Mendeley
© Know-Center 2011
13. www.know-center.at
Thank you for your
attention!
Peter Kraker
@PeterKraker
http://science20.wordpress.com
pkraker@know-center.at
© Know-Center 2011 gefördert durch das Kompetenzzentrenprogramm