CARLI Usage Stats Keynote 20130325

What's the Use: A
Symposium on Usage
Statistics

John McDonald & Jason Price, PhD
CIO & AVP Interim Library Director
Claremont Colleges Library
March 25, 2013
CARLI Electronic Resources and Collections Working Groups

Overview: a Keynote in three parts

1. Broad perspective: Where are we now?

2. Detailed perspective : Addressing the
challenges of usage statistics

3. The latest: our present & future projects

The Promise
& The Peril

How many people do you need in a room before it is
highly likely that two share a birthday?

a) Less than 30

b) 30 – 60

c) More than 60

d) 367

Which treatment for kidney stones is more
successful?

Treatment A Treatment B
Success Treatment A
78% Treatment B
83%
Rates (273/350) (289/350)
Small Group 1 Group 2
Stones 93% (81/87) 87% (234/270)
Group 3
Large Group 4
73%
Stones 69% (55/80)
(192/263)
Success 78%
83% (289/350)
Rates (273/350)

So what? What will
the data tell us…

 Harvesting the Crop: Implementing a Usage Statistics Management
System at Georgia State
 Social Media, ROI and Cookie Day
 How Do E-Resources Contribute to Teaching and Learning? Findings
from the Lib-Value Project
 Using Data Visualization Tools for Collection Analysis
 To Keep, or Not to Keep: The Effect of Discovery Tools on Licensed
Resources
 Everything That's Wrong with E-book Statistics - A Comparison of E-
book Packages
 Discovery & Usage: The Foundation of a Powerful Collection

Progress
Commonly agreed upon measures
Routine methods of transmission
Regular formatting of files
Standard dates for delivery
Audits of reports
Certification of compliant vendors
Established process for refinement

Still evolving
Comprehensive coverage of publishers
Sophistication on Ebooks & Databases
Automation
Further granularity
Measures for non-text usage (article
parts)
Article level metrics

Usage Statistics informing
decisions about acquisitions

# of Total $ of Additional Total Savings
Ebooks Ebooks not STL Costs over Existing
Purchased Purchased Plan

Purchase on 89 $17,382.31 $3,327.20 $14,055.11
Cost Projections - GVSU
4th Loan

Purchase on 58 $24,512.55 $4,621.09 $19,891.46
5th Loan

Purchase on 34 $25,722.11 $5,041.64 $20,680.47
6th Loan

Purchase on 22 $26,899.83 $5,324.84 $21,579.99
7th Loan

Doug Way and Julie Garrison, “Financial Implications of Demand-Driven
Acquisition,” in David Swords (ed.) Patron-Driven Acquisitions: History
and Best Practices. (Berlin: De Gruyter Saur, 2011), p. 148.

decisions about print collection
management

DU Storage study

Levine-Clark, Michael, “Analyzing and Describing Collections Use: Strategies for
Managing a Library Move,” LYRASIS Ideas and Insights, Webinar, May 4, 2012.
http://www.slideshare.net/MichaelLevineClark/

decisions about shared print
projects

Each “Title-Holding” has different characteristics

Dominguez Fullerton Long Beach Los Angeles Northridge Pomona
Hills
Total Circulations
0 circs 19 circs 16 circs 12 circs 13 circs 8 circs

Last Circulation Date
-none- 11/30/11 12/16/08 5/30/07 4/27/07 3/11/08

Date added to Collection
6/27/02 4/23/02 9/21/01 5/03/00 11/11/02 8/11/00
Sustainable Collections Services, Maine Shared Collections Strategy Planning
Meeting, http://www.slideshare.net/Maine_SharedCollections/mscs-scs-planning-meeting-rick-
21
lugg-andy-breeding

Sample Pilot Group - Title-Holdings by Holdings Level
2,000,000
Sample Pilot Group - Title-Holdings by Holdings
Level
1,800,000
2,000,000

1,600,000
1,800,000
779,756
1,400,000
1,600,000 4+ circs
4+ circs 779,756
1,400,000
1,200,000 1-3 Circs
1-3 Circs
1,200,000
1,000,000
0 circs
0 circs
1,000,000
800,000 305,438 539,718
800,000 305,438 539,718
600,000 257,739
600,000 311,240 257,739
400,000 311,240
400,000
220,071
220,071 560,107
200,000 560,107
362,050
200,000 362,050 239,202
239,202
-
-
1 1 22 3-6 3-6
# of Pilot Group Libraries Holding Title
# of Pilot Group Libraries Holding Title

Resource Sharing: CAMINO Collections
CUC
LMU
Oxy
Pep
UOP
CST
Wstmt
CalArts
CBU
Dom
WJU
WUHS
AJU
HNU 0 200,000 400,000 600,000 800,000 1,000,000 1,200,000

Books held only by library
Books held by BOTH library and the rest of Camino
Books held only by the rest of Camino

decisions about print & online
resources

Holy grail: Understanding User
Behavior

Tracking Impact Beyond Articles

http://www.zazzle.com/statistics_means_never_having_to_say_youre_certai_tshirt-235669028746970031
March 25, 2013

Part 2: Addressing the
Challenges of Usage Stats
1. Comparability
• Package price per use
• Defining the appropriate range(s) of cost per use
• Practical applications
2. Reliability
• Impact of mobile, discovery & harvesters
3. Prediction
• Demand Driven Acquisition
• Number of books available <> Size of budget
4. Context – Data about our data

Apples and oranges are both round(ish)…

Challenge 1: Comparing Package Price Per View
pkgIDTotal Use SubsCost UnSubsCost Overall PPV
S3.140048 $1,652,000 comparison$13.10
1 Cross-package $182,000
2 20341 $333,000 $10,000 $16.86
3 13572 $282,000 $21,000 $22.33
So Pkg 1 is a better value than Pkg 3?

It might not be…

html to pdf Ratios vary widely for these packages
50000 48047

html views
40000 pdf downloads
32688
# of views

30000

1:1.3 1:23
20000
13004
1:12
10000
4066
352 568
0
1 2 3
Package
How many pdfs in Pkg 1 are duplicates of html views?
(fmi: See Davis & Price, 2006 JASIST 57(9))

Getting a pdf from Package 1…

‘Get article’ links directly to the html version…
then the user downloads the pdf…
…2 uses are recorded for 1 pdf

Total full text views suffer from duplication issues

pkgID Package value revisited
S3. Use SubsCost UnSubsCost Overall PPV
Total
1 140048 $1,652,000 $182,000 $13.10
2 20341 $333,000 $10,000 vs. $16.86
3 13572 $282,000 $21,000 $22.33

pdf requests only tell a different story!
pkgID Est. pdf Use SubsCost UnSubsCost Overall PPP
1 83469 $1,652,000 $182,000 $21.97
2 18734 $333,000 $10,000 $18.31
3 13287 $282,000 $21,000 $22.80

Addressing Challenge 1:
Comparing Package Price Per View
When comparing packages, both total views
and PDF downloads should be compared

Extension of principle: Journal report 1B

JR 1a JR 1b
ARCHIVE FRONTFILE

Challenge 2:
Defining acceptable range(s) of
cost per use

Among packages

Within packages

Reality Check
Should we expect cost per use to be
equivalent among packages?
Content Quality
Business Model
For Profit vs Cost Recovery
Exposure in Discovery tools
Title list accuracy
Backfile access
ASSUMPTIONS

Reality!

Acceptable CPU range?

a) 0-$6
b) 0-$12
c) 0-$24
d) 0-$50
e) It depends on _________
f) Can’t say / Don’t know

Consortial Benchmarks

SCELC Package 'W' Overall Price per Use
$50.00
Price per full text article view

$40.00

$30.00

Use data not avaliable
$20.00

$10.00

$0.00
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Consortium Member (Sorted by decreasing spend)

Subscribed titles w/in a pkg- Apples to apples?
$1,200
$4300/year; 11 uses
$1,000
Price Per Use (PPU)

$800
$8800/year; 33 uses

$600
$3400/year; 17 uses

$400

$200

$0
0 10 20 30 40 50
Subscribed Title # (ordered by PPU)

Strict title level cost per view is misleading
Cost per view by access type
$35
$30.51
$30

$25
Cost per view

$20
$13.41
$15

$10

$5
$0.81
$0
All Titles Subscribed Titles Unsubscribed
[n=537] [n=192] (Leased) Titles
[n=345]
Access Type

Best practices for usage comparison tasks
1. Goal - Identify pricing inequity
a. Best accomplished by consortial benchmarking
b. Requires readily available package level cost per use
across consortial participants
c. Leverage COUNTER consortial reports and economy
of scale of consortial specialist

2. Goal - Identify lower value packages
a. Use both total views and pdf download comparisons

3. Goal - Identify lower value titles
a. Only after targeting specific lower value packages
b. Recognize that by title price per use comparison is only
valid within a package

Challenge 2: The convenience/reliability trade off
COUNTER R4: Search activity generated by
federated search engines and other automated
Case Search Inflation Full text
search agents should be included in separate impact
Impact inflation
Direct from Google IP Low? [Cost is granularity Low?
“Searches_federated and automated” counts
access of usage stats]
…and are NOT to Unlikely to be significant “Regular
Mobile devices
be included in the Low to None
Searches” counts.
Federated search Significant, but COUNTER None
engines (built into some rules require separate
discovery tools) automate reporting & number of
searches searches has always had
dubious meaning
Harvesters (e.g. Quosa) Same as federated search Potentially very
automate article high
downloads from search
results

Harvesters (like Quosa): the real threat?

Usage factor may address the harvester challenge

Challenge 3: Prediction – Coming soon!
Observations:
• Libraries prefer predictability over savings!
• Title level journal usage is remarkably
predictable year on year
• Usage driven purchasing is ripe for modelling
based on this predictability

Example: Demand driven ebook forecasting

Estimated List Size

-OR-
Estimated Annual Expenditure
=List Size ×
(% visible list purchased × mean book price) +
(% visible list w STL × mean cost per STL × mean STL per title)

Challenge 4 – Context = metadata!

• We do need good data about our data
• Data quality is more than just accuracy
• Retrospective studies require history!
• Circulation Statistics
• Dates of profile changes
• Cross library comparisons
• In an ideal world we’d share datasets with rich
metadata
• Library science is far from this ideal world
• An example of the power of good retrospective
data…

Total Books & Usage

User- Pre- Usage by Usage Read
Library Model
Selected Selected Download Online

A MIX 1131 552 6773 9888
B MIX 5246 2612 42880 38329
C USER 2198 102 0 11801
D USER 3010 48 697 15126
E MIX 4159 909 17396 25604
F PRE 0 1451 4905 3082
G PRE 31 2154 7001 4459
H USER 801 0 556 415
I MIX 305 336 3334 2568
J USER 2799 53 5 13349
K MIX 147 276 2436 2283
TOTAL 19,831 8,496 85,983 126,904

Data required
• Book purchase date
• Book purchase type
• Many years of use
• Different types of use
• Library purchasing profile
• Library list profile (what content was excluded)
• Individual user IDs (anonymized)
• Came from 4 files per library with a total of 69
data elements….
• We found one vendor that invested in library
facing reports the level of data needed, there are
few others…
• Addressing the challenge: a consortial solution?

Part 3: Our present & future
1. Improving usage stats collection
a. (External) Consortial paperstats
b. (Internal) Dublin Six AUDITOR
2. Improving usage stats visualization
a. Excel Conditional formatting
b. Splunk for Dashboard Creation…
3. Better database metrics
4. Improving on Journal number
comparisons
5. Usage Factor for Journal Evaluation

Consortia: Enhancements

Track stats for each member

Automatic import of consortia stats

SCELC PaperStats
by the numbers
Total number of full text downloads tracked for
SCELC: 312,908,657
Total counter reports downloaded: 2000+
Total number of logins: 387
Number of month records: 20.3M
Earliest year covered: 2003
Total number of reports being harvested: 15
Total number of institutions covered: 95
Total number of participants: 14

Click through to article and user level detail!!!

Visualization (Excel conditional formatting)

Splunk for dashboard visualization

Better database metrics (beyond searches & sessions)

# of Online Journal Subscriptions: meaningful?
50000

Claremont Colleges
45000

40000

35000

2nd Quartile
30000

25000

20000

15000
Median

10000

1st Quartile
5000

0
2004 2005 2006 2007 2008 2009

Beyond numbers of journals & total usage

• Knowledge base & Usage statistics comparisons
• Selected group of peers with same
knowledgebase & stats consolidation vendor
• Run comparisons in Access & Excel

Usage Factor Formula
Usage Factor =

Total usage over period ‘x’ of articles published during period ‘y’
÷
Total articles published during period ‘y’

Impact and usage factor ranks are not related

(lower)-->RANK-->(higher)

0

20

(lower)-->RANK-->(higher)
40

60

80

100
IF_rank
num art? UF_rank(All)
120
UF_rank(All)--not ISI rated

140

CARLI Usage Stats Keynote 20130325

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (6)

Similar to CARLI Usage Stats Keynote 20130325

Similar to CARLI Usage Stats Keynote 20130325 (20)

More from Jason Price, PhD

More from Jason Price, PhD (20)

Recently uploaded

Recently uploaded (20)

CARLI Usage Stats Keynote 20130325

Editor's Notes