3. Overview
2013 :
many growing pains
Growth
in data
Growth in usage
Growth in complexity
2013:
a lot of time spent fixing problems
Query
performance, working hard to maintain acceptable levels
Query
effectiveness, keeping matching rates up
Deposit
throughput, dealing with spikes in re-deposits
Deposit
processing with <citations>, can cause a lot of message traffic
4. Overview
2013 :
improvements and some new services
FundRef
Schema
changes
Allow MathML in article titles
Allow JATS abstracts in deposits
Support non-CrossRef DOIs as components
Support text-data-mining (TDM)
Stand alone deposits (easier than sending in all metadata again)
FundRef
CrossMark
Can
query on ORCIDs
5. Overview
new
service: which RA tool
doi.crossref.org/ra/10.5284/1000389
[ { "DOI": "10.5284/1000389", "RA": "Data Cite" }]
6. Roll over from 2013 into 2014
Tweak
the query logic to improve precision
•Return a DOI even if there are conflicts: Publishers often (mistakenly) deposit
a second DOI for something they’ve already assigned a DOI to (normally creates
a conflict). When a query finds two or more DOIs from the same publisher for a
given item, we could return one (the most recent).
Reliability
and scaling
Deposit System
Deposit System
Query System
Query System
Data Management
Oracle
MySQL
Berkely
Lucene
Oracle
Data Management
Current
MySQL
Other
Goal
Berkely
Lucene
7. Some 2013 fun facts
504 internal
tickets created, 154 are still open
459 tickets closed so far in 2013 (some created in late 2012)
~ 400,000 lines of code
768,115,361 metadata queries so far in 2013
348,386,170 matched
207,109,812 forward link queries
3,578,469 new CY DOIs
2,320,151 new BY DOIs
17,735,351 updated DOIs
1,084,529,650 RAW DOI ‘clicks’ (Dec12 thru Oct13)
11. 2014
Re-design
conflict processing.
Current
process requires too much labor following up and fixing
Conflicts should only be created inter-member
A given publisher will be allowed to create multiple DOIs, the system will
clean up
Title locks should prevent nearly all journal-to-journal conflicts
Auto cleanup the existing backlog
Consider alternatives to OAI-PMH for bulk data distribution
Accept full JATS file and/or PDF for deposits.