Human Factors of XR: Using Human Factors to Design XR Systems
Innovation and the STM publisher of the future (SSP IN Conference 2011)
1. Innovation and the STM publisher of
the future
Bradley P. Allen, Elsevier Labs
Innovation Session, SSP IN Conference 2011
Arlington, VA, USA
2011-09-19
2. Peak physical media
• “Music Sales”, New York Times, 1 August 2009.
http://www.nytimes.com/imagepages/2009/08/01/opinion/01blow.ready.html
• “Initial Circs per student”, William Denton, 31 January 2011.
http://www.miskatonic.org/2011/01/31/initial-circs-student
• “Rise of e-book Readers to Result in Decline of Book Publishing Business”, Steven
Mather, iSuppli, 28 April 2011. http://www.isuppli.com/Home-and-Consumer-
Electronics/News/Pages/Rise-of-e-book-Readers-to-Result-in-Decline-of-Book-
Publishing-Business.aspx 2
3. A simple model of the evolution of publishing
Print era: 1600s - Digital Library era: Platform-as-a-
1980 1980 – 2010s Service era: 2010s
• Packaged as • Packaged as • Packaged as
books and books and apps and APIs
articles articles • Digitally
• Physically • Digitally distributed
distributed distributed • Access and
• Access and • Access and discovery
discovery discovery through social
through through search networks
libraries engines
3
4. Facets of STM publishing in the PaaS era
Process Type
Extract, Load
Discovery and
Acquisition and Enhancement Indexing Composition Delivery
Access
Transform
Entity Activity Content Type
Submitting Entity extraction
Author Product catalog Article
Crawling Fact extraction
Supplier Editor Book
Syndicating Clustering
Web site Reviewer Media object
Formatting Aggregating
Typesetter User Entity record
Mapping Ordering
Automated process Designer Asset metadata
Cleansing Summarizing
Subject matter expert Developer Relational metadata
Indexing Filtering
Search engine E-book Provenance metadata
Querying Analysis
Content repository Mobile app Usage metadata
Updating Data science
Entity registry Mobile-enhanced Web site Taxonomy
Storing Rendering
API Ontology
Annotating Design
User-generated content
Subject tagging Publishing
Classification Accessing
Entity recognition Retrieving
Deleting
4
5. STM publishing as business intelligence
Surajit Chaudhuri, Umeshwar Dayal, and Vivek Narasayya. 2011. An overview of business intelligence technology. Commun.
ACM 54, 8 (August 2011), 88-98. http://doi.acm.org/10.1145/1978542.1978562
5
6. Some scenarios to compare the two digital eras
Scenario Digital Library era Platform-as-a-service era
A new medical term relevant to an emerging Organizational governance issues about how A single, automated and standardized
healthcare issue (e.g. a new type of avian flu taxonomies are be updated, coupled with taxonomy management and content
virus) needs to be incorporated into a search manually-intensive workflows and ad-hoc enhancement workflow allows rapid and
index immediately approaches to content tagging, inhibit rapid timely update of search applications
response
Application developers want to mash up Data silos without easy means of Content API and single-point-of-access
epidemiological data with medical journal programmatic access by developers, coupled repository allow data and content to be
articles to create topic-specific Web resource with governance and business model accessed, discovered and reused across
questions , inhibit data reuse multiple applications
Digital library developers want to stage Duplication of core content leads to Consolidation of duplicate repositories into a
content into single repository for unified synchronization, quality control issues single point of truth across all content
search index generation accessible and discoverable through a
Content API eliminates the need for
duplication and synchronization
Third party solutions providers want to No standards, no APIs for point-of-care Standards and APIs that scale across multiple
integrate content (e.g. tagged medical journal content integration across all content and partners, for all content types, for all delivery
articles, medical taxonomies) into point-of- data formats
care solutions
Publishers want to deliver their content to No clear standard or approach for targeting Web- and industry-standards for eReader,
tablets and e-readers in delivery formats that emerging eReader, tablet devices, multiple tablet devices supported as part of standard
take advantage of the displays and interaction and divergent approaches leading to siloed automated processing into delivery channel-
modalities on those devices solutions, duplication of effort specific formats, regularly updated and
exposed through a Content API
Journal publisher wants to integrate content No single point of access to content Easy access to multiple opportunities for
enhancements across multiple subject matter enhancements, no standards for content content enhancements embedded in
areas to add value to products leveraging enhancement suppliers and partners to standard next-generation article formats and
Article of the Future technology deliver enhancements for integration provided using standard content
enhancement formats
6
7. Goals for the publisher of the future
• Craft content acquisition, production and management
systems that support with equal capability and flexibility a
broad range of content types and delivery channels
• Make it easy for authors, editors and reviewers to work
with bundles of content and data in the aggregate
• Make it easy to discover and access, across all content
assets, information in fragments smaller than the unit of
publication
• Then make it easy to aggregate and compose these
fragments into new products and services
• Leverage the tremendous power of Web architectural
standards and formats to increase the ease of content
integration and interoperability
7
8. New requirements for content management
• Broad range of content types • Accessible
– Must treat as first-class objects video, audio, – Must be easily accessed through content
images, datasets, metadata and knowledge creation, retrieval, update and deletion (CRUD)
organization systems in addition to articles and services
books
• Flexible
• Standards-based – New content types and associated schemas
– Web-standard formats to support ease of must be easily added through configuration
integration and interoperability
• Reusable
• Fine-grained – It must be efficient for product developers to
– Must be decomposable into and addressable in aggregate and compose content fragments into
fragments smaller than the unit of publication; new products
e.g., down to the level of specific words,
phrases, images, table cells in articles or book
• Modifiable
chapters, key frames and segments in videos – Support the enhancement and correction of
content at any time following creation
• Discoverable
– Must be easily located across all levels of
• Broad range of delivery formats
granularity, – Content standards and services must support
fulfillment, delivery and presentation across
desktop, notebook, tablet and mobile
computing devices
8
9. Leveraging Web standards for sharing
1. Use URIs to name things
2. Use HTTP URIs so they can
be looked up
3. Return useful data when
things are looked up
4. Include links to other things
in the returned data
“Linked data is just a term for how to publish
data on the web while working with the
web. And the web is the best
architecture we know for publishing
information in a hugely diverse and
distributed environment, in a gradual
and sustainable way.”
Tennison J, 2010. Why Linked Data for data.gov.uk?
http://www.jenitennison.com/blog/node/140
Shotton D, Portwin K, Klyne G, Miles A, 2009. Adventures in Semantic Publishing:
Exemplar Semantic Enhancements of a Research Article. PLoS Comput Biol 5(4):
e1000361. doi:10.1371/journal.pcbi.1000361
9
10. From books and articles to evolving research objects
Linked data
Relational
metadata
Entity record
Relational
Metadata
Article Relational
metadata
Relational
Acquire Metadata Relational Deliver
metadata
Media object
Relational Relational
metadata
Metadata
Transform,
Enhance, Compose
10
11. Leveraging consumer Web innovations
• Emergent technologies driven by consumer Web applications
emphasize design choices that focus on delivering cheap, robust
and scalable Web applications
– Schemaless document stores provide read/write at Web scale with
support for analytics
• For more dynamic, fine-grained content and linked data
• For easier usage and citation analysis, bibliometrics and scientometrics
– Web application development frameworks that leverage HTML5/CSS/JS
to deliver across desktops, notebooks, tablets and smartphones
– Deploying in the cloud and moving scale-out from development to
operations to reduce time-to-market, cost of failure for emerging, niche
publishing opportunities
• As we shift to the Platform-as-a-Service era, these features
become an important part of the STM publishing technology stack
11
15. The publisher of the future as lean startup
• This stuff is not just for big publishers
• These are the tools that new consumer
Internet businesses are using to create new
products and services today… quickly and on
the cheap
• Smaller publishers and societies can use lean
startup techniques to drive app and API
design and development starting from
existing web presences and third-party APIs
15
19. Challenges for the publisher of the future
• When content can be mashed up at a fine-level
of granularity using multiple third-party APIs,
what are the rights associated with the resulting
product? What are the appropriate business
models?
• What standards should there be for research
objects?
• Who gets credit for research objects? How is
impact determined and reputation managed?
• What is an acceptable trade off between content
flexibility and high-touch presentation design?
19
20. In summary
• STM publishing is only beginning the transition
from print to online
• Articles and books are no longer sufficient
containers for scholarly communication
• Tools to effect this change come from the
consumer Internet and the business intelligence
worlds
• Publishers of the future will leverage the best
practices emerging around these tools to create
innovative new products to serve their
communities
20