Scaling API-first – The story of a global engineering organization
From Finding to Discovering
1. From Finding to Discovering
EuropeanaTech, Paris, February 12, 2015
Jaap Kamps
How to Empower Visitors and Curators?
2. Motivation
• Current CH access tools are framed by web search tech
• privilege ultra-fast look-up search for facts, navigational needs
• mismatch in use-case!
• Need new/different tools supporting users
• to explore and experience,
• to learn and be amazed, and
• engage with the material from many different angles
• Although still fragmented these tools are emerging
• ground work is done,
• work on realizing the full potential of cultural heritage online
6. Searching for CH
Web Search CH Search
Find information Discover and learn
Clear information need Non goal directed search
Search outcome Search process
Navigational Exploratory
Fast search: 1 click Slow search: engagement
… …
10. Empower
Curator
(back office)
Visitor
(front office)
Take control
Control over
content++
Control over
search trajectory
Beyond metadata
From access to
narratives
From finding to
discovering
Beyond objects
Tools to build
digital narratives
Interaction builds
own narratives
Beyond one
perspective
Multi facetted/
layered
information
Explore layer by
layer
Reunite
Digital content
into gallery
Traditional + non-
experts views
20. Wrap Up
• From Finding to Discovering
• Different use case → Different systems
• From “access” to enriched content
• Power to enrich/store/organize
• Empower users and curators
• Put visitors/curators in the driver’s seat
• Make other people’s trails explicit
21. Questions?
• Thank you to all collaborators:
• EU FP7: meSch project
• NWO CI: ExPoSe project
• Digging into Data: DiliPad project
• NWO CATCH:WebART project
• CLEF: Social Book Search Lab
• TREC: Contextual Suggestion Track
Notas del editor
Thanks for inviting me to talk about the important topic of discovery. Rather than report on what we did in a single project, I like to abstract over our work on several projects — not all on Europeana data, but on data that could have, or perhaps should have, been part of Europeana.
I was asked to limit my talk to 10 minutes, and decided to structure it along four slogans.
Web Search != Cultural Heritage Search
The use case of searching and exploring cultural heritage information online is fundamentally different form the use case of modern web search.
This is a non-sensical query at Google — will give no useful results.
No bashing of Google — I work with them on many projects, and in fact the search industry is extremely interested in the novel ways of searching and exploring information…
There are almost infinite differences — I could talk about these for hours — and there is a complete mismatch between the use cases and underlying assumptions.
These differences require a totally different type of system, and deeply impact and inform the design of such a system.
These are examples we worked on to support the search process rather than the search outcome. The system adapts to the needs in different stages of search: prefocus supports exploratory search and the discovery of a clearer search focus; focus supports collecting material on the chosen topic, raw material as a first pass. Post focus supports management of the collected information and further analysis and synthesis of this information.
WebART project/CLEF Social Book Search
This is how I often feel — both in CH search systems and even on the web — there is clearly something wrong with the results, yet there is no transparency or control for me to change this around. Search experts like me can figure out how to adapt my search to still find the desired results, but this requires me to adapt to the algorithm, rather than the algorithm adapting to me.
This is a question of who is in control — we must put the user’s back into the driver’s seat. This holds for both the back-office/curator and for the front-office/visitor-user.
Tools for curators to select and enrich content — to convey narratives rather than lists of object metadata.
meSch project.
Enriching data is key — this is an example of scanned archival records, specifically the parliamentary proceedings (Hansards) in which we explicitly annotated the debate structure, and links every speaker to her/his biography information.
The results are individual speeches with rich context on the speaker and debate -- the second result is from the 2007 debate causing the discussion on language use in parliament — we can delve deeper than ever before (find a unique speech of a MP in the last 200 years) and more general (showing aggregated results over the whole result set — consisting of massive numbers of speeches).
And we can go beyond this, by allowing users to build/change their own search engine on the fly — decide what data to search, how it is processed, and what level (speech, topic, meeting day) — tailored to their needs.
This is complex — not possible (or very expensive) for 1,000s of parallel users, but certainly feasible for privileged users such as researchers. And we can scale this, that’s just an engineering challenge.
Definitions of heritage: extensionalist is about the object meaning of objects (in terms of metadata feature). Modern view is constructivistic: the meaning is what it means to a user, or group of users, or society at large. Modern definition of meaning are “in the eye of the beholder” — so there are many meanings, and all are important.
One key consequence is that we need to bring back the social aspects into the digital realm. E.g., museum visits are a social event, people go with friends or family, and talk to each other about art.
There are easy ways to bring social aspects back. We worked on using Europeana logs for collaborative recommendations: recommend objects based in profiles, earlier viewed objects, and/or the current object looked at.
Any collaborative recommendation algorithm also already recommends similar trails of other people, and we can explicitly show these paths of previous likeminded people. This is in fact the main idea of Vannevar Bush’s Memex — rather than through hierarchical controlled systems, we discover useful information by following the footsteps of earlier users.
meSch project.
We did all the ground work, and I am very optimistic that we can dramatically increase the value of our tools in the coming decade to come.