Overview of the Sheffield IR Group and Cultural Heritage Research
1. Overview
• Introduction to Sheffield
Using Pathways for Navigating • Information access in cultural heritage
and Personalised Access to – The EU-funded PATHS project
– The use of pathways for navigation and
Cultural Heritage Materials personalised access
Paul Clough
• A brief advert
The Information School
University of Sheffield – TREC Session Track
Presentation at Glasgow University, 14th March 2011 Presentation at Glasgow University, 14th March 2011
The Information School Sheffield IR group
• Formerly known as the Department of • Currently 2 academics, but more coming …
Information Studies – Dr. Robert Villa (currently Glasgow University)
– Formed in 1963 (PG School of Librarianship)
– Now in the faculty of social science
– Prof. Elaine Toms (currently Dalhousie University)
– http://www.shef.ac.uk/is/ • Four RAs
• Leading researchers past and present include – Paula Goodale and Mark Hall (PATHS)
– Tom Wilson – Evangelos Kanoulas (EFireEval)
– Micheline Beaulieu – Monica Paramita (ACCURAT)
– Steve Whittaker
– Mark Sanderson • Currently 6 PhD students
– Peter Willett – Mixture of library and information science and
– Nigel Ford computer science backgrounds
• Now part of the emerging iSchool movement
Presentation at Glasgow University, 14th March 2011 Presentation at Glasgow University, 14th March 2011
2. My areas of research Recent and current research projects
http://ir.shef.ac.uk/cloughie/
• Text re-use and plagiarism detection • Mining imprecise regions from the Web (EPSRC and
• Multilingual information access Ordnance Survey)
• Improving Information Finding at the UK National Archives
• Geographical Information Retrieval (GIR)
(TNA)
• Multimedia retrieval (images)
• User-Centered Design of a Recommender System for a
• Evaluation of IR systems 'Universal' Library Catalogue (AHRC and OCLC Inc.)
• User interfaces and interaction • Analysis and evaluation of Comparable Corpora for Under
• Construction of corpora and evaluation Resourced Areas of machine Translation (EU FP7)
resources • Personalised Access to Cultural Heritage Spaces (EU FP7)
Presentation at Glasgow University, 14th March 2011 Presentation at Glasgow University, 14th March 2011
Information access in Cultural Heritage
• Significant amounts of CH material available online
– Web portals, digital libraries, aggregated portals (e.g.
Europeana), Wikipedia, …
• Users may find it difficult to navigate and interpret
Providing Personalised Access to wealth of information
Cultural Heritage Spaces – keyword-based access provides limited success
– many users are not domain or subject experts
– limited support for knowledge exploration and discovery
http://www.paths-project.eu/
• Contrast with traditional mechanisms (e.g. museums)
Clough, P. Stevenson, M & Ford, N. (2011) Personalising Access • Cultural institutions looking at new ways of providing
to Cultural Heritage Collections using Pathways, In Proceedings
of Workshop on Personalised Access To Cultural Heritage rich user experiences to support lifelong learning
(PATCH ‘11), IUI 2011. – user participation (e.g. web2.0), personalisation, …
Presentation at Glasgow University, 14th March 2011 Presentation at Glasgow University, 14th March 2011
3. Personalisation in Cultural Heritage Typical user groups in cultural heritage
• Over 20 years of research in using personalisation to • General user
improve the user experience in cultural heritage – e.g. cultural tourist
– adapt the suggestion and presentation of information • School child
(adaptive hypermedia systems) for physical and virtual worlds
• Academic user
– well-suited application domain for personalisation Derived from Europeana
– students
• Involves modelling users, groups and communities to user studies
– teachers
provide appropriate content http://www.europeana.org
• Expert researcher
– provides personalised learning experiences
– e.g. museum curators
– Recent emphasis on semantic and social web
• Professional user,
– development of collection-specific ontologies (e.g. CHIP)
– e.g. librarian, archivist, etc
– user-generated content seen as a useful form of metadata
– Derived from: Ardissono et al. (forthcoming 2011)
Presentation at Glasgow University, 14th March 2011 Presentation at Glasgow University, 14th March 2011
Uses of cultural heritage websites Information seeking behaviours
• For entertainment/leisure • Information tasks by professionals (Amin et al., 2008)
Derived from
– general interest, browsing – information gathering (63.0%) • Topic search
Europeana user studies
– information exchange (13.0%) • Comparison
• To gain new knowledge http://www.europeana.org • Combination
– fact-finding (10.2%)
– specific interest, targeted • Exploration
– keeping-up-to-date (8.3%) • Relationship search
• To locate interesting items – information maintenance (5.6%)
– purposive, pre-visit
• To develop communities of interest • Information seeking by non-experts (Skov & Ingwersen, 2008)
– sharing – opinions, knowledge, personal artefacts – focus on virtual museum visitors
• Highly visual experience
– broad coverage of needs/characteristics
– social platform • Meaning making
– educational purposes including making • Known-item searching
sense of items in a collection • Exploratory behaviour
Presentation at Glasgow University, 14th March 2011 Presentation at Glasgow University, 14th March 2011
4. Personalised Access To cultural Heritage
Related projects Spaces (PATHS)
• Steve: the museum social tagging project • STREP funded under the FP7 programme
– http://www.steve.museum/ • Duration of 36 months
• The SmartMuseum project – 1st January 2011 to 31st December 2013
– http://www.smartmuseum.eu/
• Budget – 3,199,299 euros in total
• Personal Experience with Active Cultural Heritage – 2,300,000 euros EU grant
– http://peach.fbk.eu/home.html
• 6 partners in 5 countries
• Cultural Heritage Information Personalisation
• Project management
– http://www.chip-project.org/
– 334 person months
• Personalisation of the Digital Library Experience – 8 work packages
– http://comminfo.rutgers.edu/imls/poodle/
– 22 deliverables
• The Ensemble (Walden’s Paths) project
– http://ensemble.tamu.edu/walden_info
Presentation at Glasgow University, 14th March 2011 Presentation at Glasgow University, 14th March 2011
The consortium Additional user groups
• Two universities • Wiltshire Heritage Museum
– Sheffield University
• Imperial War Museum
– Universidad del Pais Vasco
• Two technology enterprises • The UK National Archives
– i-sieve technologies Ltd • Archaeology Data Service
– Asplan Viak Internet Ltd
• Biblioteca Nazionale Centrale di Firenze
• Two cultural heritage enterprises
– MDR Partners • Biblioteca Virtual Cervantes
– Alinari 24 Ore Spa • Biblioteca Nacional de España
• Additional content provider
– Europeana
Presentation at Glasgow University, 14th March 2011 Presentation at Glasgow University, 14th March 2011
5. The project vision Project objectives
• Provide functionality to support user’s knowledge • Analysis of users’ requirements for discovering
discovery and exploration knowledge and construction of pathways/trails
• The use of pathways/trails to help users navigate and • Automated organisation and enrichment of Cultural
explore the information space Heritage content for use within a navigation system
• The use of personalisation (e.g. recommender systems) • Implementation of a system for navigating Cultural
to adapt views/paths to specific users, groups or Heritage resources applied to multiple data sources
communities of users • Techniques for providing personalised access to Cultural
• Show links to other items within and external to an item Heritage content (e.g. recommender systems)
to help users contextualise and interpret the item • Versions for use on mobile devices and Facebook
• Evaluation with user groups and field trials
Presentation at Glasgow University, 14th March 2011 Presentation at Glasgow University, 14th March 2011
Research areas Pathways for navigation and personalisation
• Information Access • Navigation through the information space is based
– user-driven navigation through collections of information around the metaphor of “paths”
– knowledge of users’ requirements for access to cultural heritage – flexible model of navigation and exploration onto which various
collections levels of personalisation can be added
– modeling of user preferences and navigational context • Paths can provide the following information
• Educational Informatics Which can be
– a history of where the user has been
adapted and
– adapting to individual learners in relation to being directed and – suggestions of where the user might go next mapped to an
being allowed the freedom to explore autonomously – a narrative/story through a set of items individual’s
learning styles
• Content Interpretation and Enrichment • Items in a path can be ordered
– representation and sharing of information about items in Digital – chronologically
Libraries – thematically Can be done manually or
– identifying background information related to the items in cultural automatically
– ...
heritage collections (e.g. links to Wikipedia pages)
Presentation at Glasgow University, 14th March 2011 Presentation at Glasgow University, 14th March 2011
6. Paths/trails have been studied in many fields Existing paths/trails in cultural heritage
• Trails (Memex, 1945) • The Ensemble (Walden’s Paths) project
– Associative trails explicitly created by users forming links – http://www.csdl.tamu.edu/walden/
between stored materials to help others navigate – allow educators to arrange web pages into a series of sequential
• Destinations (search engines and web analytics) paths on specific topics
– Origin/landing page (from query), intermediate pages and – educators can add comments at each node
destination page – highly prescriptive and users cannot deviate from paths
• Search strategies (information seeking) • Thematic trails – Louvre
– Users moving between information sources, perhaps due to
changes in their information needs (e.g. Berrypicking) – http://www.louvre.fr/llv/activite/liste_parcours.jsp?bmLocale=en
– selection of works that typify a period, artistic movement or
• Guided tours (hypertext)
theme (routes provide narrative when viewing physical objects)
– authors create sequence of pages useful to others (manual)
– trails can be viewed online or printed prior to visit to museum
– automatically generated trails to assist with web navigation
– used in educational informatics and cultural heritage – prescriptive with limited interactivity and personalisation
Presentation at Glasgow University, 14th March 2011 Presentation at Glasgow University, 14th March 2011
More examples of trails … Our view of pathways
• A path is a ‘route’ through an information space
http://www.ilikemuseums.com/Page/MuseumTrails.aspx – defined as collections of cultural heritage resources
http://www.vam.ac.uk/activ_events/adult_resources/trails_adults/index.html
– consists of nodes and links to connect nodes (graph)
• Nodes can be connected in different ways
http://www.nmm.ac.uk/visit/floor-plans-and-trails/trails/
– pre-computed based on similarity between items
http://www.vam.ac.uk/activ_events/adult_resources/memory_maps/trails/index.html – computed on-the-fly (automated) and personalised
– defined by system/designers (guided paths)
http://www.britishmuseum.org/visiting/floor_plans_and_galleries/ground_floor. spx
– defined by users (individual or collectively)
http://vna.nmolp.org/creativespaces/?page=help • Exist as information objects in their own right
http://www.sciencemuseum.org.uk/broughttolife.aspx – can be indexed, organised and shared with others, and will be
potential learning objects that can be offered to users alongside
the cultural heritage content
Presentation at Glasgow University, 14th March 2011 Presentation at Glasgow University, 14th March 2011
7. Possible paths Representing the data (and paths)
Subject knowledge Destination
(e.g. taxonomy)
RDF triple Multi-layer
search networks
Painted by Born in
Sunflowers Van Gogh Netherlands
Visited
Relationships (e.g. “Born
In”) can be used as: Sheffield
Start
recommendations • Facets for exploration
• Labels for path transitions
e.g. WW II External sources
• ...
(Linked Data)
Start Knowledge discovery Destination
Presentation at Glasgow University, 14th March 2011 Presentation at Glasgow University, 14th March 2011
Independent paths Guided paths
• Users can construct their own “independent paths” • Users can also follow pre-defined “guided paths”
– can be saved for future reference, edited or shared with others – created by domain experts, such as scholars or teachers
– e.g. “Sheffield steel industry”, “my favourite works by Rembrandt”
• Provide an easily accessible entry point to the collection
or “items seen during my trip to London on 6th Feb 2010” – can be followed in their entirety
– or left at any point to create an “independent path”
• More than a simple list of items in a collection that the • Guided paths can be based around any theme
user has visited (i.e. bookmarks) – artist and media (“paintings by Picasso”)
– also contain information about the links between the items – historic periods (“the Cold War”)
(relationships) – places (“Venice”)
– descriptive text (e.g. annotations, tags) – famous people (“Muhammed Ali”)
– emotions (“happiness”)
– details of others items connected to them
– events (“the World Cup”)
– connections to information both within and outside the collection – or any other topic (e.g. “Europe”, “food”)
that provides context
Presentation at Glasgow University, 14th March 2011 Presentation at Glasgow University, 14th March 2011
8. Collaborative paths Users and their goals
• Some users may want to create paths as their specific goal (e.g.
• Groups of users can work collectively to create instructors and curators) – producers
“collaborative paths” – locate and save nodes related to certain themes or subjects
– adding new routes of discovery and annotations that can build – creating learning resources for non-experts by constructing narratives
upon the contributions made by others – these experts manually create guided paths but may benefit from
assistance with locating and constructing paths
• Could be used to encourage social interaction
• Some users specifically come with the intention of following trails
– students working on a group project - the output of which is a (e.g. students and museum visitors) - consumers
trail/pathways – non-experts following static paths created by experts
– experts working in collaboration to create exhibitions or trails – may deviate from static paths and create individual paths
• Paths may also help identify individuals interested in the • Other users may not intend generate or follow paths per se
same topics and themes – don’t specifically save nodes during their searching
– identifying where the pathways cross-over – may benefit from paths as a record of interaction for future use
– generate paths through user-system interaction, allow post-editing
Presentation at Glasgow University, 14th March 2011 Presentation at Glasgow University, 14th March 2011
User studies Adapting to individuals and groups
• Focus on studying the activities of people who create pathways
– curators, educators, professional historians … • Different users will have differing needs from pathways
– currently interviewing subjects from a range of cultural heritage and – system will make user-specific recommendations about items of
potential interest as individuals navigate through the collection
educational organisations (from the user groups)
• Want to find out the following
• Build up knowledge and understanding of users
– cognitive styles
– Who creates paths and for what purpose(s)?
– expertise/subject knowledge
– What processes/tasks are involved in creating paths? explicit
– age User model
– What criteria are used to select items to include in paths? – gender
– How are paths adapted to specific audiences? – language abilities
– What tools are used to help create paths and how are they presented? – system interactions (implicit)
– Where do paths begin and where do they end? • User will be offered links to information both within and
• Also want to gather the requirements/needs of the consumers outside the collection
– based on user characteristics, how do users follow paths? – provide contextual and background information, individually
tailored to each user and their context
Presentation at Glasgow University, 14th March 2011 Presentation at Glasgow University, 14th March 2011
9. Local (analytic) Global
Learning/problem-solving goals
Learning and knowledge discovery Convergent goals.
“Find an answer”.
Divergent goals.
Creatively explore.
Learn pre-defined content. Come up with new ideas.
Process goals
• A particular area of focus in PATHS will be on learning Concerned with procedures Concerned with conceptual Adopting a navigation path that
and knowledge discovery and vertical deep detail overview and horizontal broad inter- matches one’s predominant style
(procedure building). relationships (description building).
can influence the effectiveness of
– help people as they use cultural heritage resources to learn and Navigation styles
the resultant learning.
discover new knowledge Serialist navigation style
Narrow focus.
Holist navigation style
Broad global focus.
One thing at a time. Many things on the go at the same
• People learn and solve problems differently Short logical links between time. Autonomous
nodes. Rich links between nodes.
– some people require a lot of guiding; others are self-directed Intolerance of strictly Welcoming of enrichment (but
irrelevant material. strictly irrelevant) material.
– some people welcome irrelevant material; others are intolerant Finish with one topic before Layered approach returning to nodes
going on to the next. at different level of detail.
– some people creatively explore and come up with new ideas; Local
(analytic)
Global
Positive learning outcomes
others want to simply answer a set problem Good grasp of detailed Well developed conceptual
evidence. overview.
• Users may perform information seeking Deep understanding of Broad inter-relationship of ideas. Dependent
individual topics. Good grasp of the “big picture”.
– must navigate through information spaces In-depth understanding of the
Key cognitive dimensions (Pask and Witkin)
parts.
– different people may require different levels of assistance
Characteristic learning pathologies
Poor appreciation of topic Poor grasp of detail.
inter-relationships. Over-generalisation.
Presentation at Glasgow University, 14th March 2011 Failing to see the “big
picture”.
Realising our vision Supporting exploration and discovery
• Stage 1 leads towards functionality of prototype 1 • Explore different visualisations
– simple functionality for allowing registration of users – provide representations of the document space (nodes and
– functionality to allow users to manually generate, organise and connections) users can explore and drawing trails
share static paths – personalise views to reduce irrelevant information
– visualisations of document space and provision of basic • Develop search and browse functionality
functionality for searching and browsing
– jump to specific nodes (e.g. query or through subject ontology)
• Stage 2 leads towards functionality of prototype 2 – explore relationships between nodes (e.g. “X student of Y”)
– focus on personalisation and recommendation (creating paths – support for more exploratory search behaviours
dynamically and navigating the document space)
• Supporting adaptation to specific user model
– advanced visualisation and search/browse functionalities
– personalised views of results and the collection
• Stage 3 generates the additional applications – personalised navigational paths through the collection
– adapt functionality for Facebook and mobile devices – different forms of contextualisation for items (e.g. linking)
Presentation at Glasgow University, 14th March 2011 Presentation at Glasgow University, 14th March 2011
10. Creating, managing and sharing paths Developing user interfaces
• User registration and definition of custom settings • Advanced visualisations and overviews of info space
– workspace area for registered users – spatial metaphors for user interface
• Functionality (and user interfaces) to allow people to create paths – different types of collection overview and browsing
– save nodes discovered during search and browse – document space ordered thematically and hierarchically
– arrange and organise nodes
– add metadata to nodes (e.g. description and annotations)
• Automatic creation of themed collections and paths
– edit and refine paths (e.g. add and delete nodes) • Encouraging engagement with pathways
– automatically suggest nodes to add to paths – games/quizzes, surprises and use of images
• Create paths as goal vs. create paths as side effect of interaction – social interaction
• Functionality to allow management of paths (as objects) – diversity in recommendations
• Functionality to allow users to follow created paths (users) • Browsing using ostensive relevance feedback models
– presentation/visualisation of path (e.g. history list, graph)
– use past items (not one) to guide future direction of navigation
– path overlaid on document space (contextualised)
• Approaches for evaluating pathways (e.g. Session Track)
Presentation at Glasgow University, 14th March 2011 Presentation at Glasgow University, 14th March 2011
Silverlight zoomable interface
http://www.microsoft.com/silverlight/pivotviewer/ Summary
• Pathways offer powerful metaphor for navigation onto
which personalisation can be added
– main focus and areas of novelty for the project
• Paths can be used to support various styles of cognitive
information processing
– surface as different routes taken through information space
• Offering users suggested routes will
– help them locate information in large collections
– help encourage information exploration and discovery
– help them fulfil broader activities (e.g. constructing knowledge)
• Ultimately paths could help enhance user’s information
access experience of digital library resources
– but we need to understand users and their specific needs for
creating, managing and sharing pathways
Presentation at Glasgow University, 14th March 2011
11. t!
er
Session Track
dv
A TREC Session Track • The Session track ran for first time as pilot task
in TREC 2010
– aimed to build test collection to evaluate sessions
Ben Carterette rather than a single query
• We had two primary objectives
Paul Clough – to test whether systems can improve their
performance for a given query by using a previous
Evangelos Kanoulas query
– to evaluate system performance over an entire query
Mark Sanderson session instead of a single query
Presentation at Glasgow University, 14th March 2011 Presentation at Glasgow University, 14th March 2011
An “ideal” session test collection 2010 participant’s task
• A set of information needs with title queries • Use first query to improve results for second
• For each query, a static set of one or more possible – 150 query pairs: initial query, reformulation
reformulations
– For each of those, another possible set of reformulations, continuing
• Participants submitted 3 ranked lists (RLs) for each
recursively. query pair
• For each reformulation, some "support" for it in retrieved – (RL1) retrieval results for first query
documents from the previous query, e.g. clicks, keyword – (RL2) retrieval results for second query
overlap, viewing time etc. – (RL3) retrieval results for second query using anything that can
• A way to compute a probability distribution over be gleaned from first
reformulations given a ranked list for the previous query • Evaluated using normalized session DCG (nsDCG@10)
and support used – Session: RL1 -> RL2, nsDCG.RL12
• Relevance judgments on documents to each information – Session: RL1 -> RL3, nsDCG.RL13 (use to evaluate)
need represented by the query and its reformulations
Presentation at Glasgow University, 14th March 2011 Presentation at Glasgow University, 14th March 2011
12. Participating Groups TREC Session Track 2011
• The task of improving results of the second query
• 27 systems, 3 ranked lists per system, from 10 given only the first proved difficult
– few groups were able to show improvements and
participating sites none of the improvements were statistically significant
Bauhaus University Weimar Gale Cengage Learning – limited feedback given to participants
Hungarian Academy of Science RMIT University
• For the 2011 Session Track we aim to provide
richer interaction data for participants
The University of Melbourne University of Amsterdam
– we are asking users to generate sessions on
University of Arkansas at Little Rock University of Delaware ClueWeb09
University of Essex University of Lugano
– custom-built tool to log user activities during session
– will provide various ‘hints’ to participants (e.g. results
lists, user’s clicks, queries etc.)
– http://ir.cis.udel.edu/sessions/
Presentation at Glasgow University, 14th March 2011 Presentation at Glasgow University, 14th March 2011
Contact
Thanks for listening
p.d.clough@sheffield.ac.uk
Presentation at Glasgow University, 14th March 2011