SlideShare a Scribd company logo
1 of 12
Tag-Based Browsing of Digital
Collections with Inverted Indexes and
Browsing Cache
Joaquín Gayoso-Cabada, Mercedes Gómez-Albarrán,
José Luis Sierra
Fac. Informática
Universidad Complutense de Madrid
2
Contents
Introduction
The Tag-Based Browsing Model
Tag-Based Browsing with Inverted
Indexes
Adding a Browsing Cache
Conclusions and Future Work
3
Introduction
Clavy: an experimental platform for learning
object repositories with reconfiguable
structures
Clavy makes it possible to rearrange the
hierarchical organization of elements in
metadata schemata.
These reconfigurations affect functionalities like
learning object presentation, and browsing.
In particular, although from a user’s point of
view Clavy supports a guided browsing
paradigm…
… internally it supports more free and flexible
browsing mechanisms…
… able to take account of all the posible ways of
browsing the repositories
4
Introduction
Clavy browsing is internally supported by a tag-
based browsing system
element – value pairs are abstracted as tags
The browsing system maintains:
– A set of active tags
– The set of filtered objects
– The set of additionally selectable tags, able to
further shrink, but not to vanish, the filtered
objects
Updating the browsing snapshot when the set of
active tags changes can be computationally-
intensive
To mitigate the cost we proposed a strategy
based on inverted indexes and a browsing
cache
5
The Tag-Based Browsing Model
Digital Collections
Resources Tagging Resources Tagging
r1 Cave-Painting
Cantabrian
Prehistoric
r4 Tartesian
Plateau
Protohistoric
r2 Cave-Painting
Levant
Prehistoric
r5 Phoenician
Penibaetic
Protohistoric
r3 Megalithic
Cantabrian
Prehistoric
r6 Punic
Levant
Protohistoric
Resources  Content of Learning objects
Tags  Element-value pairs
6
The Tag-Based Browsing Model
Browsing
Browsing state:
– F  Set of selected tags.
– RF  Set of filtered resources.
– SF  Set of selectable tags.
Browsing actions:
– +t  Select the tag t.
– xt  Remove the tag t
7
Browsing with Inverted Indexes
Inverted Indexes
For each tag t the inverted index  returns
the set of all the resources (t) tagged with t
(Cave-Painting)={r1,r2}
(Megalithic)={r3}
(Tartesian)={r4}
(Phoenician)={r5}
(Punic)={r6}
(Cantabrian)={r1,r3}
(Levant)={r2,r6}
(Plateau)={r4}
(Penibaetic)={r5}
(Prehistoric)={r1,r2,r3}
(Protohistoric)={r4,r5,r6}
Resources Tagging Resources Tagging
r1 Cave-Painting
Cantabrian
Prehistoric
r4 Tartesian
Plateau
Protohistoric
r2 Cave-Painting
Levant
Prehistoric
r5 Phoenician
Penibaetic
Protohistoric
r3 Megalithic
Cantabrian
Prehistoric
r6 Punic
Levant
Protohistoric
Inverted index
8
Browsing with Inverted Indexes
The Browsing Strategy
+t browsing action:
– F  F  {t}
– RF  RF(t)
– SF{t’SF-{t} |
0 < |RF(t’)| <|RF|}
xt browsing action:
– F  F - {t}
– RF  t’F (t’) (or all the
resources if F=)
– SF{t’- F |
0 < |RF(t’)| <|RF|}
F= is managed as a
particular case:
– RF  
– SF  {t | |(t)| < ||}
9
: filtered resource
store
F ⟶ RF
: selectable tag
store
F ⟶ SF
: representative
store
RF ⟶ F
Adding a Browsing Cache
CACHE#5 CACHE#4
CACHE#1
CACHE#2
()=
()=
CACHE#3
()=
(t10)=R1
F
(t10,t1)=R2
F
(R1
F
)={t10}
(R2
F
)={t10,t1}
()=
(t10)={t1,t2,t6,t7}
(t10,t1)={t6,t7}
()=
(t10)=R1
F
(t10,t1)=R2
F
(R1
F
)={t10}
(R2
F
)={t10,t1}
()=
(t10)={t1,t2,t6,t7}
(t10,t1)={t6,t7}
()=
(t10)=R1
F
(t10,t1)=R2
F
(R1
F
)={t10}
(R2
F
)={t10,t1}
()=
(t10)={t1,t2,t6,t7}
(t10,t1)={t6,t7}
()=
(t10)=R1
F
(t10,t1)=R2
F
(t1)=R5
F
(R1
F
)={t10}
(R2
F
)={t10,t1}
()=
(t10)={t1,t2,t6,t7}
(t10,t1)={t6,t7}
(t1)={t6,t7}
CACHE#6
()=
(t10)=R1
F
(R1
F
)={t10}
()=
(t10)={t1,t2,t6,t7}
+Prehistoric
CACHE#1
+Cave-Painting
CACHE#2
xCave-Painting
CACHE#3
xPrehistoric
CACHE#4+Cave-Painting
CACHE#5
{Cave-Painting}
{Cantabrian,
Levant}
 
 {Prehistoric}
{Cave-Painting,
Megalithic,
Cantabrian,
Levant}
{Prehistoric}
{Cave-Painting,
Megalithic,
Cantabrian,
Levant}
 

R1
F
=R0
F
  (t10) R2
F
=R1
F
  (t1)
R5
F
=R4
F
  (t1)
|R1
F
  (t1)|=2
|R1
F
  (t2)|=1
|R1
F
  (t3)|=0
|R1
F
  (t4)|=0
|R1
F
  (t5)|=0
|R1
F
  (t6)|=2
|R1
F
  (t7)|=1
|R1
F
  (t8)|=0
|R1
F
  (t9)|=0
|R1
F
  (t11)|=0
0<|R1
F
(t)|<|R1
F
|
|R2
F
  (t2)|=0
|R2
F
  (t6)|=1
|R2
F
  (t7)|=1
| (t1)|=2
| (t2)|=1
| (t3)|=1
| (t4)|=1
| (t5)|=1
| (t6)|=2
| (t7)|=2
| (t8)|=1
| (t9)|=1
| (t10)|=3
| (t11)|=3
|(t)|< ||
{Prehistoric,
Cave-Painting}
{Cantabrian,
Levant}
0<|R2
F
(t)|<|R2
F
|
345
{r1,r2,r3} {r1,r2}
{r1,r2,r3}{r1,r2}
0 1 2
CACHE#6
10
Conclusions
A browsing strategy based on a suitable combination of
inverted indexes and multilevel caches has been proposed
to speed up the browsing process in Clavy
Currently we are working on the empirical evaluation of our
approach in Chasqui, a real-world repository in the Pre-
Columbian American archeology field.
Preliminary experiments suggest that the browsing cache
can substantially speed up navigation with respect to a more
basic, un-cached strategy (solely based on inverted indexes).
The price to pay is the overhead generated by cache
management, as well as the higher memory footprint caused
by the technique.
However, the experiments also make apparent how: (i) the
cache management overhead is compensated by eliminating
the explicit computation of the information associated to many
browsing states, and (ii) the cache size is maintained within
reasonable ranges, even when it is not upper-bounded.
11
Future Work
To improve the cache strategy by combining it with our
previous work on navigation automata.
To generalize the browsing strategy to support navigation
through links among resources.
To combine browsing and search, letting users browse
search results according to the browsing model described.
Tag-Based Browsing of Digital
Collections with Inverted Indexes and
Browsing Cache
Joaquín Gayoso-Cabada, Mercedes Gómez-Albarrán,
José Luis Sierra
Fac. Informática
Universidad Complutense de Madrid

More Related Content

Similar to Tag-Based Browsing of Digital Collections with Inverted Indexes and Browsing Cache

E-ARK-iPRES2016-Bern-October-2016
E-ARK-iPRES2016-Bern-October-2016E-ARK-iPRES2016-Bern-October-2016
E-ARK-iPRES2016-Bern-October-2016
Sven Schlarb
 
final_copy_camera_ready_paper (7)
final_copy_camera_ready_paper (7)final_copy_camera_ready_paper (7)
final_copy_camera_ready_paper (7)
Ankit Rathi
 
RDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
RDF4U: RDF Graph Visualization by Interpreting Linked Data as KnowledgeRDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
RDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
National Institute of Informatics
 
RDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
RDF4U: RDF Graph Visualization by Interpreting Linked Data as KnowledgeRDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
RDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
Rathachai Chawuthai
 

Similar to Tag-Based Browsing of Digital Collections with Inverted Indexes and Browsing Cache (20)

SPARQL-DL - Theory & Practice
SPARQL-DL - Theory & PracticeSPARQL-DL - Theory & Practice
SPARQL-DL - Theory & Practice
 
E-ARK-iPRES2016-Bern-October-2016
E-ARK-iPRES2016-Bern-October-2016E-ARK-iPRES2016-Bern-October-2016
E-ARK-iPRES2016-Bern-October-2016
 
LibreCat::Catmandu
LibreCat::CatmanduLibreCat::Catmandu
LibreCat::Catmandu
 
Benchmarking Cloud-based Tagging Services
Benchmarking Cloud-based Tagging ServicesBenchmarking Cloud-based Tagging Services
Benchmarking Cloud-based Tagging Services
 
The Ceph RGW archive zone feature (Ceph Days 2019)
The Ceph RGW archive zone feature (Ceph Days 2019)The Ceph RGW archive zone feature (Ceph Days 2019)
The Ceph RGW archive zone feature (Ceph Days 2019)
 
final_copy_camera_ready_paper (7)
final_copy_camera_ready_paper (7)final_copy_camera_ready_paper (7)
final_copy_camera_ready_paper (7)
 
Open Source Lambda Architecture for deep learning
Open Source Lambda Architecture for deep learningOpen Source Lambda Architecture for deep learning
Open Source Lambda Architecture for deep learning
 
20161004 “Open Data Web” – A Linked Open Data Repository Built with CKAN
20161004 “Open Data Web” – A Linked Open Data Repository Built with CKAN20161004 “Open Data Web” – A Linked Open Data Repository Built with CKAN
20161004 “Open Data Web” – A Linked Open Data Repository Built with CKAN
 
The ARK Identifier Scheme at Ten Years Old
The ARK Identifier Scheme at Ten Years OldThe ARK Identifier Scheme at Ten Years Old
The ARK Identifier Scheme at Ten Years Old
 
Mastro
MastroMastro
Mastro
 
Mastro
MastroMastro
Mastro
 
FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout
 
RDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
RDF4U: RDF Graph Visualization by Interpreting Linked Data as KnowledgeRDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
RDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
 
RDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
RDF4U: RDF Graph Visualization by Interpreting Linked Data as KnowledgeRDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
RDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
 
04 open source_tools
04 open source_tools04 open source_tools
04 open source_tools
 
The Apache Spark File Format Ecosystem
The Apache Spark File Format EcosystemThe Apache Spark File Format Ecosystem
The Apache Spark File Format Ecosystem
 
The Ontario library research cloud
The Ontario library research cloudThe Ontario library research cloud
The Ontario library research cloud
 
SPARQL and RDF query optimization
SPARQL and RDF query optimizationSPARQL and RDF query optimization
SPARQL and RDF query optimization
 
ARIADNE Registry - towards interoperability
ARIADNE Registry - towards interoperabilityARIADNE Registry - towards interoperability
ARIADNE Registry - towards interoperability
 
Upgrading maps with Linked Data
Upgrading maps with Linked DataUpgrading maps with Linked Data
Upgrading maps with Linked Data
 

More from Technological Ecosystems for Enhancing Multiculturality

More from Technological Ecosystems for Enhancing Multiculturality (20)

A Preliminary Study of Proof of Concept Practices and their connection with I...
A Preliminary Study of Proof of Concept Practices and their connection with I...A Preliminary Study of Proof of Concept Practices and their connection with I...
A Preliminary Study of Proof of Concept Practices and their connection with I...
 
Social networks as a promotional space for Spanish radio content. The case st...
Social networks as a promotional space for Spanish radio content. The case st...Social networks as a promotional space for Spanish radio content. The case st...
Social networks as a promotional space for Spanish radio content. The case st...
 
Towards the study of sentiment in the public opinion of science in Spanish
Towards the study of sentiment in the public opinion of science in SpanishTowards the study of sentiment in the public opinion of science in Spanish
Towards the study of sentiment in the public opinion of science in Spanish
 
A Three-Step Data-Mining Analysis of Top-Ranked Higher Education Institutions...
A Three-Step Data-Mining Analysis of Top-Ranked Higher Education Institutions...A Three-Step Data-Mining Analysis of Top-Ranked Higher Education Institutions...
A Three-Step Data-Mining Analysis of Top-Ranked Higher Education Institutions...
 
Specifics of multimedia texts in the context of social networks media aesthetics
Specifics of multimedia texts in the context of social networks media aestheticsSpecifics of multimedia texts in the context of social networks media aesthetics
Specifics of multimedia texts in the context of social networks media aesthetics
 
Combined Effects of Similarity and Imagined Contact on First-Person Testimoni...
Combined Effects of Similarity and Imagined Contact on First-Person Testimoni...Combined Effects of Similarity and Imagined Contact on First-Person Testimoni...
Combined Effects of Similarity and Imagined Contact on First-Person Testimoni...
 
Direct online political communication effects on civil participation in spain...
Direct online political communication effects on civil participation in spain...Direct online political communication effects on civil participation in spain...
Direct online political communication effects on civil participation in spain...
 
University Media in Ecuador: Types, Functions and Self-determination
University Media in Ecuador: Types, Functions and Self-determinationUniversity Media in Ecuador: Types, Functions and Self-determination
University Media in Ecuador: Types, Functions and Self-determination
 
Like it or die: using social networks to improve collaborative learning in hi...
Like it or die: using social networks to improve collaborative learning in hi...Like it or die: using social networks to improve collaborative learning in hi...
Like it or die: using social networks to improve collaborative learning in hi...
 
Framing theory in studies of environmental information in press
Framing theory in studies of environmental information in pressFraming theory in studies of environmental information in press
Framing theory in studies of environmental information in press
 
Domain engineering for generating dashboards to analyze employment and employ...
Domain engineering for generating dashboards to analyze employment and employ...Domain engineering for generating dashboards to analyze employment and employ...
Domain engineering for generating dashboards to analyze employment and employ...
 
Mapping the systematic literature studies about software ecosystems
Mapping the systematic literature studies about software ecosystemsMapping the systematic literature studies about software ecosystems
Mapping the systematic literature studies about software ecosystems
 
A Multivocal Literature Review on the use of DevOps for e-learning systems
A Multivocal Literature Review on the use of DevOps for e-learning systemsA Multivocal Literature Review on the use of DevOps for e-learning systems
A Multivocal Literature Review on the use of DevOps for e-learning systems
 
Document Annotation Tools: Annotation Classification Mechanisms
Document Annotation Tools: Annotation Classification MechanismsDocument Annotation Tools: Annotation Classification Mechanisms
Document Annotation Tools: Annotation Classification Mechanisms
 
Toward supporting decision-making under uncertainty in digital humanities wit...
Toward supporting decision-making under uncertainty in digital humanities wit...Toward supporting decision-making under uncertainty in digital humanities wit...
Toward supporting decision-making under uncertainty in digital humanities wit...
 
Managing Uncertainty in the Humanities: Digital and Analogue Approaches
Managing Uncertainty in the Humanities: Digital and Analogue ApproachesManaging Uncertainty in the Humanities: Digital and Analogue Approaches
Managing Uncertainty in the Humanities: Digital and Analogue Approaches
 
Representing Imprecise and Uncertain Knowledge in Digital Humanities: A Theor...
Representing Imprecise and Uncertain Knowledge in Digital Humanities: A Theor...Representing Imprecise and Uncertain Knowledge in Digital Humanities: A Theor...
Representing Imprecise and Uncertain Knowledge in Digital Humanities: A Theor...
 
Dotmocracy and Planning Poker for Uncertainty Management in Collaborative Res...
Dotmocracy and Planning Poker for Uncertainty Management in Collaborative Res...Dotmocracy and Planning Poker for Uncertainty Management in Collaborative Res...
Dotmocracy and Planning Poker for Uncertainty Management in Collaborative Res...
 
Applying Commercial Computer Vision Tools to Cope with Uncertainties in a Cit...
Applying Commercial Computer Vision Tools to Cope with Uncertainties in a Cit...Applying Commercial Computer Vision Tools to Cope with Uncertainties in a Cit...
Applying Commercial Computer Vision Tools to Cope with Uncertainties in a Cit...
 
Appliying topic modeling techniques to degraded texts. Spanish historical pre...
Appliying topic modeling techniques to degraded texts. Spanish historical pre...Appliying topic modeling techniques to degraded texts. Spanish historical pre...
Appliying topic modeling techniques to degraded texts. Spanish historical pre...
 

Recently uploaded

Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdfVishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
ssuserdda66b
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
AnaAcapella
 

Recently uploaded (20)

Dyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxDyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptx
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdfVishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701
 

Tag-Based Browsing of Digital Collections with Inverted Indexes and Browsing Cache

  • 1. Tag-Based Browsing of Digital Collections with Inverted Indexes and Browsing Cache Joaquín Gayoso-Cabada, Mercedes Gómez-Albarrán, José Luis Sierra Fac. Informática Universidad Complutense de Madrid
  • 2. 2 Contents Introduction The Tag-Based Browsing Model Tag-Based Browsing with Inverted Indexes Adding a Browsing Cache Conclusions and Future Work
  • 3. 3 Introduction Clavy: an experimental platform for learning object repositories with reconfiguable structures Clavy makes it possible to rearrange the hierarchical organization of elements in metadata schemata. These reconfigurations affect functionalities like learning object presentation, and browsing. In particular, although from a user’s point of view Clavy supports a guided browsing paradigm… … internally it supports more free and flexible browsing mechanisms… … able to take account of all the posible ways of browsing the repositories
  • 4. 4 Introduction Clavy browsing is internally supported by a tag- based browsing system element – value pairs are abstracted as tags The browsing system maintains: – A set of active tags – The set of filtered objects – The set of additionally selectable tags, able to further shrink, but not to vanish, the filtered objects Updating the browsing snapshot when the set of active tags changes can be computationally- intensive To mitigate the cost we proposed a strategy based on inverted indexes and a browsing cache
  • 5. 5 The Tag-Based Browsing Model Digital Collections Resources Tagging Resources Tagging r1 Cave-Painting Cantabrian Prehistoric r4 Tartesian Plateau Protohistoric r2 Cave-Painting Levant Prehistoric r5 Phoenician Penibaetic Protohistoric r3 Megalithic Cantabrian Prehistoric r6 Punic Levant Protohistoric Resources  Content of Learning objects Tags  Element-value pairs
  • 6. 6 The Tag-Based Browsing Model Browsing Browsing state: – F  Set of selected tags. – RF  Set of filtered resources. – SF  Set of selectable tags. Browsing actions: – +t  Select the tag t. – xt  Remove the tag t
  • 7. 7 Browsing with Inverted Indexes Inverted Indexes For each tag t the inverted index  returns the set of all the resources (t) tagged with t (Cave-Painting)={r1,r2} (Megalithic)={r3} (Tartesian)={r4} (Phoenician)={r5} (Punic)={r6} (Cantabrian)={r1,r3} (Levant)={r2,r6} (Plateau)={r4} (Penibaetic)={r5} (Prehistoric)={r1,r2,r3} (Protohistoric)={r4,r5,r6} Resources Tagging Resources Tagging r1 Cave-Painting Cantabrian Prehistoric r4 Tartesian Plateau Protohistoric r2 Cave-Painting Levant Prehistoric r5 Phoenician Penibaetic Protohistoric r3 Megalithic Cantabrian Prehistoric r6 Punic Levant Protohistoric Inverted index
  • 8. 8 Browsing with Inverted Indexes The Browsing Strategy +t browsing action: – F  F  {t} – RF  RF(t) – SF{t’SF-{t} | 0 < |RF(t’)| <|RF|} xt browsing action: – F  F - {t} – RF  t’F (t’) (or all the resources if F=) – SF{t’- F | 0 < |RF(t’)| <|RF|} F= is managed as a particular case: – RF   – SF  {t | |(t)| < ||}
  • 9. 9 : filtered resource store F ⟶ RF : selectable tag store F ⟶ SF : representative store RF ⟶ F Adding a Browsing Cache CACHE#5 CACHE#4 CACHE#1 CACHE#2 ()= ()= CACHE#3 ()= (t10)=R1 F (t10,t1)=R2 F (R1 F )={t10} (R2 F )={t10,t1} ()= (t10)={t1,t2,t6,t7} (t10,t1)={t6,t7} ()= (t10)=R1 F (t10,t1)=R2 F (R1 F )={t10} (R2 F )={t10,t1} ()= (t10)={t1,t2,t6,t7} (t10,t1)={t6,t7} ()= (t10)=R1 F (t10,t1)=R2 F (R1 F )={t10} (R2 F )={t10,t1} ()= (t10)={t1,t2,t6,t7} (t10,t1)={t6,t7} ()= (t10)=R1 F (t10,t1)=R2 F (t1)=R5 F (R1 F )={t10} (R2 F )={t10,t1} ()= (t10)={t1,t2,t6,t7} (t10,t1)={t6,t7} (t1)={t6,t7} CACHE#6 ()= (t10)=R1 F (R1 F )={t10} ()= (t10)={t1,t2,t6,t7} +Prehistoric CACHE#1 +Cave-Painting CACHE#2 xCave-Painting CACHE#3 xPrehistoric CACHE#4+Cave-Painting CACHE#5 {Cave-Painting} {Cantabrian, Levant}    {Prehistoric} {Cave-Painting, Megalithic, Cantabrian, Levant} {Prehistoric} {Cave-Painting, Megalithic, Cantabrian, Levant}    R1 F =R0 F   (t10) R2 F =R1 F   (t1) R5 F =R4 F   (t1) |R1 F   (t1)|=2 |R1 F   (t2)|=1 |R1 F   (t3)|=0 |R1 F   (t4)|=0 |R1 F   (t5)|=0 |R1 F   (t6)|=2 |R1 F   (t7)|=1 |R1 F   (t8)|=0 |R1 F   (t9)|=0 |R1 F   (t11)|=0 0<|R1 F (t)|<|R1 F | |R2 F   (t2)|=0 |R2 F   (t6)|=1 |R2 F   (t7)|=1 | (t1)|=2 | (t2)|=1 | (t3)|=1 | (t4)|=1 | (t5)|=1 | (t6)|=2 | (t7)|=2 | (t8)|=1 | (t9)|=1 | (t10)|=3 | (t11)|=3 |(t)|< || {Prehistoric, Cave-Painting} {Cantabrian, Levant} 0<|R2 F (t)|<|R2 F | 345 {r1,r2,r3} {r1,r2} {r1,r2,r3}{r1,r2} 0 1 2 CACHE#6
  • 10. 10 Conclusions A browsing strategy based on a suitable combination of inverted indexes and multilevel caches has been proposed to speed up the browsing process in Clavy Currently we are working on the empirical evaluation of our approach in Chasqui, a real-world repository in the Pre- Columbian American archeology field. Preliminary experiments suggest that the browsing cache can substantially speed up navigation with respect to a more basic, un-cached strategy (solely based on inverted indexes). The price to pay is the overhead generated by cache management, as well as the higher memory footprint caused by the technique. However, the experiments also make apparent how: (i) the cache management overhead is compensated by eliminating the explicit computation of the information associated to many browsing states, and (ii) the cache size is maintained within reasonable ranges, even when it is not upper-bounded.
  • 11. 11 Future Work To improve the cache strategy by combining it with our previous work on navigation automata. To generalize the browsing strategy to support navigation through links among resources. To combine browsing and search, letting users browse search results according to the browsing model described.
  • 12. Tag-Based Browsing of Digital Collections with Inverted Indexes and Browsing Cache Joaquín Gayoso-Cabada, Mercedes Gómez-Albarrán, José Luis Sierra Fac. Informática Universidad Complutense de Madrid