SlideShare una empresa de Scribd logo
Tag-Based Browsing of Digital
Collections with Inverted Indexes and
Browsing Cache
Joaquín Gayoso-Cabada, Mercedes Gómez-Albarrán,
José Luis Sierra
Fac. Informática
Universidad Complutense de Madrid
2
Contents
Introduction
The Tag-Based Browsing Model
Tag-Based Browsing with Inverted
Indexes
Adding a Browsing Cache
Conclusions and Future Work
3
Introduction
Clavy: an experimental platform for learning
object repositories with reconfiguable
structures
Clavy makes it possible to rearrange the
hierarchical organization of elements in
metadata schemata.
These reconfigurations affect functionalities like
learning object presentation, and browsing.
In particular, although from a user’s point of
view Clavy supports a guided browsing
paradigm…
… internally it supports more free and flexible
browsing mechanisms…
… able to take account of all the posible ways of
browsing the repositories
4
Introduction
Clavy browsing is internally supported by a tag-
based browsing system
element – value pairs are abstracted as tags
The browsing system maintains:
– A set of active tags
– The set of filtered objects
– The set of additionally selectable tags, able to
further shrink, but not to vanish, the filtered
objects
Updating the browsing snapshot when the set of
active tags changes can be computationally-
intensive
To mitigate the cost we proposed a strategy
based on inverted indexes and a browsing
cache
5
The Tag-Based Browsing Model
Digital Collections
Resources Tagging Resources Tagging
r1 Cave-Painting
Cantabrian
Prehistoric
r4 Tartesian
Plateau
Protohistoric
r2 Cave-Painting
Levant
Prehistoric
r5 Phoenician
Penibaetic
Protohistoric
r3 Megalithic
Cantabrian
Prehistoric
r6 Punic
Levant
Protohistoric
Resources  Content of Learning objects
Tags  Element-value pairs
6
The Tag-Based Browsing Model
Browsing
Browsing state:
– F  Set of selected tags.
– RF  Set of filtered resources.
– SF  Set of selectable tags.
Browsing actions:
– +t  Select the tag t.
– xt  Remove the tag t
7
Browsing with Inverted Indexes
Inverted Indexes
For each tag t the inverted index  returns
the set of all the resources (t) tagged with t
(Cave-Painting)={r1,r2}
(Megalithic)={r3}
(Tartesian)={r4}
(Phoenician)={r5}
(Punic)={r6}
(Cantabrian)={r1,r3}
(Levant)={r2,r6}
(Plateau)={r4}
(Penibaetic)={r5}
(Prehistoric)={r1,r2,r3}
(Protohistoric)={r4,r5,r6}
Resources Tagging Resources Tagging
r1 Cave-Painting
Cantabrian
Prehistoric
r4 Tartesian
Plateau
Protohistoric
r2 Cave-Painting
Levant
Prehistoric
r5 Phoenician
Penibaetic
Protohistoric
r3 Megalithic
Cantabrian
Prehistoric
r6 Punic
Levant
Protohistoric
Inverted index
8
Browsing with Inverted Indexes
The Browsing Strategy
+t browsing action:
– F  F  {t}
– RF  RF(t)
– SF{t’SF-{t} |
0 < |RF(t’)| <|RF|}
xt browsing action:
– F  F - {t}
– RF  t’F (t’) (or all the
resources if F=)
– SF{t’- F |
0 < |RF(t’)| <|RF|}
F= is managed as a
particular case:
– RF  
– SF  {t | |(t)| < ||}
9
: filtered resource
store
F ⟶ RF
: selectable tag
store
F ⟶ SF
: representative
store
RF ⟶ F
Adding a Browsing Cache
CACHE#5 CACHE#4
CACHE#1
CACHE#2
()=
()=
CACHE#3
()=
(t10)=R1
F
(t10,t1)=R2
F
(R1
F
)={t10}
(R2
F
)={t10,t1}
()=
(t10)={t1,t2,t6,t7}
(t10,t1)={t6,t7}
()=
(t10)=R1
F
(t10,t1)=R2
F
(R1
F
)={t10}
(R2
F
)={t10,t1}
()=
(t10)={t1,t2,t6,t7}
(t10,t1)={t6,t7}
()=
(t10)=R1
F
(t10,t1)=R2
F
(R1
F
)={t10}
(R2
F
)={t10,t1}
()=
(t10)={t1,t2,t6,t7}
(t10,t1)={t6,t7}
()=
(t10)=R1
F
(t10,t1)=R2
F
(t1)=R5
F
(R1
F
)={t10}
(R2
F
)={t10,t1}
()=
(t10)={t1,t2,t6,t7}
(t10,t1)={t6,t7}
(t1)={t6,t7}
CACHE#6
()=
(t10)=R1
F
(R1
F
)={t10}
()=
(t10)={t1,t2,t6,t7}
+Prehistoric
CACHE#1
+Cave-Painting
CACHE#2
xCave-Painting
CACHE#3
xPrehistoric
CACHE#4+Cave-Painting
CACHE#5
{Cave-Painting}
{Cantabrian,
Levant}
 
 {Prehistoric}
{Cave-Painting,
Megalithic,
Cantabrian,
Levant}
{Prehistoric}
{Cave-Painting,
Megalithic,
Cantabrian,
Levant}
 

R1
F
=R0
F
  (t10) R2
F
=R1
F
  (t1)
R5
F
=R4
F
  (t1)
|R1
F
  (t1)|=2
|R1
F
  (t2)|=1
|R1
F
  (t3)|=0
|R1
F
  (t4)|=0
|R1
F
  (t5)|=0
|R1
F
  (t6)|=2
|R1
F
  (t7)|=1
|R1
F
  (t8)|=0
|R1
F
  (t9)|=0
|R1
F
  (t11)|=0
0<|R1
F
(t)|<|R1
F
|
|R2
F
  (t2)|=0
|R2
F
  (t6)|=1
|R2
F
  (t7)|=1
| (t1)|=2
| (t2)|=1
| (t3)|=1
| (t4)|=1
| (t5)|=1
| (t6)|=2
| (t7)|=2
| (t8)|=1
| (t9)|=1
| (t10)|=3
| (t11)|=3
|(t)|< ||
{Prehistoric,
Cave-Painting}
{Cantabrian,
Levant}
0<|R2
F
(t)|<|R2
F
|
345
{r1,r2,r3} {r1,r2}
{r1,r2,r3}{r1,r2}
0 1 2
CACHE#6
10
Conclusions
A browsing strategy based on a suitable combination of
inverted indexes and multilevel caches has been proposed
to speed up the browsing process in Clavy
Currently we are working on the empirical evaluation of our
approach in Chasqui, a real-world repository in the Pre-
Columbian American archeology field.
Preliminary experiments suggest that the browsing cache
can substantially speed up navigation with respect to a more
basic, un-cached strategy (solely based on inverted indexes).
The price to pay is the overhead generated by cache
management, as well as the higher memory footprint caused
by the technique.
However, the experiments also make apparent how: (i) the
cache management overhead is compensated by eliminating
the explicit computation of the information associated to many
browsing states, and (ii) the cache size is maintained within
reasonable ranges, even when it is not upper-bounded.
11
Future Work
To improve the cache strategy by combining it with our
previous work on navigation automata.
To generalize the browsing strategy to support navigation
through links among resources.
To combine browsing and search, letting users browse
search results according to the browsing model described.
Tag-Based Browsing of Digital
Collections with Inverted Indexes and
Browsing Cache
Joaquín Gayoso-Cabada, Mercedes Gómez-Albarrán,
José Luis Sierra
Fac. Informática
Universidad Complutense de Madrid

Más contenido relacionado

Similar a Tag-Based Browsing of Digital Collections with Inverted Indexes and Browsing Cache

SPARQL-DL - Theory & Practice
SPARQL-DL - Theory & PracticeSPARQL-DL - Theory & Practice
SPARQL-DL - Theory & Practice
Adriel Café
 
E-ARK-iPRES2016-Bern-October-2016
E-ARK-iPRES2016-Bern-October-2016E-ARK-iPRES2016-Bern-October-2016
E-ARK-iPRES2016-Bern-October-2016
Sven Schlarb
 
LibreCat::Catmandu
LibreCat::CatmanduLibreCat::Catmandu
LibreCat::Catmandu
Patrick Hochstenbach
 
Benchmarking Cloud-based Tagging Services
Benchmarking Cloud-based Tagging ServicesBenchmarking Cloud-based Tagging Services
Benchmarking Cloud-based Tagging Services
Tanu Malik
 
The Ceph RGW archive zone feature (Ceph Days 2019)
The Ceph RGW archive zone feature (Ceph Days 2019)The Ceph RGW archive zone feature (Ceph Days 2019)
The Ceph RGW archive zone feature (Ceph Days 2019)
Igalia
 
final_copy_camera_ready_paper (7)
final_copy_camera_ready_paper (7)final_copy_camera_ready_paper (7)
final_copy_camera_ready_paper (7)
Ankit Rathi
 
Open Source Lambda Architecture for deep learning
Open Source Lambda Architecture for deep learningOpen Source Lambda Architecture for deep learning
Open Source Lambda Architecture for deep learning
Patrick Nicolas
 
20161004 “Open Data Web” – A Linked Open Data Repository Built with CKAN
20161004 “Open Data Web” – A Linked Open Data Repository Built with CKAN20161004 “Open Data Web” – A Linked Open Data Repository Built with CKAN
20161004 “Open Data Web” – A Linked Open Data Repository Built with CKAN
andrea huang
 
The ARK Identifier Scheme at Ten Years Old
The ARK Identifier Scheme at Ten Years OldThe ARK Identifier Scheme at Ten Years Old
The ARK Identifier Scheme at Ten Years Old
John Kunze
 
Mastro
MastroMastro
Mastro
MastroMastro
FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout
Carole Goble
 
RDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
RDF4U: RDF Graph Visualization by Interpreting Linked Data as KnowledgeRDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
RDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
National Institute of Informatics
 
RDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
RDF4U: RDF Graph Visualization by Interpreting Linked Data as KnowledgeRDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
RDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
Rathachai Chawuthai
 
04 open source_tools
04 open source_tools04 open source_tools
04 open source_tools
Marco Quartulli
 
The Apache Spark File Format Ecosystem
The Apache Spark File Format EcosystemThe Apache Spark File Format Ecosystem
The Apache Spark File Format Ecosystem
Databricks
 
The Ontario library research cloud
The Ontario library research cloudThe Ontario library research cloud
SPARQL and RDF query optimization
SPARQL and RDF query optimizationSPARQL and RDF query optimization
SPARQL and RDF query optimization
Kisung Kim
 
ARIADNE Registry - towards interoperability
ARIADNE Registry - towards interoperabilityARIADNE Registry - towards interoperability
ARIADNE Registry - towards interoperability
ariadnenetwork
 
Upgrading maps with Linked Data
Upgrading maps with Linked DataUpgrading maps with Linked Data
Upgrading maps with Linked Data
Francisco J. Lopez-Pellicer
 

Similar a Tag-Based Browsing of Digital Collections with Inverted Indexes and Browsing Cache (20)

SPARQL-DL - Theory & Practice
SPARQL-DL - Theory & PracticeSPARQL-DL - Theory & Practice
SPARQL-DL - Theory & Practice
 
E-ARK-iPRES2016-Bern-October-2016
E-ARK-iPRES2016-Bern-October-2016E-ARK-iPRES2016-Bern-October-2016
E-ARK-iPRES2016-Bern-October-2016
 
LibreCat::Catmandu
LibreCat::CatmanduLibreCat::Catmandu
LibreCat::Catmandu
 
Benchmarking Cloud-based Tagging Services
Benchmarking Cloud-based Tagging ServicesBenchmarking Cloud-based Tagging Services
Benchmarking Cloud-based Tagging Services
 
The Ceph RGW archive zone feature (Ceph Days 2019)
The Ceph RGW archive zone feature (Ceph Days 2019)The Ceph RGW archive zone feature (Ceph Days 2019)
The Ceph RGW archive zone feature (Ceph Days 2019)
 
final_copy_camera_ready_paper (7)
final_copy_camera_ready_paper (7)final_copy_camera_ready_paper (7)
final_copy_camera_ready_paper (7)
 
Open Source Lambda Architecture for deep learning
Open Source Lambda Architecture for deep learningOpen Source Lambda Architecture for deep learning
Open Source Lambda Architecture for deep learning
 
20161004 “Open Data Web” – A Linked Open Data Repository Built with CKAN
20161004 “Open Data Web” – A Linked Open Data Repository Built with CKAN20161004 “Open Data Web” – A Linked Open Data Repository Built with CKAN
20161004 “Open Data Web” – A Linked Open Data Repository Built with CKAN
 
The ARK Identifier Scheme at Ten Years Old
The ARK Identifier Scheme at Ten Years OldThe ARK Identifier Scheme at Ten Years Old
The ARK Identifier Scheme at Ten Years Old
 
Mastro
MastroMastro
Mastro
 
Mastro
MastroMastro
Mastro
 
FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout
 
RDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
RDF4U: RDF Graph Visualization by Interpreting Linked Data as KnowledgeRDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
RDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
 
RDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
RDF4U: RDF Graph Visualization by Interpreting Linked Data as KnowledgeRDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
RDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
 
04 open source_tools
04 open source_tools04 open source_tools
04 open source_tools
 
The Apache Spark File Format Ecosystem
The Apache Spark File Format EcosystemThe Apache Spark File Format Ecosystem
The Apache Spark File Format Ecosystem
 
The Ontario library research cloud
The Ontario library research cloudThe Ontario library research cloud
The Ontario library research cloud
 
SPARQL and RDF query optimization
SPARQL and RDF query optimizationSPARQL and RDF query optimization
SPARQL and RDF query optimization
 
ARIADNE Registry - towards interoperability
ARIADNE Registry - towards interoperabilityARIADNE Registry - towards interoperability
ARIADNE Registry - towards interoperability
 
Upgrading maps with Linked Data
Upgrading maps with Linked DataUpgrading maps with Linked Data
Upgrading maps with Linked Data
 

Más de Technological Ecosystems for Enhancing Multiculturality

A Preliminary Study of Proof of Concept Practices and their connection with I...
A Preliminary Study of Proof of Concept Practices and their connection with I...A Preliminary Study of Proof of Concept Practices and their connection with I...
A Preliminary Study of Proof of Concept Practices and their connection with I...
Technological Ecosystems for Enhancing Multiculturality
 
Social networks as a promotional space for Spanish radio content. The case st...
Social networks as a promotional space for Spanish radio content. The case st...Social networks as a promotional space for Spanish radio content. The case st...
Social networks as a promotional space for Spanish radio content. The case st...
Technological Ecosystems for Enhancing Multiculturality
 
Towards the study of sentiment in the public opinion of science in Spanish
Towards the study of sentiment in the public opinion of science in SpanishTowards the study of sentiment in the public opinion of science in Spanish
Towards the study of sentiment in the public opinion of science in Spanish
Technological Ecosystems for Enhancing Multiculturality
 
A Three-Step Data-Mining Analysis of Top-Ranked Higher Education Institutions...
A Three-Step Data-Mining Analysis of Top-Ranked Higher Education Institutions...A Three-Step Data-Mining Analysis of Top-Ranked Higher Education Institutions...
A Three-Step Data-Mining Analysis of Top-Ranked Higher Education Institutions...
Technological Ecosystems for Enhancing Multiculturality
 
Specifics of multimedia texts in the context of social networks media aesthetics
Specifics of multimedia texts in the context of social networks media aestheticsSpecifics of multimedia texts in the context of social networks media aesthetics
Specifics of multimedia texts in the context of social networks media aesthetics
Technological Ecosystems for Enhancing Multiculturality
 
Combined Effects of Similarity and Imagined Contact on First-Person Testimoni...
Combined Effects of Similarity and Imagined Contact on First-Person Testimoni...Combined Effects of Similarity and Imagined Contact on First-Person Testimoni...
Combined Effects of Similarity and Imagined Contact on First-Person Testimoni...
Technological Ecosystems for Enhancing Multiculturality
 
Direct online political communication effects on civil participation in spain...
Direct online political communication effects on civil participation in spain...Direct online political communication effects on civil participation in spain...
Direct online political communication effects on civil participation in spain...
Technological Ecosystems for Enhancing Multiculturality
 
University Media in Ecuador: Types, Functions and Self-determination
University Media in Ecuador: Types, Functions and Self-determinationUniversity Media in Ecuador: Types, Functions and Self-determination
University Media in Ecuador: Types, Functions and Self-determination
Technological Ecosystems for Enhancing Multiculturality
 
Like it or die: using social networks to improve collaborative learning in hi...
Like it or die: using social networks to improve collaborative learning in hi...Like it or die: using social networks to improve collaborative learning in hi...
Like it or die: using social networks to improve collaborative learning in hi...
Technological Ecosystems for Enhancing Multiculturality
 
Framing theory in studies of environmental information in press
Framing theory in studies of environmental information in pressFraming theory in studies of environmental information in press
Framing theory in studies of environmental information in press
Technological Ecosystems for Enhancing Multiculturality
 
Domain engineering for generating dashboards to analyze employment and employ...
Domain engineering for generating dashboards to analyze employment and employ...Domain engineering for generating dashboards to analyze employment and employ...
Domain engineering for generating dashboards to analyze employment and employ...
Technological Ecosystems for Enhancing Multiculturality
 
Mapping the systematic literature studies about software ecosystems
Mapping the systematic literature studies about software ecosystemsMapping the systematic literature studies about software ecosystems
Mapping the systematic literature studies about software ecosystems
Technological Ecosystems for Enhancing Multiculturality
 
A Multivocal Literature Review on the use of DevOps for e-learning systems
A Multivocal Literature Review on the use of DevOps for e-learning systemsA Multivocal Literature Review on the use of DevOps for e-learning systems
A Multivocal Literature Review on the use of DevOps for e-learning systems
Technological Ecosystems for Enhancing Multiculturality
 
Document Annotation Tools: Annotation Classification Mechanisms
Document Annotation Tools: Annotation Classification MechanismsDocument Annotation Tools: Annotation Classification Mechanisms
Document Annotation Tools: Annotation Classification Mechanisms
Technological Ecosystems for Enhancing Multiculturality
 
Toward supporting decision-making under uncertainty in digital humanities wit...
Toward supporting decision-making under uncertainty in digital humanities wit...Toward supporting decision-making under uncertainty in digital humanities wit...
Toward supporting decision-making under uncertainty in digital humanities wit...
Technological Ecosystems for Enhancing Multiculturality
 
Managing Uncertainty in the Humanities: Digital and Analogue Approaches
Managing Uncertainty in the Humanities: Digital and Analogue ApproachesManaging Uncertainty in the Humanities: Digital and Analogue Approaches
Managing Uncertainty in the Humanities: Digital and Analogue Approaches
Technological Ecosystems for Enhancing Multiculturality
 
Representing Imprecise and Uncertain Knowledge in Digital Humanities: A Theor...
Representing Imprecise and Uncertain Knowledge in Digital Humanities: A Theor...Representing Imprecise and Uncertain Knowledge in Digital Humanities: A Theor...
Representing Imprecise and Uncertain Knowledge in Digital Humanities: A Theor...
Technological Ecosystems for Enhancing Multiculturality
 
Dotmocracy and Planning Poker for Uncertainty Management in Collaborative Res...
Dotmocracy and Planning Poker for Uncertainty Management in Collaborative Res...Dotmocracy and Planning Poker for Uncertainty Management in Collaborative Res...
Dotmocracy and Planning Poker for Uncertainty Management in Collaborative Res...
Technological Ecosystems for Enhancing Multiculturality
 
Applying Commercial Computer Vision Tools to Cope with Uncertainties in a Cit...
Applying Commercial Computer Vision Tools to Cope with Uncertainties in a Cit...Applying Commercial Computer Vision Tools to Cope with Uncertainties in a Cit...
Applying Commercial Computer Vision Tools to Cope with Uncertainties in a Cit...
Technological Ecosystems for Enhancing Multiculturality
 
Appliying topic modeling techniques to degraded texts. Spanish historical pre...
Appliying topic modeling techniques to degraded texts. Spanish historical pre...Appliying topic modeling techniques to degraded texts. Spanish historical pre...
Appliying topic modeling techniques to degraded texts. Spanish historical pre...
Technological Ecosystems for Enhancing Multiculturality
 

Más de Technological Ecosystems for Enhancing Multiculturality (20)

A Preliminary Study of Proof of Concept Practices and their connection with I...
A Preliminary Study of Proof of Concept Practices and their connection with I...A Preliminary Study of Proof of Concept Practices and their connection with I...
A Preliminary Study of Proof of Concept Practices and their connection with I...
 
Social networks as a promotional space for Spanish radio content. The case st...
Social networks as a promotional space for Spanish radio content. The case st...Social networks as a promotional space for Spanish radio content. The case st...
Social networks as a promotional space for Spanish radio content. The case st...
 
Towards the study of sentiment in the public opinion of science in Spanish
Towards the study of sentiment in the public opinion of science in SpanishTowards the study of sentiment in the public opinion of science in Spanish
Towards the study of sentiment in the public opinion of science in Spanish
 
A Three-Step Data-Mining Analysis of Top-Ranked Higher Education Institutions...
A Three-Step Data-Mining Analysis of Top-Ranked Higher Education Institutions...A Three-Step Data-Mining Analysis of Top-Ranked Higher Education Institutions...
A Three-Step Data-Mining Analysis of Top-Ranked Higher Education Institutions...
 
Specifics of multimedia texts in the context of social networks media aesthetics
Specifics of multimedia texts in the context of social networks media aestheticsSpecifics of multimedia texts in the context of social networks media aesthetics
Specifics of multimedia texts in the context of social networks media aesthetics
 
Combined Effects of Similarity and Imagined Contact on First-Person Testimoni...
Combined Effects of Similarity and Imagined Contact on First-Person Testimoni...Combined Effects of Similarity and Imagined Contact on First-Person Testimoni...
Combined Effects of Similarity and Imagined Contact on First-Person Testimoni...
 
Direct online political communication effects on civil participation in spain...
Direct online political communication effects on civil participation in spain...Direct online political communication effects on civil participation in spain...
Direct online political communication effects on civil participation in spain...
 
University Media in Ecuador: Types, Functions and Self-determination
University Media in Ecuador: Types, Functions and Self-determinationUniversity Media in Ecuador: Types, Functions and Self-determination
University Media in Ecuador: Types, Functions and Self-determination
 
Like it or die: using social networks to improve collaborative learning in hi...
Like it or die: using social networks to improve collaborative learning in hi...Like it or die: using social networks to improve collaborative learning in hi...
Like it or die: using social networks to improve collaborative learning in hi...
 
Framing theory in studies of environmental information in press
Framing theory in studies of environmental information in pressFraming theory in studies of environmental information in press
Framing theory in studies of environmental information in press
 
Domain engineering for generating dashboards to analyze employment and employ...
Domain engineering for generating dashboards to analyze employment and employ...Domain engineering for generating dashboards to analyze employment and employ...
Domain engineering for generating dashboards to analyze employment and employ...
 
Mapping the systematic literature studies about software ecosystems
Mapping the systematic literature studies about software ecosystemsMapping the systematic literature studies about software ecosystems
Mapping the systematic literature studies about software ecosystems
 
A Multivocal Literature Review on the use of DevOps for e-learning systems
A Multivocal Literature Review on the use of DevOps for e-learning systemsA Multivocal Literature Review on the use of DevOps for e-learning systems
A Multivocal Literature Review on the use of DevOps for e-learning systems
 
Document Annotation Tools: Annotation Classification Mechanisms
Document Annotation Tools: Annotation Classification MechanismsDocument Annotation Tools: Annotation Classification Mechanisms
Document Annotation Tools: Annotation Classification Mechanisms
 
Toward supporting decision-making under uncertainty in digital humanities wit...
Toward supporting decision-making under uncertainty in digital humanities wit...Toward supporting decision-making under uncertainty in digital humanities wit...
Toward supporting decision-making under uncertainty in digital humanities wit...
 
Managing Uncertainty in the Humanities: Digital and Analogue Approaches
Managing Uncertainty in the Humanities: Digital and Analogue ApproachesManaging Uncertainty in the Humanities: Digital and Analogue Approaches
Managing Uncertainty in the Humanities: Digital and Analogue Approaches
 
Representing Imprecise and Uncertain Knowledge in Digital Humanities: A Theor...
Representing Imprecise and Uncertain Knowledge in Digital Humanities: A Theor...Representing Imprecise and Uncertain Knowledge in Digital Humanities: A Theor...
Representing Imprecise and Uncertain Knowledge in Digital Humanities: A Theor...
 
Dotmocracy and Planning Poker for Uncertainty Management in Collaborative Res...
Dotmocracy and Planning Poker for Uncertainty Management in Collaborative Res...Dotmocracy and Planning Poker for Uncertainty Management in Collaborative Res...
Dotmocracy and Planning Poker for Uncertainty Management in Collaborative Res...
 
Applying Commercial Computer Vision Tools to Cope with Uncertainties in a Cit...
Applying Commercial Computer Vision Tools to Cope with Uncertainties in a Cit...Applying Commercial Computer Vision Tools to Cope with Uncertainties in a Cit...
Applying Commercial Computer Vision Tools to Cope with Uncertainties in a Cit...
 
Appliying topic modeling techniques to degraded texts. Spanish historical pre...
Appliying topic modeling techniques to degraded texts. Spanish historical pre...Appliying topic modeling techniques to degraded texts. Spanish historical pre...
Appliying topic modeling techniques to degraded texts. Spanish historical pre...
 

Último

MARY JANE WILSON, A “BOA MÃE” .
MARY JANE WILSON, A “BOA MÃE”           .MARY JANE WILSON, A “BOA MÃE”           .
MARY JANE WILSON, A “BOA MÃE” .
Colégio Santa Teresinha
 
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
National Information Standards Organization (NISO)
 
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UPLAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
RAHUL
 
Pengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptxPengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptx
Fajar Baskoro
 
writing about opinions about Australia the movie
writing about opinions about Australia the moviewriting about opinions about Australia the movie
writing about opinions about Australia the movie
Nicholas Montgomery
 
Main Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docxMain Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docx
adhitya5119
 
BBR 2024 Summer Sessions Interview Training
BBR  2024 Summer Sessions Interview TrainingBBR  2024 Summer Sessions Interview Training
BBR 2024 Summer Sessions Interview Training
Katrina Pritchard
 
How to Fix the Import Error in the Odoo 17
How to Fix the Import Error in the Odoo 17How to Fix the Import Error in the Odoo 17
How to Fix the Import Error in the Odoo 17
Celine George
 
How to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 InventoryHow to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 Inventory
Celine George
 
The Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collectionThe Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collection
Israel Genealogy Research Association
 
PCOS corelations and management through Ayurveda.
PCOS corelations and management through Ayurveda.PCOS corelations and management through Ayurveda.
PCOS corelations and management through Ayurveda.
Dr. Shivangi Singh Parihar
 
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptxC1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
mulvey2
 
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
GeorgeMilliken2
 
Cognitive Development Adolescence Psychology
Cognitive Development Adolescence PsychologyCognitive Development Adolescence Psychology
Cognitive Development Adolescence Psychology
paigestewart1632
 
Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...
Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...
Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...
Diana Rendina
 
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
PECB
 
Wound healing PPT
Wound healing PPTWound healing PPT
Wound healing PPT
Jyoti Chand
 
Walmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdfWalmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdf
TechSoup
 
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
Nguyen Thanh Tu Collection
 
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptxChapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 

Último (20)

MARY JANE WILSON, A “BOA MÃE” .
MARY JANE WILSON, A “BOA MÃE”           .MARY JANE WILSON, A “BOA MÃE”           .
MARY JANE WILSON, A “BOA MÃE” .
 
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
 
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UPLAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
 
Pengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptxPengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptx
 
writing about opinions about Australia the movie
writing about opinions about Australia the moviewriting about opinions about Australia the movie
writing about opinions about Australia the movie
 
Main Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docxMain Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docx
 
BBR 2024 Summer Sessions Interview Training
BBR  2024 Summer Sessions Interview TrainingBBR  2024 Summer Sessions Interview Training
BBR 2024 Summer Sessions Interview Training
 
How to Fix the Import Error in the Odoo 17
How to Fix the Import Error in the Odoo 17How to Fix the Import Error in the Odoo 17
How to Fix the Import Error in the Odoo 17
 
How to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 InventoryHow to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 Inventory
 
The Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collectionThe Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collection
 
PCOS corelations and management through Ayurveda.
PCOS corelations and management through Ayurveda.PCOS corelations and management through Ayurveda.
PCOS corelations and management through Ayurveda.
 
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptxC1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
 
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
 
Cognitive Development Adolescence Psychology
Cognitive Development Adolescence PsychologyCognitive Development Adolescence Psychology
Cognitive Development Adolescence Psychology
 
Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...
Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...
Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...
 
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
 
Wound healing PPT
Wound healing PPTWound healing PPT
Wound healing PPT
 
Walmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdfWalmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdf
 
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
 
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptxChapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
 

Tag-Based Browsing of Digital Collections with Inverted Indexes and Browsing Cache

  • 1. Tag-Based Browsing of Digital Collections with Inverted Indexes and Browsing Cache Joaquín Gayoso-Cabada, Mercedes Gómez-Albarrán, José Luis Sierra Fac. Informática Universidad Complutense de Madrid
  • 2. 2 Contents Introduction The Tag-Based Browsing Model Tag-Based Browsing with Inverted Indexes Adding a Browsing Cache Conclusions and Future Work
  • 3. 3 Introduction Clavy: an experimental platform for learning object repositories with reconfiguable structures Clavy makes it possible to rearrange the hierarchical organization of elements in metadata schemata. These reconfigurations affect functionalities like learning object presentation, and browsing. In particular, although from a user’s point of view Clavy supports a guided browsing paradigm… … internally it supports more free and flexible browsing mechanisms… … able to take account of all the posible ways of browsing the repositories
  • 4. 4 Introduction Clavy browsing is internally supported by a tag- based browsing system element – value pairs are abstracted as tags The browsing system maintains: – A set of active tags – The set of filtered objects – The set of additionally selectable tags, able to further shrink, but not to vanish, the filtered objects Updating the browsing snapshot when the set of active tags changes can be computationally- intensive To mitigate the cost we proposed a strategy based on inverted indexes and a browsing cache
  • 5. 5 The Tag-Based Browsing Model Digital Collections Resources Tagging Resources Tagging r1 Cave-Painting Cantabrian Prehistoric r4 Tartesian Plateau Protohistoric r2 Cave-Painting Levant Prehistoric r5 Phoenician Penibaetic Protohistoric r3 Megalithic Cantabrian Prehistoric r6 Punic Levant Protohistoric Resources  Content of Learning objects Tags  Element-value pairs
  • 6. 6 The Tag-Based Browsing Model Browsing Browsing state: – F  Set of selected tags. – RF  Set of filtered resources. – SF  Set of selectable tags. Browsing actions: – +t  Select the tag t. – xt  Remove the tag t
  • 7. 7 Browsing with Inverted Indexes Inverted Indexes For each tag t the inverted index  returns the set of all the resources (t) tagged with t (Cave-Painting)={r1,r2} (Megalithic)={r3} (Tartesian)={r4} (Phoenician)={r5} (Punic)={r6} (Cantabrian)={r1,r3} (Levant)={r2,r6} (Plateau)={r4} (Penibaetic)={r5} (Prehistoric)={r1,r2,r3} (Protohistoric)={r4,r5,r6} Resources Tagging Resources Tagging r1 Cave-Painting Cantabrian Prehistoric r4 Tartesian Plateau Protohistoric r2 Cave-Painting Levant Prehistoric r5 Phoenician Penibaetic Protohistoric r3 Megalithic Cantabrian Prehistoric r6 Punic Levant Protohistoric Inverted index
  • 8. 8 Browsing with Inverted Indexes The Browsing Strategy +t browsing action: – F  F  {t} – RF  RF(t) – SF{t’SF-{t} | 0 < |RF(t’)| <|RF|} xt browsing action: – F  F - {t} – RF  t’F (t’) (or all the resources if F=) – SF{t’- F | 0 < |RF(t’)| <|RF|} F= is managed as a particular case: – RF   – SF  {t | |(t)| < ||}
  • 9. 9 : filtered resource store F ⟶ RF : selectable tag store F ⟶ SF : representative store RF ⟶ F Adding a Browsing Cache CACHE#5 CACHE#4 CACHE#1 CACHE#2 ()= ()= CACHE#3 ()= (t10)=R1 F (t10,t1)=R2 F (R1 F )={t10} (R2 F )={t10,t1} ()= (t10)={t1,t2,t6,t7} (t10,t1)={t6,t7} ()= (t10)=R1 F (t10,t1)=R2 F (R1 F )={t10} (R2 F )={t10,t1} ()= (t10)={t1,t2,t6,t7} (t10,t1)={t6,t7} ()= (t10)=R1 F (t10,t1)=R2 F (R1 F )={t10} (R2 F )={t10,t1} ()= (t10)={t1,t2,t6,t7} (t10,t1)={t6,t7} ()= (t10)=R1 F (t10,t1)=R2 F (t1)=R5 F (R1 F )={t10} (R2 F )={t10,t1} ()= (t10)={t1,t2,t6,t7} (t10,t1)={t6,t7} (t1)={t6,t7} CACHE#6 ()= (t10)=R1 F (R1 F )={t10} ()= (t10)={t1,t2,t6,t7} +Prehistoric CACHE#1 +Cave-Painting CACHE#2 xCave-Painting CACHE#3 xPrehistoric CACHE#4+Cave-Painting CACHE#5 {Cave-Painting} {Cantabrian, Levant}    {Prehistoric} {Cave-Painting, Megalithic, Cantabrian, Levant} {Prehistoric} {Cave-Painting, Megalithic, Cantabrian, Levant}    R1 F =R0 F   (t10) R2 F =R1 F   (t1) R5 F =R4 F   (t1) |R1 F   (t1)|=2 |R1 F   (t2)|=1 |R1 F   (t3)|=0 |R1 F   (t4)|=0 |R1 F   (t5)|=0 |R1 F   (t6)|=2 |R1 F   (t7)|=1 |R1 F   (t8)|=0 |R1 F   (t9)|=0 |R1 F   (t11)|=0 0<|R1 F (t)|<|R1 F | |R2 F   (t2)|=0 |R2 F   (t6)|=1 |R2 F   (t7)|=1 | (t1)|=2 | (t2)|=1 | (t3)|=1 | (t4)|=1 | (t5)|=1 | (t6)|=2 | (t7)|=2 | (t8)|=1 | (t9)|=1 | (t10)|=3 | (t11)|=3 |(t)|< || {Prehistoric, Cave-Painting} {Cantabrian, Levant} 0<|R2 F (t)|<|R2 F | 345 {r1,r2,r3} {r1,r2} {r1,r2,r3}{r1,r2} 0 1 2 CACHE#6
  • 10. 10 Conclusions A browsing strategy based on a suitable combination of inverted indexes and multilevel caches has been proposed to speed up the browsing process in Clavy Currently we are working on the empirical evaluation of our approach in Chasqui, a real-world repository in the Pre- Columbian American archeology field. Preliminary experiments suggest that the browsing cache can substantially speed up navigation with respect to a more basic, un-cached strategy (solely based on inverted indexes). The price to pay is the overhead generated by cache management, as well as the higher memory footprint caused by the technique. However, the experiments also make apparent how: (i) the cache management overhead is compensated by eliminating the explicit computation of the information associated to many browsing states, and (ii) the cache size is maintained within reasonable ranges, even when it is not upper-bounded.
  • 11. 11 Future Work To improve the cache strategy by combining it with our previous work on navigation automata. To generalize the browsing strategy to support navigation through links among resources. To combine browsing and search, letting users browse search results according to the browsing model described.
  • 12. Tag-Based Browsing of Digital Collections with Inverted Indexes and Browsing Cache Joaquín Gayoso-Cabada, Mercedes Gómez-Albarrán, José Luis Sierra Fac. Informática Universidad Complutense de Madrid