Developer Data Modeling Mistakes: From Postgres to NoSQL
Improving Personal Tagging Consistency Through Visualization Of Tag
1. HCI International 2009 19-24 July 09, San Diego, CA, USA Improving Personal Tagging Consistency through Visualization of Tag Relevancy Dr. Qin Gao*, Yusen Dai, and Kai Fu Institute of Human Factors & Ergonomics Dept. of Industrial Engineering, Tsinghua University
2. Content Introduction Research Question Tag A Tag B Methodology Results & Discussion Conclusion Tagging consistency is important for users to organize things effectively and to retrieve them efficiently later on.
3. Introduction Tagging has emerged as a new means of information organization and retrieval Tagging is easy to use, flexible, able to harvest the intelligence of the crowd But there are many inconsistencies in tagging systems! Tripartite model of tagging system, from Halpin, Robu, & Sherpherd, 2007
4. Introduction Vocabulary problems “Bad” tags: misspelt tags, badely encoded tags, mixed use of singulars and plurals, and etc. Inevitable semantic inconsistency: polysemy, synonym, and basic level variations. (Golder & Huberman 2006) Consistency between taggers The extent to which different users agree on selection for certain tags for specific content. Allowing a true representation of knowledge and multiple interpretations of the same content. Trends towards stabilization (Golder & Huberman, 2005) Consistency within individual taggers The extent to which individual users agree on selection for certain tags for specific content at different point in time.
5. Introduction Consistency within individual taggers is important to individual users and to the system. Affecting efficiency of information organization and retrieval tasks for individual users Organizing information is one of the most motivation for tagging (Ames and Naaman, 2007; Marlow, et al., 2006). Indexing research shows that reliance on consistently used indexing cues is desired for effective access of information Impacts on users’ perceived usefulness of the system and their satisfaction. How to improve individual tagging consistency? Providing tag suggestions based on existing tagging pattern can shape users’ tagging behavior (Sen et al, 2006; Binkowski, 2006) How to present such suggestions? How to select tags for suggestion?
6. Visualization of Tags The first generation of tag clouds The second generation of tag clouds Tag popularity is represented by visual cues Semantic relations among tags is revealed by visualization Semantically clustering of tags by Montero & Solana (2006) Tag clouds from Amazon, from Bateman 2007 Nielson, 2007
7. Research Question Goal of the study: to examine the effect of tag frequency visualization and semantically clustering on users’ tagging consistency Hypothesis 1: visualization of occurrence frequency of tags improves personal tag consistency and reduces users’ workload. Hypothesis 2: visualization of inter-tag relevancy improves personal tag consistency.
9. Methodology Frequency visualization by font size the font size was determined by the following logarithm function Definition of font size levels Currenti is the font size level of the current tag Oiis the use frequency of the current tag The relationship between font size level and tag frequency
10. Methodology ti=(d1i, d2i, d3i, …, dni) Visualization of tag relevancy – Semantically clustering Clusters of relevant tags were calculated based on co-occurrence similarity with K-means algorithm developed by Montero and Solana (2006). The approach was proved to reduce semantically density of tag clouds significantly. Definition of the vector space: ti=(d1i, d2i, … dni) cosine (t1, t2)=(t1·t2)/‖t1‖*‖t2‖
11. Methodology Let and in two sessions in two sessions Dependent variables Tagging consistency Let Ai and Bi denote the sets of tags that assigned to the same document in two sessions, then tagging consistency with this document: The overall tagging consistency: Workload measured by NASA-TLX Ai and Bi denote the sets of tags that assigned to the same document in two different tagging sessions
12. Methodology Stimuli 100 pictures selected from Flickr, tagged as “nature”, “city”, or “people” 20 were stimuli, and other 80 were filler pictures Participants 40 participants, including 10 females and 30 males, aged from 20 to 31 All are experienced tagging users Procedure Two tagging sessions, with a disruptive interval in between.
13. Results aKruskal-Wallis-test.*Significant differences at p<.05 aKruskal-Wallis-test.*Significant differences at p<.05 aKruskal-Wallis-test.*Significant differences at p<.05 aKruskal-Wallis-test.*Significant differences at p<.05 Testing of hypothesis 1
14. Results Frequency visualization has no significant impact on tagging consistency. Frequency visualization reduces perceived physical demand significantly, but also increases mental demand. An interaction effect on physical demand (χ2 = 6.4, p = .01)
15. Results Testing of Hypothesis 2 aKruskal-Wallis-test.*Significant differences at p<.05
16. Results Semantically clustering improves personal tagging significantly. H2 was supported. But no significant difference in workload or the number of tags given by participants. The consistency level of participants tagging with semantically clustering is 12% higher than that of participants tagging without such visualization.
17. Discussion Two types of tags General categorical tags, influenced by the basic level High recall but low accuracy Users have a strong bias to use them as first tags (Golder & Huberman, 2005). Relatively more consistent. Descriptive/specific tags, ego-centered High accuracy but low recall All participants expressed their intention to tag consistently, but often failed to do so due to limited memory. Major source of inconsistencies
18. Discussion Semantically clustering of tags helps users’ tag formulation tasks and improves their consistency in identifying and deciding on specific tags It improves the performance of specific search and increase the attention towards tags in small fonts compared to other layouts (Schrammel et al., 2009). Frequency visualization does not provide support for search of specific tags. When used in combination with semantically clustering, it help reduce perceived physical demand.
19. Conclusion Visualizing the relevancy among tags has a significant positive effect on tagging consistency, whereas visualizing tagging frequency does not. Empirical support for the effort of visualizing semantic relationships among tags When the tag relevancy is visualized, highlight frequently used tags can reduce perceived physical demands; however, it increases perceived mental demands as well. Implications for professional indexer aid design.
20. Thank you for your attention. Contact: gaoqin@tsinghua.edu.cn http://trisha.snappages.com Q & A