SlideShare una empresa de Scribd logo
1 de 42
Research Interests :  Their Dynamics, Structures and Applications in Personalized Web Search Yi Zeng 1 , Erzhong Zhou 1 , Xu Ren 1 , Yulin Qin 1,3 , Ning Zhong 1,2  ,  Zhisheng Huang 4 1. International WIC Institute, Beijing University of Technology, China 2. Maebashi Institute of Technology, Japan 3. Carnegie Mellon University, USA  4. Vrije University Amsterdam, the Netherlands
Web Intelligence Consortium
The Large Knowledge Collider Project 13 partner institutions (from 11 countries, 2 from Asia) ,[object Object]
Motivation ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
The Acquisition, Structure and  Dynamics of Research Interests ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Different Interests Evaluation Functions ,[object Object],An analysis of cumulative interests in different time intervals. (Paul Erdos, with more than 1400 papers involved) ,[object Object],[object Object],[object Object],The “  Basic level advantage  ” [ Rogers2007 ].   Concepts in a basic level -- >   more frequently than other  terms [Wisniewski1989].
Weights of Interests ,[object Object],[object Object],An analysis of Ricardo Baeza-Yates’ weighted interests w(t(i), j).
Obtaining the Retained Interests ,[object Object],[object Object],[object Object],Pictures from: [Schooler 1993] Schooler, L. J. & Anderson, J. R.: Recency and Context: An Environmental Analysis of Memory. In Proceedings of the Fifteenth Annual Conference of the Cognitive Science Society, pp. 889-894, 1993. (Frequency and Recency) Memory Retention:
Obtaining the Retained Interests  (cont.) ,[object Object],[object Object],[Zeng 2009a]  Cognitive Memory Retention Based Starting Point for Query Extension and Granular Selection, Yi Zeng, Haiyan Zhou, Ning Zhong, Yulin Qin, Shengfu Lu, Yiyu Yao, Yang Gao. In: Cognitive Memory Component (v1),  LarKC deliverable 2-3-1 , Coordinated by Jose Quesada and Yi Zeng , March 30, 2009 . [Zeng 2009b]  Yi Zeng, Yiyu Yao, Ning Zhong. DBLP-SSE: A DBLP Search Support Engine, In: Proceedings of the 2009 IEEE/WIC/ACM International Conference on Web Intelligence, IEEE Computer Society, Milan, Italy,  September 15-18, 2009 . [Maanen 2009]  Leendert van Maanen, Julian N. Marewski.: Recommender Systems for Literature Selection: A Competition between Decision Making and Memory Models,  CogSci 2009, July 31-August 1, 2009 .
Obtaining the  Top N  Interests A comparative study of total research interests from 1990 to 2008 and retained interests in 2009 (based on both the power law and exponential law models). Difference on the contribution values from papers published in different years. ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Building and Analyzing  the Structure of Research Interests Observed Phenomenon: [1] main research interests ( pivotal nodes ) are  dynamically changing  all the time. With older ones  disappear  and new ones  emerged . [2] Relations among research interests  varies as time passed  ( strengthen   or  weaken ). [3] main research interests are  closely related to  each other. (The closeness is getting stronger from time to time, which made the degree of separation around 2-3. It indicates that for an author, research interests are  not isolated  but  highly relevant . [4] Many top research interests (pivotal nodes)  remain active  in the interest network (e.g. search, analysis, match). Figure 7 . Ricardos research interest  dynamic evolution network  from 1991 to 2009. (Based on DBLP publication list, with 232 papers involved). The network is a graph with  weighted edges and weighted vertices . An Author’s Research Interest Evolution Network
Statistical Characteristics on  the Dynamics of Total Research Interests ,[object Object],[object Object],Pictures from math.ucsd.edu.  and math.tsukuba.ac.jp Figure 2 : Power-law distribution on weights of research interests for Leonhard Euler (Publication list is from Euler's Archive,  with  856   papers ), Paul Erdos (publication list is from Erdos' publication collection project (1929-1989) and MathSciNet (1990-2004), with  1437  papers involved. Translation of titles from German, French, Hungarian has been made by google translation and Babylon translation ), and Ricardo Baeza-Yates (from DBLP). (With processing on meaningless words, tense, singular, plural form, third person, etc.
Interests with Self Organized Criticality ,[object Object],[object Object],Figure 11 . Zdzislaw Pawlak’s Interest statistics showing Self organized criticality. Rough Set Figure 12 . Zdzislaw Pawlak’s Interest connection network (1984-2008, with 62.1% interests directly connected to “rough”).
Timing characteristics of research interests Dynamic characteristics of  lnterest Longest Duration  and  Interest Cumulative Duration . Figure 9 : Ricardo's research interest lasting time and appear time distribution statistics. ,[object Object],(by  linear fit )   Inspired by Human Dynamics [Barabasi 2005] Figure 9(b)  : the probability of having  n  research interests whose lasting time is a fixed time interval  . statistical distribution approximation :
Explanations on the Observed  Power Law Distributions ,[object Object],[object Object],[object Object],The picture is from: Peter Csermely. Weak Links: Stabilizers of Complex Systems from Proteins to Social Networks, Springer, 2006. [Simon 1955] Simon, H.: On a class of skew distribution functions. Biometrika 42, 425–440, 1955. [Barabasi 1999] Barabasi, A.L. and Albert, R. : Emergence of scaling in random networks. Science 286, 509–512. ‘ the rich get richer’ effect [Simon1955]
A Comparative Study of  Different Interest Evaluation Methods Interests Longest Duration Interests Cumulative Duration Zhisheng Huang’s Interests Evaluation from CI, ILD and ICD
Social Network based  Group  Interest s Models ,[object Object],How to acquire the top N interests? ,[object Object],Carlos Castillo Ricardo A. Baeza-Yates Web PageRank Network Spam Search Detection Analysis Link Content Web Search Retrieval Information Query Analysis Challenge Engine Mining
Overlap of User Interests and Group Interests Top 9 interests retention of a user and his group interests retention. (Ricardo A. Baeza-Yates, based on May 2008 version of SwetoDBLP). … Model … Analysis … Text … Challenge 14 Analysis 1.26 Minining 18 Query 2.10 Engine 19 System 2.14 Query 26 Information 2.27 Information 28 Web 3.19 Retrieval 30 Retrieval 5.59 Search 35 Search 7.81 Web Top 9 Group Retained Interests Top 9  Retained Interests
A Step Forward : Semantic Similarity ---- Obtaining More Accurate Interest Descriptions ,[object Object],Consistent interests with consideration of semantic similarity. Carlos Castillo Ricardo A. Baeza-Yates Web PageRank Network Spam Search Detection Analysis Link Content Web Search Retrieval Information Query Analysis Challenge Engine Mining Carlos Castillo Ricardo A. Baeza-Yates Web PageRank Network Spam Search Detection Analysis Link Content Web Search Retrieval Information Query Analysis Challenge Engine Mining
Semantic Similarity and Interests Re-ranking Semantic Similarity judges by  Normalized Google Distance [Rudi and Paul 2007] Normalized Google Distance Google, Bing  as the Knowledge base. 0.080 reasoning ontology 0.460 pagerank Query 0.332 ontology logic 0.497 pagerank retrieval 0.050 semantic reasoning 0.403 query retrieval -0.003 semantic ontology 0.490 pagerank search 0.276 semantic logic 0.483 query search 0.239 reasoning logic 0.529 retrieval search NGD interest  y interest  x NGD interest  y interest  x
[object Object],Semantic Similarity and Interests Re-ranking (cont.) Interests Re-ranking Function Dynamic Agent Prolog Dynamic Agent Prolog Agent Dynamic Inconsistent Agent Dynamic Inconsistent Prolog Prolog Dynamic Logic Prolog Dynamic Inconsistent Inconsistent Logic Prolog Logic Reasoning Web Web Reasoning Inconsistent Inconsistent Semantic Logic Logic Semantic Web Reasoning Logic Reasoning Reasoning Ontology Reasoning Semantic Ontology Semantic Semantic Web Semantic Web Web Ontology Ontology Agent Ontology Ontology Agent Interests Ranking Perspectives With semantic similarity based re-ranking (b) Without semantic similarity based re-ranking (a)
Similarity Measures ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Just ask me! I know everything and believe me! I know Chemistry I know Medical Science I know Cognition I know Mathematics My Question is about Medical Science
Evaluations on  Normalized Medline Distance (NMD) Experts evaluated 30 medline term pairs Pearson Correlation: NMD gets the highest value among the measures, 0.792 T-test significance: 0.995  Experts from  AstraZeneca  evaluated 90 randomly generated pairs Pearson Correlation: NMD: 0.736  vs  NGD:0.531 Average: Experts:0.590, NMD:0.390, NGD:0.289  NMD is closer to experts’ evaluation
Motivation for User Interests Description ,[object Object],[object Object],[object Object],[object Object],The Linked Open Data figure is from http://richard.cyganiak.de/2007/10/lod/
Defining User Interests ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],< Interest URI, AgentURI, Property(i), Vaule(i), Time(i) >
The e-FOAF: interest Vocabulary ,[object Object],[object Object],[object Object],Attribute e-foaf:cumulative_interest_value Attribute e-foaf:interest_co-occur_with Attribute e-foaf:interest_has_synonym Attribute e-foaf:interest_appeare_time Attribute e-foaf:interest_appeared_in Attribute e-foaf:interest_value_updatetime Attribute e-foaf:interest_value Class  e-foaf:interest E-foaf:interest Basic Attribute e-foaf:interest_longest_duration Attribute e-foaf:retained_interest_value Attribute e-foaf:interest_cumulative_duration E-foaf:interest Complement E-foaf:interest complete Type Vocabulary Vocabulary Branch
Integration of WI and e-FOAF:interests  by FOAF community By Balthasar A.C. Schopman from Vrije University Amsterdam
Integration of WI and e-FOAF:interests  by FOAF community (cont.) ,[object Object],[object Object],[object Object],The  wi:ComplexInterest  concept  as graph with relations: This photo is taken by Professor Lora Aroyo from Vrije University Amsterdam at Vocamp 2010.
Computer Scientists’ Research Interests Dataset ,[object Object],[object Object],[object Object]
The Utilization of e-FOAF:interests Vocabulary ,[object Object],[object Object],The SPARQL endpoint for DBLP user interests  is available at  http://www.wici-lab.org/wici/dblp-sse/ Dieter & Frank 2007 
Bring User Interests to  Literature Search Refinement ,[object Object],[object Object],[object Object],[object Object],Pre-existing  Knowledge Search + Acquired  Knowledge Useful literatures that are relevant to the query and authors’ research interests
Search Refinement by Interests  from Different Perspectives ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
DBLP-SSE : DBLP Search Support Engine The DBLP dataset Web Semantic Knowledge Sub datasets pre-selection *  Web   Intelligence  and  Artificial Intelligence  in Education.  *  Artificial Intelligence  Exchange and  Service   Tie to All Test Environments (AI-ESTATE)-A New Standard for System Diagnostics.  *  Semantic Model  for  Artificial Intelligence  Based on Molecular  Computing .  * Open Information Systems   Semantics   for Distributed  Artificial Intelligence .  *  Artificial Intelligence  and Financial  Services . * …  with current interests  constraints (Top 5 results) List 2 :  * PROLOG Programming for  Artificial Intelligence , Second Edition.  *  Artificial Intelligence  Architectures for Composition and Performance Environment.  *  Artificial Intelligence  in Music Education: A Critical Review.  * Music, Intelligence and Artificiality.  Artificial Intelligence  and Music Education.  * Musical Knowledge: What can  Artificial Intelligence  Bring to the Musician? * ...  without current interests constraints (Top 5 results) List 1 :  Artificial Intelligence   Query :  Web, Service, Semantic, Architecture, Model, Ontology,  Knowledge, Computing, Language  Top 9  interests  Dieter Fensel  Log in
Search Results without any Refinement
Search Results with  Interests-based Refinement http://www.wici-lab.org/wici/dblp-sse/
User Evaluation of Refinement Strategy ,[object Object],[object Object],[object Object],[object Object],[object Object],Social Relation Based Search Refinement: Let Your Friends Help You!. Xu Ren, Yi Zeng, Yulin Qin, Ning Zhong, Zhisheng Huang, Yan Wang, and Cong Wang. Proceedings of the 2010 International Conference on Active Media Technology, Lecture Notes in Computer Science 6335, 475-485, 2010.
Scalability for Query Time With selection: approximately 80% of the time can be saved. equivalent to  Refined query based on interests much closer to user needs may be  very far from user needs Results the fastest much slower medium Query Time Interest based selection before querying Refined query based on interests Unrefined query
The Effect of Query Constraints Numbers
Recall and Spent Time (Unrefined queries vs Interest-based Selection ,[object Object],[object Object]
Context-Aware Linked Life Data Search ,[object Object]
Publications related to this talk ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Thank you! URL:   http://www.wici-lab.org/wici/~yizeng  Email:  yizeng@bjut.edu.cn

Más contenido relacionado

La actualidad más candente

Recommending Tags with a Model of Human Categorization
Recommending Tags with a Model of Human CategorizationRecommending Tags with a Model of Human Categorization
Recommending Tags with a Model of Human CategorizationChristoph Trattner
 
Naïve Bayes and J48 Classification Algorithms on Swahili Tweets: Performance ...
Naïve Bayes and J48 Classification Algorithms on Swahili Tweets: Performance ...Naïve Bayes and J48 Classification Algorithms on Swahili Tweets: Performance ...
Naïve Bayes and J48 Classification Algorithms on Swahili Tweets: Performance ...IJCSIS Research Publications
 
Multidimensioal database
Multidimensioal  databaseMultidimensioal  database
Multidimensioal databaseTPO TPO
 
On nonmetric similarity search problems in complex domains
On nonmetric similarity search problems in complex domainsOn nonmetric similarity search problems in complex domains
On nonmetric similarity search problems in complex domainsunyil96
 
Exploiting Wikipedia and Twitter for Text Mining Applications
Exploiting Wikipedia and Twitter for Text Mining ApplicationsExploiting Wikipedia and Twitter for Text Mining Applications
Exploiting Wikipedia and Twitter for Text Mining ApplicationsIRJET Journal
 
Exploring Social Media with NodeXL
Exploring Social Media with NodeXL Exploring Social Media with NodeXL
Exploring Social Media with NodeXL Shalin Hai-Jew
 
Hci encyclopedia irshortefords
Hci encyclopedia irshortefordsHci encyclopedia irshortefords
Hci encyclopedia irshortefordsapollobgslibrary
 
Ethical and Legal Issues in Computational Social Science - Lecture 7 in Intro...
Ethical and Legal Issues in Computational Social Science - Lecture 7 in Intro...Ethical and Legal Issues in Computational Social Science - Lecture 7 in Intro...
Ethical and Legal Issues in Computational Social Science - Lecture 7 in Intro...Lauri Eloranta
 
A Summary of Computational Social Science - Lecture 8 in Introduction to Comp...
A Summary of Computational Social Science - Lecture 8 in Introduction to Comp...A Summary of Computational Social Science - Lecture 8 in Introduction to Comp...
A Summary of Computational Social Science - Lecture 8 in Introduction to Comp...Lauri Eloranta
 

La actualidad más candente (11)

Recommending Tags with a Model of Human Categorization
Recommending Tags with a Model of Human CategorizationRecommending Tags with a Model of Human Categorization
Recommending Tags with a Model of Human Categorization
 
Naïve Bayes and J48 Classification Algorithms on Swahili Tweets: Performance ...
Naïve Bayes and J48 Classification Algorithms on Swahili Tweets: Performance ...Naïve Bayes and J48 Classification Algorithms on Swahili Tweets: Performance ...
Naïve Bayes and J48 Classification Algorithms on Swahili Tweets: Performance ...
 
Multidimensioal database
Multidimensioal  databaseMultidimensioal  database
Multidimensioal database
 
On nonmetric similarity search problems in complex domains
On nonmetric similarity search problems in complex domainsOn nonmetric similarity search problems in complex domains
On nonmetric similarity search problems in complex domains
 
Exploiting Wikipedia and Twitter for Text Mining Applications
Exploiting Wikipedia and Twitter for Text Mining ApplicationsExploiting Wikipedia and Twitter for Text Mining Applications
Exploiting Wikipedia and Twitter for Text Mining Applications
 
Text mining
Text miningText mining
Text mining
 
Exploring Social Media with NodeXL
Exploring Social Media with NodeXL Exploring Social Media with NodeXL
Exploring Social Media with NodeXL
 
Hci encyclopedia irshortefords
Hci encyclopedia irshortefordsHci encyclopedia irshortefords
Hci encyclopedia irshortefords
 
User Interests Identification From Twitter using Hierarchical Knowledge Base
User Interests Identification From Twitter using Hierarchical Knowledge BaseUser Interests Identification From Twitter using Hierarchical Knowledge Base
User Interests Identification From Twitter using Hierarchical Knowledge Base
 
Ethical and Legal Issues in Computational Social Science - Lecture 7 in Intro...
Ethical and Legal Issues in Computational Social Science - Lecture 7 in Intro...Ethical and Legal Issues in Computational Social Science - Lecture 7 in Intro...
Ethical and Legal Issues in Computational Social Science - Lecture 7 in Intro...
 
A Summary of Computational Social Science - Lecture 8 in Introduction to Comp...
A Summary of Computational Social Science - Lecture 8 in Introduction to Comp...A Summary of Computational Social Science - Lecture 8 in Introduction to Comp...
A Summary of Computational Social Science - Lecture 8 in Introduction to Comp...
 

Similar a Research Interests : Their Dynamics, Structures and Applications in Personalized Web Search

Social Relation Based Scalable Semantic Search Refinement
Social Relation Based Scalable Semantic Search RefinementSocial Relation Based Scalable Semantic Search Refinement
Social Relation Based Scalable Semantic Search RefinementYi Zeng
 
A Topic map-based ontology IR system versus Clustering-based IR System: A Com...
A Topic map-based ontology IR system versus Clustering-based IR System: A Com...A Topic map-based ontology IR system versus Clustering-based IR System: A Com...
A Topic map-based ontology IR system versus Clustering-based IR System: A Com...tmra
 
Mining and Supporting Community Structures in Sensor Network Research
Mining and Supporting Community Structures in Sensor Network ResearchMining and Supporting Community Structures in Sensor Network Research
Mining and Supporting Community Structures in Sensor Network ResearchMarko Rodriguez
 
Visual Analysis of Concept Change and Information Diffusion
Visual Analysis of Concept Change and Information DiffusionVisual Analysis of Concept Change and Information Diffusion
Visual Analysis of Concept Change and Information Diffusioninscit2006
 
"Mass Surveillance" through Distant Reading
"Mass Surveillance" through Distant Reading"Mass Surveillance" through Distant Reading
"Mass Surveillance" through Distant ReadingShalin Hai-Jew
 
Aggregation for searching complex information spaces
Aggregation for searching complex information spacesAggregation for searching complex information spaces
Aggregation for searching complex information spacesMounia Lalmas-Roelleke
 
Hci encyclopedia irshortefords
Hci encyclopedia irshortefordsHci encyclopedia irshortefords
Hci encyclopedia irshortefordsapollobgslibrary
 
Collnet _Conference_Turkey
Collnet _Conference_TurkeyCollnet _Conference_Turkey
Collnet _Conference_TurkeyGohar Feroz Khan
 
Collnet turkey feroz-core_scientific domain
Collnet turkey feroz-core_scientific domainCollnet turkey feroz-core_scientific domain
Collnet turkey feroz-core_scientific domainHan Woo PARK
 
Collaborative research network and scientific productivity
Collaborative research network and scientific productivityCollaborative research network and scientific productivity
Collaborative research network and scientific productivityHanbat National Univerisity
 
Navigation through citation network based on content similarity using cosine ...
Navigation through citation network based on content similarity using cosine ...Navigation through citation network based on content similarity using cosine ...
Navigation through citation network based on content similarity using cosine ...Salam Shah
 
Social media-based systems: an emerging area of information systems research ...
Social media-based systems: an emerging area of information systems research ...Social media-based systems: an emerging area of information systems research ...
Social media-based systems: an emerging area of information systems research ...Nurhazman Abdul Aziz
 
Citation metrics and the stories they tell
Citation metrics and the stories they tellCitation metrics and the stories they tell
Citation metrics and the stories they tellCarl Bergstrom
 
Helping Users Discover Perspectives: Enhancing Opinion Mining with Joint Topi...
Helping Users Discover Perspectives: Enhancing Opinion Mining with Joint Topi...Helping Users Discover Perspectives: Enhancing Opinion Mining with Joint Topi...
Helping Users Discover Perspectives: Enhancing Opinion Mining with Joint Topi...TimDraws
 
Sci 2011 big_data(30_may13)2nd revised _ loet
Sci 2011 big_data(30_may13)2nd revised _ loetSci 2011 big_data(30_may13)2nd revised _ loet
Sci 2011 big_data(30_may13)2nd revised _ loetHan Woo PARK
 
Annotated Bibliography On Research Methodologies
Annotated Bibliography On Research MethodologiesAnnotated Bibliography On Research Methodologies
Annotated Bibliography On Research MethodologiesJeff Nelson
 
From Web Data to Knowledge: on the Complementarity of Human and Artificial In...
From Web Data to Knowledge: on the Complementarity of Human and Artificial In...From Web Data to Knowledge: on the Complementarity of Human and Artificial In...
From Web Data to Knowledge: on the Complementarity of Human and Artificial In...Stefan Dietze
 
Ralph schroeder and eric meyer
Ralph schroeder and eric meyerRalph schroeder and eric meyer
Ralph schroeder and eric meyeroiisdp
 
Learning to Classify Users in Online Interaction Networks
Learning to Classify Users in Online Interaction NetworksLearning to Classify Users in Online Interaction Networks
Learning to Classify Users in Online Interaction NetworksSymeon Papadopoulos
 

Similar a Research Interests : Their Dynamics, Structures and Applications in Personalized Web Search (20)

Social Relation Based Scalable Semantic Search Refinement
Social Relation Based Scalable Semantic Search RefinementSocial Relation Based Scalable Semantic Search Refinement
Social Relation Based Scalable Semantic Search Refinement
 
A Topic map-based ontology IR system versus Clustering-based IR System: A Com...
A Topic map-based ontology IR system versus Clustering-based IR System: A Com...A Topic map-based ontology IR system versus Clustering-based IR System: A Com...
A Topic map-based ontology IR system versus Clustering-based IR System: A Com...
 
Mining and Supporting Community Structures in Sensor Network Research
Mining and Supporting Community Structures in Sensor Network ResearchMining and Supporting Community Structures in Sensor Network Research
Mining and Supporting Community Structures in Sensor Network Research
 
Visual Analysis of Concept Change and Information Diffusion
Visual Analysis of Concept Change and Information DiffusionVisual Analysis of Concept Change and Information Diffusion
Visual Analysis of Concept Change and Information Diffusion
 
"Mass Surveillance" through Distant Reading
"Mass Surveillance" through Distant Reading"Mass Surveillance" through Distant Reading
"Mass Surveillance" through Distant Reading
 
Aggregation for searching complex information spaces
Aggregation for searching complex information spacesAggregation for searching complex information spaces
Aggregation for searching complex information spaces
 
Hci encyclopedia irshortefords
Hci encyclopedia irshortefordsHci encyclopedia irshortefords
Hci encyclopedia irshortefords
 
Collnet _Conference_Turkey
Collnet _Conference_TurkeyCollnet _Conference_Turkey
Collnet _Conference_Turkey
 
Collnet_Conference_Turkey
Collnet_Conference_TurkeyCollnet_Conference_Turkey
Collnet_Conference_Turkey
 
Collnet turkey feroz-core_scientific domain
Collnet turkey feroz-core_scientific domainCollnet turkey feroz-core_scientific domain
Collnet turkey feroz-core_scientific domain
 
Collaborative research network and scientific productivity
Collaborative research network and scientific productivityCollaborative research network and scientific productivity
Collaborative research network and scientific productivity
 
Navigation through citation network based on content similarity using cosine ...
Navigation through citation network based on content similarity using cosine ...Navigation through citation network based on content similarity using cosine ...
Navigation through citation network based on content similarity using cosine ...
 
Social media-based systems: an emerging area of information systems research ...
Social media-based systems: an emerging area of information systems research ...Social media-based systems: an emerging area of information systems research ...
Social media-based systems: an emerging area of information systems research ...
 
Citation metrics and the stories they tell
Citation metrics and the stories they tellCitation metrics and the stories they tell
Citation metrics and the stories they tell
 
Helping Users Discover Perspectives: Enhancing Opinion Mining with Joint Topi...
Helping Users Discover Perspectives: Enhancing Opinion Mining with Joint Topi...Helping Users Discover Perspectives: Enhancing Opinion Mining with Joint Topi...
Helping Users Discover Perspectives: Enhancing Opinion Mining with Joint Topi...
 
Sci 2011 big_data(30_may13)2nd revised _ loet
Sci 2011 big_data(30_may13)2nd revised _ loetSci 2011 big_data(30_may13)2nd revised _ loet
Sci 2011 big_data(30_may13)2nd revised _ loet
 
Annotated Bibliography On Research Methodologies
Annotated Bibliography On Research MethodologiesAnnotated Bibliography On Research Methodologies
Annotated Bibliography On Research Methodologies
 
From Web Data to Knowledge: on the Complementarity of Human and Artificial In...
From Web Data to Knowledge: on the Complementarity of Human and Artificial In...From Web Data to Knowledge: on the Complementarity of Human and Artificial In...
From Web Data to Knowledge: on the Complementarity of Human and Artificial In...
 
Ralph schroeder and eric meyer
Ralph schroeder and eric meyerRalph schroeder and eric meyer
Ralph schroeder and eric meyer
 
Learning to Classify Users in Online Interaction Networks
Learning to Classify Users in Online Interaction NetworksLearning to Classify Users in Online Interaction Networks
Learning to Classify Users in Online Interaction Networks
 

Research Interests : Their Dynamics, Structures and Applications in Personalized Web Search

  • 1. Research Interests : Their Dynamics, Structures and Applications in Personalized Web Search Yi Zeng 1 , Erzhong Zhou 1 , Xu Ren 1 , Yulin Qin 1,3 , Ning Zhong 1,2 , Zhisheng Huang 4 1. International WIC Institute, Beijing University of Technology, China 2. Maebashi Institute of Technology, Japan 3. Carnegie Mellon University, USA 4. Vrije University Amsterdam, the Netherlands
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11. Building and Analyzing the Structure of Research Interests Observed Phenomenon: [1] main research interests ( pivotal nodes ) are dynamically changing all the time. With older ones disappear and new ones emerged . [2] Relations among research interests varies as time passed ( strengthen or weaken ). [3] main research interests are closely related to each other. (The closeness is getting stronger from time to time, which made the degree of separation around 2-3. It indicates that for an author, research interests are not isolated but highly relevant . [4] Many top research interests (pivotal nodes) remain active in the interest network (e.g. search, analysis, match). Figure 7 . Ricardos research interest dynamic evolution network from 1991 to 2009. (Based on DBLP publication list, with 232 papers involved). The network is a graph with weighted edges and weighted vertices . An Author’s Research Interest Evolution Network
  • 12.
  • 13.
  • 14.
  • 15.
  • 16. A Comparative Study of Different Interest Evaluation Methods Interests Longest Duration Interests Cumulative Duration Zhisheng Huang’s Interests Evaluation from CI, ILD and ICD
  • 17.
  • 18. Overlap of User Interests and Group Interests Top 9 interests retention of a user and his group interests retention. (Ricardo A. Baeza-Yates, based on May 2008 version of SwetoDBLP). … Model … Analysis … Text … Challenge 14 Analysis 1.26 Minining 18 Query 2.10 Engine 19 System 2.14 Query 26 Information 2.27 Information 28 Web 3.19 Retrieval 30 Retrieval 5.59 Search 35 Search 7.81 Web Top 9 Group Retained Interests Top 9 Retained Interests
  • 19.
  • 20. Semantic Similarity and Interests Re-ranking Semantic Similarity judges by Normalized Google Distance [Rudi and Paul 2007] Normalized Google Distance Google, Bing as the Knowledge base. 0.080 reasoning ontology 0.460 pagerank Query 0.332 ontology logic 0.497 pagerank retrieval 0.050 semantic reasoning 0.403 query retrieval -0.003 semantic ontology 0.490 pagerank search 0.276 semantic logic 0.483 query search 0.239 reasoning logic 0.529 retrieval search NGD interest y interest x NGD interest y interest x
  • 21.
  • 22.
  • 23. Evaluations on Normalized Medline Distance (NMD) Experts evaluated 30 medline term pairs Pearson Correlation: NMD gets the highest value among the measures, 0.792 T-test significance: 0.995 Experts from AstraZeneca evaluated 90 randomly generated pairs Pearson Correlation: NMD: 0.736 vs NGD:0.531 Average: Experts:0.590, NMD:0.390, NGD:0.289 NMD is closer to experts’ evaluation
  • 24.
  • 25.
  • 26.
  • 27. Integration of WI and e-FOAF:interests by FOAF community By Balthasar A.C. Schopman from Vrije University Amsterdam
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
  • 33. DBLP-SSE : DBLP Search Support Engine The DBLP dataset Web Semantic Knowledge Sub datasets pre-selection * Web Intelligence and Artificial Intelligence in Education. * Artificial Intelligence Exchange and Service Tie to All Test Environments (AI-ESTATE)-A New Standard for System Diagnostics. * Semantic Model for Artificial Intelligence Based on Molecular Computing . * Open Information Systems Semantics for Distributed Artificial Intelligence . * Artificial Intelligence and Financial Services . * … with current interests constraints (Top 5 results) List 2 : * PROLOG Programming for Artificial Intelligence , Second Edition. * Artificial Intelligence Architectures for Composition and Performance Environment. * Artificial Intelligence in Music Education: A Critical Review. * Music, Intelligence and Artificiality. Artificial Intelligence and Music Education. * Musical Knowledge: What can Artificial Intelligence Bring to the Musician? * ... without current interests constraints (Top 5 results) List 1 : Artificial Intelligence Query : Web, Service, Semantic, Architecture, Model, Ontology, Knowledge, Computing, Language Top 9 interests Dieter Fensel Log in
  • 34. Search Results without any Refinement
  • 35. Search Results with Interests-based Refinement http://www.wici-lab.org/wici/dblp-sse/
  • 36.
  • 37. Scalability for Query Time With selection: approximately 80% of the time can be saved. equivalent to Refined query based on interests much closer to user needs may be very far from user needs Results the fastest much slower medium Query Time Interest based selection before querying Refined query based on interests Unrefined query
  • 38. The Effect of Query Constraints Numbers
  • 39.
  • 40.
  • 41.
  • 42. Thank you! URL: http://www.wici-lab.org/wici/~yizeng Email: yizeng@bjut.edu.cn