SlideShare una empresa de Scribd logo
1 de 10
Descargar para leer sin conexión
Web Recommender Project Final Report
                                   Wei Chen, Yue (Jenny) Cui


Motivation
People use the web to browse information. One problem is that there is too much information
on the web, and it usually takes time to search for information. So it would be helpful to
facilitate this web-browsing experience in a way that is convenient, fast and accurate. An
existing solution to this problem is to use the search engine. In a typical search engine scenario,
a user types in a query, and the search engine returns relevant pages. Using a search engine to
retrieve relevant pages is not fully automatic; it requires user effort to make and type in a query.
Our goal is to develop a tool to automatically generate queries for the user when s/he is
reading a web page. Then we can use this query to recommend relevant web pages to the user.


Problem Statement
What is a Web Recommender?
A web recommender is a web-browsing tool which recommends relevant web pages to the user
while s/he is reading a page.

Why is it important?
A web recommender provides a convenient way to browse the web. It automatically
recommends relevant information. It requires less effort to make and type in queries. At the
mean time, it reserves the benefit from the state-of-the-art search engine.

Why is it hard?
Making queries from a web page is a keyword summarization problem, which is still an active
research topic. Also, search engines are not perfect: they can return dead links, and they can
return irrelevant pages. Furthermore, it is often hard to define what it means to be relevant. It
depends on different reading goals. All of these are related issues of web recommendation. We
do not attempt to conquer all of them. In this particular project, we focus on the first issue,
which is to extract queries from a web page.

Link to Vision Statement
Goals for this project (solution)
We have three goals for this project:

   (1) Provide a software framework for Web Recommendation
   (2) Provide basic recommendation algorithms
   (3) Propose an evaluation prototype

The first goal defines the basic functionality of this software. The second goal provides three
kinds of services. First, it offers basic service to the web recommender. Second, it offers
baselines for future research on Web Recommendation. Finally, it can be used as a tutorial for
teaching people how to develop their own algorithm based on our software framework.

Link to Vision Statement

Link to Domain Model


Requirements
Functional Requirements
   (1) Given a web page as input, the system should be able to find a list of relevant web pages.
   (2) The system should provide three recommendation algorithms.
       a. Baseline algorithm: uses simple string processing techniques
       b. HTML-Structure-based algorithm: uses HTML structure features
       c. Semantics-based algorithm: uses NLP techniques (named entity recognizer) to
          extract features
   (3) The system should provide a simple GUI for evaluation.

Non-functional Requirements
Recommendation results can be retrieved in 5 seconds.

Link to Requirement Analysis


Design
Our design has three components: a general software framework design, algorithm design, and
evaluation task design.

Software Framework Design
Class Diagram:
http://seit1.lti.cs.cmu.edu/projects/webrecommender/wiki/ClassDiagramFinal
The main algorithm of WebRecommender is implemented in the method recommend(). The util
package provides tools for HTML parsing, basic text processing and NLP tools that are needed
for the recommendation algorithm. QueryFilter is used for key-term selection.
QueryFormulator can be used for combining multiple queries.

Sequence Diagram illustrates an example message flow:
http://seit1.lti.cs.cmu.edu/projects/webrecommender/wiki/SequenceDiagram


Algorithm Design
We designed three algorithms: baseline algorithm, HTML-structure-based algorithm and
semantics-based algorithm. The algorithms are described below.

Baseline Algorithm
   1. Strip off HTML tags (e.g. </html>)
   2. Remove non-word tokens (e.g. “/**/”)
   3. Remove stop words (e.g. “the”)

HTML Structure-based Algorithm
  1. Parse HTML page
  2. Extract text content from node <title> and <a>
  3. Remove stop words (e.g. “the”)
  4. Select the 10 most frequent words

Semantic-based Algorithm
   1. Strip off HTML tags (e.g. </html>)
   2. Tag the page using Stanford named entity tagger
   3. Remove non-word tokens (e.g. “/**/”)
   4. Remove stop words (e.g. “the”)
   5. Select named entities with highest frequency (top 5)

Example Query Comparison
Input page: http://en.wikipedia.org/wiki/Entropy

Table 1. Example query comparison

Algorithm         Output Query
Baseline           Entropy, free, encyclopedia, Jump, search, article
HTML-Structure     ISBN, edit, entropy, thermodynamics, Entropy, energy, system, law, heat,
                  thermodynamic
Semantic           ISBN, University, Press, Boltzmann, John
Evaluation Design

Evaluation Form
We designed an evaluation form which consists of three fields: input page, recommended page,
and relevance score. We ask our evaluators to score each recommended page. The relevance
score has two values: 1 means “relevant”; 0 means “irrelevant”. The form also contains two
fields that are hidden from the evaluator: the algorithm used to produce the recommended
page and the rank of the page. These two fields are used for evaluation, and they are invisible
to the evaluators.

Evaluation Criteria
We used the modified Average Precision to aggregate relevance scores. The standard average
precision is calculated as the sum of precision at each position divided by the total number of
relevant pages. In our modified version, we replace the number of relevant pages in the
denominator with the total number of retrieved pages.

                     N
                          ( P(r ) rel(r ))
                     r1
ModifiedAv
         eP
                              N
An example of the calculation of modified average precision is shown in our final project
presentation:

link to Final Presentation

Test Data Selection
Our criterion for test data selection is that it has to span multiple dimensions. The dimensions
we considered include:
   1. Popular vs. Unpopular (e.g., “Harry Porter” vs. “Wei Chen”)
   2. Ambiguous vs. Unambiguous (e.g., “Entropy” vs.“Sushi”)
   3. New vs. Old (e.g., “Waterboarding” vs. “Entropy”)
   4. Procedural vs. Conceptual (e.g., “How to” vs. “Entropy”)
   5. Technological vs. Mass media (e.g., “Entropy” vs. “Harry Porter”)
Based on the test data selection criteria, we selected 5 input pages from 5 topics:
   1. “Harry Porter” http://en.wikipedia.org/wiki/Harry_potter
   2. “Waterboarding” http://en.wikipedia.org/wiki/Waterboarding
   3. “Wei Chen@CMU homepage” http://www.cs.cmu.edu/~weichen/
   4. “Entropy (thermodynamics)” http://en.wikipedia.org/wiki/Entropy
   5. “How to make Sushi” http://www.wikihow.com/Make-Sushi
Evaluation GUI
Link to GUI Demo

Our evaluation GUI is composed of three function areas: the top panel where user type in the
url of the input web page, the left panel where all the urls of the recommended web pages are
displayed, and the content panel which display the web page the user selects. The top panel
includes an internet address bar and the recommendation button. User types in the url of a
webpage, if he presses the enter key then the input web page will be showed in the large
content panel behind it. If the user clicks the recommend button, then the urls of the
recommended web pages are displayed in the left panel in the GUI.



Evaluation and Results
One important question we want to answer in this project is how well each our algorithms are.
So we need to design an experiment which can measure user satisfaction fairly. Our first
hypothesis is that given different kinds of topics the performance of our algorithms will be
different. But at the design stage we are not sure how huge the variation will be.

Our second hypothesis is that users will disagree on how useful the recommended web pages
are. Because if a user changes his goal, he will change his evaluation criteria at the same time.
In order to avoiding non-standard criteria, we limit our evaluation criteria only to the relevancy
of the recommended pages. In our ReadMe file we specify the definition of relevancy for each
of the topics. By doing this we think we can measure the user satisfaction to each of our
algorithms.

Experimental Design
link to example evaluation form

We have three algorithms: baseline, semantic and structure algorithm. We finally chose 5 topics
for our experiment. It is important that the web pages our algorithms recommend contain the
information our user need. But it is equally important that they appear at the top of the list of
the recommended web pages. After we combine each algorithm with 5 topics, we get totally 15
categories (e.g. (baseline, topic1) is one category). We use the top five recommended web
pages from each algorithm. Then each rater evaluates in total 75 recommended web pages.
Whenever a rater thinks a recommended web page is relevant, he scores one in the score
column in the evaluation form.
Participants
We have total of 5 participants. Three are females and two are males. All of the raters are with
at least a master degree in computer science. One of them is a native English speaker. The rest
four are not.

Experimental procedure
All the raters read the Readme file which gives out the definition of relevancy for each topic
before they do the evaluation.

Results
Link to a presentation of evaluation results

Our results show that our first hypnosis is correct. Topics that are popular and have more
resources on the web have better scores. The topic “Harry Potter” has the highest relevancy
score. All of our three algorithms recommended satisfying web pages. We think the reason is
there are so many web pages about Harry Potter. So it is easy to find relevant web pages. The
topic “Waterboarding” has the highest number of invalid web pages. We think the reason is
waterboarding is a typical news topic. Most of the time there are few web pages which talk
about Waterboarding. There are few resource on this topic on the web. But one it becomes a
news headline. Many resource are added to the web. But after some time when it is no longer
the headline of the news. Many of the resources are probably deleted from the web. That could
cause the invalid links. The topic “how to make Sushi” has the lowest relevancy score. We think
the reason is it is about a specific procedure which makes the definition of relevancy more strict.

Among our three algorithms, in this experiment structure algorithm has the best performance.
The difference between baseline algorithm and structure algorithm is significant with p-value (p
< 0.001). The difference between baseline and semantic algorithm is not significant.

The structure algorithm has the best performance on topic “entropy” with relevancy score of 1.
This is a really promising result, because if the target users of web recommender are people
who are in academic, they would use it to find technology information. For example, we could
combine web recommender with Wikipedia. Then the users would get more comprehensive
information on the topic he/she is interested in. The structure algorithm has very good
performance on the topic “How to make Sushi” whereas the baseline and semantic algorithms
have the worst the performance. We think the reason is the structure algorithm uses key terms
that are extracted from anchor tags. These anchor tags point to other relevant web pages. So
the key terms extracted from anchor tags are very more relevant than the key terms which we
extract from other part of the web page.
Error Analysis
Table 2. Key terms and score of all the categories of topic and algorithm

Topic        Algorithm    Key Term                                                     Score
Entropy      Baseline     Entropy free encyclopedia Jump search article                  0.519
             Semantic     ISBN University Press Boltzmann John                          0.6926
             Structure    ISBN edit entropy thermodynamics Entropy energy system
                          law heat thermodynamic                                                1
Harry        Baseline     Harry Potter free encyclopedia Jump search
Potter                                                                                  0.9032
             Semantic     Harry Potter Voldemort BBC Rowling                            0.9686
             Structure    Potter Harry Rowling Witch Deathly Goblet Magic
                          witchcraft Film Hallows                                         0.982
Waterboa     Baseline     Waterboarding free encyclopedia Jump search Cambodia
rding                     Khmer                                                          0.507
             Semantic     CIA United York Bush States                                   0.1738
             Structure    Torture News York Waterboarding Times Press CIA ISBN
                          torture Washington                                            0.8564
Wei Chen     Baseline     Wei Chen graduate student Language Technologies
                          Carnegie Mellon research advisor                              0.7444
             Semantic     Chen Wei University NMF Johns                                  0.457
             Structure    States Natural Language Mental Fahlman Word Jack
                          Lingual AAAI Wei                                                0.713
How to       Baseline     Make 10 steps wikiHow Manual Edit RSS Create account
make                      log prepared
Sushi                                                                                   0.1274
             Semantic     RL Commons Article Creative Nicole                                 0
             Structure    Sushi Make edit Ads Roll wikiHow Article make Rice Show       0.7444


 Overall the semantic algorithm’s performance is not as good as we expected. We expect it at
least as good as the structure algorithm. The semantic algorithm scores zero in topic “How to
make Sushi”. By looking at causes of that, we find that the name entity algorithm we use can
only identify name of person and organization. So it doesn’t include the important key word
“Sushi”. Then by looking at other topics we find that actually the semantic algorithm identify
important name entities which are relevant to the topic and useful to the algorithm. But only
using these name entities is not sufficient enough. We think if we combine the key words in the
title of the web pages and the name entities we extracted as an input to our query, we would
have much better result in the future.

For topic “entropy”, both semantic and structure algorithms score better than the baseline
algorithm. We think the reason behind is in the baseline algorithm there are some “noise” key
terms which affect the performance of the baseline algorithm. It makes the baseline algorithm
give out some irrelevant web pages. It is also a very promising sign that the semantic and
structure algorithm will make difference in the recommendation results.

All of the three algorithms are perform very well on the topic “Harry Potter”. We think there
are two reasons for this: first the definition of relevancy for a popular topic is much broader.
Anything about Harry Potter will be thought as relevant no matter it is about the book, the
movie, the author, or the actors. Second, there are so many web pages about Harry Potters on
the web. So it is easier to find 5 from them.

The reason for the error pages in the topic of “Waterboarding” is some of the links are invalid.
Because “Waterboarding” is a time sensitive topic, the content of the recommended web pages
could be deleted at the time of evaluation. By looking at the links we see that they are usually
links to the user generated content pages like forums.

For topic “Wei Chen”, the relevant web pages about this topic are very few. The structure
algorithm only returns 4 web pages as the result. But two of them are relevant. So we think the
major reason for the error pages is the scarce of the relevant web pages on the web.

For topic “how to make Sushi”, we think it will be a difficult case for web recommender. One
problem is caused by the key word “make”. The algorithm gave out pages that are about how-
to make something else instead of Sushi. The other problem is there are a lot of content that
are user generated about this topic. So some of the recommended pages are from forums and
invalid at the time of evaluation.

Conclusion
This experiment gave us a lot of feedback about our algorithms used in web Recommender.
Now we know how topics play a role on the recommendation results of each of the algorithms.
We can also conclude from the experiment that our algorithms do make significant difference
on the recommendation results. We probably could predict which kind of topic web
recommender will be most useful.


Software Engineering Techniques used in this project
We followed the standard software engineering process in this project: requirement analysis,
design, implementation, and evaluation. We used iterative development process at design,
implementation, and evaluation phase. Table 3 summarizes the iterations in each phase. It also
summarizes the main changes we went through.
Table 3. Highlights of software engineering process
              Design                   Implementation           Evaluation
Iteration 1 1. Initial design of       1. Initial               1. Pilot study
                                       implementation of
              framework                                         2, Weighted average
              2. Composite-pattern framework                    relevance score
                                       2. Implemented
              based evaluation
                                       evaluation
              design                   component based on
                                       composite pattern
Iteration 2 1. Added query             1. Implemented query     1. 5 raters, 5 input
              formulator and query formulator and query         pages
              filter                   filter                   2. Modified average
              2. Simplified
                                       2. Implemented           precision
              evaluation design
                                       simplified version of
                                       evaluation GUI




What changed over the semester?
As Table 3 showed, we made changes in each of the development phases. Major changes are
documented in several meetings notes:

Changes in Main Framework:
http://seit1.lti.cs.cmu.edu/projects/webrecommender/wiki/MeetingNotes02-02-2009
http://seit1.lti.cs.cmu.edu/projects/webrecommender/wiki/MeetingNotes02-11-2009
http://seit1.lti.cs.cmu.edu/projects/webrecommender/wiki/MeetingNotes02-18-2009
http://seit1.lti.cs.cmu.edu/projects/webrecommender/wiki/MeetingNotes03-04-2009

Changes in Evaluation Component:
http://seit1.lti.cs.cmu.edu/projects/webrecommender/wiki/MeetingNotes04-06-2009
http://seit1.lti.cs.cmu.edu/projects/webrecommender/wiki/MeetingNotes04-20-2009
http://seit1.lti.cs.cmu.edu/projects/webrecommender/wiki/MeetingNotes04-22-2009


Our evaluation GUI went through several rounds of changes.
Stage 1: Planed to use relational database to store and retrieve evaluation results
Stage 2: Discarded the idea of relational database. Use composite pattern to implement
aggregation of evaluation scores. Link to composite pattern based design
Stage 3: Discarded the idea of composite pattern. Simplified the evaluation GUI. Implemented
GUI. Link to the GUI Demo
Stage 4: GUI found to be slow. Used Excel files to store and calculate evaluation scores. link to
example evaluation form
What would we change if we did the project over again?

   1. We would improve our risk analysis: one tricky thing about risk analysis is that it is
      unexpected. We didn’t expect that speed will be a problem of our GUI.
   2. Evaluation took more time than we had thought. We want to allow more time for
      evaluation, because we need time for pilot study before we conduct the experiment.
      Then we can have detailed and systematic analysis of the algorithms and improve our
      algorithms based on the analysis.
   3. We would improve our time management: We should start evaluation early so that we
      can improve our algorithms based on evaluation results.


Acknowledgements
We own many thanks to Dr. Nyberg, Dr. Tomasic, Shilpa and Hideki for valuable comments and
suggestions on our project throughout the semester. We thank our raters for the evaluation
task. We also thank our classmates for many helpful discussions.

Más contenido relacionado

La actualidad más candente

Twitter sentiment analysis ppt
Twitter sentiment analysis pptTwitter sentiment analysis ppt
Twitter sentiment analysis ppt
SonuCreation
 
Evaluation Datasets for Twitter Sentiment Analysis: A survey and a new datase...
Evaluation Datasets for Twitter Sentiment Analysis: A survey and a new datase...Evaluation Datasets for Twitter Sentiment Analysis: A survey and a new datase...
Evaluation Datasets for Twitter Sentiment Analysis: A survey and a new datase...
Knowledge Media Institute - The Open University
 
Preference Elicitation Interface
Preference Elicitation InterfacePreference Elicitation Interface
Preference Elicitation Interface
晓愚 孟
 
An Example of Predictive Analytics: Building a Recommendation Engine Using Py...
An Example of Predictive Analytics: Building a Recommendation Engine Using Py...An Example of Predictive Analytics: Building a Recommendation Engine Using Py...
An Example of Predictive Analytics: Building a Recommendation Engine Using Py...
PyData
 

La actualidad más candente (20)

Twitter sentiment analysis ppt
Twitter sentiment analysis pptTwitter sentiment analysis ppt
Twitter sentiment analysis ppt
 
Evaluation Datasets for Twitter Sentiment Analysis: A survey and a new datase...
Evaluation Datasets for Twitter Sentiment Analysis: A survey and a new datase...Evaluation Datasets for Twitter Sentiment Analysis: A survey and a new datase...
Evaluation Datasets for Twitter Sentiment Analysis: A survey and a new datase...
 
SentiCheNews - Sentiment Analysis on Newspapers and Tweets
SentiCheNews - Sentiment Analysis on Newspapers and TweetsSentiCheNews - Sentiment Analysis on Newspapers and Tweets
SentiCheNews - Sentiment Analysis on Newspapers and Tweets
 
Offline evaluation of recommender systems: all pain and no gain?
Offline evaluation of recommender systems: all pain and no gain?Offline evaluation of recommender systems: all pain and no gain?
Offline evaluation of recommender systems: all pain and no gain?
 
Preference Elicitation Interface
Preference Elicitation InterfacePreference Elicitation Interface
Preference Elicitation Interface
 
Make the Most of Your Time: How Should the Analyst Work with Automated Tracea...
Make the Most of Your Time: How Should the Analyst Work with Automated Tracea...Make the Most of Your Time: How Should the Analyst Work with Automated Tracea...
Make the Most of Your Time: How Should the Analyst Work with Automated Tracea...
 
Sentiment Analysis of Twitter Data
Sentiment Analysis of Twitter DataSentiment Analysis of Twitter Data
Sentiment Analysis of Twitter Data
 
Twitter sentimentanalysis report
Twitter sentimentanalysis reportTwitter sentimentanalysis report
Twitter sentimentanalysis report
 
Experiments on Design Pattern Discovery
Experiments on Design Pattern DiscoveryExperiments on Design Pattern Discovery
Experiments on Design Pattern Discovery
 
Ontology based sentiment analysis
Ontology based sentiment analysisOntology based sentiment analysis
Ontology based sentiment analysis
 
Mood
MoodMood
Mood
 
Sentiment Analysis in R
Sentiment Analysis in RSentiment Analysis in R
Sentiment Analysis in R
 
Email Classification
Email ClassificationEmail Classification
Email Classification
 
New sentiment analysis of tweets using python by Ravi kumar
New sentiment analysis of tweets using python by Ravi kumarNew sentiment analysis of tweets using python by Ravi kumar
New sentiment analysis of tweets using python by Ravi kumar
 
IRJET- Product Aspect Ranking
IRJET-  	  Product Aspect RankingIRJET-  	  Product Aspect Ranking
IRJET- Product Aspect Ranking
 
An Example of Predictive Analytics: Building a Recommendation Engine Using Py...
An Example of Predictive Analytics: Building a Recommendation Engine Using Py...An Example of Predictive Analytics: Building a Recommendation Engine Using Py...
An Example of Predictive Analytics: Building a Recommendation Engine Using Py...
 
Sentiment analyzer and opinion mining
Sentiment analyzer and opinion miningSentiment analyzer and opinion mining
Sentiment analyzer and opinion mining
 
Movie recommendation project
Movie recommendation projectMovie recommendation project
Movie recommendation project
 
sentiment analysis text extraction from social media
sentiment  analysis text extraction from social media sentiment  analysis text extraction from social media
sentiment analysis text extraction from social media
 
A review of sentiment analysis approaches in big
A review of sentiment analysis approaches in bigA review of sentiment analysis approaches in big
A review of sentiment analysis approaches in big
 

Destacado

Meeting Marketplace Virtual Tour
Meeting Marketplace Virtual TourMeeting Marketplace Virtual Tour
Meeting Marketplace Virtual Tour
Evolve_Web
 
Evolve Intentional Living Coaching Program
Evolve Intentional Living Coaching ProgramEvolve Intentional Living Coaching Program
Evolve Intentional Living Coaching Program
Evolve_Web
 
Marketplace July 09 - large group facilitation
Marketplace July 09 - large group facilitationMarketplace July 09 - large group facilitation
Marketplace July 09 - large group facilitation
Evolve_Web
 
Final Presentation V3
Final Presentation V3Final Presentation V3
Final Presentation V3
weichen
 
Presentation Slides (Celine)
Presentation Slides (Celine)Presentation Slides (Celine)
Presentation Slides (Celine)
Adiemus41
 

Destacado (15)

Meeting Marketplace Virtual Tour
Meeting Marketplace Virtual TourMeeting Marketplace Virtual Tour
Meeting Marketplace Virtual Tour
 
Evolve Intentional Living Coaching Program
Evolve Intentional Living Coaching ProgramEvolve Intentional Living Coaching Program
Evolve Intentional Living Coaching Program
 
Animales 10125
Animales 10125Animales 10125
Animales 10125
 
Community Engagement Evolve 2009
Community Engagement Evolve 2009Community Engagement Evolve 2009
Community Engagement Evolve 2009
 
Marketplace July 09 - large group facilitation
Marketplace July 09 - large group facilitationMarketplace July 09 - large group facilitation
Marketplace July 09 - large group facilitation
 
Final Presentation V3
Final Presentation V3Final Presentation V3
Final Presentation V3
 
2 Worksheets
2  Worksheets2  Worksheets
2 Worksheets
 
Btech.Net Rel.2.0
Btech.Net Rel.2.0Btech.Net Rel.2.0
Btech.Net Rel.2.0
 
Presentation Slides (Celine)
Presentation Slides (Celine)Presentation Slides (Celine)
Presentation Slides (Celine)
 
4a Sessió Curs Id
4a Sessió Curs Id4a Sessió Curs Id
4a Sessió Curs Id
 
Amanida
AmanidaAmanida
Amanida
 
Les TIC com a facilitadores de l'aprenentatge
Les TIC com a facilitadores de l'aprenentatgeLes TIC com a facilitadores de l'aprenentatge
Les TIC com a facilitadores de l'aprenentatge
 
Dividend Policy
Dividend PolicyDividend Policy
Dividend Policy
 
Financial Ratios
Financial RatiosFinancial Ratios
Financial Ratios
 
Recursos i reflexions al voltant de l’alfabetizació digital
Recursos i reflexions al voltant de l’alfabetizació digitalRecursos i reflexions al voltant de l’alfabetizació digital
Recursos i reflexions al voltant de l’alfabetizació digital
 

Similar a Web Rec Final Report

Different Methodologies For Testing Web Application Testing
Different Methodologies For Testing Web Application TestingDifferent Methodologies For Testing Web Application Testing
Different Methodologies For Testing Web Application Testing
Rachel Davis
 
Ijmer 46067276
Ijmer 46067276Ijmer 46067276
Ijmer 46067276
IJMER
 
Ijmer 46067276
Ijmer 46067276Ijmer 46067276
Ijmer 46067276
IJMER
 
Chewy Trewella - Google Searchtips
Chewy Trewella - Google SearchtipsChewy Trewella - Google Searchtips
Chewy Trewella - Google Searchtips
sounddelivery
 

Similar a Web Rec Final Report (20)

Different Methodologies For Testing Web Application Testing
Different Methodologies For Testing Web Application TestingDifferent Methodologies For Testing Web Application Testing
Different Methodologies For Testing Web Application Testing
 
Opinion Driven Decision Support System
Opinion Driven Decision Support SystemOpinion Driven Decision Support System
Opinion Driven Decision Support System
 
Ijmer 46067276
Ijmer 46067276Ijmer 46067276
Ijmer 46067276
 
Ijmer 46067276
Ijmer 46067276Ijmer 46067276
Ijmer 46067276
 
A Review on Sentimental Analysis of Application Reviews
A Review on Sentimental Analysis of Application ReviewsA Review on Sentimental Analysis of Application Reviews
A Review on Sentimental Analysis of Application Reviews
 
IRJET- A New Approach to Product Recommendation Systems
IRJET- A New Approach to Product Recommendation SystemsIRJET- A New Approach to Product Recommendation Systems
IRJET- A New Approach to Product Recommendation Systems
 
IRJET- A New Approach to Product Recommendation Systems
IRJET-  	  A New Approach to Product Recommendation SystemsIRJET-  	  A New Approach to Product Recommendation Systems
IRJET- A New Approach to Product Recommendation Systems
 
Rated Ranking Evaluator (RRE) Hands-on Relevance Testing @Chorus
Rated Ranking Evaluator (RRE) Hands-on Relevance Testing @ChorusRated Ranking Evaluator (RRE) Hands-on Relevance Testing @Chorus
Rated Ranking Evaluator (RRE) Hands-on Relevance Testing @Chorus
 
Hudup - A Framework of E-commercial Recommendation Algorithms
Hudup - A Framework of E-commercial Recommendation AlgorithmsHudup - A Framework of E-commercial Recommendation Algorithms
Hudup - A Framework of E-commercial Recommendation Algorithms
 
Product Quality Analysis based on online Reviews
Product Quality Analysis based on online ReviewsProduct Quality Analysis based on online Reviews
Product Quality Analysis based on online Reviews
 
Product Quality Analysis based on online Reviews
Product Quality Analysis based on online ReviewsProduct Quality Analysis based on online Reviews
Product Quality Analysis based on online Reviews
 
Macran
MacranMacran
Macran
 
E-Commerce Product Rating Based on Customer Review
E-Commerce Product Rating Based on Customer ReviewE-Commerce Product Rating Based on Customer Review
E-Commerce Product Rating Based on Customer Review
 
Search Quality Evaluation: Tools and Techniques
Search Quality Evaluation: Tools and TechniquesSearch Quality Evaluation: Tools and Techniques
Search Quality Evaluation: Tools and Techniques
 
Haystack London - Search Quality Evaluation, Tools and Techniques
Haystack London - Search Quality Evaluation, Tools and Techniques Haystack London - Search Quality Evaluation, Tools and Techniques
Haystack London - Search Quality Evaluation, Tools and Techniques
 
IRJET- Hybrid Recommendation System for Movies
IRJET-  	  Hybrid Recommendation System for MoviesIRJET-  	  Hybrid Recommendation System for Movies
IRJET- Hybrid Recommendation System for Movies
 
Chewy Trewella - Google Searchtips
Chewy Trewella - Google SearchtipsChewy Trewella - Google Searchtips
Chewy Trewella - Google Searchtips
 
A Study on SEO Monitoring System Based on Corporate Website Development
A Study on SEO Monitoring System Based on Corporate Website DevelopmentA Study on SEO Monitoring System Based on Corporate Website Development
A Study on SEO Monitoring System Based on Corporate Website Development
 
Seo101
Seo101Seo101
Seo101
 
IRJET- Classification of Business Reviews using Sentiment Analysis
IRJET-  	  Classification of Business Reviews using Sentiment AnalysisIRJET-  	  Classification of Business Reviews using Sentiment Analysis
IRJET- Classification of Business Reviews using Sentiment Analysis
 

Más de weichen

Class Diagram
Class DiagramClass Diagram
Class Diagram
weichen
 
Cd Final
Cd FinalCd Final
Cd Final
weichen
 
Sequence Diagram
Sequence DiagramSequence Diagram
Sequence Diagram
weichen
 
Sequence Diagram
Sequence DiagramSequence Diagram
Sequence Diagram
weichen
 
Class Diagram Final
Class Diagram FinalClass Diagram Final
Class Diagram Final
weichen
 
Class Diagram Final
Class Diagram FinalClass Diagram Final
Class Diagram Final
weichen
 
Domain Model Ve
Domain Model VeDomain Model Ve
Domain Model Ve
weichen
 
Domain Model Ve
Domain Model VeDomain Model Ve
Domain Model Ve
weichen
 
Domain Model Ve
Domain Model VeDomain Model Ve
Domain Model Ve
weichen
 
Domain Model Ve
Domain Model VeDomain Model Ve
Domain Model Ve
weichen
 
Domain Model V7
Domain Model V7Domain Model V7
Domain Model V7
weichen
 
Class Diagram V7
Class Diagram V7Class Diagram V7
Class Diagram V7
weichen
 
Sequence Diagram V6
Sequence Diagram V6Sequence Diagram V6
Sequence Diagram V6
weichen
 
Domain Model V2
Domain Model V2Domain Model V2
Domain Model V2
weichen
 
Class Diagram V5
Class Diagram V5Class Diagram V5
Class Diagram V5
weichen
 
Sequence Diagram V5
Sequence Diagram V5Sequence Diagram V5
Sequence Diagram V5
weichen
 
Sequence Diagram V4
Sequence Diagram V4Sequence Diagram V4
Sequence Diagram V4
weichen
 
Class Diagram V2
Class Diagram V2Class Diagram V2
Class Diagram V2
weichen
 
Sequence Diagram
Sequence DiagramSequence Diagram
Sequence Diagram
weichen
 

Más de weichen (19)

Class Diagram
Class DiagramClass Diagram
Class Diagram
 
Cd Final
Cd FinalCd Final
Cd Final
 
Sequence Diagram
Sequence DiagramSequence Diagram
Sequence Diagram
 
Sequence Diagram
Sequence DiagramSequence Diagram
Sequence Diagram
 
Class Diagram Final
Class Diagram FinalClass Diagram Final
Class Diagram Final
 
Class Diagram Final
Class Diagram FinalClass Diagram Final
Class Diagram Final
 
Domain Model Ve
Domain Model VeDomain Model Ve
Domain Model Ve
 
Domain Model Ve
Domain Model VeDomain Model Ve
Domain Model Ve
 
Domain Model Ve
Domain Model VeDomain Model Ve
Domain Model Ve
 
Domain Model Ve
Domain Model VeDomain Model Ve
Domain Model Ve
 
Domain Model V7
Domain Model V7Domain Model V7
Domain Model V7
 
Class Diagram V7
Class Diagram V7Class Diagram V7
Class Diagram V7
 
Sequence Diagram V6
Sequence Diagram V6Sequence Diagram V6
Sequence Diagram V6
 
Domain Model V2
Domain Model V2Domain Model V2
Domain Model V2
 
Class Diagram V5
Class Diagram V5Class Diagram V5
Class Diagram V5
 
Sequence Diagram V5
Sequence Diagram V5Sequence Diagram V5
Sequence Diagram V5
 
Sequence Diagram V4
Sequence Diagram V4Sequence Diagram V4
Sequence Diagram V4
 
Class Diagram V2
Class Diagram V2Class Diagram V2
Class Diagram V2
 
Sequence Diagram
Sequence DiagramSequence Diagram
Sequence Diagram
 

Último

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Último (20)

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 

Web Rec Final Report

  • 1. Web Recommender Project Final Report Wei Chen, Yue (Jenny) Cui Motivation People use the web to browse information. One problem is that there is too much information on the web, and it usually takes time to search for information. So it would be helpful to facilitate this web-browsing experience in a way that is convenient, fast and accurate. An existing solution to this problem is to use the search engine. In a typical search engine scenario, a user types in a query, and the search engine returns relevant pages. Using a search engine to retrieve relevant pages is not fully automatic; it requires user effort to make and type in a query. Our goal is to develop a tool to automatically generate queries for the user when s/he is reading a web page. Then we can use this query to recommend relevant web pages to the user. Problem Statement What is a Web Recommender? A web recommender is a web-browsing tool which recommends relevant web pages to the user while s/he is reading a page. Why is it important? A web recommender provides a convenient way to browse the web. It automatically recommends relevant information. It requires less effort to make and type in queries. At the mean time, it reserves the benefit from the state-of-the-art search engine. Why is it hard? Making queries from a web page is a keyword summarization problem, which is still an active research topic. Also, search engines are not perfect: they can return dead links, and they can return irrelevant pages. Furthermore, it is often hard to define what it means to be relevant. It depends on different reading goals. All of these are related issues of web recommendation. We do not attempt to conquer all of them. In this particular project, we focus on the first issue, which is to extract queries from a web page. Link to Vision Statement
  • 2. Goals for this project (solution) We have three goals for this project: (1) Provide a software framework for Web Recommendation (2) Provide basic recommendation algorithms (3) Propose an evaluation prototype The first goal defines the basic functionality of this software. The second goal provides three kinds of services. First, it offers basic service to the web recommender. Second, it offers baselines for future research on Web Recommendation. Finally, it can be used as a tutorial for teaching people how to develop their own algorithm based on our software framework. Link to Vision Statement Link to Domain Model Requirements Functional Requirements (1) Given a web page as input, the system should be able to find a list of relevant web pages. (2) The system should provide three recommendation algorithms. a. Baseline algorithm: uses simple string processing techniques b. HTML-Structure-based algorithm: uses HTML structure features c. Semantics-based algorithm: uses NLP techniques (named entity recognizer) to extract features (3) The system should provide a simple GUI for evaluation. Non-functional Requirements Recommendation results can be retrieved in 5 seconds. Link to Requirement Analysis Design Our design has three components: a general software framework design, algorithm design, and evaluation task design. Software Framework Design Class Diagram: http://seit1.lti.cs.cmu.edu/projects/webrecommender/wiki/ClassDiagramFinal
  • 3. The main algorithm of WebRecommender is implemented in the method recommend(). The util package provides tools for HTML parsing, basic text processing and NLP tools that are needed for the recommendation algorithm. QueryFilter is used for key-term selection. QueryFormulator can be used for combining multiple queries. Sequence Diagram illustrates an example message flow: http://seit1.lti.cs.cmu.edu/projects/webrecommender/wiki/SequenceDiagram Algorithm Design We designed three algorithms: baseline algorithm, HTML-structure-based algorithm and semantics-based algorithm. The algorithms are described below. Baseline Algorithm 1. Strip off HTML tags (e.g. </html>) 2. Remove non-word tokens (e.g. “/**/”) 3. Remove stop words (e.g. “the”) HTML Structure-based Algorithm 1. Parse HTML page 2. Extract text content from node <title> and <a> 3. Remove stop words (e.g. “the”) 4. Select the 10 most frequent words Semantic-based Algorithm 1. Strip off HTML tags (e.g. </html>) 2. Tag the page using Stanford named entity tagger 3. Remove non-word tokens (e.g. “/**/”) 4. Remove stop words (e.g. “the”) 5. Select named entities with highest frequency (top 5) Example Query Comparison Input page: http://en.wikipedia.org/wiki/Entropy Table 1. Example query comparison Algorithm Output Query Baseline Entropy, free, encyclopedia, Jump, search, article HTML-Structure ISBN, edit, entropy, thermodynamics, Entropy, energy, system, law, heat, thermodynamic Semantic ISBN, University, Press, Boltzmann, John
  • 4. Evaluation Design Evaluation Form We designed an evaluation form which consists of three fields: input page, recommended page, and relevance score. We ask our evaluators to score each recommended page. The relevance score has two values: 1 means “relevant”; 0 means “irrelevant”. The form also contains two fields that are hidden from the evaluator: the algorithm used to produce the recommended page and the rank of the page. These two fields are used for evaluation, and they are invisible to the evaluators. Evaluation Criteria We used the modified Average Precision to aggregate relevance scores. The standard average precision is calculated as the sum of precision at each position divided by the total number of relevant pages. In our modified version, we replace the number of relevant pages in the denominator with the total number of retrieved pages. N ( P(r ) rel(r )) r1 ModifiedAv eP N An example of the calculation of modified average precision is shown in our final project presentation: link to Final Presentation Test Data Selection Our criterion for test data selection is that it has to span multiple dimensions. The dimensions we considered include: 1. Popular vs. Unpopular (e.g., “Harry Porter” vs. “Wei Chen”) 2. Ambiguous vs. Unambiguous (e.g., “Entropy” vs.“Sushi”) 3. New vs. Old (e.g., “Waterboarding” vs. “Entropy”) 4. Procedural vs. Conceptual (e.g., “How to” vs. “Entropy”) 5. Technological vs. Mass media (e.g., “Entropy” vs. “Harry Porter”) Based on the test data selection criteria, we selected 5 input pages from 5 topics: 1. “Harry Porter” http://en.wikipedia.org/wiki/Harry_potter 2. “Waterboarding” http://en.wikipedia.org/wiki/Waterboarding 3. “Wei Chen@CMU homepage” http://www.cs.cmu.edu/~weichen/ 4. “Entropy (thermodynamics)” http://en.wikipedia.org/wiki/Entropy 5. “How to make Sushi” http://www.wikihow.com/Make-Sushi
  • 5. Evaluation GUI Link to GUI Demo Our evaluation GUI is composed of three function areas: the top panel where user type in the url of the input web page, the left panel where all the urls of the recommended web pages are displayed, and the content panel which display the web page the user selects. The top panel includes an internet address bar and the recommendation button. User types in the url of a webpage, if he presses the enter key then the input web page will be showed in the large content panel behind it. If the user clicks the recommend button, then the urls of the recommended web pages are displayed in the left panel in the GUI. Evaluation and Results One important question we want to answer in this project is how well each our algorithms are. So we need to design an experiment which can measure user satisfaction fairly. Our first hypothesis is that given different kinds of topics the performance of our algorithms will be different. But at the design stage we are not sure how huge the variation will be. Our second hypothesis is that users will disagree on how useful the recommended web pages are. Because if a user changes his goal, he will change his evaluation criteria at the same time. In order to avoiding non-standard criteria, we limit our evaluation criteria only to the relevancy of the recommended pages. In our ReadMe file we specify the definition of relevancy for each of the topics. By doing this we think we can measure the user satisfaction to each of our algorithms. Experimental Design link to example evaluation form We have three algorithms: baseline, semantic and structure algorithm. We finally chose 5 topics for our experiment. It is important that the web pages our algorithms recommend contain the information our user need. But it is equally important that they appear at the top of the list of the recommended web pages. After we combine each algorithm with 5 topics, we get totally 15 categories (e.g. (baseline, topic1) is one category). We use the top five recommended web pages from each algorithm. Then each rater evaluates in total 75 recommended web pages. Whenever a rater thinks a recommended web page is relevant, he scores one in the score column in the evaluation form.
  • 6. Participants We have total of 5 participants. Three are females and two are males. All of the raters are with at least a master degree in computer science. One of them is a native English speaker. The rest four are not. Experimental procedure All the raters read the Readme file which gives out the definition of relevancy for each topic before they do the evaluation. Results Link to a presentation of evaluation results Our results show that our first hypnosis is correct. Topics that are popular and have more resources on the web have better scores. The topic “Harry Potter” has the highest relevancy score. All of our three algorithms recommended satisfying web pages. We think the reason is there are so many web pages about Harry Potter. So it is easy to find relevant web pages. The topic “Waterboarding” has the highest number of invalid web pages. We think the reason is waterboarding is a typical news topic. Most of the time there are few web pages which talk about Waterboarding. There are few resource on this topic on the web. But one it becomes a news headline. Many resource are added to the web. But after some time when it is no longer the headline of the news. Many of the resources are probably deleted from the web. That could cause the invalid links. The topic “how to make Sushi” has the lowest relevancy score. We think the reason is it is about a specific procedure which makes the definition of relevancy more strict. Among our three algorithms, in this experiment structure algorithm has the best performance. The difference between baseline algorithm and structure algorithm is significant with p-value (p < 0.001). The difference between baseline and semantic algorithm is not significant. The structure algorithm has the best performance on topic “entropy” with relevancy score of 1. This is a really promising result, because if the target users of web recommender are people who are in academic, they would use it to find technology information. For example, we could combine web recommender with Wikipedia. Then the users would get more comprehensive information on the topic he/she is interested in. The structure algorithm has very good performance on the topic “How to make Sushi” whereas the baseline and semantic algorithms have the worst the performance. We think the reason is the structure algorithm uses key terms that are extracted from anchor tags. These anchor tags point to other relevant web pages. So the key terms extracted from anchor tags are very more relevant than the key terms which we extract from other part of the web page.
  • 7. Error Analysis Table 2. Key terms and score of all the categories of topic and algorithm Topic Algorithm Key Term Score Entropy Baseline Entropy free encyclopedia Jump search article 0.519 Semantic ISBN University Press Boltzmann John 0.6926 Structure ISBN edit entropy thermodynamics Entropy energy system law heat thermodynamic 1 Harry Baseline Harry Potter free encyclopedia Jump search Potter 0.9032 Semantic Harry Potter Voldemort BBC Rowling 0.9686 Structure Potter Harry Rowling Witch Deathly Goblet Magic witchcraft Film Hallows 0.982 Waterboa Baseline Waterboarding free encyclopedia Jump search Cambodia rding Khmer 0.507 Semantic CIA United York Bush States 0.1738 Structure Torture News York Waterboarding Times Press CIA ISBN torture Washington 0.8564 Wei Chen Baseline Wei Chen graduate student Language Technologies Carnegie Mellon research advisor 0.7444 Semantic Chen Wei University NMF Johns 0.457 Structure States Natural Language Mental Fahlman Word Jack Lingual AAAI Wei 0.713 How to Baseline Make 10 steps wikiHow Manual Edit RSS Create account make log prepared Sushi 0.1274 Semantic RL Commons Article Creative Nicole 0 Structure Sushi Make edit Ads Roll wikiHow Article make Rice Show 0.7444 Overall the semantic algorithm’s performance is not as good as we expected. We expect it at least as good as the structure algorithm. The semantic algorithm scores zero in topic “How to make Sushi”. By looking at causes of that, we find that the name entity algorithm we use can only identify name of person and organization. So it doesn’t include the important key word “Sushi”. Then by looking at other topics we find that actually the semantic algorithm identify important name entities which are relevant to the topic and useful to the algorithm. But only using these name entities is not sufficient enough. We think if we combine the key words in the title of the web pages and the name entities we extracted as an input to our query, we would have much better result in the future. For topic “entropy”, both semantic and structure algorithms score better than the baseline algorithm. We think the reason behind is in the baseline algorithm there are some “noise” key
  • 8. terms which affect the performance of the baseline algorithm. It makes the baseline algorithm give out some irrelevant web pages. It is also a very promising sign that the semantic and structure algorithm will make difference in the recommendation results. All of the three algorithms are perform very well on the topic “Harry Potter”. We think there are two reasons for this: first the definition of relevancy for a popular topic is much broader. Anything about Harry Potter will be thought as relevant no matter it is about the book, the movie, the author, or the actors. Second, there are so many web pages about Harry Potters on the web. So it is easier to find 5 from them. The reason for the error pages in the topic of “Waterboarding” is some of the links are invalid. Because “Waterboarding” is a time sensitive topic, the content of the recommended web pages could be deleted at the time of evaluation. By looking at the links we see that they are usually links to the user generated content pages like forums. For topic “Wei Chen”, the relevant web pages about this topic are very few. The structure algorithm only returns 4 web pages as the result. But two of them are relevant. So we think the major reason for the error pages is the scarce of the relevant web pages on the web. For topic “how to make Sushi”, we think it will be a difficult case for web recommender. One problem is caused by the key word “make”. The algorithm gave out pages that are about how- to make something else instead of Sushi. The other problem is there are a lot of content that are user generated about this topic. So some of the recommended pages are from forums and invalid at the time of evaluation. Conclusion This experiment gave us a lot of feedback about our algorithms used in web Recommender. Now we know how topics play a role on the recommendation results of each of the algorithms. We can also conclude from the experiment that our algorithms do make significant difference on the recommendation results. We probably could predict which kind of topic web recommender will be most useful. Software Engineering Techniques used in this project We followed the standard software engineering process in this project: requirement analysis, design, implementation, and evaluation. We used iterative development process at design, implementation, and evaluation phase. Table 3 summarizes the iterations in each phase. It also summarizes the main changes we went through.
  • 9. Table 3. Highlights of software engineering process Design Implementation Evaluation Iteration 1 1. Initial design of 1. Initial 1. Pilot study implementation of framework 2, Weighted average 2. Composite-pattern framework relevance score 2. Implemented based evaluation evaluation design component based on composite pattern Iteration 2 1. Added query 1. Implemented query 1. 5 raters, 5 input formulator and query formulator and query pages filter filter 2. Modified average 2. Simplified 2. Implemented precision evaluation design simplified version of evaluation GUI What changed over the semester? As Table 3 showed, we made changes in each of the development phases. Major changes are documented in several meetings notes: Changes in Main Framework: http://seit1.lti.cs.cmu.edu/projects/webrecommender/wiki/MeetingNotes02-02-2009 http://seit1.lti.cs.cmu.edu/projects/webrecommender/wiki/MeetingNotes02-11-2009 http://seit1.lti.cs.cmu.edu/projects/webrecommender/wiki/MeetingNotes02-18-2009 http://seit1.lti.cs.cmu.edu/projects/webrecommender/wiki/MeetingNotes03-04-2009 Changes in Evaluation Component: http://seit1.lti.cs.cmu.edu/projects/webrecommender/wiki/MeetingNotes04-06-2009 http://seit1.lti.cs.cmu.edu/projects/webrecommender/wiki/MeetingNotes04-20-2009 http://seit1.lti.cs.cmu.edu/projects/webrecommender/wiki/MeetingNotes04-22-2009 Our evaluation GUI went through several rounds of changes. Stage 1: Planed to use relational database to store and retrieve evaluation results Stage 2: Discarded the idea of relational database. Use composite pattern to implement aggregation of evaluation scores. Link to composite pattern based design Stage 3: Discarded the idea of composite pattern. Simplified the evaluation GUI. Implemented GUI. Link to the GUI Demo Stage 4: GUI found to be slow. Used Excel files to store and calculate evaluation scores. link to example evaluation form
  • 10. What would we change if we did the project over again? 1. We would improve our risk analysis: one tricky thing about risk analysis is that it is unexpected. We didn’t expect that speed will be a problem of our GUI. 2. Evaluation took more time than we had thought. We want to allow more time for evaluation, because we need time for pilot study before we conduct the experiment. Then we can have detailed and systematic analysis of the algorithms and improve our algorithms based on the analysis. 3. We would improve our time management: We should start evaluation early so that we can improve our algorithms based on evaluation results. Acknowledgements We own many thanks to Dr. Nyberg, Dr. Tomasic, Shilpa and Hideki for valuable comments and suggestions on our project throughout the semester. We thank our raters for the evaluation task. We also thank our classmates for many helpful discussions.