SlideShare una empresa de Scribd logo
1 de 24
Coping with the Persistent Coldstart
Problem
Siarhei Bykau, Georgia Koutrika, Yannis Velegrakis
PersDB, 30.08.2013
Siarhei Bykau, U of Trento 2
Recommendation Systems
● Amazon (products)
● Netflix (movies)
● Facebook (friends)
● Google (news)
● Twitter (who to follow)
Siarhei Bykau, U of Trento 3
Recommendation Approaches
● Content-based filtering (CB)
– build user's profile & look for similar items
● Collaborative filtering (CF)
– find users with similar tastes
● Hybrid
– combine previous two
Siarhei Bykau, U of Trento 4
Course Evaluations
cid year area instructor trimester exam student rating
cs343 2011 DB Fox 1 written s5 avg
cs343 2010 DB Fox 1 written s6 low
cs343 2011 DB Fox 1 written s7 avg
cs241 2010 PL Smith 2 oral s9 avg
cs241 2011 PL Smith 2 oral s5 low
cs241 2011 PL Smith 2 oral s1 high
cs241 2010 PL Smith 2 oral s2 low
cs120 2008 OS Fox 1 oral s4 low
cs120 2009 OS Fox 1 oral s4 high
cs400 2010 DB Newton 3 oral s20 high
cs400 2011 DB Newton 3 oral s18 high
Siarhei Bykau, U of Trento 5
Course Evaluations
cid year area instructor trimester exam student rating
cs343 2011 DB Fox 1 written s5 avg
cs343 2010 DB Fox 1 written s6 low
cs343 2011 DB Fox 1 written s7 avg
cs241 2010 PL Smith 2 oral s9 avg
cs241 2011 PL Smith 2 oral s5 low
cs241 2011 PL Smith 2 oral s1 high
cs241 2010 PL Smith 2 oral s2 low
cs120 2008 OS Fox 1 oral s4 low
cs120 2009 OS Fox 1 oral s4 high
cs400 2010 DB Newton 3 oral s20 high
cs400 2011 DB Newton 3 oral s18 high
cs241 2012 PL Smith 2 oral s19 ?
Siarhei Bykau, U of Trento 6
cid year area instructor trimester exam student rating
cs343 2011 DB Fox 1 written s5 avg
cs343 2010 DB Fox 1 written s6 low
cs343 2011 DB Fox 1 written s7 avg
cs241 2010 PL Smith 2 oral s9 avg
cs241 2011 PL Smith 2 oral s5 low
cs241 2011 PL Smith 2 oral s1 high
cs241 2010 PL Smith 2 oral s2 low
cs120 2008 OS Fox 1 oral s4 low
cs120 2009 OS Fox 1 oral s4 high
cs400 2010 DB Newton 3 oral s20 high
cs400 2011 DB Newton 3 oral s18 high
cs241 2012 PL Smith 2 oral s19 ?
Course Evaluations
Siarhei Bykau, U of Trento 7
Course Evaluations
cid year area instructor trimester exam student rating
cs343 2011 DB Fox 1 written s5 avg
cs343 2010 DB Fox 1 written s6 low
cs343 2011 DB Fox 1 written s7 avg
cs241 2010 PL Smith 2 oral s9 avg
cs241 2011 PL Smith 2 oral s5 low
cs241 2011 PL Smith 2 oral s1 high
cs241 2010 PL Smith 2 oral s2 low
cs120 2008 OS Fox 1 oral s4 low
cs120 2009 OS Fox 1 oral s4 high
cs400 2010 DB Newton 3 oral s20 high
cs400 2011 DB Newton 3 oral s18 high
cs421 2012 DB Fox 3 oral s19 ?
Siarhei Bykau, U of Trento 8
Cold-Start Problem
Existing items
New items
Existing users New users
● Collaborative filtering
● Content based filtering
● Hybrid approaches
● SVD
● ...
recommend highly-rated
items to new users
recommend new items to
existing users based on the
users’ historical ratings and
features of items
We are here
Siarhei Bykau, U of Trento 9
Cold-Start: Existing Approaches
● Random recommendations
● External knowledge
– social network [Guy et al. 2009]
– trust network [Jamali et al. 2010]
– ontologies [Middleton et al. 2002]
● Interviews [Rashid et al. 2002]
● Pairwise regression [Park et al. 2009]
Siarhei Bykau, U of Trento 10
Similarity Based Predictions
● Similar items have similar ratings:
● Similarity between two items:
● Pick only topK similar items
Siarhei Bykau, U of Trento 11
Feature Based Prediction
● Rating transfers equally to ratings of features
● Rating of a feature:
● Prediction is the average of feature ratings:
Siarhei Bykau, U of Trento 12
Course Evaluations
cid year area instructor trimester exam student rating
cs343 2011 DB Fox 1 written s5 avg
cs343 2010 DB Fox 1 written s6 low
cs343 2011 DB Fox 1 written s7 avg
cs241 2010 PL Smith 2 oral s9 avg
cs241 2011 PL Smith 2 oral s5 low
cs241 2011 PL Smith 2 oral s1 high
cs241 2010 PL Smith 2 oral s2 low
cs120 2008 OS Fox 1 oral s4 low
cs120 2009 OS Fox 1 oral s4 high
cs400 2010 DB Newton 3 oral s20 high
cs400 2011 DB Newton 3 oral s18 high
Siarhei Bykau, U of Trento 13
Course Evaluations
cid year area instructor trimester exam student rating
cs343 2011 DB Fox 1 written s5 avg
cs343 2010 DB Fox 1 written s6 low
cs343 2011 DB Fox 1 written s7 avg
cs241 2010 PL Smith 2 oral s9 avg
cs241 2011 PL Smith 2 oral s5 low
cs241 2011 PL Smith 2 oral s1 high
cs241 2010 PL Smith 2 oral s2 low
cs120 2008 OS Fox 1 oral s4 low
cs120 2009 OS Fox 1 oral s4 high
cs400 2010 DB Newton 3 oral s20 high
cs400 2011 DB Newton 3 oral s18 high
Siarhei Bykau, U of Trento 14
Course Evaluations
cid year area instructor trimester exam student rating
cs343 2011 DB Fox 1 written s5 avg
cs343 2010 DB Fox 1 written s6 low
cs343 2011 DB Fox 1 written s7 avg
cs241 2010 PL Smith 2 oral s9 avg
cs241 2011 PL Smith 2 oral s5 low
cs241 2011 PL Smith 2 oral s1 high
cs241 2010 PL Smith 2 oral s2 low
cs120 2008 OS Fox 1 oral s4 low
cs120 2009 OS Fox 1 oral s4 high
cs400 2010 DB Newton 3 oral s20 high
cs400 2011 DB Newton 3 oral s18 high
Siarhei Bykau, U of Trento 15
Preference Pattern
cid year area instructor trimester exam student rating
cs343 2011 DB Fox 1 written s5 avg
cs343 2010 DB Fox 1 written s6 low
cs343 2011 DB Fox 1 written s7 avg
cs241 2010 PL Smith 2 oral s9 avg
cs241 2011 PL Smith 2 oral s5 low
cs241 2011 PL Smith 2 oral s1 high
cs241 2010 PL Smith 2 oral s2 low
cs120 2008 OS Fox 1 oral s4 low
cs120 2009 OS Fox 1 oral s4 high
cs400 2010 DB Newton 3 oral s20 high
cs400 2011 DB Newton 3 oral s18 high
Siarhei Bykau, U of Trento 16
Preference Pattern
cid year area instructor trimester exam student rating
cs343 2011 DB Fox 1 written s5 avg
cs343 2010 DB Fox 1 written s6 low
cs343 2011 DB Fox 1 written s7 avg
cs241 2010 PL Smith 2 oral s9 avg
cs241 2011 PL Smith 2 oral s5 low
cs241 2011 PL Smith 2 oral s1 high
cs241 2010 PL Smith 2 oral s2 low
cs120 2008 OS Fox 1 oral s4 low
cs120 2009 OS Fox 1 oral s4 high
cs400 2010 DB Newton 3 oral s20 high
cs400 2011 DB Newton 3 oral s18 high
<<DB,Fox>,avg> pattern frequency is 2/11
Siarhei Bykau, U of Trento 17
Entropy Based Prediction
1. model features and ratings as variables:
2. introduce a joint distribution of features and ratings
to model observations:
3. Generalized Iterative Scaling (GIS) is used to find
which satisfies frequent preference patterns
4. use to predict missing ratings:
Siarhei Bykau, U of Trento 18
Max Entropy Intuition
Siarhei Bykau, U of Trento 19
Metrics
● Predictability
– Root Mean Square Error (individual rating accuracy)
– Normalized Discounted Cumulative Gain (accuracy in
order)
● Coverage
Siarhei Bykau, U of Trento 20
Datasets
● Stanford Courses
– from 1997 to 2008
– 9799 ratings
– 675 courses
– 193 instructors
– features: title, description, department
● MovieLens
– 100K ratings
– 1000 users
– 1700 movies
– 42000 unique features (39 features per movie in average)
Siarhei Bykau, U of Trento 21
Algorithms
● Similarity-based
● Feature-based
● Max entropy
● Linear regression [Park et al. 2009]
Siarhei Bykau, U of Trento 22
Accuracy/Coverage for Varying
Training Data Size (Stanford)
Siarhei Bykau, U of Trento 23
Average/Coverage for Varying Density of Features
(MovieLens)
Siarhei Bykau, U of Trento 24
Conclusions
● Addressed the new-user new-item cold start
problem
● Proposed a number of algorithms:
– Similarity-based
– Feature-based
– Max entropy
● Experimental evaluation showed a high
effectiveness of the algorithms (Max entropy is the
best)

Más contenido relacionado

Último

9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesTimothy Spann
 
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGILLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGIThomas Poetter
 
detection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptxdetection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptxAleenaJamil4
 
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degreeyuu sss
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 

Último (20)

9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
 
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGILLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
 
detection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptxdetection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptx
 
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 

Destacado

Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at WorkGetSmarter
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...DevGAMM Conference
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationErica Santiago
 
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellGood Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellSaba Software
 

Destacado (20)

Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy Presentation
 
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellGood Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
 

Coping with the Persistent Coldstart Problem

  • 1. Coping with the Persistent Coldstart Problem Siarhei Bykau, Georgia Koutrika, Yannis Velegrakis PersDB, 30.08.2013
  • 2. Siarhei Bykau, U of Trento 2 Recommendation Systems ● Amazon (products) ● Netflix (movies) ● Facebook (friends) ● Google (news) ● Twitter (who to follow)
  • 3. Siarhei Bykau, U of Trento 3 Recommendation Approaches ● Content-based filtering (CB) – build user's profile & look for similar items ● Collaborative filtering (CF) – find users with similar tastes ● Hybrid – combine previous two
  • 4. Siarhei Bykau, U of Trento 4 Course Evaluations cid year area instructor trimester exam student rating cs343 2011 DB Fox 1 written s5 avg cs343 2010 DB Fox 1 written s6 low cs343 2011 DB Fox 1 written s7 avg cs241 2010 PL Smith 2 oral s9 avg cs241 2011 PL Smith 2 oral s5 low cs241 2011 PL Smith 2 oral s1 high cs241 2010 PL Smith 2 oral s2 low cs120 2008 OS Fox 1 oral s4 low cs120 2009 OS Fox 1 oral s4 high cs400 2010 DB Newton 3 oral s20 high cs400 2011 DB Newton 3 oral s18 high
  • 5. Siarhei Bykau, U of Trento 5 Course Evaluations cid year area instructor trimester exam student rating cs343 2011 DB Fox 1 written s5 avg cs343 2010 DB Fox 1 written s6 low cs343 2011 DB Fox 1 written s7 avg cs241 2010 PL Smith 2 oral s9 avg cs241 2011 PL Smith 2 oral s5 low cs241 2011 PL Smith 2 oral s1 high cs241 2010 PL Smith 2 oral s2 low cs120 2008 OS Fox 1 oral s4 low cs120 2009 OS Fox 1 oral s4 high cs400 2010 DB Newton 3 oral s20 high cs400 2011 DB Newton 3 oral s18 high cs241 2012 PL Smith 2 oral s19 ?
  • 6. Siarhei Bykau, U of Trento 6 cid year area instructor trimester exam student rating cs343 2011 DB Fox 1 written s5 avg cs343 2010 DB Fox 1 written s6 low cs343 2011 DB Fox 1 written s7 avg cs241 2010 PL Smith 2 oral s9 avg cs241 2011 PL Smith 2 oral s5 low cs241 2011 PL Smith 2 oral s1 high cs241 2010 PL Smith 2 oral s2 low cs120 2008 OS Fox 1 oral s4 low cs120 2009 OS Fox 1 oral s4 high cs400 2010 DB Newton 3 oral s20 high cs400 2011 DB Newton 3 oral s18 high cs241 2012 PL Smith 2 oral s19 ? Course Evaluations
  • 7. Siarhei Bykau, U of Trento 7 Course Evaluations cid year area instructor trimester exam student rating cs343 2011 DB Fox 1 written s5 avg cs343 2010 DB Fox 1 written s6 low cs343 2011 DB Fox 1 written s7 avg cs241 2010 PL Smith 2 oral s9 avg cs241 2011 PL Smith 2 oral s5 low cs241 2011 PL Smith 2 oral s1 high cs241 2010 PL Smith 2 oral s2 low cs120 2008 OS Fox 1 oral s4 low cs120 2009 OS Fox 1 oral s4 high cs400 2010 DB Newton 3 oral s20 high cs400 2011 DB Newton 3 oral s18 high cs421 2012 DB Fox 3 oral s19 ?
  • 8. Siarhei Bykau, U of Trento 8 Cold-Start Problem Existing items New items Existing users New users ● Collaborative filtering ● Content based filtering ● Hybrid approaches ● SVD ● ... recommend highly-rated items to new users recommend new items to existing users based on the users’ historical ratings and features of items We are here
  • 9. Siarhei Bykau, U of Trento 9 Cold-Start: Existing Approaches ● Random recommendations ● External knowledge – social network [Guy et al. 2009] – trust network [Jamali et al. 2010] – ontologies [Middleton et al. 2002] ● Interviews [Rashid et al. 2002] ● Pairwise regression [Park et al. 2009]
  • 10. Siarhei Bykau, U of Trento 10 Similarity Based Predictions ● Similar items have similar ratings: ● Similarity between two items: ● Pick only topK similar items
  • 11. Siarhei Bykau, U of Trento 11 Feature Based Prediction ● Rating transfers equally to ratings of features ● Rating of a feature: ● Prediction is the average of feature ratings:
  • 12. Siarhei Bykau, U of Trento 12 Course Evaluations cid year area instructor trimester exam student rating cs343 2011 DB Fox 1 written s5 avg cs343 2010 DB Fox 1 written s6 low cs343 2011 DB Fox 1 written s7 avg cs241 2010 PL Smith 2 oral s9 avg cs241 2011 PL Smith 2 oral s5 low cs241 2011 PL Smith 2 oral s1 high cs241 2010 PL Smith 2 oral s2 low cs120 2008 OS Fox 1 oral s4 low cs120 2009 OS Fox 1 oral s4 high cs400 2010 DB Newton 3 oral s20 high cs400 2011 DB Newton 3 oral s18 high
  • 13. Siarhei Bykau, U of Trento 13 Course Evaluations cid year area instructor trimester exam student rating cs343 2011 DB Fox 1 written s5 avg cs343 2010 DB Fox 1 written s6 low cs343 2011 DB Fox 1 written s7 avg cs241 2010 PL Smith 2 oral s9 avg cs241 2011 PL Smith 2 oral s5 low cs241 2011 PL Smith 2 oral s1 high cs241 2010 PL Smith 2 oral s2 low cs120 2008 OS Fox 1 oral s4 low cs120 2009 OS Fox 1 oral s4 high cs400 2010 DB Newton 3 oral s20 high cs400 2011 DB Newton 3 oral s18 high
  • 14. Siarhei Bykau, U of Trento 14 Course Evaluations cid year area instructor trimester exam student rating cs343 2011 DB Fox 1 written s5 avg cs343 2010 DB Fox 1 written s6 low cs343 2011 DB Fox 1 written s7 avg cs241 2010 PL Smith 2 oral s9 avg cs241 2011 PL Smith 2 oral s5 low cs241 2011 PL Smith 2 oral s1 high cs241 2010 PL Smith 2 oral s2 low cs120 2008 OS Fox 1 oral s4 low cs120 2009 OS Fox 1 oral s4 high cs400 2010 DB Newton 3 oral s20 high cs400 2011 DB Newton 3 oral s18 high
  • 15. Siarhei Bykau, U of Trento 15 Preference Pattern cid year area instructor trimester exam student rating cs343 2011 DB Fox 1 written s5 avg cs343 2010 DB Fox 1 written s6 low cs343 2011 DB Fox 1 written s7 avg cs241 2010 PL Smith 2 oral s9 avg cs241 2011 PL Smith 2 oral s5 low cs241 2011 PL Smith 2 oral s1 high cs241 2010 PL Smith 2 oral s2 low cs120 2008 OS Fox 1 oral s4 low cs120 2009 OS Fox 1 oral s4 high cs400 2010 DB Newton 3 oral s20 high cs400 2011 DB Newton 3 oral s18 high
  • 16. Siarhei Bykau, U of Trento 16 Preference Pattern cid year area instructor trimester exam student rating cs343 2011 DB Fox 1 written s5 avg cs343 2010 DB Fox 1 written s6 low cs343 2011 DB Fox 1 written s7 avg cs241 2010 PL Smith 2 oral s9 avg cs241 2011 PL Smith 2 oral s5 low cs241 2011 PL Smith 2 oral s1 high cs241 2010 PL Smith 2 oral s2 low cs120 2008 OS Fox 1 oral s4 low cs120 2009 OS Fox 1 oral s4 high cs400 2010 DB Newton 3 oral s20 high cs400 2011 DB Newton 3 oral s18 high <<DB,Fox>,avg> pattern frequency is 2/11
  • 17. Siarhei Bykau, U of Trento 17 Entropy Based Prediction 1. model features and ratings as variables: 2. introduce a joint distribution of features and ratings to model observations: 3. Generalized Iterative Scaling (GIS) is used to find which satisfies frequent preference patterns 4. use to predict missing ratings:
  • 18. Siarhei Bykau, U of Trento 18 Max Entropy Intuition
  • 19. Siarhei Bykau, U of Trento 19 Metrics ● Predictability – Root Mean Square Error (individual rating accuracy) – Normalized Discounted Cumulative Gain (accuracy in order) ● Coverage
  • 20. Siarhei Bykau, U of Trento 20 Datasets ● Stanford Courses – from 1997 to 2008 – 9799 ratings – 675 courses – 193 instructors – features: title, description, department ● MovieLens – 100K ratings – 1000 users – 1700 movies – 42000 unique features (39 features per movie in average)
  • 21. Siarhei Bykau, U of Trento 21 Algorithms ● Similarity-based ● Feature-based ● Max entropy ● Linear regression [Park et al. 2009]
  • 22. Siarhei Bykau, U of Trento 22 Accuracy/Coverage for Varying Training Data Size (Stanford)
  • 23. Siarhei Bykau, U of Trento 23 Average/Coverage for Varying Density of Features (MovieLens)
  • 24. Siarhei Bykau, U of Trento 24 Conclusions ● Addressed the new-user new-item cold start problem ● Proposed a number of algorithms: – Similarity-based – Feature-based – Max entropy ● Experimental evaluation showed a high effectiveness of the algorithms (Max entropy is the best)