5. TECHNOLOGY/TOOLS USED BY KOBO BIG DATA
•Processing/Streaming: Hadoop, Storm, Flume
•Storage/streaming: Hive, sql, redis, couchbase
•Search: Solr+plugins
•Languages: Python, Java, C++, R
7. BIG DATA IS LIKE TEENAGE SEX: EVERYONE TALKS ABOUT IT, NOBODY REALLY KNOWS HOW TO DO IT, EVERYONE THINKS EVERYONE ELSE IS DOING IT, SO EVERYONE CLAIMS THEY ARE DOING IT...
DAN ARIELY
18. DISAMBIGUATING KEY-TERMS
•Problem : Disambiguation of item to Wikipedia articles mapping
•Solution : choose Wikipedia articles that make sense collectively [1]
[1] Local and Global Algorithms for Disambiguation to Wikipedia, ACL 2011
30. LAYOUT GENERATION
•Local search
•Start from a handful of expert generated layouts
•Use widget statistics to make informed swap/exchange steps
31. OPTIMIZATION FRAMEWORK – EXPLORE/EXPLOIT
•Several actions an agent can take
•Agent takes action, and observes payoff
32. •Pick a policy for choosing action with good trade-off
•If you always take the best one so far, may be missing on better options
•If you always explore new things you losses accumulate
OPTIMIZATION FRAMEWORK – EXPLORE/EXPLOIT
33. MULTI-ARMED BANDIT MODEL
•Maximize total expected reward
•Bayesian framework (Agarwal et al [1])
•Context Sensitive Variant (Li et al [2])
[1] Explore/Exploit schemes for web content optimization, Yahoo Research, ICDM ‘09
[2] A contextual-bandit approach to personalized news article recommendation, Yahoo Research, WWW ‘10
34. FEATURE VECTOR CONSTRUCTION
•User information:
•Purchase history, by genre, by price sensitivity, freshness
•Browsing history
•Geo
•Extensions to
•Optimize for profit (not CTR)
•Account for different widget categories