SlideShare una empresa de Scribd logo
1 de 18
Descargar para leer sin conexión
Modeling taste with Cassandra




Affinity is based on user tastes, preferences, and interests

                                                               1
What is a taste profile?

               Operational definition: the set of things you like and dislike

Stuff I like                                   Stuff I don’t like




     Challenge: how do you build a set of things you like and dislike
           Operational definition: the taste profile for someone?               2
Thesis: Likes are correlated
Inferring correlations
        D               1)   User A:
                              •   Democrat
                              •   Likes Arugula
                        2)   User B:
                    C
                              •   Republican
    E
            ?                 •   Dislikes Arugula
                        3)   User C indicates:
                              •   Democrat

                        What would we infer is User C’s affinity for
                        Arugula?

A
                        Answer: User C would like Arugula
                B




                                                                       4
Inferring correlations

               Like arugula
                                      User A


                                      <3, 2.5>

                     <1,1>
Dislike                           Like
Obama                             Obama
          User B


      <-2,-1.5>


    <-3,-3>
              Dislike arugula

                              User C           If someone’s affinity
                                               for Obama is 2.0,
                              <2,?>
                                               what is their affinity
                                               for arugula?

                                                                        5
Discovering latent factors
                                                            Obama
                                                                             Liberal
                                                     Arugula        <5, 5>
                                Like arugula
                                                                <4, 4>
                                                   User A

                                                     <3, 2>

                                      <1,1>
                 Dislike                            Like
                 Obama                              Obama
                           User B


                        <-2,-1.5>

            Iceberg
                       <-3,-3>
               <-4, -4>        Dislike arugula
    GOP


 <-5, -5>                                      User C         Predict 1.5 for how
                                                              much this person will
                                               <2,1.5>
Conservative                                                  like arugula.


                                                                                       6
Taste space = many latent factors

                                      <0.7, 4.4, -.1>
                        Liberal

                                  <0.5, 2.4, -.4>
                                     A
                                     Extroverted


Masculine                                          Feminine



                        <-0.5, -3.1, 0.1>
        Introverted
                       B
                      Conservative




                                                              7
What is a taste profile profile?

                 Operational definition: a coordinate in taste space

Stuff I like (close to me in taste space)   Stuff I don’t like (far away in taste space)




           Operational definition: the set of things you like and dislike
        Challenge: how do you calculate taste coordinates?                            8
Calculating taste coordinates
                       D                     Edge weight = dot product of nodes
? <x, y>
                                             to constrain similar items to be
                   2            <1, -1>
                                             close to each other.
                                     C       Assume edge weights of:
               E                                +2 = “love”
                                                -2 = “hate”
      2    <1, -0.5>
                                             Democratic node must solve:
                                               1*x -2*y = 2 (edge from A)
           2
                           -2                  1*x -1*y = 2 (edge from C)
      A
                                             Solution = <2, 0>
 <1, -2>                         B
                           <-1, 2>




                                                                             9
Updating taste coordinates

          User A purchases a camera...

<1, -1>
                                                          <1, -0.5>
                          2         <1, -1>
                                                                                      2         <1, -1>
                                         C
                                                                                                     C
                                              <-1, 0.5>
                                                                                                          <-1, 0.5>
                  <1, -0.5>
                                                                      2       <1, -0.5>

              2
                               -2                                             2
                                              2                                            -2
          A                                                                                               2
                                                                      A
   <1, -2>                           B
                                                               <0.75, -2.5>                      B
                              <-1, 2>
                                                                                          <-1, 2>

          Resulting in blue coordinates changing.
v1 System overview - Model updates

                                     1) Receive event
  Rec.              Updater          (eg, Purchase)
  Engine



    3) Write user             2a) Write Purchase edge
    and item                  2b) Read other edges
    coordinates               for this user and item




Reco. DB            Taste graph
User -> coord
Item -> coord
v1 System overview - Rec serving

                     1) Page load        Rec.          Updater
                     requests            Engine
                     recommendations


                               2) Rec. engine
                               finds other
                               cameras close
                               to user’s
3) Recommendations             coordinates
shown to user


                                       Reco. DB        Taste graph
                                       User -> coord
                                       Item -> coord
v1 Taste Graph data size


40 billion edges
2 billion item nodes
200 million user nodes

5TB of data, takes up 10TB with Replication Factor of 2

We expect this to quadruple next year as we get more events and add
  new types of edges




                                                                      13
v1 Taste Graph DB configuration


32 Linux machines
  128GB RAM
  1TB iSCSI SSD
  10 GigE NIC


Cassandra version 1.0.8

8GB JVM heap space

Size-tiered compaction strategy
v1 Taste Graph schema

User Edges
              (timestamp, edge_type, item_id)   …
   user_id               <empty>
Item Edges
              (timestamp, edge_type, user_id)   …
    item_id              <empty>
User Nodes
                   tastevector
    user_id   200 bytes (50 floats)
Item Nodes
                   tastevector
    item_id   200 bytes (50 floats)
v1 Real-time taste updates

Edges and nodes read per second
v1 Real-time taste updates

Edges and nodes written per second
Questions?


tp@hunch.com




                        18

Más contenido relacionado

La actualidad más candente

ReadConcern and WriteConcern
ReadConcern and WriteConcernReadConcern and WriteConcern
ReadConcern and WriteConcernMongoDB
 
Column oriented database
Column oriented databaseColumn oriented database
Column oriented databaseKanike Krishna
 
Kdd 2014 Tutorial - the recommender problem revisited
Kdd 2014 Tutorial -  the recommender problem revisitedKdd 2014 Tutorial -  the recommender problem revisited
Kdd 2014 Tutorial - the recommender problem revisitedXavier Amatriain
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender SystemsT212
 
Software Development Methodologies Library Management System (Part-1)
Software Development Methodologies Library Management System (Part-1)Software Development Methodologies Library Management System (Part-1)
Software Development Methodologies Library Management System (Part-1)Totan Banik
 
Recommender systems using collaborative filtering
Recommender systems using collaborative filteringRecommender systems using collaborative filtering
Recommender systems using collaborative filteringD Yogendra Rao
 
Etsy Activity Feeds Architecture
Etsy Activity Feeds ArchitectureEtsy Activity Feeds Architecture
Etsy Activity Feeds ArchitectureDan McKinley
 
How to build a recommender system?
How to build a recommender system?How to build a recommender system?
How to build a recommender system?blueace
 
Library Management System Waterfall Model
Library Management System Waterfall ModelLibrary Management System Waterfall Model
Library Management System Waterfall Modelmitwa1990
 
System design for recommendations and search
System design for recommendations and searchSystem design for recommendations and search
System design for recommendations and searchEugene Yan Ziyou
 
Boston ML - Architecting Recommender Systems
Boston ML - Architecting Recommender SystemsBoston ML - Architecting Recommender Systems
Boston ML - Architecting Recommender SystemsJames Kirk
 
Recommender systems for E-commerce
Recommender systems for E-commerceRecommender systems for E-commerce
Recommender systems for E-commerceAlexander Konduforov
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to RedisDvir Volk
 
Artwork Personalization at Netflix
Artwork Personalization at NetflixArtwork Personalization at Netflix
Artwork Personalization at NetflixJustin Basilico
 
MongoDB Aggregation Framework
MongoDB Aggregation FrameworkMongoDB Aggregation Framework
MongoDB Aggregation FrameworkCaserta
 
Recommendation System Explained
Recommendation System ExplainedRecommendation System Explained
Recommendation System ExplainedCrossing Minds
 
A content based movie recommender system for mobile application
A content based movie recommender system for mobile applicationA content based movie recommender system for mobile application
A content based movie recommender system for mobile applicationArafat X
 

La actualidad más candente (20)

ReadConcern and WriteConcern
ReadConcern and WriteConcernReadConcern and WriteConcern
ReadConcern and WriteConcern
 
Column oriented database
Column oriented databaseColumn oriented database
Column oriented database
 
Kdd 2014 Tutorial - the recommender problem revisited
Kdd 2014 Tutorial -  the recommender problem revisitedKdd 2014 Tutorial -  the recommender problem revisited
Kdd 2014 Tutorial - the recommender problem revisited
 
Google BigTable
Google BigTableGoogle BigTable
Google BigTable
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Software Development Methodologies Library Management System (Part-1)
Software Development Methodologies Library Management System (Part-1)Software Development Methodologies Library Management System (Part-1)
Software Development Methodologies Library Management System (Part-1)
 
Recommender systems using collaborative filtering
Recommender systems using collaborative filteringRecommender systems using collaborative filtering
Recommender systems using collaborative filtering
 
Etsy Activity Feeds Architecture
Etsy Activity Feeds ArchitectureEtsy Activity Feeds Architecture
Etsy Activity Feeds Architecture
 
How to build a recommender system?
How to build a recommender system?How to build a recommender system?
How to build a recommender system?
 
Cassandra Database
Cassandra DatabaseCassandra Database
Cassandra Database
 
Library Management System Waterfall Model
Library Management System Waterfall ModelLibrary Management System Waterfall Model
Library Management System Waterfall Model
 
System design for recommendations and search
System design for recommendations and searchSystem design for recommendations and search
System design for recommendations and search
 
Boston ML - Architecting Recommender Systems
Boston ML - Architecting Recommender SystemsBoston ML - Architecting Recommender Systems
Boston ML - Architecting Recommender Systems
 
Recommender systems for E-commerce
Recommender systems for E-commerceRecommender systems for E-commerce
Recommender systems for E-commerce
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 
Artwork Personalization at Netflix
Artwork Personalization at NetflixArtwork Personalization at Netflix
Artwork Personalization at Netflix
 
MongoDB Aggregation Framework
MongoDB Aggregation FrameworkMongoDB Aggregation Framework
MongoDB Aggregation Framework
 
Recommendation System Explained
Recommendation System ExplainedRecommendation System Explained
Recommendation System Explained
 
A content based movie recommender system for mobile application
A content based movie recommender system for mobile applicationA content based movie recommender system for mobile application
A content based movie recommender system for mobile application
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
 

Destacado

Neo4j - graph database for recommendations
Neo4j - graph database for recommendationsNeo4j - graph database for recommendations
Neo4j - graph database for recommendationsproksik
 
NoSQL: Why, When, and How
NoSQL: Why, When, and HowNoSQL: Why, When, and How
NoSQL: Why, When, and HowBigBlueHat
 
An Introduction to Graph Databases
An Introduction to Graph DatabasesAn Introduction to Graph Databases
An Introduction to Graph DatabasesInfiniteGraph
 
Designing and Building a Graph Database Application – Architectural Choices, ...
Designing and Building a Graph Database Application – Architectural Choices, ...Designing and Building a Graph Database Application – Architectural Choices, ...
Designing and Building a Graph Database Application – Architectural Choices, ...Neo4j
 
Converting Relational to Graph Databases
Converting Relational to Graph DatabasesConverting Relational to Graph Databases
Converting Relational to Graph DatabasesAntonio Maccioni
 
Graph Database, a little connected tour - Castano
Graph Database, a little connected tour - CastanoGraph Database, a little connected tour - Castano
Graph Database, a little connected tour - CastanoCodemotion
 
Relational to Graph - Import
Relational to Graph - ImportRelational to Graph - Import
Relational to Graph - ImportNeo4j
 
Relational databases vs Non-relational databases
Relational databases vs Non-relational databasesRelational databases vs Non-relational databases
Relational databases vs Non-relational databasesJames Serra
 
Semantic Graph Databases: The Evolution of Relational Databases
Semantic Graph Databases: The Evolution of Relational DatabasesSemantic Graph Databases: The Evolution of Relational Databases
Semantic Graph Databases: The Evolution of Relational DatabasesCambridge Semantics
 
An Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4jAn Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4jDebanjan Mahata
 
Introduction to graph databases GraphDays
Introduction to graph databases  GraphDaysIntroduction to graph databases  GraphDays
Introduction to graph databases GraphDaysNeo4j
 
Introduction to Graph Databases
Introduction to Graph DatabasesIntroduction to Graph Databases
Introduction to Graph DatabasesMax De Marzi
 
Data Modeling with Neo4j
Data Modeling with Neo4jData Modeling with Neo4j
Data Modeling with Neo4jNeo4j
 
Data Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisData Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisDataminingTools Inc
 

Destacado (17)

Neo4j - graph database for recommendations
Neo4j - graph database for recommendationsNeo4j - graph database for recommendations
Neo4j - graph database for recommendations
 
NoSQL: Why, When, and How
NoSQL: Why, When, and HowNoSQL: Why, When, and How
NoSQL: Why, When, and How
 
Lju Lazarevic
Lju LazarevicLju Lazarevic
Lju Lazarevic
 
An Introduction to Graph Databases
An Introduction to Graph DatabasesAn Introduction to Graph Databases
An Introduction to Graph Databases
 
Graph databases
Graph databasesGraph databases
Graph databases
 
Relational vs. Non-Relational
Relational vs. Non-RelationalRelational vs. Non-Relational
Relational vs. Non-Relational
 
Designing and Building a Graph Database Application – Architectural Choices, ...
Designing and Building a Graph Database Application – Architectural Choices, ...Designing and Building a Graph Database Application – Architectural Choices, ...
Designing and Building a Graph Database Application – Architectural Choices, ...
 
Converting Relational to Graph Databases
Converting Relational to Graph DatabasesConverting Relational to Graph Databases
Converting Relational to Graph Databases
 
Graph Database, a little connected tour - Castano
Graph Database, a little connected tour - CastanoGraph Database, a little connected tour - Castano
Graph Database, a little connected tour - Castano
 
Relational to Graph - Import
Relational to Graph - ImportRelational to Graph - Import
Relational to Graph - Import
 
Relational databases vs Non-relational databases
Relational databases vs Non-relational databasesRelational databases vs Non-relational databases
Relational databases vs Non-relational databases
 
Semantic Graph Databases: The Evolution of Relational Databases
Semantic Graph Databases: The Evolution of Relational DatabasesSemantic Graph Databases: The Evolution of Relational Databases
Semantic Graph Databases: The Evolution of Relational Databases
 
An Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4jAn Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4j
 
Introduction to graph databases GraphDays
Introduction to graph databases  GraphDaysIntroduction to graph databases  GraphDays
Introduction to graph databases GraphDays
 
Introduction to Graph Databases
Introduction to Graph DatabasesIntroduction to Graph Databases
Introduction to Graph Databases
 
Data Modeling with Neo4j
Data Modeling with Neo4jData Modeling with Neo4j
Data Modeling with Neo4j
 
Data Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisData Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysis
 

Más de DataStax Academy

Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftForrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftDataStax Academy
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseDataStax Academy
 
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraDataStax Academy
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsDataStax Academy
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingDataStax Academy
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackDataStax Academy
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache CassandraDataStax Academy
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready CassandraDataStax Academy
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonDataStax Academy
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1DataStax Academy
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2DataStax Academy
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First ClusterDataStax Academy
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with DseDataStax Academy
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraDataStax Academy
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseDataStax Academy
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraDataStax Academy
 

Más de DataStax Academy (20)

Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftForrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph Database
 
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart Labs
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data Modeling
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stack
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache Cassandra
 
Coursera Cassandra Driver
Coursera Cassandra DriverCoursera Cassandra Driver
Coursera Cassandra Driver
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready Cassandra
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First Cluster
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with Dse
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache Cassandra
 
Cassandra Core Concepts
Cassandra Core ConceptsCassandra Core Concepts
Cassandra Core Concepts
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax Enterprise
 
Bad Habits Die Hard
Bad Habits Die Hard Bad Habits Die Hard
Bad Habits Die Hard
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache Cassandra
 
Advanced Cassandra
Advanced CassandraAdvanced Cassandra
Advanced Cassandra
 

Graph Based Recommendation Systems at eBay

  • 1. Modeling taste with Cassandra Affinity is based on user tastes, preferences, and interests 1
  • 2. What is a taste profile? Operational definition: the set of things you like and dislike Stuff I like Stuff I don’t like Challenge: how do you build a set of things you like and dislike Operational definition: the taste profile for someone? 2
  • 3. Thesis: Likes are correlated
  • 4. Inferring correlations D 1) User A: • Democrat • Likes Arugula 2) User B: C • Republican E ? • Dislikes Arugula 3) User C indicates: • Democrat What would we infer is User C’s affinity for Arugula? A Answer: User C would like Arugula B 4
  • 5. Inferring correlations Like arugula User A <3, 2.5> <1,1> Dislike Like Obama Obama User B <-2,-1.5> <-3,-3> Dislike arugula User C If someone’s affinity for Obama is 2.0, <2,?> what is their affinity for arugula? 5
  • 6. Discovering latent factors Obama Liberal Arugula <5, 5> Like arugula <4, 4> User A <3, 2> <1,1> Dislike Like Obama Obama User B <-2,-1.5> Iceberg <-3,-3> <-4, -4> Dislike arugula GOP <-5, -5> User C Predict 1.5 for how much this person will <2,1.5> Conservative like arugula. 6
  • 7. Taste space = many latent factors <0.7, 4.4, -.1> Liberal <0.5, 2.4, -.4> A Extroverted Masculine Feminine <-0.5, -3.1, 0.1> Introverted B Conservative 7
  • 8. What is a taste profile profile? Operational definition: a coordinate in taste space Stuff I like (close to me in taste space) Stuff I don’t like (far away in taste space) Operational definition: the set of things you like and dislike Challenge: how do you calculate taste coordinates? 8
  • 9. Calculating taste coordinates D Edge weight = dot product of nodes ? <x, y> to constrain similar items to be 2 <1, -1> close to each other. C Assume edge weights of: E +2 = “love” -2 = “hate” 2 <1, -0.5> Democratic node must solve: 1*x -2*y = 2 (edge from A) 2 -2 1*x -1*y = 2 (edge from C) A Solution = <2, 0> <1, -2> B <-1, 2> 9
  • 10. Updating taste coordinates User A purchases a camera... <1, -1> <1, -0.5> 2 <1, -1> 2 <1, -1> C C <-1, 0.5> <-1, 0.5> <1, -0.5> 2 <1, -0.5> 2 -2 2 2 -2 A 2 A <1, -2> B <0.75, -2.5> B <-1, 2> <-1, 2> Resulting in blue coordinates changing.
  • 11. v1 System overview - Model updates 1) Receive event Rec. Updater (eg, Purchase) Engine 3) Write user 2a) Write Purchase edge and item 2b) Read other edges coordinates for this user and item Reco. DB Taste graph User -> coord Item -> coord
  • 12. v1 System overview - Rec serving 1) Page load Rec. Updater requests Engine recommendations 2) Rec. engine finds other cameras close to user’s 3) Recommendations coordinates shown to user Reco. DB Taste graph User -> coord Item -> coord
  • 13. v1 Taste Graph data size 40 billion edges 2 billion item nodes 200 million user nodes 5TB of data, takes up 10TB with Replication Factor of 2 We expect this to quadruple next year as we get more events and add new types of edges 13
  • 14. v1 Taste Graph DB configuration 32 Linux machines 128GB RAM 1TB iSCSI SSD 10 GigE NIC Cassandra version 1.0.8 8GB JVM heap space Size-tiered compaction strategy
  • 15. v1 Taste Graph schema User Edges (timestamp, edge_type, item_id) … user_id <empty> Item Edges (timestamp, edge_type, user_id) … item_id <empty> User Nodes tastevector user_id 200 bytes (50 floats) Item Nodes tastevector item_id 200 bytes (50 floats)
  • 16. v1 Real-time taste updates Edges and nodes read per second
  • 17. v1 Real-time taste updates Edges and nodes written per second