HBaseCon 2012 | Getting Real about Interactive Big Data Management with Lily & HBase - ngdata

Making Sense of Data

Lily goes shopping –
real-time recommendations with HBase
HBaseCon, May 2012

Steven Noels – VP Product – @stevenn

WWW.NGDATA.COM

Lily Core 2’ recap
•  HBase-backed data repository,
with batteries included
•  Data model:
•  high-level data model on top of HBase’s
client app
byte[]’s
•  schema
•  versioning (schema and data) Lily
•  links, variants
RowLog
•  Java & REST API's
•  Indexing: HBase Solr et al.

•  through configuration, not implementation
•  incremental and batch index maintenance
•  RowLog: distributed, durable queue for sec.
actions
•  Open Source: www.lilyproject.org (Apache
License)

WWW.NGDATA.COM

Why HBase?
•  BigTable model
•  sparseness
•  atomic row updates aka concistency
•  auto-partitioning
•  Apache license
•  A great community led by a Saint J

WWW.NGDATA.COM

Portfolio Overview

Real-time AI
Recommendations
Industry algorithms and rules

commercial availability

Trend Analytics
Pattern Detection

Profile Development
Context and Activity Tracking open source

Social Stream Ingestion

Schema and Data Management
Total Data Aggregation
Real-time Index and Retrieval
Security and Enterprise Connectors

WWW.NGDATA.COM

Lily (=HBase) In Use
Some of the larger Lily deployments

•  media
•  aggregation, database publishing and online archives
•  finance
•  real-time identity fraud detection
•  retail banking
•  contextualized (time+loc+person) mobile coupons
•  retail
•  e-commerce platform:
product catalog, consumer data store, real-time
indexing

WWW.NGDATA.COM

Collaborative Filtering?

Recommend items similar to a user’s highly-preferred items

WWW.NGDATA.COM

Collaborative Filtering is … Matrixes

Sean likes “Scarface” a lot (123,654,5.0)!
Robin likes “Scarface” somewhat (789,654,3.0)!
Grant likes “The Notebook” not at all (345,876,1.0)!
… …!

(Magic)

Grant may like “Scarface” quite a bit (345,654,4.5)!
… …!

WWW.NGDATA.COM

Contextualized recommendations

Personalized
offers

shops & merchants
Profile Acitvity Item product families
offers/coupons

creditcard
statements

WWW.NGDATA.COM

Fitting Recommendations into the Lily
Architecture

LILY CRUD API

Lily/HBase Secondary Indexes

read/write demultiplexer

co-occurence
lookup matrix

rowlog activity store
Steven Noels
stevenn@ngdata.com
www.ngdata.com
telephone: +32 9 33 engine
LILY recommender 88 220
data profile data, activity, profile scoring
indexes
store store Gent (Belgium)

propensity

custom ...
k-means
ALS
Makers of

Lily Core Repository
algorithm support

WWW.NGDATA.COM

Preferencing aka Feeding the Matrix
•  Transaction-based preferencing
•  Pluggable preference strategies, using Lily-based data
(HBase&Solr) for decision making
•  e.g. credit card statement = transactions between users and product
families
•  Preference weighting
•  Ingest: REST API, bulk support
•  Real-time updating of the recommendation model

•  Profile Store
•  Profile activities can be preferenced
•  Support for Profile behavior analysis

WWW.NGDATA.COM

Making recommendations
•  Recommender
•  Pluggable recommender strategies, using Lily-based data
(HBase&Solr) for decision making
•  Multi-model support: user-item & item-user recommendations
•  Estimation of both preferenced and non-preferenced items
•  Geolocation-based recommendations
•  Re-scoring
•  REST API

•  (Planned)
•  Support for Classifications
(scenario - Recommend me all (possible) coffee drinkers)
•  Matrix / recommendation indexing

WWW.NGDATA.COM

Other upcoming Lily Features
•  Secondary indexes (= Lily Core!)
•  indexes are defined through configuration
•  single or multi-field indexes
•  range queries and prefix queries
•  asc or desc sorted results
•  can read huge, sorted lists
•  synchronously updated: index updates are applied by rowlog
secondary actions
•  online building of new indexes (no table locks)
•  MapReduce integration

•  SolrCloud integration
•  Index shards and configuration managed through ZooKeeper

WWW.NGDATA.COM

Making Sense of Data

Questions? Thank you!

WWW.NGDATA.COM

HBaseCon 2012 | Getting Real about Interactive Big Data Management with Lily & HBase - ngdata

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Destacado

Destacado (20)

Similar a HBaseCon 2012 | Getting Real about Interactive Big Data Management with Lily & HBase - ngdata

Similar a HBaseCon 2012 | Getting Real about Interactive Big Data Management with Lily & HBase - ngdata (20)

Más de Cloudera, Inc.

Más de Cloudera, Inc. (20)

Último

Último (20)

HBaseCon 2012 | Getting Real about Interactive Big Data Management with Lily & HBase - ngdata