SlideShare una empresa de Scribd logo
1 de 57
Descargar para leer sin conexión
天下武功唯快不破: 
 利用串流資料實做出即時分類器和即時推薦系統 
Yahoo! Taiwan EC Data Team
Who I am 
▪ Norman Huang (normany@yahoo-inc.com) 
▪ Software & Data Engineer of Yahoo! Taiwan 
▪ Aims to retrieve and deliver data insights via BI 
platform and data mining algorithms. 
2
Who I am 
▪ Jason Lin (jasonysl@yahoo-inc.com) 
▪ Software & Data Engineer of Yahoo! Taiwan 
▪ Responsible for recommendation system 
personalization mechanisms and cloud 
computing developer. 
3
Agenda 
▪ Challenges 
▪ Solution: Pinball 
▪ Q&A 
4
Challenges 
! 
! 
! 
! 
! 
! 
▪ Static content until next batch job. 
! 
! 
! 
5 
Processing
Challenges 
! 
! 
! 
! 
! 
! 
▪ Static content until next batch job. 
▪ Batched product recommendation algorithms have become common 
features among e-commerce platforms. 
! 
6 
Processing
Challenges 
! 
! 
! 
! 
! 
! 
▪ Nearly 72% of visitors made their decision at the same day. 
7 
Absorbed into batch views Not yet absorbed 
Time 
Several hours of data
Challenges 
! 
! 
! 
! 
! 
! 
▪ Nearly 72% of visitors made their decision at the same day. 
▪ Real-time solution to interact with potential buyers. 
8 
Absorbed into batch views Not yet absorbed 
Time 
Several hours of data
Solution: Pinball 
9
Pinball 
10 
Classifier 
Classifier 
User 
Profile A Profile B Profile C
Pinball 
! 
▪ Real-time classifier 
▪ Detect buyers’ preferences by streaming data processing 
▪ Deliver personalized ads and product recommendations on the fly 
11
Pinball 
! 
▪ Real-time classifier 
▪ Detect buyers’ preferences by streaming data processing 
▪ Deliver personalized ads and product recommendations on the fly 
! 
▪ Challenges 
› How do to it in real-time? 
12
Pinball 
! 
▪ Real-time classifier 
▪ Detect buyers’ preferences by streaming data processing 
▪ Deliver personalized ads and product recommendations on the fly 
! 
▪ Challenges 
› How do to it in real-time? 
› Storm 
13
Pinball 
! 
▪ Real-time classifier 
▪ Detect buyers’ preferences by streaming data processing 
▪ Deliver personalized ads and product recommendations on the fly 
! 
▪ Challenges 
› How do to it in real-time? 
› Storm! 
› How to determine customers’ purchasing desire? 
14
Pinball 
! 
▪ Real-time classifier 
▪ Detect buyers’ preferences by streaming data processing 
▪ Deliver personalized ads and product recommendations on the fly 
! 
▪ Challenges 
› How do to it in real-time? 
› Storm! 
› How to determine customers’ purchasing desire? 
› Buying Intention Detection 
15
Solution: Pinball 
▪ Storm Overview 
▪ Buying Intention (BI) 
▪ Architecture and Design 
16
Pinball 
17 
Storm Learning 
Buyer
Pinball 
18 
Storm Learning 
Buyers
Pinball 
19 
Learning 
Storm 
Is Potential 
Buyer? 
Buyers 
Visitor 
Promotions
Pinball 
Pinball 
20 
Learning 
Storm 
Is Potential 
Buyer? 
Buyers 
Visitor 
Promotions
Pinball 
Pinball 
21 
Learning 
Storm 
Is Potential 
Buyer? 
Buyers 
Buyer 
Promotions
Storm Concepts 
▪ Tuple & Streams 
▪ Spouts & Bolts 
▪ Topologies 
Yahoo Confidential & Proprietary 
22
Tuple & Streams 
▪ Tuple 
! 
! 
! 
! 
▪ Stream 
Yahoo Confidential & Proprietary 
23 
Field 1 Field 2 Field 3 Field 4 Field 5 
Tuple 
Tuple 1 Tuple 2 Tuple 3 Tuple n 
Stream
Spouts & Bolts 
Yahoo Confidential & Proprietary 
24 
Spout T T T T T Bolt T T T
Topology 
25 
Spout Bolt Bolt 
Streams 
▪ Hadoop map-reduce job vs. Storm topology
Topology 
26 
Spout Bolt Bolt 
Streams 
▪ Hadoop map-reduce job vs. Storm topology
Storm Concepts 
Yahoo Confidential & Proprietary 
27 
Computational 
Primitives 
Use Case 
High-level! 
Language 
Hadoop Map & Reduce 
Batch 
Processing 
Pig 
Storm Spout & Bolt 
Stream 
Processing 
Trident
Storm 
28 
Nimbus 
Zookeeper 
Zookeeper 
Zookeeper 
Supervisor 
Supervisor 
Supervisor 
Supervisor 
Supervisor 
Master node, similar to the Hadoop JobTracker
Storm 
29 
Nimbus 
Zookeeper 
Zookeeper 
Zookeeper 
Supervisor 
Supervisor 
Supervisor 
Supervisor 
Supervisor 
Coordinates the Storm cluster
Storm 
30 
Nimbus 
Zookeeper 
Zookeeper 
Zookeeper 
Supervisor 
Supervisor 
Supervisor 
Supervisor 
Supervisor 
Run worker processes
Buying Intention 
▪ Based on our findings: 
› The more page views, the higher the chance a visitor will buy it. 
› BUT, the buying intension value of each category will vary. 
31 
2 6
How to leverage 
Storm with Buying Intention (BI)?
Data Flow Diagram 
33
Adaptive Learning 
34
Learning & Classifier 
▪ Online Binary Classification 
› Simple and computationally efficient 
▪ e.g. 
› assumptions: γ=0.1, BI = 3 
› scenario: a user makes 6 page views before purchasing 
• BI’ = 3 + (6-3) x 0.1 
• BI’ = 3.3 
35 
BI ' = BI +(PV − BI )×γ
Buying Intention Qualification 
36
37 
Topology Design
Lambda Architecture 
▪ Term created by Nathan Marz (Creator of Apache Storm) 
! 
▪ Batch Real-time processing 
Yahoo Confidential & Proprietary 
38
Lambda Architecture 
▪ Term created by Nathan Marz (Creator of Apache Storm) 
! 
▪ Batch Real-time processing 
Yahoo Confidential & Proprietary 
39
Lambda Architecture 
▪ Term created by Nathan Marz (Creator of Apache Storm) 
! 
▪ Batch + Real-time processing 
› Hybrid batch and real-time processing 
Yahoo Confidential & Proprietary 
40
Lambda Architecture 
▪ Term created by Nathan Marz (Creator of Apache Storm) 
! 
▪ Batch + Real-time processing 
› Hybrid batch and real-time processing 
› Batch processing is treated as source of truth, and real-time updates 
models/insights between batches. 
Yahoo Confidential & Proprietary 
41
Lambda Architecture 
Yahoo Confidential & Proprietary 
42 
[REF] http://lambda-architecture.net/
Lambda Architecture 
Yahoo Confidential & Proprietary 
43 
[REF] http://lambda-architecture.net/
Lambda Architecture 
Yahoo Confidential & Proprietary 
44 
Storm Streaming 
[REF] http://lambda-architecture.net/
Lambda Architecture 
Summingbird 
Yahoo Confidential & Proprietary 
45 
[REF] http://lambda-architecture.net/
Pinball Demonstration
47
How to keep it generic and flexible? 
▪ to add more signals 
▪ to add more online learning algorithms 
▪ to add more channels
How to keep it generic and flexible? 
Signals 
Algorithms 
Channels 
49 
Click Login 
Buy 
View 
Bounce 
Time 
Spent 
Buying Intention 
Email Y! Webpages Mobile 
Apps 
Messenger 
Fraud Detection 
Webpage 
Sequence
Summary 
▪ Scalable to process real-time data 
▪ Supports online learning algorithms 
▪ Flexible interactions with visitors 
▪ Increase user's engagement 
▪ Increase the conversion rate 
▪ To create synergy by combining batched recommender and Pinball 
Yahoo Confidential & Proprietary 
50
Simple Hands-on 
-> Find out the heavy users!
Find out the heavy users! 
▪ Memorize the numbers of page views for each user 
▪ If the numbers are great than 3, it’s a heavy user 
Yahoo Confidential & Proprietary 
52
Find out the heavy users! 
Yahoo Confidential & Proprietary 
53 
User Log 
Spout 
Learning 
Bolt 
userid, type, catlv1, catlv2, timestamp
Find out the heavy users! 
Yahoo Confidential & Proprietary 
54 
User Log 
Spout 
Learning 
Bolt 
userid, type, catlv1, catlv2, timestamp 
Learning 
Bolt 
shuffleGroup 
userA, xxxxx 
userB, xxxxx 
userD, xxxxx 
userB, xxxxx 
userE, xxxxx 
userC, xxxxx 
userB, xxxxx 
userC, xxxxx
Find out the heavy users! 
Yahoo Confidential & Proprietary 
55 
User Log 
Spout 
Learning 
Bolt 
userid, type, catlv1, catlv2, timestamp 
Learning 
Bolt 
fieldGroup 
userA, xxxxx 
userD, xxxxx 
userF, xxxxx 
userF, xxxxx 
userE, xxxxx 
userC, xxxxx 
userB, xxxxx 
userB, xxxxx 
userB, xxxxx 
userC, xxxxx
Find out the heavy users! 
Yahoo Confidential & Proprietary 
56 
User Log 
Spout 
Learning 
Bolt 
Learning 
Bolt 
fieldGroup 
userA, xxxxx 
userD, xxxxx 
userF, xxxxx 
userF, xxxxx 
userE, xxxxx 
userC, xxxxx 
userB, xxxxx 
userB, xxxxx 
userB, xxxxx 
userC, xxxxx 
Qualification 
Bolt 
userA, totalPV 
userB, totalPV 
userC, totalPV 
userF, totalPV
Questions? 
Norman! 
@normanyhuang! 
www.linkedin.com/in/normany 
Jason! 
@kalijason! 
www.linkedin.com/pub/jason-lin/67/93/743

Más contenido relacionado

La actualidad más candente

Hadoop and Neo4j: A Winning Combination for Bioinformatics
Hadoop and Neo4j: A Winning Combination for BioinformaticsHadoop and Neo4j: A Winning Combination for Bioinformatics
Hadoop and Neo4j: A Winning Combination for Bioinformaticsosintegrators
 
Apache Druid Vision and Roadmap
Apache Druid Vision and RoadmapApache Druid Vision and Roadmap
Apache Druid Vision and RoadmapImply
 
Graph Databases in Python (PyCon Canada 2012)
Graph Databases in Python (PyCon Canada 2012)Graph Databases in Python (PyCon Canada 2012)
Graph Databases in Python (PyCon Canada 2012)Javier de la Rosa
 
Big Data Analytics 2: Leveraging Customer Behavior to Enhance Relevancy in Pe...
Big Data Analytics 2: Leveraging Customer Behavior to Enhance Relevancy in Pe...Big Data Analytics 2: Leveraging Customer Behavior to Enhance Relevancy in Pe...
Big Data Analytics 2: Leveraging Customer Behavior to Enhance Relevancy in Pe...MongoDB
 
Lord of the Bing - Black Hat USA 2010
Lord of the Bing - Black Hat USA 2010Lord of the Bing - Black Hat USA 2010
Lord of the Bing - Black Hat USA 2010Rob Ragan
 
Getting started with Graph Databases & Neo4j
Getting started with Graph Databases & Neo4jGetting started with Graph Databases & Neo4j
Getting started with Graph Databases & Neo4jSuroor Wijdan
 
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...NoSQLmatters
 
Neo4J : Introduction to Graph Database
Neo4J : Introduction to Graph DatabaseNeo4J : Introduction to Graph Database
Neo4J : Introduction to Graph DatabaseMindfire Solutions
 
Caching Search Engine Results over Incremental Indices
Caching Search Engine Results over Incremental IndicesCaching Search Engine Results over Incremental Indices
Caching Search Engine Results over Incremental IndicesRoi Blanco
 
Tenacious Diggity - Skinny Dippin in a Sea of Bing
Tenacious Diggity - Skinny Dippin in a Sea of BingTenacious Diggity - Skinny Dippin in a Sea of Bing
Tenacious Diggity - Skinny Dippin in a Sea of BingRob Ragan
 
Intro to Graphs and Neo4j
Intro to Graphs and Neo4jIntro to Graphs and Neo4j
Intro to Graphs and Neo4jjexp
 
How Graph Databases efficiently store, manage and query connected data at s...
How Graph Databases efficiently  store, manage and query  connected data at s...How Graph Databases efficiently  store, manage and query  connected data at s...
How Graph Databases efficiently store, manage and query connected data at s...jexp
 
Neo4j Fundamentals
Neo4j FundamentalsNeo4j Fundamentals
Neo4j FundamentalsMax De Marzi
 
Building Data Applications with Apache Druid
Building Data Applications with Apache DruidBuilding Data Applications with Apache Druid
Building Data Applications with Apache DruidImply
 
Using MongoDB + Hadoop Together
Using MongoDB + Hadoop TogetherUsing MongoDB + Hadoop Together
Using MongoDB + Hadoop TogetherMongoDB
 
Black Hat 2011 - Pulp Google Hacking: The Next Generation Search Engine Hacki...
Black Hat 2011 - Pulp Google Hacking: The Next Generation Search Engine Hacki...Black Hat 2011 - Pulp Google Hacking: The Next Generation Search Engine Hacki...
Black Hat 2011 - Pulp Google Hacking: The Next Generation Search Engine Hacki...Rob Ragan
 
MongoDB and Hadoop: Driving Business Insights
MongoDB and Hadoop: Driving Business InsightsMongoDB and Hadoop: Driving Business Insights
MongoDB and Hadoop: Driving Business InsightsMongoDB
 
Introduction to Graph databases and Neo4j (by Stefan Armbruster)
Introduction to Graph databases and Neo4j (by Stefan Armbruster)Introduction to Graph databases and Neo4j (by Stefan Armbruster)
Introduction to Graph databases and Neo4j (by Stefan Armbruster)barcelonajug
 

La actualidad más candente (20)

Hadoop and Neo4j: A Winning Combination for Bioinformatics
Hadoop and Neo4j: A Winning Combination for BioinformaticsHadoop and Neo4j: A Winning Combination for Bioinformatics
Hadoop and Neo4j: A Winning Combination for Bioinformatics
 
Apache Druid Vision and Roadmap
Apache Druid Vision and RoadmapApache Druid Vision and Roadmap
Apache Druid Vision and Roadmap
 
Graph Databases in Python (PyCon Canada 2012)
Graph Databases in Python (PyCon Canada 2012)Graph Databases in Python (PyCon Canada 2012)
Graph Databases in Python (PyCon Canada 2012)
 
Big Data Analytics 2: Leveraging Customer Behavior to Enhance Relevancy in Pe...
Big Data Analytics 2: Leveraging Customer Behavior to Enhance Relevancy in Pe...Big Data Analytics 2: Leveraging Customer Behavior to Enhance Relevancy in Pe...
Big Data Analytics 2: Leveraging Customer Behavior to Enhance Relevancy in Pe...
 
Big Data made easy with a Spark
Big Data made easy with a SparkBig Data made easy with a Spark
Big Data made easy with a Spark
 
Lord of the Bing - Black Hat USA 2010
Lord of the Bing - Black Hat USA 2010Lord of the Bing - Black Hat USA 2010
Lord of the Bing - Black Hat USA 2010
 
Getting started with Graph Databases & Neo4j
Getting started with Graph Databases & Neo4jGetting started with Graph Databases & Neo4j
Getting started with Graph Databases & Neo4j
 
Neo4j in Depth
Neo4j in DepthNeo4j in Depth
Neo4j in Depth
 
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...
 
Neo4J : Introduction to Graph Database
Neo4J : Introduction to Graph DatabaseNeo4J : Introduction to Graph Database
Neo4J : Introduction to Graph Database
 
Caching Search Engine Results over Incremental Indices
Caching Search Engine Results over Incremental IndicesCaching Search Engine Results over Incremental Indices
Caching Search Engine Results over Incremental Indices
 
Tenacious Diggity - Skinny Dippin in a Sea of Bing
Tenacious Diggity - Skinny Dippin in a Sea of BingTenacious Diggity - Skinny Dippin in a Sea of Bing
Tenacious Diggity - Skinny Dippin in a Sea of Bing
 
Intro to Graphs and Neo4j
Intro to Graphs and Neo4jIntro to Graphs and Neo4j
Intro to Graphs and Neo4j
 
How Graph Databases efficiently store, manage and query connected data at s...
How Graph Databases efficiently  store, manage and query  connected data at s...How Graph Databases efficiently  store, manage and query  connected data at s...
How Graph Databases efficiently store, manage and query connected data at s...
 
Neo4j Fundamentals
Neo4j FundamentalsNeo4j Fundamentals
Neo4j Fundamentals
 
Building Data Applications with Apache Druid
Building Data Applications with Apache DruidBuilding Data Applications with Apache Druid
Building Data Applications with Apache Druid
 
Using MongoDB + Hadoop Together
Using MongoDB + Hadoop TogetherUsing MongoDB + Hadoop Together
Using MongoDB + Hadoop Together
 
Black Hat 2011 - Pulp Google Hacking: The Next Generation Search Engine Hacki...
Black Hat 2011 - Pulp Google Hacking: The Next Generation Search Engine Hacki...Black Hat 2011 - Pulp Google Hacking: The Next Generation Search Engine Hacki...
Black Hat 2011 - Pulp Google Hacking: The Next Generation Search Engine Hacki...
 
MongoDB and Hadoop: Driving Business Insights
MongoDB and Hadoop: Driving Business InsightsMongoDB and Hadoop: Driving Business Insights
MongoDB and Hadoop: Driving Business Insights
 
Introduction to Graph databases and Neo4j (by Stefan Armbruster)
Introduction to Graph databases and Neo4j (by Stefan Armbruster)Introduction to Graph databases and Neo4j (by Stefan Armbruster)
Introduction to Graph databases and Neo4j (by Stefan Armbruster)
 

Destacado

李慕約&王向榮/如何備料:資料的抓取、清理以及串接
李慕約&王向榮/如何備料:資料的抓取、清理以及串接李慕約&王向榮/如何備料:資料的抓取、清理以及串接
李慕約&王向榮/如何備料:資料的抓取、清理以及串接台灣資料科學年會
 
一個賭徒的告白:從預測市場看金融交易
一個賭徒的告白:從預測市場看金融交易一個賭徒的告白:從預測市場看金融交易
一個賭徒的告白:從預測市場看金融交易台灣資料科學年會
 
林佳賢/資料視覺化的 20 個小訣竅
林佳賢/資料視覺化的 20 個小訣竅林佳賢/資料視覺化的 20 個小訣竅
林佳賢/資料視覺化的 20 個小訣竅台灣資料科學年會
 
Collaboration with Statistician? 矩陣視覺化於探索式資料分析
Collaboration with Statistician? 矩陣視覺化於探索式資料分析Collaboration with Statistician? 矩陣視覺化於探索式資料分析
Collaboration with Statistician? 矩陣視覺化於探索式資料分析台灣資料科學年會
 
[系列活動] 給工程師的統計學及資料分析 123
[系列活動] 給工程師的統計學及資料分析 123[系列活動] 給工程師的統計學及資料分析 123
[系列活動] 給工程師的統計學及資料分析 123台灣資料科學年會
 
[系列活動] 智慧製造與生產線上的資料科學 (製造資料科學:從預測性思維到處方性決策)
[系列活動] 智慧製造與生產線上的資料科學 (製造資料科學:從預測性思維到處方性決策)[系列活動] 智慧製造與生產線上的資料科學 (製造資料科學:從預測性思維到處方性決策)
[系列活動] 智慧製造與生產線上的資料科學 (製造資料科學:從預測性思維到處方性決策)台灣資料科學年會
 

Destacado (7)

李慕約&王向榮/如何備料:資料的抓取、清理以及串接
李慕約&王向榮/如何備料:資料的抓取、清理以及串接李慕約&王向榮/如何備料:資料的抓取、清理以及串接
李慕約&王向榮/如何備料:資料的抓取、清理以及串接
 
Z > B 的資料科學
Z > B 的資料科學Z > B 的資料科學
Z > B 的資料科學
 
一個賭徒的告白:從預測市場看金融交易
一個賭徒的告白:從預測市場看金融交易一個賭徒的告白:從預測市場看金融交易
一個賭徒的告白:從預測市場看金融交易
 
林佳賢/資料視覺化的 20 個小訣竅
林佳賢/資料視覺化的 20 個小訣竅林佳賢/資料視覺化的 20 個小訣竅
林佳賢/資料視覺化的 20 個小訣竅
 
Collaboration with Statistician? 矩陣視覺化於探索式資料分析
Collaboration with Statistician? 矩陣視覺化於探索式資料分析Collaboration with Statistician? 矩陣視覺化於探索式資料分析
Collaboration with Statistician? 矩陣視覺化於探索式資料分析
 
[系列活動] 給工程師的統計學及資料分析 123
[系列活動] 給工程師的統計學及資料分析 123[系列活動] 給工程師的統計學及資料分析 123
[系列活動] 給工程師的統計學及資料分析 123
 
[系列活動] 智慧製造與生產線上的資料科學 (製造資料科學:從預測性思維到處方性決策)
[系列活動] 智慧製造與生產線上的資料科學 (製造資料科學:從預測性思維到處方性決策)[系列活動] 智慧製造與生產線上的資料科學 (製造資料科學:從預測性思維到處方性決策)
[系列活動] 智慧製造與生產線上的資料科學 (製造資料科學:從預測性思維到處方性決策)
 

Similar a 天下武功唯快不破:利用串流資料實做出即時分類器和即時推薦系統

Avatara: OLAP for Web-scale Analytics Products
Avatara: OLAP for Web-scale Analytics Products Avatara: OLAP for Web-scale Analytics Products
Avatara: OLAP for Web-scale Analytics Products Lili Wu
 
Trending with Purpose
Trending with PurposeTrending with Purpose
Trending with PurposeJason Dixon
 
Schema.org Structured data the What, Why, & How
Schema.org Structured data the What, Why, & HowSchema.org Structured data the What, Why, & How
Schema.org Structured data the What, Why, & HowRichard Wallis
 
From 6 hours to 1 minute... in 2 days! How we managed to stream our (long) Ha...
From 6 hours to 1 minute... in 2 days! How we managed to stream our (long) Ha...From 6 hours to 1 minute... in 2 days! How we managed to stream our (long) Ha...
From 6 hours to 1 minute... in 2 days! How we managed to stream our (long) Ha...Dataconomy Media
 
Big Data Berlin - Criteo
Big Data Berlin - CriteoBig Data Berlin - Criteo
Big Data Berlin - CriteoSofian Djamaa
 
Wireframes: Choose the Right Tool for the Job
Wireframes: Choose the Right Tool for the JobWireframes: Choose the Right Tool for the Job
Wireframes: Choose the Right Tool for the JobCatharine Robertson
 
Graphs in Action: In-depth look at Neo4j in Production
Graphs in Action: In-depth look at Neo4j in ProductionGraphs in Action: In-depth look at Neo4j in Production
Graphs in Action: In-depth look at Neo4j in ProductionNeo4j
 
Architecting a next generation data platform
Architecting a next generation data platformArchitecting a next generation data platform
Architecting a next generation data platformhadooparchbook
 
Learn Like a Human: Taking Machine Learning from Batch to Real-Time
Learn Like a Human: Taking Machine Learning from Batch to Real-TimeLearn Like a Human: Taking Machine Learning from Batch to Real-Time
Learn Like a Human: Taking Machine Learning from Batch to Real-TimeDynamic Yield
 
Semantic search: from document retrieval to virtual assistants
Semantic search: from document retrieval to virtual assistantsSemantic search: from document retrieval to virtual assistants
Semantic search: from document retrieval to virtual assistantsPeter Mika
 
Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...
Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...
Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...Impetus Technologies
 
Natural born conversion killers - Conversion Jam
Natural born conversion killers - Conversion JamNatural born conversion killers - Conversion Jam
Natural born conversion killers - Conversion JamCraig Sullivan
 
Complex things explained easily
Complex things explained easilyComplex things explained easily
Complex things explained easilyLuca Tumedei
 
H2O World - Solving Customer Churn with Machine Learning - Julian Bharadwaj
H2O World - Solving Customer Churn with Machine Learning - Julian BharadwajH2O World - Solving Customer Churn with Machine Learning - Julian Bharadwaj
H2O World - Solving Customer Churn with Machine Learning - Julian BharadwajSri Ambati
 
IronEdge PowerBI World Tour Presentation
IronEdge PowerBI World Tour PresentationIronEdge PowerBI World Tour Presentation
IronEdge PowerBI World Tour PresentationIronEdge Group
 
Patterns of the Lambda Architecture -- 2015 April - Hadoop Summit, Europe
Patterns of the Lambda Architecture -- 2015 April - Hadoop Summit, EuropePatterns of the Lambda Architecture -- 2015 April - Hadoop Summit, Europe
Patterns of the Lambda Architecture -- 2015 April - Hadoop Summit, EuropeFlip Kromer
 
2012-09-24 Workshop: Wireframe mockups
2012-09-24 Workshop: Wireframe mockups2012-09-24 Workshop: Wireframe mockups
2012-09-24 Workshop: Wireframe mockupsBaltimore Lean Startup
 
iPhone game development - Joash Chee
iPhone game development - Joash CheeiPhone game development - Joash Chee
iPhone game development - Joash Cheejasonong
 
Architecting next generation big data platform
Architecting next generation big data platformArchitecting next generation big data platform
Architecting next generation big data platformhadooparchbook
 

Similar a 天下武功唯快不破:利用串流資料實做出即時分類器和即時推薦系統 (20)

Avatara: OLAP for Web-scale Analytics Products
Avatara: OLAP for Web-scale Analytics Products Avatara: OLAP for Web-scale Analytics Products
Avatara: OLAP for Web-scale Analytics Products
 
Trending with Purpose
Trending with PurposeTrending with Purpose
Trending with Purpose
 
Schema.org Structured data the What, Why, & How
Schema.org Structured data the What, Why, & HowSchema.org Structured data the What, Why, & How
Schema.org Structured data the What, Why, & How
 
From 6 hours to 1 minute... in 2 days! How we managed to stream our (long) Ha...
From 6 hours to 1 minute... in 2 days! How we managed to stream our (long) Ha...From 6 hours to 1 minute... in 2 days! How we managed to stream our (long) Ha...
From 6 hours to 1 minute... in 2 days! How we managed to stream our (long) Ha...
 
Big Data Berlin - Criteo
Big Data Berlin - CriteoBig Data Berlin - Criteo
Big Data Berlin - Criteo
 
NoSQL e Python RuPy 2012
NoSQL e Python RuPy 2012NoSQL e Python RuPy 2012
NoSQL e Python RuPy 2012
 
Wireframes: Choose the Right Tool for the Job
Wireframes: Choose the Right Tool for the JobWireframes: Choose the Right Tool for the Job
Wireframes: Choose the Right Tool for the Job
 
Graphs in Action: In-depth look at Neo4j in Production
Graphs in Action: In-depth look at Neo4j in ProductionGraphs in Action: In-depth look at Neo4j in Production
Graphs in Action: In-depth look at Neo4j in Production
 
Architecting a next generation data platform
Architecting a next generation data platformArchitecting a next generation data platform
Architecting a next generation data platform
 
Learn Like a Human: Taking Machine Learning from Batch to Real-Time
Learn Like a Human: Taking Machine Learning from Batch to Real-TimeLearn Like a Human: Taking Machine Learning from Batch to Real-Time
Learn Like a Human: Taking Machine Learning from Batch to Real-Time
 
Semantic search: from document retrieval to virtual assistants
Semantic search: from document retrieval to virtual assistantsSemantic search: from document retrieval to virtual assistants
Semantic search: from document retrieval to virtual assistants
 
Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...
Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...
Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...
 
Natural born conversion killers - Conversion Jam
Natural born conversion killers - Conversion JamNatural born conversion killers - Conversion Jam
Natural born conversion killers - Conversion Jam
 
Complex things explained easily
Complex things explained easilyComplex things explained easily
Complex things explained easily
 
H2O World - Solving Customer Churn with Machine Learning - Julian Bharadwaj
H2O World - Solving Customer Churn with Machine Learning - Julian BharadwajH2O World - Solving Customer Churn with Machine Learning - Julian Bharadwaj
H2O World - Solving Customer Churn with Machine Learning - Julian Bharadwaj
 
IronEdge PowerBI World Tour Presentation
IronEdge PowerBI World Tour PresentationIronEdge PowerBI World Tour Presentation
IronEdge PowerBI World Tour Presentation
 
Patterns of the Lambda Architecture -- 2015 April - Hadoop Summit, Europe
Patterns of the Lambda Architecture -- 2015 April - Hadoop Summit, EuropePatterns of the Lambda Architecture -- 2015 April - Hadoop Summit, Europe
Patterns of the Lambda Architecture -- 2015 April - Hadoop Summit, Europe
 
2012-09-24 Workshop: Wireframe mockups
2012-09-24 Workshop: Wireframe mockups2012-09-24 Workshop: Wireframe mockups
2012-09-24 Workshop: Wireframe mockups
 
iPhone game development - Joash Chee
iPhone game development - Joash CheeiPhone game development - Joash Chee
iPhone game development - Joash Chee
 
Architecting next generation big data platform
Architecting next generation big data platformArchitecting next generation big data platform
Architecting next generation big data platform
 

Más de 台灣資料科學年會

[台灣人工智慧學校] 人工智慧技術發展與應用
[台灣人工智慧學校] 人工智慧技術發展與應用[台灣人工智慧學校] 人工智慧技術發展與應用
[台灣人工智慧學校] 人工智慧技術發展與應用台灣資料科學年會
 
[台灣人工智慧學校] 執行長報告
[台灣人工智慧學校] 執行長報告[台灣人工智慧學校] 執行長報告
[台灣人工智慧學校] 執行長報告台灣資料科學年會
 
[台灣人工智慧學校] 工業 4.0 與智慧製造的發展趨勢與挑戰
[台灣人工智慧學校] 工業 4.0 與智慧製造的發展趨勢與挑戰[台灣人工智慧學校] 工業 4.0 與智慧製造的發展趨勢與挑戰
[台灣人工智慧學校] 工業 4.0 與智慧製造的發展趨勢與挑戰台灣資料科學年會
 
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機台灣資料科學年會
 
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機台灣資料科學年會
 
[台灣人工智慧學校] 台北總校第三期結業典禮 - 執行長談話
[台灣人工智慧學校] 台北總校第三期結業典禮 - 執行長談話[台灣人工智慧學校] 台北總校第三期結業典禮 - 執行長談話
[台灣人工智慧學校] 台北總校第三期結業典禮 - 執行長談話台灣資料科學年會
 
[TOxAIA台中分校] AI 引爆新工業革命,智慧機械首都台中轉型論壇
[TOxAIA台中分校] AI 引爆新工業革命,智慧機械首都台中轉型論壇[TOxAIA台中分校] AI 引爆新工業革命,智慧機械首都台中轉型論壇
[TOxAIA台中分校] AI 引爆新工業革命,智慧機械首都台中轉型論壇台灣資料科學年會
 
[TOxAIA台中分校] 2019 台灣數位轉型 與產業升級趨勢觀察
[TOxAIA台中分校] 2019 台灣數位轉型 與產業升級趨勢觀察 [TOxAIA台中分校] 2019 台灣數位轉型 與產業升級趨勢觀察
[TOxAIA台中分校] 2019 台灣數位轉型 與產業升級趨勢觀察 台灣資料科學年會
 
[TOxAIA台中分校] 智慧製造成真! 產線導入AI的致勝關鍵
[TOxAIA台中分校] 智慧製造成真! 產線導入AI的致勝關鍵[TOxAIA台中分校] 智慧製造成真! 產線導入AI的致勝關鍵
[TOxAIA台中分校] 智慧製造成真! 產線導入AI的致勝關鍵台灣資料科學年會
 
[台灣人工智慧學校] 從經濟學看人工智慧產業應用
[台灣人工智慧學校] 從經濟學看人工智慧產業應用[台灣人工智慧學校] 從經濟學看人工智慧產業應用
[台灣人工智慧學校] 從經濟學看人工智慧產業應用台灣資料科學年會
 
[台灣人工智慧學校] 台中分校第二期開學典禮 - 執行長報告
[台灣人工智慧學校] 台中分校第二期開學典禮 - 執行長報告[台灣人工智慧學校] 台中分校第二期開學典禮 - 執行長報告
[台灣人工智慧學校] 台中分校第二期開學典禮 - 執行長報告台灣資料科學年會
 
[台中分校] 第一期結業典禮 - 執行長談話
[台中分校] 第一期結業典禮 - 執行長談話[台中分校] 第一期結業典禮 - 執行長談話
[台中分校] 第一期結業典禮 - 執行長談話台灣資料科學年會
 
[TOxAIA新竹分校] 工業4.0潛力新應用! 多模式對話機器人
[TOxAIA新竹分校] 工業4.0潛力新應用! 多模式對話機器人[TOxAIA新竹分校] 工業4.0潛力新應用! 多模式對話機器人
[TOxAIA新竹分校] 工業4.0潛力新應用! 多模式對話機器人台灣資料科學年會
 
[TOxAIA新竹分校] AI整合是重點! 竹科的關鍵轉型思維
[TOxAIA新竹分校] AI整合是重點! 竹科的關鍵轉型思維[TOxAIA新竹分校] AI整合是重點! 竹科的關鍵轉型思維
[TOxAIA新竹分校] AI整合是重點! 竹科的關鍵轉型思維台灣資料科學年會
 
[TOxAIA新竹分校] 2019 台灣數位轉型與產業升級趨勢觀察
[TOxAIA新竹分校] 2019 台灣數位轉型與產業升級趨勢觀察[TOxAIA新竹分校] 2019 台灣數位轉型與產業升級趨勢觀察
[TOxAIA新竹分校] 2019 台灣數位轉型與產業升級趨勢觀察台灣資料科學年會
 
[TOxAIA新竹分校] 深度學習與Kaggle實戰
[TOxAIA新竹分校] 深度學習與Kaggle實戰[TOxAIA新竹分校] 深度學習與Kaggle實戰
[TOxAIA新竹分校] 深度學習與Kaggle實戰台灣資料科學年會
 
[台灣人工智慧學校] Bridging AI to Precision Agriculture through IoT
[台灣人工智慧學校] Bridging AI to Precision Agriculture through IoT[台灣人工智慧學校] Bridging AI to Precision Agriculture through IoT
[台灣人工智慧學校] Bridging AI to Precision Agriculture through IoT台灣資料科學年會
 
[2018 台灣人工智慧學校校友年會] 產業經驗分享: 如何用最少的訓練樣本,得到最好的深度學習影像分析結果,減少一半人力,提升一倍品質 / 李明達
[2018 台灣人工智慧學校校友年會] 產業經驗分享: 如何用最少的訓練樣本,得到最好的深度學習影像分析結果,減少一半人力,提升一倍品質 / 李明達[2018 台灣人工智慧學校校友年會] 產業經驗分享: 如何用最少的訓練樣本,得到最好的深度學習影像分析結果,減少一半人力,提升一倍品質 / 李明達
[2018 台灣人工智慧學校校友年會] 產業經驗分享: 如何用最少的訓練樣本,得到最好的深度學習影像分析結果,減少一半人力,提升一倍品質 / 李明達台灣資料科學年會
 
[2018 台灣人工智慧學校校友年會] 啟動物聯網新關鍵 - 未來由你「喚」醒 / 沈品勳
[2018 台灣人工智慧學校校友年會] 啟動物聯網新關鍵 - 未來由你「喚」醒 / 沈品勳[2018 台灣人工智慧學校校友年會] 啟動物聯網新關鍵 - 未來由你「喚」醒 / 沈品勳
[2018 台灣人工智慧學校校友年會] 啟動物聯網新關鍵 - 未來由你「喚」醒 / 沈品勳台灣資料科學年會
 

Más de 台灣資料科學年會 (20)

[台灣人工智慧學校] 人工智慧技術發展與應用
[台灣人工智慧學校] 人工智慧技術發展與應用[台灣人工智慧學校] 人工智慧技術發展與應用
[台灣人工智慧學校] 人工智慧技術發展與應用
 
[台灣人工智慧學校] 執行長報告
[台灣人工智慧學校] 執行長報告[台灣人工智慧學校] 執行長報告
[台灣人工智慧學校] 執行長報告
 
[台灣人工智慧學校] 工業 4.0 與智慧製造的發展趨勢與挑戰
[台灣人工智慧學校] 工業 4.0 與智慧製造的發展趨勢與挑戰[台灣人工智慧學校] 工業 4.0 與智慧製造的發展趨勢與挑戰
[台灣人工智慧學校] 工業 4.0 與智慧製造的發展趨勢與挑戰
 
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機
 
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機
 
[台灣人工智慧學校] 台北總校第三期結業典禮 - 執行長談話
[台灣人工智慧學校] 台北總校第三期結業典禮 - 執行長談話[台灣人工智慧學校] 台北總校第三期結業典禮 - 執行長談話
[台灣人工智慧學校] 台北總校第三期結業典禮 - 執行長談話
 
[TOxAIA台中分校] AI 引爆新工業革命,智慧機械首都台中轉型論壇
[TOxAIA台中分校] AI 引爆新工業革命,智慧機械首都台中轉型論壇[TOxAIA台中分校] AI 引爆新工業革命,智慧機械首都台中轉型論壇
[TOxAIA台中分校] AI 引爆新工業革命,智慧機械首都台中轉型論壇
 
[TOxAIA台中分校] 2019 台灣數位轉型 與產業升級趨勢觀察
[TOxAIA台中分校] 2019 台灣數位轉型 與產業升級趨勢觀察 [TOxAIA台中分校] 2019 台灣數位轉型 與產業升級趨勢觀察
[TOxAIA台中分校] 2019 台灣數位轉型 與產業升級趨勢觀察
 
[TOxAIA台中分校] 智慧製造成真! 產線導入AI的致勝關鍵
[TOxAIA台中分校] 智慧製造成真! 產線導入AI的致勝關鍵[TOxAIA台中分校] 智慧製造成真! 產線導入AI的致勝關鍵
[TOxAIA台中分校] 智慧製造成真! 產線導入AI的致勝關鍵
 
[台灣人工智慧學校] 從經濟學看人工智慧產業應用
[台灣人工智慧學校] 從經濟學看人工智慧產業應用[台灣人工智慧學校] 從經濟學看人工智慧產業應用
[台灣人工智慧學校] 從經濟學看人工智慧產業應用
 
[台灣人工智慧學校] 台中分校第二期開學典禮 - 執行長報告
[台灣人工智慧學校] 台中分校第二期開學典禮 - 執行長報告[台灣人工智慧學校] 台中分校第二期開學典禮 - 執行長報告
[台灣人工智慧學校] 台中分校第二期開學典禮 - 執行長報告
 
台灣人工智慧學校成果發表會
台灣人工智慧學校成果發表會台灣人工智慧學校成果發表會
台灣人工智慧學校成果發表會
 
[台中分校] 第一期結業典禮 - 執行長談話
[台中分校] 第一期結業典禮 - 執行長談話[台中分校] 第一期結業典禮 - 執行長談話
[台中分校] 第一期結業典禮 - 執行長談話
 
[TOxAIA新竹分校] 工業4.0潛力新應用! 多模式對話機器人
[TOxAIA新竹分校] 工業4.0潛力新應用! 多模式對話機器人[TOxAIA新竹分校] 工業4.0潛力新應用! 多模式對話機器人
[TOxAIA新竹分校] 工業4.0潛力新應用! 多模式對話機器人
 
[TOxAIA新竹分校] AI整合是重點! 竹科的關鍵轉型思維
[TOxAIA新竹分校] AI整合是重點! 竹科的關鍵轉型思維[TOxAIA新竹分校] AI整合是重點! 竹科的關鍵轉型思維
[TOxAIA新竹分校] AI整合是重點! 竹科的關鍵轉型思維
 
[TOxAIA新竹分校] 2019 台灣數位轉型與產業升級趨勢觀察
[TOxAIA新竹分校] 2019 台灣數位轉型與產業升級趨勢觀察[TOxAIA新竹分校] 2019 台灣數位轉型與產業升級趨勢觀察
[TOxAIA新竹分校] 2019 台灣數位轉型與產業升級趨勢觀察
 
[TOxAIA新竹分校] 深度學習與Kaggle實戰
[TOxAIA新竹分校] 深度學習與Kaggle實戰[TOxAIA新竹分校] 深度學習與Kaggle實戰
[TOxAIA新竹分校] 深度學習與Kaggle實戰
 
[台灣人工智慧學校] Bridging AI to Precision Agriculture through IoT
[台灣人工智慧學校] Bridging AI to Precision Agriculture through IoT[台灣人工智慧學校] Bridging AI to Precision Agriculture through IoT
[台灣人工智慧學校] Bridging AI to Precision Agriculture through IoT
 
[2018 台灣人工智慧學校校友年會] 產業經驗分享: 如何用最少的訓練樣本,得到最好的深度學習影像分析結果,減少一半人力,提升一倍品質 / 李明達
[2018 台灣人工智慧學校校友年會] 產業經驗分享: 如何用最少的訓練樣本,得到最好的深度學習影像分析結果,減少一半人力,提升一倍品質 / 李明達[2018 台灣人工智慧學校校友年會] 產業經驗分享: 如何用最少的訓練樣本,得到最好的深度學習影像分析結果,減少一半人力,提升一倍品質 / 李明達
[2018 台灣人工智慧學校校友年會] 產業經驗分享: 如何用最少的訓練樣本,得到最好的深度學習影像分析結果,減少一半人力,提升一倍品質 / 李明達
 
[2018 台灣人工智慧學校校友年會] 啟動物聯網新關鍵 - 未來由你「喚」醒 / 沈品勳
[2018 台灣人工智慧學校校友年會] 啟動物聯網新關鍵 - 未來由你「喚」醒 / 沈品勳[2018 台灣人工智慧學校校友年會] 啟動物聯網新關鍵 - 未來由你「喚」醒 / 沈品勳
[2018 台灣人工智慧學校校友年會] 啟動物聯網新關鍵 - 未來由你「喚」醒 / 沈品勳
 

Último

Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 

Último (20)

Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 

天下武功唯快不破:利用串流資料實做出即時分類器和即時推薦系統

  • 2. Who I am ▪ Norman Huang (normany@yahoo-inc.com) ▪ Software & Data Engineer of Yahoo! Taiwan ▪ Aims to retrieve and deliver data insights via BI platform and data mining algorithms. 2
  • 3. Who I am ▪ Jason Lin (jasonysl@yahoo-inc.com) ▪ Software & Data Engineer of Yahoo! Taiwan ▪ Responsible for recommendation system personalization mechanisms and cloud computing developer. 3
  • 4. Agenda ▪ Challenges ▪ Solution: Pinball ▪ Q&A 4
  • 5. Challenges ! ! ! ! ! ! ▪ Static content until next batch job. ! ! ! 5 Processing
  • 6. Challenges ! ! ! ! ! ! ▪ Static content until next batch job. ▪ Batched product recommendation algorithms have become common features among e-commerce platforms. ! 6 Processing
  • 7. Challenges ! ! ! ! ! ! ▪ Nearly 72% of visitors made their decision at the same day. 7 Absorbed into batch views Not yet absorbed Time Several hours of data
  • 8. Challenges ! ! ! ! ! ! ▪ Nearly 72% of visitors made their decision at the same day. ▪ Real-time solution to interact with potential buyers. 8 Absorbed into batch views Not yet absorbed Time Several hours of data
  • 10. Pinball 10 Classifier Classifier User Profile A Profile B Profile C
  • 11. Pinball ! ▪ Real-time classifier ▪ Detect buyers’ preferences by streaming data processing ▪ Deliver personalized ads and product recommendations on the fly 11
  • 12. Pinball ! ▪ Real-time classifier ▪ Detect buyers’ preferences by streaming data processing ▪ Deliver personalized ads and product recommendations on the fly ! ▪ Challenges › How do to it in real-time? 12
  • 13. Pinball ! ▪ Real-time classifier ▪ Detect buyers’ preferences by streaming data processing ▪ Deliver personalized ads and product recommendations on the fly ! ▪ Challenges › How do to it in real-time? › Storm 13
  • 14. Pinball ! ▪ Real-time classifier ▪ Detect buyers’ preferences by streaming data processing ▪ Deliver personalized ads and product recommendations on the fly ! ▪ Challenges › How do to it in real-time? › Storm! › How to determine customers’ purchasing desire? 14
  • 15. Pinball ! ▪ Real-time classifier ▪ Detect buyers’ preferences by streaming data processing ▪ Deliver personalized ads and product recommendations on the fly ! ▪ Challenges › How do to it in real-time? › Storm! › How to determine customers’ purchasing desire? › Buying Intention Detection 15
  • 16. Solution: Pinball ▪ Storm Overview ▪ Buying Intention (BI) ▪ Architecture and Design 16
  • 17. Pinball 17 Storm Learning Buyer
  • 18. Pinball 18 Storm Learning Buyers
  • 19. Pinball 19 Learning Storm Is Potential Buyer? Buyers Visitor Promotions
  • 20. Pinball Pinball 20 Learning Storm Is Potential Buyer? Buyers Visitor Promotions
  • 21. Pinball Pinball 21 Learning Storm Is Potential Buyer? Buyers Buyer Promotions
  • 22. Storm Concepts ▪ Tuple & Streams ▪ Spouts & Bolts ▪ Topologies Yahoo Confidential & Proprietary 22
  • 23. Tuple & Streams ▪ Tuple ! ! ! ! ▪ Stream Yahoo Confidential & Proprietary 23 Field 1 Field 2 Field 3 Field 4 Field 5 Tuple Tuple 1 Tuple 2 Tuple 3 Tuple n Stream
  • 24. Spouts & Bolts Yahoo Confidential & Proprietary 24 Spout T T T T T Bolt T T T
  • 25. Topology 25 Spout Bolt Bolt Streams ▪ Hadoop map-reduce job vs. Storm topology
  • 26. Topology 26 Spout Bolt Bolt Streams ▪ Hadoop map-reduce job vs. Storm topology
  • 27. Storm Concepts Yahoo Confidential & Proprietary 27 Computational Primitives Use Case High-level! Language Hadoop Map & Reduce Batch Processing Pig Storm Spout & Bolt Stream Processing Trident
  • 28. Storm 28 Nimbus Zookeeper Zookeeper Zookeeper Supervisor Supervisor Supervisor Supervisor Supervisor Master node, similar to the Hadoop JobTracker
  • 29. Storm 29 Nimbus Zookeeper Zookeeper Zookeeper Supervisor Supervisor Supervisor Supervisor Supervisor Coordinates the Storm cluster
  • 30. Storm 30 Nimbus Zookeeper Zookeeper Zookeeper Supervisor Supervisor Supervisor Supervisor Supervisor Run worker processes
  • 31. Buying Intention ▪ Based on our findings: › The more page views, the higher the chance a visitor will buy it. › BUT, the buying intension value of each category will vary. 31 2 6
  • 32. How to leverage Storm with Buying Intention (BI)?
  • 35. Learning & Classifier ▪ Online Binary Classification › Simple and computationally efficient ▪ e.g. › assumptions: γ=0.1, BI = 3 › scenario: a user makes 6 page views before purchasing • BI’ = 3 + (6-3) x 0.1 • BI’ = 3.3 35 BI ' = BI +(PV − BI )×γ
  • 38. Lambda Architecture ▪ Term created by Nathan Marz (Creator of Apache Storm) ! ▪ Batch Real-time processing Yahoo Confidential & Proprietary 38
  • 39. Lambda Architecture ▪ Term created by Nathan Marz (Creator of Apache Storm) ! ▪ Batch Real-time processing Yahoo Confidential & Proprietary 39
  • 40. Lambda Architecture ▪ Term created by Nathan Marz (Creator of Apache Storm) ! ▪ Batch + Real-time processing › Hybrid batch and real-time processing Yahoo Confidential & Proprietary 40
  • 41. Lambda Architecture ▪ Term created by Nathan Marz (Creator of Apache Storm) ! ▪ Batch + Real-time processing › Hybrid batch and real-time processing › Batch processing is treated as source of truth, and real-time updates models/insights between batches. Yahoo Confidential & Proprietary 41
  • 42. Lambda Architecture Yahoo Confidential & Proprietary 42 [REF] http://lambda-architecture.net/
  • 43. Lambda Architecture Yahoo Confidential & Proprietary 43 [REF] http://lambda-architecture.net/
  • 44. Lambda Architecture Yahoo Confidential & Proprietary 44 Storm Streaming [REF] http://lambda-architecture.net/
  • 45. Lambda Architecture Summingbird Yahoo Confidential & Proprietary 45 [REF] http://lambda-architecture.net/
  • 47. 47
  • 48. How to keep it generic and flexible? ▪ to add more signals ▪ to add more online learning algorithms ▪ to add more channels
  • 49. How to keep it generic and flexible? Signals Algorithms Channels 49 Click Login Buy View Bounce Time Spent Buying Intention Email Y! Webpages Mobile Apps Messenger Fraud Detection Webpage Sequence
  • 50. Summary ▪ Scalable to process real-time data ▪ Supports online learning algorithms ▪ Flexible interactions with visitors ▪ Increase user's engagement ▪ Increase the conversion rate ▪ To create synergy by combining batched recommender and Pinball Yahoo Confidential & Proprietary 50
  • 51. Simple Hands-on -> Find out the heavy users!
  • 52. Find out the heavy users! ▪ Memorize the numbers of page views for each user ▪ If the numbers are great than 3, it’s a heavy user Yahoo Confidential & Proprietary 52
  • 53. Find out the heavy users! Yahoo Confidential & Proprietary 53 User Log Spout Learning Bolt userid, type, catlv1, catlv2, timestamp
  • 54. Find out the heavy users! Yahoo Confidential & Proprietary 54 User Log Spout Learning Bolt userid, type, catlv1, catlv2, timestamp Learning Bolt shuffleGroup userA, xxxxx userB, xxxxx userD, xxxxx userB, xxxxx userE, xxxxx userC, xxxxx userB, xxxxx userC, xxxxx
  • 55. Find out the heavy users! Yahoo Confidential & Proprietary 55 User Log Spout Learning Bolt userid, type, catlv1, catlv2, timestamp Learning Bolt fieldGroup userA, xxxxx userD, xxxxx userF, xxxxx userF, xxxxx userE, xxxxx userC, xxxxx userB, xxxxx userB, xxxxx userB, xxxxx userC, xxxxx
  • 56. Find out the heavy users! Yahoo Confidential & Proprietary 56 User Log Spout Learning Bolt Learning Bolt fieldGroup userA, xxxxx userD, xxxxx userF, xxxxx userF, xxxxx userE, xxxxx userC, xxxxx userB, xxxxx userB, xxxxx userB, xxxxx userC, xxxxx Qualification Bolt userA, totalPV userB, totalPV userC, totalPV userF, totalPV
  • 57. Questions? Norman! @normanyhuang! www.linkedin.com/in/normany Jason! @kalijason! www.linkedin.com/pub/jason-lin/67/93/743