SlideShare una empresa de Scribd logo
1 de 60
Descargar para leer sin conexión
Hivemall:	Machine	Learning	
Library	for	Apache	Hive/Spark
Research	Engineer
Makoto	YUI	(油井 誠) @myui
<myui@treasure-data.com>
12016/09/09	HadoopCon	16,	Taipei
Ø 2015.04~ Research	Engineer	at	Treasure	Data,	
Inc.
• My	mission	is	developing	ML-as-a-Service	in	a	Hadoop-as-
a-service	company
Ø 2010.04-2015.03	Senior	Researcher	at	National	
Institute	of	Advanced	Industrial	Science	and	
Technology,	Japan.	産業技術総合研究所
• Developed	Hivemall	as	a	personal	research	project
Ø 2009.03	Ph.D.	in	Computer	Science	from	NAIST
• Majored	in	Parallel	Data	Processing,	not	ML	then
Ø Visiting	scholar	in	CWI,	Amsterdam	and	Univ.	Edinburgh
Little	about	me	..
2016/09/09	HadoopCon	16,	Taipei 2
2016/09/09	HadoopCon	16,	Taipei 3
Hiro Yoshikawa
CEO
Kaz Ota
CTO
Sada Furuhashi
Chief Architect
Open source business
veteran
Founder - world’s
largest Hadoop group
Invented Fluentd,
Messagepack
TODAY

100+ Employees, 30M+ funding
2015

New office in Seoul, Korea
2013

New office in Tokyo, Japan
2012

Founded in Mountain View, CA
Investors
Jerry Yang

Yahoo! Founder
Bill Tai

Angel Investor
Yukihiro Matsumoto

Ruby Inventor
Sierra Ventures - Tim Guleri

Entrerprise Software
Scale Ventures - Andy Vitus

B2B SaaS
Treasure	Data
2016/09/09	HadoopCon	16,	Taipei 4
We							Open-source!	TD	invented	..
Streaming log collector Bulk data import/export efficient binary serialization
Streaming Query Processor
Machine learning on Hadoop
digdag.io
Workflow engine (Beta)
2016/09/09	HadoopCon 16,	Taipei 5
Microsoft Operation	Management	Suite and	Google	Cloud	Platform	
(Kubernates)	are	using	Fluentd for	log	collection
Point
Our	technology	users
2016/09/09	HadoopCon 16,	Taipei 6
Microsoft Operation	Management	Suite and	Google	Cloud	Platform	
(Kubernates)	are	using	Fluentd for	log	collection
Point
Our	technology	users
2016/09/09	HadoopCon	16,	Taipei 7
Treasure	Data’s	Solution
2016/09/09	HadoopCon	16,	Taipei 8
Big	Data	Stats	in	TD
Ad-tech
IoT
三菱重工
Agency	/	Trading Desk DMP / DSP Ad-Network
Diverse Corporate Identity Manual 02
コーポレートカラー
千歳緑(ちとせみどり)
この千歳緑をDiversのコーポレートカラーとします。
千歳緑は、常緑の松の緑をさし、吉祥的な意味を持つ事から、おめでたく、喜ばしい意味を持ちます。
繁栄・幸運を意味し、吉祥天は幸福・美・富を顕す神であるとともに、美女の代名詞ともされています。
■ CMYK / プロセスカラー
C : 85% M : 17% Y : 76% K : 57%
■ PANTONE / プロセスカラー
555EC
■ RGB / モニター
R : 0 G : 80 B : 60
背景と干渉する場合に使用するボックスロゴ
背景と干渉する場合に使用するボックスロゴ 白黒
白黒のみの場合
EC Media Game/SNS
Gaminge-Commerce Internet	Service	
Retail Finance TechnologyTelecommunicationMaker
Other	domain
Our	Customers
2016/09/09	HadoopCon	16,	Taipei 9
Ad-tech
IoT
三菱重工
Agency	/	Trading Desk DMP / DSP Ad-Network
Diverse Corporate Identity Manual 02
コーポレートカラー
千歳緑(ちとせみどり)
この千歳緑をDiversのコーポレートカラーとします。
千歳緑は、常緑の松の緑をさし、吉祥的な意味を持つ事から、おめでたく、喜ばしい意味を持ちます。
繁栄・幸運を意味し、吉祥天は幸福・美・富を顕す神であるとともに、美女の代名詞ともされています。
■ CMYK / プロセスカラー
C : 85% M : 17% Y : 76% K : 57%
■ PANTONE / プロセスカラー
555EC
■ RGB / モニター
R : 0 G : 80 B : 60
背景と干渉する場合に使用するボックスロゴ
背景と干渉する場合に使用するボックスロゴ 白黒
白黒のみの場合
EC Media Game/SNS
Gaminge-Commerce Internet	Service	
Retail Finance TechnologyTelecommunicationMaker
Other	domain
Our	Customers
2016/09/09	HadoopCon	16,	Taipei 10
1. What	is	Hivemall	(introduction)
2. Why	Hivemall	(motivations	etc.)
3. Hivemall	Internals
4. How	to	use	Hivemall
5. Future	roadmap
Agenda
2016/09/09	HadoopCon	16,	Taipei 11
What	is	Hivemall
Scalable	machine	learning	library	built	
as	a	collection	of	Hive	UDFs,	licensed	
under	the	Apache	License	v2
12
https://github.com/myui/hivemall
2016/09/09	HadoopCon	16,	Taipei
Hadoop	HDFS
MapReduce
(MRv1)
Hivemall
Apache	YARN
Apache	Tez
DAG	processing
Machine Learning
Query Processing
Parallel Data
Processing Framework
Resource Management
Distributed File System
Cloud Storage
SparkSQL
Apache	Spark
MESOS
Hive Pig
MLlib
Hivemall’s Technology	Stack
Amazon	S3
2016/09/09	HadoopCon	16,	Taipei 13
Hivemall’s Vision:	ML	on	SQL
Classification	with	Mahout
CREATE	TABLE	lr_model	AS
SELECT
feature,	-- reducers	perform	model	averaging	in	
parallel
avg(weight)	as	weight
FROM	(
SELECT	logress(features,label,..)	as	(feature,weight)
FROM	train
)	t	-- map-only	task
GROUP	BY	feature;	-- shuffled	to	reducers
✓Machine	Learning	made	easy	for	SQL	
developers	(ML	for	the	rest	of	us)
✓Interactive	and	Stable	APIs	w/ SQL	abstraction
This	SQL	query	automatically	runs	in	
parallel	on	Hadoop	
142016/09/09	HadoopCon	16,	Taipei
List	of	supported	Algorithms
Classification	
✓ Perceptron
✓ Passive	Aggressive	(PA,	PA1,	
PA2)
✓ Confidence	Weighted	(CW)
✓ Adaptive	Regularization	of	
Weight	Vectors	(AROW)
✓ Soft	Confidence	Weighted	
(SCW)
✓ AdaGrad+RDA
✓ Factorization	Machines
✓ RandomForest	Classification
15
Regression
✓Logistic	Regression	(SGD)
✓AdaGrad (logistic	loss)
✓AdaDELTA (logistic	loss)
✓PA	Regression
✓AROW	Regression
✓Factorization	Machines
✓RandomForest	Regression
SCW is a good first choice
Try RandomForest if SCW does
not work
Logistic regression is good for
getting a probability of a positive
class
Factorization Machines is good
where features are sparse and
categorical ones
2016/09/09	HadoopCon	16,	Taipei
List	of	Algorithms	for	Recommendation
16
K-Nearest	Neighbor
✓ Minhash and	b-Bit	Minhash
(LSH	variant)
✓ Similarity	Search	on	Vector	
Space
(Euclid/Cosine/Jaccard/Angular)
Matrix	Completion
✓ Matrix	Factorization
✓ Factorization	Machines	
(regression)
each_top_k function	of	Hivemall	is	
useful	for	recommending	top-k	items
2016/09/09	HadoopCon	16,	Taipei
Other	Supported	Algorithms
17
Anomaly	Detection
✓ Local	Outlier	Factor	(LoF)
Feature	Engineering
✓Feature	Hashing
✓Feature	Scaling
(normalization,	z-score)	
✓ TF-IDF	vectorizer
✓ Polynomial	Expansion
(Feature	Pairing)
✓ Amplifier
NLP
✓Basic	Englist text	Tokenizer	
✓Japanese	Tokenizer	
(Kuromoji)
2016/09/09	HadoopCon	16,	Taipei
• CTR	prediction	of	Ad	click	logs
• Algorithm:	Logistic	regression
• Freakout Inc.,	Smartnews,	and	more
• Gender	prediction	of	Ad	click	logs
• Algorithm:	Classification
• Scaleout Inc.
Industry	use	cases	of	Hivemall
182016/09/09	HadoopCon	16,	Taipei
• CTR	prediction	of	Ad	click	logs
• Algorithm:	Logistic	regression
• Freakout Inc.,	Smartnews,	and	more
• Gender	prediction	of	Ad	click	logs
• Algorithm:	Classification
• Scaleout Inc.
• Item/User	recommendation
• Algorithm:	Recommendation
• Wish.com,	GMO	pepabo
Industry	use	cases	of	Hivemall
19
Problem:	Recommendation	using	hot-item	is	hard	in	hand-crafted	
product	market	because	each	creator	sells	few	single	items	(will	
soon	become	out-of-stock)
2016/09/09	HadoopCon	16,	Taipei
minne.com
• CTR	prediction	of	Ad	click	logs
• Algorithm:	Logistic	regression
• Freakout Inc.,	Smartnews,	and	more
• Gender	prediction	of	Ad	click	logs
• Algorithm:	Classification
• Scaleout Inc.
• Item/User	recommendation
• Algorithm:	Recommendation
• Wish.com,	GMO	pepabo
• Value	prediction	of	Real	estates
• Algorithm:		Regression
• Livesense
Industry	use	cases	of	Hivemall
202016/09/09	HadoopCon	16,	Taipei
• CTR	prediction	of	Ad	click	logs
• Algorithm:	Logistic	regression
• Freakout Inc.,	Smartnews,	and	more
• Gender	prediction	of	Ad	click	logs
• Algorithm:	Classification
• Scaleout Inc.
• Item/User	recommendation
• Algorithm:	Recommendation
• Wish.com,	GMO	pepabo
• Value	prediction	of	Real	estates
• Algorithm:		Regression
• Livesense
• User	score	calculation
• Algrorithm:	Regression
• Klout
Industry	use	cases	of	Hivemall
21
bit.ly/klout-hivemall
2016/09/09	HadoopCon	16,	Taipei
Influencer	marketing
klout.com
OISIX,	a	leading	food	delivery	service	company	in	Japan,	
used	Hivemall’s Logistic	Regression	to	get	churn	probability	
2016/09/09	HadoopCon	16,	Taipei 22
Churn	Detection	of	Monthly	Payment	Service
Churn	rate	dropped	almost	by	half	by	giving	gift	points	to	
customers	being	predicted	to	leave J
1. What	is	Hivemall
2. Why	Hivemall	(motivations	etc.)
3. Hivemall	Internals
4. How	to	use	Hivemall
5. Future	roadmap
Agenda
2016/09/09	HadoopCon	16,	Taipei 23
2016/09/09	HadoopCon	16,	Taipei
Motivation	– Why	a	new	ML	framework?
Mahout?
Vowpal	Wabbit?
(w/	Hadoop	streaming)
Spark	MLlib?
0xdata	H2O? Cloudera	Oryx?
Machine	Learning	frameworks	out	there that	
run	with	Hadoop
Quick	Poll:	
How	many	people	in	this	room	are	using	them?
24
How	I	used	to	do	ML	projects	before	Hivemall
Given	raw	data	stored	on	Hadoop	HDFS
Raw
Data
HDFS
S3 Feature	Vector
Extract-Transform-Load
Machine	Learning
file
2016/09/09	HadoopCon	16,	Taipei 25
height:173cm
weight:60kg
age:34
gender:	man
…
How	I	used	to	do	ML	projects	before	Hivemall
Given	raw	data	stored	on	Hadoop	HDFS
Raw
Data
HDFS
S3 Feature	Vector
height:173cm
weight:60kg
age:34
gender:	man
…
Extract-Transform-Load
file
Need	to	do	expensive	data	
preprocessing	
(Joins,	Filtering,	and	Formatting	of	Data	
that	does	not	fit	in	memory)
Machine	Learning
2016/09/09	HadoopCon	16,	Taipei 26
How	I	used	to	do	ML	projects	before	Hivemall
Given	raw	data	stored	on	Hadoop	HDFS
Raw
Data
HDFS
S3 Feature	Vector
Extract-Transform-Load
file
Do	not	scale
Have	to	learn	R/Python	APIs
height:173cm
weight:60kg
age:34
gender:	man
…
2016/09/09	HadoopCon	16,	Taipei 27
Hivemall’s Vision:	ML	on	SQL	(again)
Classification	with	Mahout
CREATE	TABLE	lr_model	AS
SELECT
feature,	-- reducers	perform	model	averaging	in	
parallel
avg(weight)	as	weight
FROM	(
SELECT	logress(features,label,..)	as	(feature,weight)
FROM	train
)	t	-- map-only	task
GROUP	BY	feature;	-- shuffled	to	reducers
✓Machine	Learning	made	easy	for	SQL	
developers	(ML	for	the	rest	of	us)
✓Interactive	and	Stable	APIs	w/ SQL	abstraction
This	SQL	query	automatically	runs	in	
parallel	on	Hadoop	2016/09/09	HadoopCon	16,	Taipei 28
29
Hivemall	on	Apache	Spark
Installation	is	very	easy	as	follows:
$	spark-shell	--packages	maropu:hivemall-spark:0.0.6	
2016/09/09	HadoopCon	16,	Taipei
1. What	is	Hivemall
2. Why	Hivemall	(motivations	etc.)
3. Hivemall	Internals
4. How	to	use	Hivemall
5. Future	roadmap
Agenda
2016/09/09	HadoopCon	16,	Taipei 30
Implemented	machine	learning	algorithms	as	
User-Defined	Table	generating	Functions	(UDTFs)
How	Hivemall	works	in	training
+1,	<1,2>
..
+1,	<1,7,9>
-1,	<1,3,	9>
..
+1,	<3,8>
tuple
<label,	array<features>>
tuple<feature,	weights>
Prediction	model
UDTF
Relation
<feature,	weights>
param-mix param-mix
Training	
table
Shuffle	
by	feature
train train
● Resulting prediction model is a
relation of feature and its weight
● # of mapper and reducers are
configurable
UDTF	is	a	function	that	returns	a	relation
Parallelism	is	Powerful
2016/09/09	HadoopCon	16,	Taipei 31
32
train train
+1,	<1,2>
..
+1,	<1,7,9>
-1,	<1,3,	9>
..
+1,	<3,8>
tuple
<label,	featues>
array<weight>
Training	
table
-1,	<2,7,	9>
..
+1,	<3,8>
MIX
-1,	<2,7,	9>
..
+1,	<3,8>
train train
array<weight>
Parameter	averaging	(bagging)
2016/09/09	HadoopCon	16,	Taipei
Alternative	Approach	in	Hivemall
Hivemall	provides	the amplify UDTF	to	enumerate	
iteration	effects	in	machine	learning	without	several	
MapReduce steps
SET hivevar:xtimes=3;
CREATE VIEW training_x3
as
SELECT
*
FROM (
SELECT
amplify(${xtimes}, *) as (rowid, label, features)
FROM
training
) t
CLUSTER BY rand()
2016/09/09	HadoopCon	16,	Taipei 33
1. What	is	Hivemall
2. Why	Hivemall	(motivations	etc.)
3. Hivemall	Internals
4. How	to	use	Hivemall
5. Future	roadmap
Agenda
2016/09/09	HadoopCon	16,	Taipei 34
How	to	use	Hivemall
Machine
Learning
Training
Prediction
Prediction
Model
Label
Feature	
Vector
Feature	Vector
Label
Data	preparation 352016/09/09	HadoopCon	16,	Taipei
Create external table e2006tfidf_train (
rowid int,
label float,
features ARRAY<STRING>
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '¥t'
COLLECTION ITEMS TERMINATED BY ",“
STORED AS TEXTFILE LOCATION '/dataset/E2006-tfidf/train';
How	to	use	Hivemall	- Data	preparation
Define	a	Hive	table	for	training/testing	data
362016/09/09	HadoopCon	16,	Taipei
How	to	use	Hivemall
Machine
Learning
Training
Prediction
Prediction
Model
Label
Feature	
Vector
Feature	Vector
Label
Feature	Engineering
372016/09/09	HadoopCon	16,	Taipei
create view e2006tfidf_train_scaled
as
select
rowid,
rescale(target,${min_label},${max_label})
as label,
features
from
e2006tfidf_train;
Applying a Min-Max Feature
Normalization
How	to	use	Hivemall	- Feature	Engineering
Transforming	a	label	value	
to	a	value	between	0.0	and	1.0
382016/09/09	HadoopCon	16,	Taipei
How	to	use	Hivemall
Machine
Learning
Training
Prediction
Prediction
Model
Label
Feature	
Vector
Feature	Vector
Label
Training
392016/09/09	HadoopCon	16,	Taipei
How	to	use	Hivemall	- Training
CREATE TABLE lr_model AS
SELECT
feature,
avg(weight) as weight
FROM (
SELECT logress(features,label,..)
as (feature,weight)
FROM train
) t
GROUP BY feature
Training	by	logistic	regression
map-only	task	to	learn	a	prediction	model
Shuffle	map-outputs	to	reduces	by	feature
Reducers	perform	model	averaging	
in	parallel
402016/09/09	HadoopCon	16,	Taipei
How	to	use	Hivemall	- Training
CREATE TABLE news20b_cw_model1 AS
SELECT
feature,
voted_avg(weight) as weight
FROM
(SELECT
train_cw(features,label)
as (feature,weight)
FROM
news20b_train
) t
GROUP BY feature
Training	of	Confidence	Weighted	Classifier
Vote	to	use	negative	or	positive	
weights	for	avg
+0.7,	+0.3,	+0.2,	-0.1,	+0.7
Training	for	the	CW	classifier
412016/09/09	HadoopCon	16,	Taipei
How	to	use	Hivemall
Machine
Learning
Training
Prediction
Prediction
Model
Label
Feature	
Vector
Feature	Vector
Label
Prediction
422016/09/09	HadoopCon	16,	Taipei
How	to	use	Hivemall	- Prediction
CREATE TABLE lr_predict
as
SELECT
t.rowid,
sigmoid(sum(m.weight)) as prob
FROM
testing_exploded t LEFT OUTER JOIN
lr_model m ON (t.feature = m.feature)
GROUP BY
t.rowid
Prediction	is	done	by	LEFT	OUTER	JOIN
between	test	data	and	prediction	model
No	need	to	load	the	entire	model	into	memory
432016/09/09	HadoopCon	16,	Taipei
Real-time	prediction
Machine
Learning
Batch Training on Hadoop
Online Prediction on RDBMS
Prediction
Model
Label
Feature	
Vector
Feature	Vector
Label
Export	
prediction	model
44
bit.ly/hivemall-rtp
2016/09/09	HadoopCon	16,	Taipei
RandomForest	in	Hivemall
Ensemble	of	Decision	Trees
2016/09/09	HadoopCon	16,	Taipei 45
Training	of	RandomForest
2016/09/09	HadoopCon	16,	Taipei 46
Prediction	of	RandomForest
2016/09/09	HadoopCon	16,	Taipei 47
1. What	is	Hivemall
2. Why	Hivemall	(motivations	etc.)
3. Hivemall	Internals
4. How	to	use	Hivemall
5. Future	roadmap
Agenda
2016/09/09	HadoopCon	16,	Taipei 48
49
Future	of	Hivemall
Hivemall	will	become	Apache	Hivemall	(?)
Now	on	voting	though..
2016/09/09	HadoopCon	16,	Taipei
50
Apache	Incubation	status
2016/09/09	HadoopCon	16,	Taipei
• Makoto	Yui	<Treasure	Data>
• Takeshi	Yamamuro <NTT>
Ø Hivemall	on	Apache	Spark
• Daniel	Dai	<Hortonworks>
Ø Hivemall	on	Apache	Pig	
Ø Apache	Pig	PMC	member
• Tsuyoshi	Ozawa	<NTT>
ØApache	Hadoop	PMC	member
• Kai	Sasaki	<Treasure	Data>
51
Initial	committers
2016/09/09	HadoopCon	16,	Taipei
Champion
Nominated	Mentors
52
Project	mentors
• Reynold	Xin	<Databricks,	ASF	member>
Apache	Spark	PMC	member
• Markus	Weimer	<Microsoft,	ASF	member>
Apache	REEF	PMC	member
• Xiangrui Meng <Databricks,	ASF	member>
Apache	Spark	PMC	member
• Roman	Shaposhnik <Pivotal,	ASF	member>
Apache	Bigtop/Incubator	PMC	member
2016/09/09	HadoopCon	16,	Taipei
• Possibly	enter	Apache	Incubator	soon
• IP	clearance	and	project/repository	site	
setup
•Contribution	guideline
•Create	who	use	Hivemall	list
•More	documentations!	Sept	to	Nov
• Initial	Apache	Release	will	be	Dec	(or	
late	Nov?)
53
Roadmap
2016/09/09	HadoopCon	16,	Taipei
ü Hivemall	on	Spark	2.0	w/	Dataframe
support	
ü XGBoost support
54
Coming	New	Features	- already	merged	in	Master
2016/09/09	HadoopCon	16,	Taipei
Please	Refer	
bit.ly/hivemall-xgboost
for	detail
ü ChangeFinder
• Efficient	algorithm	for	finding	change	point	and	outliers	
from	timeseries data
55
Coming	New	Features	- already	merged	in	Master
J.	Takeuchi	and	K.	Yamanishi,	“A	Unifying	Framework	for	Detecting	
Outliers	and	Change	Points	from	Time	Series,” IEEE		transactions	on	
Knowledge	and	Data	Engineering,	pp.482-492,	2006.
2016/09/09	HadoopCon	16,	Taipei
ü ChangeFinder
• Efficient	algorithm	for	finding	change	point	and	outliers	
from	timeseries data
56
Coming	New	Features	- already	merged	in	Master
J.	Takeuchi	and	K.	Yamanishi,	“A	Unifying	Framework	for	Detecting	
Outliers	and	Change	Points	from	Time	Series,” IEEE		transactions	on	
Knowledge	and	Data	Engineering,	pp.482-492,	2006.
2016/09/09	HadoopCon	16,	Taipei
ü Various	Evaluation	Metrics
•PR	#326
57
Coming	New	Features	- already	merged	in	Master
2016/09/09	HadoopCon	16,	Taipei
• v0.5-beta{1,2}	release	(Oct-Nov)
üone-hot	encoding
ü Field-aware	Factorization	Machines
ü Kernelized Passive	Aggressive
üGeneralized	Linear	Model
ü Optimizer	framework	including	ADAM
ü L1/L2	regularization
ü Gradient	Tree	Boosting	
ü Online	LDA
58
Other	undergoing	new	features
2016/09/09	HadoopCon	16,	Taipei
Conclusion	and	Takeaway
Hivemall	provides	a	collection	of	machine	
learning	algorithms	as	Hive	UDFs/UDTFs
59
Ø For	SQL	users	that	need	ML
Ø For	whom	already	using	Hive
Ø Easy-of-use	and	scalability	in	mind
Do	not	require	coding,	packaging,	compiling	or	
introducing	a	new	programming	language	or APIs.
Hivemall’s Positioning
We	welcome	your	contributions	to	Apache	Hivemall	J
2016/09/09	HadoopCon	16,	Taipei
60
Any	feature	request	or	questions?
#hivemall
2016/09/09	HadoopCon	16,	Taipei

Más contenido relacionado

La actualidad más candente

Writing Apache Spark and Apache Flink Applications Using Apache Bahir
Writing Apache Spark and Apache Flink Applications Using Apache BahirWriting Apache Spark and Apache Flink Applications Using Apache Bahir
Writing Apache Spark and Apache Flink Applications Using Apache BahirLuciano Resende
 
Fast, Scalable Graph Processing: Apache Giraph on YARN
Fast, Scalable Graph Processing: Apache Giraph on YARNFast, Scalable Graph Processing: Apache Giraph on YARN
Fast, Scalable Graph Processing: Apache Giraph on YARNDataWorks Summit
 
The Apache Way - Building Open Source Community in China - Luke Han
The Apache Way - Building Open Source Community in China - Luke HanThe Apache Way - Building Open Source Community in China - Luke Han
The Apache Way - Building Open Source Community in China - Luke HanLuke Han
 
The Evolution of Apache Kylin by Luke Han
The Evolution of Apache Kylin by Luke HanThe Evolution of Apache Kylin by Luke Han
The Evolution of Apache Kylin by Luke HanLuke Han
 
Extending Pandas using Apache Arrow and Numba
Extending Pandas using Apache Arrow and NumbaExtending Pandas using Apache Arrow and Numba
Extending Pandas using Apache Arrow and NumbaUwe Korn
 
SystemML - Declarative Machine Learning
SystemML - Declarative Machine LearningSystemML - Declarative Machine Learning
SystemML - Declarative Machine LearningLuciano Resende
 
Spark Summit EU talk by Oscar Castaneda
Spark Summit EU talk by Oscar CastanedaSpark Summit EU talk by Oscar Castaneda
Spark Summit EU talk by Oscar CastanedaSpark Summit
 
Hadoop Graph Processing with Apache Giraph
Hadoop Graph Processing with Apache GiraphHadoop Graph Processing with Apache Giraph
Hadoop Graph Processing with Apache GiraphDataWorks Summit
 
Data Science Languages and Industry Analytics
Data Science Languages and Industry AnalyticsData Science Languages and Industry Analytics
Data Science Languages and Industry AnalyticsWes McKinney
 
SparkR + Zeppelin
SparkR + ZeppelinSparkR + Zeppelin
SparkR + Zeppelinfelixcss
 
1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...
1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...
1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...Luke Han
 
Hw09 Hadoop Applications At Yahoo!
Hw09   Hadoop Applications At Yahoo!Hw09   Hadoop Applications At Yahoo!
Hw09 Hadoop Applications At Yahoo!Cloudera, Inc.
 
PyCon Singapore 2013 Keynote
PyCon Singapore 2013 KeynotePyCon Singapore 2013 Keynote
PyCon Singapore 2013 KeynoteWes McKinney
 
Luciano Resende's keynote at Apache big data conference
Luciano Resende's keynote at Apache big data conferenceLuciano Resende's keynote at Apache big data conference
Luciano Resende's keynote at Apache big data conferenceLuciano Resende
 
Improving data interoperability in Python and R
Improving data interoperability in Python and RImproving data interoperability in Python and R
Improving data interoperability in Python and RWes McKinney
 
Adding Spark support to Kylin at Bay Area Spark Meetup
Adding Spark support to Kylin at Bay Area Spark MeetupAdding Spark support to Kylin at Bay Area Spark Meetup
Adding Spark support to Kylin at Bay Area Spark MeetupLuke Han
 
Apache Arrow: Cross-language Development Platform for In-memory Data
Apache Arrow: Cross-language Development Platform for In-memory DataApache Arrow: Cross-language Development Platform for In-memory Data
Apache Arrow: Cross-language Development Platform for In-memory DataWes McKinney
 
Apache Kylin Introduction
Apache Kylin IntroductionApache Kylin Introduction
Apache Kylin IntroductionLuke Han
 

La actualidad más candente (20)

Writing Apache Spark and Apache Flink Applications Using Apache Bahir
Writing Apache Spark and Apache Flink Applications Using Apache BahirWriting Apache Spark and Apache Flink Applications Using Apache Bahir
Writing Apache Spark and Apache Flink Applications Using Apache Bahir
 
Fast, Scalable Graph Processing: Apache Giraph on YARN
Fast, Scalable Graph Processing: Apache Giraph on YARNFast, Scalable Graph Processing: Apache Giraph on YARN
Fast, Scalable Graph Processing: Apache Giraph on YARN
 
The Apache Way - Building Open Source Community in China - Luke Han
The Apache Way - Building Open Source Community in China - Luke HanThe Apache Way - Building Open Source Community in China - Luke Han
The Apache Way - Building Open Source Community in China - Luke Han
 
The Evolution of Apache Kylin by Luke Han
The Evolution of Apache Kylin by Luke HanThe Evolution of Apache Kylin by Luke Han
The Evolution of Apache Kylin by Luke Han
 
Extending Pandas using Apache Arrow and Numba
Extending Pandas using Apache Arrow and NumbaExtending Pandas using Apache Arrow and Numba
Extending Pandas using Apache Arrow and Numba
 
MahoutNew
MahoutNewMahoutNew
MahoutNew
 
SystemML - Declarative Machine Learning
SystemML - Declarative Machine LearningSystemML - Declarative Machine Learning
SystemML - Declarative Machine Learning
 
Spark Summit EU talk by Oscar Castaneda
Spark Summit EU talk by Oscar CastanedaSpark Summit EU talk by Oscar Castaneda
Spark Summit EU talk by Oscar Castaneda
 
Hadoop Graph Processing with Apache Giraph
Hadoop Graph Processing with Apache GiraphHadoop Graph Processing with Apache Giraph
Hadoop Graph Processing with Apache Giraph
 
Data Science Languages and Industry Analytics
Data Science Languages and Industry AnalyticsData Science Languages and Industry Analytics
Data Science Languages and Industry Analytics
 
SparkR + Zeppelin
SparkR + ZeppelinSparkR + Zeppelin
SparkR + Zeppelin
 
1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...
1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...
1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...
 
Hw09 Hadoop Applications At Yahoo!
Hw09   Hadoop Applications At Yahoo!Hw09   Hadoop Applications At Yahoo!
Hw09 Hadoop Applications At Yahoo!
 
PyCon Singapore 2013 Keynote
PyCon Singapore 2013 KeynotePyCon Singapore 2013 Keynote
PyCon Singapore 2013 Keynote
 
Luciano Resende's keynote at Apache big data conference
Luciano Resende's keynote at Apache big data conferenceLuciano Resende's keynote at Apache big data conference
Luciano Resende's keynote at Apache big data conference
 
Improving data interoperability in Python and R
Improving data interoperability in Python and RImproving data interoperability in Python and R
Improving data interoperability in Python and R
 
Apache Spark & MLlib
Apache Spark & MLlibApache Spark & MLlib
Apache Spark & MLlib
 
Adding Spark support to Kylin at Bay Area Spark Meetup
Adding Spark support to Kylin at Bay Area Spark MeetupAdding Spark support to Kylin at Bay Area Spark Meetup
Adding Spark support to Kylin at Bay Area Spark Meetup
 
Apache Arrow: Cross-language Development Platform for In-memory Data
Apache Arrow: Cross-language Development Platform for In-memory DataApache Arrow: Cross-language Development Platform for In-memory Data
Apache Arrow: Cross-language Development Platform for In-memory Data
 
Apache Kylin Introduction
Apache Kylin IntroductionApache Kylin Introduction
Apache Kylin Introduction
 

Destacado

Log Event Stream Processing In Flink Way
Log Event Stream Processing In Flink WayLog Event Stream Processing In Flink Way
Log Event Stream Processing In Flink WayGeorge T. C. Lai
 
機械学習のデータ並列処理@第7回BDI研究会
機械学習のデータ並列処理@第7回BDI研究会機械学習のデータ並列処理@第7回BDI研究会
機械学習のデータ並列処理@第7回BDI研究会Makoto Yui
 
Stream Processing with Apache Flink
Stream Processing with Apache FlinkStream Processing with Apache Flink
Stream Processing with Apache FlinkC4Media
 
Hivemall v0.3の機能紹介@1st Hivemall meetup
Hivemall v0.3の機能紹介@1st Hivemall meetupHivemall v0.3の機能紹介@1st Hivemall meetup
Hivemall v0.3の機能紹介@1st Hivemall meetupMakoto Yui
 
Dots20161029 myui
Dots20161029 myuiDots20161029 myui
Dots20161029 myuiMakoto Yui
 
Hivemall meetup vol2 oisix
Hivemall meetup vol2 oisixHivemall meetup vol2 oisix
Hivemall meetup vol2 oisixTaisuke Fukawa
 
Hadoop con 2016_9_10_王經篤(Jing-Doo Wang)
Hadoop con 2016_9_10_王經篤(Jing-Doo Wang)Hadoop con 2016_9_10_王經篤(Jing-Doo Wang)
Hadoop con 2016_9_10_王經篤(Jing-Doo Wang)Jing-Doo Wang
 
Yarn Resource Management Using Machine Learning
Yarn Resource Management Using Machine LearningYarn Resource Management Using Machine Learning
Yarn Resource Management Using Machine Learningojavajava
 
How to plan a hadoop cluster for testing and production environment
How to plan a hadoop cluster for testing and production environmentHow to plan a hadoop cluster for testing and production environment
How to plan a hadoop cluster for testing and production environmentAnna Yen
 
Apache Flink & Graph Processing
Apache Flink & Graph ProcessingApache Flink & Graph Processing
Apache Flink & Graph ProcessingVasia Kalavri
 
Hivemallで始める不動産価格推定サービス
Hivemallで始める不動産価格推定サービスHivemallで始める不動産価格推定サービス
Hivemallで始める不動産価格推定サービスKentaro Yoshida
 
2016-07-12 Introduction to Big Data Platform Security
2016-07-12 Introduction to Big Data Platform Security2016-07-12 Introduction to Big Data Platform Security
2016-07-12 Introduction to Big Data Platform SecurityJazz Yao-Tsung Wang
 
Apache Flink Training Workshop @ HadoopCon2016 - #2 DataSet API Hands-On
Apache Flink Training Workshop @ HadoopCon2016 - #2 DataSet API Hands-OnApache Flink Training Workshop @ HadoopCon2016 - #2 DataSet API Hands-On
Apache Flink Training Workshop @ HadoopCon2016 - #2 DataSet API Hands-OnApache Flink Taiwan User Group
 
Batch and Stream Graph Processing with Apache Flink
Batch and Stream Graph Processing with Apache FlinkBatch and Stream Graph Processing with Apache Flink
Batch and Stream Graph Processing with Apache FlinkVasia Kalavri
 
2016 Hadoop Conf TW - 如何建置數據精靈
2016 Hadoop Conf TW - 如何建置數據精靈2016 Hadoop Conf TW - 如何建置數據精靈
2016 Hadoop Conf TW - 如何建置數據精靈晨揚 施
 
Apache Software Foundation: How To Contribute, with Apache Flink as Example (...
Apache Software Foundation: How To Contribute, with Apache Flink as Example (...Apache Software Foundation: How To Contribute, with Apache Flink as Example (...
Apache Software Foundation: How To Contribute, with Apache Flink as Example (...Apache Flink Taiwan User Group
 
HadoopCon 2016 - 用 Jupyter Notebook Hold 住一個上線 Spark Machine Learning 專案實戰
HadoopCon 2016  - 用 Jupyter Notebook Hold 住一個上線 Spark  Machine Learning 專案實戰HadoopCon 2016  - 用 Jupyter Notebook Hold 住一個上線 Spark  Machine Learning 專案實戰
HadoopCon 2016 - 用 Jupyter Notebook Hold 住一個上線 Spark Machine Learning 專案實戰Wayne Chen
 
Achieve big data analytic platform with lambda architecture on cloud
Achieve big data analytic platform with lambda architecture on cloudAchieve big data analytic platform with lambda architecture on cloud
Achieve big data analytic platform with lambda architecture on cloudScott Miao
 

Destacado (20)

Log Event Stream Processing In Flink Way
Log Event Stream Processing In Flink WayLog Event Stream Processing In Flink Way
Log Event Stream Processing In Flink Way
 
機械学習のデータ並列処理@第7回BDI研究会
機械学習のデータ並列処理@第7回BDI研究会機械学習のデータ並列処理@第7回BDI研究会
機械学習のデータ並列処理@第7回BDI研究会
 
Stream Processing with Apache Flink
Stream Processing with Apache FlinkStream Processing with Apache Flink
Stream Processing with Apache Flink
 
Hivemall v0.3の機能紹介@1st Hivemall meetup
Hivemall v0.3の機能紹介@1st Hivemall meetupHivemall v0.3の機能紹介@1st Hivemall meetup
Hivemall v0.3の機能紹介@1st Hivemall meetup
 
Dots20161029 myui
Dots20161029 myuiDots20161029 myui
Dots20161029 myui
 
Hivemall meetup vol2 oisix
Hivemall meetup vol2 oisixHivemall meetup vol2 oisix
Hivemall meetup vol2 oisix
 
Hadoop con 2016_9_10_王經篤(Jing-Doo Wang)
Hadoop con 2016_9_10_王經篤(Jing-Doo Wang)Hadoop con 2016_9_10_王經篤(Jing-Doo Wang)
Hadoop con 2016_9_10_王經篤(Jing-Doo Wang)
 
Yarn Resource Management Using Machine Learning
Yarn Resource Management Using Machine LearningYarn Resource Management Using Machine Learning
Yarn Resource Management Using Machine Learning
 
How to plan a hadoop cluster for testing and production environment
How to plan a hadoop cluster for testing and production environmentHow to plan a hadoop cluster for testing and production environment
How to plan a hadoop cluster for testing and production environment
 
Apache Flink & Graph Processing
Apache Flink & Graph ProcessingApache Flink & Graph Processing
Apache Flink & Graph Processing
 
Hivemallで始める不動産価格推定サービス
Hivemallで始める不動産価格推定サービスHivemallで始める不動産価格推定サービス
Hivemallで始める不動産価格推定サービス
 
2016-07-12 Introduction to Big Data Platform Security
2016-07-12 Introduction to Big Data Platform Security2016-07-12 Introduction to Big Data Platform Security
2016-07-12 Introduction to Big Data Platform Security
 
Apache Flink Training Workshop @ HadoopCon2016 - #2 DataSet API Hands-On
Apache Flink Training Workshop @ HadoopCon2016 - #2 DataSet API Hands-OnApache Flink Training Workshop @ HadoopCon2016 - #2 DataSet API Hands-On
Apache Flink Training Workshop @ HadoopCon2016 - #2 DataSet API Hands-On
 
Batch and Stream Graph Processing with Apache Flink
Batch and Stream Graph Processing with Apache FlinkBatch and Stream Graph Processing with Apache Flink
Batch and Stream Graph Processing with Apache Flink
 
2016 Hadoop Conf TW - 如何建置數據精靈
2016 Hadoop Conf TW - 如何建置數據精靈2016 Hadoop Conf TW - 如何建置數據精靈
2016 Hadoop Conf TW - 如何建置數據精靈
 
Apache Software Foundation: How To Contribute, with Apache Flink as Example (...
Apache Software Foundation: How To Contribute, with Apache Flink as Example (...Apache Software Foundation: How To Contribute, with Apache Flink as Example (...
Apache Software Foundation: How To Contribute, with Apache Flink as Example (...
 
HadoopCon 2016 - 用 Jupyter Notebook Hold 住一個上線 Spark Machine Learning 專案實戰
HadoopCon 2016  - 用 Jupyter Notebook Hold 住一個上線 Spark  Machine Learning 專案實戰HadoopCon 2016  - 用 Jupyter Notebook Hold 住一個上線 Spark  Machine Learning 專案實戰
HadoopCon 2016 - 用 Jupyter Notebook Hold 住一個上線 Spark Machine Learning 專案實戰
 
Streaming in the Wild with Apache Flink
Streaming in the Wild with Apache FlinkStreaming in the Wild with Apache Flink
Streaming in the Wild with Apache Flink
 
BI in Xuenn
BI in XuennBI in Xuenn
BI in Xuenn
 
Achieve big data analytic platform with lambda architecture on cloud
Achieve big data analytic platform with lambda architecture on cloudAchieve big data analytic platform with lambda architecture on cloud
Achieve big data analytic platform with lambda architecture on cloud
 

Similar a HadoopCon'16, Taipei @myui

Introduction to Hivemall
Introduction to HivemallIntroduction to Hivemall
Introduction to HivemallMakoto Yui
 
Big data or big deal
Big data or big dealBig data or big deal
Big data or big dealeduarderwee
 
The Big Picture on Hadoop
The Big Picture on HadoopThe Big Picture on Hadoop
The Big Picture on HadoopStackIQ
 
Apache Hivemall and my OSS experience
Apache Hivemall and my OSS experienceApache Hivemall and my OSS experience
Apache Hivemall and my OSS experienceMakoto Yui
 
Apache-Flink-What-How-Why-Who-Where-by-Slim-Baltagi
Apache-Flink-What-How-Why-Who-Where-by-Slim-BaltagiApache-Flink-What-How-Why-Who-Where-by-Slim-Baltagi
Apache-Flink-What-How-Why-Who-Where-by-Slim-BaltagiSlim Baltagi
 
Hadoop or Spark: is it an either-or proposition? By Slim Baltagi
Hadoop or Spark: is it an either-or proposition? By Slim BaltagiHadoop or Spark: is it an either-or proposition? By Slim Baltagi
Hadoop or Spark: is it an either-or proposition? By Slim BaltagiSlim Baltagi
 
Idea behind Apache Hivemall
Idea behind Apache HivemallIdea behind Apache Hivemall
Idea behind Apache HivemallMakoto Yui
 
2014 feb 24_big_datacongress_hadoopsession1_hadoop101
2014 feb 24_big_datacongress_hadoopsession1_hadoop1012014 feb 24_big_datacongress_hadoopsession1_hadoop101
2014 feb 24_big_datacongress_hadoopsession1_hadoop101Adam Muise
 
Get started with hadoop hive hive ql languages
Get started with hadoop hive hive ql languagesGet started with hadoop hive hive ql languages
Get started with hadoop hive hive ql languagesJanBask Training
 
Hortonworks for Financial Analysts Presentation
Hortonworks for Financial Analysts PresentationHortonworks for Financial Analysts Presentation
Hortonworks for Financial Analysts PresentationHortonworks
 
13회 Oracle Developer Meetup 발표 자료: Oracle Cloud Data Interface(2019.07.20)
13회 Oracle Developer Meetup 발표 자료: Oracle Cloud Data Interface(2019.07.20)13회 Oracle Developer Meetup 발표 자료: Oracle Cloud Data Interface(2019.07.20)
13회 Oracle Developer Meetup 발표 자료: Oracle Cloud Data Interface(2019.07.20)Taewan Kim
 
Presto & differences between popular SQL engines (Spark, Redshift, and Hive)
Presto & differences between popular SQL engines (Spark, Redshift, and Hive)Presto & differences between popular SQL engines (Spark, Redshift, and Hive)
Presto & differences between popular SQL engines (Spark, Redshift, and Hive)Holden Ackerman
 
Detailed presentation on big data hadoop +Hadoop Project Near Duplicate Detec...
Detailed presentation on big data hadoop +Hadoop Project Near Duplicate Detec...Detailed presentation on big data hadoop +Hadoop Project Near Duplicate Detec...
Detailed presentation on big data hadoop +Hadoop Project Near Duplicate Detec...Ashok Royal
 
The sensor data challenge - Innovations (not only) for the Internet of Things
The sensor data challenge - Innovations (not only) for the Internet of ThingsThe sensor data challenge - Innovations (not only) for the Internet of Things
The sensor data challenge - Innovations (not only) for the Internet of ThingsStephan Reimann
 
Big data advance topics - part 2.pptx
Big data   advance topics - part 2.pptxBig data   advance topics - part 2.pptx
Big data advance topics - part 2.pptxMoldovan Radu Adrian
 
Big Data & Open Source - Neil Jadhav
Big Data & Open Source - Neil JadhavBig Data & Open Source - Neil Jadhav
Big Data & Open Source - Neil JadhavSwapnil (Neil) Jadhav
 
Architecting the Future of Big Data and Search
Architecting the Future of Big Data and SearchArchitecting the Future of Big Data and Search
Architecting the Future of Big Data and SearchHortonworks
 
Unified Batch and Real-Time Stream Processing Using Apache Flink
Unified Batch and Real-Time Stream Processing Using Apache FlinkUnified Batch and Real-Time Stream Processing Using Apache Flink
Unified Batch and Real-Time Stream Processing Using Apache FlinkSlim Baltagi
 
Apache SystemML - Declarative Large-Scale Machine Learning
Apache SystemML - Declarative Large-Scale Machine LearningApache SystemML - Declarative Large-Scale Machine Learning
Apache SystemML - Declarative Large-Scale Machine LearningRomeo Kienzler
 
Introduction to pyspark new
Introduction to pyspark newIntroduction to pyspark new
Introduction to pyspark newAnam Mahmood
 

Similar a HadoopCon'16, Taipei @myui (20)

Introduction to Hivemall
Introduction to HivemallIntroduction to Hivemall
Introduction to Hivemall
 
Big data or big deal
Big data or big dealBig data or big deal
Big data or big deal
 
The Big Picture on Hadoop
The Big Picture on HadoopThe Big Picture on Hadoop
The Big Picture on Hadoop
 
Apache Hivemall and my OSS experience
Apache Hivemall and my OSS experienceApache Hivemall and my OSS experience
Apache Hivemall and my OSS experience
 
Apache-Flink-What-How-Why-Who-Where-by-Slim-Baltagi
Apache-Flink-What-How-Why-Who-Where-by-Slim-BaltagiApache-Flink-What-How-Why-Who-Where-by-Slim-Baltagi
Apache-Flink-What-How-Why-Who-Where-by-Slim-Baltagi
 
Hadoop or Spark: is it an either-or proposition? By Slim Baltagi
Hadoop or Spark: is it an either-or proposition? By Slim BaltagiHadoop or Spark: is it an either-or proposition? By Slim Baltagi
Hadoop or Spark: is it an either-or proposition? By Slim Baltagi
 
Idea behind Apache Hivemall
Idea behind Apache HivemallIdea behind Apache Hivemall
Idea behind Apache Hivemall
 
2014 feb 24_big_datacongress_hadoopsession1_hadoop101
2014 feb 24_big_datacongress_hadoopsession1_hadoop1012014 feb 24_big_datacongress_hadoopsession1_hadoop101
2014 feb 24_big_datacongress_hadoopsession1_hadoop101
 
Get started with hadoop hive hive ql languages
Get started with hadoop hive hive ql languagesGet started with hadoop hive hive ql languages
Get started with hadoop hive hive ql languages
 
Hortonworks for Financial Analysts Presentation
Hortonworks for Financial Analysts PresentationHortonworks for Financial Analysts Presentation
Hortonworks for Financial Analysts Presentation
 
13회 Oracle Developer Meetup 발표 자료: Oracle Cloud Data Interface(2019.07.20)
13회 Oracle Developer Meetup 발표 자료: Oracle Cloud Data Interface(2019.07.20)13회 Oracle Developer Meetup 발표 자료: Oracle Cloud Data Interface(2019.07.20)
13회 Oracle Developer Meetup 발표 자료: Oracle Cloud Data Interface(2019.07.20)
 
Presto & differences between popular SQL engines (Spark, Redshift, and Hive)
Presto & differences between popular SQL engines (Spark, Redshift, and Hive)Presto & differences between popular SQL engines (Spark, Redshift, and Hive)
Presto & differences between popular SQL engines (Spark, Redshift, and Hive)
 
Detailed presentation on big data hadoop +Hadoop Project Near Duplicate Detec...
Detailed presentation on big data hadoop +Hadoop Project Near Duplicate Detec...Detailed presentation on big data hadoop +Hadoop Project Near Duplicate Detec...
Detailed presentation on big data hadoop +Hadoop Project Near Duplicate Detec...
 
The sensor data challenge - Innovations (not only) for the Internet of Things
The sensor data challenge - Innovations (not only) for the Internet of ThingsThe sensor data challenge - Innovations (not only) for the Internet of Things
The sensor data challenge - Innovations (not only) for the Internet of Things
 
Big data advance topics - part 2.pptx
Big data   advance topics - part 2.pptxBig data   advance topics - part 2.pptx
Big data advance topics - part 2.pptx
 
Big Data & Open Source - Neil Jadhav
Big Data & Open Source - Neil JadhavBig Data & Open Source - Neil Jadhav
Big Data & Open Source - Neil Jadhav
 
Architecting the Future of Big Data and Search
Architecting the Future of Big Data and SearchArchitecting the Future of Big Data and Search
Architecting the Future of Big Data and Search
 
Unified Batch and Real-Time Stream Processing Using Apache Flink
Unified Batch and Real-Time Stream Processing Using Apache FlinkUnified Batch and Real-Time Stream Processing Using Apache Flink
Unified Batch and Real-Time Stream Processing Using Apache Flink
 
Apache SystemML - Declarative Large-Scale Machine Learning
Apache SystemML - Declarative Large-Scale Machine LearningApache SystemML - Declarative Large-Scale Machine Learning
Apache SystemML - Declarative Large-Scale Machine Learning
 
Introduction to pyspark new
Introduction to pyspark newIntroduction to pyspark new
Introduction to pyspark new
 

Más de Makoto Yui

Introduction to Apache Hivemall v0.5.2 and v0.6
Introduction to Apache Hivemall v0.5.2 and v0.6Introduction to Apache Hivemall v0.5.2 and v0.6
Introduction to Apache Hivemall v0.5.2 and v0.6Makoto Yui
 
Introduction to Apache Hivemall v0.5.0
Introduction to Apache Hivemall v0.5.0Introduction to Apache Hivemall v0.5.0
Introduction to Apache Hivemall v0.5.0Makoto Yui
 
Introduction to Apache Hivemall v0.5.0
Introduction to Apache Hivemall v0.5.0Introduction to Apache Hivemall v0.5.0
Introduction to Apache Hivemall v0.5.0Makoto Yui
 
What's new in Hivemall v0.5.0
What's new in Hivemall v0.5.0What's new in Hivemall v0.5.0
What's new in Hivemall v0.5.0Makoto Yui
 
What's new in Apache Hivemall v0.5.0
What's new in Apache Hivemall v0.5.0What's new in Apache Hivemall v0.5.0
What's new in Apache Hivemall v0.5.0Makoto Yui
 
Revisiting b+-trees
Revisiting b+-treesRevisiting b+-trees
Revisiting b+-treesMakoto Yui
 
Incubating Apache Hivemall
Incubating Apache HivemallIncubating Apache Hivemall
Incubating Apache HivemallMakoto Yui
 
Hivemall meets Digdag @Hackertackle 2018-02-17
Hivemall meets Digdag @Hackertackle 2018-02-17Hivemall meets Digdag @Hackertackle 2018-02-17
Hivemall meets Digdag @Hackertackle 2018-02-17Makoto Yui
 
Recommendation 101 using Hivemall
Recommendation 101 using HivemallRecommendation 101 using Hivemall
Recommendation 101 using HivemallMakoto Yui
 
Hivemall dbtechshowcase 20160713 #dbts2016
Hivemall dbtechshowcase 20160713 #dbts2016Hivemall dbtechshowcase 20160713 #dbts2016
Hivemall dbtechshowcase 20160713 #dbts2016Makoto Yui
 
Tdtechtalk20160425myui
Tdtechtalk20160425myuiTdtechtalk20160425myui
Tdtechtalk20160425myuiMakoto Yui
 
Tdtechtalk20160330myui
Tdtechtalk20160330myuiTdtechtalk20160330myui
Tdtechtalk20160330myuiMakoto Yui
 
Datascientistsymp1113
Datascientistsymp1113Datascientistsymp1113
Datascientistsymp1113Makoto Yui
 
2nd Hivemall meetup 20151020
2nd Hivemall meetup 201510202nd Hivemall meetup 20151020
2nd Hivemall meetup 20151020Makoto Yui
 
Talk about Hivemall at Data Scientist Organization on 2015/09/17
Talk about Hivemall at Data Scientist Organization on 2015/09/17Talk about Hivemall at Data Scientist Organization on 2015/09/17
Talk about Hivemall at Data Scientist Organization on 2015/09/17Makoto Yui
 
Db tech show - hivemall
Db tech show - hivemallDb tech show - hivemall
Db tech show - hivemallMakoto Yui
 
Hivemall tech talk at Redwood, CA
Hivemall tech talk at Redwood, CAHivemall tech talk at Redwood, CA
Hivemall tech talk at Redwood, CAMakoto Yui
 
Hivemall Talk at TD tech talk #3
Hivemall Talk at TD tech talk #3Hivemall Talk at TD tech talk #3
Hivemall Talk at TD tech talk #3Makoto Yui
 
Hivemall LT @ Machine Learning Casual Talks #3
Hivemall LT @ Machine Learning Casual Talks #3Hivemall LT @ Machine Learning Casual Talks #3
Hivemall LT @ Machine Learning Casual Talks #3Makoto Yui
 
HivemallとSpark MLlibの比較
HivemallとSpark MLlibの比較HivemallとSpark MLlibの比較
HivemallとSpark MLlibの比較Makoto Yui
 

Más de Makoto Yui (20)

Introduction to Apache Hivemall v0.5.2 and v0.6
Introduction to Apache Hivemall v0.5.2 and v0.6Introduction to Apache Hivemall v0.5.2 and v0.6
Introduction to Apache Hivemall v0.5.2 and v0.6
 
Introduction to Apache Hivemall v0.5.0
Introduction to Apache Hivemall v0.5.0Introduction to Apache Hivemall v0.5.0
Introduction to Apache Hivemall v0.5.0
 
Introduction to Apache Hivemall v0.5.0
Introduction to Apache Hivemall v0.5.0Introduction to Apache Hivemall v0.5.0
Introduction to Apache Hivemall v0.5.0
 
What's new in Hivemall v0.5.0
What's new in Hivemall v0.5.0What's new in Hivemall v0.5.0
What's new in Hivemall v0.5.0
 
What's new in Apache Hivemall v0.5.0
What's new in Apache Hivemall v0.5.0What's new in Apache Hivemall v0.5.0
What's new in Apache Hivemall v0.5.0
 
Revisiting b+-trees
Revisiting b+-treesRevisiting b+-trees
Revisiting b+-trees
 
Incubating Apache Hivemall
Incubating Apache HivemallIncubating Apache Hivemall
Incubating Apache Hivemall
 
Hivemall meets Digdag @Hackertackle 2018-02-17
Hivemall meets Digdag @Hackertackle 2018-02-17Hivemall meets Digdag @Hackertackle 2018-02-17
Hivemall meets Digdag @Hackertackle 2018-02-17
 
Recommendation 101 using Hivemall
Recommendation 101 using HivemallRecommendation 101 using Hivemall
Recommendation 101 using Hivemall
 
Hivemall dbtechshowcase 20160713 #dbts2016
Hivemall dbtechshowcase 20160713 #dbts2016Hivemall dbtechshowcase 20160713 #dbts2016
Hivemall dbtechshowcase 20160713 #dbts2016
 
Tdtechtalk20160425myui
Tdtechtalk20160425myuiTdtechtalk20160425myui
Tdtechtalk20160425myui
 
Tdtechtalk20160330myui
Tdtechtalk20160330myuiTdtechtalk20160330myui
Tdtechtalk20160330myui
 
Datascientistsymp1113
Datascientistsymp1113Datascientistsymp1113
Datascientistsymp1113
 
2nd Hivemall meetup 20151020
2nd Hivemall meetup 201510202nd Hivemall meetup 20151020
2nd Hivemall meetup 20151020
 
Talk about Hivemall at Data Scientist Organization on 2015/09/17
Talk about Hivemall at Data Scientist Organization on 2015/09/17Talk about Hivemall at Data Scientist Organization on 2015/09/17
Talk about Hivemall at Data Scientist Organization on 2015/09/17
 
Db tech show - hivemall
Db tech show - hivemallDb tech show - hivemall
Db tech show - hivemall
 
Hivemall tech talk at Redwood, CA
Hivemall tech talk at Redwood, CAHivemall tech talk at Redwood, CA
Hivemall tech talk at Redwood, CA
 
Hivemall Talk at TD tech talk #3
Hivemall Talk at TD tech talk #3Hivemall Talk at TD tech talk #3
Hivemall Talk at TD tech talk #3
 
Hivemall LT @ Machine Learning Casual Talks #3
Hivemall LT @ Machine Learning Casual Talks #3Hivemall LT @ Machine Learning Casual Talks #3
Hivemall LT @ Machine Learning Casual Talks #3
 
HivemallとSpark MLlibの比較
HivemallとSpark MLlibの比較HivemallとSpark MLlibの比較
HivemallとSpark MLlibの比較
 

Último

Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Christo Ananth
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdfKamal Acharya
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringmulugeta48
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfKamal Acharya
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile
 
Unit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfUnit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfRagavanV2
 
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...SUHANI PANDEY
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)simmis5
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordAsst.prof M.Gokilavani
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VDineshKumar4165
 
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapRishantSharmaFr
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptDineshKumar4165
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...roncy bisnoi
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueBhangaleSonal
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 

Último (20)

Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdf
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineering
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
Unit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfUnit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdf
 
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
 
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 

HadoopCon'16, Taipei @myui