SlideShare una empresa de Scribd logo
1 de 41
Descargar para leer sin conexión
DLoBD:	An	Emerging	Paradigm	of	Deep	Learning	
Over	Big	Data	Stacks	
Talk	at	Spark+AI	Summit	2018	
by	
Dhabaleswar	K.	(DK)	Panda	
The	Ohio	State	University	
E-mail:	panda@cse.ohio-state.edu	
http://www.cse.ohio-state.edu/~panda	
Xiaoyi	Lu	
The	Ohio	State	University	
E-mail:	luxi@cse.ohio-state.edu	
http://www.cse.ohio-state.edu/~luxi
Spark+AI	Summit	2018	 2	Network	Based	Computing	Laboratory	
•  Deep	Learning	is	a	sub-set	of	Machine	Learning	
–  But,	it	is	perhaps	the	most	radical	and	revolutionary	
subset	
•  Deep	Learning	is	going	through	a	resurgence	
–  Model:	Excellent	accuracy	for	deep/convolutional	
neural	networks	
–  Data:	Public	availability	of	versatile	datasets	like	
MNIST,	CIFAR,	and	ImageNet	
–  Capability:	Unprecedented	computing	and	
communication	capabilities:	Multi-/Many-Core,	
GPGPUs,	Xeon	Phi,	InfiniBand,	RoCE,	etc.	
•  Big	Data	has	become	one	of	the	most	important	
elements	in	business	analytics	
–  Increasing	demand	for	getting	Big	Value	out	of	Big	
Data	to	drive	the	revenue	continuously	growing	
Why	Deep	Learning	is	so	hot?	
Courtesy:	
http://www.zdnet.com/article/caffe2-deep-learning-wide-ambitions-flexibility-
scalability-and-advocacy/	
MNIST	handwritten	digits	 Deep	Neural	Network
Spark+AI	Summit	2018	 3	Network	Based	Computing	Laboratory	
Application	Example	of	DL:	Flickr’s	Magic	View	Photo	Filtering	
Courtesy:	https://thenextweb.com/opinion/2015/05/22/flickrs-new-magic-view-photo-filtering-feature-works-so-well-it-convinced-me-to-ditch-iphoto/#.tnw_RaZEaD6g	
•  Image	recognition	to	divide	pictures	into	surprisingly	accurate	categories	
•  Magic	of	AI/DL:		Generate	accurate	tags	for	billions	of	pictures
Spark+AI	Summit	2018	 4	Network	Based	Computing	Laboratory	
(1)	Prepare	
Datasets	@Scale	
(2)	Deep	
Learning	@Scale	
(3)	Non-deep	
learning	
analytics	@Scale	
(4)	Apply	ML	
model	@Scale	
•  Deep	Learning	over	Big	Data	(DLoBD)	is	one	of	the	most	efficient	analyzing	paradigms	
•  More	and	more	deep	learning	tools	or	libraries	(e.g.,	Caffe,	TensorFlow)	start	running	over	big	
data	stacks,	such	as	Apache	Hadoop	and	Spark	
•  Benefits	of	the	DLoBD	approach	
–  Easily	build	a	powerful	data	analytics	pipeline	
•  E.g.,	Flickr	DL/ML	Pipeline,	“How	Deep	Learning	Powers	Flickr”,	http://bit.ly/1KIDfof	
	
–  Better	data	locality	
–  Efficient	resource	sharing	and	cost	effective	
Deep	Learning	over	Big	Data	(DLoBD)
Spark+AI	Summit	2018	 5	Network	Based	Computing	Laboratory	
•  CaffeOnSpark		
•  SparkNet		
•  TensorFlowOnSpark	
•  TensorFrame			
•  DeepLearning4J	
•  BigDL		
•  mmlspark	
–  CNTKOnSpark	
•  Many	others…	
Examples	of	DLoBD	Stacks
Spark+AI	Summit	2018	 6	Network	Based	Computing	Laboratory	
•  Layers	of	DLoBD	Stacks	
–  Deep	learning	application	layer	
–  Deep	learning	library	layer	
–  	Big	data	analytics	framework	layer		
–  Resource	scheduler	layer		
–  Distributed	file	system	layer		
–  Hardware	resource	layer	
•  How	much	performance	benefit	we	
can	achieve	for	end	deep	learning	
applications?	
Overview	of	DLoBD	Stacks	
Sub-optimal	
Performance	
?
Spark+AI	Summit	2018	 7	Network	Based	Computing	Laboratory	
Big	Data	
(Hadoop,	Spark,	
HBase,	
Memcached,	
etc.)	
Deep	Learning	
(Caffe,	TensorFlow,	
BigDL,	etc.)	
HPC		
(MPI,	RDMA,	
Lustre,	etc.)	
Increasing	Usage	of	HPC,	Big	Data	and	Deep	Learning	
Convergence	of	HPC,	Big	Data,	and	Deep	Learning!!!
Spark+AI	Summit	2018	 8	Network	Based	Computing	Laboratory	
•  BLAS	Libraries	–	the	heart	of	math	operations	
–  Atlas/OpenBLAS	
–  NVIDIA	cuBlas	
–  Intel	Math	Kernel	Library	(MKL)	
•  DNN	Libraries	–	the	heart	of	Convolutions!	
–  NVIDIA	cuDNN	(already	reached	its	7th	
iteration	–	cudnn-v7)	
–  Intel	MKL-DNN	(MKL	2017)	–	recent	but	a	very	
promising	development	
•  Communication	Libraries	–	the	heart	of	
model	parameter	updating	
–  RDMA		
–  GPUDirect	RDMA	
Highly-Optimized	Underlying	Libraries	with	HPC	Technologies
Spark+AI	Summit	2018	 9	Network	Based	Computing	Laboratory	
Outline		
•  Accelerating	Big	Data	Stacks	
•  Benchmarking	and	Characterizing	DLoBD	Stacks	
–  CaffeOnSpark,	TensorFlowOnSpark,	MMLSpark,	and	BigDL	
•  Accelerating	DLoBD	Stacks	
–  BigDL	on	RDMA-Spark		
–  TensorFlow
Spark+AI	Summit	2018	 10	Network	Based	Computing	Laboratory	
•  RDMA	for	Apache	Spark		
•  RDMA	for	Apache	Hadoop	2.x	(RDMA-Hadoop-2.x)	
–  Plugins	for	Apache,	Hortonworks	(HDP)	and	Cloudera	(CDH)	Hadoop	distributions	
•  RDMA	for	Apache	HBase	
•  RDMA	for	Memcached	(RDMA-Memcached)	
•  RDMA	for	Apache	Hadoop	1.x	(RDMA-Hadoop)	
•  OSU	HiBD-Benchmarks	(OHB)	
–  HDFS,	Memcached,	HBase,	and	Spark	Micro-benchmarks	
•  http://hibd.cse.ohio-state.edu	
•  Users	Base:	285	organizations	from	34	countries	
•  More	than	26,550	downloads	from	the	project	site	
The	High-Performance	Big	Data	(HiBD)	Project	
Available	for	InfiniBand	and	RoCE	
Also	run	on	Ethernet	
	
Available	for	x86	and	OpenPOWER	
	
Support	Singularity	and	Docker
Spark+AI	Summit	2018	 11	Network	Based	Computing	Laboratory	
HiBD	Release	Timeline	and	Downloads	
0	
5000	
10000	
15000	
20000	
25000	
30000	
Number	of	Downloads	
Timeline	
RDMA-Hadoop	2.x	0.9.1	
RDMA-Hadoop	2.x	0.9.6	
RDMA-Memcached	0.9.4	
RDMA-Hadoop	2.x	0.9.7	
RDMA-Spark	0.9.1	
RDMA-HBase	0.9.1	
RDMA-Hadoop	2.x	1.0.0	
RDMA-Spark	0.9.4	
RDMA-Hadoop	2.x	1.3.0	
RDMA-Memcached	0.9.6	&	OHB-0.9.3	
RDMA-Spark	0.9.5	
RDMA-Hadoop	1.x	0.9.0	
RDMA-Hadoop	1.x	0.9.8	
RDMA-Memcached	0.9.1	&	OHB-0.7.1	
RDMA-Hadoop	1.x	0.9.9
Spark+AI	Summit	2018	 12	Network	Based	Computing	Laboratory	
Version	Number
0.9.1
Example:	RDMA-based	Apache	Spark	
Default	Design	Choices:	
1.	Hash-based	Shuffle	
2.	NIO-based	data	
transfer	
1.2.0
Default	Design	Choices:	
1.  Sort-based	Shuffle	
2.  NIO-based	data	
transfer		
Default	Design	Choices:	
1.  Sort-based	Shuffle	
2.  Netty-based	data	
transfer		
1.3.1
Default	Design	Choices:	
1.  Sort-based	Shuffle	
2.  Netty-based	data	transfer	
New	choice:	Tungsten-Sort!	
1.4.0~2.0.0+
RDMA-based	
Apache	Spark	
(HotI’14)	
RDMA-based	
Apache	Spark	
(BigData’16)	
1.6.0~2.2.0+
New	Workloads	--	DLoBD:	
CaffeOnSpark,	
TensorFlowOnSpark,	
BigDL,	etc.	
RDMA-based	
Apache	Spark	
(HotI’17)
Spark+AI	Summit	2018	 13	Network	Based	Computing	Laboratory	
•  Design	Features	
–  RDMA	based	shuffle	plugin	
–  SEDA-based	architecture	
–  Dynamic	connection	
management	and	sharing	
–  Non-blocking	data	transfer	
–  Off-JVM-heap	buffer	
management	
–  InfiniBand/RoCE	support	
Design	Overview	of	Spark	with	RDMA
•  Enables	high	performance	RDMA	communication,	while	supporting	traditional	socket	interface	
•  JNI	Layer	bridges	Scala	based	Spark	with	communication	library	written	in	native	code	
	X.	Lu,	M.	W.	Rahman,	N.	Islam,	D.	Shankar,	and	D.	K.	Panda,	Accelerating	Spark	with	RDMA	for	Big	Data	Processing:	Early	Experiences,	Int'l	Symposium	on	High	
Performance	Interconnects	(HotI'14),	August	2014	
X.	Lu,	D.	Shankar,	S.	Gugnani,	and	D.	K.	Panda,	High-Performance	Design	of	Apache	Spark	with	RDMA	and	Its	Benefits	on	Various	Workloads,	IEEE	BigData	‘16,	Dec.	2016.	
Spark	Core	
RDMA	Capable	Networks	
(IB,		iWARP,	RoCE	..)	
Apache	Spark	Benchmarks/Applications/Libraries/Frameworks	
1/10/40/100	GigE,	IPoIB		Network	
Java	Socket	Interface	 Java	Native	Interface	(JNI)	
	
Native	RDMA-based	Comm.	Engine	
	
Shuffle	Manager	(Sort,	Hash,	Tungsten-Sort)	
Block	Transfer	Service	(Netty,	NIO,	RDMA-Plugin)	
Netty	
Server	
NIO	
Server	
RDMA	
Server	
Netty	
Client	
NIO	
Client	
RDMA	
Client
Spark+AI	Summit	2018	 14	Network	Based	Computing	Laboratory	
•  High-Performance	Design	of	Spark		over	RDMA-enabled	Interconnects	
–  High	performance	RDMA-enhanced	design	with	native	InfiniBand	and	RoCE	support	at	the	verbs-level	for	Spark	
–  RDMA-based	data	shuffle	and	SEDA-based	shuffle	architecture	
–  Non-blocking	and	chunk-based	data	transfer	
–  Off-JVM-heap	buffer	management	
–  Support	for	OpenPOWER	
–  Easily	configurable	for	different	protocols	(native	InfiniBand,	RoCE,	and	IPoIB)	
•  Current	release:	0.9.5	
–  Based	on	Apache	Spark		2.1.0	
–  Tested	with	
•  Mellanox	InfiniBand	adapters	(DDR,	QDR,	FDR,	and	EDR)	
•  RoCE	support	with	Mellanox	adapters	
•  Various	multi-core	platforms	(x86,	POWER)	
•  RAM	disks,	SSDs,	and	HDD	
–  http://hibd.cse.ohio-state.edu	
RDMA	for	Apache	Spark	Distribution
Spark+AI	Summit	2018	 15	Network	Based	Computing	Laboratory	
•  RDMA	for	Apache	Hadoop	2.x	and	RDMA	for	Apache	Spark	are	installed	and	
available	on	SDSC	Comet.	
–  Examples	for	various	modes	of	usage	are	available	in:	
•  RDMA	for	Apache	Hadoop	2.x:	/share/apps/examples/HADOOP	
•  RDMA	for	Apache	Spark:	/share/apps/examples/SPARK/	
–  Please	email	help@xsede.org	(reference	Comet	as	the	machine,	and	SDSC	as	the	
site)	if	you	have	any	further	questions	about	usage	and	configuration.		
•  RDMA	for	Apache	Hadoop	is	also	available	on	Chameleon	Cloud	as	an	
appliance	
–  https://www.chameleoncloud.org/appliances/17/		
HiBD	Packages	on	SDSC	Comet	and	Chameleon	Cloud	
M.	Tatineni,	X.	Lu,	D.	J.	Choi,	A.	Majumdar,	and	D.	K.	Panda,	Experiences	and	Benefits	of	Running	RDMA	Hadoop	and	Spark	on	SDSC	
Comet,		XSEDE’16,	July	2016
Spark+AI	Summit	2018	 16	Network	Based	Computing	Laboratory	
•  InfiniBand	FDR,	SSD,	32/64	Worker	Nodes,	768/1536	Cores,	(768/1536M	768/1536R)	
•  RDMA-based	design	for	Spark	1.5.1		
•  RDMA	vs.	IPoIB	with	768/1536	concurrent	tasks,	single	SSD	per	node.		
–  32	nodes/768	cores:	Total	time	reduced	by	37%	over	IPoIB	(56Gbps)		
–  64	nodes/1536	cores:	Total	time	reduced	by	43%	over	IPoIB	(56Gbps)		
Performance	Evaluation	on	SDSC	Comet	–	HiBench	PageRank
32	Worker	Nodes,	768	cores,	PageRank	Total	Time	 64	Worker	Nodes,	1536	cores,	PageRank	Total	Time	
0	
50	
100	
150	
200	
250	
300	
350	
400	
450	
Huge	 BigData	 Gigantic	
Time	(sec)	
Data	Size	(GB)	
IPoIB	
RDMA	
0	
100	
200	
300	
400	
500	
600	
700	
800	
Huge	 BigData	 Gigantic	
Time	(sec)	
Data	Size	(GB)	
IPoIB	
RDMA	
43%	37%
Spark+AI	Summit	2018	 17	Network	Based	Computing	Laboratory	
Outline		
•  Accelerating	Big	Data	Stacks	
•  Benchmarking	and	Characterizing	DLoBD	Stacks	
–  CaffeOnSpark,	TensorFlowOnSpark,	MMLSpark,	and	BigDL	
•  Accelerating	DLoBD	Stacks	
–  BigDL	on	RDMA-Spark		
–  TensorFlow
Spark+AI	Summit	2018	 18	Network	Based	Computing	Laboratory	
•  Choose	proper	DL	workloads,	models	and	
datasets	
–  Varied	sizes	to	cover	big	and	small	models.	
Small	and	large	data	sets	
–  Cover	different	kinds	of	combinations	
•  Choose	representative	DLoBD	stacks		
–  CaffeOnSpark,	TensorFlowOnSpark,	and	BigDL	
–  Running	over	Spark,	Yarn,	HDFS	
Benchmarking	and	Characterization	Methodology	
	
•  Define	characterization	dimensions	
–  Processor	Type	
–  Parameter	updating	approach	(i.e.,	communication)	
–  Network	Protocol	(IPoIB,	RDMA)	
•  Generate	evaluation	reports		
–  Performance	(End-to-end	training	time;	time	to	a	certain	
accuracy;	epoch	execution	time)	
–  Accuracy,	Scalability,	Resource	Utilization	
–  Breakdown
Spark+AI	Summit	2018	 19	Network	Based	Computing	Laboratory	
•  Spark	Driver:	Job	Launching	and	Job	
Control	
•  Spark	Executor:	For	data	feeding	and	task	
control	
•  Model	Synchronizer:	Communicates	across	
nodes	with	RDMA	/	TCP,	and	output	model	
file	on	HDFS	
•  Scalable	and	Communication	intensive	
–  Server-to-server	direct	communication	
(Ethernet	or	InfiniBand)	achieves	faster	
learning	and	eliminates	scalability	
bottleneck	
–  Out-of-band	communication	
Overview	of	Representative	DLoBD	Stacks	-	CaffeOnSpark
Spark+AI	Summit	2018	 20	Network	Based	Computing	Laboratory	
•  Spark	Executors	acting	as	containers	
used	to	run	TensorFlow	code	
•  Two	different	modes	to	ingesting	data	
–  Read	data	directly	from	HDFS	using	built-
in	TensorFlow	modules		
–  Feeding	data	from	Spark	RDDs	to	Spark	
executors	(TensorFlow	core)	
•  Scalable	and	Communication	intensive	
–  Parameter	Server-based	approach	
–  Embedded	inside	one	Spark	executor	and	
talk	to	other	workers	over	gRPC	or	gPRC	
with	RDMA	
–  Out-of-band	communication	
Overview	of	Representative	DLoBD	Stacks	-	TensorFlowOnSpark
Spark+AI	Summit	2018	 21	Network	Based	Computing	Laboratory	
•  Microsoft	Cognitive	Toolkit	(CNTK)	and	
OpenCV	into	Spark	Machine	Learning	
pipelines	without	data	transfer	overhead		
•  Feeding	data	for	CNTK	Core	(e.g.	images	
or	texts)	can	be	directly	read	from	HDFS	
by	Spark	Executors		
•  Scalable	and	Communication	intensive	
–  Embedded	inside	one	Spark	executor	and	
talk	to	other	workers	over	MPI	(RDMA,	
TCP)	
–  Out-of-band	communication	
Overview	of	Representative	DLoBD	Stacks	–	CNTKOnSpark/
MMLSpark
Spark+AI	Summit	2018	 22	Network	Based	Computing	Laboratory	
•  Users	can	write	deep	learning	applications	as	
Spark	programs	
•  Users	can	load	pre-trained	Caffe	or	Torch	
models	into	Spark	programs	using	BigDL	
•  Feed	data	to	BigDL	core	by	Spark	Executor	
which	can	directly	load	data	from	HDFS	
•  High	performance	
–  Support	Intel	MKL		
–  Support	both		Xeon	and	Xeon	Phi	(e.g.,	KNL)	
•  Scalable	and	Communication	intensive	
–  Spark	block	manager	as	parameter	server		
–  Organically	designed	and	integrated	with	
Spark	architecture	
–  In-band	Communication	
•  RDMA	communication	can	be	achieved	through	
our	RDMA-Spark	package!	
Overview	of	Representative	DLoBD	Stacks	-	BigDL
Spark+AI	Summit	2018	 23	Network	Based	Computing	Laboratory	
MNIST	 CIFAR-10		 ImageNet		
Category		 Digit	Classification		 Object	Classification		 Object	Classification		
Resolution		 28	×	28	B&W		 32	×	32	Color		 256	×	256	Color		
Classes		 10	 10	 1000	
Training	Images		 60	K	 50	K	 1.2	M	
Testing	Images		 10	K		 10	K	 100	K	
Selected	Various	Datasets	and	Models	
Model	 Layers	(Conv.	/	Full-connected)		 Dataset	 Framework	
LeNet	 2	/	2	 MNIST	 CaffeOnSpark,	TensorFlowOnSpark		
SoftMax	Regression	 NA	/	NA	 MNIST	 TensorFlowOnSpark		
CIFAR-10	Quick	 3	/	1	 CIFAR-10	 CaffeOnSpark,	TensorFlowOnSpark,	MMLSpark	
VGG-16	 13	/	3	 CIFAR-10	 BigDL		
AlexNet	 5	/	3	 ImageNet	 CaffeOnSpark		
GoogLeNet	 22	/	0	 ImageNet	 CaffeOnSpark		
Resnet-50	 53/1	 Synthetic		 TensorFlow
Spark+AI	Summit	2018	 24	Network	Based	Computing	Laboratory	
Performance	Characterization	for	CPU-/GPU-based	Deep	Learning	
with	CaffeOnSpark		
0
1000
2000
3000
4000
5000
6000
1 2 4 8 16
End-to-EndTime(secs)
Number of Nodes (one device/node)
CIFAR-10	Quick	
CPU+OpenBlas
CPU+MKL
GPU
GPU+cuDNN
0
200
400
600
800
1000
1200
1 2 4 8 16
End-to-EndTime(secs)
Number of Nodes (one device/node)
LeNet	on	MNIST	
CPU+OpenBlas
CPU+MKL
GPU
GPU+cuDNN
63%	
91.26%	
98.08%	
78.3%	
41.5%	
15.9%	
81.31%	
88.91%	
96.78%	
80.08%	
63.92%	
•  DL	workloads	can	benefit	from	the	high	performance	of	the	DLoBD	stacks.		
•  Network	will	become	a	bottleneck	at	some	point	if	the	sub-optimal	IPoIB	network	protocol	is	used.	
•  GPU/GPU+cuDNN	can	get	the	best	performance.	GPU	+	cuDNN	is	degraded	at	a	large	scale	(e.g.,	16	nodes).	
•  For	some	models,	solutions	with	CPU	+	MKL	may	outperform	GPU-based	solutions.
Spark+AI	Summit	2018	 25	Network	Based	Computing	Laboratory	
Performance	Characterization	for	IPoIB	and	RDMA	with	
CaffeOnSpark	and	TensorFlowOnSpark	(IB	EDR)	
•  CaffeOnSpark	benefits	from	the	high	performance	of	RDMA	compared	to	IPoIB	once	
communication	overhead	becomes	significant.	
•  Our	experiments	show	that	the	default	RDMA	design	in	TensorFlowOnSpark	is	not	fully	
optimized	yet.	For	MNIST	tests,	RDMA	is	not	showing	obvious	benefits.	
0
50
100
150
200
250
300
350
400
450
IPoIB RDMA IPoIB RDMA IPoIB RDMA IPoIB RDMA
GPU GPU-cuDNN GPU GPU-cuDNN
CIFAR-10 Quick LeNet on MNIST
End-to-EndTime(secs)
CaffeOnSpark	
2 4 8 16
0
50
100
150
200
250
300
350
400
450
IPoIB RDMA IPoIB RDMA
CIFAR10 (2 GPUs/node) MNIST (1 GPU/node)
End-to-EndTime(secs)
TensorFlowOnSpark	
2 GPUs
4 GPUs
8 GPUs
14.2%	
13.3%	
51.2%	
45.6%	
33.01%	
4.9%
Spark+AI	Summit	2018	 26	Network	Based	Computing	Laboratory	
Performance	Characterization	with	MMLSpark		
0	
1000	
2000	
3000	
4000	
5000	
1	 2	 4	 8	 16	
End-to-End	Time	(secs)	
Number	of	Nodes	(one	device/node)		
CPU+OpenBlas	
CPU+MKL	
GPU+cuDNN		
0	
40	
80	
120	
160	
IPoIB		 RDMA	
End-to-End	Time	(secs)	
CIFAR-10	
2	 4	 8	 16	
•  The	solution	of	GPU	+	cuDNN	performs	best,	up	to	55x	faster	than	CPU	+	OpenBLAS,	and	up	to	15x	
than	CPU	+	MKL.		
•  OpenMPI-based	communication	over	IPoIB	and	RDMA;	Similar	performance;	The	latency	and	
bandwidth	of	IPoIB	in	this	cluster	are	sufficient	for	small	models.	
•  Could	not	find	other	benchmarks	with	bigger	models	for	MMLSpark
Spark+AI	Summit	2018	 27	Network	Based	Computing	Laboratory	
Characterization	on	Performance	and	Accuracy		
	
10
20
30
40
50
60
70
80
286
1001
5297
12939
18673
22974
27273
30138
37301
43030
50326
Accuracy(%)
Time (secs)
IPoIB
RDMA
10
30
50
70
90
160
511
863
1214
1565
1917
2268
2619
2971
3322
3908
4259
4610
4962
Accuracy(%)
Time (secs)
IPoIB RDMA
AlexNet	on	ImageNet	with	CaffeOnSpark	 GoogleNet	on	ImageNet	with	CaffeOnSpark	
•  Performance	Evaluation	of	CaffeOnSpark		(training	time	to	achieve	a	70%	accuracy)	
–  RDMA	reduces	the	overall	time	cost	by	22%	in	training	AlexNet	on	ImageNet	
–  RDMA	reduces	the	overall	time	cost	by	15%	in	training	GoogleNet	on	ImageNet	
•  Performance	Evaluation	of	BigDL	(training	time	to	achieve	a	70%	accuracy)	
–  RDMA	reduces	the	overall	time	cost	by	48%	in	training	VGG	on	CIFAR-10	
0
20
40
60
80
100
165
474
711
961
1280
1598
1920
2017
2132
2261
2908
3645
4337
Accuracy(%)
Time (secs)
IPoIB RDMA
VGG	on	CIFAR-10	with	BigDL	
70% Accuracy
Reduce by 48%
Reduce by 22%
70% Accuracy Reduce by 15%
70% Accuracy
Spark+AI	Summit	2018	 28	Network	Based	Computing	Laboratory	
Memory	and	Network	Utilization	of	CaffeOnSpark	
•  CIFAR-10	Quick	Model	and	CIFAR-10	Dataset	
•  GPU-based	solutions	use	less	memory	than	CPU-based	ones	as	they	mostly	use	GPU	memory.		
•  CPU	+	MKL	solution	uses	host	memory	more	efficiently	and	has	better	performance	than	CPU	+	
OpenBLAS.	
•  RDMA	utilizes	the	network	resources	more	efficiently	than	the	IPoIB	in	CaffeOnSpark.		
•  CaffeOnSpark	still	does	not	fully	utilize	the	high	throughput	characteristic	of	RDMA	and	memory	
resource.	
0
10
20
30
0 3 6 9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57
HostMemoryUsage
(GB)
Time (min)
CPU+OpenBLAS
CPU+MKL
GPU
GPU+cuDNN
0
10
20
0 1 2 3 4 5 6 7
NetworkThroughput
(MB/s)
Time (min)
IPoIB
RDMA
Spark+AI	Summit	2018	 29	Network	Based	Computing	Laboratory	
•  SoftMax	Regression	model,	over	MNIST	
dataset		
•  Up	to	15.5%	time	in	Apache	Hadoop	
YARN	scheduler	layer		
•  Up	to	18.1%	execution	time	in	Spark	job	
execution	layer		
•  Data	size	is	small,	so	we	do	not	count	the	
time	spent	on	accessing	HDFS	layer.		
•  Need	more	effort	to	reduce	the	
overhead	across	different	layers	of	
DLoBD	stacks	
•  Maybe	amortized	in	long-running	deep	
learning	jobs	
Performance	Overhead	across	Layers	in	DLoBD	Stacks	
0
10
20
30
40
50
60
70
IPoIB RDMA IPoIB RDMA
TensorFlowOnSpark Native TensorFlow
Time(second)
Other
Spark-Overhead
YARN-Overhead
Actual Training
Spark+AI	Summit	2018	 30	Network	Based	Computing	Laboratory	
•  RDMA	can	benefit	DL	workloads	
–  Up	to	2.7x	speedup	with	RDMA	compared	to	the	IPoIB	scheme	for	deep	learning	workloads.		
–  RDMA	can	scale	better	and	utilize	resources	more	efficiently	than	IPoIB	over	InfiniBand	clusters	
•  GPU-based	DL	designs	can	outperform	CPU-based	designs,	but	not	always		
–  LeNet	on	MNIST,	CPU	+	MKL	achieved	better	performance	than	GPU	and	GPU	+	cuDNN	on	8/16	
nodes		
•  Large	rooms	for	further	improvement	in	DLoBD	stacks!!!	
•  We	need	more	benchmarks,	public	datasets,	and	analysis	tools!!!	
Insights	and	Guidance		
X.	Lu,	H.	Shi,	M.	H.	Javed,	R.	Biswas,	and	D.	K.	Panda,	Characterizing	Deep	Learning	over	Big	Data	(DLoBD)	Stacks	on	RDMA-capable	Networks,	HotI	2017.		
X.	Lu,	H.	Shi,	R.	Biswas,	M.	H.	Javed,	and	D.	K.	Panda,	DLoBD:	A	Comprehensive	Study	of	Deep	Learning	over	Big	Data	Stacks	on	HPC	Clusters,	IEEE	Transactions	on	
Multi-Scale	Computing	Systems	(TMSCS),	2018.
Spark+AI	Summit	2018	 31	Network	Based	Computing	Laboratory	
Outline		
•  Accelerating	Big	Data	Stacks	
•  Benchmarking	and	Characterizing	DLoBD	Stacks	
–  CaffeOnSpark,	TensorFlowOnSpark,	MMLSpark,	and	BigDL	
•  Accelerating	DLoBD	Stacks	
–  BigDL	on	RDMA-Spark		
–  TensorFlow
Spark+AI	Summit	2018	 32	Network	Based	Computing	Laboratory	
Epoch-Level	Evaluation	with	BigDL	on	SDSC	Comet		
	
0
10
20
30
40
50
60
0
500
1000
1500
2000
2500
3000
3500
4000
4500
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
Accuracy(%)
AccumulativeEpochsTime(secs)
Epoch Number
IPoIB-Time
RDMA-Time
IPoIB-Accuracy
RDMA-Accuracy
VGG	on	CIFAR-10	with	BigDL	
•  Epoch-level	evaluation	of	
training	VGG	model	using	
BigDL	on	default	Spark	with	
IPoIB	and	our	RDMA-based	
Spark.	
•  RDMA	version	takes	
constantly	less	time	than	
the	IPoIB	version	to	finish	
every	epoch.	
–  RDMA	finishes	epoch	18	in	
2.6x	time	faster	than	IPoIB		
2.6X
Spark+AI	Summit	2018	 33	Network	Based	Computing	Laboratory	
Scalability	Evaluation	with	BigDL	on	SDSC	Comet	
•  Using	BigDL	with	IPoIB	&	RDMA	Spark	
•  For	VGG	model	trained	with	BigDL,	RDMA-
based	Spark	scales	better	than	default	
IPoIB	Spark		
•  For	384	CPU	cores,	18	epochs	and	same	
batch	size,	RDMA	takes	about	870	seconds	
while	IPoIB	takes	2,372	seconds	
•  A	speedup	of	2.7x	using	RDMA	for	the	
epoch-level	training	time	
10
510
1010
1510
2010
2510
3010
3510
4010
4510
5010
96 192 384
AccumulativeEpochsTime(secs)
Number of CPU Cores
IPoIB RDMA
Spark+AI	Summit	2018	 34	Network	Based	Computing	Laboratory	
Outline		
•  Accelerating	Big	Data	Stacks	
•  Benchmarking	and	Characterizing	DLoBD	Stacks	
–  CaffeOnSpark,	TensorFlowOnSpark,	MMLSpark,	and	BigDL	
•  Accelerating	DLoBD	Stacks	
–  BigDL	on	RDMA-Spark		
–  TensorFlow
Spark+AI	Summit	2018	 35	Network	Based	Computing	Laboratory	
Overview	of	gRPC	with	TensorFlow	
Worker	services	communicate	among	each	other	using	gRPC,	or	gRPC+X!		
Client	 Master	
Worker		
/job:PS/task:0	
	
	
	
Worker		
/job:Worker/task:0	
CPU	 GPU	
	gRPC	server/	client		
CPU	 GPU	
gRPC	server/	client
Spark+AI	Summit	2018	 36	Network	Based	Computing	Laboratory	
Performance	Benefits	for	RDMA-gRPC	with	Micro-Benchmark	
RDMA-gRPC	RPC	Latency	
•  gRPC-RDMA	Latency	on	SDSC-Comet-FDR	
–  Up	to	2.7x	performance	speedup	over	IPoIB	for	Latency	for	small	messages	
–  Up	to	2.8x	performance	speedup	over	IPoIB	for	Latency	for	medium	messages	
–  Up	to	2.5x	performance	speedup	over	IPoIB	for	Latency	for	large	messages	
	
0	
15	
30	
45	
60	
75	
90	
2	 8	 32	 128	 512	 2K	 8K	
Latency	(us)	
	
payload	(Bytes)	
Default	gRPC	
OSU	RDMA	gRPC	
0	
200	
400	
600	
800	
1000	
16K	 32K	 64K	 128K	 256K	 512K	Latency	(us)	
Payload	(Bytes)	
Default	gRPC	
OSU	RDMA	gRPC	
100	
3800	
7500	
11200	
14900	
18600	
1M	 2M	 4M	 8M	
Latency	(us)	
Payload	(Bytes)	
Default	gRPC	
OSU	RDMA	gRPC	
R.	Biswas,	X.	Lu,	and	D.	K.	Panda,	Accelerating	gRPC	and	TensorFlow	with	RDMA	for	High-Performance	Deep	Learning	over	InfiniBand,	Under	Review.
Spark+AI	Summit	2018	 37	Network	Based	Computing	Laboratory	
0	
20	
40	
60	
80	
100	
120	
140	
160	
16	 32	 64	
Images	/	Second	
Batch	Size	/	GPU	
gRPPC	 Verbs	
MPI	 gRPC+RDMA	
35%	
Performance	Benefit	for	TensorFlow	-	Resnet50	
0	
50	
100	
150	
200	
250	
300	
16	 32	 64	
Images	/	Second	
Batch	Size	/	GPU	
gRPPC	 Verbs	
MPI	 gRPC+RDMA	
41%	
•  TensorFlow	Resnet50	performance	evaluation	on	an	IB	EDR	cluster	
–  Up	to	35%	performance	speedup	over	IPoIB	for	4	nodes.	
–  Up	to	41%	performance	speedup	over	IPoIB	for	8	nodes.	
4	Nodes	 8	Nodes
Spark+AI	Summit	2018	 38	Network	Based	Computing	Laboratory	
0	
10	
20	
30	
40	
50	
60	
70	
80	
90	
16	 32	 64	
Images	/	Second	
Batch	Size	/	GPU	
gRPPC	 Verbs	 MPI	 gRPC+RDMA	
Performance	Benefit	for	TensorFlow	-	Inception3	
0	
20	
40	
60	
80	
100	
120	
140	
160	
180	
200	
16	 32	 64	
Images	/	Second	
Batch	Size	/	GPU	
gRPPC	 Verbs	 MPI	 gRPC+RDMA	
TensorFlow	Inception3	performance	evaluation	on	an	IB	EDR	cluster	
•  Up	to	27%	performance	speedup	over	IPoIB	for	4	nodes	
•  Up	to	36%	performance	speedup	over	IPoIB	for	8	nodes.	
4	Nodes	 8	Nodes	
27%	
36%
Spark+AI	Summit	2018	 39	Network	Based	Computing	Laboratory	
•  Discussed	challenges	in	benchmarking,	characterizing,	and	accelerating		Deep	
Learning	over	Big	Data	(DLoBD)	stacks	
•  RDMA	can	benefit	DL	workloads	as	showed	by	our	RDMA-Spark,	AR-gRPC,	and	
other	RDMA	designs		
•  Many	other	open	issues	need	to	be	solved		
•  Will	enable	Big	Data	and	Deep	Learning	community	to	take	advantage	of	
modern	HPC	technologies	to	carry	out	their	analytics	in	a	fast	and	scalable	
manner		
Concluding	Remarks
Spark+AI	Summit	2018	 40	Network	Based	Computing	Laboratory	
The	4th	International	Workshop	on		
High-Performance	Big	Data	Computing	(HPBDC)
HPBDC	2018	was	held	with	IEEE	International	Parallel	and	Distributed	Processing	
Symposium	(IPDPS	2018),	Vancouver,	British	Columbia	CANADA,	May,	2018	
Workshop	Date:	May	21st,	2018	
Keynote	Talk:	Prof.	Geoffrey	Fox,	Twister2:	A	High-Performance	Big	Data	Programming	Environment	
Six	Regular	Research	Papers	and	Two	Short	Research	Papers	
Panel	Topic:	Which	Framework	is	the	Best	for	High-Performance	Deep	Learning:		
Big	Data	Framework	or	HPC	Framework?	
http://web.cse.ohio-state.edu/~luxi/hpbdc2018			
	
	
HPBDC	2017	was	held	in	conjunction	with	IPDPS’17	
http://web.cse.ohio-state.edu/~luxi/hpbdc2017		
	
HPBDC	2016	was	held	in	conjunction	with	IPDPS’16	
http://web.cse.ohio-state.edu/~luxi/hpbdc2016
Spark+AI	Summit	2018	 41	Network	Based	Computing	Laboratory	
Thank	You!	
Network-Based	Computing	Laboratory	
http://nowlab.cse.ohio-state.edu/
The	High-Performance	Big	Data	Project	
http://hibd.cse.ohio-state.edu/
{panda,	luxi}@cse.ohio-state.edu	
http://www.cse.ohio-state.edu/~panda	
http://www.cse.ohio-state.edu/~luxi

Más contenido relacionado

La actualidad más candente

H2O Rains with Databricks Cloud - NY 02.16.16
H2O Rains with Databricks Cloud - NY 02.16.16H2O Rains with Databricks Cloud - NY 02.16.16
H2O Rains with Databricks Cloud - NY 02.16.16Sri Ambati
 
H2O World - Clustering & Feature Extraction on Text - Seth Redmore
H2O World - Clustering & Feature Extraction on Text - Seth RedmoreH2O World - Clustering & Feature Extraction on Text - Seth Redmore
H2O World - Clustering & Feature Extraction on Text - Seth RedmoreSri Ambati
 
H2O World - Survey of Available Machine Learning Frameworks - Brendan Herger
H2O World - Survey of Available Machine Learning Frameworks - Brendan HergerH2O World - Survey of Available Machine Learning Frameworks - Brendan Herger
H2O World - Survey of Available Machine Learning Frameworks - Brendan HergerSri Ambati
 
Improving Search in Workday Products using Natural Language Processing
Improving Search in Workday Products using Natural Language ProcessingImproving Search in Workday Products using Natural Language Processing
Improving Search in Workday Products using Natural Language ProcessingDataWorks Summit
 
H2O Rains with Databricks Cloud - Parisoma SF
H2O Rains with Databricks Cloud - Parisoma SFH2O Rains with Databricks Cloud - Parisoma SF
H2O Rains with Databricks Cloud - Parisoma SFSri Ambati
 
Data Engineer Intro - WeCloudData
Data Engineer Intro - WeCloudDataData Engineer Intro - WeCloudData
Data Engineer Intro - WeCloudDataWeCloudData
 
Microsoft AI Platform - AETHER Introduction
Microsoft AI Platform - AETHER IntroductionMicrosoft AI Platform - AETHER Introduction
Microsoft AI Platform - AETHER IntroductionKarthik Murugesan
 
Introduction to Sparkling Water - Spark Summit East 2016
Introduction to Sparkling Water - Spark Summit East 2016Introduction to Sparkling Water - Spark Summit East 2016
Introduction to Sparkling Water - Spark Summit East 2016Sri Ambati
 
Data Day Seattle 2015: Sarah Guido
Data Day Seattle 2015: Sarah GuidoData Day Seattle 2015: Sarah Guido
Data Day Seattle 2015: Sarah GuidoBitly
 
Taking Jupyter Notebooks and Apache Spark to the Next Level PixieDust with Da...
Taking Jupyter Notebooks and Apache Spark to the Next Level PixieDust with Da...Taking Jupyter Notebooks and Apache Spark to the Next Level PixieDust with Da...
Taking Jupyter Notebooks and Apache Spark to the Next Level PixieDust with Da...Databricks
 
Embracing a Taxonomy of Types to Simplify Machine Learning with Leah McGuire
Embracing a Taxonomy of Types to Simplify Machine Learning with Leah McGuireEmbracing a Taxonomy of Types to Simplify Machine Learning with Leah McGuire
Embracing a Taxonomy of Types to Simplify Machine Learning with Leah McGuireDatabricks
 
Big Data Meets Learning Science: Keynote by Al Essa
Big Data Meets Learning Science: Keynote by Al EssaBig Data Meets Learning Science: Keynote by Al Essa
Big Data Meets Learning Science: Keynote by Al EssaSpark Summit
 
Productionizing H2O Models with Apache Spark with Jakub Hava and Michal Maloh...
Productionizing H2O Models with Apache Spark with Jakub Hava and Michal Maloh...Productionizing H2O Models with Apache Spark with Jakub Hava and Michal Maloh...
Productionizing H2O Models with Apache Spark with Jakub Hava and Michal Maloh...Databricks
 
Rakuten - Recommendation Platform
Rakuten - Recommendation PlatformRakuten - Recommendation Platform
Rakuten - Recommendation PlatformKarthik Murugesan
 
Virtualizing Analytics with Apache Spark: Keynote by Arsalan Tavakoli
Virtualizing Analytics with Apache Spark: Keynote by Arsalan Tavakoli Virtualizing Analytics with Apache Spark: Keynote by Arsalan Tavakoli
Virtualizing Analytics with Apache Spark: Keynote by Arsalan Tavakoli Spark Summit
 
Intro to H2O Machine Learning in R at Santa Clara University
Intro to H2O Machine Learning in R at Santa Clara UniversityIntro to H2O Machine Learning in R at Santa Clara University
Intro to H2O Machine Learning in R at Santa Clara UniversitySri Ambati
 
Architecting an Open Source AI Platform 2018 edition
Architecting an Open Source AI Platform   2018 editionArchitecting an Open Source AI Platform   2018 edition
Architecting an Open Source AI Platform 2018 editionDavid Talby
 
Data Science with Python - WeCloudData
Data Science with Python - WeCloudDataData Science with Python - WeCloudData
Data Science with Python - WeCloudDataWeCloudData
 
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...Big Data Spain
 
The Little Warehouse That Couldn't Or: How We Learned to Stop Worrying and Mo...
The Little Warehouse That Couldn't Or: How We Learned to Stop Worrying and Mo...The Little Warehouse That Couldn't Or: How We Learned to Stop Worrying and Mo...
The Little Warehouse That Couldn't Or: How We Learned to Stop Worrying and Mo...Spark Summit
 

La actualidad más candente (20)

H2O Rains with Databricks Cloud - NY 02.16.16
H2O Rains with Databricks Cloud - NY 02.16.16H2O Rains with Databricks Cloud - NY 02.16.16
H2O Rains with Databricks Cloud - NY 02.16.16
 
H2O World - Clustering & Feature Extraction on Text - Seth Redmore
H2O World - Clustering & Feature Extraction on Text - Seth RedmoreH2O World - Clustering & Feature Extraction on Text - Seth Redmore
H2O World - Clustering & Feature Extraction on Text - Seth Redmore
 
H2O World - Survey of Available Machine Learning Frameworks - Brendan Herger
H2O World - Survey of Available Machine Learning Frameworks - Brendan HergerH2O World - Survey of Available Machine Learning Frameworks - Brendan Herger
H2O World - Survey of Available Machine Learning Frameworks - Brendan Herger
 
Improving Search in Workday Products using Natural Language Processing
Improving Search in Workday Products using Natural Language ProcessingImproving Search in Workday Products using Natural Language Processing
Improving Search in Workday Products using Natural Language Processing
 
H2O Rains with Databricks Cloud - Parisoma SF
H2O Rains with Databricks Cloud - Parisoma SFH2O Rains with Databricks Cloud - Parisoma SF
H2O Rains with Databricks Cloud - Parisoma SF
 
Data Engineer Intro - WeCloudData
Data Engineer Intro - WeCloudDataData Engineer Intro - WeCloudData
Data Engineer Intro - WeCloudData
 
Microsoft AI Platform - AETHER Introduction
Microsoft AI Platform - AETHER IntroductionMicrosoft AI Platform - AETHER Introduction
Microsoft AI Platform - AETHER Introduction
 
Introduction to Sparkling Water - Spark Summit East 2016
Introduction to Sparkling Water - Spark Summit East 2016Introduction to Sparkling Water - Spark Summit East 2016
Introduction to Sparkling Water - Spark Summit East 2016
 
Data Day Seattle 2015: Sarah Guido
Data Day Seattle 2015: Sarah GuidoData Day Seattle 2015: Sarah Guido
Data Day Seattle 2015: Sarah Guido
 
Taking Jupyter Notebooks and Apache Spark to the Next Level PixieDust with Da...
Taking Jupyter Notebooks and Apache Spark to the Next Level PixieDust with Da...Taking Jupyter Notebooks and Apache Spark to the Next Level PixieDust with Da...
Taking Jupyter Notebooks and Apache Spark to the Next Level PixieDust with Da...
 
Embracing a Taxonomy of Types to Simplify Machine Learning with Leah McGuire
Embracing a Taxonomy of Types to Simplify Machine Learning with Leah McGuireEmbracing a Taxonomy of Types to Simplify Machine Learning with Leah McGuire
Embracing a Taxonomy of Types to Simplify Machine Learning with Leah McGuire
 
Big Data Meets Learning Science: Keynote by Al Essa
Big Data Meets Learning Science: Keynote by Al EssaBig Data Meets Learning Science: Keynote by Al Essa
Big Data Meets Learning Science: Keynote by Al Essa
 
Productionizing H2O Models with Apache Spark with Jakub Hava and Michal Maloh...
Productionizing H2O Models with Apache Spark with Jakub Hava and Michal Maloh...Productionizing H2O Models with Apache Spark with Jakub Hava and Michal Maloh...
Productionizing H2O Models with Apache Spark with Jakub Hava and Michal Maloh...
 
Rakuten - Recommendation Platform
Rakuten - Recommendation PlatformRakuten - Recommendation Platform
Rakuten - Recommendation Platform
 
Virtualizing Analytics with Apache Spark: Keynote by Arsalan Tavakoli
Virtualizing Analytics with Apache Spark: Keynote by Arsalan Tavakoli Virtualizing Analytics with Apache Spark: Keynote by Arsalan Tavakoli
Virtualizing Analytics with Apache Spark: Keynote by Arsalan Tavakoli
 
Intro to H2O Machine Learning in R at Santa Clara University
Intro to H2O Machine Learning in R at Santa Clara UniversityIntro to H2O Machine Learning in R at Santa Clara University
Intro to H2O Machine Learning in R at Santa Clara University
 
Architecting an Open Source AI Platform 2018 edition
Architecting an Open Source AI Platform   2018 editionArchitecting an Open Source AI Platform   2018 edition
Architecting an Open Source AI Platform 2018 edition
 
Data Science with Python - WeCloudData
Data Science with Python - WeCloudDataData Science with Python - WeCloudData
Data Science with Python - WeCloudData
 
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
 
The Little Warehouse That Couldn't Or: How We Learned to Stop Worrying and Mo...
The Little Warehouse That Couldn't Or: How We Learned to Stop Worrying and Mo...The Little Warehouse That Couldn't Or: How We Learned to Stop Worrying and Mo...
The Little Warehouse That Couldn't Or: How We Learned to Stop Worrying and Mo...
 

Similar a DLoBD: An Emerging Paradigm of Deep Learning Over Big Data Stacks with Dhabaleswar K. Panda and Xiaoyi Lu

Introduction to Big Data: Smart Factory
Introduction to Big Data: Smart FactoryIntroduction to Big Data: Smart Factory
Introduction to Big Data: Smart FactoryJongwook Woo
 
President Election of Korea in 2017
President Election of Korea in 2017President Election of Korea in 2017
President Election of Korea in 2017Jongwook Woo
 
Big Data and the Art of Data Science
Big Data and the Art of Data ScienceBig Data and the Art of Data Science
Big Data and the Art of Data ScienceAndrew Gardner
 
Big Data Platform adopting Spark and Use Cases with Open Data
Big Data  Platform adopting Spark and Use Cases with Open DataBig Data  Platform adopting Spark and Use Cases with Open Data
Big Data Platform adopting Spark and Use Cases with Open DataJongwook Woo
 
Data Culture Series - Keynote - 24th feb
Data Culture Series - Keynote - 24th febData Culture Series - Keynote - 24th feb
Data Culture Series - Keynote - 24th febJonathan Woodward
 
(R17A0528) BIG DATA ANALYTICS.pdf
(R17A0528) BIG DATA ANALYTICS.pdf(R17A0528) BIG DATA ANALYTICS.pdf
(R17A0528) BIG DATA ANALYTICS.pdfPoornimaShetty27
 
(R17A0528) BIG DATA ANALYTICS.pdf
(R17A0528) BIG DATA ANALYTICS.pdf(R17A0528) BIG DATA ANALYTICS.pdf
(R17A0528) BIG DATA ANALYTICS.pdfSreenivasa Harish
 
KM Initiatives at DePaul University
KM Initiatives at DePaul UniversityKM Initiatives at DePaul University
KM Initiatives at DePaul UniversityKM Chicago
 
Trustworthy AI and Open Science
Trustworthy AI and Open ScienceTrustworthy AI and Open Science
Trustworthy AI and Open ScienceBeth Plale
 
GP-Write computing group
GP-Write computing groupGP-Write computing group
GP-Write computing groupChris Dwan
 
MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAI
MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAIMAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAI
MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAIBig Data Week
 
Big Data Trend and Open Data
Big Data Trend and Open DataBig Data Trend and Open Data
Big Data Trend and Open DataJongwook Woo
 
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science, a Digital Research...
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science,  a Digital Research...Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science,  a Digital Research...
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science, a Digital Research...Carole Goble
 
Distributed Trust Architecture: The New Foundation of Everything
Distributed Trust Architecture: The New Foundation of EverythingDistributed Trust Architecture: The New Foundation of Everything
Distributed Trust Architecture: The New Foundation of EverythingLiming Zhu
 
Mending the Gap between Library's Electronic and Print Collections in ILS and...
Mending the Gap between Library's Electronic and Print Collections in ILS and...Mending the Gap between Library's Electronic and Print Collections in ILS and...
Mending the Gap between Library's Electronic and Print Collections in ILS and...New York University
 

Similar a DLoBD: An Emerging Paradigm of Deep Learning Over Big Data Stacks with Dhabaleswar K. Panda and Xiaoyi Lu (20)

Big Data Presentation
Big Data PresentationBig Data Presentation
Big Data Presentation
 
On Big Data
On Big DataOn Big Data
On Big Data
 
Introduction to Big Data: Smart Factory
Introduction to Big Data: Smart FactoryIntroduction to Big Data: Smart Factory
Introduction to Big Data: Smart Factory
 
President Election of Korea in 2017
President Election of Korea in 2017President Election of Korea in 2017
President Election of Korea in 2017
 
Big Data and the Art of Data Science
Big Data and the Art of Data ScienceBig Data and the Art of Data Science
Big Data and the Art of Data Science
 
Big Data Platform adopting Spark and Use Cases with Open Data
Big Data  Platform adopting Spark and Use Cases with Open DataBig Data  Platform adopting Spark and Use Cases with Open Data
Big Data Platform adopting Spark and Use Cases with Open Data
 
Data Culture Series - Keynote - 24th feb
Data Culture Series - Keynote - 24th febData Culture Series - Keynote - 24th feb
Data Culture Series - Keynote - 24th feb
 
(R17A0528) BIG DATA ANALYTICS.pdf
(R17A0528) BIG DATA ANALYTICS.pdf(R17A0528) BIG DATA ANALYTICS.pdf
(R17A0528) BIG DATA ANALYTICS.pdf
 
(R17A0528) BIG DATA ANALYTICS.pdf
(R17A0528) BIG DATA ANALYTICS.pdf(R17A0528) BIG DATA ANALYTICS.pdf
(R17A0528) BIG DATA ANALYTICS.pdf
 
Big data intro.pptx
Big data intro.pptxBig data intro.pptx
Big data intro.pptx
 
KM Initiatives at DePaul University
KM Initiatives at DePaul UniversityKM Initiatives at DePaul University
KM Initiatives at DePaul University
 
Trustworthy AI and Open Science
Trustworthy AI and Open ScienceTrustworthy AI and Open Science
Trustworthy AI and Open Science
 
GP-Write computing group
GP-Write computing groupGP-Write computing group
GP-Write computing group
 
MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAI
MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAIMAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAI
MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAI
 
AI on Big Data
AI on Big DataAI on Big Data
AI on Big Data
 
Big Data Trend and Open Data
Big Data Trend and Open DataBig Data Trend and Open Data
Big Data Trend and Open Data
 
Open problems big_data_19_feb_2015_ver_0.1
Open problems big_data_19_feb_2015_ver_0.1Open problems big_data_19_feb_2015_ver_0.1
Open problems big_data_19_feb_2015_ver_0.1
 
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science, a Digital Research...
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science,  a Digital Research...Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science,  a Digital Research...
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science, a Digital Research...
 
Distributed Trust Architecture: The New Foundation of Everything
Distributed Trust Architecture: The New Foundation of EverythingDistributed Trust Architecture: The New Foundation of Everything
Distributed Trust Architecture: The New Foundation of Everything
 
Mending the Gap between Library's Electronic and Print Collections in ILS and...
Mending the Gap between Library's Electronic and Print Collections in ILS and...Mending the Gap between Library's Electronic and Print Collections in ILS and...
Mending the Gap between Library's Electronic and Print Collections in ILS and...
 

Más de Databricks

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDatabricks
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Databricks
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Databricks
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Databricks
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Databricks
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of HadoopDatabricks
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDatabricks
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceDatabricks
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringDatabricks
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixDatabricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationDatabricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchDatabricks
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesDatabricks
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesDatabricks
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsDatabricks
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkDatabricks
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkDatabricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesDatabricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkDatabricks
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeDatabricks
 

Más de Databricks (20)

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 

Último

INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSINGmarianagonzalez07
 

Último (20)

Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
 

DLoBD: An Emerging Paradigm of Deep Learning Over Big Data Stacks with Dhabaleswar K. Panda and Xiaoyi Lu