SlideShare una empresa de Scribd logo
1 de 51
Descargar para leer sin conexión
1 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Timothy	Spann
2017	Future	of	Data	– Princeton	Meetup	
May	16,	2017
Introduction to Hadoop Tools
2 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Agenda
• 6:00pm – 6:45pm Registration and Food
• 6:45pm – 7:00pm Introduction and Welcome
• 7:00pm – 7:45 pm Install and Cloud by Milind Pandit
• 7:45pm – 8:30 pm Tools by Tim Spann
• 8:30pm – 8:45pm Questions
3 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
All	Your	Data	Are	Belong	to	Hadoop
à CSV
à JSON
à TSV
à TEXT
à PDF
à XML
à HTML
à AVRO
à PARQUET
à ORC
à Sequence	File
à HFile
à JPEG
à PNG
à MP4
à RTF
à Word
à Excel
4 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Apache	Big	Data	Ecosystem
5 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Hadoop	Architecture
Data	Access	Engines
Distributed	Reliable	Storage
Distributed	Compute	Framework
Resource	Management,	Data	LocalityData	Operating	System
Batch Interactive Real-time
Governance
&
Integration
Security
Applications
Deploy	Anywhere
6 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
HORTONWORKS	DATA	PLATFORM
DATA	MGMT
HDP	2.2
Dec	2014
HDP	2.1
April	2014
HDP	2.0
Oct	2013
HDP	2.2
Dec	2014
HDP	2.1
April	2014
HDP	2.0
Oct	2013
2.2.0
2.4.0
2.6.0
Ongoing	Innovation	in	Apache
HDFS
YARN
MapReduce
Hadoop	Core
What	is	Apache	Hadoop?
Yahoo!
2006
Hortonworks	
Oct	2011
Yahoo!	start	focus	on	multiple	Hadoop	apps	&	clusters	
Contributes	Hadoop	to	Apache
2008
HDP	1.0
Oct	2012
Apache	Hadoop	v2	YARN
Google	publishes	GFS	&	MapReduce papers
2004-2005
HDP	2.4
March	2016 2.7.1
HDP	2.2
Dec	2014
HDP	2.3
July	2015
2.7.1
HDP 2.5
Aug	2016
2.7.3
7 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Apache Hadoop = Storage + Compute
storage storage
storage storage
Hadoop	Distributed	File	
System	(HDFS)
CPU RAM
Yet	Another	Resource	
Negotiator	(YARN)
8 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
`
+
/directory/structure/in/memory.txt
Resource management + schedulingDisk, CPU, Memory
Core
NameNode
HDFS
ResourceManager
YARN
Hadoop daemon
User application
NN
RM
DataNode
HDFS
NodeManager
YARN
Worker Node
9 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Hadoop Distributed File System (HDFS)
Fault Tolerant Distributed Storage
• Divide	files	into	big	blocks	and	distribute	3	copies	randomly across	the	cluster
• Processing	Data	Locality
• Not	Just	storage	but	computation
10110100101
00100111001
11111001010
01110100101
00101100100
10101001100
01010010111
01011101011
11011011010
10110100101
01001010101
01011100100
11010111010
0
Logical File
1
2
3
4
Blocks
1
Cluster
1
1
2
2
2
3
3
34
4
4
10 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
HDFS	Storage	Architecture	- Before
Before
• DataNode is	a	single	storage
• Storage	is	uniform	- Only	
storage	type	Disk
• Storage	types	hidden	from	the	
file	system
All	disks	as	
single	storage
11 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Cloud
Storage
HDFS	Storage	Architecture	- Now
New Architecture
• DataNode is	a	collection	of	storages
• Support	different	types	of	storages
– Disk,	SSDs,	Memory
Block Storage Policies
– Describes	how	to	store	data	blocks
in	HDFS
Collection	of
tier	storage
12 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
It Looks Like a File System
13 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
WebHDFS
https://community.hortonworks.com/articles/60480/using-images-stored-in-hdfs-for-web-pages.html
http://princeton.server.com:50070/webhdfs/v1/demo/clickstream/weblogs/clickstream-feed-
generated.tsv?OP=open
14 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Hadoop	emerged	as	foundation	of	new	data	architecture
Apache	Hadoop	is	an	open	source	data	platform	for	managing	large	
volumes	of	high	velocity	and	variety	of	data
• Built	by	Yahoo!	to	be	the	heartbeat	of	its	ad	&	search	business
• Donated	to	Apache	Software	Foundation	in	2005	with	rapid	adoption	by	large	
web	properties	&	early	adopter	enterprises
• Incredibly	disruptive	to	current	platform	economics
Traditional	Hadoop	Advantages
ü Manages	new	data	paradigm
ü Handles	data	at	scale
ü Cost	effective
ü Open	source
Traditional	Hadoop	Had	Limitations
Batch-only	architecture	
Single	purpose	clusters,	specific	data	sets
Difficult	to	integrate	with	existing	investments
Not	enterprise-grade
Application
Storage
HDFS
Batch Processing
MapReduce
15 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
YARN	extends	Hadoop	into	data	center	leaders
YARN
The Architectural
Center of Hadoop
• Common data platform, many applications
• Support multi-tenant access & processing
• Batch, interactive & real-time use cases
• Supports 3rd-party ISV tools
(ex. SAS, Syncsort, Actian, etc.)
YARN Ready Applications
Facilitates ongoing innovation and enterprise adoption via
ecosystem of new and existing “YARN Ready” solutions
YARN : Data Operating System
BATCH, INTERACTIVE & REAL-TIME DATA ACCESS
1 ° ° ° ° ° ° ° ° °
° ° ° ° ° ° ° ° ° °
°
N
HDFS Hadoop Distributed File System
DATA MANAGEMENT
Batch
MapReduce
Script
Pig
Search
Solr
SQL
Hive
NoSQL
HBase
Accumulo
Phoenix
Stream
Storm
In-memory
Spark
Others
ISV Engines
Tez Tez Slider Slider
16 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Overview of SQL on Hadoop Solutions
Spark's	module	for	working	with	structured	data.	Run	SQL	
queries	alongside	complex	analytic	algorithms.
Apache	Hive	is	a	data	warehouse	infrastructure	built	on	
top	of	Hadoop	for	providing	data	summarization,	query,	
and	analysis.	
High	performance	relational	database	layer	over	HBase	for	
low	latency	applications.
Traditional
MPP	on
Hadoop
Many	traditionally	architected	MPP	solutions	have	been	
ported	to	Hadoop	and	some	new	ones	have	been	
developed	from	scratch.
17 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
SQL on Hadoop: Vitals
Project First	GA	Release
Lines	of	Code
(June	2015) (*) Most	Typical	Use
Apache	Hive April,	2009	(7	Years) 1	Million EDW	/	ETL	Offload
SparkSQL March,	2015	(4
Months)
56.6k Exploratory	Analytics
Apache	Phoenix March, 2014	(2	Year) 200k Low-Latency	
Dashboards
18 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Apache	Hive:	Fast	Facts
Most	Queries	Per	Hour
100,000	Queries	Per	Hour
(Yahoo	Japan)
Analytics	Performance
100	Million	rows/s	Per	Node
(with	Hive	LLAP)
Largest	Hive	Warehouse
300+	PB	Raw	Storage
(Facebook)
Largest	Cluster
4,500+	Nodes
(Yahoo)
19 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Phoenix	and	HBase:	Fast	Facts
Largest	Database
5	Petabytes
(Flurry)
Best	Known	App
Facebook	Messages
(Facebook)
Fastest	Ingestion
10	Million	Events/s
(Yahoo)
Biggest	SQL	App
Real-Time	SQL	on	140m+	Records
(PubMatic)
20 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Data	Types SQL Features File Formats
Numeric Core	SQL	Features Columnar
FLOAT/DOUBLE Date,	Time and	Arithmetical	Functions ORCFile
DECIMAL INNER,	OUTER,	CROSS	and	SEMI	Joins Parquet
INT/TINYINT/SMALLINT/BIGINT Derived	Table	Subqueries Text
BOOLEAN Correlated	and	Uncorrelated	Subqueries CSV
String UNION	ALL Logfile
CHAR	/	VARCHAR UDFs, UDAFs,	UDTFs Nested	/	Complex
STRING Common	Table	Expressions Avro
BINARY UNION	DISTINCT JSON
Date, Time INTERSECT, EXCEPT XML
DATE Non-Equality	Joins Custom	Formats
TIMESTAMP Advanced	Analytics Other	Features
Interval	Types OLAP	and	Windowing	Functions XPath Analytics
Complex	Types CUBE and	Grouping	Sets Procedural	Extensions	(PL/Hive)
ARRAY Nested	Data	Analytics
MAP Nested	Data	Traversal
STRUCT Lateral	Views
UNION ACID	Transactions
INSERT	/	UPDATE	/	DELETE
MERGE
Apache	Hive:	Journey	to	SQL:2011	Analytics
Legend
Hive	1.0
Future
Hive	1.2
21 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
S Q L 	 W I T H 	 H I V E
D A T A 	 A C C E S S
Apache	Hive	facilitates	querying	and	managing	large	
datasets.
Hive	provides	SQL	on	Hadoop.	
Data	analysts	use	Hive	to	explore,	structure	and	
analyze that	data	using	the	familiar	comfortable	SQL	
syntax	they	are	used	to.
Hive	also	comes	with	HCatalog;	a	global	metadata	
management	layer	that	exposes	Hive	table	metadata	
to	all	other	Hadoop	applications.	
OPERATIONS SECURITY
GOVERNANCE
STORAGE
STORAGE
22 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Z E P P E L I N 	 N OT E B O O K
T O O L S
Apache	Zeppelin	is	a	Web-based	notebook	that	
enables	interactive	data	analytics.	
Data	Scientists	and	End-Users	alike	can	make	
beautiful	data-driven,	interactive	and	collaborative	
documents	with	SparkSQL,	Scala,	Python,	JDBC	
connections,	Files,	and	more.
Notebooks	contain	code	samples,	source	data,	
descriptive	markup,	result	sets,	and	rich	visualizations.	
OPERATIONS SECURITY
GOVERNANCE
STORAGE
STORAGE
23 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
24 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
A M B A R I 	 V I E W S
T O O L S
Ambari	Views	are	a	built-in	set	of	Views	that	are	pre-
deployed	for	you	to	use	with	your	cluster.		
These	GUI	components	increase	ease-of-use	to	end	
users.	Current	Ambari	Views	include	Hive,	Pig,	Tez,	
Capacity	Scheduler,	File,	HDFS.		
The	Ambari	Views	Framework	allow	developers	to	
create	new	user	interface	components	that	plug	into	
the	Ambari	web	interface.		
OPERATIONS SECURITY
GOVERNANCE
STORAGE
STORAGE
25 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
A M B A R I 	 V I E W S
T O O L S
26 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Hive Server 2
(compile, optimize, execute)
HDFS
Apache	Hive	Architecture
Client – beeline, Ambari Hive
View, Zeppelin, BI of Choice
SQL over
JDBC/ ODBC
database
Table
1
Partition
1
Table
2
Partition
2
Hive
MetaStore
TEZ / MR
Data in HDFS
• Structured
• Unstructured
• Semi structured
Schema definitions
Distribution Engine
Data Storage
Interpreter
27 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Submitting	Hive	Queries
à Hive	CLI
– Traditional	Hive	client	that	connects	to	a	HiveServer instance
– $ hive
hive>
à Beeline
– A	new	command	line	client	that	connects	to	a	HiveServer2	instance
– $ beeline
beeline> !connect
jdbc:hive2://hostname:10000 username password
org.apache.hive.jdbc.HiveDriver
28 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Best Practices for Data Loading - Create External Table, Create
ORC Table
hdfs dfs -copyFromLocal cars.csv /visdata
CREATE	EXTERNAL	TABLE	IF	NOT	EXISTS	Cars(Name	STRING,	Miles_per_Gallon INT,		Cylinders	
INT,	Displacement	INT,	Horsepower	INT,	Weight_in_lbs INT,	Acceleration	DECIMAL,	Year	DATE,								
Origin	CHAR(1))	COMMENT	'Data'				ROW	FORMAT	DELIMITED				FIELDS	TERMINATED	BY	','				
STORED	AS	TEXTFILE	 location	’/visdata';
CREATE	TABLE	IF	NOT	EXISTS	mycars(	Name	STRING,	Miles_per_Gallon INT,	Cylinders	INT,	
Displacement	INT,	Horsepower	INT,	Weight_in_lbs INT,	Acceleration	DECIMAL,	Year	DATE,	Origin	
CHAR(1))	COMMENT	'Data'	ROW	FORMAT	DELIMITED	FIELDS	TERMINATED	BY	','	STORED	AS	
ORC;
INSERT	OVERWRITE	TABLE	mycars SELECT	*	FROM	cars;
29 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Data	Formats
More	Flexible Better	Storage	and	Performance///
30 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
What	Is	Apache	Spark?
à Apache	open	source	project	
originally	developed	at	AMPLab
(University	of	California	Berkeley)
à Unified	data	processing	engine	that	
operates	across	varied	data	
workloads	and	platforms
31 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Why	Apache	Spark?
à Elegant	Developer	APIs
– Single	environment	for	data	munging,	data	wrangling,	and	Machine	Learning	(ML)
à In-memory	computation	model	– Fast!
– Effective	for	iterative	computations	and	ML
à Machine	Learning
– Implementation	of	distributed	ML	algorithms
– Pipeline	API	(Spark	ML)
32 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Spark	SQL
Structured	Data
Spark	Streaming
Near	Real-time
Spark	MLlib
Machine	Learning
GraphX
Graph	Analysis
33 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
DataFrames
à Distributed collection of	data organized into	named	
columns
à Conceptually	equivalent	to	a	table	in	relational	DB	or	
a	data	frame	in	R/Python
à API	available	in	Scala,	Java,	Python,	and	R
Col1 Col2 … … ColN
DataFrame
Column
Row
Data	is	described	as	a	DataFrame
with	rows,	columns,	and	a	schema
34 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Sources
CSVAvro
HIVE
Spark	SQL
Col1 Col2 … … ColN
DataFrame
Column
Row
JSON
35 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Create	a	DataFrame
val path = "examples/flights.json"
val flights = spark.read.json(path)
Example
36 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Rapid	Ecosystem	Adoption:	210+	Processors
HTTP
Syslog
Email
HTML
Image
Hash Encrypt
Extract
TailMerge
Evaluate
Duplicate Execute
Scan
GeoEnrich
Replace
ConvertSplit
Translate
HL7
FTP
UDP
XML
SFTP
Route	Content
Route	Context
Route	Text
Control	Rate
Distribute	Load
AMQP
37 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
FlowFiles are	like	HTTP	data
HTTP	Data FlowFile
HTTP/1.1	200	OK
Date:	Sun,	10	Oct	2010	23:26:07	GMT
Server:	Apache/2.2.8	(CentOS)	OpenSSL/0.9.8g
Last-Modified:	Sun,	26	Sep	2010	22:04:35	GMT
ETag:	"45b6-834-49130cc1182c0"
Accept-Ranges:	bytes
Content-Length:	13
Connection:	close
Content-Type:	text/html
Hello	world!
Standard	FlowFile Attributes
Key:	'entryDate’ Value:	'Fri	Jun	17	17:15:04	EDT	2016'
Key:	'lineageStartDate’			Value:	'Fri	Jun	17	17:15:04	EDT	2016'
Key:	'fileSize’ Value:	'23609'
FlowFile Attribute	Map	Content
Key:	'filename’ Value:	'15650246997242'
Key:	'path’ Value:	'./’
Binary	Content	*
Header
Content
38 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
http://hortonworks.com/blog/hdf-2-0-flow-processing-real-time-tweets-strata-hadoop-slack-tensorflow-phoenix-zeppelin/
39 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
FlowFile:	Each	piece	of	"User	Data"	(i.e.,	data	that	the	user	brings	into	NiFi	for	processing	and	distribution)	is	
referred	to	as	a	FlowFile.	A	FlowFile is	made	up	of	two	parts:	Attributes	and	Content.	The	Content	is	the	User	Data	
itself.	Attributes	are	key-value	pairs	that	are	associated	with	the	User	Data.
Processor:	The	Processor	is	the	NiFi	component	that	is	responsible	for	creating,	sending,	receiving,	transforming,	
routing,	splitting,	merging,	and	processing	FlowFiles.	It	is	the	most	important	building	block	available	to	NiFi	users	
to	build	their	dataflows.
https://nifi.apache.org/docs/nifi-docs/html/getting-started.html
https://nifi.apache.org/docs/nifi-docs/html/overview.html
http://www.slideshare.net/aldrinpiri/apache-nifi-crash-course-san-jose-hadoop-summit-66967077
https://hortonworks.com/hadoop-tutorial/learning-ropes-apache-nifi/
Quick Terms and Reference
40 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
InvokeHttp
GetTwitter
Input
41 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
RouteOnAttribute
ExecuteStreamCommand
UpdateAttribute
Processing
42 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
PutHDFS:			Have	access	to	your	Hadoop	HDFS	from	the	NIFI	box	and	have	this	configuration:			
/etc/hadoop/conf/core-site.xml .		Also	create	a	directory	to	use	like		hdfs dfs –mkdir /nifi-place
PutSQL:			 Create	a	connection	pool,	know	your	JDBC	information.
1. Phoenix:		 URL=		jdbc:phoenix:clusterzookeeper:2181:/hbase-unsecure			
org.apache.phoenix.jdbc.PhoenixDriver file:///opt/demo/phoenix-client.jar
User=	root				pool=2				You	will	need	the	JDBC	JAR	on	the	local	file	system.
2. MySQL:		
jdbc:mysql://tspanndev11.field.hortonworks.com:3306/datacom.mysql.jdbc.Driver/usr/share/java/mysql
-connector-java.jar
PutSlack
Need	to	get	your	webhook URL	from	your	slack	site.		You	can	go	to	slack.com and	get	your	own	free	
room	to	test	with.
https://api.slack.com/incoming-webhooks
Output
43 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
python	classify_image.py --image_file /opt/demo/dronedataold/Bebop2_20160920083655-0400.jpg
solar	dish,	solar	collector,	solar	furnace	(score	=	0.98316)
window	screen	(score	=	0.00196)
manhole	cover	(score	=	0.00070)
radiator	(score	=	0.00041)
doormat,	welcome	mat	(score	=	0.00041)
bazel-bin/tensorflow/examples/label_image/label_image --
image=/opt/demo/dronedataold/Bebop2_20160920083655-0400.jpg	
tensorflow/examples/label_image/main.cc:204]	solar	dish	(577):	0.983162I	
tensorflow/examples/label_image/main.cc:204]	window	screen	(912):	0.00196204I	
tensorflow/examples/label_image/main.cc:204]	manhole	cover	(763):	0.000704005I	
tensorflow/examples/label_image/main.cc:204]	radiator	(571):	0.000408321I	
tensorflow/examples/label_image/main.cc:204]	doormat	(972):	0.000406186
Local TensorFlow via Python or C++ Binary
44 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
/opt/demo/sentiment/run.sh
python	/opt/demo/sentiment/sentiment.py "$@”
from	nltk.sentiment.vader import	SentimentIntensityAnalyzer
import	sys
sid =	SentimentIntensityAnalyzer()
ss =	sid.polarity_scores(sys.argv[1])
print('Compound	{0}	Negative	{1}	Neutral	{2}	Positive	{3}	'.format(
ss['compound'],ss['neg'],ss['neu'],ss['pos']))
or
if	ss['compound']	==	0.00: print('Neutral')
elif ss['compound']	<	0.00: print	('Negative')
else: print('Positive')
Local Sentiment Analysis via Python
45 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
https://pip.pypa.io/en/latest/installing/
http://www.nltk.org/install.html
wget https://bootstrap.pypa.io/get-pip.py
python	get-pip.py
sudo pip install -U nltk
sudo pip install -U numpy
Installing NLTK for Python 2.7
Installing	TensorFlow is	a	very	difficult	exercise,	after	getting	NLTK	you	can	start	the	process.			You	will	need	
most	of	the	development	tools	for	Python,	C,	C++,	Bezel,	Pip	and	more.			A	beefy	machine	with	a	lot	of	
RAM,	CPUs	and	GPUs	would	be	useful.
Check	out	my	install	article	for	a	guide:
https://dzone.com/articles/deep-learning-resources
Installing TensorFlow for Python 2.7
46 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Results
47 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Results
48 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Installation
Download the binary from here:
http://hortonworks.com/downloads/#dataflow
Or here:
https://nifi.apache.org/download.html
Or on Mac:
brew install nifi
https://nifi.apache.org/docs/nifi-docs/html/getting-
started.html#starting-nifi
bin/nifi.sh start
bin/nifi.sh install (now it’s installed as a service on Linux)
49 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Contact:
Timothy	Spann			@PaaSDeV
www.meetup.com/futureofdata-princeton
community.hortonworks.com/users/9304/tspann.html
http://www.coreservlets.com/hadoop-tutorial/
https://www.slideshare.net/HadoopSummit/hadoop-crash-course
50 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Hortonworks	Community	Connection
Read access for everyone, join to participate and be recognized
• Full	Q&A	Platform	(like	StackOverflow)
• Knowledge	Base	Articles
• Code	Samples	and	Repositories
51 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Community	Engagement
Participate now at: community.hortonworks.com©	Hortonworks	Inc.	2011	– 2015.	All	Rights	Reserved
4,000+
Registered	Users
10,000+
Answers
15,000+
Technical	Assets
One Website!

Más contenido relacionado

La actualidad más candente

Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?Hortonworks
 
Enabling the Real Time Analytical Enterprise
Enabling the Real Time Analytical EnterpriseEnabling the Real Time Analytical Enterprise
Enabling the Real Time Analytical EnterpriseHortonworks
 
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...Hortonworks
 
Hortonworks, Novetta and Noble Energy Webinar
Hortonworks, Novetta and Noble Energy Webinar Hortonworks, Novetta and Noble Energy Webinar
Hortonworks, Novetta and Noble Energy Webinar Hortonworks
 
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...Hortonworks
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCHortonworks
 
Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi
Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFiReal-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi
Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFiTimothy Spann
 
Introduction to HDF 3.0
Introduction to HDF 3.0Introduction to HDF 3.0
Introduction to HDF 3.0Timothy Spann
 
IoT Crash Course Hadoop Summit SJ
IoT Crash Course Hadoop Summit SJIoT Crash Course Hadoop Summit SJ
IoT Crash Course Hadoop Summit SJDaniel Madrigal
 
4 Essential Steps for Managing Sensitive Data
4 Essential Steps for Managing Sensitive Data4 Essential Steps for Managing Sensitive Data
4 Essential Steps for Managing Sensitive DataHortonworks
 
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...Hortonworks
 
Discover HDP 2.2: Apache Falcon for Hadoop Data Governance
Discover HDP 2.2: Apache Falcon for Hadoop Data GovernanceDiscover HDP 2.2: Apache Falcon for Hadoop Data Governance
Discover HDP 2.2: Apache Falcon for Hadoop Data GovernanceHortonworks
 
Cloudy with a chance of Hadoop - real world considerations
Cloudy with a chance of Hadoop - real world considerationsCloudy with a chance of Hadoop - real world considerations
Cloudy with a chance of Hadoop - real world considerationsDataWorks Summit
 
Automatic Detection, Classification and Authorization of Sensitive Personal D...
Automatic Detection, Classification and Authorization of Sensitive Personal D...Automatic Detection, Classification and Authorization of Sensitive Personal D...
Automatic Detection, Classification and Authorization of Sensitive Personal D...DataWorks Summit/Hadoop Summit
 
Hortonworks Technical Workshop: Real Time Monitoring with Apache Hadoop
Hortonworks Technical Workshop: Real Time Monitoring with Apache HadoopHortonworks Technical Workshop: Real Time Monitoring with Apache Hadoop
Hortonworks Technical Workshop: Real Time Monitoring with Apache HadoopHortonworks
 
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.nextDiscover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.nextHortonworks
 

La actualidad más candente (20)

Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?
 
Revolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and ZeppelinRevolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and Zeppelin
 
Enabling the Real Time Analytical Enterprise
Enabling the Real Time Analytical EnterpriseEnabling the Real Time Analytical Enterprise
Enabling the Real Time Analytical Enterprise
 
How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient
 
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
 
Hortonworks, Novetta and Noble Energy Webinar
Hortonworks, Novetta and Noble Energy Webinar Hortonworks, Novetta and Noble Energy Webinar
Hortonworks, Novetta and Noble Energy Webinar
 
Apache Hadoop Crash Course - HS16SJ
Apache Hadoop Crash Course - HS16SJApache Hadoop Crash Course - HS16SJ
Apache Hadoop Crash Course - HS16SJ
 
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDC
 
Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi
Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFiReal-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi
Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi
 
Introduction to HDF 3.0
Introduction to HDF 3.0Introduction to HDF 3.0
Introduction to HDF 3.0
 
Intro to Spark & Zeppelin - Crash Course - HS16SJ
Intro to Spark & Zeppelin - Crash Course - HS16SJIntro to Spark & Zeppelin - Crash Course - HS16SJ
Intro to Spark & Zeppelin - Crash Course - HS16SJ
 
IoT Crash Course Hadoop Summit SJ
IoT Crash Course Hadoop Summit SJIoT Crash Course Hadoop Summit SJ
IoT Crash Course Hadoop Summit SJ
 
4 Essential Steps for Managing Sensitive Data
4 Essential Steps for Managing Sensitive Data4 Essential Steps for Managing Sensitive Data
4 Essential Steps for Managing Sensitive Data
 
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
 
Discover HDP 2.2: Apache Falcon for Hadoop Data Governance
Discover HDP 2.2: Apache Falcon for Hadoop Data GovernanceDiscover HDP 2.2: Apache Falcon for Hadoop Data Governance
Discover HDP 2.2: Apache Falcon for Hadoop Data Governance
 
Cloudy with a chance of Hadoop - real world considerations
Cloudy with a chance of Hadoop - real world considerationsCloudy with a chance of Hadoop - real world considerations
Cloudy with a chance of Hadoop - real world considerations
 
Automatic Detection, Classification and Authorization of Sensitive Personal D...
Automatic Detection, Classification and Authorization of Sensitive Personal D...Automatic Detection, Classification and Authorization of Sensitive Personal D...
Automatic Detection, Classification and Authorization of Sensitive Personal D...
 
Hortonworks Technical Workshop: Real Time Monitoring with Apache Hadoop
Hortonworks Technical Workshop: Real Time Monitoring with Apache HadoopHortonworks Technical Workshop: Real Time Monitoring with Apache Hadoop
Hortonworks Technical Workshop: Real Time Monitoring with Apache Hadoop
 
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.nextDiscover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
 

Similar a Introduction to Hadoop

Hive2 Introduction -- Interactive SQL for Big Data
Hive2 Introduction -- Interactive SQL for Big DataHive2 Introduction -- Interactive SQL for Big Data
Hive2 Introduction -- Interactive SQL for Big DataYifeng Jiang
 
SoCal BigData Day
SoCal BigData DaySoCal BigData Day
SoCal BigData DayJohn Park
 
Log Analytics Optimization
Log Analytics OptimizationLog Analytics Optimization
Log Analytics OptimizationHortonworks
 
Log Analytics Optimization
Log Analytics OptimizationLog Analytics Optimization
Log Analytics OptimizationIsheeta Sanghi
 
Achieving Mega-Scale Business Intelligence Through Speed of Thought Analytics...
Achieving Mega-Scale Business Intelligence Through Speed of Thought Analytics...Achieving Mega-Scale Business Intelligence Through Speed of Thought Analytics...
Achieving Mega-Scale Business Intelligence Through Speed of Thought Analytics...VMware Tanzu
 
Predictive Analytics and Machine Learning …with SAS and Apache Hadoop
Predictive Analytics and Machine Learning…with SAS and Apache HadoopPredictive Analytics and Machine Learning…with SAS and Apache Hadoop
Predictive Analytics and Machine Learning …with SAS and Apache HadoopHortonworks
 
Cloud Austin Meetup - Hadoop like a champion
Cloud Austin Meetup - Hadoop like a championCloud Austin Meetup - Hadoop like a champion
Cloud Austin Meetup - Hadoop like a championAmeet Paranjape
 
introduction-to-apache-kafka
introduction-to-apache-kafkaintroduction-to-apache-kafka
introduction-to-apache-kafkaYifeng Jiang
 
Big data spain keynote nov 2016
Big data spain keynote nov 2016Big data spain keynote nov 2016
Big data spain keynote nov 2016alanfgates
 
The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...
The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...
The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...Big Data Spain
 
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Innovative Management Services
 
Intro to Big Data Analytics using Apache Spark and Apache Zeppelin
Intro to Big Data Analytics using Apache Spark and Apache ZeppelinIntro to Big Data Analytics using Apache Spark and Apache Zeppelin
Intro to Big Data Analytics using Apache Spark and Apache ZeppelinAlex Zeltov
 
The Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder Hortonworks
The Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder HortonworksThe Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder Hortonworks
The Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder HortonworksData Con LA
 
Apache NiFi + Tensorflow + Hadoop: Big Data AI サンドイッチの作り方
Apache NiFi + Tensorflow + Hadoop:Big Data AI サンドイッチの作り方Apache NiFi + Tensorflow + Hadoop:Big Data AI サンドイッチの作り方
Apache NiFi + Tensorflow + Hadoop: Big Data AI サンドイッチの作り方HortonworksJapan
 
Building a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise HadoopBuilding a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise HadoopSlim Baltagi
 
Hortonworks and Red Hat Webinar - Part 2
Hortonworks and Red Hat Webinar - Part 2Hortonworks and Red Hat Webinar - Part 2
Hortonworks and Red Hat Webinar - Part 2Hortonworks
 
Discover.hdp2.2.h base.final[2]
Discover.hdp2.2.h base.final[2]Discover.hdp2.2.h base.final[2]
Discover.hdp2.2.h base.final[2]Hortonworks
 
Apache Spark—Apache HBase Connector: Feature Rich and Efficient Access to HBa...
Apache Spark—Apache HBase Connector: Feature Rich and Efficient Access to HBa...Apache Spark—Apache HBase Connector: Feature Rich and Efficient Access to HBa...
Apache Spark—Apache HBase Connector: Feature Rich and Efficient Access to HBa...Spark Summit
 
Introduction to the Hadoop EcoSystem
Introduction to the Hadoop EcoSystemIntroduction to the Hadoop EcoSystem
Introduction to the Hadoop EcoSystemShivaji Dutta
 
Spark HBase Connector: Feature Rich and Efficient Access to HBase Through Spa...
Spark HBase Connector: Feature Rich and Efficient Access to HBase Through Spa...Spark HBase Connector: Feature Rich and Efficient Access to HBase Through Spa...
Spark HBase Connector: Feature Rich and Efficient Access to HBase Through Spa...Databricks
 

Similar a Introduction to Hadoop (20)

Hive2 Introduction -- Interactive SQL for Big Data
Hive2 Introduction -- Interactive SQL for Big DataHive2 Introduction -- Interactive SQL for Big Data
Hive2 Introduction -- Interactive SQL for Big Data
 
SoCal BigData Day
SoCal BigData DaySoCal BigData Day
SoCal BigData Day
 
Log Analytics Optimization
Log Analytics OptimizationLog Analytics Optimization
Log Analytics Optimization
 
Log Analytics Optimization
Log Analytics OptimizationLog Analytics Optimization
Log Analytics Optimization
 
Achieving Mega-Scale Business Intelligence Through Speed of Thought Analytics...
Achieving Mega-Scale Business Intelligence Through Speed of Thought Analytics...Achieving Mega-Scale Business Intelligence Through Speed of Thought Analytics...
Achieving Mega-Scale Business Intelligence Through Speed of Thought Analytics...
 
Predictive Analytics and Machine Learning …with SAS and Apache Hadoop
Predictive Analytics and Machine Learning…with SAS and Apache HadoopPredictive Analytics and Machine Learning…with SAS and Apache Hadoop
Predictive Analytics and Machine Learning …with SAS and Apache Hadoop
 
Cloud Austin Meetup - Hadoop like a champion
Cloud Austin Meetup - Hadoop like a championCloud Austin Meetup - Hadoop like a champion
Cloud Austin Meetup - Hadoop like a champion
 
introduction-to-apache-kafka
introduction-to-apache-kafkaintroduction-to-apache-kafka
introduction-to-apache-kafka
 
Big data spain keynote nov 2016
Big data spain keynote nov 2016Big data spain keynote nov 2016
Big data spain keynote nov 2016
 
The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...
The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...
The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...
 
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
 
Intro to Big Data Analytics using Apache Spark and Apache Zeppelin
Intro to Big Data Analytics using Apache Spark and Apache ZeppelinIntro to Big Data Analytics using Apache Spark and Apache Zeppelin
Intro to Big Data Analytics using Apache Spark and Apache Zeppelin
 
The Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder Hortonworks
The Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder HortonworksThe Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder Hortonworks
The Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder Hortonworks
 
Apache NiFi + Tensorflow + Hadoop: Big Data AI サンドイッチの作り方
Apache NiFi + Tensorflow + Hadoop:Big Data AI サンドイッチの作り方Apache NiFi + Tensorflow + Hadoop:Big Data AI サンドイッチの作り方
Apache NiFi + Tensorflow + Hadoop: Big Data AI サンドイッチの作り方
 
Building a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise HadoopBuilding a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise Hadoop
 
Hortonworks and Red Hat Webinar - Part 2
Hortonworks and Red Hat Webinar - Part 2Hortonworks and Red Hat Webinar - Part 2
Hortonworks and Red Hat Webinar - Part 2
 
Discover.hdp2.2.h base.final[2]
Discover.hdp2.2.h base.final[2]Discover.hdp2.2.h base.final[2]
Discover.hdp2.2.h base.final[2]
 
Apache Spark—Apache HBase Connector: Feature Rich and Efficient Access to HBa...
Apache Spark—Apache HBase Connector: Feature Rich and Efficient Access to HBa...Apache Spark—Apache HBase Connector: Feature Rich and Efficient Access to HBa...
Apache Spark—Apache HBase Connector: Feature Rich and Efficient Access to HBa...
 
Introduction to the Hadoop EcoSystem
Introduction to the Hadoop EcoSystemIntroduction to the Hadoop EcoSystem
Introduction to the Hadoop EcoSystem
 
Spark HBase Connector: Feature Rich and Efficient Access to HBase Through Spa...
Spark HBase Connector: Feature Rich and Efficient Access to HBase Through Spa...Spark HBase Connector: Feature Rich and Efficient Access to HBase Through Spa...
Spark HBase Connector: Feature Rich and Efficient Access to HBase Through Spa...
 

Más de Timothy Spann

DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNKTimothy Spann
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024Timothy Spann
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesTimothy Spann
 
2024 XTREMEJ_ Building Real-time Pipelines with FLaNK_ A Case Study with Tra...
2024 XTREMEJ_  Building Real-time Pipelines with FLaNK_ A Case Study with Tra...2024 XTREMEJ_  Building Real-time Pipelines with FLaNK_ A Case Study with Tra...
2024 XTREMEJ_ Building Real-time Pipelines with FLaNK_ A Case Study with Tra...Timothy Spann
 
28March2024-Codeless-Generative-AI-Pipelines
28March2024-Codeless-Generative-AI-Pipelines28March2024-Codeless-Generative-AI-Pipelines
28March2024-Codeless-Generative-AI-PipelinesTimothy Spann
 
TCFPro24 Building Real-Time Generative AI Pipelines
TCFPro24 Building Real-Time Generative AI PipelinesTCFPro24 Building Real-Time Generative AI Pipelines
TCFPro24 Building Real-Time Generative AI PipelinesTimothy Spann
 
2024 Build Generative AI for Non-Profits
2024 Build Generative AI for Non-Profits2024 Build Generative AI for Non-Profits
2024 Build Generative AI for Non-ProfitsTimothy Spann
 
2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...
2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...
2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...Timothy Spann
 
Conf42-Python-Building Apache NiFi 2.0 Python Processors
Conf42-Python-Building Apache NiFi 2.0 Python ProcessorsConf42-Python-Building Apache NiFi 2.0 Python Processors
Conf42-Python-Building Apache NiFi 2.0 Python ProcessorsTimothy Spann
 
Conf42Python -Using Apache NiFi, Apache Kafka, RisingWave, and Apache Iceberg...
Conf42Python -Using Apache NiFi, Apache Kafka, RisingWave, and Apache Iceberg...Conf42Python -Using Apache NiFi, Apache Kafka, RisingWave, and Apache Iceberg...
Conf42Python -Using Apache NiFi, Apache Kafka, RisingWave, and Apache Iceberg...Timothy Spann
 
2024 Feb AI Meetup NYC GenAI_LLMs_ML_Data Codeless Generative AI Pipelines
2024 Feb AI Meetup NYC GenAI_LLMs_ML_Data Codeless Generative AI Pipelines2024 Feb AI Meetup NYC GenAI_LLMs_ML_Data Codeless Generative AI Pipelines
2024 Feb AI Meetup NYC GenAI_LLMs_ML_Data Codeless Generative AI PipelinesTimothy Spann
 
DBA Fundamentals Group: Continuous SQL with Kafka and Flink
DBA Fundamentals Group: Continuous SQL with Kafka and FlinkDBA Fundamentals Group: Continuous SQL with Kafka and Flink
DBA Fundamentals Group: Continuous SQL with Kafka and FlinkTimothy Spann
 
NY Open Source Data Meetup Feb 8 2024 Building Real-time Pipelines with FLaNK...
NY Open Source Data Meetup Feb 8 2024 Building Real-time Pipelines with FLaNK...NY Open Source Data Meetup Feb 8 2024 Building Real-time Pipelines with FLaNK...
NY Open Source Data Meetup Feb 8 2024 Building Real-time Pipelines with FLaNK...Timothy Spann
 
OSACon 2023_ Unlocking Financial Data with Real-Time Pipelines
OSACon 2023_ Unlocking Financial Data with Real-Time PipelinesOSACon 2023_ Unlocking Financial Data with Real-Time Pipelines
OSACon 2023_ Unlocking Financial Data with Real-Time PipelinesTimothy Spann
 
Building Real-Time Travel Alerts
Building Real-Time Travel AlertsBuilding Real-Time Travel Alerts
Building Real-Time Travel AlertsTimothy Spann
 
JConWorld_ Continuous SQL with Kafka and Flink
JConWorld_ Continuous SQL with Kafka and FlinkJConWorld_ Continuous SQL with Kafka and Flink
JConWorld_ Continuous SQL with Kafka and FlinkTimothy Spann
 
[EN]DSS23_tspann_Integrating LLM with Streaming Data Pipelines
[EN]DSS23_tspann_Integrating LLM with Streaming Data Pipelines[EN]DSS23_tspann_Integrating LLM with Streaming Data Pipelines
[EN]DSS23_tspann_Integrating LLM with Streaming Data PipelinesTimothy Spann
 
Evolve 2023 NYC - Integrating AI Into Realtime Data Pipelines Demo
Evolve 2023 NYC - Integrating AI Into Realtime Data Pipelines DemoEvolve 2023 NYC - Integrating AI Into Realtime Data Pipelines Demo
Evolve 2023 NYC - Integrating AI Into Realtime Data Pipelines DemoTimothy Spann
 

Más de Timothy Spann (20)

DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
 
2024 XTREMEJ_ Building Real-time Pipelines with FLaNK_ A Case Study with Tra...
2024 XTREMEJ_  Building Real-time Pipelines with FLaNK_ A Case Study with Tra...2024 XTREMEJ_  Building Real-time Pipelines with FLaNK_ A Case Study with Tra...
2024 XTREMEJ_ Building Real-time Pipelines with FLaNK_ A Case Study with Tra...
 
28March2024-Codeless-Generative-AI-Pipelines
28March2024-Codeless-Generative-AI-Pipelines28March2024-Codeless-Generative-AI-Pipelines
28March2024-Codeless-Generative-AI-Pipelines
 
TCFPro24 Building Real-Time Generative AI Pipelines
TCFPro24 Building Real-Time Generative AI PipelinesTCFPro24 Building Real-Time Generative AI Pipelines
TCFPro24 Building Real-Time Generative AI Pipelines
 
2024 Build Generative AI for Non-Profits
2024 Build Generative AI for Non-Profits2024 Build Generative AI for Non-Profits
2024 Build Generative AI for Non-Profits
 
2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...
2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...
2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...
 
Conf42-Python-Building Apache NiFi 2.0 Python Processors
Conf42-Python-Building Apache NiFi 2.0 Python ProcessorsConf42-Python-Building Apache NiFi 2.0 Python Processors
Conf42-Python-Building Apache NiFi 2.0 Python Processors
 
Conf42Python -Using Apache NiFi, Apache Kafka, RisingWave, and Apache Iceberg...
Conf42Python -Using Apache NiFi, Apache Kafka, RisingWave, and Apache Iceberg...Conf42Python -Using Apache NiFi, Apache Kafka, RisingWave, and Apache Iceberg...
Conf42Python -Using Apache NiFi, Apache Kafka, RisingWave, and Apache Iceberg...
 
2024 Feb AI Meetup NYC GenAI_LLMs_ML_Data Codeless Generative AI Pipelines
2024 Feb AI Meetup NYC GenAI_LLMs_ML_Data Codeless Generative AI Pipelines2024 Feb AI Meetup NYC GenAI_LLMs_ML_Data Codeless Generative AI Pipelines
2024 Feb AI Meetup NYC GenAI_LLMs_ML_Data Codeless Generative AI Pipelines
 
DBA Fundamentals Group: Continuous SQL with Kafka and Flink
DBA Fundamentals Group: Continuous SQL with Kafka and FlinkDBA Fundamentals Group: Continuous SQL with Kafka and Flink
DBA Fundamentals Group: Continuous SQL with Kafka and Flink
 
NY Open Source Data Meetup Feb 8 2024 Building Real-time Pipelines with FLaNK...
NY Open Source Data Meetup Feb 8 2024 Building Real-time Pipelines with FLaNK...NY Open Source Data Meetup Feb 8 2024 Building Real-time Pipelines with FLaNK...
NY Open Source Data Meetup Feb 8 2024 Building Real-time Pipelines with FLaNK...
 
OSACon 2023_ Unlocking Financial Data with Real-Time Pipelines
OSACon 2023_ Unlocking Financial Data with Real-Time PipelinesOSACon 2023_ Unlocking Financial Data with Real-Time Pipelines
OSACon 2023_ Unlocking Financial Data with Real-Time Pipelines
 
Building Real-Time Travel Alerts
Building Real-Time Travel AlertsBuilding Real-Time Travel Alerts
Building Real-Time Travel Alerts
 
JConWorld_ Continuous SQL with Kafka and Flink
JConWorld_ Continuous SQL with Kafka and FlinkJConWorld_ Continuous SQL with Kafka and Flink
JConWorld_ Continuous SQL with Kafka and Flink
 
[EN]DSS23_tspann_Integrating LLM with Streaming Data Pipelines
[EN]DSS23_tspann_Integrating LLM with Streaming Data Pipelines[EN]DSS23_tspann_Integrating LLM with Streaming Data Pipelines
[EN]DSS23_tspann_Integrating LLM with Streaming Data Pipelines
 
Evolve 2023 NYC - Integrating AI Into Realtime Data Pipelines Demo
Evolve 2023 NYC - Integrating AI Into Realtime Data Pipelines DemoEvolve 2023 NYC - Integrating AI Into Realtime Data Pipelines Demo
Evolve 2023 NYC - Integrating AI Into Realtime Data Pipelines Demo
 

Último

EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 

Último (20)

EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 

Introduction to Hadoop