SlideShare una empresa de Scribd logo
1 de 36
Descargar para leer sin conexión
#elasticsearch
MAKE SENSE OF YOUR (BIG) DATA!
David Pilato
Technical advocate	

!
elasticsearch.
@dadoonet
StartUp #elasticsearch
data ?
StartUp #elasticsearch
StartUp #elasticsearch
StartUp #elasticsearch
StartUp #elasticsearch
StartUp #elasticsearch
StartUp #elasticsearch
BIG data ?
StartUp #elasticsearch
BIG data ?
StartUp #elasticsearch
Source: http://www.csc.com/insights/flxwd/78931-big_data_just_beginning_to_explode
35.000.000.000.000.000 mb
StartUp #elasticsearch
Source: http://www.thebigdatainsightgroup.com/site/article/big-data-infographic
StartUp #elasticsearch
search = like % ?
SELECT 	

doc.*, country.* 	

FROM 	

doc, country	

WHERE 	

doc.country_code = country.code AND	

doc.date_doc > to_date('2011-12', 'yyyy-mm') AND 	

doc.date_doc < to_date('2012-01', 'yyyy-mm') AND 	

lower(country.name) = 'france' AND 	

lower(doc.comment) LIKE ‘%product%' AND
lower(doc.comment) LIKE ‘%david%';
StartUp #elasticsearch
Search engine ?
StartUp #elasticsearch
elasticsearch ?
plug & play
REST/JSON
scalable
Apache 2 license
Lucene
elasticsearch
#elasticsearch
Start…
$ wget https://download.elasticsearch.org/elasticsearch/
elasticsearch/elasticsearch-1.1.0.tar.gz!
$ tar -xf elasticsearch-1.1.0.tar.gz!
$ ./elasticsearch-1.1.0/bin/elasticsearch!
[INFO ][node ][Ghost Maker] {1.1.0}[5645]: initializing
#elasticsearch
… and play!
$ curl -XPUT localhost:9200/sessions/session/1 -d '{!
"title" : "Elasticsearch",!
"subtitle" : "Make sense of your (BIG) data !",!
"date" : "2014-04-12T14:10:00",!
"tags" : [ "elasticsearch", "codemotion", "bigdata" ],!
"speakers" : [{!
"first_name" : "David", !
"last_name" : "Pilato" !
}]!
}'
#elasticsearch
Search!
$ curl http://localhost:9200/sessions/session/_search -d'
{
"query": {
"multi_match": {
"query": "elasticsearch codemotion david",
"fields": [ "title^3", "tags^2", "speakers.first_name" ]
}
},
"post_filter": {
"range": {
"date": {
"from": "2014-04-09",
"to": "2014-04-13"
}
}
}
}'
StartUp #elasticsearch
Compute?
#elasticsearch
$ curl http://localhost:9200/sessions/session/_search -d'
{
"query": { ... },
"aggs": {
"by_date": {
"date_histogram": {
"field": "date",
"interval": "day",
"format" : "dd/MM/yyyy"
}
}
}
}'
"by_date": [
{ "key_as_string": "03/04/2014", "doc_count": 1 },
{ "key_as_string": "12/04/2014", "doc_count": 2 },
{ "key_as_string": "16/04/2014", "doc_count": 3 }
]
Compute!
#mstechdays #elasticsearch StartUp #elasticsearch
• logs	

• twitter	

• github	

• marketing data	

• ...	

• your data	

• your big data
Let’s make sense of …
#mstechdays #elasticsearch StartUp #elasticsearch
• logs	

• twitter	

• github	

• marketing data	

• ...	

• your data	

• your big data
Let’s make sense of …
{
"name":"Pilato David",
"dateOfBirth":"1971-12-26",
"gender":"male",
"children":3,
"marketing":{
"fashion":334,
"music":3363,
"hifi":2351
},
"address":{
"country":"France",
"city":"Paris",
"location": [2.332395, 48.861871]
}
}
démo
#mstechdays #elasticsearch StartUp #elasticsearch
MAKE SENSE OF YOUR (BIG) DATA!
let’s inject some marketing documents…
#elasticsearch
ELASTICSEARCH
StartUp #elasticsearch
Distributed indices
node 1
orders
products
1 2
3 4
1 2
$ curl -XPUT localhost:9200/orders -d '{!
"settings.index.number_of_shards" : 4,!
"settings.index.number_of_replicas" : 1!
}'
$ curl -XPUT localhost:9200/products -d '{!
"settings.index.number_of_shards" : 2,!
"settings.index.number_of_replicas" : 0!
}'
StartUp #elasticsearch
Distributed indices
node 1
orders
products
1 2
3 4
1 2
node 2
$ bin/elasticsearch!
[INFO ][cluster.service][Armageddon] detected_master [Ghost Maker]
StartUp #elasticsearch
Distributed indices
node 1
orders
products
1
4
1
node 2
orders
products
2
3
2
2
3
1
4
2
3
2
StartUp #elasticsearch
node 3
Distributed indices
node 1
orders
products
1
4
1
node 2
orders
products
2
3
2
2
3
1
4
$ bin/elasticsearch!
[INFO ][cluster.service][Karnak] detected_master [Ghost Maker]
StartUp #elasticsearch
node 3
products
orders
Distributed indices
node 1
orders
products
1
4
1
node 2
orders
products
2
33
2
2
3
1
4 3
1
4
elasticsearch.
elasticsearch
kibana
logstash
Marvel
elasticsearch.
Training (public and on-site)
Development support
Production support
Marvel
@dadoonet
questions ?
we are hiring! jobs@elasticsearch.com
@dadoonet
grazie !

Más contenido relacionado

Similar a Make sense of your big data - Pilato

The Business of Big Data (IA Ventures)
The Business of Big Data (IA Ventures)The Business of Big Data (IA Ventures)
The Business of Big Data (IA Ventures)
Ben Siscovick
 
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
ALTER WAY
 
Big Data in Practice.pdf
Big Data in Practice.pdfBig Data in Practice.pdf
Big Data in Practice.pdf
Tom Tan
 

Similar a Make sense of your big data - Pilato (20)

David pilato codemotion-telaviv
David pilato codemotion-telavivDavid pilato codemotion-telaviv
David pilato codemotion-telaviv
 
JSON Data Modeling - July 2018 - Tulsa Techfest
JSON Data Modeling - July 2018 - Tulsa TechfestJSON Data Modeling - July 2018 - Tulsa Techfest
JSON Data Modeling - July 2018 - Tulsa Techfest
 
The Business of Big Data (IA Ventures)
The Business of Big Data (IA Ventures)The Business of Big Data (IA Ventures)
The Business of Big Data (IA Ventures)
 
Big Data Analytics - Best of the Worst : Anti-patterns & Antidotes
Big Data Analytics - Best of the Worst : Anti-patterns & AntidotesBig Data Analytics - Best of the Worst : Anti-patterns & Antidotes
Big Data Analytics - Best of the Worst : Anti-patterns & Antidotes
 
From Rocket Science to Data Science
From Rocket Science to Data ScienceFrom Rocket Science to Data Science
From Rocket Science to Data Science
 
KDD 2019 IADSS Workshop - Research Updates from Usama Fayyad & Hamit Hamutcu
KDD 2019 IADSS Workshop - Research Updates from Usama Fayyad & Hamit HamutcuKDD 2019 IADSS Workshop - Research Updates from Usama Fayyad & Hamit Hamutcu
KDD 2019 IADSS Workshop - Research Updates from Usama Fayyad & Hamit Hamutcu
 
Test trend analysis: Towards robust reliable and timely tests
Test trend analysis: Towards robust reliable and timely testsTest trend analysis: Towards robust reliable and timely tests
Test trend analysis: Towards robust reliable and timely tests
 
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
 
Intro to Data Science
Intro to Data ScienceIntro to Data Science
Intro to Data Science
 
Integrating the CDO Role Into Your Organization; Managing the Disruption (MIT...
Integrating the CDO Role Into Your Organization; Managing the Disruption (MIT...Integrating the CDO Role Into Your Organization; Managing the Disruption (MIT...
Integrating the CDO Role Into Your Organization; Managing the Disruption (MIT...
 
iTrain Malaysia: Data Science by Tarun Sukhani
iTrain Malaysia: Data Science by Tarun SukhaniiTrain Malaysia: Data Science by Tarun Sukhani
iTrain Malaysia: Data Science by Tarun Sukhani
 
Open Data and News Analytics Demo
Open Data and News Analytics DemoOpen Data and News Analytics Demo
Open Data and News Analytics Demo
 
Embracing data science
Embracing data scienceEmbracing data science
Embracing data science
 
Chapter 4 marketing research pace
Chapter 4 marketing research paceChapter 4 marketing research pace
Chapter 4 marketing research pace
 
Dpo latin strategy 04132018
Dpo latin strategy 04132018Dpo latin strategy 04132018
Dpo latin strategy 04132018
 
JSON Data Modeling - GDG Indy - April 2020
JSON Data Modeling - GDG Indy - April 2020JSON Data Modeling - GDG Indy - April 2020
JSON Data Modeling - GDG Indy - April 2020
 
Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science
 
Industry of Things World - Berlin 19-09-16
Industry of Things World - Berlin 19-09-16Industry of Things World - Berlin 19-09-16
Industry of Things World - Berlin 19-09-16
 
Why CxOs care about Data Governance; the roadblock to digital mastery
Why CxOs care about Data Governance; the roadblock to digital masteryWhy CxOs care about Data Governance; the roadblock to digital mastery
Why CxOs care about Data Governance; the roadblock to digital mastery
 
Big Data in Practice.pdf
Big Data in Practice.pdfBig Data in Practice.pdf
Big Data in Practice.pdf
 

Más de Codemotion

Más de Codemotion (20)

Fuzz-testing: A hacker's approach to making your code more secure | Pascal Ze...
Fuzz-testing: A hacker's approach to making your code more secure | Pascal Ze...Fuzz-testing: A hacker's approach to making your code more secure | Pascal Ze...
Fuzz-testing: A hacker's approach to making your code more secure | Pascal Ze...
 
Pompili - From hero to_zero: The FatalNoise neverending story
Pompili - From hero to_zero: The FatalNoise neverending storyPompili - From hero to_zero: The FatalNoise neverending story
Pompili - From hero to_zero: The FatalNoise neverending story
 
Pastore - Commodore 65 - La storia
Pastore - Commodore 65 - La storiaPastore - Commodore 65 - La storia
Pastore - Commodore 65 - La storia
 
Pennisi - Essere Richard Altwasser
Pennisi - Essere Richard AltwasserPennisi - Essere Richard Altwasser
Pennisi - Essere Richard Altwasser
 
Michel Schudel - Let's build a blockchain... in 40 minutes! - Codemotion Amst...
Michel Schudel - Let's build a blockchain... in 40 minutes! - Codemotion Amst...Michel Schudel - Let's build a blockchain... in 40 minutes! - Codemotion Amst...
Michel Schudel - Let's build a blockchain... in 40 minutes! - Codemotion Amst...
 
Richard Süselbeck - Building your own ride share app - Codemotion Amsterdam 2019
Richard Süselbeck - Building your own ride share app - Codemotion Amsterdam 2019Richard Süselbeck - Building your own ride share app - Codemotion Amsterdam 2019
Richard Süselbeck - Building your own ride share app - Codemotion Amsterdam 2019
 
Eward Driehuis - What we learned from 20.000 attacks - Codemotion Amsterdam 2019
Eward Driehuis - What we learned from 20.000 attacks - Codemotion Amsterdam 2019Eward Driehuis - What we learned from 20.000 attacks - Codemotion Amsterdam 2019
Eward Driehuis - What we learned from 20.000 attacks - Codemotion Amsterdam 2019
 
Francesco Baldassarri - Deliver Data at Scale - Codemotion Amsterdam 2019 -
Francesco Baldassarri  - Deliver Data at Scale - Codemotion Amsterdam 2019 - Francesco Baldassarri  - Deliver Data at Scale - Codemotion Amsterdam 2019 -
Francesco Baldassarri - Deliver Data at Scale - Codemotion Amsterdam 2019 -
 
Martin Förtsch, Thomas Endres - Stereoscopic Style Transfer AI - Codemotion A...
Martin Förtsch, Thomas Endres - Stereoscopic Style Transfer AI - Codemotion A...Martin Förtsch, Thomas Endres - Stereoscopic Style Transfer AI - Codemotion A...
Martin Förtsch, Thomas Endres - Stereoscopic Style Transfer AI - Codemotion A...
 
Melanie Rieback, Klaus Kursawe - Blockchain Security: Melting the "Silver Bul...
Melanie Rieback, Klaus Kursawe - Blockchain Security: Melting the "Silver Bul...Melanie Rieback, Klaus Kursawe - Blockchain Security: Melting the "Silver Bul...
Melanie Rieback, Klaus Kursawe - Blockchain Security: Melting the "Silver Bul...
 
Angelo van der Sijpt - How well do you know your network stack? - Codemotion ...
Angelo van der Sijpt - How well do you know your network stack? - Codemotion ...Angelo van der Sijpt - How well do you know your network stack? - Codemotion ...
Angelo van der Sijpt - How well do you know your network stack? - Codemotion ...
 
Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...
Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...
Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...
 
Sascha Wolter - Conversational AI Demystified - Codemotion Amsterdam 2019
Sascha Wolter - Conversational AI Demystified - Codemotion Amsterdam 2019Sascha Wolter - Conversational AI Demystified - Codemotion Amsterdam 2019
Sascha Wolter - Conversational AI Demystified - Codemotion Amsterdam 2019
 
Michele Tonutti - Scaling is caring - Codemotion Amsterdam 2019
Michele Tonutti - Scaling is caring - Codemotion Amsterdam 2019Michele Tonutti - Scaling is caring - Codemotion Amsterdam 2019
Michele Tonutti - Scaling is caring - Codemotion Amsterdam 2019
 
Pat Hermens - From 100 to 1,000+ deployments a day - Codemotion Amsterdam 2019
Pat Hermens - From 100 to 1,000+ deployments a day - Codemotion Amsterdam 2019Pat Hermens - From 100 to 1,000+ deployments a day - Codemotion Amsterdam 2019
Pat Hermens - From 100 to 1,000+ deployments a day - Codemotion Amsterdam 2019
 
James Birnie - Using Many Worlds of Compute Power with Quantum - Codemotion A...
James Birnie - Using Many Worlds of Compute Power with Quantum - Codemotion A...James Birnie - Using Many Worlds of Compute Power with Quantum - Codemotion A...
James Birnie - Using Many Worlds of Compute Power with Quantum - Codemotion A...
 
Don Goodman-Wilson - Chinese food, motor scooters, and open source developmen...
Don Goodman-Wilson - Chinese food, motor scooters, and open source developmen...Don Goodman-Wilson - Chinese food, motor scooters, and open source developmen...
Don Goodman-Wilson - Chinese food, motor scooters, and open source developmen...
 
Pieter Omvlee - The story behind Sketch - Codemotion Amsterdam 2019
Pieter Omvlee - The story behind Sketch - Codemotion Amsterdam 2019Pieter Omvlee - The story behind Sketch - Codemotion Amsterdam 2019
Pieter Omvlee - The story behind Sketch - Codemotion Amsterdam 2019
 
Dave Farley - Taking Back “Software Engineering” - Codemotion Amsterdam 2019
Dave Farley - Taking Back “Software Engineering” - Codemotion Amsterdam 2019Dave Farley - Taking Back “Software Engineering” - Codemotion Amsterdam 2019
Dave Farley - Taking Back “Software Engineering” - Codemotion Amsterdam 2019
 
Joshua Hoffman - Should the CTO be Coding? - Codemotion Amsterdam 2019
Joshua Hoffman - Should the CTO be Coding? - Codemotion Amsterdam 2019Joshua Hoffman - Should the CTO be Coding? - Codemotion Amsterdam 2019
Joshua Hoffman - Should the CTO be Coding? - Codemotion Amsterdam 2019
 

Último

Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Peter Udo Diehl
 

Último (20)

Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
 
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
 
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
 
Syngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdfSyngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdf
 
Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024
 
Top 10 Symfony Development Companies 2024
Top 10 Symfony Development Companies 2024Top 10 Symfony Development Companies 2024
Top 10 Symfony Development Companies 2024
 
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxWSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
 
The UX of Automation by AJ King, Senior UX Researcher, Ocado
The UX of Automation by AJ King, Senior UX Researcher, OcadoThe UX of Automation by AJ King, Senior UX Researcher, Ocado
The UX of Automation by AJ King, Senior UX Researcher, Ocado
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John Staveley
 
IESVE for Early Stage Design and Planning
IESVE for Early Stage Design and PlanningIESVE for Early Stage Design and Planning
IESVE for Early Stage Design and Planning
 
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfWhere to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
 
Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджера
 
THE BEST IPTV in GERMANY for 2024: IPTVreel
THE BEST IPTV in  GERMANY for 2024: IPTVreelTHE BEST IPTV in  GERMANY for 2024: IPTVreel
THE BEST IPTV in GERMANY for 2024: IPTVreel
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
 
Connecting the Dots in Product Design at KAYAK
Connecting the Dots in Product Design at KAYAKConnecting the Dots in Product Design at KAYAK
Connecting the Dots in Product Design at KAYAK
 
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
 
PLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsPLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. Startups
 
Designing for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at ComcastDesigning for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at Comcast
 
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
 

Make sense of your big data - Pilato