SlideShare una empresa de Scribd logo
1 de 48
Interesting ways Big Data is used
today
Daniel Sarbe
May 2015, Big Data Romanian Tour - Timisoara
Agenda
1. Source of (Big)Data
2. Why now?
3. Interesting patterns of using BigData
4. BigData – Big Opportunities
“There is a big data revolution.
But it is not the quantity of data that is revolutionary.
The big data revolution is that now we can do something with the data.”
Gary King, professor at Harvard University
“In God we Trust, all others bring data”
William Edwards Deming - American statistician
“If we have data, let’s look at data. If all we have are opinions, let’s go with mine.”
Jim Barksdale, former Netscape CEO
Source of (Big) Data
The 3+1Vs of Big Data
Just how Big is the Big Data market?
Big Data Market Forecast, 2011-2020
(in $ billion)
Source: Wikibon 2015
Interest in BigData
Why now?
1) So much data generated that we cannot store
and analyze with conventional tools
Big Data in Aviation Industry
2) Companies realized the potential and started to
invest money
Companies sources of data analysis
Source: captricity.com
Data Mining & Machine Learning
• Data Mining - The process of discovering meaningful correlations, patterns and
trends by sifting through large amounts of data
• Machine Learning is the study of computer algorithms that improve automatically
through experience
▫ Supervised machine learning - The program is “trained” on a pre-defined set of
“training examples”, which then facilitate its ability to reach an accurate conclusion when
given new data.
▫ Unsupervised machine learning - The program is given a bunch of data and must find
patterns and relationships therein.
BigData use-cases
Source: IBM
Analytics Maturity
Predictive Analysis
• Predictive analytics is an area of data mining that deals with extracting
information from data and using it to predict trends and behavior patterns.
• The accuracy and usability of results will depend greatly on the level of data
analysis and the quality of assumptions
BigData used for predictions – 2012 US Election
The 2012 Election: A Big Win for Big Data
• Statistician Nate Silver, gave Barack Obama over a
90 percent chance of victory in the Electoral College.
• Algorithm 538 name - number of electors in US
• In 2008 his mathematical model correctly called 49
out of 50 states, missing only Indiana (which went to
Obama by 0.1%.) (John McCain vs Barack Obama)
• In 2012 Silver's model has correctly predicted 50 out
of 50 states.
• Incorporated hundreds of state-level polls into his
analysis. Economic variables, demographics,
electoral outcome, historical polls, economic data
and party registration figures were also incorporated
• While some analysts might cherry-pick data sources
according to whether they were qualitatively
"reliable" or "unbiased", Silver incorporated them
all. Silver's model instead looked at trends over time
BigData used for predictions - 2014 Sochi Winter Olympics
• “Canada will enjoy their best Olympics ever,
while the U.S. and host Russia will struggle."
BigData used for predictions - 2014 Sochi Winter Olympics
• The analysts used publicly available data on all Winter Olympic Games from 1924 forward
• The model's inputs are Gross Domestic Product(GDP), year, if the country is
communist or not, if the country is a host or not, population of that country, and
its historical performances and medal counts in previous Olympics.
• All variables are given the same weight in the model
• The medal count prediction is based on a linear regression model
• The algorithm is based on historical data, and doesn’t necessarily reflect more current
information such as emerging stars, recent funding boosts, and an unexpectedly large addition of
new events to the program.
• “Based on the above mentioned data and analysis, the analysts predict that Canadian athletes
will grab the most medals and the United States will finish seventh. Germany, Norway,
Austria, China and Russia will rank second to sixth respectively.”
Big Data used in other sports
Germany Uses Big Data to Crush Brazil in World Cup Semifinal
• Forget about Moneyball - Germany has now used serious Big Data to win a World Cup match.
• Soccer, a more fluid game, was thought to be less amenable than baseball to Big Data's wiles.
• According to assistant coach Hansi Flick, team managers combed through years of research about
the Brazilian team compiled by students at Cologne's Sports University, looking for any advantage to
be gained over the Brazilian team.
• The compiled information included a detailed analysis of all Brazil's players--their favorite moves,
how they deal with high pressure scenarios, their reactions when fouled, and even how they sprint
for the ball.
• 3) Cost of cloud/hardware and full-grown of
software solutions (Hadoop ecosystem)
Cost per GB of disk
Hadoop – The platform for BigData
• Hadoop became a very stable and mature
platform (and faster)
Hadoop 1.0 to 2.0
Hadoop myths debunked
Hadoop isn’t enterprise
ready
Hadoop isn’t stable, cluster
go down
You lose data on HDFS
Data cannot be shared across
the organization
Hadoop is not secured
NameNode do not scale
Software upgrades are rare
Hadoop use cases are limited
I need expensive servers to
get more
Hadoop is so dead
Source: Sumeet Singh - Yahoo
Cost of BigData vs Traditional DBs
Hadoop Providers
◦ 1. Cloudera - $4B market value
◦ - 1,000+ paying customers
◦ 2. Hortonworks - $1B market
value - 800+ paying customers
◦
◦ 3. MapR - $1B market value
◦ - 700+ paying customers
Open Data Platform
The Open Data Platform Initiative (ODP) is a shared
industry effort focused on promoting and advancing the
state of Apache Hadoop® and Big Data technologies for the
enterprise.
Other (interesting) Big Data real use-cases
Netflix
Netflix collects a lot of data to understand how its users behave and what their
preferences are
• It collects metrics including what people watch, when they watch, where they watch,
what devices they use, ratings, searches, when users pause or stop watching, etc.
• Netflix made the House of Cards decision by identifying that subscribers who
watched the original British version of House of Cards were very likely to watch
movies starring Kevin Spacey or directed by David Fincher
• Netflix made ten different versions of the trailer for House of Cards geared towards
different audiences
▫ Fans of Kevin Spacey watched trailers that were focused on him while people who liked
female-oriented movies saw trailers that highlighted the women in the show.
Verizon
• 103.3 million wireless customers, 6.2 million Internet users and 5.3 million TV subscribers.
• Data collected:
▫ Calls(order flowers) or accessing some pages
▫ Locations in City + Roaming
▫ Home + Mobile web pages + Television
• Formed a Precision Marketing division – e.g. Event attendance information
▫ Migrate from iPhone 5 to iPhone 6 – resulted in a plan data increase or not?
▫ Some migrated from Android to iPhone and huge data plan consumtion 3x-5x more
Notes:
• Customers can choose not to participate in the program by going to their privacy choices page on MyVerizon or by
calling 866-211-0874
• Verizon’s business and government customers are not part of the Precision program
The Perfect Milk - Digital Cow - The internet of cows
• Embaded sensos in cow stomachs
• If cow is seek, sensor will let a veterinar know while there is time to treat
the disease
• Sensor to detect the presence of E.coli bacteria
• Vital Herd, a Texas-based start-up - e-Pill - collect information about the
animal: breathing rate, heart rate, temperature, rumination time, rumen
acidity and estrogen levels
The City of Las Vegas
• archaic records and inaccurate information
• took advantage of smart data to develop a living
model of its utilities network
• aggregate data from various sources into a single
real-time 3D model created with Autodesk
technology for both avove and below ground
utilities
Google Flu Trends
BigData – Big Opportunities
• Big data means big IT job opportunities -- for the right people
Big Opportunities
• Gartner predicted in 2013 that by 2015, Big Data demand will generate 4.4
million jobs in the IT Industry all around the world.
• 1.9 million IT jobs will be created just in the U.S. That is how Big Data
directly affects the IT Industry.
• Only 1/3rd of these jobs will be fulfilled, due to lack of skills in the
individuals
What is needed?
• A Curious Mind Is Key - The most important qualifications for these positions
aren't academic degrees, certifications, job experience or titles. Rather, they seem to
be soft skills: a curious mind, the ability to communicate with nontechnical people, a
persistent -- even stubborn -- character and a strong creative bent.
• The CIA is hiring data scientists : “We are looking for curious, creative
individuals interested in serving their country through the field of data
science.”
“I keep saying that the sexy job in the next 10 years will be
statisticians, and I’m not kidding.”
Hal Varian, chief economist at Google
“Without big data, you are blind and deaf and in the
middle of a freeway.”
Geoffrey Moore, author and consultant
Thank you!
Twitter: @danielsarbe

Más contenido relacionado

La actualidad más candente

Big Data Applications | Big Data Application Examples | Big Data Use Cases | ...
Big Data Applications | Big Data Application Examples | Big Data Use Cases | ...Big Data Applications | Big Data Application Examples | Big Data Use Cases | ...
Big Data Applications | Big Data Application Examples | Big Data Use Cases | ...
Simplilearn
 
Getting started in data science (4:3)
Getting started in data science (4:3)Getting started in data science (4:3)
Getting started in data science (4:3)
Thinkful
 
Getting started in data science (4:3)
Getting started in data science (4:3)Getting started in data science (4:3)
Getting started in data science (4:3)
Thinkful
 

La actualidad más candente (20)

Big Data & Analytics for Government - Case Studies
Big Data & Analytics for Government - Case StudiesBig Data & Analytics for Government - Case Studies
Big Data & Analytics for Government - Case Studies
 
Big data insights part i
Big data insights   part iBig data insights   part i
Big data insights part i
 
Getting started in Data Science (April 2017, Los Angeles)
Getting started in Data Science (April 2017, Los Angeles)Getting started in Data Science (April 2017, Los Angeles)
Getting started in Data Science (April 2017, Los Angeles)
 
Data-Ed Webinar: Demystifying Big Data
Data-Ed Webinar: Demystifying Big Data Data-Ed Webinar: Demystifying Big Data
Data-Ed Webinar: Demystifying Big Data
 
Big Data Applications | Big Data Application Examples | Big Data Use Cases | ...
Big Data Applications | Big Data Application Examples | Big Data Use Cases | ...Big Data Applications | Big Data Application Examples | Big Data Use Cases | ...
Big Data Applications | Big Data Application Examples | Big Data Use Cases | ...
 
Big Data Evolution
Big Data EvolutionBig Data Evolution
Big Data Evolution
 
NewMR 2016 presents: 9 Big Applications of Big Data
NewMR 2016 presents: 9 Big Applications of Big DataNewMR 2016 presents: 9 Big Applications of Big Data
NewMR 2016 presents: 9 Big Applications of Big Data
 
Data Mining With Big Data
Data Mining With Big DataData Mining With Big Data
Data Mining With Big Data
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Getting started in data science (4:3)
Getting started in data science (4:3)Getting started in data science (4:3)
Getting started in data science (4:3)
 
Getting started in data science (4:3)
Getting started in data science (4:3)Getting started in data science (4:3)
Getting started in data science (4:3)
 
Big data analytics
Big data analyticsBig data analytics
Big data analytics
 
Big data for official statistics @ Konferensi Big Data Indonesia 2016
Big data for official statistics @ Konferensi Big Data Indonesia 2016 Big data for official statistics @ Konferensi Big Data Indonesia 2016
Big data for official statistics @ Konferensi Big Data Indonesia 2016
 
What is AI without Data?
What is AI without Data?What is AI without Data?
What is AI without Data?
 
Big Data By Vijay Bhaskar Semwal
Big Data By Vijay Bhaskar SemwalBig Data By Vijay Bhaskar Semwal
Big Data By Vijay Bhaskar Semwal
 
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate
 
Getting Started in Data Science
Getting Started in Data ScienceGetting Started in Data Science
Getting Started in Data Science
 
Big Data-Job 2
Big Data-Job 2Big Data-Job 2
Big Data-Job 2
 
Systemof insight
Systemof insightSystemof insight
Systemof insight
 
Social Big Data in Government
Social Big Data in GovernmentSocial Big Data in Government
Social Big Data in Government
 

Similar a Interesting ways Big Data is used today

Bigdata " new level"
Bigdata " new level"Bigdata " new level"
Bigdata " new level"
Vamshikrishna Goud
 

Similar a Interesting ways Big Data is used today (20)

Big data
Big dataBig data
Big data
 
Generating Big Value from Big Data
Generating Big Value from Big DataGenerating Big Value from Big Data
Generating Big Value from Big Data
 
Ictam big data
Ictam big dataIctam big data
Ictam big data
 
Bigdata " new level"
Bigdata " new level"Bigdata " new level"
Bigdata " new level"
 
Big data and Internet
Big data and InternetBig data and Internet
Big data and Internet
 
Bigdata and Hadoop with applications
Bigdata and Hadoop with applicationsBigdata and Hadoop with applications
Bigdata and Hadoop with applications
 
Big data management
Big data managementBig data management
Big data management
 
Big Data for Development
Big Data for DevelopmentBig Data for Development
Big Data for Development
 
Big Data et eGovernment
Big Data et eGovernmentBig Data et eGovernment
Big Data et eGovernment
 
big data analytics pgpmx2015
big data analytics pgpmx2015big data analytics pgpmx2015
big data analytics pgpmx2015
 
Kartikey tripathi
Kartikey tripathiKartikey tripathi
Kartikey tripathi
 
CRM & Big Data Analytics
CRM & Big Data AnalyticsCRM & Big Data Analytics
CRM & Big Data Analytics
 
Big data
Big dataBig data
Big data
 
Geospatial Intelligence Middle East 2013_Big Data_Steven Ramage
Geospatial Intelligence Middle East 2013_Big Data_Steven RamageGeospatial Intelligence Middle East 2013_Big Data_Steven Ramage
Geospatial Intelligence Middle East 2013_Big Data_Steven Ramage
 
Unit 1 (DSBDA) PD.pptx
Unit 1 (DSBDA)  PD.pptxUnit 1 (DSBDA)  PD.pptx
Unit 1 (DSBDA) PD.pptx
 
Business Intelligence & Predictive Analytic by Prof. Lili Saghafi
Business Intelligence & Predictive Analytic by Prof. Lili SaghafiBusiness Intelligence & Predictive Analytic by Prof. Lili Saghafi
Business Intelligence & Predictive Analytic by Prof. Lili Saghafi
 
If companies are not careful, "Big Data" will become "Big Dilbert"
If companies are not careful, "Big Data" will become "Big Dilbert"If companies are not careful, "Big Data" will become "Big Dilbert"
If companies are not careful, "Big Data" will become "Big Dilbert"
 
Big data Mining
Big data MiningBig data Mining
Big data Mining
 
data analytics lecture2.pptx
data analytics lecture2.pptxdata analytics lecture2.pptx
data analytics lecture2.pptx
 
Data science and its potential to change business as we know it. The Roadmap ...
Data science and its potential to change business as we know it. The Roadmap ...Data science and its potential to change business as we know it. The Roadmap ...
Data science and its potential to change business as we know it. The Roadmap ...
 

Último

TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
mohitmore19
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Último (20)

%in Durban+277-882-255-28 abortion pills for sale in Durban
%in Durban+277-882-255-28 abortion pills for sale in Durban%in Durban+277-882-255-28 abortion pills for sale in Durban
%in Durban+277-882-255-28 abortion pills for sale in Durban
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfThe Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 

Interesting ways Big Data is used today

  • 1. Interesting ways Big Data is used today Daniel Sarbe May 2015, Big Data Romanian Tour - Timisoara
  • 2. Agenda 1. Source of (Big)Data 2. Why now? 3. Interesting patterns of using BigData 4. BigData – Big Opportunities
  • 3. “There is a big data revolution. But it is not the quantity of data that is revolutionary. The big data revolution is that now we can do something with the data.” Gary King, professor at Harvard University
  • 4. “In God we Trust, all others bring data” William Edwards Deming - American statistician “If we have data, let’s look at data. If all we have are opinions, let’s go with mine.” Jim Barksdale, former Netscape CEO
  • 6. The 3+1Vs of Big Data
  • 7.
  • 8. Just how Big is the Big Data market? Big Data Market Forecast, 2011-2020 (in $ billion) Source: Wikibon 2015
  • 10. Why now? 1) So much data generated that we cannot store and analyze with conventional tools
  • 11. Big Data in Aviation Industry
  • 12. 2) Companies realized the potential and started to invest money
  • 13.
  • 14. Companies sources of data analysis Source: captricity.com
  • 15. Data Mining & Machine Learning • Data Mining - The process of discovering meaningful correlations, patterns and trends by sifting through large amounts of data • Machine Learning is the study of computer algorithms that improve automatically through experience ▫ Supervised machine learning - The program is “trained” on a pre-defined set of “training examples”, which then facilitate its ability to reach an accurate conclusion when given new data. ▫ Unsupervised machine learning - The program is given a bunch of data and must find patterns and relationships therein.
  • 18. Predictive Analysis • Predictive analytics is an area of data mining that deals with extracting information from data and using it to predict trends and behavior patterns. • The accuracy and usability of results will depend greatly on the level of data analysis and the quality of assumptions
  • 19. BigData used for predictions – 2012 US Election The 2012 Election: A Big Win for Big Data • Statistician Nate Silver, gave Barack Obama over a 90 percent chance of victory in the Electoral College. • Algorithm 538 name - number of electors in US • In 2008 his mathematical model correctly called 49 out of 50 states, missing only Indiana (which went to Obama by 0.1%.) (John McCain vs Barack Obama) • In 2012 Silver's model has correctly predicted 50 out of 50 states. • Incorporated hundreds of state-level polls into his analysis. Economic variables, demographics, electoral outcome, historical polls, economic data and party registration figures were also incorporated • While some analysts might cherry-pick data sources according to whether they were qualitatively "reliable" or "unbiased", Silver incorporated them all. Silver's model instead looked at trends over time
  • 20. BigData used for predictions - 2014 Sochi Winter Olympics • “Canada will enjoy their best Olympics ever, while the U.S. and host Russia will struggle."
  • 21. BigData used for predictions - 2014 Sochi Winter Olympics • The analysts used publicly available data on all Winter Olympic Games from 1924 forward • The model's inputs are Gross Domestic Product(GDP), year, if the country is communist or not, if the country is a host or not, population of that country, and its historical performances and medal counts in previous Olympics. • All variables are given the same weight in the model • The medal count prediction is based on a linear regression model • The algorithm is based on historical data, and doesn’t necessarily reflect more current information such as emerging stars, recent funding boosts, and an unexpectedly large addition of new events to the program. • “Based on the above mentioned data and analysis, the analysts predict that Canadian athletes will grab the most medals and the United States will finish seventh. Germany, Norway, Austria, China and Russia will rank second to sixth respectively.”
  • 22.
  • 23. Big Data used in other sports Germany Uses Big Data to Crush Brazil in World Cup Semifinal • Forget about Moneyball - Germany has now used serious Big Data to win a World Cup match. • Soccer, a more fluid game, was thought to be less amenable than baseball to Big Data's wiles. • According to assistant coach Hansi Flick, team managers combed through years of research about the Brazilian team compiled by students at Cologne's Sports University, looking for any advantage to be gained over the Brazilian team. • The compiled information included a detailed analysis of all Brazil's players--their favorite moves, how they deal with high pressure scenarios, their reactions when fouled, and even how they sprint for the ball.
  • 24. • 3) Cost of cloud/hardware and full-grown of software solutions (Hadoop ecosystem)
  • 25. Cost per GB of disk
  • 26. Hadoop – The platform for BigData • Hadoop became a very stable and mature platform (and faster)
  • 28. Hadoop myths debunked Hadoop isn’t enterprise ready Hadoop isn’t stable, cluster go down You lose data on HDFS Data cannot be shared across the organization Hadoop is not secured NameNode do not scale Software upgrades are rare Hadoop use cases are limited I need expensive servers to get more Hadoop is so dead Source: Sumeet Singh - Yahoo
  • 29. Cost of BigData vs Traditional DBs
  • 30. Hadoop Providers ◦ 1. Cloudera - $4B market value ◦ - 1,000+ paying customers ◦ 2. Hortonworks - $1B market value - 800+ paying customers ◦ ◦ 3. MapR - $1B market value ◦ - 700+ paying customers
  • 31.
  • 32. Open Data Platform The Open Data Platform Initiative (ODP) is a shared industry effort focused on promoting and advancing the state of Apache Hadoop® and Big Data technologies for the enterprise.
  • 33.
  • 34. Other (interesting) Big Data real use-cases
  • 35. Netflix Netflix collects a lot of data to understand how its users behave and what their preferences are • It collects metrics including what people watch, when they watch, where they watch, what devices they use, ratings, searches, when users pause or stop watching, etc. • Netflix made the House of Cards decision by identifying that subscribers who watched the original British version of House of Cards were very likely to watch movies starring Kevin Spacey or directed by David Fincher • Netflix made ten different versions of the trailer for House of Cards geared towards different audiences ▫ Fans of Kevin Spacey watched trailers that were focused on him while people who liked female-oriented movies saw trailers that highlighted the women in the show.
  • 36. Verizon • 103.3 million wireless customers, 6.2 million Internet users and 5.3 million TV subscribers. • Data collected: ▫ Calls(order flowers) or accessing some pages ▫ Locations in City + Roaming ▫ Home + Mobile web pages + Television • Formed a Precision Marketing division – e.g. Event attendance information ▫ Migrate from iPhone 5 to iPhone 6 – resulted in a plan data increase or not? ▫ Some migrated from Android to iPhone and huge data plan consumtion 3x-5x more Notes: • Customers can choose not to participate in the program by going to their privacy choices page on MyVerizon or by calling 866-211-0874 • Verizon’s business and government customers are not part of the Precision program
  • 37. The Perfect Milk - Digital Cow - The internet of cows • Embaded sensos in cow stomachs • If cow is seek, sensor will let a veterinar know while there is time to treat the disease • Sensor to detect the presence of E.coli bacteria • Vital Herd, a Texas-based start-up - e-Pill - collect information about the animal: breathing rate, heart rate, temperature, rumination time, rumen acidity and estrogen levels
  • 38. The City of Las Vegas • archaic records and inaccurate information • took advantage of smart data to develop a living model of its utilities network • aggregate data from various sources into a single real-time 3D model created with Autodesk technology for both avove and below ground utilities
  • 40. BigData – Big Opportunities • Big data means big IT job opportunities -- for the right people
  • 41. Big Opportunities • Gartner predicted in 2013 that by 2015, Big Data demand will generate 4.4 million jobs in the IT Industry all around the world. • 1.9 million IT jobs will be created just in the U.S. That is how Big Data directly affects the IT Industry. • Only 1/3rd of these jobs will be fulfilled, due to lack of skills in the individuals What is needed? • A Curious Mind Is Key - The most important qualifications for these positions aren't academic degrees, certifications, job experience or titles. Rather, they seem to be soft skills: a curious mind, the ability to communicate with nontechnical people, a persistent -- even stubborn -- character and a strong creative bent. • The CIA is hiring data scientists : “We are looking for curious, creative individuals interested in serving their country through the field of data science.”
  • 42.
  • 43.
  • 44.
  • 45.
  • 46.
  • 47. “I keep saying that the sexy job in the next 10 years will be statisticians, and I’m not kidding.” Hal Varian, chief economist at Google “Without big data, you are blind and deaf and in the middle of a freeway.” Geoffrey Moore, author and consultant

Notas del editor

  1. In 60 second, Google receives over 4,000,000 search queries, YouTube users upload 71 hours of new videos Twitter users share 277,000 tweet Apple Watch - predicted 30 M units in fist year, 29.2 M units all Swiss sold watches
  2. In 60 second: Emails: 200 M emails/minute Facebook generates 10 PB of data per day - Twitter users share 277,000 tweet, - Apple users download 48,000 apps. The first documented use of the term “big data” appeared in a 1997 paper by scientists at NASA, describing the problem they had with visualization (i.e. computer graphics) which “provides an interesting challenge for computer systems: data sets are generally quite large, taxing the capacities of main memory, local disk, and even remote disk. We call this the problem of big data. When data sets do not fit in main memory (in core), or when they do not fit even on local disk, the most common solution is to acquire more resources.”
  3. Cloud Market Size: $16 B, 30% Amazon, 10% Microsoft
  4. temperature, humidity, air pressure, etc. http://www.theguardian.com/technology/2015/apr/29/apple-ipad-fail-grounds-few-dozen-american-airline-flights - $1.2 M saved per year from paper and fuel
  5. “Extragerea de cunostinte din date” "torturarea datelor pâna când acestea se confeseaza“ Example: Machine Translation Spam filters Face recognition Car/housing price predictor 
  6. - From batch processing to Data Operating System - YARN (Yet Another Resource Negotiator) - separating the processing engine and resource management capabilities More like an operating system, to support multiple users, multiple applications In Hadoop 1.0, everything was batch-oriented. In 2.0, you will now have multiple apps hitting the data inside all at once. Streaming, online, in-memory
  7. Due to archaic records and inaccurate information, most utilities have no idea where all of their underground assets are located, resulting in those all-too-common service interruptions for residents when a power line is accidently cut or a water line bursts. To avoid these problems, the City of Las Vegas took advantage of smart data to develop a living model of its utilities network. VTN Consulting helped the city aggregate data from various sources into a single real-time 3D model created with Autodesk technology. The model includes both above and below ground utilities, and is being used to visualize the location and performance of critical assets located under the city.
  8. http://www.fastcompany.com/1842928/time-build-your-big-data-muscles