SlideShare una empresa de Scribd logo
1 de 24
Understanding Big Data
Overview 
Why Big Data 
Big Data Users 
What is Big Data
Big Data 
Big data is a popular term used to describe the exponential growth and availability of data, both structured and 
unstructured. And big data may be as important to business – and society – as the Internet has become. Why? 
More data may lead to more accurate analyses. 
More accurate analyses may lead to more confident decision making. And better decisions can mean greater 
operational efficiencies, cost reductions and reduced risk.
Big Data 
Mainstream definition of big data as the three V’s of big data:
Big Data
Big Data 
• Volume: Many factors contribute to the increase in data volume. Transaction-based data stored through 
the years. Unstructured data streaming in from social media. Increasing amounts of sensor and machine-to- 
machine data being collected. 
• Velocity: Data is streaming in at unprecedented speed and must be dealt with in a timely manner. RFID 
tags, sensors and smart metering are driving the need to deal with torrents of data in near-real time. 
Reacting quickly enough to deal with data velocity is a challenge for most organizations. 
• Variety: Data today comes in all types of formats. Structured, numeric data in traditional databases. 
Information created from line-of-business applications. Unstructured text documents, email, video, audio, 
stock ticker data and financial transactions. Managing, merging and governing different varieties of data is 
something many organizations still grapple with.
Big Data 
Let us consider two additional dimensions when thinking about big data: 
• Variability: In addition to the increasing velocities and varieties of data, data flows can be highly 
inconsistent with periodic peaks. Is something trending in social media? Daily, seasonal and event-triggered 
peak data loads can be challenging to manage. Even more so with unstructured data involved. 
• Complexity: Today's data comes from multiple sources. And it is still an undertaking to link, match, cleanse 
and transform data across systems. However, it is necessary to connect and correlate relationships, 
hierarchies and multiple data linkages or your data can quickly spiral out of control.
Big Data
Why Big Data ?? 
The real issue is not that you are acquiring large amounts of data. It's what you do with the data that counts. The 
hopeful vision is that organizations will be able to take data from any source, harness relevant data and analyse 
it to find answers that enable 
1) cost reductions 
2) time reductions 
3) new product development and optimized offerings 
4) smarter business decision making.
Big Data Ecosystem
Big Data platform typically works by storing data first into clusters , then process the data through 
MapReduce workflows which executes by Mapping the input data through independent chunks processed 
by appropriate algorithms, the output from Map phase then moves to Shuffle/Sorting phase & finally the 
output from Shuffle phase comes to Reduce phase as input. 
Typical Big Data MapReduce workflow:
Big Data users in Next Five Years
Big Data users in Private Sector 
• Ebay.com uses two data warehouses at 7.5 petabytes and 40PB as well as a 40PB Hadoop cluster for search, 
consumer recommendations, and merchandising. 
• Amazon.com handles millions of back-end operations every day, as well as queries from more than half a 
million third-party sellers. The core technology that keeps Amazon running is Linux-based and as of 2005 
they had the world’s three largest Linux databases, with capacities of 7.8 TB, 18.5 TB, and 24.7 TB. 
• Walmart handles more than 1 million customer transactions every hour, which are imported into databases 
estimated to contain more than 2.5 petabytes (2560 terabytes) of data – the equivalent of 167 times the 
information contained in all the books in the US Library of Congress. 
• Facebook handles 50 billion photos from its user base.
Big Data users in Private Sector 
• FICO Falcon Credit Card Fraud Detection System protects 2.1 billion active accounts world-wide. The volume of 
business data worldwide, across all companies, doubles every 1.2 years, according to estimates. 
• Windermere Real Estate uses anonymous GPS signals from nearly 100 million drivers to help new home 
buyers determine their typical drive times to and from work throughout various times of the day.
Thank You.. 
Queries are welcome 
Praneet Samaiya

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Big data
Big dataBig data
Big data
 
Big data
Big dataBig data
Big data
 
Presentation on Big Data
Presentation on Big DataPresentation on Big Data
Presentation on Big Data
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
Big data.
Big data.Big data.
Big data.
 
Introduction to Big Data & Big Data 1.0 System
Introduction to Big Data & Big Data 1.0 SystemIntroduction to Big Data & Big Data 1.0 System
Introduction to Big Data & Big Data 1.0 System
 
Big data
Big dataBig data
Big data
 
A Short History of Big Data
A Short History of Big DataA Short History of Big Data
A Short History of Big Data
 
Big Data
Big DataBig Data
Big Data
 
Big Data, Big Deal: For Future Big Data Scientists
Big Data, Big Deal: For Future Big Data ScientistsBig Data, Big Deal: For Future Big Data Scientists
Big Data, Big Deal: For Future Big Data Scientists
 
Big data
Big dataBig data
Big data
 
Big data
Big dataBig data
Big data
 
Big Data
Big DataBig Data
Big Data
 
Data Mining With Big Data
Data Mining With Big DataData Mining With Big Data
Data Mining With Big Data
 
Mining Big Data in Real Time
Mining Big Data in Real TimeMining Big Data in Real Time
Mining Big Data in Real Time
 
Data mining with big data implementation
Data mining with big data implementationData mining with big data implementation
Data mining with big data implementation
 
big data Presentation
big data Presentationbig data Presentation
big data Presentation
 
Data mining on big data
Data mining on big dataData mining on big data
Data mining on big data
 
JPJ1417 Data Mining With Big Data
JPJ1417   Data Mining With Big DataJPJ1417   Data Mining With Big Data
JPJ1417 Data Mining With Big Data
 
BIG DATA & DATA ANALYTICS
BIG  DATA & DATA  ANALYTICSBIG  DATA & DATA  ANALYTICS
BIG DATA & DATA ANALYTICS
 

Destacado

Nuclear Winter
Nuclear WinterNuclear Winter
Nuclear Winterbrookeec
 
acceso a la biblioteca
acceso a la biblioteca acceso a la biblioteca
acceso a la biblioteca kthriin
 
FACTORS AFFECTING LLS
FACTORS AFFECTING LLSFACTORS AFFECTING LLS
FACTORS AFFECTING LLSMau5pls
 
Діагностика уваги
Діагностика увагиДіагностика уваги
Діагностика увагиyfnfkmz1990
 
Movie assignment MoeR
Movie assignment MoeRMovie assignment MoeR
Movie assignment MoeRXin Yi Zyx
 
Anuario estadístico América Latina 2013
Anuario estadístico América Latina 2013Anuario estadístico América Latina 2013
Anuario estadístico América Latina 2013Manager Asesores
 
Klaster pariwisata desa sembungan
Klaster pariwisata desa sembunganKlaster pariwisata desa sembungan
Klaster pariwisata desa sembunganfebrinasas
 
Horario cesanjose2014
Horario cesanjose2014Horario cesanjose2014
Horario cesanjose2014dallas60
 
#10dieci: Turismo
#10dieci: Turismo#10dieci: Turismo
#10dieci: Turismopaticchio
 
How to create a new Master Page in SharePoint 2013?
How to create a new Master Page in SharePoint 2013?How to create a new Master Page in SharePoint 2013?
How to create a new Master Page in SharePoint 2013?Velocity Software
 
Pardalis & Nohavicka llp final
Pardalis & Nohavicka llp finalPardalis & Nohavicka llp final
Pardalis & Nohavicka llp finalJoseph Nohavicka
 
Ieee 2014 2015 dotnet projects titles globalsoft technologies
Ieee 2014 2015 dotnet projects titles globalsoft technologiesIeee 2014 2015 dotnet projects titles globalsoft technologies
Ieee 2014 2015 dotnet projects titles globalsoft technologiesIEEEJAVAPROJECTS
 
Tests for intergranular corrosion and stress corrosion cracking
Tests for intergranular corrosion and stress corrosion crackingTests for intergranular corrosion and stress corrosion cracking
Tests for intergranular corrosion and stress corrosion crackingkoshykanjirapallikaran
 

Destacado (19)

Nuclear Winter
Nuclear WinterNuclear Winter
Nuclear Winter
 
Bab 6 kls xi
Bab 6 kls xiBab 6 kls xi
Bab 6 kls xi
 
acceso a la biblioteca
acceso a la biblioteca acceso a la biblioteca
acceso a la biblioteca
 
FACTORS AFFECTING LLS
FACTORS AFFECTING LLSFACTORS AFFECTING LLS
FACTORS AFFECTING LLS
 
Діагностика уваги
Діагностика увагиДіагностика уваги
Діагностика уваги
 
Movie assignment MoeR
Movie assignment MoeRMovie assignment MoeR
Movie assignment MoeR
 
Anuario estadístico América Latina 2013
Anuario estadístico América Latina 2013Anuario estadístico América Latina 2013
Anuario estadístico América Latina 2013
 
Klaster pariwisata desa sembungan
Klaster pariwisata desa sembunganKlaster pariwisata desa sembungan
Klaster pariwisata desa sembungan
 
Horario cesanjose2014
Horario cesanjose2014Horario cesanjose2014
Horario cesanjose2014
 
#10dieci: Turismo
#10dieci: Turismo#10dieci: Turismo
#10dieci: Turismo
 
How to create a new Master Page in SharePoint 2013?
How to create a new Master Page in SharePoint 2013?How to create a new Master Page in SharePoint 2013?
How to create a new Master Page in SharePoint 2013?
 
Pardalis & Nohavicka llp final
Pardalis & Nohavicka llp finalPardalis & Nohavicka llp final
Pardalis & Nohavicka llp final
 
Ieee 2014 2015 dotnet projects titles globalsoft technologies
Ieee 2014 2015 dotnet projects titles globalsoft technologiesIeee 2014 2015 dotnet projects titles globalsoft technologies
Ieee 2014 2015 dotnet projects titles globalsoft technologies
 
Encuestas a niños
Encuestas a niñosEncuestas a niños
Encuestas a niños
 
Sky fall production company
Sky fall production company Sky fall production company
Sky fall production company
 
Tests for intergranular corrosion and stress corrosion cracking
Tests for intergranular corrosion and stress corrosion crackingTests for intergranular corrosion and stress corrosion cracking
Tests for intergranular corrosion and stress corrosion cracking
 
Grand Cianjur
Grand CianjurGrand Cianjur
Grand Cianjur
 
VozDigital DevFest 31/10/14
VozDigital DevFest 31/10/14VozDigital DevFest 31/10/14
VozDigital DevFest 31/10/14
 
Presentation
PresentationPresentation
Presentation
 

Similar a Understanding big data

Similar a Understanding big data (20)

Presentation on Big Data
Presentation on Big DataPresentation on Big Data
Presentation on Big Data
 
Bigdata " new level"
Bigdata " new level"Bigdata " new level"
Bigdata " new level"
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
Kartikey tripathi
Kartikey tripathiKartikey tripathi
Kartikey tripathi
 
Big data
Big dataBig data
Big data
 
Content1. Introduction2. What is Big Data3. Characte.docx
Content1. Introduction2. What is Big Data3. Characte.docxContent1. Introduction2. What is Big Data3. Characte.docx
Content1. Introduction2. What is Big Data3. Characte.docx
 
Special issues on big data
Special issues on big dataSpecial issues on big data
Special issues on big data
 
ppt final.pptx
ppt final.pptxppt final.pptx
ppt final.pptx
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Big Data By Vijay Bhaskar Semwal
Big Data By Vijay Bhaskar SemwalBig Data By Vijay Bhaskar Semwal
Big Data By Vijay Bhaskar Semwal
 
Bigdatappt 140225061440-phpapp01
Bigdatappt 140225061440-phpapp01Bigdatappt 140225061440-phpapp01
Bigdatappt 140225061440-phpapp01
 
Big_Data_ppt[1] (1).pptx
Big_Data_ppt[1] (1).pptxBig_Data_ppt[1] (1).pptx
Big_Data_ppt[1] (1).pptx
 
Big data
Big dataBig data
Big data
 
Big data Analytics
Big data Analytics Big data Analytics
Big data Analytics
 
Big data ppt
Big  data pptBig  data ppt
Big data ppt
 
Big data
Big dataBig data
Big data
 
Big data and analytics
Big data and analyticsBig data and analytics
Big data and analytics
 
Big data
Big dataBig data
Big data
 
Big data
Big dataBig data
Big data
 

Último

Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 

Último (20)

Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 

Understanding big data

  • 2. Overview Why Big Data Big Data Users What is Big Data
  • 3. Big Data Big data is a popular term used to describe the exponential growth and availability of data, both structured and unstructured. And big data may be as important to business – and society – as the Internet has become. Why? More data may lead to more accurate analyses. More accurate analyses may lead to more confident decision making. And better decisions can mean greater operational efficiencies, cost reductions and reduced risk.
  • 4.
  • 5.
  • 6.
  • 7. Big Data Mainstream definition of big data as the three V’s of big data:
  • 9. Big Data • Volume: Many factors contribute to the increase in data volume. Transaction-based data stored through the years. Unstructured data streaming in from social media. Increasing amounts of sensor and machine-to- machine data being collected. • Velocity: Data is streaming in at unprecedented speed and must be dealt with in a timely manner. RFID tags, sensors and smart metering are driving the need to deal with torrents of data in near-real time. Reacting quickly enough to deal with data velocity is a challenge for most organizations. • Variety: Data today comes in all types of formats. Structured, numeric data in traditional databases. Information created from line-of-business applications. Unstructured text documents, email, video, audio, stock ticker data and financial transactions. Managing, merging and governing different varieties of data is something many organizations still grapple with.
  • 10. Big Data Let us consider two additional dimensions when thinking about big data: • Variability: In addition to the increasing velocities and varieties of data, data flows can be highly inconsistent with periodic peaks. Is something trending in social media? Daily, seasonal and event-triggered peak data loads can be challenging to manage. Even more so with unstructured data involved. • Complexity: Today's data comes from multiple sources. And it is still an undertaking to link, match, cleanse and transform data across systems. However, it is necessary to connect and correlate relationships, hierarchies and multiple data linkages or your data can quickly spiral out of control.
  • 12. Why Big Data ?? The real issue is not that you are acquiring large amounts of data. It's what you do with the data that counts. The hopeful vision is that organizations will be able to take data from any source, harness relevant data and analyse it to find answers that enable 1) cost reductions 2) time reductions 3) new product development and optimized offerings 4) smarter business decision making.
  • 13.
  • 14.
  • 15.
  • 17.
  • 18. Big Data platform typically works by storing data first into clusters , then process the data through MapReduce workflows which executes by Mapping the input data through independent chunks processed by appropriate algorithms, the output from Map phase then moves to Shuffle/Sorting phase & finally the output from Shuffle phase comes to Reduce phase as input. Typical Big Data MapReduce workflow:
  • 19.
  • 20.
  • 21. Big Data users in Next Five Years
  • 22. Big Data users in Private Sector • Ebay.com uses two data warehouses at 7.5 petabytes and 40PB as well as a 40PB Hadoop cluster for search, consumer recommendations, and merchandising. • Amazon.com handles millions of back-end operations every day, as well as queries from more than half a million third-party sellers. The core technology that keeps Amazon running is Linux-based and as of 2005 they had the world’s three largest Linux databases, with capacities of 7.8 TB, 18.5 TB, and 24.7 TB. • Walmart handles more than 1 million customer transactions every hour, which are imported into databases estimated to contain more than 2.5 petabytes (2560 terabytes) of data – the equivalent of 167 times the information contained in all the books in the US Library of Congress. • Facebook handles 50 billion photos from its user base.
  • 23. Big Data users in Private Sector • FICO Falcon Credit Card Fraud Detection System protects 2.1 billion active accounts world-wide. The volume of business data worldwide, across all companies, doubles every 1.2 years, according to estimates. • Windermere Real Estate uses anonymous GPS signals from nearly 100 million drivers to help new home buyers determine their typical drive times to and from work throughout various times of the day.
  • 24. Thank You.. Queries are welcome Praneet Samaiya