SlideShare una empresa de Scribd logo
1 de 19
Big Data
What’s the real BIG problem?
--Brian Pereira, Editor-in-chief, InformationWeek
20:20 MSL – 17-May-2013
Terminology
• Data – Unprocessed, captured in raw form
• Information – Processed data – meaningful, insightful
SELECT first_name, last_name FROM student_details;
• Structured Data – Databases, tabular data
(It’s searchable, you can filter it, and extract meaningful information)
• Unstructured Data – Tweets, video, social media
updates, blogs, images
Unstructured Data = Big Data
• Users posting comments and reviews about a particular product on
Facebook
• Users creating their own videos using their smart phones and
uploading on YouTube
• Journalists posting tweets during the launch event of a particular
product
• CC TVs at traffic intersections or in stores -- streaming video feeds
back to the server for storage
• Sensors around an aircraft or spaceship or a piece of complex
machinery -- relaying data about temperature, wind speed, fuel
levels – back to a server.
Big Data definition
Big Data is the trillions and quintillions of bytes
being generated, mostly in unstructured form,
by millions of users and devices
How BIG is Big Data?
• 90% of Digital Data generated in last two years
• 23x – The expected increase in digital data in india
during 2012 – 2020 (From 127 Exabytes to 2.9 Zettabytes)
• 12 Terabytes is the size of tweets in a day
• 5 Exabytes of data was created between the dawn of
civilization and 2003; today that much of information is
created every two days!
• 72 Hours of video uploaded to YouTube every minute!
Bits and Bytes
1,000 bytes = 1 Kilobyte (KB)
1,000 KB = 1 Megabyte (MB)
1,000 MB = 1 Gigabyte (GB)
1,000 GB = 1 Terabyte (TB)
1,000 TB = 1 petabyte
1,000 petabytes = 1 exabyte
1,000 exabytes = 1 zettabyte
1,000 zettabytes = 1 yottabyte
1,000 yottabytes = 1 brontobyte
1,0000 brontobytes = 1 geopbyte
What led to the Big Data explosion?
• Nexus of forces – cloud, social, mobile, information
(Intersection of cloud & mobile)
• The Internet of things (connected devices, sensors)
• Devices going digital (cameras, phones, power meters, traffic
signals etc)
• User generated content (digital photos, videos, social media,
blogs, SMS, email, tweets)
• Earlier – transaction systems for structured data – under
control by the organization
How much can be analysed?
0.5% or less of the digital information is
analysed in India today
36% is the size of data that technology can
analyse now
32% expected growth in the global Big Data
technology services market by 2016
Analyze this!
• We need to analyze all the data and look for insights that
can help us make decisions (in business)
• Can you analyze the video stream from a camera in real-
time and predict a crime?
• Can a marketer analyze all the tweets to gauge how his
customers feel about his product?
• Can sensors in a car analyze (monitoring the “health” of
different system in real-time, predict the failure of a part
– and send an SMS to the service center?
So what’s the real problem?
• Not all data captured is useful to business
• You need to find the right data sets in a heap
of data (harnessing the data)
• And do this fast enough to:
– make timely decisions, prevent a disaster, prevent
outage, curb negative customer sentiments on
social media
Examples
• Walmart and Amazon are harnessing Big Data
to improve customer service, stock better
inventory, gauge sales trands and improve
operational efficiencies
• Big Data Analytics used in genetic research, to
improve traffic management, generate alerts
on freak weather (storms), prevent crime,
improve power grid efficiencies, etc
Quotes
• “Big Data is contextual though in sheer numbers,
I would place the market beyond 100 TB when
‘normal’ systems start struggling bit” – Arun
Gupta, CIO, Cipla
• “Top challenges in managing the massive amount
of data are backup, security and incorporation of
unstructured data into business processes” – N
Jayantha Prabhu, CTO, Essar Group
Big Data facts
• Big Data (term) did not exist five years ago
• It was less than a $100 million industry in 2009
(Deloitte)
• The Big Data market is worth $5 billion (IDC)
• By 2015, Big Data revenue will touch $30 billion (IDC)
• By 2017, it will cross $54 billion (IDC)
• 300 exabytes of data is stored today (IBM)
Players
• Leaders – IBM, Microsoft, SAS, SAP, QlikTech,
NetApp, Teradata, EMC
• Disrupters – Amazon, Google
• US Startups – Cloudera, GoodData, Parcel
• Indian Startups – Mu Sigma, Xurmo,
Metaome, Vizury, Meshlabs
• Indian companies – TCS, Infosys, Wipro, HCL
Tech
Big Data magic quadrant (Gartner)
Tools for analyzing data
• EMC GreenPlum
• NetApp E-Series
• IBM InfoSphere solutions
• IBM Smarter Analytics (hardware, software, services)
• SAS Business Analytics
• SAP HANA
• Oracle Big Data Appliance
• Hadoop
• MapReduce (programming model)
• QlikView BI Dashboards
• JasperSoft BI Suite
• http://www.infoworld.com/d/business-intelligence/7-top-tools-taming-big-data-
191131?page=0,2
The people factor
• Demand for Data Scientists, statisticians, data
architects will grow
• Skills: analytical skills, statistics
Related fields
• Predictive analytics
• In-memory databasesM
• Data modeling
• Data visualization
• Business Intelligence
Thank you!
• brian9p@gmail.com
• Brian.pereira@ubm.com

Más contenido relacionado

La actualidad más candente

Towards a big data roadmap for europe
Towards a big data roadmap for europeTowards a big data roadmap for europe
Towards a big data roadmap for europeBIG Project
 
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)Hritika Raj
 
Big Data for Smart City
Big Data for Smart CityBig Data for Smart City
Big Data for Smart CityKoltiva
 
Big data - Key Enablers, Drivers & Challenges
Big data - Key Enablers, Drivers & ChallengesBig data - Key Enablers, Drivers & Challenges
Big data - Key Enablers, Drivers & ChallengesShilpi Sharma
 
Big data for official statistics @ Konferensi Big Data Indonesia 2016
Big data for official statistics @ Konferensi Big Data Indonesia 2016 Big data for official statistics @ Konferensi Big Data Indonesia 2016
Big data for official statistics @ Konferensi Big Data Indonesia 2016 Setia Pramana
 
The Pros and Cons of Big Data in an ePatient World
The Pros and Cons of Big Data in an ePatient WorldThe Pros and Cons of Big Data in an ePatient World
The Pros and Cons of Big Data in an ePatient WorldPYA, P.C.
 
Big Data & Analytics (Conceptual and Practical Introduction)
Big Data & Analytics (Conceptual and Practical Introduction)Big Data & Analytics (Conceptual and Practical Introduction)
Big Data & Analytics (Conceptual and Practical Introduction)Yaman Hajja, Ph.D.
 
Essential Tools For Your Big Data Arsenal
Essential Tools For Your Big Data ArsenalEssential Tools For Your Big Data Arsenal
Essential Tools For Your Big Data ArsenalMongoDB
 

La actualidad más candente (20)

Big data ppt
Big data pptBig data ppt
Big data ppt
 
IOT DATA AND BIG DATA
IOT DATA AND BIG DATAIOT DATA AND BIG DATA
IOT DATA AND BIG DATA
 
Applications of Big Data
Applications of Big DataApplications of Big Data
Applications of Big Data
 
Towards a big data roadmap for europe
Towards a big data roadmap for europeTowards a big data roadmap for europe
Towards a big data roadmap for europe
 
Kartikey tripathi
Kartikey tripathiKartikey tripathi
Kartikey tripathi
 
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
 
Big Data for Smart City
Big Data for Smart CityBig Data for Smart City
Big Data for Smart City
 
Big data - Key Enablers, Drivers & Challenges
Big data - Key Enablers, Drivers & ChallengesBig data - Key Enablers, Drivers & Challenges
Big data - Key Enablers, Drivers & Challenges
 
Big Data Trends
Big Data TrendsBig Data Trends
Big Data Trends
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
BIG DATA -- The Next Big Thing!
BIG DATA -- The Next Big Thing!BIG DATA -- The Next Big Thing!
BIG DATA -- The Next Big Thing!
 
NewMR 2016 presents: 9 Big Applications of Big Data
NewMR 2016 presents: 9 Big Applications of Big DataNewMR 2016 presents: 9 Big Applications of Big Data
NewMR 2016 presents: 9 Big Applications of Big Data
 
Big data for official statistics @ Konferensi Big Data Indonesia 2016
Big data for official statistics @ Konferensi Big Data Indonesia 2016 Big data for official statistics @ Konferensi Big Data Indonesia 2016
Big data for official statistics @ Konferensi Big Data Indonesia 2016
 
Data Mining With Big Data
Data Mining With Big DataData Mining With Big Data
Data Mining With Big Data
 
Big-Data-AryaTadbirNetworkDesigners
Big-Data-AryaTadbirNetworkDesignersBig-Data-AryaTadbirNetworkDesigners
Big-Data-AryaTadbirNetworkDesigners
 
The Pros and Cons of Big Data in an ePatient World
The Pros and Cons of Big Data in an ePatient WorldThe Pros and Cons of Big Data in an ePatient World
The Pros and Cons of Big Data in an ePatient World
 
The big story (BIG DATA)
The big story (BIG DATA)The big story (BIG DATA)
The big story (BIG DATA)
 
Big Data & Analytics (Conceptual and Practical Introduction)
Big Data & Analytics (Conceptual and Practical Introduction)Big Data & Analytics (Conceptual and Practical Introduction)
Big Data & Analytics (Conceptual and Practical Introduction)
 
Big data
Big dataBig data
Big data
 
Essential Tools For Your Big Data Arsenal
Essential Tools For Your Big Data ArsenalEssential Tools For Your Big Data Arsenal
Essential Tools For Your Big Data Arsenal
 

Similar a Big data

Introduction to big data
Introduction to big dataIntroduction to big data
Introduction to big dataHari Priya
 
How IOT & Big Data will shape up Future Economies?
How IOT & Big Data will shape up Future Economies?How IOT & Big Data will shape up Future Economies?
How IOT & Big Data will shape up Future Economies?Srinath Perera
 
Big Data By Vijay Bhaskar Semwal
Big Data By Vijay Bhaskar SemwalBig Data By Vijay Bhaskar Semwal
Big Data By Vijay Bhaskar SemwalIIIT Allahabad
 
Find værdi i alle data
Find værdi i alle dataFind værdi i alle data
Find værdi i alle dataMicrosoft
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big DataAkshata Humbe
 
Qu'est ce que le Big Data ? Avec Victoria Galano Data Scientist chez Air France
Qu'est ce que le Big Data ? Avec Victoria Galano Data Scientist chez Air FranceQu'est ce que le Big Data ? Avec Victoria Galano Data Scientist chez Air France
Qu'est ce que le Big Data ? Avec Victoria Galano Data Scientist chez Air FranceJedha Bootcamp
 
What is the concept of Big Data?
What is the concept of Big Data?What is the concept of Big Data?
What is the concept of Big Data?Sushil Deshmukh
 
Introduction to big data – convergences.
Introduction to big data – convergences.Introduction to big data – convergences.
Introduction to big data – convergences.saranya270513
 
Let's make money from big data!
Let's make money from big data! Let's make money from big data!
Let's make money from big data! B Spot
 
BIGDATAPrepared ByMuhammad Abrar UddinIntrodu.docx
BIGDATAPrepared ByMuhammad Abrar UddinIntrodu.docxBIGDATAPrepared ByMuhammad Abrar UddinIntrodu.docx
BIGDATAPrepared ByMuhammad Abrar UddinIntrodu.docxtangyechloe
 
Research issues in the big data and its Challenges
Research issues in the big data and its ChallengesResearch issues in the big data and its Challenges
Research issues in the big data and its ChallengesKathirvel Ayyaswamy
 
Data analytics introduction
Data analytics introductionData analytics introduction
Data analytics introductionamiyadash
 
Managing your Assets with Big Data Tools
Managing your Assets with Big Data ToolsManaging your Assets with Big Data Tools
Managing your Assets with Big Data ToolsMachinePulse
 
Innovation change mangement m_yaseen
Innovation change mangement m_yaseenInnovation change mangement m_yaseen
Innovation change mangement m_yaseenMohammed Yaseen
 
Bigdatappt 140225061440-phpapp01
Bigdatappt 140225061440-phpapp01Bigdatappt 140225061440-phpapp01
Bigdatappt 140225061440-phpapp01nayanbhatia2
 

Similar a Big data (20)

Introduction to big data
Introduction to big dataIntroduction to big data
Introduction to big data
 
How IOT & Big Data will shape up Future Economies?
How IOT & Big Data will shape up Future Economies?How IOT & Big Data will shape up Future Economies?
How IOT & Big Data will shape up Future Economies?
 
Ictam big data
Ictam big dataIctam big data
Ictam big data
 
Big Data By Vijay Bhaskar Semwal
Big Data By Vijay Bhaskar SemwalBig Data By Vijay Bhaskar Semwal
Big Data By Vijay Bhaskar Semwal
 
bigdatappt.pptx
bigdatappt.pptxbigdatappt.pptx
bigdatappt.pptx
 
Find værdi i alle data
Find værdi i alle dataFind værdi i alle data
Find værdi i alle data
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Big data Ppt
Big data PptBig data Ppt
Big data Ppt
 
Qu'est ce que le Big Data ? Avec Victoria Galano Data Scientist chez Air France
Qu'est ce que le Big Data ? Avec Victoria Galano Data Scientist chez Air FranceQu'est ce que le Big Data ? Avec Victoria Galano Data Scientist chez Air France
Qu'est ce que le Big Data ? Avec Victoria Galano Data Scientist chez Air France
 
What is the concept of Big Data?
What is the concept of Big Data?What is the concept of Big Data?
What is the concept of Big Data?
 
Big data and analytics
Big data and analyticsBig data and analytics
Big data and analytics
 
Introduction to big data – convergences.
Introduction to big data – convergences.Introduction to big data – convergences.
Introduction to big data – convergences.
 
Let's make money from big data!
Let's make money from big data! Let's make money from big data!
Let's make money from big data!
 
BIGDATAPrepared ByMuhammad Abrar UddinIntrodu.docx
BIGDATAPrepared ByMuhammad Abrar UddinIntrodu.docxBIGDATAPrepared ByMuhammad Abrar UddinIntrodu.docx
BIGDATAPrepared ByMuhammad Abrar UddinIntrodu.docx
 
Research issues in the big data and its Challenges
Research issues in the big data and its ChallengesResearch issues in the big data and its Challenges
Research issues in the big data and its Challenges
 
Data analytics introduction
Data analytics introductionData analytics introduction
Data analytics introduction
 
Big data Analytics
Big data Analytics Big data Analytics
Big data Analytics
 
Managing your Assets with Big Data Tools
Managing your Assets with Big Data ToolsManaging your Assets with Big Data Tools
Managing your Assets with Big Data Tools
 
Innovation change mangement m_yaseen
Innovation change mangement m_yaseenInnovation change mangement m_yaseen
Innovation change mangement m_yaseen
 
Bigdatappt 140225061440-phpapp01
Bigdatappt 140225061440-phpapp01Bigdatappt 140225061440-phpapp01
Bigdatappt 140225061440-phpapp01
 

Más de Brian Pereira

Technology a facilitator for buisiness
Technology   a facilitator for buisinessTechnology   a facilitator for buisiness
Technology a facilitator for buisinessBrian Pereira
 
Demystifying Internet of Things
Demystifying Internet of ThingsDemystifying Internet of Things
Demystifying Internet of ThingsBrian Pereira
 
Workshop on Internal Communications
Workshop on Internal CommunicationsWorkshop on Internal Communications
Workshop on Internal CommunicationsBrian Pereira
 
Social media for event promotion, marketing
Social media for event promotion, marketing  Social media for event promotion, marketing
Social media for event promotion, marketing Brian Pereira
 
Communication Skills
Communication SkillsCommunication Skills
Communication SkillsBrian Pereira
 
Vocabulary Building techniques
Vocabulary Building techniquesVocabulary Building techniques
Vocabulary Building techniquesBrian Pereira
 
Measuring Editorial Productivity
Measuring Editorial ProductivityMeasuring Editorial Productivity
Measuring Editorial ProductivityBrian Pereira
 
Copywriting for SEO
Copywriting for SEO Copywriting for SEO
Copywriting for SEO Brian Pereira
 

Más de Brian Pereira (13)

Technology a facilitator for buisiness
Technology   a facilitator for buisinessTechnology   a facilitator for buisiness
Technology a facilitator for buisiness
 
Demystifying Internet of Things
Demystifying Internet of ThingsDemystifying Internet of Things
Demystifying Internet of Things
 
Workshop on Internal Communications
Workshop on Internal CommunicationsWorkshop on Internal Communications
Workshop on Internal Communications
 
Social media for event promotion, marketing
Social media for event promotion, marketing  Social media for event promotion, marketing
Social media for event promotion, marketing
 
Presentation tips
Presentation tipsPresentation tips
Presentation tips
 
Communication Skills
Communication SkillsCommunication Skills
Communication Skills
 
Vocabulary Building techniques
Vocabulary Building techniquesVocabulary Building techniques
Vocabulary Building techniques
 
Editorial Strategy
Editorial StrategyEditorial Strategy
Editorial Strategy
 
Cloud Computing
Cloud Computing Cloud Computing
Cloud Computing
 
Tech trends 2010
Tech trends 2010Tech trends 2010
Tech trends 2010
 
Measuring Editorial Productivity
Measuring Editorial ProductivityMeasuring Editorial Productivity
Measuring Editorial Productivity
 
Copywriting for SEO
Copywriting for SEOCopywriting for SEO
Copywriting for SEO
 
Copywriting for SEO
Copywriting for SEO Copywriting for SEO
Copywriting for SEO
 

Último

Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 

Último (20)

Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 

Big data

  • 1. Big Data What’s the real BIG problem? --Brian Pereira, Editor-in-chief, InformationWeek 20:20 MSL – 17-May-2013
  • 2. Terminology • Data – Unprocessed, captured in raw form • Information – Processed data – meaningful, insightful SELECT first_name, last_name FROM student_details; • Structured Data – Databases, tabular data (It’s searchable, you can filter it, and extract meaningful information) • Unstructured Data – Tweets, video, social media updates, blogs, images
  • 3. Unstructured Data = Big Data • Users posting comments and reviews about a particular product on Facebook • Users creating their own videos using their smart phones and uploading on YouTube • Journalists posting tweets during the launch event of a particular product • CC TVs at traffic intersections or in stores -- streaming video feeds back to the server for storage • Sensors around an aircraft or spaceship or a piece of complex machinery -- relaying data about temperature, wind speed, fuel levels – back to a server.
  • 4. Big Data definition Big Data is the trillions and quintillions of bytes being generated, mostly in unstructured form, by millions of users and devices
  • 5. How BIG is Big Data? • 90% of Digital Data generated in last two years • 23x – The expected increase in digital data in india during 2012 – 2020 (From 127 Exabytes to 2.9 Zettabytes) • 12 Terabytes is the size of tweets in a day • 5 Exabytes of data was created between the dawn of civilization and 2003; today that much of information is created every two days! • 72 Hours of video uploaded to YouTube every minute!
  • 6. Bits and Bytes 1,000 bytes = 1 Kilobyte (KB) 1,000 KB = 1 Megabyte (MB) 1,000 MB = 1 Gigabyte (GB) 1,000 GB = 1 Terabyte (TB) 1,000 TB = 1 petabyte 1,000 petabytes = 1 exabyte 1,000 exabytes = 1 zettabyte 1,000 zettabytes = 1 yottabyte 1,000 yottabytes = 1 brontobyte 1,0000 brontobytes = 1 geopbyte
  • 7. What led to the Big Data explosion? • Nexus of forces – cloud, social, mobile, information (Intersection of cloud & mobile) • The Internet of things (connected devices, sensors) • Devices going digital (cameras, phones, power meters, traffic signals etc) • User generated content (digital photos, videos, social media, blogs, SMS, email, tweets) • Earlier – transaction systems for structured data – under control by the organization
  • 8. How much can be analysed? 0.5% or less of the digital information is analysed in India today 36% is the size of data that technology can analyse now 32% expected growth in the global Big Data technology services market by 2016
  • 9. Analyze this! • We need to analyze all the data and look for insights that can help us make decisions (in business) • Can you analyze the video stream from a camera in real- time and predict a crime? • Can a marketer analyze all the tweets to gauge how his customers feel about his product? • Can sensors in a car analyze (monitoring the “health” of different system in real-time, predict the failure of a part – and send an SMS to the service center?
  • 10. So what’s the real problem? • Not all data captured is useful to business • You need to find the right data sets in a heap of data (harnessing the data) • And do this fast enough to: – make timely decisions, prevent a disaster, prevent outage, curb negative customer sentiments on social media
  • 11. Examples • Walmart and Amazon are harnessing Big Data to improve customer service, stock better inventory, gauge sales trands and improve operational efficiencies • Big Data Analytics used in genetic research, to improve traffic management, generate alerts on freak weather (storms), prevent crime, improve power grid efficiencies, etc
  • 12. Quotes • “Big Data is contextual though in sheer numbers, I would place the market beyond 100 TB when ‘normal’ systems start struggling bit” – Arun Gupta, CIO, Cipla • “Top challenges in managing the massive amount of data are backup, security and incorporation of unstructured data into business processes” – N Jayantha Prabhu, CTO, Essar Group
  • 13. Big Data facts • Big Data (term) did not exist five years ago • It was less than a $100 million industry in 2009 (Deloitte) • The Big Data market is worth $5 billion (IDC) • By 2015, Big Data revenue will touch $30 billion (IDC) • By 2017, it will cross $54 billion (IDC) • 300 exabytes of data is stored today (IBM)
  • 14. Players • Leaders – IBM, Microsoft, SAS, SAP, QlikTech, NetApp, Teradata, EMC • Disrupters – Amazon, Google • US Startups – Cloudera, GoodData, Parcel • Indian Startups – Mu Sigma, Xurmo, Metaome, Vizury, Meshlabs • Indian companies – TCS, Infosys, Wipro, HCL Tech
  • 15. Big Data magic quadrant (Gartner)
  • 16. Tools for analyzing data • EMC GreenPlum • NetApp E-Series • IBM InfoSphere solutions • IBM Smarter Analytics (hardware, software, services) • SAS Business Analytics • SAP HANA • Oracle Big Data Appliance • Hadoop • MapReduce (programming model) • QlikView BI Dashboards • JasperSoft BI Suite • http://www.infoworld.com/d/business-intelligence/7-top-tools-taming-big-data- 191131?page=0,2
  • 17. The people factor • Demand for Data Scientists, statisticians, data architects will grow • Skills: analytical skills, statistics
  • 18. Related fields • Predictive analytics • In-memory databasesM • Data modeling • Data visualization • Business Intelligence
  • 19. Thank you! • brian9p@gmail.com • Brian.pereira@ubm.com