SlideShare una empresa de Scribd logo
1 de 21
Big Data
 “Big Data” is data whose scale, diversity, and
complexity requires new architecture, techniques,
algorithms, and analytics to manage it and extract
value and hidden knowledge from it
 Most analysts and practitioners currently refer to data
sets from 30-50 terabytes(1000 gigabytes per terabyte)
to multiple petabytes (1000 terabytes per petabyte) as
big data.
 The progress and innovation is no longer hindered by the ability to collect data
 But, by the ability to manage, analyze, summarize, visualize, and discover
knowledge from the collected data in a timely manner and in a scalable fashion
3
Social media and networks
(all of us are generating data)
Scientific instruments
(collecting all sorts of data)
Mobile devices
(tracking all objects all the tim
Sensor technology and
networks
(measuring all kinds of data)
 Volume: The massive scale and growth of
unstructured data outstrips traditional storage and
analytical solutions
 Velocity: Data is generated in real time, with
demands for usable information to be served up
immediately
 Variety: Data is getting generated in the form of
relational data, text data, semi structured data ,Graph
data etc.
 There were 5 billion mobile phones in use in 2010.
 There is a 40% projected growth in global data
generated per year vs. 5% growth in global IT spending.
 There were 235 terabytes of data collected by the US
Library of Congress in April 2011.
 15 out of 17 major business sectors in the United States
have more data stored per company than the US Library
of Congress.
The Problem
The complex nature of big data is primarily driven by the
unstructured nature of much of the data that is generated
by modern technologies, such as that from web logs, radio
frequency Id (RFID), sensors embedded in devices,
machinery, vehicles, Internet searches, social networks such
as Facebook, portable computers, smart phones and other
cell phones, GPS devices, and call center records.
In most cases, in order to effectively utilize big data, it must
be combined with structured data (typically from a
relational database) from a more conventional business
application, such as Enterprise Resource Planning (ERP) or
Customer Relationship Management (CRM).
Global market for big data
Industry Size :
Today every organisation across the globe is faced with an unprecedented growth in
data.
The digital universe of data was expected to expand to 2.7 Zetta bytes (ZB) by the end of
2012. Then it is predicted to be double every two years, reaching 8 ZB data by 2015. Its
hard to conceptualize this quantity of information.
US library of Congress holds 462 terabytes (TB) of digital data. At this rate 8 ZB is
equivalent to almost 18 million libraries of Congress.
That translates to a ten-fold increase over the last five years and an astounding 29-fold
increase over the next ten years.
This year, the world’s digital information is expected to grow by 57%. Within that,
internet traffic is growing by 35%, and mobile data traffic at 110%, according to Cisco.
The big data industry is worth somewhere between $30bn and $200bn.
Smartphones, tablets, sensors, social networks, online
games, video streams and mobile payments will all drive
big data for many years to come
Internet companies:
Amazon , Apple, Facebook ,Google, Microsoft
The big Internet companies control where the data comes
from and where it goes to .
Amazon, Baidu, Facebook and Google may one day make a
lucrative side business from selling their proprietary
distributed database technologies, competing with IBM
and Oracle
 Data storage, networking and hardware companies:
ARM, BROCADE, CISCO, DELL, EMC, HP, INTEL ,LENOVO,
NETAPP, SEAGATE
Many hardware makers like Cisco, Dell, Lenovo and HP are
investing heavily in big data appliances
Data storage companies are likely to continue to beat
earnings expectations as the data deluge goes into
overdrive
Enterprise software companies:
Adobe, Citrix System, IBM, Fujitsu, Informatica, Oracle,
Red Hat, SAP, Salesforce.com
Hadoop is fast becoming the industry standard enterprise
database platform
Cloud database services are likely to be the fastest growth
sector this year within the enterprise software space
A wide variety of techniques and technologies has been developed and adapted to
aggregate, manipulate, analyze, and visualize big data.
BIG DATA TECHNIQUES:
 A/B testing: A technique in which a control group is compared with a variety of test
groups in order to determine what treatments (i.e., changes) will improve a given
objective variable, e.g., marketing response rate.
This technique is also known as split testing or bucket testing. An example application is
determining what copy text, layouts, images, or colors will improve conversion rates on
an e-commerce Web site.
 Association rule learning: A set of techniques for discovering interesting
relationships, i.e., “association rules,” among variables in large databases.
These techniques consist of a variety of algorithms to generate and test possible rules.
One application is market basket analysis, in which a retailer can determine which
products are frequently bought together and use this information for marketing (a
commonly cited example is the discovery that many supermarket shoppers who buy
diapers also tend to buy beer). Used for data mining.
 Cluster analysis: A statistical method for classifying objects
that splits a diverse group into smaller groups of similar
objects, whose characteristics of similarity are not known in
advance.
 Crowdsourcing: A technique for collecting data submitted by
a large group of people or community through an open call,
usually through networked media such as the Web.
 Statistics: The science of the collection, organization, and
interpretation of data, including the design of surveys and
experiments
BIG DATA TECHNOLOGIES
 There is a growing number of technologies used to
aggregate, manipulate, manage, and analyze big data.
 Big Table. Proprietary distributed database system
built on the Google File System. Tables are further
split into multiple tablets. When size of data grows
beyond limits, tablets are compressed by the use of
algorithms such as Snappy.
 Business intelligence (BI): A type of application
software designed to report, analyze, and present
data. BI tools are often used to read data that have
been previously stored in a data warehouse or data
mart. BI tools can also be used to create standard
reports that are generated on a periodic basis, or
to display information on real-time management
dashboards, i.e., integrated displays of metrics
that measure the performance of a system.
 Data warehouse: Specialized database optimized for reporting,
often used for storing large amounts of structured data. Data is
uploaded using ETL (extract, transform, and load) tools from
operational data stores, and reports are often generated using
business intelligence tools.
 Extract, transform, and load (ETL): Software tools used to extract
data from outside sources, transform them to fit operational
needs, and load them into a database or data warehouse.
 Hadoop: An open source (free) software framework for
processing huge datasets on certain kinds of problems on a
distributed system. Its development was inspired by Google’s
MapReduce and Google File System.
 Hbase: An open source (free), distributed, non-relational
database modeled on Google’s Big Table. It enables fault tolerant
way of storing large quantities of data.
 Opportunities:
Data intent and capacity
• The data revolution
• Intent in an age of growing volatility
Social Science and Policy Applications
 Challenges:
Data
• Privacy
• Access and Sharing
Analysis
• Defining and Detecting Anomalies in Human Ecosystems
• HP’s Big Data strategy and Vertica
• CSC Buys Infochimps for Big Data, Analytics Expertise
• Market Intelligence Provider FirstRain Unveils New Big Data Tool,
Market Insights
Investment risks:
Whilst big data industry revenues are certain to grow, investors face
significant risks.
Bandwidth risk
Today, internet bandwidth prices are capped, effectively making internet
bandwidth a free resource for big data companies. But, without
substantial investment by the world’s mobile operators, big data is likely
to grow far faster than the ability of the network to carry it.
As networks get overloaded, network latency rises, reducing the speed
and efficiency of analytical engines, especially those powered through
the cloud. The coming mobile bandwidth shortage will shift competitive
advantage from technology companies to telecom operators.
Open source risk
With the source code free, barriers to entry remain low. In the longer term, this may depress
the database industry’s margins.
Patent risk
Ever since Apple took on the mobile phone industry – and won – with barely a handful of
mobile patents to its name, a patent war has erupted across the technology sector. Were a
patent war to break out in the big data space, technological progress could be slowed down.
Whilst regulators are unlikely to allow any hoarding of patents on anti-competitive grounds,
the risk remains. Oracle, a leader in big data, is
well known for filing multi-billion dollar patent infringement lawsuits against its competitors.
Cyber risk
Last month Global Payments, a credit card transaction processor, admitted that hackers had
stolen the details of 1.5m North American
card holders. This is the latest in a string of security breaches that have hit companies dealing
in big data. Apple, EMC, Google, Oracle and
Sony are all recent hacking victims. As the level of cyber-crime rises, so does the risk of
dealing with big data. Just as the Fukushima incident dampened prospects for the nuclear
sector, so a large cyber-attack could adversely impact big data industry profits.
 Often misunderstood and ill-applied
 The question is not “how big is your data?”, it is “what are
you are doing with your data?”
 It fails to supply its customers with products that solve
business problems.
 Companies searching for data solutions are often confused
by all the big data marketing hype and sometimes end up
wasting resources.
Big data

Más contenido relacionado

La actualidad más candente

JIMS Rohini IT Flash Monthly Newsletter - October Issue
JIMS Rohini IT Flash Monthly Newsletter  - October IssueJIMS Rohini IT Flash Monthly Newsletter  - October Issue
JIMS Rohini IT Flash Monthly Newsletter - October IssueJIMS Rohini Sector 5
 
06. 9534 14985-1-ed b edit dhyan
06. 9534 14985-1-ed b edit dhyan06. 9534 14985-1-ed b edit dhyan
06. 9534 14985-1-ed b edit dhyanIAESIJEECS
 
BIG Data & Hadoop Applications in Social Media
BIG Data & Hadoop Applications in Social MediaBIG Data & Hadoop Applications in Social Media
BIG Data & Hadoop Applications in Social MediaSkillspeed
 
Implementation of application for huge data file transfer
Implementation of application for huge data file transferImplementation of application for huge data file transfer
Implementation of application for huge data file transferijwmn
 
Efficient Data Filtering Algorithm for Big Data Technology in Telecommunicati...
Efficient Data Filtering Algorithm for Big Data Technology in Telecommunicati...Efficient Data Filtering Algorithm for Big Data Technology in Telecommunicati...
Efficient Data Filtering Algorithm for Big Data Technology in Telecommunicati...Onyebuchi nosiri
 
Efficient Data Filtering Algorithm for Big Data Technology in Telecommunicati...
Efficient Data Filtering Algorithm for Big Data Technology in Telecommunicati...Efficient Data Filtering Algorithm for Big Data Technology in Telecommunicati...
Efficient Data Filtering Algorithm for Big Data Technology in Telecommunicati...Onyebuchi nosiri
 
Big data - a review (2013 4)
Big data - a review (2013 4)Big data - a review (2013 4)
Big data - a review (2013 4)Sonu Gupta
 
Overcomming Big Data Mining Challenges for Revolutionary Breakthroughs in Com...
Overcomming Big Data Mining Challenges for Revolutionary Breakthroughs in Com...Overcomming Big Data Mining Challenges for Revolutionary Breakthroughs in Com...
Overcomming Big Data Mining Challenges for Revolutionary Breakthroughs in Com...AnthonyOtuonye
 
Data Mining and Big Data Challenges and Research Opportunities
Data Mining and Big Data Challenges and Research OpportunitiesData Mining and Big Data Challenges and Research Opportunities
Data Mining and Big Data Challenges and Research OpportunitiesKathirvel Ayyaswamy
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big DataAkshata Humbe
 
A Survey on Data Mining
A Survey on Data MiningA Survey on Data Mining
A Survey on Data MiningIOSR Journals
 
wireless sensor network
wireless sensor networkwireless sensor network
wireless sensor networkparry prabhu
 
An introduction to Data Mining
An introduction to Data MiningAn introduction to Data Mining
An introduction to Data MiningShobhita Dayal
 

La actualidad más candente (20)

1
11
1
 
JIMS Rohini IT Flash Monthly Newsletter - October Issue
JIMS Rohini IT Flash Monthly Newsletter  - October IssueJIMS Rohini IT Flash Monthly Newsletter  - October Issue
JIMS Rohini IT Flash Monthly Newsletter - October Issue
 
06. 9534 14985-1-ed b edit dhyan
06. 9534 14985-1-ed b edit dhyan06. 9534 14985-1-ed b edit dhyan
06. 9534 14985-1-ed b edit dhyan
 
BIG Data & Hadoop Applications in Social Media
BIG Data & Hadoop Applications in Social MediaBIG Data & Hadoop Applications in Social Media
BIG Data & Hadoop Applications in Social Media
 
Implementation of application for huge data file transfer
Implementation of application for huge data file transferImplementation of application for huge data file transfer
Implementation of application for huge data file transfer
 
Efficient Data Filtering Algorithm for Big Data Technology in Telecommunicati...
Efficient Data Filtering Algorithm for Big Data Technology in Telecommunicati...Efficient Data Filtering Algorithm for Big Data Technology in Telecommunicati...
Efficient Data Filtering Algorithm for Big Data Technology in Telecommunicati...
 
Efficient Data Filtering Algorithm for Big Data Technology in Telecommunicati...
Efficient Data Filtering Algorithm for Big Data Technology in Telecommunicati...Efficient Data Filtering Algorithm for Big Data Technology in Telecommunicati...
Efficient Data Filtering Algorithm for Big Data Technology in Telecommunicati...
 
Big data - a review (2013 4)
Big data - a review (2013 4)Big data - a review (2013 4)
Big data - a review (2013 4)
 
Big Data for Ag (2019)
Big Data for Ag (2019)Big Data for Ag (2019)
Big Data for Ag (2019)
 
Big data
Big dataBig data
Big data
 
Understanding big data
Understanding big dataUnderstanding big data
Understanding big data
 
Overcomming Big Data Mining Challenges for Revolutionary Breakthroughs in Com...
Overcomming Big Data Mining Challenges for Revolutionary Breakthroughs in Com...Overcomming Big Data Mining Challenges for Revolutionary Breakthroughs in Com...
Overcomming Big Data Mining Challenges for Revolutionary Breakthroughs in Com...
 
Data Mining and Big Data Challenges and Research Opportunities
Data Mining and Big Data Challenges and Research OpportunitiesData Mining and Big Data Challenges and Research Opportunities
Data Mining and Big Data Challenges and Research Opportunities
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Big data survey
Big data surveyBig data survey
Big data survey
 
A Survey on Data Mining
A Survey on Data MiningA Survey on Data Mining
A Survey on Data Mining
 
Big data
Big dataBig data
Big data
 
wireless sensor network
wireless sensor networkwireless sensor network
wireless sensor network
 
National Conference - Big Data - 31 Jan 2015
National Conference - Big Data - 31 Jan 2015National Conference - Big Data - 31 Jan 2015
National Conference - Big Data - 31 Jan 2015
 
An introduction to Data Mining
An introduction to Data MiningAn introduction to Data Mining
An introduction to Data Mining
 

Destacado

"Опыт создания системы управления сборкой и тестированием" (слайдкаст)
"Опыт создания системы управления сборкой и тестированием" (слайдкаст)"Опыт создания системы управления сборкой и тестированием" (слайдкаст)
"Опыт создания системы управления сборкой и тестированием" (слайдкаст)SPB SQA Group
 
Pictures of students in sw 475
Pictures of students in sw 475Pictures of students in sw 475
Pictures of students in sw 475pegart
 
消費者行動論(小松崎班)A
消費者行動論(小松崎班)A消費者行動論(小松崎班)A
消費者行動論(小松崎班)Ayahohsoaho
 
Segway pt se and seg solutions launch webinar -- united states (3-24-14)
Segway pt se and seg solutions launch webinar  -- united states (3-24-14)Segway pt se and seg solutions launch webinar  -- united states (3-24-14)
Segway pt se and seg solutions launch webinar -- united states (3-24-14)Mark Vena
 
Het zorgportaal documentatie (juni2012)
Het zorgportaal   documentatie  (juni2012)Het zorgportaal   documentatie  (juni2012)
Het zorgportaal documentatie (juni2012)Raymond
 
ITGM8. Илья Коробицын (Grid Dinamics) Автоматизатор, копай глубже, копай шире!
ITGM8. Илья Коробицын (Grid Dinamics) Автоматизатор, копай глубже, копай шире!ITGM8. Илья Коробицын (Grid Dinamics) Автоматизатор, копай глубже, копай шире!
ITGM8. Илья Коробицын (Grid Dinamics) Автоматизатор, копай глубже, копай шире!SPB SQA Group
 
How Edmodo Uses Splunk For Real-Time Tag-Based Reporting of AWS Billing and U...
How Edmodo Uses Splunk For Real-Time Tag-Based Reporting of AWS Billing and U...How Edmodo Uses Splunk For Real-Time Tag-Based Reporting of AWS Billing and U...
How Edmodo Uses Splunk For Real-Time Tag-Based Reporting of AWS Billing and U...cloudcontroller
 
Designing and implementing synergies; Coordinating investment in Research and...
Designing and implementing synergies; Coordinating investment in Research and...Designing and implementing synergies; Coordinating investment in Research and...
Designing and implementing synergies; Coordinating investment in Research and...Dimitri Corpakis
 
Knowing your purpose in life lesson #4
Knowing your purpose in life lesson #4Knowing your purpose in life lesson #4
Knowing your purpose in life lesson #4Vision of Hope
 
Are people the problem or solution: reflections from the wild
Are people the problem or solution: reflections from the wildAre people the problem or solution: reflections from the wild
Are people the problem or solution: reflections from the wildNottingham Trent University
 
mHealth Insights for Wireless Carrier
mHealth Insights for Wireless CarriermHealth Insights for Wireless Carrier
mHealth Insights for Wireless CarrierKarthik Ethirajan
 
да славит
да славитда славит
да славитko63ar
 
Thesis_AnoukKon_421037_1662016
Thesis_AnoukKon_421037_1662016Thesis_AnoukKon_421037_1662016
Thesis_AnoukKon_421037_1662016anoukkonQompas
 

Destacado (20)

"Опыт создания системы управления сборкой и тестированием" (слайдкаст)
"Опыт создания системы управления сборкой и тестированием" (слайдкаст)"Опыт создания системы управления сборкой и тестированием" (слайдкаст)
"Опыт создания системы управления сборкой и тестированием" (слайдкаст)
 
Pictures of students in sw 475
Pictures of students in sw 475Pictures of students in sw 475
Pictures of students in sw 475
 
消費者行動論(小松崎班)A
消費者行動論(小松崎班)A消費者行動論(小松崎班)A
消費者行動論(小松崎班)A
 
Cau kien 36 70
Cau kien 36 70Cau kien 36 70
Cau kien 36 70
 
Evaluacion de biologia 10°
Evaluacion de biologia 10°Evaluacion de biologia 10°
Evaluacion de biologia 10°
 
Segway pt se and seg solutions launch webinar -- united states (3-24-14)
Segway pt se and seg solutions launch webinar  -- united states (3-24-14)Segway pt se and seg solutions launch webinar  -- united states (3-24-14)
Segway pt se and seg solutions launch webinar -- united states (3-24-14)
 
Het zorgportaal documentatie (juni2012)
Het zorgportaal   documentatie  (juni2012)Het zorgportaal   documentatie  (juni2012)
Het zorgportaal documentatie (juni2012)
 
ITGM8. Илья Коробицын (Grid Dinamics) Автоматизатор, копай глубже, копай шире!
ITGM8. Илья Коробицын (Grid Dinamics) Автоматизатор, копай глубже, копай шире!ITGM8. Илья Коробицын (Grid Dinamics) Автоматизатор, копай глубже, копай шире!
ITGM8. Илья Коробицын (Grid Dinamics) Автоматизатор, копай глубже, копай шире!
 
Inspirational Instruments 1 LAMPSHAPES
Inspirational Instruments 1 LAMPSHAPESInspirational Instruments 1 LAMPSHAPES
Inspirational Instruments 1 LAMPSHAPES
 
Horrible_jobs
Horrible_jobsHorrible_jobs
Horrible_jobs
 
How Edmodo Uses Splunk For Real-Time Tag-Based Reporting of AWS Billing and U...
How Edmodo Uses Splunk For Real-Time Tag-Based Reporting of AWS Billing and U...How Edmodo Uses Splunk For Real-Time Tag-Based Reporting of AWS Billing and U...
How Edmodo Uses Splunk For Real-Time Tag-Based Reporting of AWS Billing and U...
 
Sfe time robbers
Sfe time robbersSfe time robbers
Sfe time robbers
 
Designing and implementing synergies; Coordinating investment in Research and...
Designing and implementing synergies; Coordinating investment in Research and...Designing and implementing synergies; Coordinating investment in Research and...
Designing and implementing synergies; Coordinating investment in Research and...
 
Knowing your purpose in life lesson #4
Knowing your purpose in life lesson #4Knowing your purpose in life lesson #4
Knowing your purpose in life lesson #4
 
Single Sign On Social Login
Single Sign On Social LoginSingle Sign On Social Login
Single Sign On Social Login
 
Are people the problem or solution: reflections from the wild
Are people the problem or solution: reflections from the wildAre people the problem or solution: reflections from the wild
Are people the problem or solution: reflections from the wild
 
The Science Behind Climate Change
The Science Behind Climate ChangeThe Science Behind Climate Change
The Science Behind Climate Change
 
mHealth Insights for Wireless Carrier
mHealth Insights for Wireless CarriermHealth Insights for Wireless Carrier
mHealth Insights for Wireless Carrier
 
да славит
да славитда славит
да славит
 
Thesis_AnoukKon_421037_1662016
Thesis_AnoukKon_421037_1662016Thesis_AnoukKon_421037_1662016
Thesis_AnoukKon_421037_1662016
 

Similar a Big data

Similar a Big data (20)

Big data
Big dataBig data
Big data
 
Introduction to big data – convergences.
Introduction to big data – convergences.Introduction to big data – convergences.
Introduction to big data – convergences.
 
Bigdata " new level"
Bigdata " new level"Bigdata " new level"
Bigdata " new level"
 
Big data
Big dataBig data
Big data
 
A Big Data Concept
A Big Data ConceptA Big Data Concept
A Big Data Concept
 
Big data-ppt-
Big data-ppt-Big data-ppt-
Big data-ppt-
 
Big Data PPT by Rohit Dubey
Big Data PPT by Rohit DubeyBig Data PPT by Rohit Dubey
Big Data PPT by Rohit Dubey
 
Big data
Big dataBig data
Big data
 
Unit 1
Unit 1Unit 1
Unit 1
 
Whitepaper: Know Your Big Data – in 10 Minutes! - Happiest Minds
Whitepaper: Know Your Big Data – in 10 Minutes! - Happiest MindsWhitepaper: Know Your Big Data – in 10 Minutes! - Happiest Minds
Whitepaper: Know Your Big Data – in 10 Minutes! - Happiest Minds
 
QuickView #3 - Big Data
QuickView #3 - Big DataQuickView #3 - Big Data
QuickView #3 - Big Data
 
BIG DATA & DATA ANALYTICS
BIG  DATA & DATA  ANALYTICSBIG  DATA & DATA  ANALYTICS
BIG DATA & DATA ANALYTICS
 
Big data Analytics
Big data Analytics Big data Analytics
Big data Analytics
 
Data Mining With Big Data
Data Mining With Big DataData Mining With Big Data
Data Mining With Big Data
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
Big data lecture notes
Big data lecture notesBig data lecture notes
Big data lecture notes
 
Big data-ppt
Big data-pptBig data-ppt
Big data-ppt
 
Big data analytics, survey r.nabati
Big data analytics, survey r.nabatiBig data analytics, survey r.nabati
Big data analytics, survey r.nabati
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
Big data seminor
Big data seminorBig data seminor
Big data seminor
 

Último

Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 

Último (20)

Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 

Big data

  • 2.  “Big Data” is data whose scale, diversity, and complexity requires new architecture, techniques, algorithms, and analytics to manage it and extract value and hidden knowledge from it  Most analysts and practitioners currently refer to data sets from 30-50 terabytes(1000 gigabytes per terabyte) to multiple petabytes (1000 terabytes per petabyte) as big data.
  • 3.  The progress and innovation is no longer hindered by the ability to collect data  But, by the ability to manage, analyze, summarize, visualize, and discover knowledge from the collected data in a timely manner and in a scalable fashion 3 Social media and networks (all of us are generating data) Scientific instruments (collecting all sorts of data) Mobile devices (tracking all objects all the tim Sensor technology and networks (measuring all kinds of data)
  • 4.  Volume: The massive scale and growth of unstructured data outstrips traditional storage and analytical solutions  Velocity: Data is generated in real time, with demands for usable information to be served up immediately  Variety: Data is getting generated in the form of relational data, text data, semi structured data ,Graph data etc.
  • 5.  There were 5 billion mobile phones in use in 2010.  There is a 40% projected growth in global data generated per year vs. 5% growth in global IT spending.  There were 235 terabytes of data collected by the US Library of Congress in April 2011.  15 out of 17 major business sectors in the United States have more data stored per company than the US Library of Congress.
  • 6. The Problem The complex nature of big data is primarily driven by the unstructured nature of much of the data that is generated by modern technologies, such as that from web logs, radio frequency Id (RFID), sensors embedded in devices, machinery, vehicles, Internet searches, social networks such as Facebook, portable computers, smart phones and other cell phones, GPS devices, and call center records. In most cases, in order to effectively utilize big data, it must be combined with structured data (typically from a relational database) from a more conventional business application, such as Enterprise Resource Planning (ERP) or Customer Relationship Management (CRM).
  • 7. Global market for big data Industry Size : Today every organisation across the globe is faced with an unprecedented growth in data. The digital universe of data was expected to expand to 2.7 Zetta bytes (ZB) by the end of 2012. Then it is predicted to be double every two years, reaching 8 ZB data by 2015. Its hard to conceptualize this quantity of information. US library of Congress holds 462 terabytes (TB) of digital data. At this rate 8 ZB is equivalent to almost 18 million libraries of Congress. That translates to a ten-fold increase over the last five years and an astounding 29-fold increase over the next ten years. This year, the world’s digital information is expected to grow by 57%. Within that, internet traffic is growing by 35%, and mobile data traffic at 110%, according to Cisco. The big data industry is worth somewhere between $30bn and $200bn.
  • 8. Smartphones, tablets, sensors, social networks, online games, video streams and mobile payments will all drive big data for many years to come
  • 9. Internet companies: Amazon , Apple, Facebook ,Google, Microsoft The big Internet companies control where the data comes from and where it goes to . Amazon, Baidu, Facebook and Google may one day make a lucrative side business from selling their proprietary distributed database technologies, competing with IBM and Oracle
  • 10.  Data storage, networking and hardware companies: ARM, BROCADE, CISCO, DELL, EMC, HP, INTEL ,LENOVO, NETAPP, SEAGATE Many hardware makers like Cisco, Dell, Lenovo and HP are investing heavily in big data appliances Data storage companies are likely to continue to beat earnings expectations as the data deluge goes into overdrive
  • 11. Enterprise software companies: Adobe, Citrix System, IBM, Fujitsu, Informatica, Oracle, Red Hat, SAP, Salesforce.com Hadoop is fast becoming the industry standard enterprise database platform Cloud database services are likely to be the fastest growth sector this year within the enterprise software space
  • 12. A wide variety of techniques and technologies has been developed and adapted to aggregate, manipulate, analyze, and visualize big data. BIG DATA TECHNIQUES:  A/B testing: A technique in which a control group is compared with a variety of test groups in order to determine what treatments (i.e., changes) will improve a given objective variable, e.g., marketing response rate. This technique is also known as split testing or bucket testing. An example application is determining what copy text, layouts, images, or colors will improve conversion rates on an e-commerce Web site.  Association rule learning: A set of techniques for discovering interesting relationships, i.e., “association rules,” among variables in large databases. These techniques consist of a variety of algorithms to generate and test possible rules. One application is market basket analysis, in which a retailer can determine which products are frequently bought together and use this information for marketing (a commonly cited example is the discovery that many supermarket shoppers who buy diapers also tend to buy beer). Used for data mining.
  • 13.  Cluster analysis: A statistical method for classifying objects that splits a diverse group into smaller groups of similar objects, whose characteristics of similarity are not known in advance.  Crowdsourcing: A technique for collecting data submitted by a large group of people or community through an open call, usually through networked media such as the Web.  Statistics: The science of the collection, organization, and interpretation of data, including the design of surveys and experiments BIG DATA TECHNOLOGIES  There is a growing number of technologies used to aggregate, manipulate, manage, and analyze big data.
  • 14.  Big Table. Proprietary distributed database system built on the Google File System. Tables are further split into multiple tablets. When size of data grows beyond limits, tablets are compressed by the use of algorithms such as Snappy.  Business intelligence (BI): A type of application software designed to report, analyze, and present data. BI tools are often used to read data that have been previously stored in a data warehouse or data mart. BI tools can also be used to create standard reports that are generated on a periodic basis, or to display information on real-time management dashboards, i.e., integrated displays of metrics that measure the performance of a system.
  • 15.  Data warehouse: Specialized database optimized for reporting, often used for storing large amounts of structured data. Data is uploaded using ETL (extract, transform, and load) tools from operational data stores, and reports are often generated using business intelligence tools.  Extract, transform, and load (ETL): Software tools used to extract data from outside sources, transform them to fit operational needs, and load them into a database or data warehouse.  Hadoop: An open source (free) software framework for processing huge datasets on certain kinds of problems on a distributed system. Its development was inspired by Google’s MapReduce and Google File System.  Hbase: An open source (free), distributed, non-relational database modeled on Google’s Big Table. It enables fault tolerant way of storing large quantities of data.
  • 16.  Opportunities: Data intent and capacity • The data revolution • Intent in an age of growing volatility Social Science and Policy Applications  Challenges: Data • Privacy • Access and Sharing Analysis • Defining and Detecting Anomalies in Human Ecosystems
  • 17. • HP’s Big Data strategy and Vertica • CSC Buys Infochimps for Big Data, Analytics Expertise • Market Intelligence Provider FirstRain Unveils New Big Data Tool, Market Insights
  • 18. Investment risks: Whilst big data industry revenues are certain to grow, investors face significant risks. Bandwidth risk Today, internet bandwidth prices are capped, effectively making internet bandwidth a free resource for big data companies. But, without substantial investment by the world’s mobile operators, big data is likely to grow far faster than the ability of the network to carry it. As networks get overloaded, network latency rises, reducing the speed and efficiency of analytical engines, especially those powered through the cloud. The coming mobile bandwidth shortage will shift competitive advantage from technology companies to telecom operators.
  • 19. Open source risk With the source code free, barriers to entry remain low. In the longer term, this may depress the database industry’s margins. Patent risk Ever since Apple took on the mobile phone industry – and won – with barely a handful of mobile patents to its name, a patent war has erupted across the technology sector. Were a patent war to break out in the big data space, technological progress could be slowed down. Whilst regulators are unlikely to allow any hoarding of patents on anti-competitive grounds, the risk remains. Oracle, a leader in big data, is well known for filing multi-billion dollar patent infringement lawsuits against its competitors. Cyber risk Last month Global Payments, a credit card transaction processor, admitted that hackers had stolen the details of 1.5m North American card holders. This is the latest in a string of security breaches that have hit companies dealing in big data. Apple, EMC, Google, Oracle and Sony are all recent hacking victims. As the level of cyber-crime rises, so does the risk of dealing with big data. Just as the Fukushima incident dampened prospects for the nuclear sector, so a large cyber-attack could adversely impact big data industry profits.
  • 20.  Often misunderstood and ill-applied  The question is not “how big is your data?”, it is “what are you are doing with your data?”  It fails to supply its customers with products that solve business problems.  Companies searching for data solutions are often confused by all the big data marketing hype and sometimes end up wasting resources.