SlideShare una empresa de Scribd logo
1 de 27
Descargar para leer sin conexión
analyze(NoSQL,BigData);
/* history, hype, opportunities */




              // By: Vishy Poosala
          // Head of Bell Labs, India
       // poosala@alcatel-lucent.com
                   // @vishyp
                                        1
The dark ages of COBOL




                         2
..then Codd said
let there be tables

              Rows &
              Columns




                        Normal
        SQL
                        Forms




               ACID


                                 3
www.data-for-humans.com


                        SET-
             WHAT
                       VALUED
            COLUMNS
                      ATTRIBUT
               ?
                         ES



                      Schema
              XML
                      Evolution




                                  4
Billions of Keys & Values

                        GFS



                       Google
                      Big Table



                       Hadoop



                      Cassandra
                       Dynamo


                                  5
How would you build a super-fast,
 FB-scale chat service, in 2012?

          (for example)



                                    6
I want my own DB!
           • Memcached
 Main
Memory     • redis


 Distr.
           • MongoDB
 K-V



Versions   • CouchDB



Social
Graphs     • Neo4j


                                    7
BIG
             KB       GB       TB           PB


Data                           Semi-
            FILES   TABLES                 Variety
                             Structured
                                          Dynamic

Analytics            OLAP
            STATS              Apps        Mahout
                     Cube


Language
            COBOL     SQL      XML         NoSQL




            60’s    80-96    96-’07         ‘07-

                                                 8
Following *AMAZING* Slides Courtesy: Gregory Piatesky-Shapiro, kdnuggets.com

You can find all the slides from his talk at:

http://www.slideshare.net/gpiatetskyshapiro/analytics-and-data-mining-industry-overview

                                                                                          9
Data Tsunami
• In 2010 enterprises
  stored 7 exabytes
  =7,000,000,000 GB
of new data (McKinsey)
• 90 percent of the
  world's data has been
                          Image with apologies to KDD-2011
  generated in the past
  two years (IBM)
                                                             10
Pre-history




Statistics is the biggest term in 20th century, but
data mining           and analytics          appears in late
1990s
From Google Ngram viewer – English language books
Note: Our analysis uses only English language data.
Other languages, especially Chinese , need to be considered for full picture
                                                                               11
Recent History:
Analytics, Data Mining, Knowledge Discovery




Analytics has been used since 1800, but started to rise in 2005
Data Mining jumps around 1996 (soon after first KDD conference) but declines after
2003 (TIA controversy, associated with gov. invasion of privacy).
Knowledge Discovery appears in 1989, jumps in 1996, and plateaus after 2000
                                                                           12
Google Trends:
After 2006, Data Mining < Analytics




                                  13
Google Insights: searches for
data mining, analytics -google
are most popular in India, US




                                 14
Analytics > Data Mining > Data
            Science




                                 15
Data Science, Big Data




                         16
Data Types Analyzed/Mined




www.KDnuggets.com/polls/2011/data-types-analyzed-mined.html   17
Largest Dataset Analyzed?
                                               2011 median dataset
                                               size ~10-20 GB,
                                               vs 8-10 GB in 2010.

                                               Increase in
                                               10 GB to 1 PB range




www.KDnuggets.com/polls/2011/largest-dataset-analyzed-data-mined.html
                                                                 18
Which methods/algorithms did you
  use for data analysis in 2011
                                    % analysts who used it
                                    0%   10%   20%   30%   40%   50%   60%   70%

                 Decision Trees
                     Regression
                     Clustering
                       Statistics
                   Visualization
  Time series/Sequence analysis
           Support Vector (SVM)
               Association rules
             Ensemble methods
                    Text Mining
                    Neural Nets
                       Boosting
                      Bayesian
                       Bagging
                Factor Analysis
    Anomaly/Deviation detection
        Social Network Analysis
               Survival Analysis
             Genetic algorithms
                 Uplift modeling



 www.KDnuggets.com/polls/2011/algorithms-analytics-data-mining.html
                                                                  19
Cloud Analytics is not common
             (yet)




www.KDnuggets.com/polls/2011/algorithms-analytics-data-mining.html
                                                                 20
Shortage of Skills
• McKinsey: shortage by 2018 in the US of
  – 140-190,000 people with deep analytical skills

  – 1.5 M managers/analysts with the know-how
    to use the analysis of big data to make
    effective decisions.

  Source:
   www.mckinsey.com/mgi/publications/big_data
   /                                        21
Job data: Data Scientist




                           22
Jobs: Data Mining >> Data
        Scientist




                            23
“Ground” Analytics (LinkedIn
          Skills)
                 ~ 75,000 with Data Mining skill

                  ~ 7,000 with Predictive Modeling



                  Also
                  ~ 20,000 with Predictive
                  Analytics
                  (not related with Predictive
                  Modeling ??




                                             24
Analytics LinkedIn Skills




  Predictive Analytics Machine Learning


 Text
 Mining                                   MapReduce



                                                      25
Big Data Bubble?

Big Data




            Gartner Hype Cycle

                                 26
27

Más contenido relacionado

La actualidad más candente

An Overview of the Emerging Graph Landscape (Oct 2013)
An Overview of the Emerging Graph Landscape (Oct 2013)An Overview of the Emerging Graph Landscape (Oct 2013)
An Overview of the Emerging Graph Landscape (Oct 2013)Emil Eifrem
 
Predictive Analysis for Airbnb Listing Rating using Scalable Big Data Platform
Predictive Analysis for Airbnb Listing Rating using Scalable Big Data PlatformPredictive Analysis for Airbnb Listing Rating using Scalable Big Data Platform
Predictive Analysis for Airbnb Listing Rating using Scalable Big Data PlatformSavita Yadav
 
A Hacking Toolset for Big Tabular Files (3)
A Hacking Toolset for Big Tabular Files (3)A Hacking Toolset for Big Tabular Files (3)
A Hacking Toolset for Big Tabular Files (3)Toshiyuki Shimono
 
GraphConnect SF 2013 Keynote
GraphConnect SF 2013 KeynoteGraphConnect SF 2013 Keynote
GraphConnect SF 2013 KeynoteEmil Eifrem
 
Real-time information analysis: social networks and open data
Real-time information analysis: social networks and open dataReal-time information analysis: social networks and open data
Real-time information analysis: social networks and open dataData Science Society
 
Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Me...
Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Me...Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Me...
Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Me...Neo4j
 
Big Data Analytics MIS presentation
Big Data Analytics MIS presentationBig Data Analytics MIS presentation
Big Data Analytics MIS presentationAASTHA PANDEY
 
History and Trend of Big Data and Deep Learning
History and Trend of Big Data and Deep LearningHistory and Trend of Big Data and Deep Learning
History and Trend of Big Data and Deep LearningJongwook Woo
 
Rating Prediction using Deep Learning and Spark
Rating Prediction using Deep Learning and SparkRating Prediction using Deep Learning and Spark
Rating Prediction using Deep Learning and SparkJongwook Woo
 
Big Data Fundamentals
Big Data FundamentalsBig Data Fundamentals
Big Data Fundamentalsrjain51
 
Introduction to Big Data and its Trends
Introduction to Big Data and its TrendsIntroduction to Big Data and its Trends
Introduction to Big Data and its TrendsJongwook Woo
 
Introduction to Big Data and AI for Business Analytics and Prediction
Introduction to Big Data and AI for Business Analytics and PredictionIntroduction to Big Data and AI for Business Analytics and Prediction
Introduction to Big Data and AI for Business Analytics and PredictionJongwook Woo
 
Introduction to Big Data: Smart Factory
Introduction to Big Data: Smart FactoryIntroduction to Big Data: Smart Factory
Introduction to Big Data: Smart FactoryJongwook Woo
 
Data Mining and Business Intelligence Tools
Data Mining and Business Intelligence ToolsData Mining and Business Intelligence Tools
Data Mining and Business Intelligence ToolsMotaz Saad
 

La actualidad más candente (19)

Dbm630 lecture10
Dbm630 lecture10Dbm630 lecture10
Dbm630 lecture10
 
An Overview of the Emerging Graph Landscape (Oct 2013)
An Overview of the Emerging Graph Landscape (Oct 2013)An Overview of the Emerging Graph Landscape (Oct 2013)
An Overview of the Emerging Graph Landscape (Oct 2013)
 
Predictive Analysis for Airbnb Listing Rating using Scalable Big Data Platform
Predictive Analysis for Airbnb Listing Rating using Scalable Big Data PlatformPredictive Analysis for Airbnb Listing Rating using Scalable Big Data Platform
Predictive Analysis for Airbnb Listing Rating using Scalable Big Data Platform
 
A Hacking Toolset for Big Tabular Files (3)
A Hacking Toolset for Big Tabular Files (3)A Hacking Toolset for Big Tabular Files (3)
A Hacking Toolset for Big Tabular Files (3)
 
GraphConnect SF 2013 Keynote
GraphConnect SF 2013 KeynoteGraphConnect SF 2013 Keynote
GraphConnect SF 2013 Keynote
 
Real-time information analysis: social networks and open data
Real-time information analysis: social networks and open dataReal-time information analysis: social networks and open data
Real-time information analysis: social networks and open data
 
BigData Analytics
BigData AnalyticsBigData Analytics
BigData Analytics
 
Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Me...
Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Me...Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Me...
Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Me...
 
Big Data Analytics MIS presentation
Big Data Analytics MIS presentationBig Data Analytics MIS presentation
Big Data Analytics MIS presentation
 
History and Trend of Big Data and Deep Learning
History and Trend of Big Data and Deep LearningHistory and Trend of Big Data and Deep Learning
History and Trend of Big Data and Deep Learning
 
STI Summit 2011 - Digital Worlds
STI Summit 2011 - Digital WorldsSTI Summit 2011 - Digital Worlds
STI Summit 2011 - Digital Worlds
 
Data mining
Data miningData mining
Data mining
 
Rating Prediction using Deep Learning and Spark
Rating Prediction using Deep Learning and SparkRating Prediction using Deep Learning and Spark
Rating Prediction using Deep Learning and Spark
 
Bigdata
BigdataBigdata
Bigdata
 
Big Data Fundamentals
Big Data FundamentalsBig Data Fundamentals
Big Data Fundamentals
 
Introduction to Big Data and its Trends
Introduction to Big Data and its TrendsIntroduction to Big Data and its Trends
Introduction to Big Data and its Trends
 
Introduction to Big Data and AI for Business Analytics and Prediction
Introduction to Big Data and AI for Business Analytics and PredictionIntroduction to Big Data and AI for Business Analytics and Prediction
Introduction to Big Data and AI for Business Analytics and Prediction
 
Introduction to Big Data: Smart Factory
Introduction to Big Data: Smart FactoryIntroduction to Big Data: Smart Factory
Introduction to Big Data: Smart Factory
 
Data Mining and Business Intelligence Tools
Data Mining and Business Intelligence ToolsData Mining and Business Intelligence Tools
Data Mining and Business Intelligence Tools
 

Destacado

Big Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementBig Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementTony Bain
 
Big Data with Not Only SQL
Big Data with Not Only SQLBig Data with Not Only SQL
Big Data with Not Only SQLPhilippe Julio
 
SQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data ArchitectureSQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data ArchitectureVenu Anuganti
 
Sql vs NoSQL
Sql vs NoSQLSql vs NoSQL
Sql vs NoSQLRTigger
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with HadoopPhilippe Julio
 

Destacado (7)

Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Big Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementBig Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data Management
 
Big Data with Not Only SQL
Big Data with Not Only SQLBig Data with Not Only SQL
Big Data with Not Only SQL
 
SQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data ArchitectureSQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data Architecture
 
Sql vs NoSQL
Sql vs NoSQLSql vs NoSQL
Sql vs NoSQL
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
 
Big data ppt
Big  data pptBig  data ppt
Big data ppt
 

Similar a NoSQL & Big Data Analytics: History, Hype, Opportunities

Big Data = Big Decisions
Big Data = Big DecisionsBig Data = Big Decisions
Big Data = Big DecisionsInnoTech
 
Enabling a Data Driven Agile Business
Enabling a Data Driven Agile BusinessEnabling a Data Driven Agile Business
Enabling a Data Driven Agile BusinessTharindu Mathew
 
Kave Salamatian, Universite de Savoie and Eiko Yoneki, University of Cambridg...
Kave Salamatian, Universite de Savoie and Eiko Yoneki, University of Cambridg...Kave Salamatian, Universite de Savoie and Eiko Yoneki, University of Cambridg...
Kave Salamatian, Universite de Savoie and Eiko Yoneki, University of Cambridg...i_scienceEU
 
제1회 Korea Community Day 발표자료 Bigdata
제1회 Korea Community Day 발표자료 Bigdata 제1회 Korea Community Day 발표자료 Bigdata
제1회 Korea Community Day 발표자료 Bigdata Gruter
 
Data Mining: Future Trends and Applications
Data Mining: Future Trends and ApplicationsData Mining: Future Trends and Applications
Data Mining: Future Trends and ApplicationsIJMER
 
When big data meet python @ COSCUP 2012
When big data meet python @ COSCUP 2012When big data meet python @ COSCUP 2012
When big data meet python @ COSCUP 2012Jimmy Lai
 
Sample Paper.doc.doc
Sample Paper.doc.docSample Paper.doc.doc
Sample Paper.doc.docbutest
 
Intel Cloud summit: Big Data by Nick Knupffer
Intel Cloud summit: Big Data by Nick KnupfferIntel Cloud summit: Big Data by Nick Knupffer
Intel Cloud summit: Big Data by Nick KnupfferIntelAPAC
 
(R17A0528) BIG DATA ANALYTICS.pdf
(R17A0528) BIG DATA ANALYTICS.pdf(R17A0528) BIG DATA ANALYTICS.pdf
(R17A0528) BIG DATA ANALYTICS.pdfSreenivasa Harish
 
(R17A0528) BIG DATA ANALYTICS.pdf
(R17A0528) BIG DATA ANALYTICS.pdf(R17A0528) BIG DATA ANALYTICS.pdf
(R17A0528) BIG DATA ANALYTICS.pdfPoornimaShetty27
 
Research issues in the big data and its Challenges
Research issues in the big data and its ChallengesResearch issues in the big data and its Challenges
Research issues in the big data and its ChallengesKathirvel Ayyaswamy
 
Big Data and Implications on Platform Architecture
Big Data and Implications on Platform ArchitectureBig Data and Implications on Platform Architecture
Big Data and Implications on Platform ArchitectureOdinot Stanislas
 
Future of big data nick kabra speaker compendium march 2013
Future of big data nick kabra speaker compendium march 2013Future of big data nick kabra speaker compendium march 2013
Future of big data nick kabra speaker compendium march 2013nkabra
 
Big Data By Vijay Bhaskar Semwal
Big Data By Vijay Bhaskar SemwalBig Data By Vijay Bhaskar Semwal
Big Data By Vijay Bhaskar SemwalIIIT Allahabad
 
Lecture1 introduction to big data
Lecture1 introduction to big dataLecture1 introduction to big data
Lecture1 introduction to big datahktripathy
 
NoSQL for the SQL Server Pro
NoSQL for the SQL Server ProNoSQL for the SQL Server Pro
NoSQL for the SQL Server ProLynn Langit
 

Similar a NoSQL & Big Data Analytics: History, Hype, Opportunities (20)

Big Data = Big Decisions
Big Data = Big DecisionsBig Data = Big Decisions
Big Data = Big Decisions
 
Enabling a Data Driven Agile Business
Enabling a Data Driven Agile BusinessEnabling a Data Driven Agile Business
Enabling a Data Driven Agile Business
 
Kave Salamatian, Universite de Savoie and Eiko Yoneki, University of Cambridg...
Kave Salamatian, Universite de Savoie and Eiko Yoneki, University of Cambridg...Kave Salamatian, Universite de Savoie and Eiko Yoneki, University of Cambridg...
Kave Salamatian, Universite de Savoie and Eiko Yoneki, University of Cambridg...
 
제1회 Korea Community Day 발표자료 Bigdata
제1회 Korea Community Day 발표자료 Bigdata 제1회 Korea Community Day 발표자료 Bigdata
제1회 Korea Community Day 발표자료 Bigdata
 
Data Mining: Future Trends and Applications
Data Mining: Future Trends and ApplicationsData Mining: Future Trends and Applications
Data Mining: Future Trends and Applications
 
When big data meet python @ COSCUP 2012
When big data meet python @ COSCUP 2012When big data meet python @ COSCUP 2012
When big data meet python @ COSCUP 2012
 
Sample Paper.doc.doc
Sample Paper.doc.docSample Paper.doc.doc
Sample Paper.doc.doc
 
Intel Cloud summit: Big Data by Nick Knupffer
Intel Cloud summit: Big Data by Nick KnupfferIntel Cloud summit: Big Data by Nick Knupffer
Intel Cloud summit: Big Data by Nick Knupffer
 
(R17A0528) BIG DATA ANALYTICS.pdf
(R17A0528) BIG DATA ANALYTICS.pdf(R17A0528) BIG DATA ANALYTICS.pdf
(R17A0528) BIG DATA ANALYTICS.pdf
 
(R17A0528) BIG DATA ANALYTICS.pdf
(R17A0528) BIG DATA ANALYTICS.pdf(R17A0528) BIG DATA ANALYTICS.pdf
(R17A0528) BIG DATA ANALYTICS.pdf
 
Research issues in the big data and its Challenges
Research issues in the big data and its ChallengesResearch issues in the big data and its Challenges
Research issues in the big data and its Challenges
 
Big data
Big dataBig data
Big data
 
Big Data and Implications on Platform Architecture
Big Data and Implications on Platform ArchitectureBig Data and Implications on Platform Architecture
Big Data and Implications on Platform Architecture
 
Future of big data nick kabra speaker compendium march 2013
Future of big data nick kabra speaker compendium march 2013Future of big data nick kabra speaker compendium march 2013
Future of big data nick kabra speaker compendium march 2013
 
Big Data By Vijay Bhaskar Semwal
Big Data By Vijay Bhaskar SemwalBig Data By Vijay Bhaskar Semwal
Big Data By Vijay Bhaskar Semwal
 
13 pv-do es-18-bigdata-v3
13 pv-do es-18-bigdata-v313 pv-do es-18-bigdata-v3
13 pv-do es-18-bigdata-v3
 
Lecture1 introduction to big data
Lecture1 introduction to big dataLecture1 introduction to big data
Lecture1 introduction to big data
 
Our big data
Our big dataOur big data
Our big data
 
Big data
Big dataBig data
Big data
 
NoSQL for the SQL Server Pro
NoSQL for the SQL Server ProNoSQL for the SQL Server Pro
NoSQL for the SQL Server Pro
 

Más de Vishy Poosala

Big Ideas, Ideal Job, and other Holy Grails
Big Ideas, Ideal Job, and other Holy GrailsBig Ideas, Ideal Job, and other Holy Grails
Big Ideas, Ideal Job, and other Holy GrailsVishy Poosala
 
Next Generation Innovation - Power of Stillness and More
Next Generation Innovation - Power of Stillness and MoreNext Generation Innovation - Power of Stillness and More
Next Generation Innovation - Power of Stillness and MoreVishy Poosala
 
18 minutes - Get the Right Things Done
18 minutes - Get the Right Things Done18 minutes - Get the Right Things Done
18 minutes - Get the Right Things DoneVishy Poosala
 
Computers & Programming for Creativity in Children
Computers & Programming for Creativity in  ChildrenComputers & Programming for Creativity in  Children
Computers & Programming for Creativity in ChildrenVishy Poosala
 
Innovation in software architecture
Innovation in software architectureInnovation in software architecture
Innovation in software architectureVishy Poosala
 
Recipe for Viral Marketing
Recipe for Viral MarketingRecipe for Viral Marketing
Recipe for Viral MarketingVishy Poosala
 
Ideal job: Doing what you love to do
Ideal job: Doing what you love to doIdeal job: Doing what you love to do
Ideal job: Doing what you love to doVishy Poosala
 
A recipe for meditation
A recipe for meditationA recipe for meditation
A recipe for meditationVishy Poosala
 
A Recipe For Innovation and Creative Thinking [creating the 8th wonder of the...
A Recipe For Innovation and Creative Thinking [creating the 8th wonder of the...A Recipe For Innovation and Creative Thinking [creating the 8th wonder of the...
A Recipe For Innovation and Creative Thinking [creating the 8th wonder of the...Vishy Poosala
 

Más de Vishy Poosala (9)

Big Ideas, Ideal Job, and other Holy Grails
Big Ideas, Ideal Job, and other Holy GrailsBig Ideas, Ideal Job, and other Holy Grails
Big Ideas, Ideal Job, and other Holy Grails
 
Next Generation Innovation - Power of Stillness and More
Next Generation Innovation - Power of Stillness and MoreNext Generation Innovation - Power of Stillness and More
Next Generation Innovation - Power of Stillness and More
 
18 minutes - Get the Right Things Done
18 minutes - Get the Right Things Done18 minutes - Get the Right Things Done
18 minutes - Get the Right Things Done
 
Computers & Programming for Creativity in Children
Computers & Programming for Creativity in  ChildrenComputers & Programming for Creativity in  Children
Computers & Programming for Creativity in Children
 
Innovation in software architecture
Innovation in software architectureInnovation in software architecture
Innovation in software architecture
 
Recipe for Viral Marketing
Recipe for Viral MarketingRecipe for Viral Marketing
Recipe for Viral Marketing
 
Ideal job: Doing what you love to do
Ideal job: Doing what you love to doIdeal job: Doing what you love to do
Ideal job: Doing what you love to do
 
A recipe for meditation
A recipe for meditationA recipe for meditation
A recipe for meditation
 
A Recipe For Innovation and Creative Thinking [creating the 8th wonder of the...
A Recipe For Innovation and Creative Thinking [creating the 8th wonder of the...A Recipe For Innovation and Creative Thinking [creating the 8th wonder of the...
A Recipe For Innovation and Creative Thinking [creating the 8th wonder of the...
 

Último

20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-pyJamie (Taka) Wang
 
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationIES VE
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...DianaGray10
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1DianaGray10
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsSafe Software
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdfPedro Manuel
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7DianaGray10
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAshyamraj55
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1DianaGray10
 
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfJamie (Taka) Wang
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8DianaGray10
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsSeth Reyes
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDELiveplex
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXTarek Kalaji
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024SkyPlanner
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfAijun Zhang
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopBachir Benyammi
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Brian Pichman
 

Último (20)

201610817 - edge part1
201610817 - edge part1201610817 - edge part1
201610817 - edge part1
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-py
 
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdf
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1
 
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and Hazards
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBX
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdf
 
20150722 - AGV
20150722 - AGV20150722 - AGV
20150722 - AGV
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 Workshop
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )
 

NoSQL & Big Data Analytics: History, Hype, Opportunities

  • 1. analyze(NoSQL,BigData); /* history, hype, opportunities */ // By: Vishy Poosala // Head of Bell Labs, India // poosala@alcatel-lucent.com // @vishyp 1
  • 2. The dark ages of COBOL 2
  • 3. ..then Codd said let there be tables Rows & Columns Normal SQL Forms ACID 3
  • 4. www.data-for-humans.com SET- WHAT VALUED COLUMNS ATTRIBUT ? ES Schema XML Evolution 4
  • 5. Billions of Keys & Values GFS Google Big Table Hadoop Cassandra Dynamo 5
  • 6. How would you build a super-fast, FB-scale chat service, in 2012? (for example) 6
  • 7. I want my own DB! • Memcached Main Memory • redis Distr. • MongoDB K-V Versions • CouchDB Social Graphs • Neo4j 7
  • 8. BIG KB GB TB PB Data Semi- FILES TABLES Variety Structured Dynamic Analytics OLAP STATS Apps Mahout Cube Language COBOL SQL XML NoSQL 60’s 80-96 96-’07 ‘07- 8
  • 9. Following *AMAZING* Slides Courtesy: Gregory Piatesky-Shapiro, kdnuggets.com You can find all the slides from his talk at: http://www.slideshare.net/gpiatetskyshapiro/analytics-and-data-mining-industry-overview 9
  • 10. Data Tsunami • In 2010 enterprises stored 7 exabytes =7,000,000,000 GB of new data (McKinsey) • 90 percent of the world's data has been Image with apologies to KDD-2011 generated in the past two years (IBM) 10
  • 11. Pre-history Statistics is the biggest term in 20th century, but data mining and analytics appears in late 1990s From Google Ngram viewer – English language books Note: Our analysis uses only English language data. Other languages, especially Chinese , need to be considered for full picture 11
  • 12. Recent History: Analytics, Data Mining, Knowledge Discovery Analytics has been used since 1800, but started to rise in 2005 Data Mining jumps around 1996 (soon after first KDD conference) but declines after 2003 (TIA controversy, associated with gov. invasion of privacy). Knowledge Discovery appears in 1989, jumps in 1996, and plateaus after 2000 12
  • 13. Google Trends: After 2006, Data Mining < Analytics 13
  • 14. Google Insights: searches for data mining, analytics -google are most popular in India, US 14
  • 15. Analytics > Data Mining > Data Science 15
  • 16. Data Science, Big Data 16
  • 18. Largest Dataset Analyzed? 2011 median dataset size ~10-20 GB, vs 8-10 GB in 2010. Increase in 10 GB to 1 PB range www.KDnuggets.com/polls/2011/largest-dataset-analyzed-data-mined.html 18
  • 19. Which methods/algorithms did you use for data analysis in 2011 % analysts who used it 0% 10% 20% 30% 40% 50% 60% 70% Decision Trees Regression Clustering Statistics Visualization Time series/Sequence analysis Support Vector (SVM) Association rules Ensemble methods Text Mining Neural Nets Boosting Bayesian Bagging Factor Analysis Anomaly/Deviation detection Social Network Analysis Survival Analysis Genetic algorithms Uplift modeling www.KDnuggets.com/polls/2011/algorithms-analytics-data-mining.html 19
  • 20. Cloud Analytics is not common (yet) www.KDnuggets.com/polls/2011/algorithms-analytics-data-mining.html 20
  • 21. Shortage of Skills • McKinsey: shortage by 2018 in the US of – 140-190,000 people with deep analytical skills – 1.5 M managers/analysts with the know-how to use the analysis of big data to make effective decisions. Source: www.mckinsey.com/mgi/publications/big_data / 21
  • 22. Job data: Data Scientist 22
  • 23. Jobs: Data Mining >> Data Scientist 23
  • 24. “Ground” Analytics (LinkedIn Skills) ~ 75,000 with Data Mining skill ~ 7,000 with Predictive Modeling Also ~ 20,000 with Predictive Analytics (not related with Predictive Modeling ?? 24
  • 25. Analytics LinkedIn Skills Predictive Analytics Machine Learning Text Mining MapReduce 25
  • 26. Big Data Bubble? Big Data Gartner Hype Cycle 26
  • 27. 27