SlideShare una empresa de Scribd logo
1 de 21
Descargar para leer sin conexión
Data Discovery Tool
        BigSheets
MapReduce with No Coding?
  p                     g
Atsushi Tsuchiya (eAtsuhsi@JP.ibm.com)
Atsushi Tsuchiya (eAtsuhsi@JP.ibm.com)
          Big Data Tiger Team
             IBM Software
             IBM Software
Looking at Data
              Looking at Data
• What would you do with Big data? 
    h      ld    d ih i d ?
• How to make use of it?
• It is difficult! – too vague.
   • No specific problem that needs to be solved.
            p       p
   • No specific question that needs to be answered.
• Only you know is to improve the business.
       yy                   p
• But you have *data*
• So what would you do first?
  So, what would you do first?
                Looking at Data!
                      g
IBM with Hadoop
            IBM with Hadoop
• IBM has been working with Open source 
           y           g
  community for the long time.
  – Eclipse, Hadoop and so on …

• BigInsights include Hadoop
BigInsights
• BigInsihgts i
   i    ih is IBM Hadoop product for Big data 
                    d       d    f    i d
  analytics.
  – Basic Edition (up to 10TB) – Free   無償で使えます!
  – Enterprise Edition 
         p

• Next version BigInsights ‐ coming soon
  Next version BigInsights coming soon.
  – v1.2 available.

• And many more
BigInsights Componetns
         BigInsights Componetns
• BigInsihgts i l d
   i    ih includes:
  –   IBM Java
  –   JAQL           - IBMが開発した言語(オープンソース)
  –   IBM Distribution of Hadoop
  –   BigSheets      - データ探索ツール
  –   FLEX scheduler for Adaptive MapReduce 
  –   Orchestrator (Workflow Engine)
  –   SystemT (Text Analytics), SystemML (Machine Learning)
  –   LDAP
  –   Web Console / Developer Studio
BigInsights – Basic Edition
                BigInsights – Basic Edition
                                                                      Version
                                                                  Will be Update     Basic    Enterprise
Function                                                             in Nov         Edition
                                                                                    Editi      Edition
                                                                                               Editi
                                                                     release.

Integrated Install                                                                 Inc        Inc
Open Source components:
Hadoop (including common utilities, HDFS, MapReduce framework)    0.20.2           Inc        Inc
Jaql (programming / query language)                               0.5.2            Inc        Inc
Pig (programming / query language)                                0.7              Inc        Inc
Flume (data collection/aggregation)                               0.9.1            Inc        Inc
Hive (data summarization/querying)                                0.5              Inc        Inc
Lucene (text search)                                              3.0.2
                                                                  302              Inc        Inc
Zookeeper (process coordination)                                  3.2.2            Inc        Inc
Avro (data serialization)                                         1.3.0            Inc        Inc
HBase (
      (real time read/write)
                     /     )                                      0.20.6
                                                                  0 20 6           Inc        Inc
Oozie (workflow/ job orchestration)                               2.2.2            Inc        Inc
Online documentation                                                               Inc        Inc
Capability to integrate with DB2, InfoSphere Warehouse                             Inc        Inc
 Two DB2 UDFs to submit jobs, and read results from BigInsights
BigInsights – Enterprise Edition
                     Enterprise Edition
                                                                        Basic    Enterprise
Function                                                               Edition    Edition
R Connector
 Jaql module to invoke R statistical capabilities from BigInsights   n/a         Inc
Netezza C
N t     Connector
                t
 Jaql modules to read/write data from/to Netezza                     n/a         Inc
LDAP                                                                 n/a         Inc
Web Console                                                          n/a         Inc
Workflow Engine                                                      n/a         Inc
Scheduler (Orchestrator)                                             n/a         Inc
Text Analytics Module (System T)                                     n/a         Inc
Eclipse support (for System T)*                                      n/a         Inc
BigSheets – Data Discovery Tool                                      n/a         Inc
IBM Optim Development Studio V2.2.1.0                                n/a         Inc
Support by IBM
  pp     y                                                           n/a         Inc
BigSheets
• A data exploring tool for Hadoop
• Only comes with BigInsights Enterprise edition
  Only comes with BigInsights Enterprise edition
BigSheets Concept Model
                     Concept Model
                           Enrich   Inspect


                                               Explore
Internet                                                   No Coding is Required!
            Gather
                             BigSheets


Intranet

                 Publish                      Get/
                                              Manipulate
 Logs       Gather


                           Massive Results
 Other                      in BigInsights

                                                    Explore & 
                                                    Analyze
It s like a spreadsheets.
It’s like a spreadsheets

                    Looks very familiar ?!?
Visualizations
• Predefined visualization
• Customer Plug‐in
  Customer Plug in




                  A number of coffee shops in North America for each States.
DEMO
Internet
                                                                     BigSheets

                                                          Intranet




                           Gather                         Logs


                                                          Other
                                                                     BigInsight
                                                                          s




• BigInsights can gather data from
   i    i h          h d f
  – Predefined formats :
     •   BigSheets data reader
     •   Basic crawler data reader
     •   Basic crawler data reader (binary support)
         Basic crawler data reader (binary support)
     •   Character‐delimited data reader
     •   Tab Separated Value (TSV) data reader
                p             (    )
     •   JavaScript Object Notation (JSON) array reader
     •   Comma Separated Value (CSV) data reader

  – Customer BigSheets Reader 
Internet
                                                  BigSheets

                                       Intranet




                      Gather           Logs


                                       Other
                                                  BigInsight
                                                       s




• BigInsights can import structured and 
   i    i h       i               d d
  unstructured data
  – CSV
  – Files
  – Network
     • http
          p
     • hdfs
     • AWS (S3n/S3)
  – Other
     • Customer Importer
Internet
                                                    BigSheets

                                         Intranet




       Collection                         Logs


                                          Other
                                                    BigInsight
                                                         s




A complete list of MacDonald s in North America.
A complete list of MacDonald's in North America
Internet
                                                                         BigSheets

                                                              Intranet


                                                              Logs

                                                                         BigInsight
                                                              Other           s




                                                  Calculate



               Reformat

Import



         A complete list of MacDonald's in North America.
Internet
                                     BigSheets

                          Intranet


                          Logs

                                     BigInsight
                          Other           s




Column chart




               Heat map
BigSheets in Action
                    in Action
              映 売  げ
• Blockbuster 映画売り上げ予測
 – ABC Newsより
Blockbuster – 映画の売り上げ予測
    IBM BigInsights/BigSheets
                 ①週末につぶやかれたTweets 
                 ①週末につぶやかれたTweets
                 (約200,000)フィードを受けて、




                 ②数時間以内に、
                 (今までは、月曜の朝になってから)
                  売り上げ予測チャ ト作成
                 ‐売り上げ予測チャート作成
                 ‐センチメント分析
                 例えば、今年の夏は、
                      がどれよりも人気があ た(
                 X‐manがどれよりも人気があった(つ
                 ぶやかれた)→宣伝、上映戦略など
                 をこまめに修正
Conclusion


• We all need to improve the business.

• S
  So, where would you start with Big data?
       h       ld      t t ith Bi d t ?

 Data Discovery is a key to start improving 
              YOUR Business!
              YOUR Business!
Thank you!
Thank you!

Más contenido relacionado

La actualidad más candente

Overview of Big data, Hadoop and Microsoft BI - version1
Overview of Big data, Hadoop and Microsoft BI - version1Overview of Big data, Hadoop and Microsoft BI - version1
Overview of Big data, Hadoop and Microsoft BI - version1Thanh Nguyen
 
Hadoop as Data Refinery - Steve Loughran
Hadoop as Data Refinery - Steve LoughranHadoop as Data Refinery - Steve Loughran
Hadoop as Data Refinery - Steve LoughranJAX London
 
Hadoop as data refinery
Hadoop as data refineryHadoop as data refinery
Hadoop as data refinerySteve Loughran
 
Jan 2013 HUG: Cloud-Friendly Hadoop and Hive
Jan 2013 HUG: Cloud-Friendly Hadoop and HiveJan 2013 HUG: Cloud-Friendly Hadoop and Hive
Jan 2013 HUG: Cloud-Friendly Hadoop and HiveYahoo Developer Network
 
Using hadoop to expand data warehousing
Using hadoop to expand data warehousingUsing hadoop to expand data warehousing
Using hadoop to expand data warehousingDataWorks Summit
 
Emergent Distributed Data Storage
Emergent Distributed Data StorageEmergent Distributed Data Storage
Emergent Distributed Data Storagehybrid cloud
 
HugeTable:Application-Oriented Structure Data Storage System
HugeTable:Application-Oriented Structure Data Storage SystemHugeTable:Application-Oriented Structure Data Storage System
HugeTable:Application-Oriented Structure Data Storage Systemqlw5
 
Data Analysis with Hadoop and Hive, ChicagoDB 2/21/2011
Data Analysis with Hadoop and Hive, ChicagoDB 2/21/2011Data Analysis with Hadoop and Hive, ChicagoDB 2/21/2011
Data Analysis with Hadoop and Hive, ChicagoDB 2/21/2011Jonathan Seidman
 
Introduction to Hortonworks Data Platform for Windows
Introduction to Hortonworks Data Platform for WindowsIntroduction to Hortonworks Data Platform for Windows
Introduction to Hortonworks Data Platform for WindowsHortonworks
 
Alex Wade, Digital Library Interoperability
Alex Wade, Digital Library InteroperabilityAlex Wade, Digital Library Interoperability
Alex Wade, Digital Library Interoperabilityparker01
 
Db tech show - hivemall
Db tech show - hivemallDb tech show - hivemall
Db tech show - hivemallMakoto Yui
 
First Step for Big Data with Apache Hadoop
First Step for Big Data with Apache HadoopFirst Step for Big Data with Apache Hadoop
First Step for Big Data with Apache HadoopBorn2Learn Co., Ltd
 
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...Mahantesh Angadi
 
Etu L2 Training - Hadoop 企業應用實作
Etu L2 Training - Hadoop 企業應用實作Etu L2 Training - Hadoop 企業應用實作
Etu L2 Training - Hadoop 企業應用實作James Chen
 
Big Data Warehousing: Pig vs. Hive Comparison
Big Data Warehousing: Pig vs. Hive ComparisonBig Data Warehousing: Pig vs. Hive Comparison
Big Data Warehousing: Pig vs. Hive ComparisonCaserta
 
Integrating Hadoop Into the Enterprise – Hadoop Summit 2012
Integrating Hadoop Into the Enterprise – Hadoop Summit 2012Integrating Hadoop Into the Enterprise – Hadoop Summit 2012
Integrating Hadoop Into the Enterprise – Hadoop Summit 2012Jonathan Seidman
 

La actualidad más candente (19)

Overview of Big data, Hadoop and Microsoft BI - version1
Overview of Big data, Hadoop and Microsoft BI - version1Overview of Big data, Hadoop and Microsoft BI - version1
Overview of Big data, Hadoop and Microsoft BI - version1
 
Hadoop
HadoopHadoop
Hadoop
 
Hadoop as Data Refinery - Steve Loughran
Hadoop as Data Refinery - Steve LoughranHadoop as Data Refinery - Steve Loughran
Hadoop as Data Refinery - Steve Loughran
 
Hadoop as data refinery
Hadoop as data refineryHadoop as data refinery
Hadoop as data refinery
 
Jan 2013 HUG: Cloud-Friendly Hadoop and Hive
Jan 2013 HUG: Cloud-Friendly Hadoop and HiveJan 2013 HUG: Cloud-Friendly Hadoop and Hive
Jan 2013 HUG: Cloud-Friendly Hadoop and Hive
 
Using hadoop to expand data warehousing
Using hadoop to expand data warehousingUsing hadoop to expand data warehousing
Using hadoop to expand data warehousing
 
Emergent Distributed Data Storage
Emergent Distributed Data StorageEmergent Distributed Data Storage
Emergent Distributed Data Storage
 
HugeTable:Application-Oriented Structure Data Storage System
HugeTable:Application-Oriented Structure Data Storage SystemHugeTable:Application-Oriented Structure Data Storage System
HugeTable:Application-Oriented Structure Data Storage System
 
Data Analysis with Hadoop and Hive, ChicagoDB 2/21/2011
Data Analysis with Hadoop and Hive, ChicagoDB 2/21/2011Data Analysis with Hadoop and Hive, ChicagoDB 2/21/2011
Data Analysis with Hadoop and Hive, ChicagoDB 2/21/2011
 
Introduction to Hortonworks Data Platform for Windows
Introduction to Hortonworks Data Platform for WindowsIntroduction to Hortonworks Data Platform for Windows
Introduction to Hortonworks Data Platform for Windows
 
SQL in Hadoop
SQL in HadoopSQL in Hadoop
SQL in Hadoop
 
Alex Wade, Digital Library Interoperability
Alex Wade, Digital Library InteroperabilityAlex Wade, Digital Library Interoperability
Alex Wade, Digital Library Interoperability
 
Steve Watt Presentation
Steve Watt PresentationSteve Watt Presentation
Steve Watt Presentation
 
Db tech show - hivemall
Db tech show - hivemallDb tech show - hivemall
Db tech show - hivemall
 
First Step for Big Data with Apache Hadoop
First Step for Big Data with Apache HadoopFirst Step for Big Data with Apache Hadoop
First Step for Big Data with Apache Hadoop
 
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
 
Etu L2 Training - Hadoop 企業應用實作
Etu L2 Training - Hadoop 企業應用實作Etu L2 Training - Hadoop 企業應用實作
Etu L2 Training - Hadoop 企業應用實作
 
Big Data Warehousing: Pig vs. Hive Comparison
Big Data Warehousing: Pig vs. Hive ComparisonBig Data Warehousing: Pig vs. Hive Comparison
Big Data Warehousing: Pig vs. Hive Comparison
 
Integrating Hadoop Into the Enterprise – Hadoop Summit 2012
Integrating Hadoop Into the Enterprise – Hadoop Summit 2012Integrating Hadoop Into the Enterprise – Hadoop Summit 2012
Integrating Hadoop Into the Enterprise – Hadoop Summit 2012
 

Similar a Hadoop Summit Japan 2011 Fall - LT by IBM

Big Data 視覺化分析解決方案
Big Data 視覺化分析解決方案Big Data 視覺化分析解決方案
Big Data 視覺化分析解決方案Etu Solution
 
Big Data: Technical Introduction to BigSheets for InfoSphere BigInsights
Big Data:  Technical Introduction to BigSheets for InfoSphere BigInsightsBig Data:  Technical Introduction to BigSheets for InfoSphere BigInsights
Big Data: Technical Introduction to BigSheets for InfoSphere BigInsightsCynthia Saracco
 
Finding the needles in the haystack. An Overview of Analyzing Big Data with H...
Finding the needles in the haystack. An Overview of Analyzing Big Data with H...Finding the needles in the haystack. An Overview of Analyzing Big Data with H...
Finding the needles in the haystack. An Overview of Analyzing Big Data with H...Chris Baglieri
 
Webinar: Open Source Business Intelligence Intro
Webinar: Open Source Business Intelligence IntroWebinar: Open Source Business Intelligence Intro
Webinar: Open Source Business Intelligence IntroSpagoWorld
 
Intel IT OpenStack Journey - OpenStack Fall 2012 Summit.pdf
Intel IT OpenStack Journey - OpenStack Fall 2012 Summit.pdfIntel IT OpenStack Journey - OpenStack Fall 2012 Summit.pdf
Intel IT OpenStack Journey - OpenStack Fall 2012 Summit.pdfOpenStack Foundation
 
Dynamic Cubes Deep Dive IBM Cognos 10.2
Dynamic Cubes Deep Dive IBM Cognos 10.2Dynamic Cubes Deep Dive IBM Cognos 10.2
Dynamic Cubes Deep Dive IBM Cognos 10.2Senturus
 
The sensor data challenge - Innovations (not only) for the Internet of Things
The sensor data challenge - Innovations (not only) for the Internet of ThingsThe sensor data challenge - Innovations (not only) for the Internet of Things
The sensor data challenge - Innovations (not only) for the Internet of ThingsStephan Reimann
 
Avoiding 10 common SharePoint Administration mistakes
Avoiding 10 common SharePoint Administration mistakesAvoiding 10 common SharePoint Administration mistakes
Avoiding 10 common SharePoint Administration mistakesBenjamin Athawes
 
Tableau 7.0 prsentation
Tableau 7.0 prsentationTableau 7.0 prsentation
Tableau 7.0 prsentationinam_slides
 
Big data and hadoop introduction
Big data and hadoop introductionBig data and hadoop introduction
Big data and hadoop introductionAjay Mittal
 
Know thy logos
Know thy logosKnow thy logos
Know thy logosVishal V
 
Impact of in-memory technology and SAP HANA on your business, IT, and career
Impact of in-memory technology and SAP HANA on your business, IT, and careerImpact of in-memory technology and SAP HANA on your business, IT, and career
Impact of in-memory technology and SAP HANA on your business, IT, and careerVitaliy Rudnytskiy
 
Big data-at-detik
Big data-at-detikBig data-at-detik
Big data-at-detikk4ndar
 
sones company presentation
sones company presentationsones company presentation
sones company presentationsones GmbH
 
01 necto introduction_ready
01 necto introduction_ready01 necto introduction_ready
01 necto introduction_readywww.panorama.com
 
hari_duche_updated
hari_duche_updatedhari_duche_updated
hari_duche_updatedHari Duche
 
All Grown Up: Maturation of Analytics in the Cloud
All Grown Up: Maturation of Analytics in the CloudAll Grown Up: Maturation of Analytics in the Cloud
All Grown Up: Maturation of Analytics in the CloudInside Analysis
 
From open data to API-driven business
From open data to API-driven businessFrom open data to API-driven business
From open data to API-driven businessOpenDataSoft
 

Similar a Hadoop Summit Japan 2011 Fall - LT by IBM (20)

Iotbds v1.0
Iotbds v1.0Iotbds v1.0
Iotbds v1.0
 
Big Data 視覺化分析解決方案
Big Data 視覺化分析解決方案Big Data 視覺化分析解決方案
Big Data 視覺化分析解決方案
 
Big Data: Technical Introduction to BigSheets for InfoSphere BigInsights
Big Data:  Technical Introduction to BigSheets for InfoSphere BigInsightsBig Data:  Technical Introduction to BigSheets for InfoSphere BigInsights
Big Data: Technical Introduction to BigSheets for InfoSphere BigInsights
 
Finding the needles in the haystack. An Overview of Analyzing Big Data with H...
Finding the needles in the haystack. An Overview of Analyzing Big Data with H...Finding the needles in the haystack. An Overview of Analyzing Big Data with H...
Finding the needles in the haystack. An Overview of Analyzing Big Data with H...
 
Webinar: Open Source Business Intelligence Intro
Webinar: Open Source Business Intelligence IntroWebinar: Open Source Business Intelligence Intro
Webinar: Open Source Business Intelligence Intro
 
Intel IT OpenStack Journey - OpenStack Fall 2012 Summit.pdf
Intel IT OpenStack Journey - OpenStack Fall 2012 Summit.pdfIntel IT OpenStack Journey - OpenStack Fall 2012 Summit.pdf
Intel IT OpenStack Journey - OpenStack Fall 2012 Summit.pdf
 
Dynamic Cubes Deep Dive IBM Cognos 10.2
Dynamic Cubes Deep Dive IBM Cognos 10.2Dynamic Cubes Deep Dive IBM Cognos 10.2
Dynamic Cubes Deep Dive IBM Cognos 10.2
 
The sensor data challenge - Innovations (not only) for the Internet of Things
The sensor data challenge - Innovations (not only) for the Internet of ThingsThe sensor data challenge - Innovations (not only) for the Internet of Things
The sensor data challenge - Innovations (not only) for the Internet of Things
 
Avoiding 10 common SharePoint Administration mistakes
Avoiding 10 common SharePoint Administration mistakesAvoiding 10 common SharePoint Administration mistakes
Avoiding 10 common SharePoint Administration mistakes
 
Tableau 7.0 prsentation
Tableau 7.0 prsentationTableau 7.0 prsentation
Tableau 7.0 prsentation
 
Big data and hadoop introduction
Big data and hadoop introductionBig data and hadoop introduction
Big data and hadoop introduction
 
Know thy logos
Know thy logosKnow thy logos
Know thy logos
 
Impact of in-memory technology and SAP HANA on your business, IT, and career
Impact of in-memory technology and SAP HANA on your business, IT, and careerImpact of in-memory technology and SAP HANA on your business, IT, and career
Impact of in-memory technology and SAP HANA on your business, IT, and career
 
Ofm msft-interop-v5c-132827
Ofm msft-interop-v5c-132827Ofm msft-interop-v5c-132827
Ofm msft-interop-v5c-132827
 
Big data-at-detik
Big data-at-detikBig data-at-detik
Big data-at-detik
 
sones company presentation
sones company presentationsones company presentation
sones company presentation
 
01 necto introduction_ready
01 necto introduction_ready01 necto introduction_ready
01 necto introduction_ready
 
hari_duche_updated
hari_duche_updatedhari_duche_updated
hari_duche_updated
 
All Grown Up: Maturation of Analytics in the Cloud
All Grown Up: Maturation of Analytics in the CloudAll Grown Up: Maturation of Analytics in the Cloud
All Grown Up: Maturation of Analytics in the Cloud
 
From open data to API-driven business
From open data to API-driven businessFrom open data to API-driven business
From open data to API-driven business
 

Último

So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 

Último (20)

So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 

Hadoop Summit Japan 2011 Fall - LT by IBM

  • 1. Data Discovery Tool BigSheets MapReduce with No Coding? p g Atsushi Tsuchiya (eAtsuhsi@JP.ibm.com) Atsushi Tsuchiya (eAtsuhsi@JP.ibm.com) Big Data Tiger Team IBM Software IBM Software
  • 2. Looking at Data Looking at Data • What would you do with Big data?  h ld d ih i d ? • How to make use of it? • It is difficult! – too vague. • No specific problem that needs to be solved. p p • No specific question that needs to be answered. • Only you know is to improve the business. yy p • But you have *data* • So what would you do first? So, what would you do first? Looking at Data! g
  • 3. IBM with Hadoop IBM with Hadoop • IBM has been working with Open source  y g community for the long time. – Eclipse, Hadoop and so on … • BigInsights include Hadoop
  • 4. BigInsights • BigInsihgts i i ih is IBM Hadoop product for Big data  d d f i d analytics. – Basic Edition (up to 10TB) – Free 無償で使えます! – Enterprise Edition  p • Next version BigInsights ‐ coming soon Next version BigInsights coming soon. – v1.2 available. • And many more
  • 5. BigInsights Componetns BigInsights Componetns • BigInsihgts i l d i ih includes: – IBM Java – JAQL - IBMが開発した言語(オープンソース) – IBM Distribution of Hadoop – BigSheets - データ探索ツール – FLEX scheduler for Adaptive MapReduce  – Orchestrator (Workflow Engine) – SystemT (Text Analytics), SystemML (Machine Learning) – LDAP – Web Console / Developer Studio
  • 6. BigInsights – Basic Edition BigInsights – Basic Edition Version Will be Update Basic Enterprise Function in Nov Edition Editi Edition Editi release. Integrated Install Inc Inc Open Source components: Hadoop (including common utilities, HDFS, MapReduce framework) 0.20.2 Inc Inc Jaql (programming / query language) 0.5.2 Inc Inc Pig (programming / query language) 0.7 Inc Inc Flume (data collection/aggregation) 0.9.1 Inc Inc Hive (data summarization/querying) 0.5 Inc Inc Lucene (text search) 3.0.2 302 Inc Inc Zookeeper (process coordination) 3.2.2 Inc Inc Avro (data serialization) 1.3.0 Inc Inc HBase ( (real time read/write) / ) 0.20.6 0 20 6 Inc Inc Oozie (workflow/ job orchestration) 2.2.2 Inc Inc Online documentation Inc Inc Capability to integrate with DB2, InfoSphere Warehouse Inc Inc Two DB2 UDFs to submit jobs, and read results from BigInsights
  • 7. BigInsights – Enterprise Edition Enterprise Edition Basic Enterprise Function Edition Edition R Connector Jaql module to invoke R statistical capabilities from BigInsights n/a Inc Netezza C N t Connector t Jaql modules to read/write data from/to Netezza n/a Inc LDAP n/a Inc Web Console n/a Inc Workflow Engine n/a Inc Scheduler (Orchestrator) n/a Inc Text Analytics Module (System T) n/a Inc Eclipse support (for System T)* n/a Inc BigSheets – Data Discovery Tool n/a Inc IBM Optim Development Studio V2.2.1.0 n/a Inc Support by IBM pp y n/a Inc
  • 8. BigSheets • A data exploring tool for Hadoop • Only comes with BigInsights Enterprise edition Only comes with BigInsights Enterprise edition
  • 9. BigSheets Concept Model Concept Model Enrich Inspect Explore Internet No Coding is Required! Gather BigSheets Intranet Publish Get/ Manipulate Logs Gather Massive Results Other in BigInsights Explore &  Analyze
  • 10. It s like a spreadsheets. It’s like a spreadsheets Looks very familiar ?!?
  • 11. Visualizations • Predefined visualization • Customer Plug‐in Customer Plug in A number of coffee shops in North America for each States.
  • 12. DEMO
  • 13. Internet BigSheets Intranet Gather Logs Other BigInsight s • BigInsights can gather data from i i h h d f – Predefined formats : • BigSheets data reader • Basic crawler data reader • Basic crawler data reader (binary support) Basic crawler data reader (binary support) • Character‐delimited data reader • Tab Separated Value (TSV) data reader p ( ) • JavaScript Object Notation (JSON) array reader • Comma Separated Value (CSV) data reader – Customer BigSheets Reader 
  • 14. Internet BigSheets Intranet Gather Logs Other BigInsight s • BigInsights can import structured and  i i h i d d unstructured data – CSV – Files – Network • http p • hdfs • AWS (S3n/S3) – Other • Customer Importer
  • 15. Internet BigSheets Intranet Collection Logs Other BigInsight s A complete list of MacDonald s in North America. A complete list of MacDonald's in North America
  • 16. Internet BigSheets Intranet Logs BigInsight Other s Calculate Reformat Import A complete list of MacDonald's in North America.
  • 17. Internet BigSheets Intranet Logs BigInsight Other s Column chart Heat map
  • 18. BigSheets in Action in Action 映 売 げ • Blockbuster 映画売り上げ予測 – ABC Newsより
  • 19. Blockbuster – 映画の売り上げ予測 IBM BigInsights/BigSheets ①週末につぶやかれたTweets  ①週末につぶやかれたTweets (約200,000)フィードを受けて、 ②数時間以内に、 (今までは、月曜の朝になってから) 売り上げ予測チャ ト作成 ‐売り上げ予測チャート作成 ‐センチメント分析 例えば、今年の夏は、 がどれよりも人気があ た( X‐manがどれよりも人気があった(つ ぶやかれた)→宣伝、上映戦略など をこまめに修正
  • 20. Conclusion • We all need to improve the business. • S So, where would you start with Big data? h ld t t ith Bi d t ? Data Discovery is a key to start improving  YOUR Business! YOUR Business!