SlideShare una empresa de Scribd logo
1 de 23
WINDOWS AZURE           Matt Winkler
                 Azure Data Platform
     HDINSIGHT
Windows Azure
cloud services



    application
building blocks
Windows Azure
HDInsight Service
  elastic
  simple
  secure
Built on HDP
      Core
       Pig
      Hive
     Oozie
     Sqoop
    Ambari
   HCatalog
   Templeton
Demo
Provisionin
    g
Provisionin
    g
Leverage Azure Storage
    Economic Flexibility
          Scale
     Geo-Redundancy
Secure
        Isolated
Single REST Entrypoint
Hive, Pig, Mahout, Cascading, Scalding, Scoobi, Pegasus…




C#, F# Map/Reduce, LINQ to Hive, .NET management clients




JavaScript Map/Reduce, Browser hosted console, Node.js management clients




PowerShell, Cross Platform CLI tools
Price
     Compute*
(~
        +
     Storage
(~
The Data Platform
for Modern Apps

 Any Data, Any Size, Anywhere
 Data Management and Insights at Scale
Resources
 Windows Azure
 Free trial
 Getting Started with HDInsight
 Pricing
 .NET SDK For Hadoop
 Halo 4 Case Study
start now.
Management
      UI Tooling
       Cluster usage
>_     Job authoring
       Result consumption in common tools

      PowerShell & Cross platform scripting
      API Surface
       RDFE – Azure provisioning
       Ambari – Cluster monitoring
       WebHCatalog – Metadata and job submission
       WebHDFS, Blob Storage – Storage
Existing Ecosystem
 Actively contributing to:
  Core
  Pig
  Hive
  HCatalog

 Branching to other projects
 Simple one-box developer install on
     Windows
.NET
 Map/Reduce
 LINQ to Hive
 Client API’s
  WebHCat
  Ambari
  WebHDFS
  Azure

 Visual Studio Tooling
    Local debugging support
JavaScript
 MRjs – Map/Reduce in JavaScript
 Node.js client API’s
  WebHCat
  WebHDFS
  Ambari
  Azure
Management
      UI Tooling
       Cluster usage
>_     Job authoring
       Result consumption in common tools

      PowerShell & Cross platform scripting
      API Surface
       RDFE – Azure provisioning
       Ambari – Cluster monitoring
       WebHCatalog – Metadata and job submission
       WebHDFS, Blob Storage – Storage
 Sources
         http://hadoopsdk.codeplex.com

open     http://www.github.com/windowsazure

        NuGet packages
         Microsoft.Hadoop.MapReduce
         Microsoft.Hadoop.Hive
         Microsoft.Hadoop.WebHDFS => WebClient

        NPM packages
         Azure
         Azure-cli
         Hadoop REST clients pending…

Más contenido relacionado

La actualidad más candente

Scaling containers with keda
Scaling containers  with kedaScaling containers  with keda
Scaling containers with kedaNilesh Gule
 
SCasia 2018 MSFT hands on session for Azure Batch AI
SCasia 2018 MSFT hands on session for Azure Batch AISCasia 2018 MSFT hands on session for Azure Batch AI
SCasia 2018 MSFT hands on session for Azure Batch AIHiroshi Tanaka
 
Autoscaling containers with event driven workloads
Autoscaling containers with event driven workloadsAutoscaling containers with event driven workloads
Autoscaling containers with event driven workloadsNilesh Gule
 
Event driven workloads on Kubernetes with KEDA
Event driven workloads on Kubernetes with KEDAEvent driven workloads on Kubernetes with KEDA
Event driven workloads on Kubernetes with KEDANilesh Gule
 
Machine learning in the physical world by Kip Larson from AWS IoT
Machine learning in the physical world by  Kip Larson from AWS IoTMachine learning in the physical world by  Kip Larson from AWS IoT
Machine learning in the physical world by Kip Larson from AWS IoTBill Liu
 
Serverless Big Data Architecture on Google Cloud Platform at Credit OK
Serverless Big Data Architecture on Google Cloud Platform at Credit OKServerless Big Data Architecture on Google Cloud Platform at Credit OK
Serverless Big Data Architecture on Google Cloud Platform at Credit OKKriangkrai Chaonithi
 
Scaling containers with KEDA
Scaling containers with KEDAScaling containers with KEDA
Scaling containers with KEDANilesh Gule
 
How Docker Accelerates Continuous Development at ironSource: Containers #101 ...
How Docker Accelerates Continuous Development at ironSource: Containers #101 ...How Docker Accelerates Continuous Development at ironSource: Containers #101 ...
How Docker Accelerates Continuous Development at ironSource: Containers #101 ...Brittany Ingram
 
Scaling .net containers with event driven workloads
Scaling .net containers with event driven workloadsScaling .net containers with event driven workloads
Scaling .net containers with event driven workloadsNilesh Gule
 
Ecs gitlab runners
Ecs gitlab runnersEcs gitlab runners
Ecs gitlab runnersdynnamitt
 
Finding new Customers using D&B and Excel Power Query
Finding new Customers using D&B and Excel Power QueryFinding new Customers using D&B and Excel Power Query
Finding new Customers using D&B and Excel Power QueryLynn Langit
 
Azure containers fundamentals
Azure containers fundamentalsAzure containers fundamentals
Azure containers fundamentalsNilesh Gule
 
Machine learning at scale by Amy Unruh from Google
Machine learning at scale by  Amy Unruh from GoogleMachine learning at scale by  Amy Unruh from Google
Machine learning at scale by Amy Unruh from GoogleBill Liu
 
Hacking google cloud run
Hacking google cloud runHacking google cloud run
Hacking google cloud runAviv Laufer
 
TensorFlow London 14: Ben Hall 'Machine Learning Workloads with Kubernetes an...
TensorFlow London 14: Ben Hall 'Machine Learning Workloads with Kubernetes an...TensorFlow London 14: Ben Hall 'Machine Learning Workloads with Kubernetes an...
TensorFlow London 14: Ben Hall 'Machine Learning Workloads with Kubernetes an...Seldon
 
A practical approach to provisioning resources in azure
A practical approach to provisioning resources in azureA practical approach to provisioning resources in azure
A practical approach to provisioning resources in azureMorten Christensen
 
Google Cloud Platform for Data Science teams
Google Cloud Platform for Data Science teamsGoogle Cloud Platform for Data Science teams
Google Cloud Platform for Data Science teamsBarton Rhodes
 
Shelly cloud & heroku & engineyard. Pros & Cons
Shelly cloud & heroku & engineyard. Pros & ConsShelly cloud & heroku & engineyard. Pros & Cons
Shelly cloud & heroku & engineyard. Pros & ConsGiedrius Rimkus
 

La actualidad más candente (20)

Scaling containers with keda
Scaling containers  with kedaScaling containers  with keda
Scaling containers with keda
 
SCasia 2018 MSFT hands on session for Azure Batch AI
SCasia 2018 MSFT hands on session for Azure Batch AISCasia 2018 MSFT hands on session for Azure Batch AI
SCasia 2018 MSFT hands on session for Azure Batch AI
 
Autoscaling containers with event driven workloads
Autoscaling containers with event driven workloadsAutoscaling containers with event driven workloads
Autoscaling containers with event driven workloads
 
Event driven workloads on Kubernetes with KEDA
Event driven workloads on Kubernetes with KEDAEvent driven workloads on Kubernetes with KEDA
Event driven workloads on Kubernetes with KEDA
 
Machine learning in the physical world by Kip Larson from AWS IoT
Machine learning in the physical world by  Kip Larson from AWS IoTMachine learning in the physical world by  Kip Larson from AWS IoT
Machine learning in the physical world by Kip Larson from AWS IoT
 
Serverless Big Data Architecture on Google Cloud Platform at Credit OK
Serverless Big Data Architecture on Google Cloud Platform at Credit OKServerless Big Data Architecture on Google Cloud Platform at Credit OK
Serverless Big Data Architecture on Google Cloud Platform at Credit OK
 
Scaling containers with KEDA
Scaling containers with KEDAScaling containers with KEDA
Scaling containers with KEDA
 
How Docker Accelerates Continuous Development at ironSource: Containers #101 ...
How Docker Accelerates Continuous Development at ironSource: Containers #101 ...How Docker Accelerates Continuous Development at ironSource: Containers #101 ...
How Docker Accelerates Continuous Development at ironSource: Containers #101 ...
 
Scaling .net containers with event driven workloads
Scaling .net containers with event driven workloadsScaling .net containers with event driven workloads
Scaling .net containers with event driven workloads
 
Ecs gitlab runners
Ecs gitlab runnersEcs gitlab runners
Ecs gitlab runners
 
Finding new Customers using D&B and Excel Power Query
Finding new Customers using D&B and Excel Power QueryFinding new Customers using D&B and Excel Power Query
Finding new Customers using D&B and Excel Power Query
 
Data Science on Google Cloud Platform
Data Science on Google Cloud PlatformData Science on Google Cloud Platform
Data Science on Google Cloud Platform
 
Azure containers fundamentals
Azure containers fundamentalsAzure containers fundamentals
Azure containers fundamentals
 
Machine learning at scale by Amy Unruh from Google
Machine learning at scale by  Amy Unruh from GoogleMachine learning at scale by  Amy Unruh from Google
Machine learning at scale by Amy Unruh from Google
 
Hacking google cloud run
Hacking google cloud runHacking google cloud run
Hacking google cloud run
 
TensorFlow London 14: Ben Hall 'Machine Learning Workloads with Kubernetes an...
TensorFlow London 14: Ben Hall 'Machine Learning Workloads with Kubernetes an...TensorFlow London 14: Ben Hall 'Machine Learning Workloads with Kubernetes an...
TensorFlow London 14: Ben Hall 'Machine Learning Workloads with Kubernetes an...
 
Meetup roundupazure
Meetup roundupazureMeetup roundupazure
Meetup roundupazure
 
A practical approach to provisioning resources in azure
A practical approach to provisioning resources in azureA practical approach to provisioning resources in azure
A practical approach to provisioning resources in azure
 
Google Cloud Platform for Data Science teams
Google Cloud Platform for Data Science teamsGoogle Cloud Platform for Data Science teams
Google Cloud Platform for Data Science teams
 
Shelly cloud & heroku & engineyard. Pros & Cons
Shelly cloud & heroku & engineyard. Pros & ConsShelly cloud & heroku & engineyard. Pros & Cons
Shelly cloud & heroku & engineyard. Pros & Cons
 

Similar a Drive Smarter Decisions with Hadoop and Windows Azure HDInsight

Building Tools for the Hadoop Developer
Building Tools for the Hadoop DeveloperBuilding Tools for the Hadoop Developer
Building Tools for the Hadoop DeveloperDataWorks Summit
 
Windows Azure HDInsight Service
Windows Azure HDInsight ServiceWindows Azure HDInsight Service
Windows Azure HDInsight ServiceNeil Mackenzie
 
Hadoop MapReduce Fundamentals
Hadoop MapReduce FundamentalsHadoop MapReduce Fundamentals
Hadoop MapReduce FundamentalsLynn Langit
 
Architecting multi-cloud ready applications
Architecting multi-cloud ready applicationsArchitecting multi-cloud ready applications
Architecting multi-cloud ready applicationsSwaminathan Vetri
 
Drupal DevOps on Microsoft Azure Websites
Drupal DevOps on Microsoft Azure WebsitesDrupal DevOps on Microsoft Azure Websites
Drupal DevOps on Microsoft Azure WebsitesCory Fowler
 
Microsoft's Hadoop Story
Microsoft's Hadoop StoryMicrosoft's Hadoop Story
Microsoft's Hadoop StoryMichael Rys
 
Hadoop in the Cloud – The What, Why and How from the Experts
Hadoop in the Cloud – The What, Why and How from the ExpertsHadoop in the Cloud – The What, Why and How from the Experts
Hadoop in the Cloud – The What, Why and How from the ExpertsDataWorks Summit/Hadoop Summit
 
Cloudjiffy vs Microsoft Azure
Cloudjiffy vs Microsoft AzureCloudjiffy vs Microsoft Azure
Cloudjiffy vs Microsoft AzureSharma Aashish
 
Running a business in the Cloud with AWS
Running a business in the Cloud with AWSRunning a business in the Cloud with AWS
Running a business in the Cloud with AWSConor O'Neill
 
Cloud for Developers: Azure vs. Google App Engine vs. Amazon vs. AppHarbor
Cloud for Developers: Azure vs. Google App Engine vs. Amazon vs. AppHarborCloud for Developers: Azure vs. Google App Engine vs. Amazon vs. AppHarbor
Cloud for Developers: Azure vs. Google App Engine vs. Amazon vs. AppHarborSvetlin Nakov
 
Microsoft abraça o Open Source - InteropMix
Microsoft abraça o Open Source - InteropMixMicrosoft abraça o Open Source - InteropMix
Microsoft abraça o Open Source - InteropMixDanilo Bordini
 
Oportunidade para Desenvolvedores: Mobile-First, Cloud-First
Oportunidade para Desenvolvedores: Mobile-First, Cloud-FirstOportunidade para Desenvolvedores: Mobile-First, Cloud-First
Oportunidade para Desenvolvedores: Mobile-First, Cloud-FirstDanilo Bordini
 
Best Hadoop and Amazon Online Training
Best Hadoop and Amazon Online TrainingBest Hadoop and Amazon Online Training
Best Hadoop and Amazon Online TrainingSamatha Kamuni
 
Hadoop and aws map reducecourse
Hadoop and aws map reducecourseHadoop and aws map reducecourse
Hadoop and aws map reducecourseSamatha Kamuni
 
(BDT302) Big Data Beyond Hadoop: Running Mahout, Giraph, and R on Amazon EMR ...
(BDT302) Big Data Beyond Hadoop: Running Mahout, Giraph, and R on Amazon EMR ...(BDT302) Big Data Beyond Hadoop: Running Mahout, Giraph, and R on Amazon EMR ...
(BDT302) Big Data Beyond Hadoop: Running Mahout, Giraph, and R on Amazon EMR ...Amazon Web Services
 
Apache OpenWhisk Serverless Computing
Apache OpenWhisk Serverless ComputingApache OpenWhisk Serverless Computing
Apache OpenWhisk Serverless ComputingUpkar Lidder
 
PaaS Anywhere - Deploying an OpenShift PaaS into your Cloud Provider of Choice
PaaS Anywhere - Deploying an OpenShift PaaS into your Cloud Provider of ChoicePaaS Anywhere - Deploying an OpenShift PaaS into your Cloud Provider of Choice
PaaS Anywhere - Deploying an OpenShift PaaS into your Cloud Provider of ChoiceIsaac Christoffersen
 
Introduction to Big Data Analytics on Apache Hadoop
Introduction to Big Data Analytics on Apache HadoopIntroduction to Big Data Analytics on Apache Hadoop
Introduction to Big Data Analytics on Apache HadoopAvkash Chauhan
 
Hadoop online training by certified trainer
Hadoop online training by certified trainerHadoop online training by certified trainer
Hadoop online training by certified trainersriram0233
 

Similar a Drive Smarter Decisions with Hadoop and Windows Azure HDInsight (20)

Building Tools for the Hadoop Developer
Building Tools for the Hadoop DeveloperBuilding Tools for the Hadoop Developer
Building Tools for the Hadoop Developer
 
Windows Azure HDInsight Service
Windows Azure HDInsight ServiceWindows Azure HDInsight Service
Windows Azure HDInsight Service
 
Hadoop MapReduce Fundamentals
Hadoop MapReduce FundamentalsHadoop MapReduce Fundamentals
Hadoop MapReduce Fundamentals
 
Architecting multi-cloud ready applications
Architecting multi-cloud ready applicationsArchitecting multi-cloud ready applications
Architecting multi-cloud ready applications
 
Drupal DevOps on Microsoft Azure Websites
Drupal DevOps on Microsoft Azure WebsitesDrupal DevOps on Microsoft Azure Websites
Drupal DevOps on Microsoft Azure Websites
 
Microsoft's Hadoop Story
Microsoft's Hadoop StoryMicrosoft's Hadoop Story
Microsoft's Hadoop Story
 
Hadoop in the Cloud – The What, Why and How from the Experts
Hadoop in the Cloud – The What, Why and How from the ExpertsHadoop in the Cloud – The What, Why and How from the Experts
Hadoop in the Cloud – The What, Why and How from the Experts
 
Cloudjiffy vs Microsoft Azure
Cloudjiffy vs Microsoft AzureCloudjiffy vs Microsoft Azure
Cloudjiffy vs Microsoft Azure
 
Running a business in the Cloud with AWS
Running a business in the Cloud with AWSRunning a business in the Cloud with AWS
Running a business in the Cloud with AWS
 
Cloud for Developers: Azure vs. Google App Engine vs. Amazon vs. AppHarbor
Cloud for Developers: Azure vs. Google App Engine vs. Amazon vs. AppHarborCloud for Developers: Azure vs. Google App Engine vs. Amazon vs. AppHarbor
Cloud for Developers: Azure vs. Google App Engine vs. Amazon vs. AppHarbor
 
Microsoft abraça o Open Source - InteropMix
Microsoft abraça o Open Source - InteropMixMicrosoft abraça o Open Source - InteropMix
Microsoft abraça o Open Source - InteropMix
 
Oportunidade para Desenvolvedores: Mobile-First, Cloud-First
Oportunidade para Desenvolvedores: Mobile-First, Cloud-FirstOportunidade para Desenvolvedores: Mobile-First, Cloud-First
Oportunidade para Desenvolvedores: Mobile-First, Cloud-First
 
Best Hadoop and Amazon Online Training
Best Hadoop and Amazon Online TrainingBest Hadoop and Amazon Online Training
Best Hadoop and Amazon Online Training
 
Hadoop and aws map reducecourse
Hadoop and aws map reducecourseHadoop and aws map reducecourse
Hadoop and aws map reducecourse
 
(BDT302) Big Data Beyond Hadoop: Running Mahout, Giraph, and R on Amazon EMR ...
(BDT302) Big Data Beyond Hadoop: Running Mahout, Giraph, and R on Amazon EMR ...(BDT302) Big Data Beyond Hadoop: Running Mahout, Giraph, and R on Amazon EMR ...
(BDT302) Big Data Beyond Hadoop: Running Mahout, Giraph, and R on Amazon EMR ...
 
Running PHP In The Cloud
Running PHP In The CloudRunning PHP In The Cloud
Running PHP In The Cloud
 
Apache OpenWhisk Serverless Computing
Apache OpenWhisk Serverless ComputingApache OpenWhisk Serverless Computing
Apache OpenWhisk Serverless Computing
 
PaaS Anywhere - Deploying an OpenShift PaaS into your Cloud Provider of Choice
PaaS Anywhere - Deploying an OpenShift PaaS into your Cloud Provider of ChoicePaaS Anywhere - Deploying an OpenShift PaaS into your Cloud Provider of Choice
PaaS Anywhere - Deploying an OpenShift PaaS into your Cloud Provider of Choice
 
Introduction to Big Data Analytics on Apache Hadoop
Introduction to Big Data Analytics on Apache HadoopIntroduction to Big Data Analytics on Apache Hadoop
Introduction to Big Data Analytics on Apache Hadoop
 
Hadoop online training by certified trainer
Hadoop online training by certified trainerHadoop online training by certified trainer
Hadoop online training by certified trainer
 

Más de DataWorks Summit

Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisDataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiDataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal SystemDataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExampleDataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberDataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixDataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiDataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsDataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureDataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EngineDataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudDataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiDataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerDataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouDataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkDataWorks Summit
 

Más de DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Último

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 

Último (20)

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 

Drive Smarter Decisions with Hadoop and Windows Azure HDInsight

  • 1. WINDOWS AZURE Matt Winkler Azure Data Platform HDINSIGHT
  • 3. cloud services application building blocks
  • 4. Windows Azure HDInsight Service  elastic  simple  secure
  • 5. Built on HDP Core Pig Hive Oozie Sqoop Ambari HCatalog Templeton
  • 9. Leverage Azure Storage Economic Flexibility Scale Geo-Redundancy
  • 10. Secure Isolated Single REST Entrypoint
  • 11. Hive, Pig, Mahout, Cascading, Scalding, Scoobi, Pegasus… C#, F# Map/Reduce, LINQ to Hive, .NET management clients JavaScript Map/Reduce, Browser hosted console, Node.js management clients PowerShell, Cross Platform CLI tools
  • 12. Price Compute* (~ + Storage (~
  • 13. The Data Platform for Modern Apps  Any Data, Any Size, Anywhere  Data Management and Insights at Scale
  • 14. Resources  Windows Azure  Free trial  Getting Started with HDInsight  Pricing  .NET SDK For Hadoop  Halo 4 Case Study
  • 16.
  • 17. Management  UI Tooling  Cluster usage >_  Job authoring  Result consumption in common tools  PowerShell & Cross platform scripting  API Surface  RDFE – Azure provisioning  Ambari – Cluster monitoring  WebHCatalog – Metadata and job submission  WebHDFS, Blob Storage – Storage
  • 18.
  • 19. Existing Ecosystem  Actively contributing to:  Core  Pig  Hive  HCatalog  Branching to other projects  Simple one-box developer install on Windows
  • 20. .NET  Map/Reduce  LINQ to Hive  Client API’s  WebHCat  Ambari  WebHDFS  Azure  Visual Studio Tooling  Local debugging support
  • 21. JavaScript  MRjs – Map/Reduce in JavaScript  Node.js client API’s  WebHCat  WebHDFS  Ambari  Azure
  • 22. Management  UI Tooling  Cluster usage >_  Job authoring  Result consumption in common tools  PowerShell & Cross platform scripting  API Surface  RDFE – Azure provisioning  Ambari – Cluster monitoring  WebHCatalog – Metadata and job submission  WebHDFS, Blob Storage – Storage
  • 23.  Sources  http://hadoopsdk.codeplex.com open  http://www.github.com/windowsazure  NuGet packages  Microsoft.Hadoop.MapReduce  Microsoft.Hadoop.Hive  Microsoft.Hadoop.WebHDFS => WebClient  NPM packages  Azure  Azure-cli  Hadoop REST clients pending…

Notas del editor

  1. View from Camp Muir looking to Mount Adams, Mount Rainier National Park, Washington 2011, © matt winkler
  2. Innovate across the stack