Delivering Data Science to the Business

•

2 recomendaciones•707 vistas

DataWorks Summit 2017 - Sydney Keynote Madhu Kochar, Vice President, Analytics Product Development and Client Success, IBM Data science holds the promise of transforming businesses and disrupting entire industries. However, many organizations struggle to deploy and scale key technologies such as machine learning and deep learning. IBM will share how it is making data science accessible to all by simplifying the use of a range of open source technologies and data sources, including high performing and open architectures geared for cognitive workloads.

Tecnología

Delivering Data Science
to the Business
Madhu Kochar
Vice President, Analytics Product
Development and Client Services
IBM

Operationalizing Machine Learning and getting actionable insights has been
a huge challenge
Organization needs to act fast
ACT
NOW!
Operationalize Machine LearningData still lives in Silos
IBM Db2

Business Objective:
Drive top line growth and market share
Optimize Real-Time Marketing (RTM) and improve Return On
Investment (ROI)
Outdoor Equipment
Let’s meet Amy who works for
Outdoor Equipment Inc.
Amy
Marketing Director
Company:
Outdoor Equipment Inc. is a full-line sporting goods retailer

Amy wants to promote sales campaign
at targeted customers to increase
organization’s revenue
Sleeping Bags
Camping Chairs and Bedding

Ryan
Data Scientist
Amy needs to work with different teams who perform specific tasks
to execute the campaign
Amy
Marketing Director
Nick
Application Developer
Chris
Data Engineer
Product
details
Customer
details
Sales
campaign
Chris
Data Engineer
Operationalize
Machine
Learning

Ryan
Data Scientist
Nick
Application Developer
Federation Application IntegrationSpark Integration
With Big SQL, Amy’s team can self serve their requirement, save
time on execution and enhance productivity
Self Service
IBM Big SQL
Chris
Data Engineer

Big SQL Key Capabilities
Federation
and
Spark
Performance
Enterprise
and
Security
SQL
Compatibility
Relational
Databases
Leads
performance
metrics on high
volumes of data
and concurrent
streams
Role and Column
level Security
Ranger Integration
NoSQL
Object
Stores

PROCESSING
DATA
STORAGE
ACCESS
H o r t o n w o r k s
P o w e r S y s t e m s
E l a s t i c S t o r a g e S e r v e r
IBM
B i g S Q L
IBMIBM
3x Price-Performance Guaranteed
Get more performance with Power Systems

Find New Business Opportunities or Solve Business Problems using Big SQL
9
How do I get started?
Big SQL sandbox
Big SQL v5.0.1
NOW
Available on HDP v2.6.2
Try NOW!

Scaling Data Science
on Big Data
Date: Wed, 9/20 @ 11:00 AM
Room: C2.3
1 Ingesting Data at Blazing Speed using
Apache ORC
Data: Wed, 9/20 @ 4:20 PM
Room: C4.7
2
Open metadata and governance
with Apache Atlas
Date: Wed, 9/20 @ 5:10 PM
Room: C4.6
Empowering YOU with Democratized Data
Access, Data Science and Machine Learning
Date: Wednesday, 9/20 @ 6:00 PM
Room: C4.5
3 4
Breaching the 100TB mark
with SQL over Hadoop
Date: Thurs, 9/21 @ 2:20 PM
Room: C2.3
Birds-of a Feather: Apache Spark, Apache
Zeppelin and Data Science
Date: Thurs, 9/21 @ 6:00 PM
Room: C4.5
5 6
Thank you!
Check out the Breakout sessions
Visit IBM Booth for More Information!

Find more #DWS17 sessions and
slides at:
www.DataWorksSummit.com

Más contenido relacionado

La actualidad más candente

Hadoop for the MassesDataWorks Summit/Hadoop Summit

How Big Data and Hadoop Integrated into BMC ControlM at CARFAXBMC Software

On Demand HDP Clusters using Cloudbreak and AmbariDataWorks Summit/Hadoop Summit

How to deploy machine learning models into productionDataWorks Summit

How Market Intelligence From Hadoop on Azure Shows Trucking Companies a Clear...DataWorks Summit

Data pipeline and data lake for autonomous drivingYu Huang

Security, ETL, BI & Analytics, and Software IntegrationDataWorks Summit

Big Data at your Desk with KNIMEDataWorks Summit/Hadoop Summit

Optimizing your SparkML pipelines using the latest features in Spark 2.3DataWorks Summit

How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors DataWorks Summit/Hadoop Summit

High Performance Spatial-Temporal Trajectory Analysis with Spark DataWorks Summit/Hadoop Summit

Data-In-Motion UnleashedDataWorks Summit

Admiral GroupDataWorks Summit/Hadoop Summit

IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...Mark Rittman

Geospatial data platform at UberDataWorks Summit

Seeing Redshift: How Amazon Changed Data Warehousing ForeverInside Analysis

Ingesting Data at Blazing Speed Using Apache OrcDataWorks Summit

The Practice of Big Data - The Hadoop ecosystem explained with usage scenarioskcmallu

Dynamic DDL: Adding structure to streaming IoT data on the flyDataWorks Summit

Georgia Azure Event - Scalable cloud games using Microsoft AzureMicrosoft

La actualidad más candente (20)

Hadoop for the Masses

How Big Data and Hadoop Integrated into BMC ControlM at CARFAX

On Demand HDP Clusters using Cloudbreak and Ambari

How to deploy machine learning models into production

How Market Intelligence From Hadoop on Azure Shows Trucking Companies a Clear...

Data pipeline and data lake for autonomous driving

Security, ETL, BI & Analytics, and Software Integration

Big Data at your Desk with KNIME

Optimizing your SparkML pipelines using the latest features in Spark 2.3

How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors

High Performance Spatial-Temporal Trajectory Analysis with Spark

Data-In-Motion Unleashed

Admiral Group

IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...

Geospatial data platform at Uber

Seeing Redshift: How Amazon Changed Data Warehousing Forever

Ingesting Data at Blazing Speed Using Apache Orc

The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios

Dynamic DDL: Adding structure to streaming IoT data on the fly

Georgia Azure Event - Scalable cloud games using Microsoft Azure

Destacado

Data Guarantees and Fault Tolerance in Streaming SystemsDataWorks Summit

Beyond Big Data: Data Science and AIDataWorks Summit

The Apache WayDataWorks Summit

SparkR Best Practices for R Data ScientistsDataWorks Summit

Next Generation Execution for Apache StormDataWorks Summit

Apache Hadoop Crash CourseDataWorks Summit

Driving in the Desert - Running Your HDP Cluster with Helion, Openstack, and ...DataWorks Summit

The Power of Intelligent Flows: Real-Time IoT Botnet Classification with Apac...DataWorks Summit

Data Science Crash CourseDataWorks Summit

MaaS (Model as a Service): Modern Streaming Data Science with Apache MetronDataWorks Summit

How Big Data and Deep Learning are Revolutionizing AML and Financial Crime De...DataWorks Summit

The Future of Data in Telecom and the Rise of Connected CommunitiesDataWorks Summit

Apache Spark Crash CourseDataWorks Summit

Running Zeppelin in EnterpriseDataWorks Summit

An Apache Hive Based Data WarehouseDataWorks Summit

Intelligently Collecting Data at the Edge - Intro to Apache MiNiFiDataWorks Summit

Performance Update: When Apache ORC Met Apache SparkDataWorks Summit

Destacado (17)

Data Guarantees and Fault Tolerance in Streaming Systems

Beyond Big Data: Data Science and AI

The Apache Way

SparkR Best Practices for R Data Scientists

Next Generation Execution for Apache Storm

Apache Hadoop Crash Course

Driving in the Desert - Running Your HDP Cluster with Helion, Openstack, and ...

The Power of Intelligent Flows: Real-Time IoT Botnet Classification with Apac...

Data Science Crash Course

MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron

How Big Data and Deep Learning are Revolutionizing AML and Financial Crime De...

The Future of Data in Telecom and the Rise of Connected Communities

Apache Spark Crash Course

Running Zeppelin in Enterprise

An Apache Hive Based Data Warehouse

Intelligently Collecting Data at the Edge - Intro to Apache MiNiFi

Performance Update: When Apache ORC Met Apache Spark

Similar a Delivering Data Science to the Business

Achieving Massive Concurrency & Sub-second Query Latency on Cloud Warehouses ...Alluxio, Inc.

Building an Enterprise Data Platform with Azure Databricks to Enable Machine ...Databricks

IBM Governed Data LakeKaran Sachdeva

Serverless projects at MyplanetDaniel Zivkovic

Unlock Data-driven Insights in Databricks Using Location IntelligencePrecisely

Ανδρέας Τσαγκάρης, 5th Digital Banking ForumStarttech Ventures

Databricks on AWS.pptxWasm1953

Building a Data Cloud to enable Analytics & AI-Driven Innovation - Lak Lakshm...Daniel Zivkovic

Liberate Your Data: Integrate Data From Traditional On-Prem Systems to Next-G...Precisely

Master the art of Data ScienceInTTrust S.A.

Building an End-to-End Solution in Microsoft Fabric: From Dataverse to Power ...Cathrine Wilhelmsen

Opening KeynoteAmazon Web Services

SendGrid Improves Email Delivery with Hybrid Data WarehousingAmazon Web Services

Web hosting is a software businessisabelwang

AzureML Welcome to the future of Predictive Analytics Ruben Pertusa Lopez

Data Culture Series - Keynote - 3rd DecJonathan Woodward

Effective Cost Management for Amazon EMRDevOps.com

ChatGPT and not only: How to use the power of GPT-X models at scaleMaxim Salnikov

IBM Meetup on November 1, 2018: Machine Learning made easy with Watson StudioSvetlana Levitan, PhD

TestGuild and QuerySurge Presentation -DevOps for Data TestingRTTS

Similar a Delivering Data Science to the Business (20)

Achieving Massive Concurrency & Sub-second Query Latency on Cloud Warehouses ...

Building an Enterprise Data Platform with Azure Databricks to Enable Machine ...

IBM Governed Data Lake

Serverless projects at Myplanet

Unlock Data-driven Insights in Databricks Using Location Intelligence

Ανδρέας Τσαγκάρης, 5th Digital Banking Forum

Databricks on AWS.pptx

Building a Data Cloud to enable Analytics & AI-Driven Innovation - Lak Lakshm...

Liberate Your Data: Integrate Data From Traditional On-Prem Systems to Next-G...

Master the art of Data Science

Building an End-to-End Solution in Microsoft Fabric: From Dataverse to Power ...

Opening Keynote

SendGrid Improves Email Delivery with Hybrid Data Warehousing

Web hosting is a software business

AzureML Welcome to the future of Predictive Analytics

Data Culture Series - Keynote - 3rd Dec

Effective Cost Management for Amazon EMR

ChatGPT and not only: How to use the power of GPT-X models at scale

IBM Meetup on November 1, 2018: Machine Learning made easy with Watson Studio

TestGuild and QuerySurge Presentation -DevOps for Data Testing

Más de DataWorks Summit

Data Science Crash CourseDataWorks Summit

Floating on a RAFT: HBase Durability with Apache RatisDataWorks Summit

Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiDataWorks Summit

HBase Tales From the Trenches - Short stories about most common HBase operati...DataWorks Summit

Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...DataWorks Summit

Managing the Dewey Decimal SystemDataWorks Summit

Practical NoSQL: Accumulo's dirlist ExampleDataWorks Summit

HBase Global Indexing to support large-scale data ingestion at UberDataWorks Summit

Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixDataWorks Summit

Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiDataWorks Summit

Supporting Apache HBase : Troubleshooting and Supportability ImprovementsDataWorks Summit

Security Framework for Multitenant ArchitectureDataWorks Summit

Presto: Optimizing Performance of SQL-on-Anything EngineDataWorks Summit

Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit

Extending Twitter's Data Platform to Google CloudDataWorks Summit

Event-Driven Messaging and Actions using Apache Flink and Apache NiFiDataWorks Summit

Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerDataWorks Summit

Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...DataWorks Summit

Computer Vision: Coming to a Store Near YouDataWorks Summit

Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkDataWorks Summit

Más de DataWorks Summit (20)

Data Science Crash Course

Floating on a RAFT: HBase Durability with Apache Ratis

Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi

HBase Tales From the Trenches - Short stories about most common HBase operati...

Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...

Managing the Dewey Decimal System

Practical NoSQL: Accumulo's dirlist Example

HBase Global Indexing to support large-scale data ingestion at Uber

Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix

Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi

Supporting Apache HBase : Troubleshooting and Supportability Improvements

Security Framework for Multitenant Architecture

Presto: Optimizing Performance of SQL-on-Anything Engine

Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...

Extending Twitter's Data Platform to Google Cloud

Event-Driven Messaging and Actions using Apache Flink and Apache NiFi

Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger

Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...

Computer Vision: Coming to a Store Near You

Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark

Último

Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3

The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3

Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3

Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructureitnewsafrica

The State of Passkeys with FIDO Alliance.pptxLoriGlavin3

Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda

Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica

How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe

Decarbonising Buildings: Making a net-zero built environment a realityIES VE

[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra

The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney

UiPath Community: Communication Mining from Zero to HeroUiPathCommunity

New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada

Time Series Foundation Models - current state and future directionsNathaniel Shimoni

Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González

Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765

Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll

A Journey Into the Emotions of Software DevelopersNicole Novielli

Top 10 Hubspot Development Companies in 2024TopCSSGallery

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3

Delivering Data Science to the Business

1. Delivering Data Science to the Business Madhu Kochar Vice President, Analytics Product Development and Client Services IBM

2. Operationalizing Machine Learning and getting actionable insights has been a huge challenge Organization needs to act fast ACT NOW! Operationalize Machine LearningData still lives in Silos IBM Db2

3. Business Objective: Drive top line growth and market share Optimize Real-Time Marketing (RTM) and improve Return On Investment (ROI) Outdoor Equipment Let’s meet Amy who works for Outdoor Equipment Inc. Amy Marketing Director Company: Outdoor Equipment Inc. is a full-line sporting goods retailer

4. Amy wants to promote sales campaign at targeted customers to increase organization’s revenue Sleeping Bags Camping Chairs and Bedding

5. Ryan Data Scientist Amy needs to work with different teams who perform specific tasks to execute the campaign Amy Marketing Director Nick Application Developer Chris Data Engineer Product details Customer details Sales campaign Chris Data Engineer Operationalize Machine Learning

6. Ryan Data Scientist Nick Application Developer Federation Application IntegrationSpark Integration With Big SQL, Amy’s team can self serve their requirement, save time on execution and enhance productivity Self Service IBM Big SQL Chris Data Engineer

7. Big SQL Key Capabilities Federation and Spark Performance Enterprise and Security SQL Compatibility Relational Databases Leads performance metrics on high volumes of data and concurrent streams Role and Column level Security Ranger Integration NoSQL Object Stores

8. PROCESSING DATA STORAGE ACCESS H o r t o n w o r k s P o w e r S y s t e m s E l a s t i c S t o r a g e S e r v e r IBM B i g S Q L IBMIBM 3x Price-Performance Guaranteed Get more performance with Power Systems

9. Find New Business Opportunities or Solve Business Problems using Big SQL 9 How do I get started? Big SQL sandbox Big SQL v5.0.1 NOW Available on HDP v2.6.2 Try NOW!

10. Scaling Data Science on Big Data Date: Wed, 9/20 @ 11:00 AM Room: C2.3 1 Ingesting Data at Blazing Speed using Apache ORC Data: Wed, 9/20 @ 4:20 PM Room: C4.7 2 Open metadata and governance with Apache Atlas Date: Wed, 9/20 @ 5:10 PM Room: C4.6 Empowering YOU with Democratized Data Access, Data Science and Machine Learning Date: Wednesday, 9/20 @ 6:00 PM Room: C4.5 3 4 Breaching the 100TB mark with SQL over Hadoop Date: Thurs, 9/21 @ 2:20 PM Room: C2.3 Birds-of a Feather: Apache Spark, Apache Zeppelin and Data Science Date: Thurs, 9/21 @ 6:00 PM Room: C4.5 5 6 Thank you! Check out the Breakout sessions Visit IBM Booth for More Information!

11. Find more #DWS17 sessions and slides at: www.DataWorksSummit.com

12. 12 T H A N K Y O U

Notas del editor

Organizations understand the importance of machine learning and are exploring ways to implement it to improve their business. However every line of business has the challenge to find the best way of operationalizing machine learning for their business. Data gravity creates silos in the organization and it’s a challenge to bring all this data together for analyses. Even if the data can be brought together, using an ML model with data requires special set of skills and development effort. After operationalizing the machine learning model, businesses want to take actions on the discovered insights. These actions can be of variety and demand integration and development efforts. Businesses cant be agile and swiftly act on data unless these problems are tied together and addressed with a self service tool.
Lets meet Amy, who works for Outdoor Equipment Inc. Outdoor Equipment Inc is an sporting goods retailer. Amy works as a Marketing Director for this organization. Being an exec, her business objectives are to grow the business and her organization’s market share. She plans achieve her business goals by Real time marketing and improving ROI.
Based on competitive analysis, market trends and customer behaviors, Amy’s team has concluded that a prospective customer may convert into a paying customer if they are provided with proper incentive to shop. This key finding motivated Amy to come up with a sales campaign to send out product promotions to targeted users based on their interest in products. Amy is a well informed exec and understands the power of data science. She has decided to leverage it to get maximum ROI. She wants to put the right incentives in the hand of the right customer to convert them. She has put together a plan to run a sales campaign for 3 months with a variety of products that are available in the store.
Amy has to work with different teams that perform specific set of tasks in order to execute the marketing campaign. Chris is the data engineer that unifies the data which exists in different data platforms such as hadoop, db2 and other RDBMs. Chris pull all the data together into one single platform so that it can be used to operationalize the machine learning model to get predictions. Ryan is the data scientiest that creates the ML model based on Amy’s requirement so as to recommend the product category that a customer would likely be interested into. Once, Chris has used the ML model created by Ryan, they have a result set of customer and their interest. Finaly Nick integrates the result with mail gun app to send out emails to targeted customers with product promotions. This repeats everyday as the product promotions are refreshed and are extensive during seasonal sales.
With Big SQL, Amy’s team can start becoming self sufficient in operationalizing the assets on regular bases once they are created by Ryan and Nick. Amy’s team can leverage Big SQL’s federation capabilities to connect and query data that is stored in separate data sources in a secured way as its setup by Chris. So now Chris doesn’t need to ingest and bring the data into single data location. With Federation and Predicate pushdown, only the data that matters, travels over the wire. With Big SQL and Spark integration, Amy’s team can operationalize spark ML models without knowing the details of how Spark works or what Spark API’s. Finally, Amy’s team can push out the discounted sales promotions that are refreshed every day to the customers by leveraging Big SQL’s capability to call applications developed by users. Technical Meaning - Application is wrapped as a UDF and can be invoked by BigSQL Let me show you in demo that how Eric who is a marketing analyst and works for Amy is able to operationalize this whole effort in just couple of SQL statements. By using Big SQL, Amy’s team is more self-relaint in executing the marketing Campaign because of its capabilities to ties all these separate tasks together through a single tool. Amy still works with Ryan and Nick but only if she needs any changes in the assets.
After that exciting demo, I would like to summarize that how Big SQL can help you in making your team’s more productive and improve your business Big SQL understands different sql dialects so you can leverage your existing skills on Oracle and Netezza to build application on Big SQL or import enterprise workloads on hadoop platform and run it as is without any change. Big SQL’s can access remote databases and perform query pushdown to these federated data sources. Big SQL’s integrates with Spark Bi-directionally in memory to exchange data between Big SQL data sets and Spark Dataframes. This lets Big SQL call any Spark application and operationalize Spark ML models with enterprise data. Big SQL exhibits high performance even when data scales upto 100TB with complex SQL queries. It comes with a work load manager that lets the enterprise do a lot of plumbing with resource allocation and workloads. Big SQL also has a proven track record to support many concurrent users without degrading performance. Big SQL comes enterprise ready with build in security features and also integrates with Apache Ranger for centralized management of your hadoop environment. Details: SQL COMPATIBILITY SQL Compatible with: netezza, oracle, db2, etc Applications work as-is without any changes FEDERATION AND SPARK: Federates to more than 10 data sources: RDBMS, NoSQL and/or Object Stores Integrates bi-directionally with Spark, like no other Operationalizes ML models PERFORMANCE Exhibits high performance even when data scales up to 100TB with complex SQLs Handles many concurrent users without relinquishing performance ENTERPRISE & SECURITY Secures data using SQL with roles Integrates with Ranger for centralized management
We have some very exciting sessions lineup for you in this conference. Please attend these sessions to learn more. If you have questions about the demo or need any more information then please visit us at the IBM booth in the expo hall.

Delivering Data Science to the Business

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Destacado

Destacado (17)

Similar a Delivering Data Science to the Business

Similar a Delivering Data Science to the Business (20)

Más de DataWorks Summit

Más de DataWorks Summit (20)

Último

Último (20)

Delivering Data Science to the Business

Notas del editor