SlideShare una empresa de Scribd logo
1 de 26
Descargar para leer sin conexión
© 2017 IBM Corporation
Data: Spark and the Data Federation
Leif Pedersen
Executive IT Specialist,
z Analytics, Europe
Email: Leif.Pedersen@dk.ibm.com
© 2017 IBM Corporation
Systems of InsightSystems of Record Systems of
Engagement
Look like a “déjà vu”?
2
© 2017 IBM Corporation
In the new insight economy, winners infuse analytics
everywhere to drive better outcomes!
Create new business models
(CEO)
Attract, grow, retain
customers
(CMO)
Transform financial
& management
processes
(CFO)
Manage risk
(CRO)
Prioritize IT investment
for innovation
(CIO, CDO)
Optimize
operations
(COO)
Fight fraud and
counter threats
(CSO)
Systems of Insight
Systems of Record Systems of
Engagement
3
© 2017 IBM Corporation
All Data New Dev StylesNew Analytics More People
Business
Value
Embrace all data
Run at the speed
of business
1 Enable all analytics
IBM Analytics Point of View - Make DATA SIMPLE and
ACCESSIBLE to ALL
DATA
Professionals are
leading THE
Transformation!
2
3
4
© 2017 IBM Corporation
The Evolution in the Approach to Getting Value from Data
Operations Data Warehousing Self-service
Analytics
New Business
Imperatives
Maturity High
High
Low
Data-Informed
Decision Making
• Full dataset analysis
(no more sampling)
• Extract value from
non-relational data
• 360o
view of all
enterprise data
• Exploratory analysis
and discovery
Warehouse
Modernization
• Data lake
• Data offload
• ETL offload
• Queryable archive
and staging
Lower the Cost
of Storage
Ensure resiliency
and availability
Business
Transformation
• Create new business
models
• Risk-aware decision
making
• Fight fraud and
counter threats
• Optimize operations
• Attract, grow, retain
customers
Value
We
are
here
5
© 2017 IBM Corporation
SoE
Analytics evolution to support all Analytics Apps on all Data –
The Mainframe Use case
6
Applications Data
SoI
HDFSMap / Reduce
Spark
Historical data in DB2 for z/OS &
IBM DB2 Analytics Accelerator
Other Data
BI Reporting Data Warehouse / Data Marts
The Data Lake Evolution
Operational Data stored in
VSAM, IMS, DB2
SoR
Core Business supported by
CICS, IMS, WAS
z/OSRules
Score
execution
Machine Learning
The Predictive Analytics EvolutionScore
Creation
IT Operational Data
© 2017 IBM Corporation
z Systems Analytics Areas complement existing Analytics
Environments.
IBMDB2Analytics
Accelerator
In transaction rules and
score execution
Intraday capability for ad-hoc
queries & predictive analytics
Availability of historical
data (in raw format)
Accelerated reporting to
fulfill internal and regulatory
requirements
Ability to transform
data before offload to
DWH or reporting
Ability to create new
models at any time
Quasi Real Time
availability of data
for analytics
Instant access to raw data
for new report generation in
hours instead of days
Load and merge of ANY non
DB2 z/OS data
Scoring Rules
A
zDatazApps
Scoring
Rules
Explore data to
uncover hidden
insights
A
7
© 2017 IBM Corporation
Opportunity to rethink business processes: analytics as an integral part of the process itself,
rather than a separate activity performed after the fact
o Transform business processes, not just provide existing styles of analytics faster and without latency
Enable business leaders to perform, in the context of operational processes, advanced and
sophisticated real-time analysis of their business data
Hybrid transaction/analytical processing will empower application leaders to
innovate via greater situation awareness and improved business agility.
Gartner Research Note G00259033 28 January 2014: Hybrid Transaction/Analytical Processing Will Foster Opportunities for Dramatic Business Innovation
The integration of transactions and analytics is an
emerging and important market segment
“
”
Analytics as
part of the
flow of
business
Insights on
every
transaction
© 2017 IBM Corporation
Hybrid Transaction/Analytical Processing (HTAP) - with DB2
Analytics Accelerator
OLAP
DB2forz/OS
Processing
IBMDB2AnalyticsAccelerator
DB2 for z/OS CPU savings
target
• Operational (in transaction)
analytics
• (complex) OLTP
Accelerator focus
• Ad-hoc queries
• Complex queries scanning
large amount of data
• ETL acceleration/virtual
transformation
Complex queries (more history)
OLTP Transactions
High concurrency
Hybrid Transactional &
Analytical Processing
Standard reports
© 2017 IBM Corporation
Data Warehouse and Data Lake
A Data Lake is…
+An analytics sandbox for exploring data to
gain insight
+An enterprise-wide catalog to find data across
the enterprise and to link from business term to
technical metadata
+An environment for enabling reuse data
transformations and queries
+An environment where users can access vast
amounts raw data
+An environment for developing and proving
an analytics model and then moving into
production; experience in production may drive
further experimentation in the data lake
A Data Lake is not…
- A data warehouse or data mart of all of the data
in an enterprise
- A high-performance production environment
- A production reporting application
- A purpose-built system to solve a specific
problem
10
© 2017 IBM Corporation
Fast Runtime Environment
– Interactive or batch processing
– Based on data in-memory processing
• High performance for multi-step processes where Spark can
pass the data directly without using disk storage.
– Parallel processing
Interface to Data
– Accessing Hadoop based HDFS data, Cassandra,
Hbase, …
– Accessing any traditional databases using JDBC
Interface for Applications – Ease of Use APIs supported
by modern languages
– Stack of libraries including SQL, Machine Learning,
GraphX, and Spark Streaming
– Over 80 high-level operators that make it easy to build
parallel applications
– Many languages supported including Java, Scala, Python
and R
• Spark is actually written in Scala
Spark, a Transaction Manager for Analytics Applications
11
Spark is NOT a datastore, NOT a
replacement for Hadoop!
© 2017 IBM Corporation
2. Spark lets you develop line-of-business
applications faster
3. Spark learns from data and delivers in real
time
With Hadoop, you ask a question and get back
a batch of data. With Spark, you may say,
“continue to give me answers to this
question”…and when new data comes, the
user is smarter.
1. Spark makes it easier to access and work
with all data
- Enables new data-based use cases
- All data: Internal/ External, Structured/
Unstructured
- Real-time insights, from all data
sources
- Automates analytics with Machine
Learning
- Clients that lead in data, lead their
industry
Design
Develop
ment
Data
Science
Why Spark matters to a business?
12
© 2017 IBM Corporation
VSAM
z/OSKey
Business
Transaction
& Batch
Systems
Spark Applications: IBM
and Partners
AdabasIMSDB2 z/OS
Distributed
Teradata
HDFS
Apache Spark Core
Spark
Stream
Spark
SQL
MLib GraphX
RDD
DF
RDD
DF
Optimized data access
IBM z/OS Platform for Apache Spark
and *many* more . . .
Spark can run on z/OS close to z/OS-based Applications & Data
Values:
Data-in place analytics,
without need to ETL or move
data for analytic purposes
Optimized access and z/OS
governed ‘in-memory’
capabilities for core business
data
Unique capability to access
almost all z/OS sources with
Apache Spark SQL & many
non-z data sources
Almost all zIIP eligible
Integration of analytics
across core systems, social
data, website information,
etc.
13
and *many* more including SMF, OPERLOG, SYSLOGs, . . .
© 2017 IBM Corporation14
Examples of Spark Use Case
© 2017 IBM Corporation15
Client Insight Analytics over transactions & customer interactions
Leverage data on z/OS (DB2, VSAM) & distributed (Oracle, SQL Server, HDFS) to enable real-time access from data
science teams focused on client insight to develop patterns, models
Data Distillation - Hybrid Architecture
Run Spark z/OS to access, aggregate, filter and *distill* large volumes of data
Make available smaller, aggregated analytic results for access by: customer insight solutions, data science
environments
360 Degree View: Customers, Payments, Transactions
Leverage Spark z/OS to get real-time or near real-time view of current status of payments, transactions,
customers combining data from OLTP, distributed sources, & streaming
IT Analytics
Analyze real-time streamed SMF data, combined with archived SMF data and syslog data, visualize and interact with data
science Jupyter Notebook to find patterns
Use Case Patterns
© 2017 IBM Corporation16
Distill the Data:
• Use Spark z/OS for data blending, cleansing, transform, etc with data-
in-place
• Store results in ‘Tidy’ Data Repository
• Refresh as needed
Explore the results
Data exploration, investigation
leveraging ‘Tidy’ Repository
Values:
• Leverage most current business data for data science
• Efficiencies in reducing ETL
• Leverage common analytics ecosystem skill
• Integrate Spark on multiple platforms for optimal analytics infrastructure
Use Case #1: Hybrid Data Science
© 2017 IBM Corporation17
Use Case #2: Optimized Customer Insight
Customer
z/OS
Transactio
nMerchant
Spark Analytic
Result Set
Call
Center
Apache Spark Core
Spark
Stream
Spark
SQL
MLib GraphX
RDD
DF
RDD
DF
Optimized data Layer
IBM z/OS Platform for Apache
Spark
Subset of
Data: distilled,
filtered,
transformed
BI
Dashboard
Components
Data
Cube
Analytical
Engines
Web
Portal
Analytics
API
Gatewa
y
APIs
Pre-Built
Dashboards
Pre-Built
Data Models
Pre-Built
Analytical
Models
Transform (if
needed), &
populate BBCI
staging area /
cache
Input &
Output
Tidy Data
Values:
• Avoid costly and ineffective wholesale copy of data
• Frequent refresh of most relevant data elements to customer insights solution
• Faster time to implementation for business solution to deliver insights on churn, cross-
sell, etc.
Customer Insight for
Banking Solution
© 2017 IBM Corporation18
Use Case #3: Real-Time Application Event
Analytics Use Case
Spark
z/OS
Event Stream
CICS Event triggers create an event stream that would
be captured by Spark running on its own z/OS LPAR
Spark configured for high availability to avoid impacting
CICS
Real-Time Analytics with Spark z/OS:
Real time analytics to provide feedback into the
Systems of Engagement or Monitoring Systems on
types of banking services and frequency of
consumption
Real time monitoring of core business processes
and applications
Historical Analysis leverages IDAA:
Batch Load of Events for historical, trending and
reporting
Real
Time
Analytics,
can
include
scoring
DB2 Analytics
Accelerator
Loader
Channel
System of
Engagement
CICS Transactions
Monitor
LogstreamLogstream
IBM DB2
Analytics
Accelerat
or
Real-Time Consumption Batch Load Overnight
Historical
Analysis,
Reporting
DB2
z/OS
© 2017 IBM Corporation19
Use Case #4: Surface Spark Results to JDBC / ODBC Applications
DB2 z/OS
z/OS
Apache Spark Core
Spark
Strea
m
Spark
SQL
MLib
Graph
X
DF
RDD
DF
RDD
DFStor
• Persist
specific
Spark
Result
Sets
• Backed
by VSAM
• Leverage
z/OS SAF,
Dataset
mgmt
HDFS
JDBC / ODBC /
REST, noSQL
Client
accessing
Spark RDDs,
example:
Cognos ,
Tableau, …
Optimized Data Layer
IMSVSAM
© 2017 IBM Corporation20
Use Case #5: Analyzing SMF Data with Spark
• Spark application is
agnostic to data source
and number of sources
• MDSS required on at
least one system, MDSS
agents required on all
systems. No IPL required
for installation
• Logstream recording
mode required for
realtime interfaces
MDSS Client
LPAR1
MDSS Client
LPAR2
MDSS Client
LPAR3
SMF
Realtime
Logstream
Logstream
Logstream SMF
Realtime
Logstream
Logstream
Logstream SMF
Realtime
Logstream
Logstream
Logstream
Spark Application using SparkSQL
Optimized Data Integration Layer (MDSS)
JDBC
LPARn
SMF
Realtime
Logstream
Logstream
Logstream
Dump Data Sets
Analyze real-time in-memory SMF data, combined with archived data
Analyze data across multiple LPARs
Augment with SYSLOG and other sources for richer analytic outcome
Efficiencies in avoiding data movement
© 2017 IBM Corporation21
Use Cases for Real Time SMF Analytics
Detect excessive memory consumption – SMF30
Monitor high water mark for real memory usage for jobs and send alerts if usage exceeds normal
consumption
Detect security violations in real-time – SMF 80
Monitor volume of datasets/files accessed per user within a given time period and raise alerts for above
normal access rates
Real time monitoring resource usage in cloud environments (CPU, Memory, Disk)
A list of supported SMF record types can be found in the Redbook “Apache Spark Implementation on IBM z/OS” - page 78
http://www.redbooks.ibm.com/abstracts/sg248325.html
© 2017 IBM Corporation22
IBM Open Data Analytics for z/OS
© 2017 IBM Corporation
Business Applications
CustomerTransactionMerchant
Distributed
Apache Spark
Distilled
Insight
Query
Acceleration
Leveraging IBM Z for Optimized Analytics
Federate analytics leveraging data in place for more current insights at scale,
optimized security, privacy and reduced costs
DataData
Data
Prep
Data
Prep
ML
Algo
ML
Algo
ModelModel DeployDeploy PredictPredict
Python
Distilled
InsightAnalytic Result
Sets
Govern, Manage, Algorithm Assist…
Monitor, Feedback
Pauselss GC
New SIMD instructions 32 TB Memory
Pervasive Encryption
23
IBM Open Data Analytics for z/OS
IBM Machine Learning for z/OS
Optimized Data Integration Layer
© 2017 IBM Corporation
IBM Open Data Analytics for z/OS: Offering Overview
What is in the Offering?
IBM Open Data Analytics for z/OS (IBM
product):
• Apache Spark 2.1.1 enabled for z/OS
• Python 3.6.1
• All Pre-requisite libraries
• Select Anaconda Libraries (approx. 250 including
pandas, dask, numpy, scikit-learn, matplotlib…)
• Optimized Data Integration Layer: optimized for
Spark & Python db access to z/OS data
• Integration with WLM z/OS for resource
management aligned with job priority
• Integration with security (SAF) interfaces
• Support & Service available from IBM for a fee
–Very aggressive pricing for zIIPs (cores) and memory for
Open Data Analytics z/OS workload
Ecosystem
–GitHub zos-spark repository
•Jupyter Notebooks (Scala, Python Workbenches)
•Kernel gateway, Jupyter client, kernel toree
•Sample data & code snippets
–Rocket:
•Collaboration for Optimized Data Layer
•Industry vertical mappings, e.g. ISO8583-1, ACH,
SMF, etc.
–Continuum:
• Access to z/OS channel on Anaconda cloud for
updates / refreshes & Package management
• Option to license private mirrored environment
• Services & Consulting for Python
© 2017 IBM Corporation
Value: Increase Integration through Persisting Analytic Results for Enterprise Collaboration
VSAM
z/OS
DF Store:
• Specific
Spark &
Python
Result
Sets
• Backed by
VSAM
• Leverage
z/OS SAF,
Dataset
mgmtOptimized Data Layer
Apache Spark Core
Spark
Stream
DF DF
MLib Graphx
Spark
SQL
Python 3.6.1
Core Packages:
• numpy
• scikit-learn
• dask
• pandas
• Matplotlib
• Etc.
IMS
DB2
z/OS
HDFS
JDBC / ODBC /
REST,
noSQL
Client
accessing
Spark
RDDs,
example:
Cognos ,
Tableau, …
IBM Open Data Analytics for z/OS
© 2017 IBM Corporation

Más contenido relacionado

La actualidad más candente

Hadoop Powers Modern Enterprise Data Architectures
Hadoop Powers Modern Enterprise Data ArchitecturesHadoop Powers Modern Enterprise Data Architectures
Hadoop Powers Modern Enterprise Data Architectures
DataWorks Summit
 
Hybrid Data Architecture: Integrating Hadoop with a Data Warehouse
Hybrid Data Architecture: Integrating Hadoop with a Data WarehouseHybrid Data Architecture: Integrating Hadoop with a Data Warehouse
Hybrid Data Architecture: Integrating Hadoop with a Data Warehouse
DataWorks Summit
 
Ambari Meetup: 2nd April 2013: Teradata Viewpoint Hadoop Integration with Ambari
Ambari Meetup: 2nd April 2013: Teradata Viewpoint Hadoop Integration with AmbariAmbari Meetup: 2nd April 2013: Teradata Viewpoint Hadoop Integration with Ambari
Ambari Meetup: 2nd April 2013: Teradata Viewpoint Hadoop Integration with Ambari
Hortonworks
 
Making Bank Predictive and Real-Time
Making Bank Predictive and Real-TimeMaking Bank Predictive and Real-Time
Making Bank Predictive and Real-Time
DataWorks Summit
 
Monitizing Big Data at Telecom Service Providers
Monitizing Big Data at Telecom Service ProvidersMonitizing Big Data at Telecom Service Providers
Monitizing Big Data at Telecom Service Providers
DataWorks Summit
 

La actualidad más candente (20)

Ibm machine learning for z os
Ibm machine learning for z osIbm machine learning for z os
Ibm machine learning for z os
 
Big Data: Architecture and Performance Considerations in Logical Data Lakes
Big Data: Architecture and Performance Considerations in Logical Data LakesBig Data: Architecture and Performance Considerations in Logical Data Lakes
Big Data: Architecture and Performance Considerations in Logical Data Lakes
 
Modern Data Architecture
Modern Data Architecture Modern Data Architecture
Modern Data Architecture
 
10 Amazing Things To Do With a Hadoop-Based Data Lake
10 Amazing Things To Do With a Hadoop-Based Data Lake10 Amazing Things To Do With a Hadoop-Based Data Lake
10 Amazing Things To Do With a Hadoop-Based Data Lake
 
SplunkSummit 2015 - Real World Big Data Architecture
SplunkSummit 2015 -  Real World Big Data ArchitectureSplunkSummit 2015 -  Real World Big Data Architecture
SplunkSummit 2015 - Real World Big Data Architecture
 
Rob Bearden Keynote Hadoop Summit San Jose
Rob Bearden Keynote Hadoop Summit San JoseRob Bearden Keynote Hadoop Summit San Jose
Rob Bearden Keynote Hadoop Summit San Jose
 
Exploring the Wider World of Big Data- Vasalis Kapsalis
Exploring the Wider World of Big Data- Vasalis KapsalisExploring the Wider World of Big Data- Vasalis Kapsalis
Exploring the Wider World of Big Data- Vasalis Kapsalis
 
Why Data Lake should be the foundation of Enterprise Data Architecture
Why Data Lake should be the foundation of Enterprise Data ArchitectureWhy Data Lake should be the foundation of Enterprise Data Architecture
Why Data Lake should be the foundation of Enterprise Data Architecture
 
Hadoop Powers Modern Enterprise Data Architectures
Hadoop Powers Modern Enterprise Data ArchitecturesHadoop Powers Modern Enterprise Data Architectures
Hadoop Powers Modern Enterprise Data Architectures
 
Alexandre Vasseur - Evolution of Data Architectures: From Hadoop to Data Lake...
Alexandre Vasseur - Evolution of Data Architectures: From Hadoop to Data Lake...Alexandre Vasseur - Evolution of Data Architectures: From Hadoop to Data Lake...
Alexandre Vasseur - Evolution of Data Architectures: From Hadoop to Data Lake...
 
Hadoop Trends
Hadoop TrendsHadoop Trends
Hadoop Trends
 
Hybrid Data Architecture: Integrating Hadoop with a Data Warehouse
Hybrid Data Architecture: Integrating Hadoop with a Data WarehouseHybrid Data Architecture: Integrating Hadoop with a Data Warehouse
Hybrid Data Architecture: Integrating Hadoop with a Data Warehouse
 
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, ClouderaMongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
 
Ambari Meetup: 2nd April 2013: Teradata Viewpoint Hadoop Integration with Ambari
Ambari Meetup: 2nd April 2013: Teradata Viewpoint Hadoop Integration with AmbariAmbari Meetup: 2nd April 2013: Teradata Viewpoint Hadoop Integration with Ambari
Ambari Meetup: 2nd April 2013: Teradata Viewpoint Hadoop Integration with Ambari
 
Big-Data Server Farm Architecture
Big-Data Server Farm Architecture Big-Data Server Farm Architecture
Big-Data Server Farm Architecture
 
Use dependency injection to get Hadoop *out* of your application code
Use dependency injection to get Hadoop *out* of your application codeUse dependency injection to get Hadoop *out* of your application code
Use dependency injection to get Hadoop *out* of your application code
 
Designing Fast Data Architecture for Big Data using Logical Data Warehouse a...
Designing Fast Data Architecture for Big Data  using Logical Data Warehouse a...Designing Fast Data Architecture for Big Data  using Logical Data Warehouse a...
Designing Fast Data Architecture for Big Data using Logical Data Warehouse a...
 
Making Bank Predictive and Real-Time
Making Bank Predictive and Real-TimeMaking Bank Predictive and Real-Time
Making Bank Predictive and Real-Time
 
Monitizing Big Data at Telecom Service Providers
Monitizing Big Data at Telecom Service ProvidersMonitizing Big Data at Telecom Service Providers
Monitizing Big Data at Telecom Service Providers
 
Microsoft and Hortonworks Delivers the Modern Data Architecture for Big Data
Microsoft and Hortonworks Delivers the Modern Data Architecture for Big DataMicrosoft and Hortonworks Delivers the Modern Data Architecture for Big Data
Microsoft and Hortonworks Delivers the Modern Data Architecture for Big Data
 

Similar a NRB - LUXEMBOURG MAINFRAME DAY 2017 - Data Spark and the Data Federation

Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
StampedeCon
 
Exclusive Verizon Employee Webinar: Getting More From Your CDR Data
Exclusive Verizon Employee Webinar: Getting More From Your CDR DataExclusive Verizon Employee Webinar: Getting More From Your CDR Data
Exclusive Verizon Employee Webinar: Getting More From Your CDR Data
Pentaho
 

Similar a NRB - LUXEMBOURG MAINFRAME DAY 2017 - Data Spark and the Data Federation (20)

Analytics with IMS Assets - 2017
Analytics with IMS Assets - 2017Analytics with IMS Assets - 2017
Analytics with IMS Assets - 2017
 
Enterprise analytics journey from Helene Lyon
Enterprise analytics journey from Helene LyonEnterprise analytics journey from Helene Lyon
Enterprise analytics journey from Helene Lyon
 
Bring your SAP and Enterprise Data to Hadoop, Apache Kafka and the Cloud
Bring your SAP and Enterprise Data to Hadoop, Apache Kafka and the CloudBring your SAP and Enterprise Data to Hadoop, Apache Kafka and the Cloud
Bring your SAP and Enterprise Data to Hadoop, Apache Kafka and the Cloud
 
Skillwise Big Data part 2
Skillwise Big Data part 2Skillwise Big Data part 2
Skillwise Big Data part 2
 
Skilwise Big data
Skilwise Big dataSkilwise Big data
Skilwise Big data
 
Fueling AI & Machine Learning: Legacy Data as a Competitive Advantage
Fueling AI & Machine Learning: Legacy Data as a Competitive AdvantageFueling AI & Machine Learning: Legacy Data as a Competitive Advantage
Fueling AI & Machine Learning: Legacy Data as a Competitive Advantage
 
Bring Your SAP and Enterprise Data to Hadoop, Kafka, and the Cloud
Bring Your SAP and Enterprise Data to Hadoop, Kafka, and the CloudBring Your SAP and Enterprise Data to Hadoop, Kafka, and the Cloud
Bring Your SAP and Enterprise Data to Hadoop, Kafka, and the Cloud
 
Big Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on DataBig Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on Data
 
Big Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on DataBig Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on Data
 
The new dominant companies are running on data
The new dominant companies are running on data The new dominant companies are running on data
The new dominant companies are running on data
 
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
 
How Businesses use Big Data to Impact the Bottom Line
How Businesses use Big Data to Impact the Bottom LineHow Businesses use Big Data to Impact the Bottom Line
How Businesses use Big Data to Impact the Bottom Line
 
Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...
Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...
Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...
 
Exclusive Verizon Employee Webinar: Getting More From Your CDR Data
Exclusive Verizon Employee Webinar: Getting More From Your CDR DataExclusive Verizon Employee Webinar: Getting More From Your CDR Data
Exclusive Verizon Employee Webinar: Getting More From Your CDR Data
 
Demystifying Data Warehouse as a Service
Demystifying Data Warehouse as a ServiceDemystifying Data Warehouse as a Service
Demystifying Data Warehouse as a Service
 
Top SAP Online training institute in Hyderabad
Top SAP Online training institute in HyderabadTop SAP Online training institute in Hyderabad
Top SAP Online training institute in Hyderabad
 
The Double win business transformation and in-year ROI and TCO reduction
The Double win business transformation and in-year ROI and TCO reductionThe Double win business transformation and in-year ROI and TCO reduction
The Double win business transformation and in-year ROI and TCO reduction
 
2017 OpenWorld Keynote for Data Integration
2017 OpenWorld Keynote for Data Integration2017 OpenWorld Keynote for Data Integration
2017 OpenWorld Keynote for Data Integration
 
Verizon Centralizes Data into a Data Lake in Real Time for Analytics
Verizon Centralizes Data into a Data Lake in Real Time for AnalyticsVerizon Centralizes Data into a Data Lake in Real Time for Analytics
Verizon Centralizes Data into a Data Lake in Real Time for Analytics
 
Big Data LDN 2017: The Logical Data Warehouse – A Modern Analytical Architect...
Big Data LDN 2017: The Logical Data Warehouse – A Modern Analytical Architect...Big Data LDN 2017: The Logical Data Warehouse – A Modern Analytical Architect...
Big Data LDN 2017: The Logical Data Warehouse – A Modern Analytical Architect...
 

Más de NRB

Más de NRB (20)

Le Groupe NRB : Le meilleur partenaire pour votre z/modernisation
Le Groupe NRB : Le meilleur partenaire pour votre z/modernisationLe Groupe NRB : Le meilleur partenaire pour votre z/modernisation
Le Groupe NRB : Le meilleur partenaire pour votre z/modernisation
 
Mainframe Day 2022 -The NRB Group - the best partner of your z-modernization.pdf
Mainframe Day 2022 -The NRB Group - the best partner of your z-modernization.pdfMainframe Day 2022 -The NRB Group - the best partner of your z-modernization.pdf
Mainframe Day 2022 -The NRB Group - the best partner of your z-modernization.pdf
 
The NRB Group mainframe day 2021 - Containerisation on Z - Paul Pilotto - Seb...
The NRB Group mainframe day 2021 - Containerisation on Z - Paul Pilotto - Seb...The NRB Group mainframe day 2021 - Containerisation on Z - Paul Pilotto - Seb...
The NRB Group mainframe day 2021 - Containerisation on Z - Paul Pilotto - Seb...
 
The NRB Group mainframe day 2021 - New Programming Languages on Z - Frank Van...
The NRB Group mainframe day 2021 - New Programming Languages on Z - Frank Van...The NRB Group mainframe day 2021 - New Programming Languages on Z - Frank Van...
The NRB Group mainframe day 2021 - New Programming Languages on Z - Frank Van...
 
The NRB Group mainframe day 2021 - DevOps on Z - Jerome Klimm - Benoit Ebner
The NRB Group mainframe day 2021 - DevOps on Z - Jerome Klimm - Benoit EbnerThe NRB Group mainframe day 2021 - DevOps on Z - Jerome Klimm - Benoit Ebner
The NRB Group mainframe day 2021 - DevOps on Z - Jerome Klimm - Benoit Ebner
 
The NRB Group mainframe day 2021 - Application Modernisation On Z - Sebastien...
The NRB Group mainframe day 2021 - Application Modernisation On Z - Sebastien...The NRB Group mainframe day 2021 - Application Modernisation On Z - Sebastien...
The NRB Group mainframe day 2021 - Application Modernisation On Z - Sebastien...
 
The NRB Group mainframe day 2021 - Security On Z - Guillaume Hoareau
The NRB Group mainframe day 2021 - Security On Z - Guillaume HoareauThe NRB Group mainframe day 2021 - Security On Z - Guillaume Hoareau
The NRB Group mainframe day 2021 - Security On Z - Guillaume Hoareau
 
The NRB Group mainframe day 2021 - IBM Z-Strategy & Roadmap - Adam John Sturg...
The NRB Group mainframe day 2021 - IBM Z-Strategy & Roadmap - Adam John Sturg...The NRB Group mainframe day 2021 - IBM Z-Strategy & Roadmap - Adam John Sturg...
The NRB Group mainframe day 2021 - IBM Z-Strategy & Roadmap - Adam John Sturg...
 
The NRB Group mainframe day 2021 - The NRB Group & The Mainframe - Pascal Laf...
The NRB Group mainframe day 2021 - The NRB Group & The Mainframe - Pascal Laf...The NRB Group mainframe day 2021 - The NRB Group & The Mainframe - Pascal Laf...
The NRB Group mainframe day 2021 - The NRB Group & The Mainframe - Pascal Laf...
 
Nrb Mainframe Day - z Data and AI - Michael Boeckx
Nrb Mainframe Day - z Data and AI - Michael BoeckxNrb Mainframe Day - z Data and AI - Michael Boeckx
Nrb Mainframe Day - z Data and AI - Michael Boeckx
 
Nrb Mainframe Day - Nrb Mainframe Strategy - Pascal Laffineur
Nrb Mainframe Day - Nrb Mainframe Strategy - Pascal LaffineurNrb Mainframe Day - Nrb Mainframe Strategy - Pascal Laffineur
Nrb Mainframe Day - Nrb Mainframe Strategy - Pascal Laffineur
 
Nrb Mainframe Day - Ibm z A Key Player In The Hybrid Cloud Journey - Bob Catteew
Nrb Mainframe Day - Ibm z A Key Player In The Hybrid Cloud Journey - Bob CatteewNrb Mainframe Day - Ibm z A Key Player In The Hybrid Cloud Journey - Bob Catteew
Nrb Mainframe Day - Ibm z A Key Player In The Hybrid Cloud Journey - Bob Catteew
 
Nrb Mainframe Day - NRB's Agile Software Factory In support of Application In...
Nrb Mainframe Day - NRB's Agile Software Factory In support of Application In...Nrb Mainframe Day - NRB's Agile Software Factory In support of Application In...
Nrb Mainframe Day - NRB's Agile Software Factory In support of Application In...
 
Nrb Mainframe Day z Data and AI - Leif Pedersen
Nrb Mainframe Day z Data and AI - Leif PedersenNrb Mainframe Day z Data and AI - Leif Pedersen
Nrb Mainframe Day z Data and AI - Leif Pedersen
 
Nrb Mainframe Day - z Legacy Innovation - New Architecture And Api Services -...
Nrb Mainframe Day - z Legacy Innovation - New Architecture And Api Services -...Nrb Mainframe Day - z Legacy Innovation - New Architecture And Api Services -...
Nrb Mainframe Day - z Legacy Innovation - New Architecture And Api Services -...
 
NRB Sap Day 03/10/2019 - Presentation The Nrb Group - Daniel Eycken
NRB Sap Day 03/10/2019 - Presentation The Nrb Group - Daniel Eycken NRB Sap Day 03/10/2019 - Presentation The Nrb Group - Daniel Eycken
NRB Sap Day 03/10/2019 - Presentation The Nrb Group - Daniel Eycken
 
NRB Sap Day 03/10/2019 - Wbfin What An Exciting Challenge - Sophie Algoet - C...
NRB Sap Day 03/10/2019 - Wbfin What An Exciting Challenge - Sophie Algoet - C...NRB Sap Day 03/10/2019 - Wbfin What An Exciting Challenge - Sophie Algoet - C...
NRB Sap Day 03/10/2019 - Wbfin What An Exciting Challenge - Sophie Algoet - C...
 
NRB Sap Day 03/10/2019 - UMGC Groningen, The Entire Organisation Aligned - Kr...
NRB Sap Day 03/10/2019 - UMGC Groningen, The Entire Organisation Aligned - Kr...NRB Sap Day 03/10/2019 - UMGC Groningen, The Entire Organisation Aligned - Kr...
NRB Sap Day 03/10/2019 - UMGC Groningen, The Entire Organisation Aligned - Kr...
 
NRB Sap Day 03/10/2019 - The Sap Intelligent Enterprise Strategy In Action - ...
NRB Sap Day 03/10/2019 - The Sap Intelligent Enterprise Strategy In Action - ...NRB Sap Day 03/10/2019 - The Sap Intelligent Enterprise Strategy In Action - ...
NRB Sap Day 03/10/2019 - The Sap Intelligent Enterprise Strategy In Action - ...
 
NRB Sap Day 03/10/2019 - Sap's Commitment Towards Great Delivery For S4 move...
NRB Sap Day 03/10/2019 -  Sap's Commitment Towards Great Delivery For S4 move...NRB Sap Day 03/10/2019 -  Sap's Commitment Towards Great Delivery For S4 move...
NRB Sap Day 03/10/2019 - Sap's Commitment Towards Great Delivery For S4 move...
 

Último

Call Girls Banashankari Just Call 👗 7737669865 👗 Top Class Call Girl Service ...
Call Girls Banashankari Just Call 👗 7737669865 👗 Top Class Call Girl Service ...Call Girls Banashankari Just Call 👗 7737669865 👗 Top Class Call Girl Service ...
Call Girls Banashankari Just Call 👗 7737669865 👗 Top Class Call Girl Service ...
amitlee9823
 
CHEAP Call Girls in Mayapuri (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Mayapuri  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Mayapuri  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Mayapuri (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
怎样办理维多利亚大学毕业证(UVic毕业证书)成绩单留信认证
怎样办理维多利亚大学毕业证(UVic毕业证书)成绩单留信认证怎样办理维多利亚大学毕业证(UVic毕业证书)成绩单留信认证
怎样办理维多利亚大学毕业证(UVic毕业证书)成绩单留信认证
tufbav
 
CHEAP Call Girls in Vinay Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Vinay Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Vinay Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Vinay Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Vip Mumbai Call Girls Kalyan Call On 9920725232 With Body to body massage wit...
Vip Mumbai Call Girls Kalyan Call On 9920725232 With Body to body massage wit...Vip Mumbai Call Girls Kalyan Call On 9920725232 With Body to body massage wit...
Vip Mumbai Call Girls Kalyan Call On 9920725232 With Body to body massage wit...
amitlee9823
 
(👉Ridhima)👉VIP Model Call Girls Mulund ( Mumbai) Call ON 9967824496 Starting ...
(👉Ridhima)👉VIP Model Call Girls Mulund ( Mumbai) Call ON 9967824496 Starting ...(👉Ridhima)👉VIP Model Call Girls Mulund ( Mumbai) Call ON 9967824496 Starting ...
(👉Ridhima)👉VIP Model Call Girls Mulund ( Mumbai) Call ON 9967824496 Starting ...
motiram463
 
CHEAP Call Girls in Ashok Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Ashok Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Ashok Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Ashok Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Último (20)

Pooja 9892124323, Call girls Services and Mumbai Escort Service Near Hotel Th...
Pooja 9892124323, Call girls Services and Mumbai Escort Service Near Hotel Th...Pooja 9892124323, Call girls Services and Mumbai Escort Service Near Hotel Th...
Pooja 9892124323, Call girls Services and Mumbai Escort Service Near Hotel Th...
 
Call Girls Banashankari Just Call 👗 7737669865 👗 Top Class Call Girl Service ...
Call Girls Banashankari Just Call 👗 7737669865 👗 Top Class Call Girl Service ...Call Girls Banashankari Just Call 👗 7737669865 👗 Top Class Call Girl Service ...
Call Girls Banashankari Just Call 👗 7737669865 👗 Top Class Call Girl Service ...
 
CHEAP Call Girls in Mayapuri (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Mayapuri  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Mayapuri  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Mayapuri (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Shikrapur Call Girls Most Awaited Fun 6297143586 High Profiles young Beautie...
Shikrapur Call Girls Most Awaited Fun  6297143586 High Profiles young Beautie...Shikrapur Call Girls Most Awaited Fun  6297143586 High Profiles young Beautie...
Shikrapur Call Girls Most Awaited Fun 6297143586 High Profiles young Beautie...
 
怎样办理维多利亚大学毕业证(UVic毕业证书)成绩单留信认证
怎样办理维多利亚大学毕业证(UVic毕业证书)成绩单留信认证怎样办理维多利亚大学毕业证(UVic毕业证书)成绩单留信认证
怎样办理维多利亚大学毕业证(UVic毕业证书)成绩单留信认证
 
HLH PPT.ppt very important topic to discuss
HLH PPT.ppt very important topic to discussHLH PPT.ppt very important topic to discuss
HLH PPT.ppt very important topic to discuss
 
Get Premium Pimple Saudagar Call Girls (8005736733) 24x7 Rate 15999 with A/c ...
Get Premium Pimple Saudagar Call Girls (8005736733) 24x7 Rate 15999 with A/c ...Get Premium Pimple Saudagar Call Girls (8005736733) 24x7 Rate 15999 with A/c ...
Get Premium Pimple Saudagar Call Girls (8005736733) 24x7 Rate 15999 with A/c ...
 
VVIP Pune Call Girls Warje (7001035870) Pune Escorts Nearby with Complete Sat...
VVIP Pune Call Girls Warje (7001035870) Pune Escorts Nearby with Complete Sat...VVIP Pune Call Girls Warje (7001035870) Pune Escorts Nearby with Complete Sat...
VVIP Pune Call Girls Warje (7001035870) Pune Escorts Nearby with Complete Sat...
 
Call Girls Pimple Saudagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Pimple Saudagar Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Pimple Saudagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Pimple Saudagar Call Me 7737669865 Budget Friendly No Advance Booking
 
CHEAP Call Girls in Vinay Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Vinay Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Vinay Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Vinay Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Top Rated Pune Call Girls Katraj ⟟ 6297143586 ⟟ Call Me For Genuine Sex Serv...
Top Rated  Pune Call Girls Katraj ⟟ 6297143586 ⟟ Call Me For Genuine Sex Serv...Top Rated  Pune Call Girls Katraj ⟟ 6297143586 ⟟ Call Me For Genuine Sex Serv...
Top Rated Pune Call Girls Katraj ⟟ 6297143586 ⟟ Call Me For Genuine Sex Serv...
 
Introduction-to-4x4-SRAM-Memory-Block.pptx
Introduction-to-4x4-SRAM-Memory-Block.pptxIntroduction-to-4x4-SRAM-Memory-Block.pptx
Introduction-to-4x4-SRAM-Memory-Block.pptx
 
Vip Mumbai Call Girls Kalyan Call On 9920725232 With Body to body massage wit...
Vip Mumbai Call Girls Kalyan Call On 9920725232 With Body to body massage wit...Vip Mumbai Call Girls Kalyan Call On 9920725232 With Body to body massage wit...
Vip Mumbai Call Girls Kalyan Call On 9920725232 With Body to body massage wit...
 
Deira Dubai Escorts +0561951007 Escort Service in Dubai by Dubai Escort Girls
Deira Dubai Escorts +0561951007 Escort Service in Dubai by Dubai Escort GirlsDeira Dubai Escorts +0561951007 Escort Service in Dubai by Dubai Escort Girls
Deira Dubai Escorts +0561951007 Escort Service in Dubai by Dubai Escort Girls
 
Book Sex Workers Available Pune Call Girls Yerwada 6297143586 Call Hot India...
Book Sex Workers Available Pune Call Girls Yerwada  6297143586 Call Hot India...Book Sex Workers Available Pune Call Girls Yerwada  6297143586 Call Hot India...
Book Sex Workers Available Pune Call Girls Yerwada 6297143586 Call Hot India...
 
(👉Ridhima)👉VIP Model Call Girls Mulund ( Mumbai) Call ON 9967824496 Starting ...
(👉Ridhima)👉VIP Model Call Girls Mulund ( Mumbai) Call ON 9967824496 Starting ...(👉Ridhima)👉VIP Model Call Girls Mulund ( Mumbai) Call ON 9967824496 Starting ...
(👉Ridhima)👉VIP Model Call Girls Mulund ( Mumbai) Call ON 9967824496 Starting ...
 
Develop Keyboard Skill.pptx er power point
Develop Keyboard Skill.pptx er power pointDevelop Keyboard Skill.pptx er power point
Develop Keyboard Skill.pptx er power point
 
(ISHITA) Call Girls Service Aurangabad Call Now 8617697112 Aurangabad Escorts...
(ISHITA) Call Girls Service Aurangabad Call Now 8617697112 Aurangabad Escorts...(ISHITA) Call Girls Service Aurangabad Call Now 8617697112 Aurangabad Escorts...
(ISHITA) Call Girls Service Aurangabad Call Now 8617697112 Aurangabad Escorts...
 
9004554577, Get Adorable Call Girls service. Book call girls & escort service...
9004554577, Get Adorable Call Girls service. Book call girls & escort service...9004554577, Get Adorable Call Girls service. Book call girls & escort service...
9004554577, Get Adorable Call Girls service. Book call girls & escort service...
 
CHEAP Call Girls in Ashok Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Ashok Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Ashok Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Ashok Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 

NRB - LUXEMBOURG MAINFRAME DAY 2017 - Data Spark and the Data Federation

  • 1. © 2017 IBM Corporation Data: Spark and the Data Federation Leif Pedersen Executive IT Specialist, z Analytics, Europe Email: Leif.Pedersen@dk.ibm.com
  • 2. © 2017 IBM Corporation Systems of InsightSystems of Record Systems of Engagement Look like a “déjà vu”? 2
  • 3. © 2017 IBM Corporation In the new insight economy, winners infuse analytics everywhere to drive better outcomes! Create new business models (CEO) Attract, grow, retain customers (CMO) Transform financial & management processes (CFO) Manage risk (CRO) Prioritize IT investment for innovation (CIO, CDO) Optimize operations (COO) Fight fraud and counter threats (CSO) Systems of Insight Systems of Record Systems of Engagement 3
  • 4. © 2017 IBM Corporation All Data New Dev StylesNew Analytics More People Business Value Embrace all data Run at the speed of business 1 Enable all analytics IBM Analytics Point of View - Make DATA SIMPLE and ACCESSIBLE to ALL DATA Professionals are leading THE Transformation! 2 3 4
  • 5. © 2017 IBM Corporation The Evolution in the Approach to Getting Value from Data Operations Data Warehousing Self-service Analytics New Business Imperatives Maturity High High Low Data-Informed Decision Making • Full dataset analysis (no more sampling) • Extract value from non-relational data • 360o view of all enterprise data • Exploratory analysis and discovery Warehouse Modernization • Data lake • Data offload • ETL offload • Queryable archive and staging Lower the Cost of Storage Ensure resiliency and availability Business Transformation • Create new business models • Risk-aware decision making • Fight fraud and counter threats • Optimize operations • Attract, grow, retain customers Value We are here 5
  • 6. © 2017 IBM Corporation SoE Analytics evolution to support all Analytics Apps on all Data – The Mainframe Use case 6 Applications Data SoI HDFSMap / Reduce Spark Historical data in DB2 for z/OS & IBM DB2 Analytics Accelerator Other Data BI Reporting Data Warehouse / Data Marts The Data Lake Evolution Operational Data stored in VSAM, IMS, DB2 SoR Core Business supported by CICS, IMS, WAS z/OSRules Score execution Machine Learning The Predictive Analytics EvolutionScore Creation IT Operational Data
  • 7. © 2017 IBM Corporation z Systems Analytics Areas complement existing Analytics Environments. IBMDB2Analytics Accelerator In transaction rules and score execution Intraday capability for ad-hoc queries & predictive analytics Availability of historical data (in raw format) Accelerated reporting to fulfill internal and regulatory requirements Ability to transform data before offload to DWH or reporting Ability to create new models at any time Quasi Real Time availability of data for analytics Instant access to raw data for new report generation in hours instead of days Load and merge of ANY non DB2 z/OS data Scoring Rules A zDatazApps Scoring Rules Explore data to uncover hidden insights A 7
  • 8. © 2017 IBM Corporation Opportunity to rethink business processes: analytics as an integral part of the process itself, rather than a separate activity performed after the fact o Transform business processes, not just provide existing styles of analytics faster and without latency Enable business leaders to perform, in the context of operational processes, advanced and sophisticated real-time analysis of their business data Hybrid transaction/analytical processing will empower application leaders to innovate via greater situation awareness and improved business agility. Gartner Research Note G00259033 28 January 2014: Hybrid Transaction/Analytical Processing Will Foster Opportunities for Dramatic Business Innovation The integration of transactions and analytics is an emerging and important market segment “ ” Analytics as part of the flow of business Insights on every transaction
  • 9. © 2017 IBM Corporation Hybrid Transaction/Analytical Processing (HTAP) - with DB2 Analytics Accelerator OLAP DB2forz/OS Processing IBMDB2AnalyticsAccelerator DB2 for z/OS CPU savings target • Operational (in transaction) analytics • (complex) OLTP Accelerator focus • Ad-hoc queries • Complex queries scanning large amount of data • ETL acceleration/virtual transformation Complex queries (more history) OLTP Transactions High concurrency Hybrid Transactional & Analytical Processing Standard reports
  • 10. © 2017 IBM Corporation Data Warehouse and Data Lake A Data Lake is… +An analytics sandbox for exploring data to gain insight +An enterprise-wide catalog to find data across the enterprise and to link from business term to technical metadata +An environment for enabling reuse data transformations and queries +An environment where users can access vast amounts raw data +An environment for developing and proving an analytics model and then moving into production; experience in production may drive further experimentation in the data lake A Data Lake is not… - A data warehouse or data mart of all of the data in an enterprise - A high-performance production environment - A production reporting application - A purpose-built system to solve a specific problem 10
  • 11. © 2017 IBM Corporation Fast Runtime Environment – Interactive or batch processing – Based on data in-memory processing • High performance for multi-step processes where Spark can pass the data directly without using disk storage. – Parallel processing Interface to Data – Accessing Hadoop based HDFS data, Cassandra, Hbase, … – Accessing any traditional databases using JDBC Interface for Applications – Ease of Use APIs supported by modern languages – Stack of libraries including SQL, Machine Learning, GraphX, and Spark Streaming – Over 80 high-level operators that make it easy to build parallel applications – Many languages supported including Java, Scala, Python and R • Spark is actually written in Scala Spark, a Transaction Manager for Analytics Applications 11 Spark is NOT a datastore, NOT a replacement for Hadoop!
  • 12. © 2017 IBM Corporation 2. Spark lets you develop line-of-business applications faster 3. Spark learns from data and delivers in real time With Hadoop, you ask a question and get back a batch of data. With Spark, you may say, “continue to give me answers to this question”…and when new data comes, the user is smarter. 1. Spark makes it easier to access and work with all data - Enables new data-based use cases - All data: Internal/ External, Structured/ Unstructured - Real-time insights, from all data sources - Automates analytics with Machine Learning - Clients that lead in data, lead their industry Design Develop ment Data Science Why Spark matters to a business? 12
  • 13. © 2017 IBM Corporation VSAM z/OSKey Business Transaction & Batch Systems Spark Applications: IBM and Partners AdabasIMSDB2 z/OS Distributed Teradata HDFS Apache Spark Core Spark Stream Spark SQL MLib GraphX RDD DF RDD DF Optimized data access IBM z/OS Platform for Apache Spark and *many* more . . . Spark can run on z/OS close to z/OS-based Applications & Data Values: Data-in place analytics, without need to ETL or move data for analytic purposes Optimized access and z/OS governed ‘in-memory’ capabilities for core business data Unique capability to access almost all z/OS sources with Apache Spark SQL & many non-z data sources Almost all zIIP eligible Integration of analytics across core systems, social data, website information, etc. 13 and *many* more including SMF, OPERLOG, SYSLOGs, . . .
  • 14. © 2017 IBM Corporation14 Examples of Spark Use Case
  • 15. © 2017 IBM Corporation15 Client Insight Analytics over transactions & customer interactions Leverage data on z/OS (DB2, VSAM) & distributed (Oracle, SQL Server, HDFS) to enable real-time access from data science teams focused on client insight to develop patterns, models Data Distillation - Hybrid Architecture Run Spark z/OS to access, aggregate, filter and *distill* large volumes of data Make available smaller, aggregated analytic results for access by: customer insight solutions, data science environments 360 Degree View: Customers, Payments, Transactions Leverage Spark z/OS to get real-time or near real-time view of current status of payments, transactions, customers combining data from OLTP, distributed sources, & streaming IT Analytics Analyze real-time streamed SMF data, combined with archived SMF data and syslog data, visualize and interact with data science Jupyter Notebook to find patterns Use Case Patterns
  • 16. © 2017 IBM Corporation16 Distill the Data: • Use Spark z/OS for data blending, cleansing, transform, etc with data- in-place • Store results in ‘Tidy’ Data Repository • Refresh as needed Explore the results Data exploration, investigation leveraging ‘Tidy’ Repository Values: • Leverage most current business data for data science • Efficiencies in reducing ETL • Leverage common analytics ecosystem skill • Integrate Spark on multiple platforms for optimal analytics infrastructure Use Case #1: Hybrid Data Science
  • 17. © 2017 IBM Corporation17 Use Case #2: Optimized Customer Insight Customer z/OS Transactio nMerchant Spark Analytic Result Set Call Center Apache Spark Core Spark Stream Spark SQL MLib GraphX RDD DF RDD DF Optimized data Layer IBM z/OS Platform for Apache Spark Subset of Data: distilled, filtered, transformed BI Dashboard Components Data Cube Analytical Engines Web Portal Analytics API Gatewa y APIs Pre-Built Dashboards Pre-Built Data Models Pre-Built Analytical Models Transform (if needed), & populate BBCI staging area / cache Input & Output Tidy Data Values: • Avoid costly and ineffective wholesale copy of data • Frequent refresh of most relevant data elements to customer insights solution • Faster time to implementation for business solution to deliver insights on churn, cross- sell, etc. Customer Insight for Banking Solution
  • 18. © 2017 IBM Corporation18 Use Case #3: Real-Time Application Event Analytics Use Case Spark z/OS Event Stream CICS Event triggers create an event stream that would be captured by Spark running on its own z/OS LPAR Spark configured for high availability to avoid impacting CICS Real-Time Analytics with Spark z/OS: Real time analytics to provide feedback into the Systems of Engagement or Monitoring Systems on types of banking services and frequency of consumption Real time monitoring of core business processes and applications Historical Analysis leverages IDAA: Batch Load of Events for historical, trending and reporting Real Time Analytics, can include scoring DB2 Analytics Accelerator Loader Channel System of Engagement CICS Transactions Monitor LogstreamLogstream IBM DB2 Analytics Accelerat or Real-Time Consumption Batch Load Overnight Historical Analysis, Reporting DB2 z/OS
  • 19. © 2017 IBM Corporation19 Use Case #4: Surface Spark Results to JDBC / ODBC Applications DB2 z/OS z/OS Apache Spark Core Spark Strea m Spark SQL MLib Graph X DF RDD DF RDD DFStor • Persist specific Spark Result Sets • Backed by VSAM • Leverage z/OS SAF, Dataset mgmt HDFS JDBC / ODBC / REST, noSQL Client accessing Spark RDDs, example: Cognos , Tableau, … Optimized Data Layer IMSVSAM
  • 20. © 2017 IBM Corporation20 Use Case #5: Analyzing SMF Data with Spark • Spark application is agnostic to data source and number of sources • MDSS required on at least one system, MDSS agents required on all systems. No IPL required for installation • Logstream recording mode required for realtime interfaces MDSS Client LPAR1 MDSS Client LPAR2 MDSS Client LPAR3 SMF Realtime Logstream Logstream Logstream SMF Realtime Logstream Logstream Logstream SMF Realtime Logstream Logstream Logstream Spark Application using SparkSQL Optimized Data Integration Layer (MDSS) JDBC LPARn SMF Realtime Logstream Logstream Logstream Dump Data Sets Analyze real-time in-memory SMF data, combined with archived data Analyze data across multiple LPARs Augment with SYSLOG and other sources for richer analytic outcome Efficiencies in avoiding data movement
  • 21. © 2017 IBM Corporation21 Use Cases for Real Time SMF Analytics Detect excessive memory consumption – SMF30 Monitor high water mark for real memory usage for jobs and send alerts if usage exceeds normal consumption Detect security violations in real-time – SMF 80 Monitor volume of datasets/files accessed per user within a given time period and raise alerts for above normal access rates Real time monitoring resource usage in cloud environments (CPU, Memory, Disk) A list of supported SMF record types can be found in the Redbook “Apache Spark Implementation on IBM z/OS” - page 78 http://www.redbooks.ibm.com/abstracts/sg248325.html
  • 22. © 2017 IBM Corporation22 IBM Open Data Analytics for z/OS
  • 23. © 2017 IBM Corporation Business Applications CustomerTransactionMerchant Distributed Apache Spark Distilled Insight Query Acceleration Leveraging IBM Z for Optimized Analytics Federate analytics leveraging data in place for more current insights at scale, optimized security, privacy and reduced costs DataData Data Prep Data Prep ML Algo ML Algo ModelModel DeployDeploy PredictPredict Python Distilled InsightAnalytic Result Sets Govern, Manage, Algorithm Assist… Monitor, Feedback Pauselss GC New SIMD instructions 32 TB Memory Pervasive Encryption 23 IBM Open Data Analytics for z/OS IBM Machine Learning for z/OS Optimized Data Integration Layer
  • 24. © 2017 IBM Corporation IBM Open Data Analytics for z/OS: Offering Overview What is in the Offering? IBM Open Data Analytics for z/OS (IBM product): • Apache Spark 2.1.1 enabled for z/OS • Python 3.6.1 • All Pre-requisite libraries • Select Anaconda Libraries (approx. 250 including pandas, dask, numpy, scikit-learn, matplotlib…) • Optimized Data Integration Layer: optimized for Spark & Python db access to z/OS data • Integration with WLM z/OS for resource management aligned with job priority • Integration with security (SAF) interfaces • Support & Service available from IBM for a fee –Very aggressive pricing for zIIPs (cores) and memory for Open Data Analytics z/OS workload Ecosystem –GitHub zos-spark repository •Jupyter Notebooks (Scala, Python Workbenches) •Kernel gateway, Jupyter client, kernel toree •Sample data & code snippets –Rocket: •Collaboration for Optimized Data Layer •Industry vertical mappings, e.g. ISO8583-1, ACH, SMF, etc. –Continuum: • Access to z/OS channel on Anaconda cloud for updates / refreshes & Package management • Option to license private mirrored environment • Services & Consulting for Python
  • 25. © 2017 IBM Corporation Value: Increase Integration through Persisting Analytic Results for Enterprise Collaboration VSAM z/OS DF Store: • Specific Spark & Python Result Sets • Backed by VSAM • Leverage z/OS SAF, Dataset mgmtOptimized Data Layer Apache Spark Core Spark Stream DF DF MLib Graphx Spark SQL Python 3.6.1 Core Packages: • numpy • scikit-learn • dask • pandas • Matplotlib • Etc. IMS DB2 z/OS HDFS JDBC / ODBC / REST, noSQL Client accessing Spark RDDs, example: Cognos , Tableau, … IBM Open Data Analytics for z/OS
  • 26. © 2017 IBM Corporation