SlideShare una empresa de Scribd logo
1 de 26
Descargar para leer sin conexión
Building Your Enterprise Data
Marketplace with DMX-h
Jennifer Cheplick
Sr. Director, Product Marketing
Today’s agenda
• The Need for an Enterprise Data Marketplace
• Attributes of a Successful Enterprise Data Marketplace
• Building an Enterprise Data Marketplace
• Potential Roadblocks
• How Syncsort Helps
3
Data Growth
(quintillion) bytes of
data created every day
of the world’s data generated
in the past two years alone
smart devices
projected by 20202.5Q 90% 200B
Data Delivers
Competitive
Advantage
“Compared with their
peers, high performers
report a greater variety
of actions to monetize
data – with greater
revenue impact”
- McKinsey Global Survey: Fueling growth
through data monetization
Enterprise Data Marketplace4
73.2%
Percentage of executives
whose firms have
achieved measurable
results from Big Data
and AI investments
- NewVantage Partners Big Data Executive
Survey 2018
$1.8 Trillion
Projected annual
revenue for insights-
driven businesses by
2021
- “Insights-Driven Businesses Set the Pace
for Global Growth,” Forrester, October 19,
2018
85%
Firms that leverage
customer behavioral
insights outperform peers
by 85 percent in sales
growth and 25 percent in
gross margin
- McKinsey Global Survey: Capturing value
from your customer data
Enterprise Data Marketplace5
Promise of a Data-Driven Culture
ACCURATE ANALYTICS & FASTER TIME-TO-VALUE
▪ Reduce bias, uncertainty, and misunderstanding
▪ Uncover new, previously inaccessible insights
▪ Accelerate speed of organizational decision-making
▪ Gain the most accurate, in-depth view of your customers
▪ Monitor and respond to customer activity in real-time
▪ Ensure confidence in regulatory reporting
▪ Identify and manage risk more quickly and completely
▪ Minimize time spent on manual data preparation
▪ Ensure accuracy of global operations and supply chain
TARGETED MARKETING & REVENUE GROWTH
OPERATIONAL EFFICIENCY & COST REDUCTION
REDUCED RISK & COMPLIANCE WITH CONFIDENCE
• Data has outgrown the
data warehouse
• Data lakes can be
polluted and chaotic
• Data is inconsistent
across data marts
Enterprise Data Marketplace6
• Every part of the
business demands
sophisticated data
analysis
• Departments need
access to the
company’s many
data sets,
combined in
different ways
• IT can’t be a
bottleneck
But most
organizations
are not getting
the full value of
their data
91% of organizations
have not yet reached
a “transformational”
level of maturity in
data and analytics
- Gartner
68% of IT professionals
state that data silos
negatively impact their
organization’s ability
to get value from their
data
The Rise of
The Enterprise
Data
Marketplace
• Enables data-driven
organizations
• Analytics teams and
business users can shop
and find the data they
need
• Data can be combined
for ever-expanding
applications
Overcomes the
limitations of previous
solutions to deliver
the best of each, in
one central repository
• Volume and variety of
the data lake
• Veracity and auditability
of the data warehouse
• Velocity and specificity
of purpose of the data
mart
Enterprise Data Marketplace7
Enables data-driven
organizations
Enterprise Data
Marketplace
Attributes:
Reliability
Provides a centralized location for
curated, trusted data, that it is:
• Clean
• Standardized
• Verified
Guardian Life Insurance
needed to enable Machine
Learning, visualization and BI
on broad range of datasets,
and reduce time-to-market for
analytics projects.
• Reduce data preparation,
transformation times
• Make data assets available to
whole enterprise – including
Mainframe data
Enterprise Data Marketplace8
Data Marketplace –
centralized, reusable, up-to-the-
minute current, searchable,
accessible, managed,
trustworthy data for analytics
Fast Time-to-Market
for new analytics and reporting
Enterprise Data
Marketplace
Attributes:
Flexibility
Pulls data from across the
enterprise and allows users to pick
and choose the data you need,
depending on what you want to
accomplish.
Progressive Insurance needs
cost-effective, easily accessible
operational data – including
Claims Liability, Policy,
Customer, Incident and more –
for advanced analytics
• Data marketplace includes 50
data sources
• More are added as business
needs evolve
Enterprise Data Marketplace9
Better Analytics – with readily
accessible, up-to-date data.
Fast Analytics Time-to-Market –
Data available in hours not days.
Audit Trails for Compliance
while keeping the EDW current
Low Archival Costs
Enterprise Data
Marketplace
Attributes:
Availability
Empower analytics teams to create
new data schemas on their own
• The right data sets are available
• Data is always up to date and
ready for various types of
analytics
• Removes wait times and IT
bottlenecks
Analysts at Symphony
Health no longer wait for
requests for specific data
schemas, or data subsets,
to work their way through
the IT team’s queue
Enterprise Data Marketplace10
“Before, part of the
data wasn’t available
for a day, and other
parts, not for a week.
Now it’s all available
for analysis within
minutes of the data
arriving.”
Robert Hathaway
Senior Manager Big Data
Today’s agenda
• The Need for an Enterprise Data Marketplace
• Attributes of a Successful Enterprise Data Marketplace
• Building an Enterprise Data Marketplace
• Potential Roadblocks
• How Syncsort Helps
Enterprise Data Marketplace12
Building an Enterprise Data Marketplace
Data Lake or Cloud
Raw Landing Zone
Access & Onboard – Elect to include data to understand
• What you don’t know CAN hurt you – e.g. bias
• If you’ve left it out, you cannot know it exists
• Data sets have more power to predict when combined
Enterprise Data Marketplace13
Building an Enterprise Data Marketplace
Data Lake or Cloud
Raw Landing Zone
Refined Zone
Refine – cleanse, enrich, de-duplicate
• What data needs refinement? – use cases will determine
• Each data set should be refined once – don’t repeat work
Enterprise Data Marketplace14
Building an Enterprise Data Marketplace
Data Lake or Cloud
Raw Landing Zone
Refined Zone
Track Provenance
• Data lineage documentation is necessary for establishing data
can be trusted, and for auditing, regulatory compliance
• Also, useful for reproducing steps in production machine
learning data pipelines
Enterprise Data Marketplace15
Building an Enterprise Data Marketplace
Data Lake or Cloud
Raw Landing Zone
Refined Zone
Shop for data sets, features & validate against your questions
• Analyst, data scientist shops for data
• What do I need for my purpose?
• Quality is already assured, provenance documented
• Improves trust, saves time
5 Potential Roadblocks to Building Your
Enterprise Data Warehouse
• Can be trapped in
hard-to-reach
systems like
mainframes, etc.
• Found in streams
in from POS, web
clicks, etc.
• Incompatible
formats, making it
difficult to gather
and prepare the
data for model
training.
Enterprise Data Marketplace16
Data Cleansing
at Scale
• Cleanse, enrich,
de-duplicate
• What data needs
refinement? – use
cases will
determine
• Each data set
should be refined
once – don’t
repeat work
Tracking
Lineage from
the Source
• Capture of
complete lineage,
from source to end
point – across
systems -- is
needed.
• Data changes made
to help train
models have to be
exactly duplicated
in production, in
order for models to
accurately make
predictions on new
data, and for
required audit
trails.
Entity
Resolution
• Matching across
massive datasets
that indicate a
single specific
entity (person,
company,
product, etc.)
• Requires
sophisticated
multi-field
matching
algorithms and a
lot of compute
power.
Siloed, Hard to
Reach Datasets
Ongoing Real-
Time Changed
Data Capture
• Tracking and
detection needs
to happen very
rapidly.
• Current
transactions need
to be constantly
added to
combined
datasets,
prepared and
presented to
models as close
to real-time as
possible.
Today’s agenda
• The Need for an Enterprise Data Marketplace
• Attributes of a Successful Enterprise Data Marketplace
• Building an Enterprise Data Marketplace
• Potential Roadblocks
• How Syncsort Helps
18
Build Your Enterprise Data Marketplace with Syncsort
Onboard ALL
enterprise
data.
Access
Join, transform,
cleanse, de-
duplicate batch
or streaming
data.
Integrate
Secure, govern,
manage and
monitor
everything.
Comply
Design once,
deploy anywhere.
Simplify
19
Simplify Big Data Integration with Syncsort
Simplify Big Data Integration
Onboard ALL
enterprise
data.
Access
Enterprise Data Marketplace20
Access & Integrate ALL Enterprise Data – Mainframe to Streaming
Data Sources
Onboard data, modify
on-the-fly to match
Hadoop storage model,
or store unchanged for
archive and compliance.
Access data from
streaming and batch
sources outside
cluster.
Cluster or Cloud
Data
Refine, transform, join,
cleanse, enhance
data in cluster or Cloud
with MapReduce,
EMR, or Spark.
Simplify Big Data Integration21
Comply: Govern and Track Everything for Compliance
• Metadata and data lineage for Hive, Avro and Parquet
through HCatalog
• Metadata lineage export and API from DMX/DMX-h
• Simplify audits, analytics dashboards, metrics
• Integrate with enterprise metadata repositories
• Cloudera Navigator certified integration
• Track lineage from source – even changes made off cluster
• HDFS, YARN, Spark and other metadata
• Lineage, tagging
• Business and structural metadata
• Apache Atlas ingestion lineage integration
• Lineage, tagging
• Track lineage from source – even changes made off cluster
DMX-h
Simplify Big Data Integration22
Comply: Secure the Entire Process
• Native Kerberos and LDAP support
• Kerberos-secured clusters
• Authenticated browsing
• Authenticated sampling
• Security certified
• Apache Ranger
• Apache Sentry
• FTPS, Connect:Direct secure data transfers
DMX-h
23
Simplify: Design Once, Deploy Anywhere
Simplify Big Data Integration
Intelligent Execution - Insulate your organization from underlying complexities of Hadoop.
Get excellent performance every time
without tuning, load balancing, etc.
No re-design, re-compile, no re-work ever
• Future-proof job designs for emerging
compute frameworks, e.g. Spark 2.x
• Move from dev to test to production
• Move from on-premise to Cloud
• Move from one Cloud to another
Use existing ETL skills
No parallel programming – Java, MapReduce, Spark …
No worries about:
• Mappers, Reducers
• Big side or small side of joins …
Design Once
in visual GUI
Deploy Anywhere!
On-Premise,
Cloud
Mapreduce, Spark,
Future Platforms
Windows, Unix,
Linux
Batch,
Streaming
Single Node,
Cluster
Trillium Quality for Big Data – Data Cleansing at Scale
Boost effectiveness of machine learning, AI with complete, standardized data.
1. Visually create and test data
quality processes locally
2. Execute in MapReduce or Spark
On premise or in the Cloud
Build Your
Enterprise
Data
Warehouse
with Syncsort
“Ingestion has
gone from
days to hours”
- Progressive Big Data Tech
Lead
“DMX-h is already
optimized. We use
its Intelligent
Execution and it
just performs.”
Enterprise Data Marketplace25
“DMX-h is already
optimized. We use
its Intelligent
Execution and it
just performs.”
- Robert Hathaway
Senior Manager Big Data,
Symphony Health
“We found DMX-h
to be very usable
and easy to ramp
up in terms of
skills. Most of all,
Syncsort has been
a very good
partner in terms
of support and
listening to our
needs.”
- Alex Rosenthal, Enterprise
Data Office, Guardian Life
Insurance
Visit
www.syncsort.com
to learn more
Enterprise Data Marketplace26

Más contenido relacionado

La actualidad más candente

Data Architecture for Data Governance
Data Architecture for Data GovernanceData Architecture for Data Governance
Data Architecture for Data Governance
DATAVERSITY
 

La actualidad más candente (20)

Data Profiling, Data Catalogs and Metadata Harmonisation
Data Profiling, Data Catalogs and Metadata HarmonisationData Profiling, Data Catalogs and Metadata Harmonisation
Data Profiling, Data Catalogs and Metadata Harmonisation
 
Master the Multi-Clustered Data Warehouse - Snowflake
Master the Multi-Clustered Data Warehouse - SnowflakeMaster the Multi-Clustered Data Warehouse - Snowflake
Master the Multi-Clustered Data Warehouse - Snowflake
 
Data Architecture for Data Governance
Data Architecture for Data GovernanceData Architecture for Data Governance
Data Architecture for Data Governance
 
Time to Talk about Data Mesh
Time to Talk about Data MeshTime to Talk about Data Mesh
Time to Talk about Data Mesh
 
Data Governance
Data GovernanceData Governance
Data Governance
 
Data Governance Best Practices
Data Governance Best PracticesData Governance Best Practices
Data Governance Best Practices
 
Data-Ed Webinar: Data Quality Success Stories
Data-Ed Webinar: Data Quality Success StoriesData-Ed Webinar: Data Quality Success Stories
Data-Ed Webinar: Data Quality Success Stories
 
Rethinking Trust in Data
Rethinking Trust in Data Rethinking Trust in Data
Rethinking Trust in Data
 
Preparing a data migration plan: A practical guide
Preparing a data migration plan: A practical guidePreparing a data migration plan: A practical guide
Preparing a data migration plan: A practical guide
 
Actionable Insights with AI - Snowflake for Data Science
Actionable Insights with AI - Snowflake for Data ScienceActionable Insights with AI - Snowflake for Data Science
Actionable Insights with AI - Snowflake for Data Science
 
Data Marketplace and the Role of Data Virtualization
Data Marketplace and the Role of Data VirtualizationData Marketplace and the Role of Data Virtualization
Data Marketplace and the Role of Data Virtualization
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
 
Data Catalog as the Platform for Data Intelligence
Data Catalog as the Platform for Data IntelligenceData Catalog as the Platform for Data Intelligence
Data Catalog as the Platform for Data Intelligence
 
Data Catalog as a Business Enabler
Data Catalog as a Business EnablerData Catalog as a Business Enabler
Data Catalog as a Business Enabler
 
Modern Data architecture Design
Modern Data architecture DesignModern Data architecture Design
Modern Data architecture Design
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouse
 
Strategic Approach To Data Migration Project Plan
Strategic Approach To Data Migration Project PlanStrategic Approach To Data Migration Project Plan
Strategic Approach To Data Migration Project Plan
 
Data Strategy
Data StrategyData Strategy
Data Strategy
 
You Need a Data Catalog. Do You Know Why?
You Need a Data Catalog. Do You Know Why?You Need a Data Catalog. Do You Know Why?
You Need a Data Catalog. Do You Know Why?
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lake
 

Similar a Building Your Enterprise Data Marketplace with DMX-h

Customer Intelligence_ Harnessing Elephants at Transamerica Presentation (1)
Customer Intelligence_ Harnessing Elephants at Transamerica    Presentation (1)Customer Intelligence_ Harnessing Elephants at Transamerica    Presentation (1)
Customer Intelligence_ Harnessing Elephants at Transamerica Presentation (1)
Vishal Bamba
 

Similar a Building Your Enterprise Data Marketplace with DMX-h (20)

How to Capitalize on Big Data with Oracle Analytics Cloud
How to Capitalize on Big Data with Oracle Analytics CloudHow to Capitalize on Big Data with Oracle Analytics Cloud
How to Capitalize on Big Data with Oracle Analytics Cloud
 
Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...
Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...
Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...
 
Deliveinrg explainable AI
Deliveinrg explainable AIDeliveinrg explainable AI
Deliveinrg explainable AI
 
Customer Intelligence_ Harnessing Elephants at Transamerica Presentation (1)
Customer Intelligence_ Harnessing Elephants at Transamerica    Presentation (1)Customer Intelligence_ Harnessing Elephants at Transamerica    Presentation (1)
Customer Intelligence_ Harnessing Elephants at Transamerica Presentation (1)
 
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
 
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
 
Assessing New Databases– Translytical Use Cases
Assessing New Databases– Translytical Use CasesAssessing New Databases– Translytical Use Cases
Assessing New Databases– Translytical Use Cases
 
Introducing Trillium DQ for Big Data: Powerful Profiling and Data Quality for...
Introducing Trillium DQ for Big Data: Powerful Profiling and Data Quality for...Introducing Trillium DQ for Big Data: Powerful Profiling and Data Quality for...
Introducing Trillium DQ for Big Data: Powerful Profiling and Data Quality for...
 
How 360 Degree Data Integration Enables the Customer-centric Business
How 360 Degree Data Integration Enables the Customer-centric BusinessHow 360 Degree Data Integration Enables the Customer-centric Business
How 360 Degree Data Integration Enables the Customer-centric Business
 
When and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data ArchitectureWhen and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data Architecture
 
Foundational Strategies for Trust in Big Data Part 3: Data Lineage
Foundational Strategies for Trust in Big Data Part 3: Data LineageFoundational Strategies for Trust in Big Data Part 3: Data Lineage
Foundational Strategies for Trust in Big Data Part 3: Data Lineage
 
Data Mashups for Analytics
Data Mashups for AnalyticsData Mashups for Analytics
Data Mashups for Analytics
 
Data Mashups for Analytics
Data Mashups for AnalyticsData Mashups for Analytics
Data Mashups for Analytics
 
MT101 Dell OCIO: Delivering data and analytics in real time
MT101 Dell OCIO:  Delivering data and analytics in real timeMT101 Dell OCIO:  Delivering data and analytics in real time
MT101 Dell OCIO: Delivering data and analytics in real time
 
CHAPTER 2.ppt
CHAPTER 2.pptCHAPTER 2.ppt
CHAPTER 2.ppt
 
Data mining wrhousing-lec
Data mining wrhousing-lecData mining wrhousing-lec
Data mining wrhousing-lec
 
ADV Slides: Data Pipelines in the Enterprise and Comparison
ADV Slides: Data Pipelines in the Enterprise and ComparisonADV Slides: Data Pipelines in the Enterprise and Comparison
ADV Slides: Data Pipelines in the Enterprise and Comparison
 
Turning Big Data into Better Business Outcomes
Turning Big Data into Better Business OutcomesTurning Big Data into Better Business Outcomes
Turning Big Data into Better Business Outcomes
 
How PepsiCo's Big Data Strategy is Disrupting CPG Retail Analytics
How PepsiCo's Big Data Strategy is Disrupting CPG Retail AnalyticsHow PepsiCo's Big Data Strategy is Disrupting CPG Retail Analytics
How PepsiCo's Big Data Strategy is Disrupting CPG Retail Analytics
 
Increasing Agility Through Data Virtualization
Increasing Agility Through Data VirtualizationIncreasing Agility Through Data Virtualization
Increasing Agility Through Data Virtualization
 

Más de Precisely

How to Build Data Governance Programs That Last - A Business-First Approach.pdf
How to Build Data Governance Programs That Last - A Business-First Approach.pdfHow to Build Data Governance Programs That Last - A Business-First Approach.pdf
How to Build Data Governance Programs That Last - A Business-First Approach.pdf
Precisely
 
Zukuntssichere SAP Prozesse dank automatisierter Massendaten
Zukuntssichere SAP Prozesse dank automatisierter MassendatenZukuntssichere SAP Prozesse dank automatisierter Massendaten
Zukuntssichere SAP Prozesse dank automatisierter Massendaten
Precisely
 
Automate Studio Training: Materials Maintenance Tips for Efficiency and Ease ...
Automate Studio Training: Materials Maintenance Tips for Efficiency and Ease ...Automate Studio Training: Materials Maintenance Tips for Efficiency and Ease ...
Automate Studio Training: Materials Maintenance Tips for Efficiency and Ease ...
Precisely
 
Testjrjnejrvnorno4rno3nrfnfjnrfnournfou3nfou3f
Testjrjnejrvnorno4rno3nrfnfjnrfnournfou3nfou3fTestjrjnejrvnorno4rno3nrfnfjnrfnournfou3nfou3f
Testjrjnejrvnorno4rno3nrfnfjnrfnournfou3nfou3f
Precisely
 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity Webinar
Precisely
 
Moving IBM i Applications to the Cloud with AWS and Precisely
Moving IBM i Applications to the Cloud with AWS and PreciselyMoving IBM i Applications to the Cloud with AWS and Precisely
Moving IBM i Applications to the Cloud with AWS and Precisely
Precisely
 
Automate Your Master Data Processes for Shared Service Center Excellence
Automate Your Master Data Processes for Shared Service Center ExcellenceAutomate Your Master Data Processes for Shared Service Center Excellence
Automate Your Master Data Processes for Shared Service Center Excellence
Precisely
 

Más de Precisely (20)

How to Build Data Governance Programs That Last - A Business-First Approach.pdf
How to Build Data Governance Programs That Last - A Business-First Approach.pdfHow to Build Data Governance Programs That Last - A Business-First Approach.pdf
How to Build Data Governance Programs That Last - A Business-First Approach.pdf
 
Zukuntssichere SAP Prozesse dank automatisierter Massendaten
Zukuntssichere SAP Prozesse dank automatisierter MassendatenZukuntssichere SAP Prozesse dank automatisierter Massendaten
Zukuntssichere SAP Prozesse dank automatisierter Massendaten
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power Systems
 
Crucial Considerations for AI-ready Data.pdf
Crucial Considerations for AI-ready Data.pdfCrucial Considerations for AI-ready Data.pdf
Crucial Considerations for AI-ready Data.pdf
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Justifying Capacity Managment Webinar 4/10
Justifying Capacity Managment Webinar 4/10Justifying Capacity Managment Webinar 4/10
Justifying Capacity Managment Webinar 4/10
 
Automate Studio Training: Materials Maintenance Tips for Efficiency and Ease ...
Automate Studio Training: Materials Maintenance Tips for Efficiency and Ease ...Automate Studio Training: Materials Maintenance Tips for Efficiency and Ease ...
Automate Studio Training: Materials Maintenance Tips for Efficiency and Ease ...
 
Leveraging Mainframe Data in Near Real Time to Unleash Innovation With Cloud:...
Leveraging Mainframe Data in Near Real Time to Unleash Innovation With Cloud:...Leveraging Mainframe Data in Near Real Time to Unleash Innovation With Cloud:...
Leveraging Mainframe Data in Near Real Time to Unleash Innovation With Cloud:...
 
Testjrjnejrvnorno4rno3nrfnfjnrfnournfou3nfou3f
Testjrjnejrvnorno4rno3nrfnfjnrfnournfou3nfou3fTestjrjnejrvnorno4rno3nrfnfjnrfnournfou3nfou3f
Testjrjnejrvnorno4rno3nrfnfjnrfnournfou3nfou3f
 
Data Innovation Summit: Data Integrity Trends
Data Innovation Summit: Data Integrity TrendsData Innovation Summit: Data Integrity Trends
Data Innovation Summit: Data Integrity Trends
 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity Webinar
 
Optimisez la fonction financière en automatisant vos processus SAP
Optimisez la fonction financière en automatisant vos processus SAPOptimisez la fonction financière en automatisant vos processus SAP
Optimisez la fonction financière en automatisant vos processus SAP
 
SAPS/4HANA Migration - Transformation-Management + nachhaltige Investitionen
SAPS/4HANA Migration - Transformation-Management + nachhaltige InvestitionenSAPS/4HANA Migration - Transformation-Management + nachhaltige Investitionen
SAPS/4HANA Migration - Transformation-Management + nachhaltige Investitionen
 
Automatisierte SAP Prozesse mit Hilfe von APIs
Automatisierte SAP Prozesse mit Hilfe von APIsAutomatisierte SAP Prozesse mit Hilfe von APIs
Automatisierte SAP Prozesse mit Hilfe von APIs
 
Moving IBM i Applications to the Cloud with AWS and Precisely
Moving IBM i Applications to the Cloud with AWS and PreciselyMoving IBM i Applications to the Cloud with AWS and Precisely
Moving IBM i Applications to the Cloud with AWS and Precisely
 
Effective Security Monitoring for IBM i: What You Need to Know
Effective Security Monitoring for IBM i: What You Need to KnowEffective Security Monitoring for IBM i: What You Need to Know
Effective Security Monitoring for IBM i: What You Need to Know
 
Automate Your Master Data Processes for Shared Service Center Excellence
Automate Your Master Data Processes for Shared Service Center ExcellenceAutomate Your Master Data Processes for Shared Service Center Excellence
Automate Your Master Data Processes for Shared Service Center Excellence
 
5 Keys to Improved IT Operation Management
5 Keys to Improved IT Operation Management5 Keys to Improved IT Operation Management
5 Keys to Improved IT Operation Management
 
Unlock Efficiency With Your Address Data Today For a Smarter Tomorrow
Unlock Efficiency With Your Address Data Today For a Smarter TomorrowUnlock Efficiency With Your Address Data Today For a Smarter Tomorrow
Unlock Efficiency With Your Address Data Today For a Smarter Tomorrow
 
Navigating Cloud Trends in 2024 Webinar Deck
Navigating Cloud Trends in 2024 Webinar DeckNavigating Cloud Trends in 2024 Webinar Deck
Navigating Cloud Trends in 2024 Webinar Deck
 

Último

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Último (20)

Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 

Building Your Enterprise Data Marketplace with DMX-h

  • 1. Building Your Enterprise Data Marketplace with DMX-h Jennifer Cheplick Sr. Director, Product Marketing
  • 2. Today’s agenda • The Need for an Enterprise Data Marketplace • Attributes of a Successful Enterprise Data Marketplace • Building an Enterprise Data Marketplace • Potential Roadblocks • How Syncsort Helps
  • 3. 3 Data Growth (quintillion) bytes of data created every day of the world’s data generated in the past two years alone smart devices projected by 20202.5Q 90% 200B
  • 4. Data Delivers Competitive Advantage “Compared with their peers, high performers report a greater variety of actions to monetize data – with greater revenue impact” - McKinsey Global Survey: Fueling growth through data monetization Enterprise Data Marketplace4 73.2% Percentage of executives whose firms have achieved measurable results from Big Data and AI investments - NewVantage Partners Big Data Executive Survey 2018 $1.8 Trillion Projected annual revenue for insights- driven businesses by 2021 - “Insights-Driven Businesses Set the Pace for Global Growth,” Forrester, October 19, 2018 85% Firms that leverage customer behavioral insights outperform peers by 85 percent in sales growth and 25 percent in gross margin - McKinsey Global Survey: Capturing value from your customer data
  • 5. Enterprise Data Marketplace5 Promise of a Data-Driven Culture ACCURATE ANALYTICS & FASTER TIME-TO-VALUE ▪ Reduce bias, uncertainty, and misunderstanding ▪ Uncover new, previously inaccessible insights ▪ Accelerate speed of organizational decision-making ▪ Gain the most accurate, in-depth view of your customers ▪ Monitor and respond to customer activity in real-time ▪ Ensure confidence in regulatory reporting ▪ Identify and manage risk more quickly and completely ▪ Minimize time spent on manual data preparation ▪ Ensure accuracy of global operations and supply chain TARGETED MARKETING & REVENUE GROWTH OPERATIONAL EFFICIENCY & COST REDUCTION REDUCED RISK & COMPLIANCE WITH CONFIDENCE
  • 6. • Data has outgrown the data warehouse • Data lakes can be polluted and chaotic • Data is inconsistent across data marts Enterprise Data Marketplace6 • Every part of the business demands sophisticated data analysis • Departments need access to the company’s many data sets, combined in different ways • IT can’t be a bottleneck But most organizations are not getting the full value of their data 91% of organizations have not yet reached a “transformational” level of maturity in data and analytics - Gartner 68% of IT professionals state that data silos negatively impact their organization’s ability to get value from their data
  • 7. The Rise of The Enterprise Data Marketplace • Enables data-driven organizations • Analytics teams and business users can shop and find the data they need • Data can be combined for ever-expanding applications Overcomes the limitations of previous solutions to deliver the best of each, in one central repository • Volume and variety of the data lake • Veracity and auditability of the data warehouse • Velocity and specificity of purpose of the data mart Enterprise Data Marketplace7 Enables data-driven organizations
  • 8. Enterprise Data Marketplace Attributes: Reliability Provides a centralized location for curated, trusted data, that it is: • Clean • Standardized • Verified Guardian Life Insurance needed to enable Machine Learning, visualization and BI on broad range of datasets, and reduce time-to-market for analytics projects. • Reduce data preparation, transformation times • Make data assets available to whole enterprise – including Mainframe data Enterprise Data Marketplace8 Data Marketplace – centralized, reusable, up-to-the- minute current, searchable, accessible, managed, trustworthy data for analytics Fast Time-to-Market for new analytics and reporting
  • 9. Enterprise Data Marketplace Attributes: Flexibility Pulls data from across the enterprise and allows users to pick and choose the data you need, depending on what you want to accomplish. Progressive Insurance needs cost-effective, easily accessible operational data – including Claims Liability, Policy, Customer, Incident and more – for advanced analytics • Data marketplace includes 50 data sources • More are added as business needs evolve Enterprise Data Marketplace9 Better Analytics – with readily accessible, up-to-date data. Fast Analytics Time-to-Market – Data available in hours not days. Audit Trails for Compliance while keeping the EDW current Low Archival Costs
  • 10. Enterprise Data Marketplace Attributes: Availability Empower analytics teams to create new data schemas on their own • The right data sets are available • Data is always up to date and ready for various types of analytics • Removes wait times and IT bottlenecks Analysts at Symphony Health no longer wait for requests for specific data schemas, or data subsets, to work their way through the IT team’s queue Enterprise Data Marketplace10 “Before, part of the data wasn’t available for a day, and other parts, not for a week. Now it’s all available for analysis within minutes of the data arriving.” Robert Hathaway Senior Manager Big Data
  • 11. Today’s agenda • The Need for an Enterprise Data Marketplace • Attributes of a Successful Enterprise Data Marketplace • Building an Enterprise Data Marketplace • Potential Roadblocks • How Syncsort Helps
  • 12. Enterprise Data Marketplace12 Building an Enterprise Data Marketplace Data Lake or Cloud Raw Landing Zone Access & Onboard – Elect to include data to understand • What you don’t know CAN hurt you – e.g. bias • If you’ve left it out, you cannot know it exists • Data sets have more power to predict when combined
  • 13. Enterprise Data Marketplace13 Building an Enterprise Data Marketplace Data Lake or Cloud Raw Landing Zone Refined Zone Refine – cleanse, enrich, de-duplicate • What data needs refinement? – use cases will determine • Each data set should be refined once – don’t repeat work
  • 14. Enterprise Data Marketplace14 Building an Enterprise Data Marketplace Data Lake or Cloud Raw Landing Zone Refined Zone Track Provenance • Data lineage documentation is necessary for establishing data can be trusted, and for auditing, regulatory compliance • Also, useful for reproducing steps in production machine learning data pipelines
  • 15. Enterprise Data Marketplace15 Building an Enterprise Data Marketplace Data Lake or Cloud Raw Landing Zone Refined Zone Shop for data sets, features & validate against your questions • Analyst, data scientist shops for data • What do I need for my purpose? • Quality is already assured, provenance documented • Improves trust, saves time
  • 16. 5 Potential Roadblocks to Building Your Enterprise Data Warehouse • Can be trapped in hard-to-reach systems like mainframes, etc. • Found in streams in from POS, web clicks, etc. • Incompatible formats, making it difficult to gather and prepare the data for model training. Enterprise Data Marketplace16 Data Cleansing at Scale • Cleanse, enrich, de-duplicate • What data needs refinement? – use cases will determine • Each data set should be refined once – don’t repeat work Tracking Lineage from the Source • Capture of complete lineage, from source to end point – across systems -- is needed. • Data changes made to help train models have to be exactly duplicated in production, in order for models to accurately make predictions on new data, and for required audit trails. Entity Resolution • Matching across massive datasets that indicate a single specific entity (person, company, product, etc.) • Requires sophisticated multi-field matching algorithms and a lot of compute power. Siloed, Hard to Reach Datasets Ongoing Real- Time Changed Data Capture • Tracking and detection needs to happen very rapidly. • Current transactions need to be constantly added to combined datasets, prepared and presented to models as close to real-time as possible.
  • 17. Today’s agenda • The Need for an Enterprise Data Marketplace • Attributes of a Successful Enterprise Data Marketplace • Building an Enterprise Data Marketplace • Potential Roadblocks • How Syncsort Helps
  • 18. 18 Build Your Enterprise Data Marketplace with Syncsort Onboard ALL enterprise data. Access Join, transform, cleanse, de- duplicate batch or streaming data. Integrate Secure, govern, manage and monitor everything. Comply Design once, deploy anywhere. Simplify
  • 19. 19 Simplify Big Data Integration with Syncsort Simplify Big Data Integration Onboard ALL enterprise data. Access
  • 20. Enterprise Data Marketplace20 Access & Integrate ALL Enterprise Data – Mainframe to Streaming Data Sources Onboard data, modify on-the-fly to match Hadoop storage model, or store unchanged for archive and compliance. Access data from streaming and batch sources outside cluster. Cluster or Cloud Data Refine, transform, join, cleanse, enhance data in cluster or Cloud with MapReduce, EMR, or Spark.
  • 21. Simplify Big Data Integration21 Comply: Govern and Track Everything for Compliance • Metadata and data lineage for Hive, Avro and Parquet through HCatalog • Metadata lineage export and API from DMX/DMX-h • Simplify audits, analytics dashboards, metrics • Integrate with enterprise metadata repositories • Cloudera Navigator certified integration • Track lineage from source – even changes made off cluster • HDFS, YARN, Spark and other metadata • Lineage, tagging • Business and structural metadata • Apache Atlas ingestion lineage integration • Lineage, tagging • Track lineage from source – even changes made off cluster DMX-h
  • 22. Simplify Big Data Integration22 Comply: Secure the Entire Process • Native Kerberos and LDAP support • Kerberos-secured clusters • Authenticated browsing • Authenticated sampling • Security certified • Apache Ranger • Apache Sentry • FTPS, Connect:Direct secure data transfers DMX-h
  • 23. 23 Simplify: Design Once, Deploy Anywhere Simplify Big Data Integration Intelligent Execution - Insulate your organization from underlying complexities of Hadoop. Get excellent performance every time without tuning, load balancing, etc. No re-design, re-compile, no re-work ever • Future-proof job designs for emerging compute frameworks, e.g. Spark 2.x • Move from dev to test to production • Move from on-premise to Cloud • Move from one Cloud to another Use existing ETL skills No parallel programming – Java, MapReduce, Spark … No worries about: • Mappers, Reducers • Big side or small side of joins … Design Once in visual GUI Deploy Anywhere! On-Premise, Cloud Mapreduce, Spark, Future Platforms Windows, Unix, Linux Batch, Streaming Single Node, Cluster
  • 24. Trillium Quality for Big Data – Data Cleansing at Scale Boost effectiveness of machine learning, AI with complete, standardized data. 1. Visually create and test data quality processes locally 2. Execute in MapReduce or Spark On premise or in the Cloud
  • 25. Build Your Enterprise Data Warehouse with Syncsort “Ingestion has gone from days to hours” - Progressive Big Data Tech Lead “DMX-h is already optimized. We use its Intelligent Execution and it just performs.” Enterprise Data Marketplace25 “DMX-h is already optimized. We use its Intelligent Execution and it just performs.” - Robert Hathaway Senior Manager Big Data, Symphony Health “We found DMX-h to be very usable and easy to ramp up in terms of skills. Most of all, Syncsort has been a very good partner in terms of support and listening to our needs.” - Alex Rosenthal, Enterprise Data Office, Guardian Life Insurance Visit www.syncsort.com to learn more