More Related Content Similar to The new dominant companies are running on data (20) The new dominant companies are running on data 1. The new dominant companies are
running on data
Take your company to the next level of value and efficiency
Rich Dill– Enterprise Solutions Architect– rdill@snaplogic.com
2. ©2017 SnapLogic, Inc. All Rights Reserved Confidential Content
2
What problem do we want to solve?
How do we get value from all this data?
3. What is the solution?
Confidential Content
3
Sometimes it is not obvious to everyone involved
©2017 SnapLogic, Inc. All Rights Reserved
Decisions made without facts are opinions
◦ What are the facts? Again and again and again – what are the facts? Shun wishful thinking, ignore divine revelation, forget what
“the stars foretell,” avoid opinion, care not what the neighbors think, never mind the unguessable “verdict of history” – what are
the facts, and to how many decimal places? You pilot always into an unknown future; facts are your single clue. Get the facts!”
RH
Turn your latent assets into liquid to realize their value
- No longer latent but now liquid
◦ Data has to be on the move
- It must be leveraged by the masses
The business goal
◦ Actually deliver on the promise of transforming data into actionable information
◦ Predictive analytics improve forecasting
◦ Prescriptive analytics can guide business behaviour
◦ Geolocation analytics can improve resource utilization and inventory turns
What are the results?
- Delivering insights to executives yields direction
- Delivering insights to line workers yields results
4. corporate overview
Not everyone has the same problem
Use cases are variations on a common theme
Confidential Content
4
©2017 SnapLogic, Inc. All Rights Reserved
5. Sampling of Industry Focused Use Cases
Umbrella Industry
Fraud
Detection
Upsell &
Cross-sell
Customer360 Fault
Prediction
Sentiment
Analysis
Personalization M & A Management
Consulting
Manufacturing X X
Retail X X X X X X
Healthcare X X
Financial Services X X X X X
Energy X X
Logistics &
Transportation
X X X
Services X X
CPG X X X
Computer Software X X
Telecom X X X X X X X
Deployment Pattern
Data Refinery or
Data Lake Pop.
Hub-and-Spoke Hub-and-Spoke Data Refinery Data Refinery or
Data Lake Pop.
Data Refinery or
Data Lake Pop.
Common Data
Modeling
Common Data
Modeling
6. Data Lake Population
Data Lake
Storage: S3, HDFS,
Processing/Transformation
Ingestion
Source
System 1
Source
System 2
Source
System 3
Source
System N
Pull
Push
Stream
Streaming
Database
SaaS App
File
7. 7
Data Refinery
Data Lake OLAP
Push
Storage: S3, HDFS,
Processing/Transformation
Ingestion
Pull
Push
Stream
Source
System 1
Source
System 2
Source
System 3
Source
System N
Streaming
Database
SaaS App
File
8. 8
Common Data Model
Data Lake
Source
System 1
Source
System 2
Source
System 3
Source
System N
Source
System 1
Source
System 2
Source
System 3
Source
System N
Source
System 1
Source
System 2
Source
System 3
Source
System N
HDFS,
S3, Blob
Staging
**
Source
System 1
Source
System 1
Downstream
Apps
Push
Streaming
Database
SaaS
App
File
Processing/Transformation
Ingestion
Pull
Push
Stream
Storage: S3, HDFS,
11. Michelangelo@Uber
Confidential Content
11
Welcome my son to the machine…
©2017 SnapLogic, Inc. All Rights Reserved
The problem
◦ “There were no systems in place to build reliable, uniform, and reproducible pipelines for creating and
managing training and prediction data at scale.”
The solution: Machine Learning as a Service
◦ ML-as-a-service platform that democratizes machine learning and makes scaling AI to meet the needs of
business as easy as requesting a ride.
Michelangelo consists of a mix of open source systems and components built in-house. The
primary open sourced components used
are HDFS, Spark, Samza, Cassandra, MLLib, XGBoost, and TensorFlow.
Cost
◦ Two years
◦ $60 million
Results
◦ A Wall Street Journal report claims SoftBank has been in touch with Uber with the apparent goal of buying a
“multi-billion dollar stake” in the company. To date, Uber has raised close to $12 billion from investors, with its
most recent valuation reportedly above $60 billion. July 25, 2017
13. Building on success
Confidential Content
13
Both the systems and staff continue to learn and evolve
©2017 SnapLogic, Inc. All Rights Reserved
“As the platform layers mature, we plan to invest in higher level tools and services to
drive democratization of machine learning and better support the needs of our
business”
For more information
◦ https://eng.uber.com/michelangelo/
15. The five year plan
Confidential Content
15
Rome was not built in a day
©2017 SnapLogic, Inc. All Rights Reserved
The problem
◦ A large multinational corporation grew in part by acquisition
◦ Technology stacks and silos as far as the eye can see
◦ They had one or more of every kind of technology
◦ They had hundreds of data warehouses and data marts
The cost
◦ Implementing any new business processes were blindingly expensive, took too long and were not what the user was expecting
or needed
The solution
◦ Simplify, standardized, consolidate and adopt a cloud strategy
◦ Insert a Data Lake into the data lifecycle
◦ Adopt a Citizen Integrator model where ever possible
The business result
◦ The combination of migration from a perpetual software license model to SaaS and the reduced labor costs of the Citizen
Integrator model resulted in savings in the millions
16. The evolving data lifecycle
Confidential Content
16
©2017 SnapLogic, Inc. All Rights Reserved
Data
Lake
Source
System 1
Source
System 2
Source
System 3
Source
System N
Source
System 1
Source
System 2
Source
System 3
Source
System N
Source
System 1
Source
System 2
Source
System 3
Source
System N
Source
System 1
Source
System 2
Source
System 3
Source
System N
Source
System 1
Source
System 2
Source
System 3
Source
System N
Source
System 1
Source
System 2
Source
System 3
Source
System N
EDW
Data
Mart
Data
Mart
Data
Mart
Data Science
Workbench
EDW
Data
Mart
Data
Mart
Data
Mart
Two stages, OLTP to DW and Data marts Three stages, OLTP to Data Lake, the to on shore
Data marts and DW
17. Results
Confidential Content
17
Happy productive business users
©2017 SnapLogic, Inc. All Rights Reserved
Faster time to market for new programs with agility and LOB alignment
Over 500 users from almost all business units
Savings in the millions
A more agile business environment
19. The solution approach
Confidential Content
19
Business goal drive the architectural requirements
©2017 SnapLogic, Inc. All Rights Reserved
The problem/business goal
◦ Obtain a customer 360 view by removing the constraints of an on-premises environment and move to a cloud-first
environment where multiple departments/constituents can access data and obtain insights.
Key Characteristics of a cloud-first enterprise stack:
◦ Scalable
◦ Collaborative
◦ Promotes easy data sharing
◦ Reduces on-premises maintenance overhead with auto updates
The process
◦ Upgrade the cloud data warehouse
◦ Move legacy BI to a modern tool like Tableau or PowerBI, for greater data fluency
◦ Create a foundation for an AI/ML workbench for predictive analytics
◦ Use ML framework like TensorFlow from Google generates Java code that runs anywhere
20. 20
Proposed Enterprise Stack
Amazon S3
Amazon EMR
SnapLogic (AWS Deployed)
Pull
Push
Stream
Push Tableau
Streaming
Database
Webservices
File
SAS
Cognos
Analytics
Kafka, JMS
Hbase, Hive, Dynamo,
Mongo, Redshift,
SQLServer, AzureSQL,
Aurora, MySQL
REST, SOAP
Flat Files, XML, JSon,
Excel, Word doc, PDF,
S3, FTP/SFTP, ORC,
Parquet
Sources & Targets
Social Media
Facebook, LinkedIn,
Twitter
Machine Learning Integration Point
21. Key Benefits of Proposed Architecture
Confidential Content
21
©2017 SnapLogic, Inc. All Rights Reserved
Enables migration in phases rather than all at once
Promotes data re-use and reduces time to insight across the organization
Scalable and flexible to accommodate company’s changing needs
Reduced maintenance costs to enable IT to stay focused on enabling the business
Complete view of the customer with real-time data updates
Better focused marketing programs (less waste, higher performance)
Greater customer loyalty due to more relevant customer engagement
22. Observations from the field
Confidential Content
22
Some observations and a few of Rich’s rules of technology
©2017 SnapLogic, Inc. All Rights Reserved
Technology is a tool, use the right one for the job
◦ It amazes me how some engineers have almost religious beliefs in their favorite technology
- If the only tool you have is a hammer…
Software evolves like a funnel
◦ Early releases have limitations that are fixed with later releases
We work in an industry where change is constant
◦ Absolute truths can change every 5-10 years
◦ The rate of change can make you old, or keep you young. As the Iron Giant said, choose!
Different technologies require different approaches and techniques
◦ I don’t code Scala like C or Cobol
◦ “A mind is like a parachute it only functions when it is open” Thomas Dewar
The adoption curve entails risk… and costs
◦ There is a reason we call it the bleeding edge
Open source is not free
◦ The money you save on license cost, you will spend on additional labor, plus 25%
24. Thank You
San Mateo, CA
Boulder, CO
New York, NY
London, UK
Melbourne, AUS
Hyderabad, India
www.snaplogic.com
Rich Dill– Enterprise Solutions Architect
rdill@snaplogic.com