SlideShare una empresa de Scribd logo
1 de 44
1Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Data Engineering the Startup Way
| Dan Collins | 2018
AGENDA
Who is Uptake?
Solving Hard Problems
Facts of Life
Continuous Evolution
Data Engineering the Startup Way
AGENDA
Who is Uptake?
4Copyright © 2018 Uptake06-Sep-18AWS Startup Day
• CEO and Co-founder Brad Keywell
• President Ganesh Bell
• ~ 4 years old
• 100+ Customers
• Two-time CNBC Disruptor 50
honoree
• World Economic Forum Technology
Pioneer
• One of Chicago’s best workplaces
for 2018 by Fortune
• Uptake is ranked in top 25 of the
2017 “Forbes Cloud 100”
5Copyright © 2018 Uptake06-Sep-18AWS Startup Day
AGENDA
Solving Hard Problems
7Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Solving Hard Problems
Industrial AI and IOT
• Predictive Analytics
• Anomaly Detection
• Label Correction
• Applications and AI UX
8Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Solving Hard Problems
Industrial Data
• Telematics
• SCADA Systems
• PLC / Sensor Data
• Contextual Data
• Resource Planning
• Customer Relationships
• Content Management
9Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Solving Hard Problems
Industrial Data is… Dirty
• Out of Order
• High Volatility
• System-wide Snapshots with no deltas
• Pre-determined Aggregation
• Duplicated, Partitioned, Compressed
10Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Solving Hard Problems
Industrial Data has a past…
• Very old systems (some > 30 years old)
• Susceptible to policy changes over time (formatting, time, etc)
• Most integrations follow a standard, but not the same one
11Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Solving Hard Problems
12Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Solving Hard Problems
13Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Solving Hard Problems
• ~150,000 writes/second
• Across tenants
• Across integrations
1 2 3 4 5 6 7 8 9 10
1 2 3 4 5 6 7 8 9 10
1 2 3 4 5 6 7 8 9 10
T1 T2 T3 T4 T5 T6 T7 T8 T9 T10
Processing Time
14Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Solving Hard Problems
• How it really works
• Remember, industrial data is dirty
• We need to validate, hydrate,
quarantine, and persist updates
as they come in
• We need to be consistent or our
data science models lose their
efficacy
• At 150,000k writes/second
1 2 3 5 6 7 8 9
1234
9 7 9 10
1 1 1 2 3 4 1 8 9 2
T1 T2 T3 T4 T5 T6 T7 T8 T9 T10
Processing Time
15Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Solving Hard Problems
16Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Solving Hard Problems
Platform Instance Platform Instance Platform InstancePlatform
Oh my!
Platform
Platform
Platform
Platform
Platform
Platform
Platform
Platform
Platform
Platform
Platform
Platform
Platform
Platform
Platform
Platform
Platform
Platform
Platform
Platform
Platform
Platform
Platform
Platform
Platform Instance
W
e did it!
17Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Solving Hard Problems
Shared Platform
Configured
Product
Bespoke
Solution
Platform
W
e did it!
More
Feature set
Feature set
AGENDA
Facts of Life
19Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Facts of Life
Machine Learning: The
High-Interest Credit Card of
Tech Debt
20Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Facts of Life
What people talk about
The hard parts
21Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Facts of Life
Changing Anything,
Changes Everything
22Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Facts of Life
So, to recap:
• Take dirty data from old systems
• Scale it to > 150,000 writes/seconds
• Spin up data science models on top
and balance them really carefully
• What could go wrong?
xkcd.com/1838
AGENDA
Continuous Evolution
24Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Continuous Evolution
1. Proof of Concept
2. Build it
3. Learn from it
4. Repeat
25Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Continuous Evolution – Proof of Concept
• Prototype: from works on my machine to scales in the cloud
• We create real-world working models written in R and Python
and sample data sets
• Focus on the problem, not the infrastructure, monitoring, etc
• Use the “beefiest” boxes to find equilibrium
• AWS allows you to go all in as soon as you’re ready to start
• Quickly spin up test instances or scaffold an environment
26Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Continuous Evolution – Build It
• Build out for scale
• Account for real-world data sets on distributed systems
• Lean on managed services and IaaS as your foundation
• AWS managed services and elastic scaling can drastically
reduce the time it takes to get up and running
• You can be production ready very quickly
27Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Continuous Evolution – Build It
What people talk about
The hard parts
AWS kickstarts your data
engineering here
28Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Continuous Evolution – Learn from It
• Codify patterns and encourage repeatability
• From bespoke to baked in
• Review trade-offs
• Analyze compute, I/O, parallelism
• Partition the problem space
• The scientific method, AWS’ huge array of services, and some
luck let you put hindsight to work as you build
29Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Continuous Evolution
Repeat “A program that is used and that as an
implementation of its specification reflects some
other reality, undergoes continual
change or becomes progressively less
useful. The change or decay process continues
until it is judged more cost effective to replace
the system with a recreated version.”
- Meir Lehman’s law of software evolution
AGENDA
Data Engineering the
Startup Way
31Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Data Engineering the Startup Way
Monolith
Microservices
Platform
• Features and efficiency are better fit each iteration
• Survival depends on flexibility and feedback
Data Science Applications Data Engineering
Platform
32Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Data Engineering the Startup Way
1. Focus on Value
2. Choose good abstractions
3. Act like an enterprise
4. Invest
5. Be Open
33Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Data Engineering the Startup Way – Focus on Value
• You have great ideas.
• Focus on where you have value, let others solve the less
interesting problems
• Use what’s available when it’s available, check often
• AWS and services like it can remove noise, letting you focus on
where you’re most innovative
34Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Data Engineering the Startup Way – Choose good abstractions
• Choose abstractions that let you take advantage of managed
services
• Don’t reinvent the wheel and don’t be afraid to change the
implementation
• docker, microservices, test driven development, continuous
delivery, automation, etc can all help you here
35Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Data Engineering the Startup Way – Act like an enterprise
• When you use world class, global services, you get the
services levels of world class, global services.
• Use services to enable your two person team operate like the
army of infra/ops they’re used to working with
• An outage is an outage no matter how small…
36Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Data Engineering the Startup Way – Invest
• Pairing really smart people with really great services gives you
the flexibility to be curious while you deliver
• Put down a foundation in your data platform and use managed
services where you can
• Craft your platform
• Investing in your data engineering gives you repeatability and
“paved roads” you can use to accelerate your delivery
37Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Data Engineering the Startup Way – Be Open
• There are a lot of smart people working on really useful
projects
• Scala, Flink, Spark, Kafka, Postgres, Docker, Airflow,
Kubernetes, Mesos, Kudu, Hive, Impala
• Get involved, share back, and use
open source
38Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Data Engineering the Startup Way – Oh, and Have Fun
• Don’t fight change, build systems and orgs that are flexible
• Use all the cool tech and packaged solutions to get you closer
to your vision
• And have fun!
• There’s never been a better time to be building
AGENDA
Recap
AGENDA
Who is Uptake?
Solving Hard Problems
Facts of Life
Continuous Evolution
Data Engineering the Startup Way
41Copyright © 2018 Uptake06-Sep-18AWS Startup Day
• is awesome
• There are hard problems and we’re
solving them
• You can solve your hard problems
too if you try
• AWS makes it easier, especially for
startups
• Build, Learn, Repeat
• Have fun
In Summary
42Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Copyright © 2018 by Uptake Technologies Inc. All rights reserved. No parts of this document may be
distributed, reproduced, transmitted, or stored electronically without Uptake’s prior written permission. This
document contains Uptake's confidential and proprietary information. If a pre-existing contract containing
disclosure and use restrictions exists between your company and Uptake, you and your company will use the
information in this document subject to the terms of the pre-existing contract. If no such pre-existing contract
exists, you and your Company agree to protect the information in this document and agree not to reproduce or
disclose the information in any way. Uptake makes no warranties, express or implied, in this document. Uptake
shall not be liable for damages of any kind arising out of use of this document. Any discussion of potential
features is not a promise of future functionality.
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Thanks!

Más contenido relacionado

La actualidad más candente

CWIN16 UK Event - The Future of Infrastructure
CWIN16 UK Event - The Future of Infrastructure CWIN16 UK Event - The Future of Infrastructure
CWIN16 UK Event - The Future of Infrastructure Gunnar Menzel
 
AWS re:Invent 2017 | CloudHealth Tech Session
AWS re:Invent 2017 |  CloudHealth Tech SessionAWS re:Invent 2017 |  CloudHealth Tech Session
AWS re:Invent 2017 | CloudHealth Tech SessionCloudHealth by VMware
 
Take the Bias out of Big Data Insights With Augmented Analytics
Take the Bias out of Big Data Insights With Augmented AnalyticsTake the Bias out of Big Data Insights With Augmented Analytics
Take the Bias out of Big Data Insights With Augmented AnalyticsTyler Wishnoff
 
CeBIT 2016 - The Data Centre in the age of Microservices
CeBIT 2016 - The Data Centre in the age of MicroservicesCeBIT 2016 - The Data Centre in the age of Microservices
CeBIT 2016 - The Data Centre in the age of MicroservicesGunnar Menzel
 
The Cloud Imperative – What, Why, When and How
The Cloud Imperative – What, Why, When and HowThe Cloud Imperative – What, Why, When and How
The Cloud Imperative – What, Why, When and HowInside Analysis
 
Ecotech Infra Management Services For Enterprises
Ecotech Infra Management Services For EnterprisesEcotech Infra Management Services For Enterprises
Ecotech Infra Management Services For EnterprisesEcotechinfra
 
Monitoring the Dynamic Nature of Cloud Computing
Monitoring the Dynamic Nature of Cloud ComputingMonitoring the Dynamic Nature of Cloud Computing
Monitoring the Dynamic Nature of Cloud ComputingLee Atchison
 
Stay Out of the News by Staying in the Cloud - jason cradit
Stay Out of the News by Staying in the Cloud - jason craditStay Out of the News by Staying in the Cloud - jason cradit
Stay Out of the News by Staying in the Cloud - jason craditAmazon Web Services
 
Dev ops don't be left behind
Dev ops   don't be left behindDev ops   don't be left behind
Dev ops don't be left behindGunnar Menzel
 
Pivotal Digital Transformation Forum: Data Science Technical Overview
Pivotal Digital Transformation Forum: Data Science Technical OverviewPivotal Digital Transformation Forum: Data Science Technical Overview
Pivotal Digital Transformation Forum: Data Science Technical OverviewVMware Tanzu
 
ODSC data science to DataOps
ODSC data science to DataOpsODSC data science to DataOps
ODSC data science to DataOpsChristopher Bergh
 
Bootcamp Recap: EC2 Reserved Instances
Bootcamp Recap: EC2 Reserved InstancesBootcamp Recap: EC2 Reserved Instances
Bootcamp Recap: EC2 Reserved InstancesCloudHealth by VMware
 
Unify Data at Memory Speed
Unify Data at Memory SpeedUnify Data at Memory Speed
Unify Data at Memory SpeedAlluxio, Inc.
 
Unleash the Power of Big Data and Machine Learning
Unleash the Power of Big Data and Machine LearningUnleash the Power of Big Data and Machine Learning
Unleash the Power of Big Data and Machine LearningTalend
 
Embracing Cloud Agility to Maximize Flexibility & Performance
Embracing Cloud Agility to Maximize Flexibility & Performance Embracing Cloud Agility to Maximize Flexibility & Performance
Embracing Cloud Agility to Maximize Flexibility & Performance Talend
 
Challenges in Clinical Research: Aridhia Disrupts Technology Approach to Rese...
Challenges in Clinical Research: Aridhia Disrupts Technology Approach to Rese...Challenges in Clinical Research: Aridhia Disrupts Technology Approach to Rese...
Challenges in Clinical Research: Aridhia Disrupts Technology Approach to Rese...VMware Tanzu
 

La actualidad más candente (20)

CWIN16 UK Event - The Future of Infrastructure
CWIN16 UK Event - The Future of Infrastructure CWIN16 UK Event - The Future of Infrastructure
CWIN16 UK Event - The Future of Infrastructure
 
AWS re:Invent 2017 | CloudHealth Tech Session
AWS re:Invent 2017 |  CloudHealth Tech SessionAWS re:Invent 2017 |  CloudHealth Tech Session
AWS re:Invent 2017 | CloudHealth Tech Session
 
Take the Bias out of Big Data Insights With Augmented Analytics
Take the Bias out of Big Data Insights With Augmented AnalyticsTake the Bias out of Big Data Insights With Augmented Analytics
Take the Bias out of Big Data Insights With Augmented Analytics
 
CeBIT 2016 - The Data Centre in the age of Microservices
CeBIT 2016 - The Data Centre in the age of MicroservicesCeBIT 2016 - The Data Centre in the age of Microservices
CeBIT 2016 - The Data Centre in the age of Microservices
 
The Cloud Imperative – What, Why, When and How
The Cloud Imperative – What, Why, When and HowThe Cloud Imperative – What, Why, When and How
The Cloud Imperative – What, Why, When and How
 
4 Phases of Cloud Optimization
4 Phases of Cloud Optimization4 Phases of Cloud Optimization
4 Phases of Cloud Optimization
 
Ecotech Infra Management Services For Enterprises
Ecotech Infra Management Services For EnterprisesEcotech Infra Management Services For Enterprises
Ecotech Infra Management Services For Enterprises
 
Monitoring the Dynamic Nature of Cloud Computing
Monitoring the Dynamic Nature of Cloud ComputingMonitoring the Dynamic Nature of Cloud Computing
Monitoring the Dynamic Nature of Cloud Computing
 
Stay Out of the News by Staying in the Cloud - jason cradit
Stay Out of the News by Staying in the Cloud - jason craditStay Out of the News by Staying in the Cloud - jason cradit
Stay Out of the News by Staying in the Cloud - jason cradit
 
Dev ops don't be left behind
Dev ops   don't be left behindDev ops   don't be left behind
Dev ops don't be left behind
 
Pivotal Digital Transformation Forum: Data Science Technical Overview
Pivotal Digital Transformation Forum: Data Science Technical OverviewPivotal Digital Transformation Forum: Data Science Technical Overview
Pivotal Digital Transformation Forum: Data Science Technical Overview
 
ODSC data science to DataOps
ODSC data science to DataOpsODSC data science to DataOps
ODSC data science to DataOps
 
Bootcamp Recap: EC2 Reserved Instances
Bootcamp Recap: EC2 Reserved InstancesBootcamp Recap: EC2 Reserved Instances
Bootcamp Recap: EC2 Reserved Instances
 
New AWS Regional RIs Explained
New AWS Regional RIs ExplainedNew AWS Regional RIs Explained
New AWS Regional RIs Explained
 
Unify Data at Memory Speed
Unify Data at Memory SpeedUnify Data at Memory Speed
Unify Data at Memory Speed
 
Unleash the Power of Big Data and Machine Learning
Unleash the Power of Big Data and Machine LearningUnleash the Power of Big Data and Machine Learning
Unleash the Power of Big Data and Machine Learning
 
Embracing Cloud Agility to Maximize Flexibility & Performance
Embracing Cloud Agility to Maximize Flexibility & Performance Embracing Cloud Agility to Maximize Flexibility & Performance
Embracing Cloud Agility to Maximize Flexibility & Performance
 
Challenges in Clinical Research: Aridhia Disrupts Technology Approach to Rese...
Challenges in Clinical Research: Aridhia Disrupts Technology Approach to Rese...Challenges in Clinical Research: Aridhia Disrupts Technology Approach to Rese...
Challenges in Clinical Research: Aridhia Disrupts Technology Approach to Rese...
 
3.1 oracle salonika
3.1 oracle salonika3.1 oracle salonika
3.1 oracle salonika
 
The Nordic Startup Scene
The Nordic Startup SceneThe Nordic Startup Scene
The Nordic Startup Scene
 

Similar a Data Engineering the Startup Way - AWS Startup Day Chicago 2018

Sydney: Certus Data 2.0 Vault Meetup with Snowflake - Data Vault In The Cloud
Sydney: Certus Data 2.0 Vault Meetup with Snowflake - Data Vault In The Cloud Sydney: Certus Data 2.0 Vault Meetup with Snowflake - Data Vault In The Cloud
Sydney: Certus Data 2.0 Vault Meetup with Snowflake - Data Vault In The Cloud Certus Solutions
 
Trends in Digital Transformation (ARC212) - AWS re:Invent 2018
Trends in Digital Transformation (ARC212) - AWS re:Invent 2018Trends in Digital Transformation (ARC212) - AWS re:Invent 2018
Trends in Digital Transformation (ARC212) - AWS re:Invent 2018Amazon Web Services
 
Cloud Best Practices
Cloud Best PracticesCloud Best Practices
Cloud Best Practicesenzoriv
 
Pivotal Big Data Roadshow
Pivotal Big Data Roadshow Pivotal Big Data Roadshow
Pivotal Big Data Roadshow VMware Tanzu
 
Embedded-ml(ai)applications - Bjoern Staender
Embedded-ml(ai)applications - Bjoern StaenderEmbedded-ml(ai)applications - Bjoern Staender
Embedded-ml(ai)applications - Bjoern StaenderDataconomy Media
 
Migrating Workloads from Oracle to Amazon Redshift: Best Practices with Pfize...
Migrating Workloads from Oracle to Amazon Redshift: Best Practices with Pfize...Migrating Workloads from Oracle to Amazon Redshift: Best Practices with Pfize...
Migrating Workloads from Oracle to Amazon Redshift: Best Practices with Pfize...Amazon Web Services
 
AWS Initiate - Tendências da Transformação Digital
AWS Initiate - Tendências da Transformação DigitalAWS Initiate - Tendências da Transformação Digital
AWS Initiate - Tendências da Transformação DigitalAmazon Web Services LATAM
 
Robert Murphy Driving Value from Smart Manufacturing
Robert Murphy Driving Value from Smart ManufacturingRobert Murphy Driving Value from Smart Manufacturing
Robert Murphy Driving Value from Smart ManufacturingRockwell Automation
 
Top Trends in Building Data Lakes for Machine Learning and AI
Top Trends in Building Data Lakes for Machine Learning and AI Top Trends in Building Data Lakes for Machine Learning and AI
Top Trends in Building Data Lakes for Machine Learning and AI Holden Ackerman
 
DATAOPS: THE NEXT BIG WAVE ON YOUR DATA JOURNEY - Big Data Expo
DATAOPS: THE NEXT BIG WAVE ON YOUR DATA JOURNEY - Big Data ExpoDATAOPS: THE NEXT BIG WAVE ON YOUR DATA JOURNEY - Big Data Expo
DATAOPS: THE NEXT BIG WAVE ON YOUR DATA JOURNEY - Big Data Expowebwinkelvakdag
 
TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
 TiVo: How to Scale New Products with a Data Lake on AWS and Qubole TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
TiVo: How to Scale New Products with a Data Lake on AWS and QuboleAmazon Web Services
 
TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
 TiVo: How to Scale New Products with a Data Lake on AWS and Qubole TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
TiVo: How to Scale New Products with a Data Lake on AWS and QuboleAmazon Web Services
 
Webinar: iPaaS in the Enterprise - What to Look for in a Cloud Integration Pl...
Webinar: iPaaS in the Enterprise - What to Look for in a Cloud Integration Pl...Webinar: iPaaS in the Enterprise - What to Look for in a Cloud Integration Pl...
Webinar: iPaaS in the Enterprise - What to Look for in a Cloud Integration Pl...SnapLogic
 
Enterprise DevOps: Begin with Production-Ready Migration (ENT217-R1) - AWS re...
Enterprise DevOps: Begin with Production-Ready Migration (ENT217-R1) - AWS re...Enterprise DevOps: Begin with Production-Ready Migration (ENT217-R1) - AWS re...
Enterprise DevOps: Begin with Production-Ready Migration (ENT217-R1) - AWS re...Amazon Web Services
 
SOUG Day - autonomous what is next
SOUG Day - autonomous what is nextSOUG Day - autonomous what is next
SOUG Day - autonomous what is nextThomas Teske
 
DV 2016: Mission Possible - Building a New Analytics Framework
DV 2016: Mission Possible - Building a New Analytics FrameworkDV 2016: Mission Possible - Building a New Analytics Framework
DV 2016: Mission Possible - Building a New Analytics FrameworkTealium
 
Enterprise Cloud Adoption
Enterprise Cloud Adoption Enterprise Cloud Adoption
Enterprise Cloud Adoption Tom Laszewski
 
Capgemini Leap Data Transformation Framework with Cloudera
Capgemini Leap Data Transformation Framework with ClouderaCapgemini Leap Data Transformation Framework with Cloudera
Capgemini Leap Data Transformation Framework with ClouderaCapgemini
 

Similar a Data Engineering the Startup Way - AWS Startup Day Chicago 2018 (20)

Sydney: Certus Data 2.0 Vault Meetup with Snowflake - Data Vault In The Cloud
Sydney: Certus Data 2.0 Vault Meetup with Snowflake - Data Vault In The Cloud Sydney: Certus Data 2.0 Vault Meetup with Snowflake - Data Vault In The Cloud
Sydney: Certus Data 2.0 Vault Meetup with Snowflake - Data Vault In The Cloud
 
Trends in Digital Transformation (ARC212) - AWS re:Invent 2018
Trends in Digital Transformation (ARC212) - AWS re:Invent 2018Trends in Digital Transformation (ARC212) - AWS re:Invent 2018
Trends in Digital Transformation (ARC212) - AWS re:Invent 2018
 
Cloud Best Practices
Cloud Best PracticesCloud Best Practices
Cloud Best Practices
 
Pivotal Big Data Roadshow
Pivotal Big Data Roadshow Pivotal Big Data Roadshow
Pivotal Big Data Roadshow
 
Innovation and Startups Today
Innovation and Startups TodayInnovation and Startups Today
Innovation and Startups Today
 
Embedded-ml(ai)applications - Bjoern Staender
Embedded-ml(ai)applications - Bjoern StaenderEmbedded-ml(ai)applications - Bjoern Staender
Embedded-ml(ai)applications - Bjoern Staender
 
Migrating Workloads from Oracle to Amazon Redshift: Best Practices with Pfize...
Migrating Workloads from Oracle to Amazon Redshift: Best Practices with Pfize...Migrating Workloads from Oracle to Amazon Redshift: Best Practices with Pfize...
Migrating Workloads from Oracle to Amazon Redshift: Best Practices with Pfize...
 
Tendências na Transformação Digital
Tendências na Transformação DigitalTendências na Transformação Digital
Tendências na Transformação Digital
 
AWS Initiate - Tendências da Transformação Digital
AWS Initiate - Tendências da Transformação DigitalAWS Initiate - Tendências da Transformação Digital
AWS Initiate - Tendências da Transformação Digital
 
Robert Murphy Driving Value from Smart Manufacturing
Robert Murphy Driving Value from Smart ManufacturingRobert Murphy Driving Value from Smart Manufacturing
Robert Murphy Driving Value from Smart Manufacturing
 
Top Trends in Building Data Lakes for Machine Learning and AI
Top Trends in Building Data Lakes for Machine Learning and AI Top Trends in Building Data Lakes for Machine Learning and AI
Top Trends in Building Data Lakes for Machine Learning and AI
 
DATAOPS: THE NEXT BIG WAVE ON YOUR DATA JOURNEY - Big Data Expo
DATAOPS: THE NEXT BIG WAVE ON YOUR DATA JOURNEY - Big Data ExpoDATAOPS: THE NEXT BIG WAVE ON YOUR DATA JOURNEY - Big Data Expo
DATAOPS: THE NEXT BIG WAVE ON YOUR DATA JOURNEY - Big Data Expo
 
TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
 TiVo: How to Scale New Products with a Data Lake on AWS and Qubole TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
 
TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
 TiVo: How to Scale New Products with a Data Lake on AWS and Qubole TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
 
Webinar: iPaaS in the Enterprise - What to Look for in a Cloud Integration Pl...
Webinar: iPaaS in the Enterprise - What to Look for in a Cloud Integration Pl...Webinar: iPaaS in the Enterprise - What to Look for in a Cloud Integration Pl...
Webinar: iPaaS in the Enterprise - What to Look for in a Cloud Integration Pl...
 
Enterprise DevOps: Begin with Production-Ready Migration (ENT217-R1) - AWS re...
Enterprise DevOps: Begin with Production-Ready Migration (ENT217-R1) - AWS re...Enterprise DevOps: Begin with Production-Ready Migration (ENT217-R1) - AWS re...
Enterprise DevOps: Begin with Production-Ready Migration (ENT217-R1) - AWS re...
 
SOUG Day - autonomous what is next
SOUG Day - autonomous what is nextSOUG Day - autonomous what is next
SOUG Day - autonomous what is next
 
DV 2016: Mission Possible - Building a New Analytics Framework
DV 2016: Mission Possible - Building a New Analytics FrameworkDV 2016: Mission Possible - Building a New Analytics Framework
DV 2016: Mission Possible - Building a New Analytics Framework
 
Enterprise Cloud Adoption
Enterprise Cloud Adoption Enterprise Cloud Adoption
Enterprise Cloud Adoption
 
Capgemini Leap Data Transformation Framework with Cloudera
Capgemini Leap Data Transformation Framework with ClouderaCapgemini Leap Data Transformation Framework with Cloudera
Capgemini Leap Data Transformation Framework with Cloudera
 

Más de Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

Más de Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Data Engineering the Startup Way - AWS Startup Day Chicago 2018

  • 1. 1Copyright © 2018 Uptake06-Sep-18AWS Startup Day Data Engineering the Startup Way | Dan Collins | 2018
  • 2. AGENDA Who is Uptake? Solving Hard Problems Facts of Life Continuous Evolution Data Engineering the Startup Way
  • 4. 4Copyright © 2018 Uptake06-Sep-18AWS Startup Day • CEO and Co-founder Brad Keywell • President Ganesh Bell • ~ 4 years old • 100+ Customers • Two-time CNBC Disruptor 50 honoree • World Economic Forum Technology Pioneer • One of Chicago’s best workplaces for 2018 by Fortune • Uptake is ranked in top 25 of the 2017 “Forbes Cloud 100”
  • 5. 5Copyright © 2018 Uptake06-Sep-18AWS Startup Day
  • 7. 7Copyright © 2018 Uptake06-Sep-18AWS Startup Day Solving Hard Problems Industrial AI and IOT • Predictive Analytics • Anomaly Detection • Label Correction • Applications and AI UX
  • 8. 8Copyright © 2018 Uptake06-Sep-18AWS Startup Day Solving Hard Problems Industrial Data • Telematics • SCADA Systems • PLC / Sensor Data • Contextual Data • Resource Planning • Customer Relationships • Content Management
  • 9. 9Copyright © 2018 Uptake06-Sep-18AWS Startup Day Solving Hard Problems Industrial Data is… Dirty • Out of Order • High Volatility • System-wide Snapshots with no deltas • Pre-determined Aggregation • Duplicated, Partitioned, Compressed
  • 10. 10Copyright © 2018 Uptake06-Sep-18AWS Startup Day Solving Hard Problems Industrial Data has a past… • Very old systems (some > 30 years old) • Susceptible to policy changes over time (formatting, time, etc) • Most integrations follow a standard, but not the same one
  • 11. 11Copyright © 2018 Uptake06-Sep-18AWS Startup Day Solving Hard Problems
  • 12. 12Copyright © 2018 Uptake06-Sep-18AWS Startup Day Solving Hard Problems
  • 13. 13Copyright © 2018 Uptake06-Sep-18AWS Startup Day Solving Hard Problems • ~150,000 writes/second • Across tenants • Across integrations 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 T1 T2 T3 T4 T5 T6 T7 T8 T9 T10 Processing Time
  • 14. 14Copyright © 2018 Uptake06-Sep-18AWS Startup Day Solving Hard Problems • How it really works • Remember, industrial data is dirty • We need to validate, hydrate, quarantine, and persist updates as they come in • We need to be consistent or our data science models lose their efficacy • At 150,000k writes/second 1 2 3 5 6 7 8 9 1234 9 7 9 10 1 1 1 2 3 4 1 8 9 2 T1 T2 T3 T4 T5 T6 T7 T8 T9 T10 Processing Time
  • 15. 15Copyright © 2018 Uptake06-Sep-18AWS Startup Day Solving Hard Problems
  • 16. 16Copyright © 2018 Uptake06-Sep-18AWS Startup Day Solving Hard Problems Platform Instance Platform Instance Platform InstancePlatform Oh my! Platform Platform Platform Platform Platform Platform Platform Platform Platform Platform Platform Platform Platform Platform Platform Platform Platform Platform Platform Platform Platform Platform Platform Platform Platform Instance W e did it!
  • 17. 17Copyright © 2018 Uptake06-Sep-18AWS Startup Day Solving Hard Problems Shared Platform Configured Product Bespoke Solution Platform W e did it! More Feature set Feature set
  • 19. 19Copyright © 2018 Uptake06-Sep-18AWS Startup Day Facts of Life Machine Learning: The High-Interest Credit Card of Tech Debt
  • 20. 20Copyright © 2018 Uptake06-Sep-18AWS Startup Day Facts of Life What people talk about The hard parts
  • 21. 21Copyright © 2018 Uptake06-Sep-18AWS Startup Day Facts of Life Changing Anything, Changes Everything
  • 22. 22Copyright © 2018 Uptake06-Sep-18AWS Startup Day Facts of Life So, to recap: • Take dirty data from old systems • Scale it to > 150,000 writes/seconds • Spin up data science models on top and balance them really carefully • What could go wrong? xkcd.com/1838
  • 24. 24Copyright © 2018 Uptake06-Sep-18AWS Startup Day Continuous Evolution 1. Proof of Concept 2. Build it 3. Learn from it 4. Repeat
  • 25. 25Copyright © 2018 Uptake06-Sep-18AWS Startup Day Continuous Evolution – Proof of Concept • Prototype: from works on my machine to scales in the cloud • We create real-world working models written in R and Python and sample data sets • Focus on the problem, not the infrastructure, monitoring, etc • Use the “beefiest” boxes to find equilibrium • AWS allows you to go all in as soon as you’re ready to start • Quickly spin up test instances or scaffold an environment
  • 26. 26Copyright © 2018 Uptake06-Sep-18AWS Startup Day Continuous Evolution – Build It • Build out for scale • Account for real-world data sets on distributed systems • Lean on managed services and IaaS as your foundation • AWS managed services and elastic scaling can drastically reduce the time it takes to get up and running • You can be production ready very quickly
  • 27. 27Copyright © 2018 Uptake06-Sep-18AWS Startup Day Continuous Evolution – Build It What people talk about The hard parts AWS kickstarts your data engineering here
  • 28. 28Copyright © 2018 Uptake06-Sep-18AWS Startup Day Continuous Evolution – Learn from It • Codify patterns and encourage repeatability • From bespoke to baked in • Review trade-offs • Analyze compute, I/O, parallelism • Partition the problem space • The scientific method, AWS’ huge array of services, and some luck let you put hindsight to work as you build
  • 29. 29Copyright © 2018 Uptake06-Sep-18AWS Startup Day Continuous Evolution Repeat “A program that is used and that as an implementation of its specification reflects some other reality, undergoes continual change or becomes progressively less useful. The change or decay process continues until it is judged more cost effective to replace the system with a recreated version.” - Meir Lehman’s law of software evolution
  • 31. 31Copyright © 2018 Uptake06-Sep-18AWS Startup Day Data Engineering the Startup Way Monolith Microservices Platform • Features and efficiency are better fit each iteration • Survival depends on flexibility and feedback Data Science Applications Data Engineering Platform
  • 32. 32Copyright © 2018 Uptake06-Sep-18AWS Startup Day Data Engineering the Startup Way 1. Focus on Value 2. Choose good abstractions 3. Act like an enterprise 4. Invest 5. Be Open
  • 33. 33Copyright © 2018 Uptake06-Sep-18AWS Startup Day Data Engineering the Startup Way – Focus on Value • You have great ideas. • Focus on where you have value, let others solve the less interesting problems • Use what’s available when it’s available, check often • AWS and services like it can remove noise, letting you focus on where you’re most innovative
  • 34. 34Copyright © 2018 Uptake06-Sep-18AWS Startup Day Data Engineering the Startup Way – Choose good abstractions • Choose abstractions that let you take advantage of managed services • Don’t reinvent the wheel and don’t be afraid to change the implementation • docker, microservices, test driven development, continuous delivery, automation, etc can all help you here
  • 35. 35Copyright © 2018 Uptake06-Sep-18AWS Startup Day Data Engineering the Startup Way – Act like an enterprise • When you use world class, global services, you get the services levels of world class, global services. • Use services to enable your two person team operate like the army of infra/ops they’re used to working with • An outage is an outage no matter how small…
  • 36. 36Copyright © 2018 Uptake06-Sep-18AWS Startup Day Data Engineering the Startup Way – Invest • Pairing really smart people with really great services gives you the flexibility to be curious while you deliver • Put down a foundation in your data platform and use managed services where you can • Craft your platform • Investing in your data engineering gives you repeatability and “paved roads” you can use to accelerate your delivery
  • 37. 37Copyright © 2018 Uptake06-Sep-18AWS Startup Day Data Engineering the Startup Way – Be Open • There are a lot of smart people working on really useful projects • Scala, Flink, Spark, Kafka, Postgres, Docker, Airflow, Kubernetes, Mesos, Kudu, Hive, Impala • Get involved, share back, and use open source
  • 38. 38Copyright © 2018 Uptake06-Sep-18AWS Startup Day Data Engineering the Startup Way – Oh, and Have Fun • Don’t fight change, build systems and orgs that are flexible • Use all the cool tech and packaged solutions to get you closer to your vision • And have fun! • There’s never been a better time to be building
  • 40. AGENDA Who is Uptake? Solving Hard Problems Facts of Life Continuous Evolution Data Engineering the Startup Way
  • 41. 41Copyright © 2018 Uptake06-Sep-18AWS Startup Day • is awesome • There are hard problems and we’re solving them • You can solve your hard problems too if you try • AWS makes it easier, especially for startups • Build, Learn, Repeat • Have fun In Summary
  • 42. 42Copyright © 2018 Uptake06-Sep-18AWS Startup Day
  • 43. Copyright © 2018 by Uptake Technologies Inc. All rights reserved. No parts of this document may be distributed, reproduced, transmitted, or stored electronically without Uptake’s prior written permission. This document contains Uptake's confidential and proprietary information. If a pre-existing contract containing disclosure and use restrictions exists between your company and Uptake, you and your company will use the information in this document subject to the terms of the pre-existing contract. If no such pre-existing contract exists, you and your Company agree to protect the information in this document and agree not to reproduce or disclose the information in any way. Uptake makes no warranties, express or implied, in this document. Uptake shall not be liable for damages of any kind arising out of use of this document. Any discussion of potential features is not a promise of future functionality.
  • 44. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Thanks!