SlideShare a Scribd company logo
1 of 12
Download to read offline
THE INS & OUTS OF DATA TRANSFER
LOS ANGELES AWS USERS GROUP
JASON DAVIS, CEO SIMON DATA
@JASONDAVIS
DRJASONDAVIS.COM
A TYPICAL DATA ECOSYSTEM
OLTP/RDS
DATA LAKE / REDSHIFT / S3
USERS FRONTEND
ANALYTICS
BACK OFFICE
"THE BIZ"
CORE TECH
3P TECH /
SAAS
CRM / ERPEMAIL / PUSH / SMS GRAPHS / BI
APPLICATION
A gentle introduction to data transfer & "ETL"
An overview of common failure cases
Best practices and some high level guidance
OVERVIEW
SOME TYPICAL DATA TRANSFERS
WEB ANALYTICS
"BUSINESS" REPORTING
ACQUISITION /
LTV ANALYSIS
EMAIL SEGMENTATION
Product recommendations
Extract: skus, purchase / browse history, profit margins
Transform: Deep learning / recommender systems
Load: user / sku recommendations into a production DB
Inventory planning
Extract: historical sales, inventory and shipping costs
Transform: Stockage goal estimation
Load: Sku-level forecasts into an ERP system
Executive dashboard
Extract: revenue, traffic, support volume, operational data
Transform: basic aggregates
Load: pie charts, vanity metrics driven by a reporting DB
SOME MORE TYPICAL DATA TRANSFERS
ETL: the process of pulling data from one or more sources for use in another
Extract data from one or more sources
Database, event streams, S3, Salesforce, email metrics
Transform data via aggregations, joins, filters, and/or predictive analysis
Parallel (Hadoop, Spark), In-core (Redshift), Scripts (Python, bash)
Load data into destination
Database / Redshift, S3, HDFS, SaaS, ERP, CRM, email platform, etc.
DATA TRANSFER IN 3 STEPS: EXTRACT-TRANSFORM-LOAD
E T L
Extraction failures
Source unavailable
Data corrupt / incomplete - upstream error
Transform failures
Resources unavailable / exceeded: OOM
Broken computation: Bad math / DBZ
Load failures
Validation errors
Connectivity errors
Availability / bandwidth limitations
Failures can cascade in unexpected ways
MOVING DATA IS HARD: COMMON FAILURE CASES
Maintaining state between two systems is hard
The basic problem of 1-1 syncing is hard in itself
Incrementals, cursor based extractors are all prone to failure
Failure cases are wide, varied, and data-driven
Generally require running in real-world context for an extended period
Many times failures are silent
Ensuring correctness is hard / impossible
Run-times are generally longer which strain unit testing best practices
FUNDAMENTAL CHALLENGES
=?
Break your pipeline into small steps
Large SQL statements are hard to test
SQL in general is hard to unit test - it's a declarative language after all
Data flow languages such as spark / cascading are easier to test
Build patterns to be able to easily test real-world inputs against outputs
Unit testing timeout errors and other exceptional cases are hard to test in isolation
WRITE UNIT TESTS BUT TEMPER EXPECTATIONS
DATA PIPES ARE HARD TO UNIT TEST
Idempotent. A unary operation (or function) is idempotent if, whenever it is applied twice to any value, it gives the
same result as if it were applied once; i.e., ƒ(ƒ(x)) ≡ ƒ(x). For example, the absolute value function, where abs(abs(x))
≡ abs(x), is idempotent.
In layman's terms: your code has the same result if you run it one, two, or three or more times.
Why is this important?
Oftentimes you won't know if something was successful or not.
Solution: Idempotency allows you to "just run it again"
IDEMPOTENCY
"THINGS DON'T ALWAYS TAKE ON THE FIRST TRY...."
Start with fine-grained logs
"Measure Anything, Measure Everything" - Etsy, Code as Craft
Alert on things that are mission critical or have well-known failure characteristic
VISIBILITY: LOGGING, GRAPHING, & ALERTING
OPTIMIZE FOR TIME TO DETECTION
THANKS
QUESTIONS?
DRJASONDAVIS.COM
EMAIL ME: JASON@SIMONDATA.COM

More Related Content

Similar to The ins & outs of data transfer

Whats a datawarehouse
Whats a datawarehouseWhats a datawarehouse
Whats a datawarehousevijjudarling
 
Power of the Run Graph
Power of the Run GraphPower of the Run Graph
Power of the Run GraphVaticle
 
Em12c performance tuning outside the box
Em12c performance tuning outside the boxEm12c performance tuning outside the box
Em12c performance tuning outside the boxKellyn Pot'Vin-Gorman
 
SQL Server 2008 Development for Programmers
SQL Server 2008 Development for ProgrammersSQL Server 2008 Development for Programmers
SQL Server 2008 Development for ProgrammersAdam Hutson
 
Product! - The road to production deployment
Product! - The road to production deploymentProduct! - The road to production deployment
Product! - The road to production deploymentFilippo Zanella
 
Datastage Online Training @ Adithya Elearning
Datastage Online Training @ Adithya ElearningDatastage Online Training @ Adithya Elearning
Datastage Online Training @ Adithya Elearningshanmukha rao dondapati
 
Introduction to Simulation
Introduction to SimulationIntroduction to Simulation
Introduction to Simulationchimco.net
 
Modern Database Development Oow2008 Lucas Jellema
Modern Database Development Oow2008 Lucas JellemaModern Database Development Oow2008 Lucas Jellema
Modern Database Development Oow2008 Lucas JellemaLucas Jellema
 
NoSQL, Hadoop, Cascading June 2010
NoSQL, Hadoop, Cascading June 2010NoSQL, Hadoop, Cascading June 2010
NoSQL, Hadoop, Cascading June 2010Christopher Curtin
 
Data Warehouses: A Whistle-Stop Tour
Data Warehouses: A Whistle-Stop TourData Warehouses: A Whistle-Stop Tour
Data Warehouses: A Whistle-Stop TourCade Roux
 
ScalabilityAvailability
ScalabilityAvailabilityScalabilityAvailability
ScalabilityAvailabilitywebuploader
 
SQL Server 2008 Integration Services
SQL Server 2008 Integration ServicesSQL Server 2008 Integration Services
SQL Server 2008 Integration ServicesEduardo Castro
 
Hadoop and Cascading At AJUG July 2009
Hadoop and Cascading At AJUG July 2009Hadoop and Cascading At AJUG July 2009
Hadoop and Cascading At AJUG July 2009Christopher Curtin
 
Data ware house design
Data ware house designData ware house design
Data ware house designSayed Ahmed
 
Data ware house design
Data ware house designData ware house design
Data ware house designSayed Ahmed
 
Azure BI Cloud Architectural Guidelines.pdf
Azure BI Cloud Architectural Guidelines.pdfAzure BI Cloud Architectural Guidelines.pdf
Azure BI Cloud Architectural Guidelines.pdfpbonillo1
 

Similar to The ins & outs of data transfer (20)

Whats a datawarehouse
Whats a datawarehouseWhats a datawarehouse
Whats a datawarehouse
 
Power of the Run Graph
Power of the Run GraphPower of the Run Graph
Power of the Run Graph
 
Em12c performance tuning outside the box
Em12c performance tuning outside the boxEm12c performance tuning outside the box
Em12c performance tuning outside the box
 
SQL Server 2008 Development for Programmers
SQL Server 2008 Development for ProgrammersSQL Server 2008 Development for Programmers
SQL Server 2008 Development for Programmers
 
Product! - The road to production deployment
Product! - The road to production deploymentProduct! - The road to production deployment
Product! - The road to production deployment
 
Datastage Online Training @ Adithya Elearning
Datastage Online Training @ Adithya ElearningDatastage Online Training @ Adithya Elearning
Datastage Online Training @ Adithya Elearning
 
Introduction to Simulation
Introduction to SimulationIntroduction to Simulation
Introduction to Simulation
 
Modern Database Development Oow2008 Lucas Jellema
Modern Database Development Oow2008 Lucas JellemaModern Database Development Oow2008 Lucas Jellema
Modern Database Development Oow2008 Lucas Jellema
 
NoSQL, Hadoop, Cascading June 2010
NoSQL, Hadoop, Cascading June 2010NoSQL, Hadoop, Cascading June 2010
NoSQL, Hadoop, Cascading June 2010
 
Data Warehouses: A Whistle-Stop Tour
Data Warehouses: A Whistle-Stop TourData Warehouses: A Whistle-Stop Tour
Data Warehouses: A Whistle-Stop Tour
 
Using power db 02
Using power db 02Using power db 02
Using power db 02
 
Using power db 02
Using power db 02Using power db 02
Using power db 02
 
ScalabilityAvailability
ScalabilityAvailabilityScalabilityAvailability
ScalabilityAvailability
 
Data Warehouse
Data WarehouseData Warehouse
Data Warehouse
 
SQL Server 2008 Integration Services
SQL Server 2008 Integration ServicesSQL Server 2008 Integration Services
SQL Server 2008 Integration Services
 
Hadoop and Cascading At AJUG July 2009
Hadoop and Cascading At AJUG July 2009Hadoop and Cascading At AJUG July 2009
Hadoop and Cascading At AJUG July 2009
 
No sql
No sqlNo sql
No sql
 
Data ware house design
Data ware house designData ware house design
Data ware house design
 
Data ware house design
Data ware house designData ware house design
Data ware house design
 
Azure BI Cloud Architectural Guidelines.pdf
Azure BI Cloud Architectural Guidelines.pdfAzure BI Cloud Architectural Guidelines.pdf
Azure BI Cloud Architectural Guidelines.pdf
 

Recently uploaded

EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxFurkanTasci3
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 

Recently uploaded (20)

EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Decoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in ActionDecoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in Action
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptx
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 

The ins & outs of data transfer

  • 1. THE INS & OUTS OF DATA TRANSFER LOS ANGELES AWS USERS GROUP JASON DAVIS, CEO SIMON DATA @JASONDAVIS DRJASONDAVIS.COM
  • 2. A TYPICAL DATA ECOSYSTEM OLTP/RDS DATA LAKE / REDSHIFT / S3 USERS FRONTEND ANALYTICS BACK OFFICE "THE BIZ" CORE TECH 3P TECH / SAAS CRM / ERPEMAIL / PUSH / SMS GRAPHS / BI APPLICATION
  • 3. A gentle introduction to data transfer & "ETL" An overview of common failure cases Best practices and some high level guidance OVERVIEW
  • 4. SOME TYPICAL DATA TRANSFERS WEB ANALYTICS "BUSINESS" REPORTING ACQUISITION / LTV ANALYSIS EMAIL SEGMENTATION
  • 5. Product recommendations Extract: skus, purchase / browse history, profit margins Transform: Deep learning / recommender systems Load: user / sku recommendations into a production DB Inventory planning Extract: historical sales, inventory and shipping costs Transform: Stockage goal estimation Load: Sku-level forecasts into an ERP system Executive dashboard Extract: revenue, traffic, support volume, operational data Transform: basic aggregates Load: pie charts, vanity metrics driven by a reporting DB SOME MORE TYPICAL DATA TRANSFERS
  • 6. ETL: the process of pulling data from one or more sources for use in another Extract data from one or more sources Database, event streams, S3, Salesforce, email metrics Transform data via aggregations, joins, filters, and/or predictive analysis Parallel (Hadoop, Spark), In-core (Redshift), Scripts (Python, bash) Load data into destination Database / Redshift, S3, HDFS, SaaS, ERP, CRM, email platform, etc. DATA TRANSFER IN 3 STEPS: EXTRACT-TRANSFORM-LOAD E T L
  • 7. Extraction failures Source unavailable Data corrupt / incomplete - upstream error Transform failures Resources unavailable / exceeded: OOM Broken computation: Bad math / DBZ Load failures Validation errors Connectivity errors Availability / bandwidth limitations Failures can cascade in unexpected ways MOVING DATA IS HARD: COMMON FAILURE CASES
  • 8. Maintaining state between two systems is hard The basic problem of 1-1 syncing is hard in itself Incrementals, cursor based extractors are all prone to failure Failure cases are wide, varied, and data-driven Generally require running in real-world context for an extended period Many times failures are silent Ensuring correctness is hard / impossible Run-times are generally longer which strain unit testing best practices FUNDAMENTAL CHALLENGES =?
  • 9. Break your pipeline into small steps Large SQL statements are hard to test SQL in general is hard to unit test - it's a declarative language after all Data flow languages such as spark / cascading are easier to test Build patterns to be able to easily test real-world inputs against outputs Unit testing timeout errors and other exceptional cases are hard to test in isolation WRITE UNIT TESTS BUT TEMPER EXPECTATIONS DATA PIPES ARE HARD TO UNIT TEST
  • 10. Idempotent. A unary operation (or function) is idempotent if, whenever it is applied twice to any value, it gives the same result as if it were applied once; i.e., ƒ(ƒ(x)) ≡ ƒ(x). For example, the absolute value function, where abs(abs(x)) ≡ abs(x), is idempotent. In layman's terms: your code has the same result if you run it one, two, or three or more times. Why is this important? Oftentimes you won't know if something was successful or not. Solution: Idempotency allows you to "just run it again" IDEMPOTENCY "THINGS DON'T ALWAYS TAKE ON THE FIRST TRY...."
  • 11. Start with fine-grained logs "Measure Anything, Measure Everything" - Etsy, Code as Craft Alert on things that are mission critical or have well-known failure characteristic VISIBILITY: LOGGING, GRAPHING, & ALERTING OPTIMIZE FOR TIME TO DETECTION