Containers 101 meetup talk recording posted here- https://codefresh.io/blog/containers-101-meetup-docker-accelerates-continuous-development/
Shimon Tolts, General Manager/ CTO of Data Solutions at ironSouce, joined us to talk about how they leverage Docker to simplify their workflow and deliver Big Data solutions to their customers faster. He shared their experience running Docker containers in production and how they took one of their base systems, considered "the backbone of the company," and transformed it using containers.
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
How Docker Accelerates Continuous Development at ironSource: Containers #101 Meetup
1. Shimon Tolts
General Manager, Data Solutions
Atom
Data Pipeline Processing 200B events with
Node.js And Docker On AWS
2. About ironSource: Hypergrowth
People Reached Each Month
4200
Apps Installed Every Minute
with the ironSource Platform
Registered & Analyzed Data Events
Every Month
200B
800M
50B
0
100B
150B
200B
Jun
2015
Jul
2015
Aug
2015
Sep
2015
Oct
2015
Nov
2015
Dec
2015
Jan
2016
Feb
2016
Mar
2016
Apr
2016
May
2016
3. We needed a way to manage this data:
Our Business Challenge
ProcessCollect Store
4.
5. Collection
● Multi region layer - Latency based routing
● Low latency from client to Atom servers
● High Availability - AWS regions does fail!
● Storing raw data + headers upon receiving
6. Data Enrichment
● Enrich data before storing in your Data Lake
and/or Warehouse
○ IP to Country
○ Currency conversion
○ Decrypt data
○ User Agent parsing - OS, Browser, Device...
● Any custom logic you would like! - fully
extendible
7. Data Targets
● Near real-time data insertion - 1 minute!
● Stream data to Google Storage and/or AWS S3
● Smart insertion of data into AWS Redshift
○ Set the amount of parallel copys
○ Configure priority on tables
● BigQuery - Streaming data using batch files
import (saves 20% cost)
10. Docker
● Linux Container
● Save provisioning time
● Infrastructure as code
● Dev-Test-Production - identical container
● Ship easily
11. Cloud infrastructure
● Pay as you go - (grow)
● SaaS services
● Auto-scaling-groups
● DynamoDB
● RDS *SQL
● Redshift data warehouse
12. Continuous Integration
● From commit to production
● Jenkins commit hook
● Git branching model
● AWS dynamic slaves
● Unit tests
● Docker builds
● Updating live environment