SlideShare una empresa de Scribd logo
1 de 35
Descargar para leer sin conexión
Jordi Puigsegur Figueras
https://www.linkedin.com/in/jordipuigsegur/
▪ Head of Solution Architecture
Hotelbeds
▪ Course Instructor
High Scale Distributed Systems
Universitat Oberta de Catalunya
1. Who we are
2. Hotelbeds Journey
3. Survival Principles
▪ Hotelbeds Group is a leading bedbank and a business-to-business
(B2B) provider to the global travel industry.
▪ In September 2016 Hotelbeds Group was acquired by Cinven and
the Canada Pension Plan Investment Board.
▪ In June 2017 Tourico Holidays became part of Hotelbeds Group
and in October GTA also became part of the Group.
▪ Hotelbeds Group offers travel providers access to a network of over
60,000 travel sellers from around 185 source markets globally.
▪ Travel sellers have access to over 170,000 hotels, +22,000 transfer
routes, +16,000 activities and 142,000 car rental products.
▪ The technology platform handles around 2 billion data requests
per day – with peaks of up to 2.5 billion and 40.000 request
per second – from users worldwide.
▪ +5000 employees worldwide. 210 offices globally. Biggest single
site is the head office in Palma de Mallorca where over
1500 people work.
• Apitude Cloud
• 1.200M request per day
• 20k request per second
• Distributed Availability
• Multi-region deployments
• 2.000M requests per day
• 40k request per second
• Transfers and Activities product
• Product update for content (FTP)
• 1M requests per day
• Cache pull service
• Proprietary format & Rules based
• Allow customers to scan inventory
• 10M requests per day
• API driven Product Distribution Strategy – Suite of APIs
• Focus on scalable and high performance platforms
• Ease of integration development driven experience
• 200 M requests per day
2008
2010
• Technical breakthrough in the market
• Accommodation product only
• Thousands of requests per day
XML2
AIF2
2002
XML1
2017
APItude
APItude
2015
2018
APItude
Three main initiatives will shape
Hotelbeds Technical evolution
2016 2017 2018
Distributed Availability:
Going Global
ATLAS+ Project:
Breaking the monolith
Apitude Migration
to cloud
▪ APITUDE is a new redesign of our APIs
▪ Live by end of 2015
▪ 30% faster than the existing API
▪ DX approach: simpler and based on new technologies
▪ Sets the ground for a modern microservice cloud-native architecture:
○ Cloud-ready services based on Spring Boot
○ Immutable deployments (rpm) and external configuration
○ New technology components: Redis, Spring Config Server
○ Focus on enabling automation
○ New “cloud-friendly” architectural patterns
▪ Big monolithic Oracle Database with most of
the company’s business logic inside.
▪ Some satellite Java services…
○ ... but all logic still in PL code
○ ~ 2500 tables, ~ 1.1 M LoC
○ Montly releases with full stop
of up to 1 h.
▪ The new Apitude API platform, is already live,
but it is still hosted on premise.
▪ Most hotel availability requests are solved by
the legacy XML 2 Platform (on premise).
▪ Apitude rollout is just beginning.
▪ We beginning to face serious scalability
issues.
▪ Our Oracle based platform is not going to
cope with the expected business growth.
▪ Vertical scalability is not an option …
we are already running on powerful Oracle
Exadata hardware.
▪ Some estimates only leave 18 months until
platform saturation.
▪ The main driver is to reduce latencies for
our globally distributed customer base.
▪ Availability requests are increasing
exponentially, therefore:
○ We need more flexibility to grow
and evolve new Apitude services
○ We need autoscaling to adapt to
varying loads
(day / week / seasonal)
▪ Cloud can also be a cost-saving driver
▪ Cloud migration strategy is mixed:
○ Migrating new cloud native components
○ Plus lift and shift of older ones
▪ Deployed in AWS - 1 region: Europe
▪ Based on IaaS deployments of binary immutable
rpms with external configuration
▪ Some managed services (ElastiCache & ELBs)
▪ Adjusting autoscaling and fine tuning takes time
▪ Good monitoring is crucial
▪ Project focused on extracting the core business logic inside our
big monolithic Oracle Database
▪ Migration of business logic to cloud-native microservices:
○ Full reengineering of backend services
○ No business involvement … transparent migration
4 Teams
50 Developers
9 Months
20 new Spring Boot services
70 % on Cloud
▪ Hybrid on-premise - cloud approach:
○ Madrid datacenter + 1 AWS region
○ Logic is moved to cloud microservices
○ Data is kept on Oracle DB (on-premise)
○ Prioritize use of cloud data
▪ Data replication on-premise - cloud becomes crucial
○ Use of Kafka (own deployment)
▪ PostgreSQL is the choice for microservices database
○ Managed RDS instances
○ Sometimes noSQL approach
AtlasDB
BOOKING BL
BOOKING
API
API
PostgreSQL
onpremise
onpremisecloud
AtlasDB
Example booking operations:
▪ Booking List
▪ Booking Detail
▪ Hotel Availability across three regions:
○ Europe
○ North America
○ Asia
▪ Global data replication using Kafka
▪ Customers are geographically
redirected by dynamic DNS.
(check Eric Janz talk this afternoon)
▪ Better latencies across the globe
▪ New options for growth
▪ Good monitoring becomes crucial
… for the microservices jungle!
▪ Standardization
▪ Decoupling
▪ Data replication
▪ Resilient designs
▪ Automation
▪ Microservice support ecosystem
▪ Governance
▪ We all know microservices are cool
▪ We all want to do microservices!!
▪ We all know the advantages of microservices
▪ But Microservice architectures
○ are complex
○ carry many hidden overheads
▪ In fact ...
You are going to build a distributed
system and distributed systems are hard!
▪ Programming Language: Java
▪ Parent poms with most relevant dependencies
▪ Some (not many) libraries, e.g. metrics
▪ Standardized service archetype (maven)
○ Ready to run in Hotelbeds ecosystem
○ Produces binary rpms (docker images soon!)
▪ REST APIs designed following similar principles
▪ Carefully chosen set of technology components
▪ Reference architectures on when and how to use
components and libraries
▪ Technology radar
▪ Decoupling is essential to achieve
the microservices goals
▪ Good decoupled architecture ...
○ Helps scale dev teams
○ High scalability & efficiency enabler
○ Supports future features naturally
▪ Independent deployments and life cycles for each Service
▪ The API is the only point of access of the service (REST endpoint, Kafka, …)
▪ Data is private: no database access from external components
▪ Importance of clean service boundaries
▪ One rule of thumb: changes shouldn’t involve several microservices
▪ Domain-Driven Design as a very useful set of tools:
○ Focuses on domain knowledge and its representation on code
○ Focus on Strategic patterns
○ Bounded contexts as the basis for microservice boundaries
▪ Beware!
○ Service boundaries are hard to define!
○ Easy to end up with a distributed monolith / microservice spaghetti / ...
▪ Data replication between services is crucial in
a hybrid cloud / multi-[region|cloud] environment
▪ Each entity is owned by a service
▪ All the other services access the owner service
via REST API or consuming its Kafka messages
▪ Kafka messages contain exactly the same entities
as the service REST API
▪ Kafka is our message broker and basic tool to replicate data between services
○ Scalability
○ Partial order guarantees
○ Kafka “mirror maker” for moving data across locations
(check Kafka talk by Isa and Alicia tomorrow!)
Resilience: "the ability of a system
to withstand changes in its
environment and still function"
Wikipedia
▪ We need to design with resilience
in mind
▪ Favor self-healing architectures
▪ Remember! We are dealing with
distributed systems
▪ Resilience patterns: check Uwe
Friedrichsen talks (slideshare)
▪ Protect your services for the unexpected
○ Overloads
○ Timeouts
○ Downstream errors
○ Datacenter failures
○ etc.
▪ Protect your services even if they are only internally
exposed
▪ Focus on protecting each service individually
▪ Let good system behaviour emerge from good service
level practices
▪ Hystrix library provides several very useful resilience patterns:
○ Circuit breaker
○ Load shedding
○ Timeouts
○ Fallbacks
○ Retries
DISTRIBUTION
3rd PARTY
INTERNAL
PRODUCT
Suppliers
* from: Coordination Avoidance in Database Systems
Peter Bailis, Alan Fekete, Michael J. Franklin, Ali Ghodsi, Joseph M. Hellerstein, Ion Stoica
“Minimize coordination, or blocking communication between
concurrently executing operations, is key to maximizing
scalability, availability, and high performance.” *
▪ Allow services / instances / threads to keep
Working independently of its peers,
its dependencies and even its clients
▪ Favor push vs pull strategies
▪ Favor asynchronous vs synchronous
▪ Favor local caches vs complicated grid /
replicated caches
▪ The Reactive Manifesto
▪ Services publish entity changes to Kafka
▪ Client services can consume these
streams and keep a local memory replica
▪ Important for high transaccionality / low
latency services
▪ Kafka compaction guarantees that at
least one message per key is kept
▪ Every time an instance of the client
service spins ups can load the caches in
memory and keep listening for changes
▪ Each service instance is independent. No
communication between peers.
PRODUCT MASTER DATA
DISTRIBUTION
▪ Dedicated Delivery & Automation team
▪ Devops roles inside scrum teams
▪ Automated CI/CD pipelines based on GitHub Flow
▪ Infrastructure automation
▪ Infrastructure as a code
▪ Automated testing is key for continuous
delivery:
○ Unit testing
○ Integration testing
○ Smoke test & end 2 end testing using
framework based on TestNG
SERVICE
DISCOVERY
EXTERNAL
CONFIGURATION
LOAD
BALANCING
LOGGING
METRICS
Config
Server
AWS Elastic
Load Balancer
▪ Service catalogue based on Enterprise
Architect and own tools:
○ Architecture baselines
○ Ownership of services
○ Dependencies
○ Targets & Transitions
▪ Dedicated Information Architecture Team
▪ Clear Process for new Services provisioning
▪ Automation Integration: No new deployments
of deprecated components
▪ IT Cost Model Tool
▪ Focused on integration of the
three companies
▪ Reorganization into a product
based company
▪ Moving more business logic
into microservices
▪ Multi-cloud
▪ Containers
▪ Keep improving our platform
○ More resilient
○ More agile
○ Better TTM
Commit Conf 2018 - Hotelbeds' journey to a microservice cloud-based architecture

Más contenido relacionado

La actualidad más candente

Συναισθηματική επίκληση στην διαφήμιση και η αποτελεσματικότητας της
Συναισθηματική επίκληση στην διαφήμιση και η αποτελεσματικότητας τηςΣυναισθηματική επίκληση στην διαφήμιση και η αποτελεσματικότητας της
Συναισθηματική επίκληση στην διαφήμιση και η αποτελεσματικότητας της
Clio Siragaki
 
22. ο εκχριστιανισμος των σλαβικων φυλων
22. ο εκχριστιανισμος των σλαβικων φυλων22. ο εκχριστιανισμος των σλαβικων φυλων
22. ο εκχριστιανισμος των σλαβικων φυλων
Ελενη Ζαχου
 
ΣΤΑΔΙΑ ΤΕΧΝΟΛΟΓΙΚΗΣ ΕΞΕΛΙΞΗΣ
ΣΤΑΔΙΑ ΤΕΧΝΟΛΟΓΙΚΗΣ ΕΞΕΛΙΞΗΣΣΤΑΔΙΑ ΤΕΧΝΟΛΟΓΙΚΗΣ ΕΞΕΛΙΞΗΣ
ΣΤΑΔΙΑ ΤΕΧΝΟΛΟΓΙΚΗΣ ΕΞΕΛΙΞΗΣ
ixvos
 

La actualidad más candente (20)

Λέων Γ΄ Ίσαυρος και Εικονομαχία,Σ.Σωφρονά-Κ.Σωφρονάς
Λέων Γ΄ Ίσαυρος και Εικονομαχία,Σ.Σωφρονά-Κ.ΣωφρονάςΛέων Γ΄ Ίσαυρος και Εικονομαχία,Σ.Σωφρονά-Κ.Σωφρονάς
Λέων Γ΄ Ίσαυρος και Εικονομαχία,Σ.Σωφρονά-Κ.Σωφρονάς
 
Body mass index
Body mass indexBody mass index
Body mass index
 
Μελανόμορφα Αγγεία
Μελανόμορφα ΑγγείαΜελανόμορφα Αγγεία
Μελανόμορφα Αγγεία
 
Ολλανδία
ΟλλανδίαΟλλανδία
Ολλανδία
 
πειθώ και είδη επίκλησης
πειθώ και είδη επίκλησηςπειθώ και είδη επίκλησης
πειθώ και είδη επίκλησης
 
Εικονομαχία
ΕικονομαχίαΕικονομαχία
Εικονομαχία
 
μινωικός πολιτισμός
μινωικός πολιτισμόςμινωικός πολιτισμός
μινωικός πολιτισμός
 
Ακτίνες χ και οι εφαρμογές τους
Ακτίνες χ και οι εφαρμογές τουςΑκτίνες χ και οι εφαρμογές τους
Ακτίνες χ και οι εφαρμογές τους
 
γραμμικη β΄(1)
γραμμικη β΄(1)γραμμικη β΄(1)
γραμμικη β΄(1)
 
Συναισθηματική επίκληση στην διαφήμιση και η αποτελεσματικότητας της
Συναισθηματική επίκληση στην διαφήμιση και η αποτελεσματικότητας τηςΣυναισθηματική επίκληση στην διαφήμιση και η αποτελεσματικότητας της
Συναισθηματική επίκληση στην διαφήμιση και η αποτελεσματικότητας της
 
22. ο εκχριστιανισμος των σλαβικων φυλων
22. ο εκχριστιανισμος των σλαβικων φυλων22. ο εκχριστιανισμος των σλαβικων φυλων
22. ο εκχριστιανισμος των σλαβικων φυλων
 
ΣΤΑΔΙΑ ΤΕΧΝΟΛΟΓΙΚΗΣ ΕΞΕΛΙΞΗΣ
ΣΤΑΔΙΑ ΤΕΧΝΟΛΟΓΙΚΗΣ ΕΞΕΛΙΞΗΣΣΤΑΔΙΑ ΤΕΧΝΟΛΟΓΙΚΗΣ ΕΞΕΛΙΞΗΣ
ΣΤΑΔΙΑ ΤΕΧΝΟΛΟΓΙΚΗΣ ΕΞΕΛΙΞΗΣ
 
θεατρο
θεατροθεατρο
θεατρο
 
ιστορια της τεχνης (1)
ιστορια της τεχνης (1)ιστορια της τεχνης (1)
ιστορια της τεχνης (1)
 
ΚΟΙΝΩΝΙΚΗ ΚΑΙ ΠΟΛΙΤΙΚΗ ΑΓΩΓΗ 1. ΠΟΥ ΑΝΗΚΩ;
ΚΟΙΝΩΝΙΚΗ ΚΑΙ ΠΟΛΙΤΙΚΗ ΑΓΩΓΗ 1. ΠΟΥ ΑΝΗΚΩ; ΚΟΙΝΩΝΙΚΗ ΚΑΙ ΠΟΛΙΤΙΚΗ ΑΓΩΓΗ 1. ΠΟΥ ΑΝΗΚΩ;
ΚΟΙΝΩΝΙΚΗ ΚΑΙ ΠΟΛΙΤΙΚΗ ΑΓΩΓΗ 1. ΠΟΥ ΑΝΗΚΩ;
 
4. Διοίκηση και νομοθεσία
4. Διοίκηση και νομοθεσία4. Διοίκηση και νομοθεσία
4. Διοίκηση και νομοθεσία
 
Η τέχνη στη Μεσοποταμία και την Αίγυπτο
Η τέχνη στη Μεσοποταμία και την Αίγυπτο Η τέχνη στη Μεσοποταμία και την Αίγυπτο
Η τέχνη στη Μεσοποταμία και την Αίγυπτο
 
Pop art (presentation)!!!!
Pop art (presentation)!!!!Pop art (presentation)!!!!
Pop art (presentation)!!!!
 
Ζωγραφική και Γλυπτική στην Αναγέννηση
Ζωγραφική και Γλυπτική στην ΑναγέννησηΖωγραφική και Γλυπτική στην Αναγέννηση
Ζωγραφική και Γλυπτική στην Αναγέννηση
 
ΠΑΤΕΡ ΗΜΩΝ -ΕΡΓΑΣΙΑ
ΠΑΤΕΡ ΗΜΩΝ -ΕΡΓΑΣΙΑΠΑΤΕΡ ΗΜΩΝ -ΕΡΓΑΣΙΑ
ΠΑΤΕΡ ΗΜΩΝ -ΕΡΓΑΣΙΑ
 

Similar a Commit Conf 2018 - Hotelbeds' journey to a microservice cloud-based architecture

PLNOG 13: B. van der Sloot, S. Abdel-Hafez: Running a 2 Tbps global IP networ...
PLNOG 13: B. van der Sloot, S. Abdel-Hafez: Running a 2 Tbps global IP networ...PLNOG 13: B. van der Sloot, S. Abdel-Hafez: Running a 2 Tbps global IP networ...
PLNOG 13: B. van der Sloot, S. Abdel-Hafez: Running a 2 Tbps global IP networ...
PROIDEA
 

Similar a Commit Conf 2018 - Hotelbeds' journey to a microservice cloud-based architecture (20)

Docker microservices and the service mesh
Docker microservices and the service meshDocker microservices and the service mesh
Docker microservices and the service mesh
 
Ledingkart Meetup #1: Monolithic to microservices in action
Ledingkart Meetup #1: Monolithic to microservices in actionLedingkart Meetup #1: Monolithic to microservices in action
Ledingkart Meetup #1: Monolithic to microservices in action
 
Istio as an enabler for migrating to microservices (edition 2022)
Istio as an enabler for migrating to microservices (edition 2022)Istio as an enabler for migrating to microservices (edition 2022)
Istio as an enabler for migrating to microservices (edition 2022)
 
Running containers in production, the ING story
Running containers in production, the ING storyRunning containers in production, the ING story
Running containers in production, the ING story
 
Data Engineer, Patterns & Architecture The future: Deep-dive into Microservic...
Data Engineer, Patterns & Architecture The future: Deep-dive into Microservic...Data Engineer, Patterns & Architecture The future: Deep-dive into Microservic...
Data Engineer, Patterns & Architecture The future: Deep-dive into Microservic...
 
Docker, Microservices, and the Service Mesh
Docker, Microservices, and the Service MeshDocker, Microservices, and the Service Mesh
Docker, Microservices, and the Service Mesh
 
NUS-ISS Learning Day 2018- Designing software to make the most of cloud platf...
NUS-ISS Learning Day 2018- Designing software to make the most of cloud platf...NUS-ISS Learning Day 2018- Designing software to make the most of cloud platf...
NUS-ISS Learning Day 2018- Designing software to make the most of cloud platf...
 
AirBNB's ML platform - BigHead
AirBNB's ML platform - BigHeadAirBNB's ML platform - BigHead
AirBNB's ML platform - BigHead
 
Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa...
 Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa... Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa...
Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa...
 
Migrate to Microservices Judiciously!
Migrate to Microservices Judiciously!Migrate to Microservices Judiciously!
Migrate to Microservices Judiciously!
 
PyCONKE meetup 2019: Microservices
PyCONKE meetup 2019: MicroservicesPyCONKE meetup 2019: Microservices
PyCONKE meetup 2019: Microservices
 
HBaseCon2017 Splice Machine as a Service: Multi-tenant HBase using DCOS (Meso...
HBaseCon2017 Splice Machine as a Service: Multi-tenant HBase using DCOS (Meso...HBaseCon2017 Splice Machine as a Service: Multi-tenant HBase using DCOS (Meso...
HBaseCon2017 Splice Machine as a Service: Multi-tenant HBase using DCOS (Meso...
 
From monolith to microservices
From monolith to microservicesFrom monolith to microservices
From monolith to microservices
 
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a MonthUSENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
 
Modernizing Applications with Microservices and DC/OS (Lightbend/Mesosphere c...
Modernizing Applications with Microservices and DC/OS (Lightbend/Mesosphere c...Modernizing Applications with Microservices and DC/OS (Lightbend/Mesosphere c...
Modernizing Applications with Microservices and DC/OS (Lightbend/Mesosphere c...
 
Building a high-performance, scalable ML & NLP platform with Python, Sheer El...
Building a high-performance, scalable ML & NLP platform with Python, Sheer El...Building a high-performance, scalable ML & NLP platform with Python, Sheer El...
Building a high-performance, scalable ML & NLP platform with Python, Sheer El...
 
Red hat's updates on the cloud & infrastructure strategy
Red hat's updates on the cloud & infrastructure strategyRed hat's updates on the cloud & infrastructure strategy
Red hat's updates on the cloud & infrastructure strategy
 
SpringPeople - Introduction to Cloud Computing
SpringPeople - Introduction to Cloud ComputingSpringPeople - Introduction to Cloud Computing
SpringPeople - Introduction to Cloud Computing
 
Cloudify your applications: microservices and beyond
Cloudify your applications: microservices and beyondCloudify your applications: microservices and beyond
Cloudify your applications: microservices and beyond
 
PLNOG 13: B. van der Sloot, S. Abdel-Hafez: Running a 2 Tbps global IP networ...
PLNOG 13: B. van der Sloot, S. Abdel-Hafez: Running a 2 Tbps global IP networ...PLNOG 13: B. van der Sloot, S. Abdel-Hafez: Running a 2 Tbps global IP networ...
PLNOG 13: B. van der Sloot, S. Abdel-Hafez: Running a 2 Tbps global IP networ...
 

Último

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Último (20)

Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Quantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation ComputingQuantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation Computing
 
Decarbonising Commercial Real Estate: The Role of Operational Performance
Decarbonising Commercial Real Estate: The Role of Operational PerformanceDecarbonising Commercial Real Estate: The Role of Operational Performance
Decarbonising Commercial Real Estate: The Role of Operational Performance
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDM
 
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Choreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software EngineeringChoreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software Engineering
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Navigating Identity and Access Management in the Modern Enterprise
Navigating Identity and Access Management in the Modern EnterpriseNavigating Identity and Access Management in the Modern Enterprise
Navigating Identity and Access Management in the Modern Enterprise
 

Commit Conf 2018 - Hotelbeds' journey to a microservice cloud-based architecture

  • 1.
  • 2. Jordi Puigsegur Figueras https://www.linkedin.com/in/jordipuigsegur/ ▪ Head of Solution Architecture Hotelbeds ▪ Course Instructor High Scale Distributed Systems Universitat Oberta de Catalunya
  • 3. 1. Who we are 2. Hotelbeds Journey 3. Survival Principles
  • 4. ▪ Hotelbeds Group is a leading bedbank and a business-to-business (B2B) provider to the global travel industry. ▪ In September 2016 Hotelbeds Group was acquired by Cinven and the Canada Pension Plan Investment Board. ▪ In June 2017 Tourico Holidays became part of Hotelbeds Group and in October GTA also became part of the Group. ▪ Hotelbeds Group offers travel providers access to a network of over 60,000 travel sellers from around 185 source markets globally. ▪ Travel sellers have access to over 170,000 hotels, +22,000 transfer routes, +16,000 activities and 142,000 car rental products. ▪ The technology platform handles around 2 billion data requests per day – with peaks of up to 2.5 billion and 40.000 request per second – from users worldwide. ▪ +5000 employees worldwide. 210 offices globally. Biggest single site is the head office in Palma de Mallorca where over 1500 people work.
  • 5. • Apitude Cloud • 1.200M request per day • 20k request per second • Distributed Availability • Multi-region deployments • 2.000M requests per day • 40k request per second • Transfers and Activities product • Product update for content (FTP) • 1M requests per day • Cache pull service • Proprietary format & Rules based • Allow customers to scan inventory • 10M requests per day • API driven Product Distribution Strategy – Suite of APIs • Focus on scalable and high performance platforms • Ease of integration development driven experience • 200 M requests per day 2008 2010 • Technical breakthrough in the market • Accommodation product only • Thousands of requests per day XML2 AIF2 2002 XML1 2017 APItude APItude 2015 2018 APItude
  • 6.
  • 7.
  • 8. Three main initiatives will shape Hotelbeds Technical evolution 2016 2017 2018 Distributed Availability: Going Global ATLAS+ Project: Breaking the monolith Apitude Migration to cloud
  • 9. ▪ APITUDE is a new redesign of our APIs ▪ Live by end of 2015 ▪ 30% faster than the existing API ▪ DX approach: simpler and based on new technologies ▪ Sets the ground for a modern microservice cloud-native architecture: ○ Cloud-ready services based on Spring Boot ○ Immutable deployments (rpm) and external configuration ○ New technology components: Redis, Spring Config Server ○ Focus on enabling automation ○ New “cloud-friendly” architectural patterns
  • 10. ▪ Big monolithic Oracle Database with most of the company’s business logic inside. ▪ Some satellite Java services… ○ ... but all logic still in PL code ○ ~ 2500 tables, ~ 1.1 M LoC ○ Montly releases with full stop of up to 1 h. ▪ The new Apitude API platform, is already live, but it is still hosted on premise. ▪ Most hotel availability requests are solved by the legacy XML 2 Platform (on premise). ▪ Apitude rollout is just beginning.
  • 11. ▪ We beginning to face serious scalability issues. ▪ Our Oracle based platform is not going to cope with the expected business growth. ▪ Vertical scalability is not an option … we are already running on powerful Oracle Exadata hardware. ▪ Some estimates only leave 18 months until platform saturation.
  • 12. ▪ The main driver is to reduce latencies for our globally distributed customer base. ▪ Availability requests are increasing exponentially, therefore: ○ We need more flexibility to grow and evolve new Apitude services ○ We need autoscaling to adapt to varying loads (day / week / seasonal) ▪ Cloud can also be a cost-saving driver
  • 13. ▪ Cloud migration strategy is mixed: ○ Migrating new cloud native components ○ Plus lift and shift of older ones ▪ Deployed in AWS - 1 region: Europe ▪ Based on IaaS deployments of binary immutable rpms with external configuration ▪ Some managed services (ElastiCache & ELBs) ▪ Adjusting autoscaling and fine tuning takes time ▪ Good monitoring is crucial
  • 14. ▪ Project focused on extracting the core business logic inside our big monolithic Oracle Database ▪ Migration of business logic to cloud-native microservices: ○ Full reengineering of backend services ○ No business involvement … transparent migration 4 Teams 50 Developers 9 Months 20 new Spring Boot services 70 % on Cloud
  • 15. ▪ Hybrid on-premise - cloud approach: ○ Madrid datacenter + 1 AWS region ○ Logic is moved to cloud microservices ○ Data is kept on Oracle DB (on-premise) ○ Prioritize use of cloud data ▪ Data replication on-premise - cloud becomes crucial ○ Use of Kafka (own deployment) ▪ PostgreSQL is the choice for microservices database ○ Managed RDS instances ○ Sometimes noSQL approach
  • 17. ▪ Hotel Availability across three regions: ○ Europe ○ North America ○ Asia ▪ Global data replication using Kafka ▪ Customers are geographically redirected by dynamic DNS. (check Eric Janz talk this afternoon) ▪ Better latencies across the globe ▪ New options for growth ▪ Good monitoring becomes crucial
  • 18. … for the microservices jungle! ▪ Standardization ▪ Decoupling ▪ Data replication ▪ Resilient designs ▪ Automation ▪ Microservice support ecosystem ▪ Governance
  • 19. ▪ We all know microservices are cool ▪ We all want to do microservices!! ▪ We all know the advantages of microservices ▪ But Microservice architectures ○ are complex ○ carry many hidden overheads ▪ In fact ... You are going to build a distributed system and distributed systems are hard!
  • 20. ▪ Programming Language: Java ▪ Parent poms with most relevant dependencies ▪ Some (not many) libraries, e.g. metrics ▪ Standardized service archetype (maven) ○ Ready to run in Hotelbeds ecosystem ○ Produces binary rpms (docker images soon!) ▪ REST APIs designed following similar principles ▪ Carefully chosen set of technology components ▪ Reference architectures on when and how to use components and libraries ▪ Technology radar
  • 21. ▪ Decoupling is essential to achieve the microservices goals ▪ Good decoupled architecture ... ○ Helps scale dev teams ○ High scalability & efficiency enabler ○ Supports future features naturally ▪ Independent deployments and life cycles for each Service ▪ The API is the only point of access of the service (REST endpoint, Kafka, …) ▪ Data is private: no database access from external components
  • 22. ▪ Importance of clean service boundaries ▪ One rule of thumb: changes shouldn’t involve several microservices ▪ Domain-Driven Design as a very useful set of tools: ○ Focuses on domain knowledge and its representation on code ○ Focus on Strategic patterns ○ Bounded contexts as the basis for microservice boundaries ▪ Beware! ○ Service boundaries are hard to define! ○ Easy to end up with a distributed monolith / microservice spaghetti / ...
  • 23. ▪ Data replication between services is crucial in a hybrid cloud / multi-[region|cloud] environment ▪ Each entity is owned by a service ▪ All the other services access the owner service via REST API or consuming its Kafka messages ▪ Kafka messages contain exactly the same entities as the service REST API ▪ Kafka is our message broker and basic tool to replicate data between services ○ Scalability ○ Partial order guarantees ○ Kafka “mirror maker” for moving data across locations (check Kafka talk by Isa and Alicia tomorrow!)
  • 24. Resilience: "the ability of a system to withstand changes in its environment and still function" Wikipedia ▪ We need to design with resilience in mind ▪ Favor self-healing architectures ▪ Remember! We are dealing with distributed systems ▪ Resilience patterns: check Uwe Friedrichsen talks (slideshare)
  • 25. ▪ Protect your services for the unexpected ○ Overloads ○ Timeouts ○ Downstream errors ○ Datacenter failures ○ etc. ▪ Protect your services even if they are only internally exposed ▪ Focus on protecting each service individually ▪ Let good system behaviour emerge from good service level practices
  • 26. ▪ Hystrix library provides several very useful resilience patterns: ○ Circuit breaker ○ Load shedding ○ Timeouts ○ Fallbacks ○ Retries DISTRIBUTION 3rd PARTY INTERNAL PRODUCT Suppliers
  • 27. * from: Coordination Avoidance in Database Systems Peter Bailis, Alan Fekete, Michael J. Franklin, Ali Ghodsi, Joseph M. Hellerstein, Ion Stoica “Minimize coordination, or blocking communication between concurrently executing operations, is key to maximizing scalability, availability, and high performance.” * ▪ Allow services / instances / threads to keep Working independently of its peers, its dependencies and even its clients ▪ Favor push vs pull strategies ▪ Favor asynchronous vs synchronous ▪ Favor local caches vs complicated grid / replicated caches ▪ The Reactive Manifesto
  • 28. ▪ Services publish entity changes to Kafka ▪ Client services can consume these streams and keep a local memory replica ▪ Important for high transaccionality / low latency services ▪ Kafka compaction guarantees that at least one message per key is kept ▪ Every time an instance of the client service spins ups can load the caches in memory and keep listening for changes ▪ Each service instance is independent. No communication between peers. PRODUCT MASTER DATA DISTRIBUTION
  • 29. ▪ Dedicated Delivery & Automation team ▪ Devops roles inside scrum teams ▪ Automated CI/CD pipelines based on GitHub Flow ▪ Infrastructure automation ▪ Infrastructure as a code ▪ Automated testing is key for continuous delivery: ○ Unit testing ○ Integration testing ○ Smoke test & end 2 end testing using framework based on TestNG
  • 30.
  • 32.
  • 33. ▪ Service catalogue based on Enterprise Architect and own tools: ○ Architecture baselines ○ Ownership of services ○ Dependencies ○ Targets & Transitions ▪ Dedicated Information Architecture Team ▪ Clear Process for new Services provisioning ▪ Automation Integration: No new deployments of deprecated components ▪ IT Cost Model Tool
  • 34. ▪ Focused on integration of the three companies ▪ Reorganization into a product based company ▪ Moving more business logic into microservices ▪ Multi-cloud ▪ Containers ▪ Keep improving our platform ○ More resilient ○ More agile ○ Better TTM