call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
Cross-Cluster and Cross-Datacenter Elasticsearch Replication at sahibinden.com
1. Cross-Cluster and Cross-Datacenter
Elasticsearch Replication at sahibinden.com
12.07.2016
Ertuğ Karamatlı
ertug@karamatli.com
Software Architect at sahibinden.com
CMPE PhD Student at Boğaziçi University
2. Elasticsearch at sahibinden.com
● Search suggestion (first use, 2011)
● Classifieds search (10+ nodes, 1000+ QPS)
● User messaging (10+ nodes, 300+ GB, 300M+
docs)
● 10+ other clusters
4. Why Replicate Across Clusters?
2. (Very) Hot Backups
Load balance reads
Maintain warm caches
Ensure backup is functional
5. Why Replicate Across Clusters?
3. Minimize Risks
Test ES/Java/OS version upgrades
Test configuration changes
6. Why Replicate Across Clusters?
4. Multiple Datacenters
Low-latency synchronization
Active-active datacenters
Maintain fresh data in test environments
7. How to Replicate Across Clusters?
Option 1: Sync write
ES Cluster 1 ES Cluster 2
Application
8. How to Replicate Across Clusters?
Application
Option 2: Async write
ES Cluster 1 ES Cluster 2
Queue
Replicator
9. How to Replicate Across Clusters?
Option 3: Sync master write, async slave write
Application
ES Cluster 1 ES Cluster 2
Queue
Replicator
12. Cross-Datacenter Replication
Application
ES Cluster 1 ES Cluster 2
Aggregate
Kafka
Replicator
Local
Kafka
ES Cluster 3
Aggregate
Kafka
Replicator
Local
Kafka
Application
ES Cluster 4
WAN
Datacenter 1 Datacenter 2
14. What about Performance?
Application
ES Cluster 1 ES Cluster 2
Kafka
ReplicatorN servers × M threads
{
_id: 123,
name: “Ahmet”
}
10 threads
{
_id: 123,
name: “Mehmet”
}
1 thread
15. What about Performance?
INDEX 3 [3]
UPDATE 1 [1,3]
DELETE 2 [1,2,3]
UPDATE 3 [3]
INDEX 4 [3,4]
Queries In-Flight Document IDs
Checkpoint
16. Sync Script
How to Fix Things?
ES Cluster 1 ES Cluster 2
ID Worker ID Worker
Index WorkersIndex WorkersIndex Workers Delete WorkersDelete WorkersDelete Workers
Sync Direction