This document describes the implementation of data replication in Apache Accumulo. It discusses justifying the need for replication to handle failures, describes how replication is implemented using write-ahead logs, and outlines future work including replicating to other systems and improving consistency.
High level - why is replication important? Availability
Hardware failures suck. Unexpected configuration/ops problems. Scale past a single data center without having to worry about a single instance.
Must satisfy SLAs. Cannot just accept unplanned downtime.
Jeff Dean’s talk about what to expect in the first year of a 1K node cluster
The software running in the application, database and operating system also might have bugs which cause unexpected unavailability.
Describe the characteristics of the replication implementation
Book-keeping of data written to tables
Interfaces/implementation for replicating data from Accumulo instance to another application
Asynchronous
Eventually consistent
Push data from primary to peer
Resilient to prolonged outages
BigTable basics – what is a write-ahead log
Write-ahead logs used to track data that was written to a table
WAL is the primary element in bookkeeping system
Some data written to metadata table, most written to replication table
Leverage ZooKeeper for work assignment to tservers
Tservers track WALs used for local tables
Master makes work entries for the WALs that tservers use
Assign replication work back to the tservers
Master cleans up fully-replicated records
GC closes WALs that are no longer referenced by tservers
ReplicaSystem interface – does the “heavy lifting”, runs inside of the tserver
AccumuloReplicaSystem implementation to replicate to an accumulo instance
Local tserver -> peer master
Peer master replies with peer tserver
Local tserver -> peer tserver
Not just some academic adventure.
Will be available in 1.7.0 and the next version of HDP
Replicate to RDBMS or nosql systems
Support bulk-imports – not a huge priority because typically easily to replicate on your own.
Conditional mutation support somehow – would have to change from async to sync or introduce conflict resolution
Problems in table configuration properties that are universal and others which are specific to an instance