SlideShare una empresa de Scribd logo
1 de 54
Descargar para leer sin conexión
Apache ZooKeeper
Курс «Базы данных»
Щитинин Дмитрий
Computer Science Center

11 ноября 2013 г.

Щитинин Д. А. (CompSciCenter)

Apache ZooKeeper

11 ноября 2013 г.

1 / 54
Содержание
1

Introducing

2

Data Model

3

API

4

Usages

5

Architecture

6

Throughput

7

Conclusion

Щитинин Д. А. (CompSciCenter)

Apache ZooKeeper

11 ноября 2013 г.

2 / 54
Introducing

Introducing
High-available (decentralized, no SPoF) and high-reliable
serivce for
Naming
Configuration management
Synchronization
Leader election
Message Queue
Notification system

Щитинин Д. А. (CompSciCenter)

Apache ZooKeeper

11 ноября 2013 г.

3 / 54
Introducing

History

Google Chubby

2006, Google published a paper on "Chubby"
Google’s primary internal name service
Сommon mechanism for MapReduce, GFS, BigTable
Usages:
election a primary from redundant replicas
repository for high availabable files

Щитинин Д. А. (CompSciCenter)

Apache ZooKeeper

11 ноября 2013 г.

4 / 54
Introducing

History

History
ZooKeeper is the close clone of Chubby:
2006-2008 - Developed at Yahoo!1
2008-2010 - Moved to Apache as Hadoop subproject
Jan 2011-... - Became top-level project of Apache
projects system

Commiters: Yahoo!, Cloudera, Google, Facebook,
Microsoft, ...2

1
2

http://www.sdtimes.com/content/article.aspx?ArticleID=33011
http://zookeeper.apache.org/credits.html

Щитинин Д. А. (CompSciCenter)

Apache ZooKeeper

11 ноября 2013 г.

5 / 54
Introducing

Data Model

Data Model
Hierarchical tree of nodes (similar to Google Chubby)
called "znode"
znode can contain either data and subnodes
Not high-volume data storage (znode data - max
1MB)
Access by key (seq of znode’s names "/"delimieted,
like FS paths)
Watches/Notications of znode changes
znodes have version counters and other metadata

Щитинин Д. А. (CompSciCenter)

Apache ZooKeeper

11 ноября 2013 г.

6 / 54
Introducing

ACID

ACID

Atomicy per node: no partial reads or writes
No rollbacks
Sequential consistency (clients see a single sequence
of operations)
Isolation per node
Durable (commit log + replication)

Щитинин Д. А. (CompSciCenter)

Apache ZooKeeper

11 ноября 2013 г.

7 / 54
Introducing

CAP

CAP

Guarantees consistency (writes are linearizable - have
total order)
May not be available for writes (strict quorum based
replication of them)
Partition Tolerance

Щитинин Д. А. (CompSciCenter)

Apache ZooKeeper

11 ноября 2013 г.

8 / 54
Introducing

Users

Users
3

Apache: HBase, HDFS, Hadoop MapReduce, Kafka,
Solr
Katta (distributed Lucene indexes)
Eclipse Communication Framework
Norbert, LinkedIn, partitioned routing, cluster
management
Neo4j, graph database
S4, Yahoo, stream processing
redis_failover (automatic master/slave failover)
Yandex
3

https:
//cwiki.apache.org/confluence/display/ZOOKEEPER/PoweredBy
Щитинин Д. А. (CompSciCenter)

Apache ZooKeeper

11 ноября 2013 г.

9 / 54
Introducing

Users

Use in HBase

Leader Election
Ensure there is at most 1 active master at any time
Configuration Management
Store the bootstrap location
Group Membership
Discovers servers and notifies about server death

Щитинин Д. А. (CompSciCenter)

Apache ZooKeeper

11 ноября 2013 г.

10 / 54
Introducing

Motivation

Motivation
Making up own protocols for coordinate is almost
failed
Distributed system architecture is hard problem
races, deadlocks, inconsistency, reliability
ZK provides tools for create correct distributed
applications:
hard problems already solved
no bycile re-inventions
using open-source well-tested service and clients

Щитинин Д. А. (CompSciCenter)

Apache ZooKeeper

11 ноября 2013 г.

11 / 54
Data Model

Data Model

Hierarhical tree of nodes called "znodes"
znode contains a data and ACL
znode may contain children znodes
1MB of data for any znode
Atomic data access (reads and writes)
No append operation
znodes referenced by path - "/"delimited strings

Щитинин Д. А. (CompSciCenter)

Apache ZooKeeper

11 ноября 2013 г.

12 / 54
Data Model

Data Model

Щитинин Д. А. (CompSciCenter)

Apache ZooKeeper

11 ноября 2013 г.

13 / 54
Data Model

Znode Types

Znode Types
Can be persistent or ephemeral - set at cration time.
Persistent znodes
not tied to a client session
deleted explicitly by a client (any according to ACL)
may have children

Ephemeral znodes
deleted by ZK when creating client session ends
always leaf nodes - may not have subnodes (no even
ephemeral)
tied to a client session, visible to all clients (according to
ACL)
ideal for wacth distributed resources availability
Щитинин Д. А. (CompSciCenter)

Apache ZooKeeper

11 ноября 2013 г.

14 / 54
Data Model

Sequential Znodes

Sequential Znodes

sequential number by ZK is a part of znode name
sequential flag sets at cration time
numbers are monotonically increasing
sequence is maintained by parent znode
used to impose a global ordering on events
we will learn how to build distributed lock on top of
them

Щитинин Д. А. (CompSciCenter)

Apache ZooKeeper

11 ноября 2013 г.

15 / 54
Data Model

Watches

Watches

Allow clients to get notifications when a znode changes
set by operations on ZK
triggered by other operations
triggered once - for multiple notification client
reregisters the watch
used for quick reactions of different changes

Щитинин Д. А. (CompSciCenter)

Apache ZooKeeper

11 ноября 2013 г.

16 / 54
API

Operations

Operations

Operation
create
delete
exists
getACL, setACL
getChildren
getData, setData
sync

Description
Creates a znode (the parent znode must already exist)
Deletes a znode (the znode must not have any children)
Tests whether a znode exists and retrieves its metadata
Gets/sets the ACL for a znode
Gets a list of the children of a znode
Gets/sets the data associated with a znode
Synchronizes a client’s view of a znode with ZooKeeper

Щитинин Д. А. (CompSciCenter)

Apache ZooKeeper

11 ноября 2013 г.

17 / 54
API

Guarantees

Guarantees
Sequential Consistency - Updates from a client
will be applied in the order that they were sent
Atomicity - Updates either succeed or fail. No
partial results.
Single System Image - A client will see the same
view of the service regardless of the server that it
connects to
Reliability - Once an update has been applied, it
will persist from that time forward until a client
overwrites the update
Timeliness - The clients view of the system is
guaranteed to be up-to-date within a certain time
bound
Щитинин Д. А. (CompSciCenter)
Apache ZooKeeper
11 ноября 2013 г.
18 / 54
API

Guarantees

Updates

Update operations (delete, setData) are conditional:
version number of znode under update is specified
performs only if version numbers are equal
so updates are non-blocking

Щитинин Д. А. (CompSciCenter)

Apache ZooKeeper

11 ноября 2013 г.

19 / 54
API

Clients

Clients
Core language bindings:
Java (org.apache.zookeeper:zookeeper:3.4.5)
C (zookeeper_st, zookeeper_mt libs)
Contrib bindings:
Perl
Python
REST clients

Щитинин Д. А. (CompSciCenter)

Apache ZooKeeper

11 ноября 2013 г.

20 / 54
API

Clients

Async & sync

Each binding provides async and sync APIs.
both have same funtionality
async is preferred for event-driven programming
model
async API can offer better throughput

Щитинин Д. А. (CompSciCenter)

Apache ZooKeeper

11 ноября 2013 г.

21 / 54
API

Watch triggers

Watch triggers

exists, getChildren, getData may have watches set on
them
triggered by write operations: create, delete, setData
don’t triggered by ACL operations
watch event generated includes path of modified
znode

Щитинин Д. А. (CompSciCenter)

Apache ZooKeeper

11 ноября 2013 г.

22 / 54
API

Watch triggers

Operations and triggers

create znode
exists NodeCreated

create child

getData
getChildren

Щитинин Д. А. (CompSciCenter)

delete znode
NodeDeleted

delete child

NodeDeleted
NodeChildren
Chagned

NodeDeleted

Apache ZooKeeper

set data
NodeData
Changed
NodeData
Chagned

NodeChildren
Chagned

11 ноября 2013 г.

23 / 54
API

Watch triggers

Handling events
NodeCreated & NodeDeleted
contains modified znode path
NodeChildrenChange
need to call getChildren for retrieve fresh list of
children
NodeDataChanged
need to call getData for retrieve fresh znode data
Warning!
State of the znodes may have changed between receiving
the watch event and performing the read operation.
Щитинин Д. А. (CompSciCenter)

Apache ZooKeeper

11 ноября 2013 г.

24 / 54
API

ACL

ACL
Available authentication types:
digest
by username and password
sasl
Kerberos 4 used
ip
IP address used

4

http://en.wikipedia.org/wiki/Kerberos_(protocol)

Щитинин Д. А. (CompSciCenter)

Apache ZooKeeper

11 ноября 2013 г.

25 / 54
Usages

Out of Box usages

Out of Box usages
Provided directly by API:
Name Service
Configuration management
Provided by create, setData, getData methods
Benefits:
new nodes only need to know how to connect to
ZooKeeper and can then download all other
configuration information and determine
application can subscribe to changes in the
configuration and apply them quickly
Щитинин Д. А. (CompSciCenter)

Apache ZooKeeper

11 ноября 2013 г.

26 / 54
Usages

Group Membership

Group Membership
group is represented by a node (persistent)
members of the group are ephemeral nodes under the
group node
clients call getChildren() on group node with watch
set
failed member nodes will be removed automatically
by ZK
NodeChildrenChanged notification send to clients

Щитинин Д. А. (CompSciCenter)

Apache ZooKeeper

11 ноября 2013 г.

27 / 54
Usages

Lock

Lock
Obtain the lock:
define lock node - create(’_locknode_’)
call create(’_locknode_/lock-’, sequential = true, ephemeral =
true), remember created path as ’path-A’
call getChildren(’_locknode_’) without setting the watch
if ’path-A’ has lowest seq number suffix from all children client
obtains the lock and exits
(if ’path-A’ has not lowest seq number), let ’path-B’ has next lowest
seq number client calls exists() with the watch set on the ’path-B’
if exists() returns false, go to 2 else wait for a notification for
’path-B’ before going to step 2.
Release lock:
Remove node created at step 1 (i.e. ’path-A’)
Note:

node removal will only cause one client to wake up (each
node is watched by exactly one client)
no polling or timeouts

Щитинин Д. А. (CompSciCenter)

Apache ZooKeeper

11 ноября 2013 г.

28 / 54
Usages

ReadWrite Lock

ReadWrite Lock
Read lock acquire:
read-A = create("_locknode_/read- sequential=true,
ephemeral=true)
getChildren("_locknode_ watch = false)
are there children with path starting with "write-"and having a lower
sequence number than ’read-A’ ?
no: acquire the read lock and exit
yes: let ’write-B’ has next lowest seq number and path starts with
’write-’
call exists(watch=true) on ’write-B’
if exists() returns false, go to 2 else wait for a notification for
’write-B’ before going to step 2

Щитинин Д. А. (CompSciCenter)

Apache ZooKeeper

11 ноября 2013 г.

29 / 54
Usages

ReadWrite Lock

ReadWrite Lock
Write lock acquire:
"write-A- create("_locknode_/write- sequential=true,
ephemeral=true)
getChildren("_locknode_ watch =false)
are there children with a lower sequence number than the "write-A"?
no: acquire the write lock and exit
yes: let ’path-B’ has next lowest seq number
call exists(watch=true) on ’path-B’
if exists() returns false, go to step 2 else wait for a notification for
’path-B’

Щитинин Д. А. (CompSciCenter)

Apache ZooKeeper

11 ноября 2013 г.

30 / 54
Usages

Leader Election

Leader Election
Work path - ’_election_’
Leader volunteer behavior:
z = create("_election_/n_ sequential=true, ephemeral=true), z =
n_i (i - seq number assigned for created node)
C = getChildren("_election_")
if i is the lowest seq suffix in C then current volunteer becomes a
leader
else watch for changes on "_election_/n_j where j is the largest
sequence number and j < i
When receive a notification of znode deletion:
C = getChildren("_election_")
if i is the lowest seq suffix in C then current volunteer becomes a
leader
else watch for changes on "_election_/n_j where j is the largest
sequence number and j < i
Щитинин Д. А. (CompSciCenter)

Apache ZooKeeper

11 ноября 2013 г.

31 / 54
Usages

Apache Curator

Apache Curator

"Curator is a set of Java libraries that make using Apache
ZooKeeper much easier"5
initially created by Netflix6
moved too Apache ecosystem

5
6

http://curator.apache.org
http://en.wikipedia.org/wiki/Netflix

Щитинин Д. А. (CompSciCenter)

Apache ZooKeeper

11 ноября 2013 г.

32 / 54
Usages

Apache Curator

Apache Curator
Consists from:
Framework - high-level API that simplifies using ZooKeeper
handles the complexity of managing connections to the ZooKeeper
cluster and retrying operations
Recipes - Implementations of some of the common ZooKeeper
"recipes"
Utilities - Various utilities that are useful when using ZooKeeper
Client - A replacement for ZooKeeper class that takes care of some
low-level housekeeping
Errors - How Curator deals with errors, connection issues,
recoverable exceptions, etc.
Extensions - Other recipes (ServiceDiscovery)

Щитинин Д. А. (CompSciCenter)

Apache ZooKeeper

11 ноября 2013 г.

33 / 54
Architecture

Physical View

Physical View

leader is elected at startup and after leader failures
clients connect to exactly one server to submit requests
every server services clients:
read requests
write requests are forwarded to leader (responses are sent when a majority
of followers have persisted the change)
Щитинин Д. А. (CompSciCenter)

Apache ZooKeeper

11 ноября 2013 г.

34 / 54
Architecture

Logical View

Logical View

Щитинин Д. А. (CompSciCenter)

Apache ZooKeeper

11 ноября 2013 г.

35 / 54
Architecture

Logical View

Request Processor
prepares request for execution
when write request received RP calculates what the
state of the system will ater its applying and
transforms it into a transaction
transactions are idempotemt (unlike client request)
future state must be calculated: there may be
outstanding transactions that have not yet been
applied to the database

Щитинин Д. А. (CompSciCenter)

Apache ZooKeeper

11 ноября 2013 г.

36 / 54
Architecture

Logical View

Atomic Broadcast

Atomic broadcast (total order broadcast) - a broadcast
messaging protocol that ensures that messages are
received reliably and in the same order by all participants.
Special protocol implementation used - Zab

Щитинин Д. А. (CompSciCenter)

Apache ZooKeeper

11 ноября 2013 г.

37 / 54
Architecture

Logical View

Replicated Database

in-memory database containing the entire data tree
each replica has its own copy
for durability:
log updates to disk (replay log (a write-ahead) of
committed operations)
force writes to be on the disk before they are applied to
the in-memory database
generate periodic snapshots of the in-memory database

Щитинин Д. А. (CompSciCenter)

Apache ZooKeeper

11 ноября 2013 г.

38 / 54
Architecture

Broadcast algorithms

Broadcast algorithms
A broadcast algorithm transmits a message from one process (primary) to
all other processes in a network or in a broadcast domain, including the
primary. Atomic broadcast satisfies:
Validity
If a correct process broadcasts a message, then all correct processes
will eventually deliver it
Uniform Agreement
If a process delivers a message, then all correct processes eventually
deliver that message
Uniform Integrity
For any message m, every process delivers m at most once, and only
if m was previously broadcast by the sender of m
Uniform Total Order
If processes p and q both deliver messages m and m’, then p delivers
m before m’ if and only if q delivers m before m’
Щитинин Д. А. (CompSciCenter)

Apache ZooKeeper

11 ноября 2013 г.

39 / 54
Architecture

Paxos

Paxos

Traditional protocol for solving distributed consensus 7 .

7

http://en.wikipedia.org/wiki/Consensus_(computer_science)

Щитинин Д. А. (CompSciCenter)

Apache ZooKeeper

11 ноября 2013 г.

40 / 54
Architecture

Paxos

Roles
Client
issues a request to the distributed system, and waits for a response
Acceptor (Voter)
Act as the fault-tolerant "memory"of the protocol. Acceptors are
collected into groups called Quorums.
Proposer
Advocates a client request, attempting to convince the Acceptors to
agree on it
Learner
Execute the request and send a response to the client)
Leader A distinguished Proposer (called the leader) to make
progress.

Щитинин Д. А. (CompSciCenter)

Apache ZooKeeper

11 ноября 2013 г.

41 / 54
Architecture

Paxos

Basic Paxos

Щитинин Д. А. (CompSciCenter)

Apache ZooKeeper

11 ноября 2013 г.

42 / 54
Architecture

Why not Paxos

Why not Paxos
was not originally intended for atomic broadcasting
but consensus protocols can be used for atomic
broadcasting
does not enable multiple outstanding transactions
does not require FIFO channels for communication,
so it tolerates message loss and reordering
does not support order dependency of proposed
values.
(Problem could be solved by batching multiple
transactions into a single proposal and allowing at
most one proposal at a time, but this has
performance drawbacks)
Щитинин Д. А. (CompSciCenter)

Apache ZooKeeper

11 ноября 2013 г.

43 / 54
Architecture

ZooKeeper Atomic Broadcast (ZAB)

Crash Recovery
Peers communicate by message passing. They may
crash and recover indefinitely many times.
Quorum is subset of peers that contains more than
half ones
Any two quorums have a non-empty intersection
Bidirectional channel for every pair of peers must
satisfy:
integrity
prefix

ZAB uses TCP (therefore FIFO) channels for
communication
Щитинин Д. А. (CompSciCenter)

Apache ZooKeeper

11 ноября 2013 г.

44 / 54
Architecture

ZooKeeper Atomic Broadcast (ZAB)

Transactions
State changes that the primary propagate ("broadcasts")
to the ensemble, and are represented by a pair (v, z):
v - new state
z - transactional identifier - called zxid
zxid z of a transaction is a pair (e, c), where:
e - the epoch number of the primary that generated
the transaction
c - is an integer acting as a counter

Щитинин Д. А. (CompSciCenter)

Apache ZooKeeper

11 ноября 2013 г.

45 / 54
Architecture

ZooKeeper Atomic Broadcast (ZAB)

Broadcast protocol
Peer states:
following
leading
election
Whether a peer is a follower or a leader, it executes three
Zab phases:
discovery
synchronization
broadcast
Previous to Phase 1, a peer is in state election, when it
executes a leader election algorithm.
Щитинин Д. А. (CompSciCenter)

Apache ZooKeeper

11 ноября 2013 г.

46 / 54
Architecture

ZooKeeper Atomic Broadcast (ZAB)

Phase 0: Leader election

peers have state election
no specific leader election protocol needs to be
employed
if peer p voted for peer p’, then p’ is called the
prospective leader for p

Щитинин Д. А. (CompSciCenter)

Apache ZooKeeper

11 ноября 2013 г.

47 / 54
Architecture

ZooKeeper Atomic Broadcast (ZAB)

Phase 1: Discovery

followers communicate with their prospective leader
that leader gathers information about the most
recent transactions that its followers accepted
discover the most updated sequence of accepted
transactions among a quorum, and to establish a new
epoch so that previous leaders cannot commit new
proposals

Щитинин Д. А. (CompSciCenter)

Apache ZooKeeper

11 ноября 2013 г.

48 / 54
Architecture

ZooKeeper Atomic Broadcast (ZAB)

Phase 2: Synchronization
Recovery part of the protocol
synchronize the replicas in the ensemble using the
leader’s updated history from the previous phase
leader communicates with the followers, proposing
transactions from its history. Followers acknowledge
the proposals if their own history is behind the
leader’s history.
when the leader sees acknowledgements from a
quorum, it issues a commit message to them.
At that point, the leader is said to be established and
not anymore prospective.
Щитинин Д. А. (CompSciCenter)

Apache ZooKeeper

11 ноября 2013 г.

49 / 54
Architecture

ZooKeeper Atomic Broadcast (ZAB)

Phase 3: Broadcast

peers stay in this phase until any crashes occur
perform broadcast transaction for client write
requests

Щитинин Д. А. (CompSciCenter)

Apache ZooKeeper

11 ноября 2013 г.

50 / 54
Architecture

ZooKeeper Atomic Broadcast (ZAB)

Implemented protocol

Щитинин Д. А. (CompSciCenter)

Apache ZooKeeper

11 ноября 2013 г.

51 / 54
Throughput

Throughput

running on servers with dual 2Ghz Xeon and two SATA 15K RPM drives
one drive was used as a dedicated log device
"servers"indicate the size of the ensemble
Щитинин Д. А. (CompSciCenter)

Apache ZooKeeper

11 ноября 2013 г.

52 / 54
Conclusion

Off Screen

Off Screen

More ZooKeeper usages
ZooKeeper cluster tuning
Error handling while native ZooKeeper client usage
Deep study of Paxos and ZAB

Щитинин Д. А. (CompSciCenter)

Apache ZooKeeper

11 ноября 2013 г.

53 / 54
Conclusion

Questions?

Questions?

Thanks! Questions?

Щитинин Д. А. (CompSciCenter)

Apache ZooKeeper

11 ноября 2013 г.

54 / 54

Más contenido relacionado

La actualidad más candente

Distributed systems at ok.ru #rigadevday
Distributed systems at ok.ru #rigadevdayDistributed systems at ok.ru #rigadevday
Distributed systems at ok.ru #rigadevdayodnoklassniki.ru
 
WiredTiger In-Memory vs WiredTiger B-Tree
WiredTiger In-Memory vs WiredTiger B-TreeWiredTiger In-Memory vs WiredTiger B-Tree
WiredTiger In-Memory vs WiredTiger B-TreeSveta Smirnova
 
Androsia: A step ahead in securing in-memory Android application data by Sami...
Androsia: A step ahead in securing in-memory Android application data by Sami...Androsia: A step ahead in securing in-memory Android application data by Sami...
Androsia: A step ahead in securing in-memory Android application data by Sami...CODE BLUE
 
glance replicator
glance replicatorglance replicator
glance replicatoririx_jp
 
Postgres-BDR with Google Cloud Platform
Postgres-BDR with Google Cloud PlatformPostgres-BDR with Google Cloud Platform
Postgres-BDR with Google Cloud PlatformSungJae Yun
 
MySQL Audit using Percona audit plugin and ELK
MySQL Audit using Percona audit plugin and ELKMySQL Audit using Percona audit plugin and ELK
MySQL Audit using Percona audit plugin and ELKI Goo Lee
 
了解Oracle rac brain split resolution
了解Oracle rac brain split resolution了解Oracle rac brain split resolution
了解Oracle rac brain split resolutionmaclean liu
 
Elastic 101 tutorial - Percona Europe 2018
Elastic 101 tutorial - Percona Europe 2018 Elastic 101 tutorial - Percona Europe 2018
Elastic 101 tutorial - Percona Europe 2018 Antonios Giannopoulos
 
Эксплуатируем неэксплуатируемые уязвимости SAP
Эксплуатируем неэксплуатируемые уязвимости SAPЭксплуатируем неэксплуатируемые уязвимости SAP
Эксплуатируем неэксплуатируемые уязвимости SAPPositive Hack Days
 
Backup automation in KAKAO
Backup automation in KAKAO Backup automation in KAKAO
Backup automation in KAKAO I Goo Lee
 
Everything you wanted to know about Stack Traces and Heap Dumps
Everything you wanted to know about Stack Traces and Heap DumpsEverything you wanted to know about Stack Traces and Heap Dumps
Everything you wanted to know about Stack Traces and Heap DumpsAndrei Pangin
 
Do we need Unsafe in Java?
Do we need Unsafe in Java?Do we need Unsafe in Java?
Do we need Unsafe in Java?Andrei Pangin
 
Ricon/West 2013: Adventures with Riak Pipe
Ricon/West 2013: Adventures with Riak PipeRicon/West 2013: Adventures with Riak Pipe
Ricon/West 2013: Adventures with Riak PipeSusan Potter
 
The Art of JVM Profiling
The Art of JVM ProfilingThe Art of JVM Profiling
The Art of JVM ProfilingAndrei Pangin
 
Hollywood mode off: security testing at scale
Hollywood mode off: security testing at scaleHollywood mode off: security testing at scale
Hollywood mode off: security testing at scaleClaudio Criscione
 
Introduction to PostgreSQL
Introduction to PostgreSQLIntroduction to PostgreSQL
Introduction to PostgreSQLMark Wong
 
Add a bit of ACID to Cassandra. Cassandra Summit EU 2014
Add a bit of ACID to Cassandra. Cassandra Summit EU 2014Add a bit of ACID to Cassandra. Cassandra Summit EU 2014
Add a bit of ACID to Cassandra. Cassandra Summit EU 2014odnoklassniki.ru
 
Introduction httpClient on Java11 / Java11時代のHTTPアクセス再入門
Introduction httpClient on Java11 / Java11時代のHTTPアクセス再入門Introduction httpClient on Java11 / Java11時代のHTTPアクセス再入門
Introduction httpClient on Java11 / Java11時代のHTTPアクセス再入門tamtam180
 
Using Apache Spark and MySQL for Data Analysis
Using Apache Spark and MySQL for Data AnalysisUsing Apache Spark and MySQL for Data Analysis
Using Apache Spark and MySQL for Data AnalysisSveta Smirnova
 

La actualidad más candente (20)

Distributed systems at ok.ru #rigadevday
Distributed systems at ok.ru #rigadevdayDistributed systems at ok.ru #rigadevday
Distributed systems at ok.ru #rigadevday
 
WiredTiger In-Memory vs WiredTiger B-Tree
WiredTiger In-Memory vs WiredTiger B-TreeWiredTiger In-Memory vs WiredTiger B-Tree
WiredTiger In-Memory vs WiredTiger B-Tree
 
Androsia: A step ahead in securing in-memory Android application data by Sami...
Androsia: A step ahead in securing in-memory Android application data by Sami...Androsia: A step ahead in securing in-memory Android application data by Sami...
Androsia: A step ahead in securing in-memory Android application data by Sami...
 
glance replicator
glance replicatorglance replicator
glance replicator
 
Postgres-BDR with Google Cloud Platform
Postgres-BDR with Google Cloud PlatformPostgres-BDR with Google Cloud Platform
Postgres-BDR with Google Cloud Platform
 
MySQL Audit using Percona audit plugin and ELK
MySQL Audit using Percona audit plugin and ELKMySQL Audit using Percona audit plugin and ELK
MySQL Audit using Percona audit plugin and ELK
 
了解Oracle rac brain split resolution
了解Oracle rac brain split resolution了解Oracle rac brain split resolution
了解Oracle rac brain split resolution
 
Elastic 101 tutorial - Percona Europe 2018
Elastic 101 tutorial - Percona Europe 2018 Elastic 101 tutorial - Percona Europe 2018
Elastic 101 tutorial - Percona Europe 2018
 
Эксплуатируем неэксплуатируемые уязвимости SAP
Эксплуатируем неэксплуатируемые уязвимости SAPЭксплуатируем неэксплуатируемые уязвимости SAP
Эксплуатируем неэксплуатируемые уязвимости SAP
 
Streaming replication
Streaming replicationStreaming replication
Streaming replication
 
Backup automation in KAKAO
Backup automation in KAKAO Backup automation in KAKAO
Backup automation in KAKAO
 
Everything you wanted to know about Stack Traces and Heap Dumps
Everything you wanted to know about Stack Traces and Heap DumpsEverything you wanted to know about Stack Traces and Heap Dumps
Everything you wanted to know about Stack Traces and Heap Dumps
 
Do we need Unsafe in Java?
Do we need Unsafe in Java?Do we need Unsafe in Java?
Do we need Unsafe in Java?
 
Ricon/West 2013: Adventures with Riak Pipe
Ricon/West 2013: Adventures with Riak PipeRicon/West 2013: Adventures with Riak Pipe
Ricon/West 2013: Adventures with Riak Pipe
 
The Art of JVM Profiling
The Art of JVM ProfilingThe Art of JVM Profiling
The Art of JVM Profiling
 
Hollywood mode off: security testing at scale
Hollywood mode off: security testing at scaleHollywood mode off: security testing at scale
Hollywood mode off: security testing at scale
 
Introduction to PostgreSQL
Introduction to PostgreSQLIntroduction to PostgreSQL
Introduction to PostgreSQL
 
Add a bit of ACID to Cassandra. Cassandra Summit EU 2014
Add a bit of ACID to Cassandra. Cassandra Summit EU 2014Add a bit of ACID to Cassandra. Cassandra Summit EU 2014
Add a bit of ACID to Cassandra. Cassandra Summit EU 2014
 
Introduction httpClient on Java11 / Java11時代のHTTPアクセス再入門
Introduction httpClient on Java11 / Java11時代のHTTPアクセス再入門Introduction httpClient on Java11 / Java11時代のHTTPアクセス再入門
Introduction httpClient on Java11 / Java11時代のHTTPアクセス再入門
 
Using Apache Spark and MySQL for Data Analysis
Using Apache Spark and MySQL for Data AnalysisUsing Apache Spark and MySQL for Data Analysis
Using Apache Spark and MySQL for Data Analysis
 

Destacado

Базы данных. HDFS
Базы данных. HDFSБазы данных. HDFS
Базы данных. HDFSVadim Tsesko
 
Базы данных. Haystack
Базы данных. HaystackБазы данных. Haystack
Базы данных. HaystackVadim Tsesko
 
Базы данных. HBase
Базы данных. HBaseБазы данных. HBase
Базы данных. HBaseVadim Tsesko
 
Фреймворк Akka и его использование в Яндексе
Фреймворк Akka и его использование в ЯндексеФреймворк Akka и его использование в Яндексе
Фреймворк Akka и его использование в ЯндексеVadim Tsesko
 
Базы данных. MongoDB
Базы данных. MongoDBБазы данных. MongoDB
Базы данных. MongoDBVadim Tsesko
 
Multidimensional indexing
Multidimensional indexingMultidimensional indexing
Multidimensional indexingVadim Tsesko
 
Базы данных. Lucene
Базы данных. LuceneБазы данных. Lucene
Базы данных. LuceneVadim Tsesko
 
Базы данных. Cassandra
Базы данных. CassandraБазы данных. Cassandra
Базы данных. CassandraVadim Tsesko
 
Software Transactional Memory
Software Transactional MemorySoftware Transactional Memory
Software Transactional MemoryVadim Tsesko
 
Базы данных. Distributed Commit
Базы данных. Distributed CommitБазы данных. Distributed Commit
Базы данных. Distributed CommitVadim Tsesko
 
Базы данных. Hash & Cache
Базы данных. Hash & CacheБазы данных. Hash & Cache
Базы данных. Hash & CacheVadim Tsesko
 
Базы данных. Введение
Базы данных. ВведениеБазы данных. Введение
Базы данных. ВведениеVadim Tsesko
 
Базы данных. CAP
Базы данных. CAPБазы данных. CAP
Базы данных. CAPVadim Tsesko
 
Actor model. Futures & Promises. Reactive Streams.
Actor model. Futures & Promises. Reactive Streams.Actor model. Futures & Promises. Reactive Streams.
Actor model. Futures & Promises. Reactive Streams.Vadim Tsesko
 
Потоковая фильтрация событий
Потоковая фильтрация событийПотоковая фильтрация событий
Потоковая фильтрация событийCEE-SEC(R)
 
YoctoDB в Яндекс.Вертикалях
YoctoDB в Яндекс.ВертикаляхYoctoDB в Яндекс.Вертикалях
YoctoDB в Яндекс.ВертикаляхCEE-SEC(R)
 

Destacado (18)

Базы данных. HDFS
Базы данных. HDFSБазы данных. HDFS
Базы данных. HDFS
 
Базы данных. Haystack
Базы данных. HaystackБазы данных. Haystack
Базы данных. Haystack
 
Базы данных. HBase
Базы данных. HBaseБазы данных. HBase
Базы данных. HBase
 
Фреймворк Akka и его использование в Яндексе
Фреймворк Akka и его использование в ЯндексеФреймворк Akka и его использование в Яндексе
Фреймворк Akka и его использование в Яндексе
 
Базы данных. MongoDB
Базы данных. MongoDBБазы данных. MongoDB
Базы данных. MongoDB
 
Actor model
Actor modelActor model
Actor model
 
Multidimensional indexing
Multidimensional indexingMultidimensional indexing
Multidimensional indexing
 
Actor model
Actor modelActor model
Actor model
 
Базы данных. Lucene
Базы данных. LuceneБазы данных. Lucene
Базы данных. Lucene
 
Базы данных. Cassandra
Базы данных. CassandraБазы данных. Cassandra
Базы данных. Cassandra
 
Software Transactional Memory
Software Transactional MemorySoftware Transactional Memory
Software Transactional Memory
 
Базы данных. Distributed Commit
Базы данных. Distributed CommitБазы данных. Distributed Commit
Базы данных. Distributed Commit
 
Базы данных. Hash & Cache
Базы данных. Hash & CacheБазы данных. Hash & Cache
Базы данных. Hash & Cache
 
Базы данных. Введение
Базы данных. ВведениеБазы данных. Введение
Базы данных. Введение
 
Базы данных. CAP
Базы данных. CAPБазы данных. CAP
Базы данных. CAP
 
Actor model. Futures & Promises. Reactive Streams.
Actor model. Futures & Promises. Reactive Streams.Actor model. Futures & Promises. Reactive Streams.
Actor model. Futures & Promises. Reactive Streams.
 
Потоковая фильтрация событий
Потоковая фильтрация событийПотоковая фильтрация событий
Потоковая фильтрация событий
 
YoctoDB в Яндекс.Вертикалях
YoctoDB в Яндекс.ВертикаляхYoctoDB в Яндекс.Вертикалях
YoctoDB в Яндекс.Вертикалях
 

Similar a Базы данных. ZooKeeper

Apache Eagle in Action
Apache Eagle in ActionApache Eagle in Action
Apache Eagle in ActionHao Chen
 
Streaming Processing with a Distributed Commit Log
Streaming Processing with a Distributed Commit LogStreaming Processing with a Distributed Commit Log
Streaming Processing with a Distributed Commit LogJoe Stein
 
Functional reactive programming
Functional reactive programmingFunctional reactive programming
Functional reactive programmingAraf Karsh Hamid
 
Neo4j Vision and Roadmap
Neo4j Vision and Roadmap Neo4j Vision and Roadmap
Neo4j Vision and Roadmap Neo4j
 
Fast Streaming into Clickhouse with Apache Pulsar
Fast Streaming into Clickhouse with Apache PulsarFast Streaming into Clickhouse with Apache Pulsar
Fast Streaming into Clickhouse with Apache PulsarTimothy Spann
 
Apache ZooKeeper TechTuesday
Apache ZooKeeper TechTuesdayApache ZooKeeper TechTuesday
Apache ZooKeeper TechTuesdayAndrei Savu
 
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceSFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceLucidworks (Archived)
 
Centralized logging with Flume
Centralized logging with FlumeCentralized logging with Flume
Centralized logging with FlumeRatnakar Pawar
 
Stabilizing the Jenga tower: Scaling out Ceilometer
Stabilizing the Jenga tower: Scaling out CeilometerStabilizing the Jenga tower: Scaling out Ceilometer
Stabilizing the Jenga tower: Scaling out CeilometerPradeep Kilambi
 
Stabilising the jenga tower
Stabilising the jenga towerStabilising the jenga tower
Stabilising the jenga towerGordon Chung
 
Apache Eagle: Architecture Evolvement and New Features
Apache Eagle: Architecture Evolvement and New FeaturesApache Eagle: Architecture Evolvement and New Features
Apache Eagle: Architecture Evolvement and New FeaturesHao Chen
 
OpenStack Tokyo Talk Application Data Protection Service
OpenStack Tokyo Talk Application Data Protection ServiceOpenStack Tokyo Talk Application Data Protection Service
OpenStack Tokyo Talk Application Data Protection ServiceEran Gampel
 
Introduction to Apache ZooKeeper
Introduction to Apache ZooKeeperIntroduction to Apache ZooKeeper
Introduction to Apache ZooKeeperSaurav Haloi
 
Monitoring&Logging - Stanislav Kolenkin
Monitoring&Logging - Stanislav Kolenkin  Monitoring&Logging - Stanislav Kolenkin
Monitoring&Logging - Stanislav Kolenkin Kuberton
 
No sql storage systems an architects view
No sql storage systems   an architects viewNo sql storage systems   an architects view
No sql storage systems an architects viewAjit Bhingarkar
 
StratusLab at FOSDEM'13
StratusLab at FOSDEM'13StratusLab at FOSDEM'13
StratusLab at FOSDEM'13stratuslab
 
Benchmarking Solr Performance
Benchmarking Solr PerformanceBenchmarking Solr Performance
Benchmarking Solr PerformanceLucidworks
 
MicroProfile, Docker, Kubernetes, Istio and Open Shift lab @dev nexus
MicroProfile, Docker, Kubernetes, Istio and Open Shift lab @dev nexusMicroProfile, Docker, Kubernetes, Istio and Open Shift lab @dev nexus
MicroProfile, Docker, Kubernetes, Istio and Open Shift lab @dev nexusEmily Jiang
 
Oracle Failover Database Cluster with Grid Infrastructure 12c
Oracle Failover Database Cluster with Grid Infrastructure 12cOracle Failover Database Cluster with Grid Infrastructure 12c
Oracle Failover Database Cluster with Grid Infrastructure 12cTrivadis
 

Similar a Базы данных. ZooKeeper (20)

Apache Eagle in Action
Apache Eagle in ActionApache Eagle in Action
Apache Eagle in Action
 
Streaming Processing with a Distributed Commit Log
Streaming Processing with a Distributed Commit LogStreaming Processing with a Distributed Commit Log
Streaming Processing with a Distributed Commit Log
 
Functional reactive programming
Functional reactive programmingFunctional reactive programming
Functional reactive programming
 
Neo4j Vision and Roadmap
Neo4j Vision and Roadmap Neo4j Vision and Roadmap
Neo4j Vision and Roadmap
 
Fast Streaming into Clickhouse with Apache Pulsar
Fast Streaming into Clickhouse with Apache PulsarFast Streaming into Clickhouse with Apache Pulsar
Fast Streaming into Clickhouse with Apache Pulsar
 
Apache ZooKeeper TechTuesday
Apache ZooKeeper TechTuesdayApache ZooKeeper TechTuesday
Apache ZooKeeper TechTuesday
 
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceSFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
 
Centralized logging with Flume
Centralized logging with FlumeCentralized logging with Flume
Centralized logging with Flume
 
Stabilizing the Jenga tower: Scaling out Ceilometer
Stabilizing the Jenga tower: Scaling out CeilometerStabilizing the Jenga tower: Scaling out Ceilometer
Stabilizing the Jenga tower: Scaling out Ceilometer
 
Stabilising the jenga tower
Stabilising the jenga towerStabilising the jenga tower
Stabilising the jenga tower
 
Apache Eagle: Architecture Evolvement and New Features
Apache Eagle: Architecture Evolvement and New FeaturesApache Eagle: Architecture Evolvement and New Features
Apache Eagle: Architecture Evolvement and New Features
 
OpenStack Tokyo Talk Application Data Protection Service
OpenStack Tokyo Talk Application Data Protection ServiceOpenStack Tokyo Talk Application Data Protection Service
OpenStack Tokyo Talk Application Data Protection Service
 
Introduction to Apache ZooKeeper
Introduction to Apache ZooKeeperIntroduction to Apache ZooKeeper
Introduction to Apache ZooKeeper
 
Monitoring&Logging - Stanislav Kolenkin
Monitoring&Logging - Stanislav Kolenkin  Monitoring&Logging - Stanislav Kolenkin
Monitoring&Logging - Stanislav Kolenkin
 
No sql storage systems an architects view
No sql storage systems   an architects viewNo sql storage systems   an architects view
No sql storage systems an architects view
 
StratusLab at FOSDEM'13
StratusLab at FOSDEM'13StratusLab at FOSDEM'13
StratusLab at FOSDEM'13
 
Libra Library OS
Libra Library OSLibra Library OS
Libra Library OS
 
Benchmarking Solr Performance
Benchmarking Solr PerformanceBenchmarking Solr Performance
Benchmarking Solr Performance
 
MicroProfile, Docker, Kubernetes, Istio and Open Shift lab @dev nexus
MicroProfile, Docker, Kubernetes, Istio and Open Shift lab @dev nexusMicroProfile, Docker, Kubernetes, Istio and Open Shift lab @dev nexus
MicroProfile, Docker, Kubernetes, Istio and Open Shift lab @dev nexus
 
Oracle Failover Database Cluster with Grid Infrastructure 12c
Oracle Failover Database Cluster with Grid Infrastructure 12cOracle Failover Database Cluster with Grid Infrastructure 12c
Oracle Failover Database Cluster with Grid Infrastructure 12c
 

Último

How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 

Último (20)

How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 

Базы данных. ZooKeeper

  • 1. Apache ZooKeeper Курс «Базы данных» Щитинин Дмитрий Computer Science Center 11 ноября 2013 г. Щитинин Д. А. (CompSciCenter) Apache ZooKeeper 11 ноября 2013 г. 1 / 54
  • 3. Introducing Introducing High-available (decentralized, no SPoF) and high-reliable serivce for Naming Configuration management Synchronization Leader election Message Queue Notification system Щитинин Д. А. (CompSciCenter) Apache ZooKeeper 11 ноября 2013 г. 3 / 54
  • 4. Introducing History Google Chubby 2006, Google published a paper on "Chubby" Google’s primary internal name service Сommon mechanism for MapReduce, GFS, BigTable Usages: election a primary from redundant replicas repository for high availabable files Щитинин Д. А. (CompSciCenter) Apache ZooKeeper 11 ноября 2013 г. 4 / 54
  • 5. Introducing History History ZooKeeper is the close clone of Chubby: 2006-2008 - Developed at Yahoo!1 2008-2010 - Moved to Apache as Hadoop subproject Jan 2011-... - Became top-level project of Apache projects system Commiters: Yahoo!, Cloudera, Google, Facebook, Microsoft, ...2 1 2 http://www.sdtimes.com/content/article.aspx?ArticleID=33011 http://zookeeper.apache.org/credits.html Щитинин Д. А. (CompSciCenter) Apache ZooKeeper 11 ноября 2013 г. 5 / 54
  • 6. Introducing Data Model Data Model Hierarchical tree of nodes (similar to Google Chubby) called "znode" znode can contain either data and subnodes Not high-volume data storage (znode data - max 1MB) Access by key (seq of znode’s names "/"delimieted, like FS paths) Watches/Notications of znode changes znodes have version counters and other metadata Щитинин Д. А. (CompSciCenter) Apache ZooKeeper 11 ноября 2013 г. 6 / 54
  • 7. Introducing ACID ACID Atomicy per node: no partial reads or writes No rollbacks Sequential consistency (clients see a single sequence of operations) Isolation per node Durable (commit log + replication) Щитинин Д. А. (CompSciCenter) Apache ZooKeeper 11 ноября 2013 г. 7 / 54
  • 8. Introducing CAP CAP Guarantees consistency (writes are linearizable - have total order) May not be available for writes (strict quorum based replication of them) Partition Tolerance Щитинин Д. А. (CompSciCenter) Apache ZooKeeper 11 ноября 2013 г. 8 / 54
  • 9. Introducing Users Users 3 Apache: HBase, HDFS, Hadoop MapReduce, Kafka, Solr Katta (distributed Lucene indexes) Eclipse Communication Framework Norbert, LinkedIn, partitioned routing, cluster management Neo4j, graph database S4, Yahoo, stream processing redis_failover (automatic master/slave failover) Yandex 3 https: //cwiki.apache.org/confluence/display/ZOOKEEPER/PoweredBy Щитинин Д. А. (CompSciCenter) Apache ZooKeeper 11 ноября 2013 г. 9 / 54
  • 10. Introducing Users Use in HBase Leader Election Ensure there is at most 1 active master at any time Configuration Management Store the bootstrap location Group Membership Discovers servers and notifies about server death Щитинин Д. А. (CompSciCenter) Apache ZooKeeper 11 ноября 2013 г. 10 / 54
  • 11. Introducing Motivation Motivation Making up own protocols for coordinate is almost failed Distributed system architecture is hard problem races, deadlocks, inconsistency, reliability ZK provides tools for create correct distributed applications: hard problems already solved no bycile re-inventions using open-source well-tested service and clients Щитинин Д. А. (CompSciCenter) Apache ZooKeeper 11 ноября 2013 г. 11 / 54
  • 12. Data Model Data Model Hierarhical tree of nodes called "znodes" znode contains a data and ACL znode may contain children znodes 1MB of data for any znode Atomic data access (reads and writes) No append operation znodes referenced by path - "/"delimited strings Щитинин Д. А. (CompSciCenter) Apache ZooKeeper 11 ноября 2013 г. 12 / 54
  • 13. Data Model Data Model Щитинин Д. А. (CompSciCenter) Apache ZooKeeper 11 ноября 2013 г. 13 / 54
  • 14. Data Model Znode Types Znode Types Can be persistent or ephemeral - set at cration time. Persistent znodes not tied to a client session deleted explicitly by a client (any according to ACL) may have children Ephemeral znodes deleted by ZK when creating client session ends always leaf nodes - may not have subnodes (no even ephemeral) tied to a client session, visible to all clients (according to ACL) ideal for wacth distributed resources availability Щитинин Д. А. (CompSciCenter) Apache ZooKeeper 11 ноября 2013 г. 14 / 54
  • 15. Data Model Sequential Znodes Sequential Znodes sequential number by ZK is a part of znode name sequential flag sets at cration time numbers are monotonically increasing sequence is maintained by parent znode used to impose a global ordering on events we will learn how to build distributed lock on top of them Щитинин Д. А. (CompSciCenter) Apache ZooKeeper 11 ноября 2013 г. 15 / 54
  • 16. Data Model Watches Watches Allow clients to get notifications when a znode changes set by operations on ZK triggered by other operations triggered once - for multiple notification client reregisters the watch used for quick reactions of different changes Щитинин Д. А. (CompSciCenter) Apache ZooKeeper 11 ноября 2013 г. 16 / 54
  • 17. API Operations Operations Operation create delete exists getACL, setACL getChildren getData, setData sync Description Creates a znode (the parent znode must already exist) Deletes a znode (the znode must not have any children) Tests whether a znode exists and retrieves its metadata Gets/sets the ACL for a znode Gets a list of the children of a znode Gets/sets the data associated with a znode Synchronizes a client’s view of a znode with ZooKeeper Щитинин Д. А. (CompSciCenter) Apache ZooKeeper 11 ноября 2013 г. 17 / 54
  • 18. API Guarantees Guarantees Sequential Consistency - Updates from a client will be applied in the order that they were sent Atomicity - Updates either succeed or fail. No partial results. Single System Image - A client will see the same view of the service regardless of the server that it connects to Reliability - Once an update has been applied, it will persist from that time forward until a client overwrites the update Timeliness - The clients view of the system is guaranteed to be up-to-date within a certain time bound Щитинин Д. А. (CompSciCenter) Apache ZooKeeper 11 ноября 2013 г. 18 / 54
  • 19. API Guarantees Updates Update operations (delete, setData) are conditional: version number of znode under update is specified performs only if version numbers are equal so updates are non-blocking Щитинин Д. А. (CompSciCenter) Apache ZooKeeper 11 ноября 2013 г. 19 / 54
  • 20. API Clients Clients Core language bindings: Java (org.apache.zookeeper:zookeeper:3.4.5) C (zookeeper_st, zookeeper_mt libs) Contrib bindings: Perl Python REST clients Щитинин Д. А. (CompSciCenter) Apache ZooKeeper 11 ноября 2013 г. 20 / 54
  • 21. API Clients Async & sync Each binding provides async and sync APIs. both have same funtionality async is preferred for event-driven programming model async API can offer better throughput Щитинин Д. А. (CompSciCenter) Apache ZooKeeper 11 ноября 2013 г. 21 / 54
  • 22. API Watch triggers Watch triggers exists, getChildren, getData may have watches set on them triggered by write operations: create, delete, setData don’t triggered by ACL operations watch event generated includes path of modified znode Щитинин Д. А. (CompSciCenter) Apache ZooKeeper 11 ноября 2013 г. 22 / 54
  • 23. API Watch triggers Operations and triggers create znode exists NodeCreated create child getData getChildren Щитинин Д. А. (CompSciCenter) delete znode NodeDeleted delete child NodeDeleted NodeChildren Chagned NodeDeleted Apache ZooKeeper set data NodeData Changed NodeData Chagned NodeChildren Chagned 11 ноября 2013 г. 23 / 54
  • 24. API Watch triggers Handling events NodeCreated & NodeDeleted contains modified znode path NodeChildrenChange need to call getChildren for retrieve fresh list of children NodeDataChanged need to call getData for retrieve fresh znode data Warning! State of the znodes may have changed between receiving the watch event and performing the read operation. Щитинин Д. А. (CompSciCenter) Apache ZooKeeper 11 ноября 2013 г. 24 / 54
  • 25. API ACL ACL Available authentication types: digest by username and password sasl Kerberos 4 used ip IP address used 4 http://en.wikipedia.org/wiki/Kerberos_(protocol) Щитинин Д. А. (CompSciCenter) Apache ZooKeeper 11 ноября 2013 г. 25 / 54
  • 26. Usages Out of Box usages Out of Box usages Provided directly by API: Name Service Configuration management Provided by create, setData, getData methods Benefits: new nodes only need to know how to connect to ZooKeeper and can then download all other configuration information and determine application can subscribe to changes in the configuration and apply them quickly Щитинин Д. А. (CompSciCenter) Apache ZooKeeper 11 ноября 2013 г. 26 / 54
  • 27. Usages Group Membership Group Membership group is represented by a node (persistent) members of the group are ephemeral nodes under the group node clients call getChildren() on group node with watch set failed member nodes will be removed automatically by ZK NodeChildrenChanged notification send to clients Щитинин Д. А. (CompSciCenter) Apache ZooKeeper 11 ноября 2013 г. 27 / 54
  • 28. Usages Lock Lock Obtain the lock: define lock node - create(’_locknode_’) call create(’_locknode_/lock-’, sequential = true, ephemeral = true), remember created path as ’path-A’ call getChildren(’_locknode_’) without setting the watch if ’path-A’ has lowest seq number suffix from all children client obtains the lock and exits (if ’path-A’ has not lowest seq number), let ’path-B’ has next lowest seq number client calls exists() with the watch set on the ’path-B’ if exists() returns false, go to 2 else wait for a notification for ’path-B’ before going to step 2. Release lock: Remove node created at step 1 (i.e. ’path-A’) Note: node removal will only cause one client to wake up (each node is watched by exactly one client) no polling or timeouts Щитинин Д. А. (CompSciCenter) Apache ZooKeeper 11 ноября 2013 г. 28 / 54
  • 29. Usages ReadWrite Lock ReadWrite Lock Read lock acquire: read-A = create("_locknode_/read- sequential=true, ephemeral=true) getChildren("_locknode_ watch = false) are there children with path starting with "write-"and having a lower sequence number than ’read-A’ ? no: acquire the read lock and exit yes: let ’write-B’ has next lowest seq number and path starts with ’write-’ call exists(watch=true) on ’write-B’ if exists() returns false, go to 2 else wait for a notification for ’write-B’ before going to step 2 Щитинин Д. А. (CompSciCenter) Apache ZooKeeper 11 ноября 2013 г. 29 / 54
  • 30. Usages ReadWrite Lock ReadWrite Lock Write lock acquire: "write-A- create("_locknode_/write- sequential=true, ephemeral=true) getChildren("_locknode_ watch =false) are there children with a lower sequence number than the "write-A"? no: acquire the write lock and exit yes: let ’path-B’ has next lowest seq number call exists(watch=true) on ’path-B’ if exists() returns false, go to step 2 else wait for a notification for ’path-B’ Щитинин Д. А. (CompSciCenter) Apache ZooKeeper 11 ноября 2013 г. 30 / 54
  • 31. Usages Leader Election Leader Election Work path - ’_election_’ Leader volunteer behavior: z = create("_election_/n_ sequential=true, ephemeral=true), z = n_i (i - seq number assigned for created node) C = getChildren("_election_") if i is the lowest seq suffix in C then current volunteer becomes a leader else watch for changes on "_election_/n_j where j is the largest sequence number and j < i When receive a notification of znode deletion: C = getChildren("_election_") if i is the lowest seq suffix in C then current volunteer becomes a leader else watch for changes on "_election_/n_j where j is the largest sequence number and j < i Щитинин Д. А. (CompSciCenter) Apache ZooKeeper 11 ноября 2013 г. 31 / 54
  • 32. Usages Apache Curator Apache Curator "Curator is a set of Java libraries that make using Apache ZooKeeper much easier"5 initially created by Netflix6 moved too Apache ecosystem 5 6 http://curator.apache.org http://en.wikipedia.org/wiki/Netflix Щитинин Д. А. (CompSciCenter) Apache ZooKeeper 11 ноября 2013 г. 32 / 54
  • 33. Usages Apache Curator Apache Curator Consists from: Framework - high-level API that simplifies using ZooKeeper handles the complexity of managing connections to the ZooKeeper cluster and retrying operations Recipes - Implementations of some of the common ZooKeeper "recipes" Utilities - Various utilities that are useful when using ZooKeeper Client - A replacement for ZooKeeper class that takes care of some low-level housekeeping Errors - How Curator deals with errors, connection issues, recoverable exceptions, etc. Extensions - Other recipes (ServiceDiscovery) Щитинин Д. А. (CompSciCenter) Apache ZooKeeper 11 ноября 2013 г. 33 / 54
  • 34. Architecture Physical View Physical View leader is elected at startup and after leader failures clients connect to exactly one server to submit requests every server services clients: read requests write requests are forwarded to leader (responses are sent when a majority of followers have persisted the change) Щитинин Д. А. (CompSciCenter) Apache ZooKeeper 11 ноября 2013 г. 34 / 54
  • 35. Architecture Logical View Logical View Щитинин Д. А. (CompSciCenter) Apache ZooKeeper 11 ноября 2013 г. 35 / 54
  • 36. Architecture Logical View Request Processor prepares request for execution when write request received RP calculates what the state of the system will ater its applying and transforms it into a transaction transactions are idempotemt (unlike client request) future state must be calculated: there may be outstanding transactions that have not yet been applied to the database Щитинин Д. А. (CompSciCenter) Apache ZooKeeper 11 ноября 2013 г. 36 / 54
  • 37. Architecture Logical View Atomic Broadcast Atomic broadcast (total order broadcast) - a broadcast messaging protocol that ensures that messages are received reliably and in the same order by all participants. Special protocol implementation used - Zab Щитинин Д. А. (CompSciCenter) Apache ZooKeeper 11 ноября 2013 г. 37 / 54
  • 38. Architecture Logical View Replicated Database in-memory database containing the entire data tree each replica has its own copy for durability: log updates to disk (replay log (a write-ahead) of committed operations) force writes to be on the disk before they are applied to the in-memory database generate periodic snapshots of the in-memory database Щитинин Д. А. (CompSciCenter) Apache ZooKeeper 11 ноября 2013 г. 38 / 54
  • 39. Architecture Broadcast algorithms Broadcast algorithms A broadcast algorithm transmits a message from one process (primary) to all other processes in a network or in a broadcast domain, including the primary. Atomic broadcast satisfies: Validity If a correct process broadcasts a message, then all correct processes will eventually deliver it Uniform Agreement If a process delivers a message, then all correct processes eventually deliver that message Uniform Integrity For any message m, every process delivers m at most once, and only if m was previously broadcast by the sender of m Uniform Total Order If processes p and q both deliver messages m and m’, then p delivers m before m’ if and only if q delivers m before m’ Щитинин Д. А. (CompSciCenter) Apache ZooKeeper 11 ноября 2013 г. 39 / 54
  • 40. Architecture Paxos Paxos Traditional protocol for solving distributed consensus 7 . 7 http://en.wikipedia.org/wiki/Consensus_(computer_science) Щитинин Д. А. (CompSciCenter) Apache ZooKeeper 11 ноября 2013 г. 40 / 54
  • 41. Architecture Paxos Roles Client issues a request to the distributed system, and waits for a response Acceptor (Voter) Act as the fault-tolerant "memory"of the protocol. Acceptors are collected into groups called Quorums. Proposer Advocates a client request, attempting to convince the Acceptors to agree on it Learner Execute the request and send a response to the client) Leader A distinguished Proposer (called the leader) to make progress. Щитинин Д. А. (CompSciCenter) Apache ZooKeeper 11 ноября 2013 г. 41 / 54
  • 42. Architecture Paxos Basic Paxos Щитинин Д. А. (CompSciCenter) Apache ZooKeeper 11 ноября 2013 г. 42 / 54
  • 43. Architecture Why not Paxos Why not Paxos was not originally intended for atomic broadcasting but consensus protocols can be used for atomic broadcasting does not enable multiple outstanding transactions does not require FIFO channels for communication, so it tolerates message loss and reordering does not support order dependency of proposed values. (Problem could be solved by batching multiple transactions into a single proposal and allowing at most one proposal at a time, but this has performance drawbacks) Щитинин Д. А. (CompSciCenter) Apache ZooKeeper 11 ноября 2013 г. 43 / 54
  • 44. Architecture ZooKeeper Atomic Broadcast (ZAB) Crash Recovery Peers communicate by message passing. They may crash and recover indefinitely many times. Quorum is subset of peers that contains more than half ones Any two quorums have a non-empty intersection Bidirectional channel for every pair of peers must satisfy: integrity prefix ZAB uses TCP (therefore FIFO) channels for communication Щитинин Д. А. (CompSciCenter) Apache ZooKeeper 11 ноября 2013 г. 44 / 54
  • 45. Architecture ZooKeeper Atomic Broadcast (ZAB) Transactions State changes that the primary propagate ("broadcasts") to the ensemble, and are represented by a pair (v, z): v - new state z - transactional identifier - called zxid zxid z of a transaction is a pair (e, c), where: e - the epoch number of the primary that generated the transaction c - is an integer acting as a counter Щитинин Д. А. (CompSciCenter) Apache ZooKeeper 11 ноября 2013 г. 45 / 54
  • 46. Architecture ZooKeeper Atomic Broadcast (ZAB) Broadcast protocol Peer states: following leading election Whether a peer is a follower or a leader, it executes three Zab phases: discovery synchronization broadcast Previous to Phase 1, a peer is in state election, when it executes a leader election algorithm. Щитинин Д. А. (CompSciCenter) Apache ZooKeeper 11 ноября 2013 г. 46 / 54
  • 47. Architecture ZooKeeper Atomic Broadcast (ZAB) Phase 0: Leader election peers have state election no specific leader election protocol needs to be employed if peer p voted for peer p’, then p’ is called the prospective leader for p Щитинин Д. А. (CompSciCenter) Apache ZooKeeper 11 ноября 2013 г. 47 / 54
  • 48. Architecture ZooKeeper Atomic Broadcast (ZAB) Phase 1: Discovery followers communicate with their prospective leader that leader gathers information about the most recent transactions that its followers accepted discover the most updated sequence of accepted transactions among a quorum, and to establish a new epoch so that previous leaders cannot commit new proposals Щитинин Д. А. (CompSciCenter) Apache ZooKeeper 11 ноября 2013 г. 48 / 54
  • 49. Architecture ZooKeeper Atomic Broadcast (ZAB) Phase 2: Synchronization Recovery part of the protocol synchronize the replicas in the ensemble using the leader’s updated history from the previous phase leader communicates with the followers, proposing transactions from its history. Followers acknowledge the proposals if their own history is behind the leader’s history. when the leader sees acknowledgements from a quorum, it issues a commit message to them. At that point, the leader is said to be established and not anymore prospective. Щитинин Д. А. (CompSciCenter) Apache ZooKeeper 11 ноября 2013 г. 49 / 54
  • 50. Architecture ZooKeeper Atomic Broadcast (ZAB) Phase 3: Broadcast peers stay in this phase until any crashes occur perform broadcast transaction for client write requests Щитинин Д. А. (CompSciCenter) Apache ZooKeeper 11 ноября 2013 г. 50 / 54
  • 51. Architecture ZooKeeper Atomic Broadcast (ZAB) Implemented protocol Щитинин Д. А. (CompSciCenter) Apache ZooKeeper 11 ноября 2013 г. 51 / 54
  • 52. Throughput Throughput running on servers with dual 2Ghz Xeon and two SATA 15K RPM drives one drive was used as a dedicated log device "servers"indicate the size of the ensemble Щитинин Д. А. (CompSciCenter) Apache ZooKeeper 11 ноября 2013 г. 52 / 54
  • 53. Conclusion Off Screen Off Screen More ZooKeeper usages ZooKeeper cluster tuning Error handling while native ZooKeeper client usage Deep study of Paxos and ZAB Щитинин Д. А. (CompSciCenter) Apache ZooKeeper 11 ноября 2013 г. 53 / 54
  • 54. Conclusion Questions? Questions? Thanks! Questions? Щитинин Д. А. (CompSciCenter) Apache ZooKeeper 11 ноября 2013 г. 54 / 54