-
高性能
- 提供异步的接口
- 并没提供完全的读写线性一致性,可以在副本读,从而在读写比高的场景中提高性能
-
用途广泛
- ZooKeeper 作为一个分布式的协调服务框架,主要用来解决分布式集群中,应用系统需要面对的各种通用的一致性问题
- 像是一个“瑞士军刀”,它提供了很多基本的 *** 作,能实现什么样的功能更多取决于使用者如何来使用它。
-
关于写
-
zk也是基于replicated state machine的,所有写 *** 作要经过zk的leader
-
client write–》zkServer–》 zab层(类似raft, 每个write的log需要被commit,才能被leader返回)
- 用zab层来容错和保证写 *** 作的线性一致性
-
-
关于读
-
如果要提高读的性能,就需要让副本响应client的读,但这样就违反了线性一致性,因为:
-
Replica may not be in majority, so may not have seen a completed write.
Replica may not yet have seen a commit for a completed write.
Replica may be entirely cut off from the leader (same as above).
-
-
Linearizable writes
- clien 向zk的写 *** 作,是要经过leader的,从而保证写 *** 作的线性一致性
FIFO client order
- 所有的client 向zk发送 *** 作的顺序,和这些 *** 作被执行的顺序,是一致的
写- 保证 每个client的“写”的顺序在zk执行的是一致的
- *** 作的原子性用“ready file”来实现
- 大致的思想是,要 *** 作到某个数据,先检查对于的标记“ready file”是否存在,存在才能 *** 作
- 在修改对应的数据的时候,会先删除这个“ready file”标记,修改完再create 这个"ready file"
读
1. 保证 每个client执行读的 *** 作,在这个client的“读写”的顺序是一致的
2. 不会出现“go backward read”,即 client会记录已经读到的最大新数据的zxid,之后的读不会读低版本的数据
3. 一个client的读会等这个client的之前的write都完成后,再读
1. zk提供了一个sync的 *** 作
- 图中看出的东西
- 当只有 write 时,server + , ops -
- 横坐标+ server+的 ops+
zk其他提高性能的地方
Clients can send async writes to leader (async = don’t have to wait).
Leader batches up many requests to reduce net and disk-write overhead.
Assumes lots of active clients.
zk如何成为一个通用的分分布式协调框架zk的结构【图】
the state: a file-system-like tree of znodes
file names, file content, directories, path names
typical use: configuration info in znodes
set of machines that participate in the application
which machine is the primary
each znode has a version number
types of znodes:
regular
ephemeral
sequential: name + seqno
zk的api
create(path, data, flags)
exclusive – only first create indicates success
delete(path, version)
if znode.version = version, then delete
exists(path, watch)
watch=true means also send notification if path is later created/deleted
getData(path, watch)
setData(path, data, version)
if znode.version = version, then update
getChildren(path, watch)
sync()
sync then read ensures writes before sync are visible to same client’s read
client could instead submit a write
api的特性
ZooKeeper API well tuned to synchronization:
+ exclusive file creation; exactly one concurrent create returns success
+ getData()/setData(x, version) supports mini-transactions
+ sessions automate actions when clients fail (e.g. release lock on failure)
+ sequential files create order among multiple clients
+ watches – avoid polling
一些例子
Example: add one to a number stored in a ZooKeeper znode
what if the read returns stale data?
write will write the wrong value!
what if another client concurrently updates?
will one of the increments be lost?
while true:
x, v := getData(“f”)
if setData(x + 1, version=v):
break
Example: Locks without Herd Effect
(look at pseudo-code in paper, Section 2.4, page 6)
1. create a “sequential” file
2. list files
3. if no lower-numbered, lock is acquired!
4. if exists(next-lower-numbered, watch=true)
5. wait for event…
6. goto 2
zk在kafka的应用
https://time.geekbang.org/column/article/137655?cid=100032301
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)