Exactly-Once, Again: Adding EOS Support for Kafka Connect Source Connectors with Chris Egerton

Exactly-Once, Again: Adding EOS
Support for Kafka Connect Source
Connectors
Chris Egerton

Nice to meet you!
Chris Egerton
Open Source Program
Office @Aiven
Apache Kafka
committer and PMC
member
(Official bio) (Unofficial bio)

Exactly-Once, Again: Adding
EOS Support for Kafka
Connect Source Connectors

● “Exactly-once semantics”
● “Semantics” instead of “delivery”, “guarantees”, “delivery
guarantees”, etc. (see Two Generals’ Problem)
● Levels:
○ Probably-once
○ At-least-once
○ At-most-once
○ Exactly-once
● With all else equal, exactly-once is best
● But of course, it’s the hardest to implement
EOS

Source Connectors
● Kafka stores and transmits events. Where do these events
come from, and where do they go?
● DYI producer/consumer application? Nah 👎
● Connectors: no-code (or low-code) applications to integrate
Kafka with other systems
● Sink connectors write data from Kafka to the external system
● Source connectors read data from the external system into
Kafka

Kafka Connect
● Distributed, horizontally-scalable, fault-
tolerant ingest/export tool for Kafka
● Developers implement connectors
against the Kafka Connect API
● Cluster administrators install connectors
onto one or more Kafka Connect workers,
which combine to form a cluster
● Users can then create and manage
connectors on that cluster by submitting
JSON configurations via a REST API
● (For users) No code required!
{
"name": "local-file-source",
"connector.class": "FileStreamSink",
"tasks.max": "1",
"file": "test.txt",
"topic": "connect-test"
}

We’re going to talk about designing support for exactly-once
semantics (EOS) with source connectors developed for Kafka
Connect.
In summary…

Challenges
1⃣ƿ
Ø#μГ& Г& όƿ
„)} +Ρ#ƿ
)μμ„#–„
2⃣ƿ
q–)+Г& όƿ
/ & 0ƿ
+#–+Г#¤Г& όƿ
„)} +Ρ#ƿ
)μμ„#–„
3⃣ƿ
ò#–+№
Г& όƿ
0#Ю
Г¤#+№
ƿ
–)ƿ
ǽ/ μЧ/
4⃣ƿ̀ )8 ΛГ#„ƿ
🧟🧟🧟🧟🧟

Source offsets, in detail
● Source connectors provide source records
● Source records come with source offsets (partition + offset)
● On startup, connectors use source offsets to know where to
resume from
● Source offsets are stored in an offsets topic by Kafka
Connect
✅🎉
// TODO
ɀ Ø"μГ%Г%όƿ
„)} +Ρ#ƿ
)μμ„#–„ƿ
;1⃣<ƿ
Г„ƿ
–φ#ƿ
+#„ъ)& „ГΛГЮ
Г–№
ƿ
)μƿ
–φ#ƿ
Ρ)& & #Ρ–)+
ɀ q–)*Г%όƿ
, %-ƿ
*"–*Г"¤Г%όƿ
„)} +Ρ#ƿ
)μμ„#–„ƿ
;2⃣<ƿ
Г„ƿ
–φ#ƿ
+#„ъ)& „ГΛГЮ
Г–№
ƿ
)μƿ
ǽ/μЧ/ƿ
Ņ)& & #Ρ–

Exactly-once for Kafka clients
ɀ ǽ@
nBCDEƿ
ŚG/ Ρ–Ю
№
ƿ
H& Ρ#ƿ
Ø#Ю
Г¤#+№
ƿ
/ & 0ƿ
ſ +/ & „/ Ρ–Г)& / Ю
ƿ
ë #„„/ όГ& ό
ɂ ò#Ю
#/ „#0ƿ
Г& ƿ
LMNNMLML
ɂ O #& Ρ#P
ƿ
Q ό/ Г& Sƿ
Г& ƿ
–φ#ƿ
–Г–Ю
#
ɀ /
-"0 ъ)–"%–ƿ
ъ*)-} Ρ"*Tƿ
+#–+Г#„ƿ
$ Г–φ)} –ƿ
0} ъЮ
ГΡ/ –#„ƿ
;3⃣<
ɀ ſ *, %„, Ρ–Г)%, Ю
ƿ
ъ*)-} Ρ"*Tƿ
/ –)8 ГΡƿ
Ρ+)„„V –)ъГΡƿ
$ +Г–#„
ɀ ſ *, %„, Ρ–Г)%, Ю
ƿ
/
Ø„Tƿ
„Г& όЮ
#ƿ
$ +Г–#+ƿ
ъ#+ƿ
@
Ø
ɂ /
%Г–Г, Ю
Г7"ƿ
–*, %„, Ρ–Г)%„ƿ
$ Г–φƿ
/ ƿ
ъ+)0} Ρ#+ƿ
–)ƿ
μ"%Ρ"ƿ
)} –ƿ
)–φ#+ƿ
ъ+)0} Ρ#+„ƿ
$ Г–φƿ
–φ#ƿ
„/ 8 #ƿ
–+/ & „/ Ρ–Г)& / Ю
ƿ
@
Ø
✅🎉

Tracking source offsets (2⃣)
Ļ #μ)+#T
ɀ n)Ю
Ю
ƿ
Ρ)& & #Ρ–)+ƿ
μ)+ƿ
„)} +Ρ#ƿ
+#Ρ)+0„
ɀ ỳ +Г–#ƿ
+#Ρ)+0„ƿ
–)ƿ
ǽ/ μЧ/
ɀ n#+Г)0ГΡ/ Ю
Ю
№
ƿ
$ +Г–#ƿ
;Ρ)8 8 Г–<ƿ
„)} +Ρ#ƿ
)μμ„#–„ƿ
–)ƿ
ǽ/ μЧ/
ɀ n+)¤Г0#„ƿ
/ –V Ю
#/ „–V )& Ρ#ƿ
„} ъъ)+–
μ–#+T
ɀ Ļ #όГ& ƿ
–+/ & „/ Ρ–Г)&
ɀ n)Ю
Ю
ƿ
Ρ)& & #Ρ–)+ƿ
μ)+ƿ
„)} +Ρ#ƿ
+#Ρ)+0„
ɀ ỳ +Г–#ƿ
+#Ρ)+0„ƿ
Г8 8 #0Г/ –#Ю
№
ƿ
–)ƿ
ǽ/ μЧ/ ƿ
;„–ГЮ
Ю
<
ɀ ỳ +Г–#ƿ
„)} +Ρ#ƿ
)μμ„#–„ƿ
–)ƿ
ǽ/ μЧ/ ƿ
ɀ Ņ)8 8 Г–ƿ
–+/ & „/ Ρ–Г)&
ɀ n+)¤Г0#„ƿ
#G/ Ρ–Ю
№
V )& Ρ#ƿ
„} ъъ)+–
✅🎉

75% done!
1⃣ƿ
Ø#μГ& Г& όƿ
„)} +Ρ#ƿ
)μμ„#–„
2⃣ƿ
q–)+Г& όƿ
/ & 0ƿ
+#–+Г#¤Г& όƿ
„)} +Ρ#ƿ
)μμ„#–„
3⃣ƿ
ò#–+№
Г& όƿ
0#Ю
Г¤#+№
ƿ
–)ƿ
ǽ/ μЧ/
🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟
🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟
🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟
✅
✅
✅
4⃣ƿ
ſ φ/ –ƿ
Ю
#/¤#„Y

“Chekhov’s gun cabinet”
Kafka Connect: diving deeper
ɀ Ņ)%%"Ρ–)*„ƿ
ό#& #+/–#ƿ
Ρ)& μГό} +/–Г)& „ƿ
μ)+ƿ
)& #ƿ
)+ƿ
8 )+#ƿ
–, „Ч„
ɀ ſ / „Ч„ƿ
Ρ/ & ƿ
Λ#ƿ
„ъ+#/ 0ƿ
/ Ρ+)„„ƿ
8 } Ю
–ГъЮ
#ƿ
ǽ/ μЧ/ ƿ
Ņ)& & #Ρ–ƿ
$ )+Ч#+„
ɀ ſ / „Ч„ƿ
/ +#ƿ
/ „„Гό& #0ƿ
–)ƿ
$ )+Ч#+„ƿ
0} +Г& όƿ
/ ƿ
*"Λ, Ю
, %Ρ"
ɀ ƿ
„Г& όЮ
#ƿ
$ )+Ч#+P
ƿ
–φ#ƿ
Ю
", -"*P
ƿ
0#–#+8 Г& #„ƿ
–φ/ –ƿ
/ „„Гό& 8 #& –
ɀ ſ / „Чƿ
Ρ)& μГό} +/ –Г)& „ƿ
/ +#ƿ
„–)+#0ƿ
Г& ƿ
/ ƿ
„Г& όЮ
#V ъ/+–Г–Г)& ƿ
ǽ/μЧ/ƿ
–)ъГΡƿ
Ρ/ Ю
Ю
#0ƿ
–φ#ƿ
Ρ)%μГόƿ
–)ъГΡ
ɀ Z)+ƿ
„)} +Ρ#ƿ
Ρ)& & #Ρ–)+„P
ƿ
ǽ/ μЧ/ ƿ
Ņ)& & #Ρ–ƿ
} „#„ƿ
)& #ƿ
ъ+)0} Ρ#+ƿ
ъ#+ƿ
–/ „Чƿ
–)ƿ
„#& 0ƿ
„)} +Ρ#ƿ
+#Ρ)+0„ƿ
–)ƿ
ǽ/μЧ/

Zombie fencing: correctness goals
ſ )Ю
#+/ –#0ƿ
„Ρ#& / +Г)„T
ɀ ;[ & <ό+/ Ρ#μ} Ю
ƿ
$ )+Ч#+ƿ
„φ} –0)$ &
ɀ +ΛГ–+/ +№
ƿ
$ )+Ч#+ƿ
„–/ +–} ъ„
ɀ +ΛГ–+/ +ГЮ
№
V Ю
)& όP
ƿ
/ +ΛГ–+/ +ГЮ
№
V ъЮ
/Ρ#0P
ƿ
/ & 0ƿ
/+ΛГ–+/+ГЮ
№
V –Г8 #0ƿ
ъ/ } „#„

Zombie fencing: UX goals
● Minimal unnecessary interruptions (keep processing data)
● Minimal changes to connector code
● Minimal connector management API changes

Zombie fencing: actually pretty easy?
● Give each task a transactional ID derived from the name of
the connector and the task ID
○ E.g., “reddit-source-0” or “chris-ksl-3”
● Let tasks fence out older instances on startup
○ Fencing: disabling a producer from writing to Kafka

Zombie fencing: actually pretty easy?
reddit-source-0
reddit-source-1
reddit-source-2
reddit-source-3
ò#00Г–ƿ
„)} +Ρ#ƿ
;)Ю
0ƿ
–/ „Ч„<
reddit-source-0
reddit-source-1
reddit-source-2
reddit-source-3
ò#00Г–ƿ
„)} +Ρ#ƿ
;& #$ ƿ
–/ „Ч„<
Fences out
;"%Ρ"„ƿ
)} –
;"%Ρ"„ƿ
)} –
Fences out
ɀ q)Y ƿ
0)#„ƿ
–φГ„ƿ
$ )+Ч
ɀ ])Ю
ƿ
@
ƿ
$ Г„φƿ
🥲

Source partition reshuffling problem
ɀ q)} +Ρ#ƿ
Ρ)& & #Ρ–)+„ƿ
+#/ 0ƿ
μ+)8 ƿ
„)} *Ρ"ƿ
ъ, *–Г–Г)%„
ɂ Ø/ –/ Λ/ „#ƿ
–/ ΛЮ
#„P
ƿ
ǽ/ μЧ/ ƿ
–)ъГΡ„P
ƿ
#–ΡM
ɀ ſ φ#„#ƿ
/ +#ƿ
/ Ю
Ю
)Ρ/ –#0ƿ
/ Ρ+)„„ƿ
–/ „Ч„
ɀ ſ φ/ –ƿ
/ Ю
Ю
)Ρ/ –Г)& ƿ
Ρ/ & ƿ
Ρφ/ & ό#ƿ
)¤#+ƿ
–Г8 #

P0
Task 0
P1
ſ / „Чƿ
N
@
& Г–Г/ Ю
ƿ
/ Ю
Ю
)Ρ/ –Г)& ƿ
^Ţƿ
–/ „Ч„P
ƿ
Ţƿ
ъ/ +–Г–Г)& „<
P0
Task 0
P2
ƿ
& #$ ƿ
ъ/ +–Г–Г)& ƿ
^nŢ`ƿ
Г„ƿ
Ρ+#/ –#0
P1
ſ / „Чƿ
N
& ƿ
#GГ„–Г& όƿ
ъ/ +–Г–Г)& ƿ
^nN`ƿ
Г„ƿ
0#Ю
#–#0
P0
ſ / „Чƿ
L
P2
ſ / „Чƿ
N
P1
P2

P0
ſ / „Чƿ
L
P2
HЮ
0ƿ
–/ „Ч„
P1
Task 1
ﬀ #$ ƿ
–/ „Ч„
P2
ſ / „Чƿ
N
;"%Ρ"„ƿ
)} –
● New task T1 starts
before old task T0
stops
ɀ Ļ )–φƿ
/ +#ƿ
/ „„Гό& #0ƿ
ъ/+–Г–Г)& ƿ
nŢ
ɀ Ø} ъЮ
ГΡ/ –#„ƿ
/ Λ)} & 0b

Zombie fencing: second try
ɀ ỳ φ#& #¤#+ƿ
& #$ ƿ
–/„Чƿ
Ρ)& μГό} +/–Г)& „ƿ
/+#ƿ
+#/0ƿ
μ+)8 ƿ
–φ#ƿ
Ρ)& μГόƿ
–)ъГΡP
ƿ
ъ#+μ)+8 ƿ
/ƿ
+)} & 0ƿ
)μƿ
7)0 ΛГ"ƿ
μ"%ΡГ%ό
ɂ Z)+ƿ
#¤#+№
ƿ
& #$ ƿ
–/ „ЧP
ƿ
–/ Ч#ƿ
Г–„ƿ
–+/ & „/ Ρ–Г)& / Ю
ƿ
@
Øƿ
/ & 0ƿ
ъ+##8 ъ–Г¤#Ю
№
ƿ
Г& Г–Г/Ю
Гc#ƿ
–+/& „/Ρ–Г)& „
ɀ ﬀ )$ ƿ
Г–d„ƿ
Г8 ъ)„„ГΛЮ
#ƿ
μ)+ƿ
/ &№
ƿ
)Ю
0#+ƿ
Г& „–/ & Ρ#„ƿ
)μƿ
& #$ Ю
№
V Ρ+#/ –#0ƿ
–/ „Ч„ƿ
–)ƿ
Λ#ƿ
+} & & Г& όƿ
Λ#μ)+#ƿ
/ &№
ƿ
& #$ #+ƿ
Г& „–/ & Ρ#„ƿ
/ +#ƿ
„–/ +–#0
ɀ Ø)ƿ
–φГ„ƿ
0} +Г& όƿ
+#Λ/ Ю
/ & Ρ#

Zombie fencing: first try
reddit-source-0
reddit-source-1
reddit-source-2
reddit-source-3
ò#00Г–ƿ
„)} +Ρ#ƿ
;)Ю
0ƿ
–/ „Ч„<
reddit-source-0
reddit-source-1
reddit-source-2
reddit-source-3
ò#00Г–ƿ
„)} +Ρ#ƿ
;& #$ ƿ
–/ „Ч„<
;"%Ρ"„ƿ
)} –
Fences out
;"%Ρ"„ƿ
)} –
;"%Ρ"„ƿ
)} –

Zombie fencing: second try
reddit-source-0
reddit-source-1
reddit-source-2
reddit-source-3
ò#00Г–ƿ
„)} +Ρ#ƿ
;)Ю
0ƿ
–/ „Ч„<
reddit-source-0
reddit-source-1
reddit-source-2
reddit-source-3
ò#00Г–ƿ
„)} +Ρ#ƿ
;& #$ ƿ
–/ „Ч„<
` )0 ΛГ"ƿ
μ"%ΡГ%ό
*)} %-
● Woohoo, we did it!
● Just kidding, put your
hand back down

Connector resizing
T0
T1
T2
T3
Z)} +ƿ
–/ „Ч„
T0
T1
T2
ſ φ+##ƿ
–/ „Ч„
` )0 ΛГ"ƿ
μ"%ΡГ%ό
*)} %-=
* - based on new tasks
ɀ HЮ
0ƿ
–/ „Чƿ
+} & & Г& όƿ
/ μ–#+ƿ
& #$ #+ƿ
–/ „Ч„ƿ
/ +#ƿ
„–/ +–#0
ɀ @
μƿ
/ ƿ
ъ/ +–Г–Г)& ƿ
Г„ƿ
+#/„„Гό& #0ƿ
μ+)8 ƿ
)Ю
0ƿ
ſ e ƿ
–)ƿ
& #$ ƿ
ſ Lfſ Nfſ ŢY
ɀ Ø} ъЮ
ГΡ/ –#„ƿ
/ Λ)} & 0

Zombie fencing: third time’s the
charm
ɀ @
& „–#/ 0ƿ
)μƿ
} „Г& όƿ
–φ#ƿ
& #$ ƿ
„#–ƿ
)μƿ
–/ „Чƿ
Ρ)& μГό„ƿ
μ)+ƿ
)} +ƿ
+)} & 0ƿ
)μƿ
c)8 ΛГ#ƿ
μ#& ΡГ& όP
ƿ
} „#ƿ
–φ#ƿ
)Ю
0ƿ
„#–
ɀ ό/ Г& P
ƿ
0)ƿ
–φГ„ƿ
0} +Г& όƿ
+#Λ/ Ю
/ & Ρ#
ɀ Ю
+Гόφ–P
ƿ
–φГ„ƿ
φ/ „ƿ
–)ƿ
Λ#ƿ
#& )} όφP
ƿ
+Гόφ–
ɀ ỳ #Ю
Ю
Y ƿ
φ)$ ƿ
0)ƿ
$ #ƿ
Ч& )$ ƿ
/ Λ)} –ƿ
–φ#ƿ
)Ю
0ƿ
„#–ƿ
)μƿ
–/ „Чƿ
Ρ)& μГό„

Zombie leaders
ɀ ſ φ#ƿ
Ю
#/ 0#+ƿ
)μƿ
–φ#ƿ
ΡЮ
} „–#+ƿ
Г„ƿ
–φ#ƿ
)& Ю
№
ƿ
)& #ƿ
/ Ю
Ю
)$ #0ƿ
–)ƿ
ъ} ΛЮ
Г„φƿ
–/ „Чƿ
Ρ)& μГό„ƿ
–)ƿ
–φ#ƿ
Ρ)& μГόƿ
–)ъГΡ
ɀ O )$ #¤#+P
ƿ
$ #ƿ
0)& d–ƿ
#& μ)+Ρ#ƿ
–φГ„ƿ
¤#+№
ƿ
„–+)& όЮ
№
ɀ q)8 #ƿ
$ )+Ч#+„ƿ
8 /№
ƿ
8 Г„–/ Ч#& Ю
№
ƿ
Λ#Ю
Г#¤#ƿ
–φ#№
ƿ
/ +#ƿ
–φ#ƿ
Ю
#/ 0#+
ɀ @
& / ΡΡ} +/ –#ƿ
–/ „Чƿ
Ρ)& μГό„ƿ
8 /№
ƿ
Λ#ƿ
ъ} ΛЮ
Г„φ#0ƿ
Г& ƿ
+/ ъГ0ƿ
„} ΡΡ#„„Г)& P
ƿ
)¤#+$ +Г–Г& όƿ
¤/ Ю
Г0ƿ
–/ „Чƿ
Ρ)& μГό„

● Leader should fence
out three tasks
● But leader only
fences out two
Zombie leaders
my-connector (3 tasks)
Ņ)%μГόƿ
–)ъГΡ
my-connector (1 task)
q–/ +–Г& όƿ
„–/ –#
Write by zombie leader
ỳ +Г–#ƿ
Λ№
ƿ
/ Ρ–} / Ю
ƿ
Ю
#/ 0#+
ò#Λ/ Ю
/ & Ρ#
Ś¤"%–
ɀ H& ƿ
+#Λ/ Ю
/ & Ρ#P
ƿ
Ю
#/ 0#+ƿ
„##„T
○ One new task
ɂ ſ $ )ƿ
)Ю
0ƿ
–/ „Ч„

Zombie fencing: guarded config topic
ɀ O )$ ƿ
0)ƿ
$ #ƿ
ъ+#¤#& –ƿ
c)8 ΛГ#ƿ
Ю
#/ 0#+„ƿ
μ+)8 ƿ
/ ΡΡ#„„Г& όƿ
–φ#ƿ
Ρ)& μГόƿ
–)ъГΡ
ɀ ſ +/ & „/ Ρ–Г)& / Ю
ƿ
ъ+)0} Ρ#+ƿ
–)ƿ
–φ#ƿ
+#„Ρ} #ƿ
;/ ό/ Г& <b
ɀ @
„ƿ
–φГ„ƿ
#& )} όφ
ɀ +#ƿ
$ #ƿ
ό} / +/ & –##0ƿ
–φ/ –ƿ
–φ#ƿ
Ρ)& μГόƿ
–)ъГΡƿ
Ρ/ & ƿ
Λ#ƿ
} „#0ƿ
/ „ƿ
/ ƿ
„)} +Ρ#ƿ
)μƿ
–+} –φƿ
μ)+ƿ
c)8 ΛГ#ƿ
μ#& ΡГ& όƿ
& )$
ɀ 🙃🙃🙃

Leadership change
ɀ @
μƿ
/ ƿ
Ю
#/ 0#+ƿ
μ/ Ю
Ю
„ƿ
)} –ƿ
)μƿ
–φ#ƿ
ΡЮ
} „–#+P
ƿ
/ ƿ
& #$ ƿ
)& #ƿ
Г„ƿ
Ρφ)„#&
ɀ O )$ ƿ
0)#„ƿ
–φ#ƿ
& #$ ƿ
Ю
#/ 0#+ƿ
Ч& )$ ƿ
$ φ/ –ƿ
& ##0„ƿ
μ#& ΡГ& ό

Leadership change (with new tasks)
Config topic
q–/ +–Г& όƿ
„–/ –#
ỳ +Г–#ƿ
Λ№
ƿ
Ю
#/ 0#+
ò#Λ/ Ю
/ & Ρ#ƿ
;g ƿ
c)8 ΛГ#ƿ
μ#& ΡГ& ό<
Leader falls out of cluster
Ś¤"%–
ɀ H& ƿ
μГ+„–ƿ
+#Λ/ Ю
/ & Ρ#P
ƿ
)Ю
0ƿ
–/ „Ч„ƿ
/ +#ƿ
μ#& Ρ#0ƿ
)} –ƿ
„} ΡΡ#„„μ} Ю
Ю
№
Rebalance (+ new leader)
ɀ H& ƿ
„#Ρ)& 0ƿ
+#Λ/ Ю
/ & Ρ#P
ƿ
& #$ ƿ
Ю
#/ 0#+ƿ
0)#„& d–ƿ
φ/¤#ƿ
–)ƿ
0)ƿ
/ &№
ƿ
μ#& ΡГ& ό

Leadership change (with new tasks)
Config topic
q–/ +–Г& όƿ
„–/ –#
Write by leader
]#/ 0#+ƿ
μ/ Ю
Ю
„ƿ
)} –ƿ
)μƿ
ΡЮ
} „–#+ƿ
;Λ#μ)+#ƿ
+#Λ/ Ю
/ & Ρ#<
ò#Λ/ Ю
/ & Ρ#ƿ
;g ƿ
& #$ ƿ
Ю
#/ 0#+<
Ś¤"%–
ɀ H& ƿ
+#Λ/ Ю
/ & Ρ#P
ƿ
& #$ ƿ
Ю
#/ 0#+ƿ
φ/ „ƿ
–)ƿ
μ#& Ρ#ƿ
)} –ƿ
)Ю
0ƿ
–/ „Ч„
● But how can it tell?
ɀ ſ φ#ƿ
Ρ)& μГόƿ
–)ъГΡƿ
Ю
))Ч„ƿ
–φ#ƿ
„/ 8 #ƿ
Г& ƿ
Λ)–φƿ
„Ρ#& / +Г)„
ɀ ò#8 #8 Λ#+P
ƿ
$ #ƿ
$ / & –ƿ
–)ƿ
/¤)Г0ƿ
} & & #Ρ#„„/ +№
ƿ
Г& –#++} ъ–Г)& „

Zombie fencing: fence, then write
ɀ Ņ} ++#& –ƿ
„#я } #& Ρ#ƿ
)μƿ
#¤#& –„T
ɂ n} ΛЮ
Г„φƿ
& #$ ƿ
–/ „Чƿ
Ρ)& μГό„
ɂ ò#Λ/ Ю
/ & Ρ#
ɂ i
ƿ̀ )8 ΛГ#ƿ
μ#& ΡГ& ό
ɂ q–/ +–ƿ
& #$ ƿ
–/ „Ч„
ɀ ﬀ #$ ƿ
)+0#+T
ɂ ` )8 ΛГ#ƿ
μ#& ΡГ& ό
ɂ n} ΛЮ
Г„φƿ
& #$ ƿ
–/ „Чƿ
Ρ)& μГό„
ɂ ò#Λ/ Ю
/ & Ρ#
ɂ q–/ +–ƿ
& #$ ƿ
–/ „Ч„
ɀ ſ φГ„ƿ
φ/ „ƿ
–)ƿ
Λ#ƿ
Г–P
ƿ
+Гόφ–
ɀ j )} ƿ
$ Г„φƿ
😈

That was not a good idea
● Poor UX
○ Causes tasks to fail in between zombie fencing and end
of rebalance
○ Forcibly kills them, no chance to commit pending offsets
○ Looks like a bug to users
● Correctness issue
○ Users can manually restart failed tasks
○ Even in between zombie fencing and publishing new
task configs
○ Uh oh, a zombie task made it to the other end of the
rebalance!

Zombie fencing: durable task counts
● Forget the “fence then write” logic
● Instead, we explicitly track the number of to-be-fenced tasks
in the config topic with a task count record
● These serve two purposes:
○ Explicitly: if fencing is necessary, how many tasks have
to be fenced out
○ Implicitly: determine whether zombie fencing is
necessary

Ņ)%μГόƿ
–)ъГΡ
my-connector-task-count (2)
q–/ +–Г& όƿ
„–/ –#
ﬀ #$ ƿ
–/ „Чƿ
Ρ)& μГό„
Rebalance (+ zombie fencing)
Ś¤"%–
ɀ H& ƿ
+#Λ/ Ю
/ & Ρ#T
ɂ ]#/ 0#+ƿ
μ#& Ρ#„ƿ
–φ+##ƿ
–/ „Ч„ƿ
Λ/ „#0ƿ
)& ƿ
Ю
/ –#„–ƿ
–/ „Чƿ
Ρ)} & –ƿ
+#Ρ)+0
ɂ ]#/ 0#+ƿ
$ +Г–#„ƿ
& #$ ƿ
–/ „Чƿ
Ρ)} & –ƿ
)μƿ
–$ )ƿ
–/ „Ч„ƿ
Λ/ „#0ƿ
)& ƿ
Ю
/ –#„–ƿ
–/ „Чƿ
Ρ)& μГό„

Config topic
q–/ +–Г& όƿ
„–/ –#
ﬀ #$ ƿ
–/ „Чƿ
Ρ)& μГό„
Rebalance (+ zombie fencing)
Ś¤"%–
Safe to run bring up tasks?
✅
❌
✅

ɀ ỳ φ/–ƿ
Ρ)} Ю
0ƿ
ъ)„„ГΛЮ
№
ƿ
Λ+#/Чƿ
–φГ„ƿ
& )$
ɀ Hφƿ
№
#/ φP
ƿ
$ φ/ –ƿ
$ / „ƿ
–φ/ –ƿ
/ Λ)} –ƿ
–/ „Чƿ
+#„–/ +–„

Laggy task startup
● Zombie fencing disables all initialized task producers from
writing to Kafka
● What if a zombie task lags and hasn’t initialized its producer
by the time zombie fencing for a new generation of tasks
takes place?
● Or, what if a task is restarted on a zombie worker after
zombie fencing takes place?

Laggy task startup
reddit-source-0
reddit-source-1
reddit-source-2
reddit-source-3
ò#00Г–ƿ
„)} +Ρ#ƿ
^k ƿ
)Ю
0ƿ
–/ „Ч„<
reddit-source-0
reddit-source-1
reddit-source-2
ò#00Г–ƿ
„)} +Ρ#ƿ
^e ƿ
& #$ ƿ
–/ „Ч„<
` )0 ΛГ"ƿ
μ"%ΡГ%ό
*)} %-
reddit-source-3
^ſ / „Чƿ
Г„ƿ
Ю
/ όόГ& όƿ
0} +Г& όƿ
„–/ +–} ъ<
^ſ / „Чƿ
φ/ „ƿ
μГ& Г„φ#0ƿ
„–/ +–} ъ<

Zombie fencing: check your work
ɀ μ–#+ƿ
Г& Г–Г/ Ю
ГcГ& όƿ
–+/ & „/ Ρ–Г)& „ƿ
μ)+ƿ
/ ƿ
–/ „Чƿ
ъ+)0} Ρ#+P
ƿ
φ/¤#ƿ
–)ƿ
8 / Ч#ƿ
„} +#ƿ
Г–d„ƿ
„–ГЮ
Ю
ƿ
„/ μ#ƿ
–)ƿ
+} & ƿ
–φ#ƿ
–/ „Ч
ɀ ﬀ #$ ƿ
„#я } #& Ρ#ƿ
)μƿ
#¤#& –„T
ɂ Ø#ΡГ0#ƿ
–)ƿ
;+#<„–/ +–ƿ
–/ „Ч
ɂ Ņ+#/ –#ƿ
ъ+)0} Ρ#+ƿ
μ)+ƿ
–/ „Чƿ
/ & 0ƿ
Г& Г–Г/ Ю
Гc#ƿ
–+/ & „/ Ρ–Г)& „
ɂ ò#/ 0ƿ
–)ƿ
#& 0ƿ
)μƿ
Ρ)& μГόƿ
–)ъГΡ
ɂ @
μƿ
& #$ ƿ
–/ „Чƿ
Ρ)& μГό„ƿ
μ)} & 0P
ƿ
/ Λ)+–ƿ
„–/ +–} ъƿ
/ & 0ƿ
/ Λ/ & 0)& ƿ
–φ#ƿ
–/ „Ч
ɂ H–φ#+$ Г„#P
ƿ
„/ μ#ƿ
–)ƿ
„–/ +–ƿ
ъ+)Ρ#„„Г& όƿ
0/ –/
ɀ O /¤#ƿ
$ #ƿ
μГ& / Ю
Ю
№
ƿ
0)& #ƿ
Г–

🎉🎉🎉 Yes! 🎉🎉🎉
(But…)

Caveats
● Fencing during rebalancing is not a good idea
○ Makes rebalances more brittle
○ Requires a new rebalance any time we want to restart a
task that failed due to failed zombie fencing
● Instead, we fence outside of rebalances
○ During task startup, workers issue a REST request to the
leader to perform zombie fencing for the connector
○ The leader will perform that round (if necessary), then
send back a 2XX response
○ If a non-2XX response is received, the task is marked
failed
○ Tasks can easily be restarted

Caveats
ɀ ſ φ+)$ / $ /№
ƿ
ъ+)0} Ρ#+„ƿ
μ)+ƿ
Г& Г–Г/ Ю
ГcГ& όƿ
–+/ & „/ Ρ–Г)& „ƿ
Г„ƿ
$ / „–#μ} Ю
ɀ ỳ #ƿ
/ 00#0ƿ
/ ƿ
& #$ ƿ
/ 08 Г& ƿ
ΡЮ
Г#& –ƿ n@
ƿ
Г& ƿ
e Me ƿ
–)ƿ
0)ƿ
–φГ„ƿ
Г& „–#/ 0
ɀ ỳ #ƿ
φ/¤#ƿ
–)ƿ
Λ#ƿ
Ρ/ +#μ} Ю
ƿ
/ Λ)} –ƿ
φ)$ ƿ
–φ#ƿ
Ю
#/ 0#+ƿ
} „#„ƿ
–φ#ƿ
–+/ & „/ Ρ–Г)& / Ю
ƿ
ъ+)0} Ρ#+
ɀ qГ8 ГЮ
/ +ƿ
QΡЮ
/ Г8 V –φ#& V Ρφ#ΡЧSƿ
Ю
)όГΡƿ
–)ƿ
„)} +Ρ#ƿ
–/ „Ч„

In practice
@
8 ъЮ
#8 #& –/ –Г)& ƿ
0#–/ ГЮ
„ƿ
/ +#ƿ
Λ)+Г& όP
ƿ
φ)$ ƿ
0)ƿ
$ #ƿ
/ Ρ–} / Ю
Ю
№
ƿ
} „#ƿ
–φГ„ƿ
μ#/ –} +#

In practice (cluster administrators)
ɀ ﬀ #$ ƿ
ΡЮ
} „–#+„T
ɂ [ „#ƿ
¤#+„Г)& ƿ
e Me MLƿ
)+ƿ
Ю
/ –#+
ɂ Ņ)& μГό} +#ƿ
#¤#+№
ƿ
$ )+Ч#+ƿ
$ Г–φƿ
exactly.once.source.support = enabled
ɀ ŚGГ„–Г& όƿ
ΡЮ
} „–#+„T
ɂ ò)Ю
Ю
Г& όƿ
} ъό+/ 0#ƿ
NE
ɂ Ļ +Г& όƿ
/ Ю
Ю
ƿ
$ )+Ч#+„ƿ
–)ƿ
e Me MLƿ
)+ƿ
Ю
/ –#+
#¤#+№
ƿ
$ )+Ч#+ƿ
$ Г–φƿ
exactly.once.source.support = preparing
○ ò)Ю
Ю
Г& όƿ
} ъό+/ 0#ƿ
ŢE
#¤#+№
ƿ
$ )+Ч#+ƿ
$ Г–φƿ
exactly.once.source.support = enabled

In practice (downstream readers)
● Have to filter out records from aborted transactions
● If using the Java consumer, configure with isolation.level
= read_committed
● For sink connectors, do at least one of the following:
○ Configure worker with consumer.isolation.level =
read_committed
○ Configure connector with
consumer.override.isolation.level =
read_committed with (3.0.0 or later, with default
worker configuration)

In practice (writing connectors)
Have to define source offsets correctly
public abstract class SourceTask {
public abstract List<SourceRecord> poll();
}
public class SourceRecord {
public SourceRecord(Map<String, ?>
sourcePartition, Map<String, ?> sourceOffset, ...)
}

In practice (writing connectors)
O /¤#ƿ
–)ƿ
} „"ƿ
„)} +Ρ#ƿ
)μμ„#–„ƿ
Ρ)++#Ρ–Ю
№
public abstract class SourceTask {
protected SourceTaskContext context;
public abstract void start(Map<String, String> props);
}
public interface SourceTaskContext {
OffsetStorageReader offsetStorageReader();
}
public interface OffsetStorageReader {
<T> Map<Map<String, T>, Map<String, Object>>
offsets(Collection<Map<String, T>> partitions);
}

In summary
ɀ ŚG/ Ρ–Ю
№
V )& Ρ#ƿ
Г„ƿ
φ/ +0ƿ
–)ƿ
Г8 ъЮ
#8 #& –
ɂ Ś„ъ#ΡГ/ Ю
Ю
№
ƿ
φ/ & 0Ю
Г& όƿ
c)8 ΛГ#ƿ
$ )+Ч#+„fc)8 ΛГ#ƿ
–/ „Ч„ƿ
/ Ρ+)„„ƿ
–/ „Чƿ
+#Ρ)& μГό} +/ –Г)& „
ɀ ŚG/ Ρ–Ю
№
V )& Ρ#ƿ
Г„ƿ
;φ)ъ#μ} Ю
Ю
№
<ƿ
#/ „№
ƿ
–)ƿ
} „#
ɂ @
μƿ
Г–d„ƿ
& )–P
ƿ
φ/ +/ „„ƿ
ъГ& όƿ
8 #ƿ
)& ƿ
ăГ+/b
ɂ φ––ъ„TffГ„„} #„M/ ъ/ Ρφ#M)+όfТ
Г+/ fъ+)Т
#Ρ–„fǽ Zǽ fГ„„} #„
ɀ Z)+ƿ
/ Ю
Ю
ƿ
–φ#ƿ
0#–/ ГЮ
„P
ƿ
Ρφ#ΡЧƿ
)} –ƿ
ǽ@
nBnND
ɂ φ––ъ„TffΡ$ ГЧГM/ ъ/ Ρφ#M)+όfΡ)& μЮ
} #& Ρ#f0Г„ъЮ
/№
fǽ Zǽ fǽ@
n
BnNDṊ e i
ŚG/ Ρ–Ю
№
V
H& Ρ#g q} ъъ)+–g μ)+g q)} +Ρ#g Ņ)& & #Ρ–)+„

Thank you!
ƿ
Open Source Program Office @Aiven
Ρφ#Г„&' ( Г¤&* +Г,
Chris Egerton
-Г* -Ρφ#Г„. &ό&#–,* . 12 33456 3
@C0urante

Exactly-Once, Again: Adding EOS Support for Kafka Connect Source Connectors with Chris Egerton

Recomendados

Recomendados

Más contenido relacionado

Similar a Exactly-Once, Again: Adding EOS Support for Kafka Connect Source Connectors with Chris Egerton

Similar a Exactly-Once, Again: Adding EOS Support for Kafka Connect Source Connectors with Chris Egerton (20)

Más de HostedbyConfluent

Más de HostedbyConfluent (20)

Último

Último (20)

Exactly-Once, Again: Adding EOS Support for Kafka Connect Source Connectors with Chris Egerton