SlideShare una empresa de Scribd logo
1 de 53
Descargar para leer sin conexión
Exactly-Once, Again: Adding EOS
Support for Kafka Connect Source
Connectors
Chris Egerton
Nice to meet you!
Chris Egerton
Open Source Program
Office @Aiven
Apache Kafka
committer and PMC
member
(Official bio) (Unofficial bio)
Exactly-Once, Again: Adding
EOS Support for Kafka
Connect Source Connectors
● “Exactly-once semantics”
● “Semantics” instead of “delivery”, “guarantees”, “delivery
guarantees”, etc. (see Two Generals’ Problem)
● Levels:
○ Probably-once
○ At-least-once
○ At-most-once
○ Exactly-once
● With all else equal, exactly-once is best
● But of course, it’s the hardest to implement
EOS
Exactly-Once, Again: Adding
EOS Support for Kafka
Connect Source Connectors
Source Connectors
● Kafka stores and transmits events. Where do these events
come from, and where do they go?
● DYI producer/consumer application? Nah 👎
● Connectors: no-code (or low-code) applications to integrate
Kafka with other systems
● Sink connectors write data from Kafka to the external system
● Source connectors read data from the external system into
Kafka
Exactly-Once, Again: Adding
EOS Support for Kafka
Connect Source Connectors
Kafka Connect
● Distributed, horizontally-scalable, fault-
tolerant ingest/export tool for Kafka
● Developers implement connectors
against the Kafka Connect API
● Cluster administrators install connectors
onto one or more Kafka Connect workers,
which combine to form a cluster
● Users can then create and manage
connectors on that cluster by submitting
JSON configurations via a REST API
● (For users) No code required!
{
"name": "local-file-source",
"connector.class": "FileStreamSink",
"tasks.max": "1",
"file": "test.txt",
"topic": "connect-test"
}
We’re going to talk about designing support for exactly-once
semantics (EOS) with source connectors developed for Kafka
Connect.
In summary…
Challenges
1⃣ƿ
Ø#μГ& Г& όƿ
„)} +Ρ#ƿ
)μμ„#–„
2⃣ƿ
q–)+Г& όƿ
/ & 0ƿ
+#–+Г#¤Г& όƿ
„)} +Ρ#ƿ
)μμ„#–„
3⃣ƿ
ò#–+№
Г& όƿ
0#Ю
Г¤#+№
ƿ
–)ƿ
ǽ/ μЧ/
4⃣ƿ̀ )8 ΛГ#„ƿ
🧟🧟🧟🧟🧟
Source offsets, in detail
● Source connectors provide source records
● Source records come with source offsets (partition + offset)
● On startup, connectors use source offsets to know where to
resume from
● Source offsets are stored in an offsets topic by Kafka
Connect
✅🎉
// TODO
ɀ Ø"μГ%Г%όƿ
„)} +Ρ#ƿ
)μμ„#–„ƿ
;1⃣<ƿ
Г„ƿ
–φ#ƿ
+#„ъ)& „ГΛГЮ
Г–№
ƿ
)μƿ
–φ#ƿ
Ρ)& & #Ρ–)+
ɀ q–)*Г%όƿ
, %-ƿ
*"–*Г"¤Г%όƿ
„)} +Ρ#ƿ
)μμ„#–„ƿ
;2⃣<ƿ
Г„ƿ
–φ#ƿ
+#„ъ)& „ГΛГЮ
Г–№
ƿ
)μƿ
ǽ/μЧ/ƿ
Ņ)& & #Ρ–
Exactly-once for Kafka clients
ɀ ǽ@
nBCDEƿ
ŚG/ Ρ–Ю
№
ƿ
H& Ρ#ƿ
Ø#Ю
Г¤#+№
ƿ
/ & 0ƿ
ſ +/ & „/ Ρ–Г)& / Ю
ƿ
ë #„„/ όГ& ό
ɂ ò#Ю
#/ „#0ƿ
Г& ƿ
LMNNMLML
ɂ O #& Ρ#P
ƿ
Q ό/ Г& Sƿ
Г& ƿ
–φ#ƿ
–Г–Ю
#
ɀ /
-"0 ъ)–"%–ƿ
ъ*)-} Ρ"*Tƿ
+#–+Г#„ƿ
$ Г–φ)} –ƿ
0} ъЮ
ГΡ/ –#„ƿ
;3⃣<
ɀ ſ *, %„, Ρ–Г)%, Ю
ƿ
ъ*)-} Ρ"*Tƿ
/ –)8 ГΡƿ
Ρ+)„„V –)ъГΡƿ
$ +Г–#„
ɀ ſ *, %„, Ρ–Г)%, Ю
ƿ
/
Ø„Tƿ
„Г& όЮ
#ƿ
$ +Г–#+ƿ
ъ#+ƿ
@
Ø
ɂ /
%Г–Г, Ю
Г7"ƿ
–*, %„, Ρ–Г)%„ƿ
$ Г–φƿ
/ ƿ
ъ+)0} Ρ#+ƿ
–)ƿ
μ"%Ρ"ƿ
)} –ƿ
)–φ#+ƿ
ъ+)0} Ρ#+„ƿ
$ Г–φƿ
–φ#ƿ
„/ 8 #ƿ
–+/ & „/ Ρ–Г)& / Ю
ƿ
@
Ø
✅🎉
Tracking source offsets (2⃣)
Ļ #μ)+#T
ɀ n)Ю
Ю
ƿ
Ρ)& & #Ρ–)+ƿ
μ)+ƿ
„)} +Ρ#ƿ
+#Ρ)+0„
ɀ ỳ +Г–#ƿ
+#Ρ)+0„ƿ
–)ƿ
ǽ/ μЧ/
ɀ n#+Г)0ГΡ/ Ю
Ю
№
ƿ
$ +Г–#ƿ
;Ρ)8 8 Г–<ƿ
„)} +Ρ#ƿ
)μμ„#–„ƿ
–)ƿ
ǽ/ μЧ/
ɀ n+)¤Г0#„ƿ
/ –V Ю
#/ „–V )& Ρ#ƿ
„} ъъ)+–
μ–#+T
ɀ Ļ #όГ& ƿ
–+/ & „/ Ρ–Г)&
ɀ n)Ю
Ю
ƿ
Ρ)& & #Ρ–)+ƿ
μ)+ƿ
„)} +Ρ#ƿ
+#Ρ)+0„
ɀ ỳ +Г–#ƿ
+#Ρ)+0„ƿ
Г8 8 #0Г/ –#Ю
№
ƿ
–)ƿ
ǽ/ μЧ/ ƿ
;„–ГЮ
Ю
<
ɀ ỳ +Г–#ƿ
„)} +Ρ#ƿ
)μμ„#–„ƿ
–)ƿ
ǽ/ μЧ/ ƿ
ɀ Ņ)8 8 Г–ƿ
–+/ & „/ Ρ–Г)&
ɀ n+)¤Г0#„ƿ
#G/ Ρ–Ю
№
V )& Ρ#ƿ
„} ъъ)+–
✅🎉
75% done!
1⃣ƿ
Ø#μГ& Г& όƿ
„)} +Ρ#ƿ
)μμ„#–„
2⃣ƿ
q–)+Г& όƿ
/ & 0ƿ
+#–+Г#¤Г& όƿ
„)} +Ρ#ƿ
)μμ„#–„
3⃣ƿ
ò#–+№
Г& όƿ
0#Ю
Г¤#+№
ƿ
–)ƿ
ǽ/ μЧ/
🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟
🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟
🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟
✅
✅
✅
4⃣ƿ
ſ φ/ –ƿ
Ю
#/¤#„Y
“Chekhov’s gun cabinet”
Kafka Connect: diving deeper
ɀ Ņ)%%"Ρ–)*„ƿ
ό#& #+/–#ƿ
Ρ)& μГό} +/–Г)& „ƿ
μ)+ƿ
)& #ƿ
)+ƿ
8 )+#ƿ
–, „Ч„
ɀ ſ / „Ч„ƿ
Ρ/ & ƿ
Λ#ƿ
„ъ+#/ 0ƿ
/ Ρ+)„„ƿ
8 } Ю
–ГъЮ
#ƿ
ǽ/ μЧ/ ƿ
Ņ)& & #Ρ–ƿ
$ )+Ч#+„
ɀ ſ / „Ч„ƿ
/ +#ƿ
/ „„Гό& #0ƿ
–)ƿ
$ )+Ч#+„ƿ
0} +Г& όƿ
/ ƿ
*"Λ, Ю
, %Ρ"
ɀ ƿ
„Г& όЮ
#ƿ
$ )+Ч#+P
ƿ
–φ#ƿ
Ю
", -"*P
ƿ
0#–#+8 Г& #„ƿ
–φ/ –ƿ
/ „„Гό& 8 #& –
ɀ ſ / „Чƿ
Ρ)& μГό} +/ –Г)& „ƿ
/ +#ƿ
„–)+#0ƿ
Г& ƿ
/ ƿ
„Г& όЮ
#V ъ/+–Г–Г)& ƿ
ǽ/μЧ/ƿ
–)ъГΡƿ
Ρ/ Ю
Ю
#0ƿ
–φ#ƿ
Ρ)%μГόƿ
–)ъГΡ
ɀ Z)+ƿ
„)} +Ρ#ƿ
Ρ)& & #Ρ–)+„P
ƿ
ǽ/ μЧ/ ƿ
Ņ)& & #Ρ–ƿ
} „#„ƿ
)& #ƿ
ъ+)0} Ρ#+ƿ
ъ#+ƿ
–/ „Чƿ
–)ƿ
„#& 0ƿ
„)} +Ρ#ƿ
+#Ρ)+0„ƿ
–)ƿ
ǽ/μЧ/
Zombie fencing: correctness goals
ſ )Ю
#+/ –#0ƿ
„Ρ#& / +Г)„T
ɀ ;[ & <ό+/ Ρ#μ} Ю
ƿ
$ )+Ч#+ƿ
„φ} –0)$ &
ɀ +ΛГ–+/ +№
ƿ
$ )+Ч#+ƿ
„–/ +–} ъ„
ɀ +ΛГ–+/ +ГЮ
№
V Ю
)& όP
ƿ
/ +ΛГ–+/ +ГЮ
№
V ъЮ
/Ρ#0P
ƿ
/ & 0ƿ
/+ΛГ–+/+ГЮ
№
V –Г8 #0ƿ
ъ/ } „#„
Zombie fencing: UX goals
● Minimal unnecessary interruptions (keep processing data)
● Minimal changes to connector code
● Minimal connector management API changes
Zombie fencing: actually pretty easy?
● Give each task a transactional ID derived from the name of
the connector and the task ID
○ E.g., “reddit-source-0” or “chris-ksl-3”
● Let tasks fence out older instances on startup
○ Fencing: disabling a producer from writing to Kafka
Zombie fencing: actually pretty easy?
reddit-source-0
reddit-source-1
reddit-source-2
reddit-source-3
ò#00Г–ƿ
„)} +Ρ#ƿ
;)Ю
0ƿ
–/ „Ч„<
reddit-source-0
reddit-source-1
reddit-source-2
reddit-source-3
ò#00Г–ƿ
„)} +Ρ#ƿ
;& #$ ƿ
–/ „Ч„<
Fences out
;"%Ρ"„ƿ
)} –
;"%Ρ"„ƿ
)} –
Fences out
ɀ q)Y ƿ
0)#„ƿ
–φГ„ƿ
$ )+Ч
ɀ ])Ю
ƿ
@
ƿ
$ Г„φƿ
🥲
Source partition reshuffling problem
ɀ q)} +Ρ#ƿ
Ρ)& & #Ρ–)+„ƿ
+#/ 0ƿ
μ+)8 ƿ
„)} *Ρ"ƿ
ъ, *–Г–Г)%„
ɂ Ø/ –/ Λ/ „#ƿ
–/ ΛЮ
#„P
ƿ
ǽ/ μЧ/ ƿ
–)ъГΡ„P
ƿ
#–ΡM
ɀ ſ φ#„#ƿ
/ +#ƿ
/ Ю
Ю
)Ρ/ –#0ƿ
/ Ρ+)„„ƿ
–/ „Ч„
ɀ ſ φ/ –ƿ
/ Ю
Ю
)Ρ/ –Г)& ƿ
Ρ/ & ƿ
Ρφ/ & ό#ƿ
)¤#+ƿ
–Г8 #
Source partition reshuffling problem
P0
Task 0
P1
ſ / „Чƿ
N
@
& Г–Г/ Ю
ƿ
/ Ю
Ю
)Ρ/ –Г)& ƿ
^Ţƿ
–/ „Ч„P
ƿ
Ţƿ
ъ/ +–Г–Г)& „<
P0
Task 0
P2
ƿ
& #$ ƿ
ъ/ +–Г–Г)& ƿ
^nŢ`ƿ
Г„ƿ
Ρ+#/ –#0
P1
ſ / „Чƿ
N
& ƿ
#GГ„–Г& όƿ
ъ/ +–Г–Г)& ƿ
^nN`ƿ
Г„ƿ
0#Ю
#–#0
P0
ſ / „Чƿ
L
P2
ſ / „Чƿ
N
P1
P2
Source partition reshuffling problem
P0
ſ / „Чƿ
L
P2
HЮ
0ƿ
–/ „Ч„
P1
Task 1
ff #$ ƿ
–/ „Ч„
P2
ſ / „Чƿ
N
;"%Ρ"„ƿ
)} –
● New task T1 starts
before old task T0
stops
ɀ Ļ )–φƿ
/ +#ƿ
/ „„Гό& #0ƿ
ъ/+–Г–Г)& ƿ
nŢ
ɀ Ø} ъЮ
ГΡ/ –#„ƿ
/ Λ)} & 0b
Zombie fencing: second try
ɀ ỳ φ#& #¤#+ƿ
& #$ ƿ
–/„Чƿ
Ρ)& μГό} +/–Г)& „ƿ
/+#ƿ
+#/0ƿ
μ+)8 ƿ
–φ#ƿ
Ρ)& μГόƿ
–)ъГΡP
ƿ
ъ#+μ)+8 ƿ
/ƿ
+)} & 0ƿ
)μƿ
7)0 ΛГ"ƿ
μ"%ΡГ%ό
ɂ Z)+ƿ
#¤#+№
ƿ
& #$ ƿ
–/ „ЧP
ƿ
–/ Ч#ƿ
Г–„ƿ
–+/ & „/ Ρ–Г)& / Ю
ƿ
@
Øƿ
/ & 0ƿ
ъ+##8 ъ–Г¤#Ю
№
ƿ
Г& Г–Г/Ю
Гc#ƿ
–+/& „/Ρ–Г)& „
ɀ ff )$ ƿ
Г–d„ƿ
Г8 ъ)„„ГΛЮ
#ƿ
μ)+ƿ
/ &№
ƿ
)Ю
0#+ƿ
Г& „–/ & Ρ#„ƿ
)μƿ
& #$ Ю
№
V Ρ+#/ –#0ƿ
–/ „Ч„ƿ
–)ƿ
Λ#ƿ
+} & & Г& όƿ
Λ#μ)+#ƿ
/ &№
ƿ
& #$ #+ƿ
Г& „–/ & Ρ#„ƿ
/ +#ƿ
„–/ +–#0
ɀ Ø)ƿ
–φГ„ƿ
0} +Г& όƿ
+#Λ/ Ю
/ & Ρ#
Zombie fencing: first try
reddit-source-0
reddit-source-1
reddit-source-2
reddit-source-3
ò#00Г–ƿ
„)} +Ρ#ƿ
;)Ю
0ƿ
–/ „Ч„<
reddit-source-0
reddit-source-1
reddit-source-2
reddit-source-3
ò#00Г–ƿ
„)} +Ρ#ƿ
;& #$ ƿ
–/ „Ч„<
;"%Ρ"„ƿ
)} –
Fences out
;"%Ρ"„ƿ
)} –
;"%Ρ"„ƿ
)} –
Zombie fencing: second try
reddit-source-0
reddit-source-1
reddit-source-2
reddit-source-3
ò#00Г–ƿ
„)} +Ρ#ƿ
;)Ю
0ƿ
–/ „Ч„<
reddit-source-0
reddit-source-1
reddit-source-2
reddit-source-3
ò#00Г–ƿ
„)} +Ρ#ƿ
;& #$ ƿ
–/ „Ч„<
` )0 ΛГ"ƿ
μ"%ΡГ%ό
*)} %-
● Woohoo, we did it!
● Just kidding, put your
hand back down
Zombie fencing: second try
ɀ ỳ φ#& #¤#+ƿ
& #$ ƿ
–/„Чƿ
Ρ)& μГό} +/–Г)& „ƿ
/+#ƿ
+#/0ƿ
μ+)8 ƿ
–φ#ƿ
Ρ)& μГόƿ
–)ъГΡP
ƿ
ъ#+μ)+8 ƿ
/ƿ
+)} & 0ƿ
)μƿ
7)0 ΛГ"ƿ
μ"%ΡГ%ό
ɂ Z)+ƿ
#¤#+№
ƿ
& #$ ƿ
–/ „ЧP
ƿ
–/ Ч#ƿ
Г–„ƿ
–+/ & „/ Ρ–Г)& / Ю
ƿ
@
Øƿ
/ & 0ƿ
ъ+##8 ъ–Г¤#Ю
№
ƿ
Г& Г–Г/Ю
Гc#ƿ
–+/& „/Ρ–Г)& „
ɀ ff )$ ƿ
Г–d„ƿ
Г8 ъ)„„ГΛЮ
#ƿ
μ)+ƿ
/ &№
ƿ
)Ю
0#+ƿ
Г& „–/ & Ρ#„ƿ
)μƿ
& #$ Ю
№
V Ρ+#/ –#0ƿ
–/ „Ч„ƿ
–)ƿ
Λ#ƿ
+} & & Г& όƿ
Λ#μ)+#ƿ
/ &№
ƿ
& #$ #+ƿ
Г& „–/ & Ρ#„ƿ
/ +#ƿ
„–/ +–#0
ɀ Ø)ƿ
–φГ„ƿ
0} +Г& όƿ
+#Λ/ Ю
/ & Ρ#
Connector resizing
T0
T1
T2
T3
Z)} +ƿ
–/ „Ч„
T0
T1
T2
ſ φ+##ƿ
–/ „Ч„
` )0 ΛГ"ƿ
μ"%ΡГ%ό
*)} %-=
* - based on new tasks
ɀ HЮ
0ƿ
–/ „Чƿ
+} & & Г& όƿ
/ μ–#+ƿ
& #$ #+ƿ
–/ „Ч„ƿ
/ +#ƿ
„–/ +–#0
ɀ @
μƿ
/ ƿ
ъ/ +–Г–Г)& ƿ
Г„ƿ
+#/„„Гό& #0ƿ
μ+)8 ƿ
)Ю
0ƿ
ſ e ƿ
–)ƿ
& #$ ƿ
ſ Lfſ Nfſ ŢY
ɀ Ø} ъЮ
ГΡ/ –#„ƿ
/ Λ)} & 0
Zombie fencing: third time’s the
charm
ɀ @
& „–#/ 0ƿ
)μƿ
} „Г& όƿ
–φ#ƿ
& #$ ƿ
„#–ƿ
)μƿ
–/ „Чƿ
Ρ)& μГό„ƿ
μ)+ƿ
)} +ƿ
+)} & 0ƿ
)μƿ
c)8 ΛГ#ƿ
μ#& ΡГ& όP
ƿ
} „#ƿ
–φ#ƿ
)Ю
0ƿ
„#–
ɀ ό/ Г& P
ƿ
0)ƿ
–φГ„ƿ
0} +Г& όƿ
+#Λ/ Ю
/ & Ρ#
ɀ Ю
+Гόφ–P
ƿ
–φГ„ƿ
φ/ „ƿ
–)ƿ
Λ#ƿ
#& )} όφP
ƿ
+Гόφ–
ɀ ỳ #Ю
Ю
Y ƿ
φ)$ ƿ
0)ƿ
$ #ƿ
Ч& )$ ƿ
/ Λ)} –ƿ
–φ#ƿ
)Ю
0ƿ
„#–ƿ
)μƿ
–/ „Чƿ
Ρ)& μГό„
Zombie leaders
ɀ ſ φ#ƿ
Ю
#/ 0#+ƿ
)μƿ
–φ#ƿ
ΡЮ
} „–#+ƿ
Г„ƿ
–φ#ƿ
)& Ю
№
ƿ
)& #ƿ
/ Ю
Ю
)$ #0ƿ
–)ƿ
ъ} ΛЮ
Г„φƿ
–/ „Чƿ
Ρ)& μГό„ƿ
–)ƿ
–φ#ƿ
Ρ)& μГόƿ
–)ъГΡ
ɀ O )$ #¤#+P
ƿ
$ #ƿ
0)& d–ƿ
#& μ)+Ρ#ƿ
–φГ„ƿ
¤#+№
ƿ
„–+)& όЮ
№
ɀ q)8 #ƿ
$ )+Ч#+„ƿ
8 /№
ƿ
8 Г„–/ Ч#& Ю
№
ƿ
Λ#Ю
Г#¤#ƿ
–φ#№
ƿ
/ +#ƿ
–φ#ƿ
Ю
#/ 0#+
ɀ @
& / ΡΡ} +/ –#ƿ
–/ „Чƿ
Ρ)& μГό„ƿ
8 /№
ƿ
Λ#ƿ
ъ} ΛЮ
Г„φ#0ƿ
Г& ƿ
+/ ъГ0ƿ
„} ΡΡ#„„Г)& P
ƿ
)¤#+$ +Г–Г& όƿ
¤/ Ю
Г0ƿ
–/ „Чƿ
Ρ)& μГό„
● Leader should fence
out three tasks
● But leader only
fences out two
Zombie leaders
my-connector (3 tasks)
Ņ)%μГόƿ
–)ъГΡ
my-connector (2 tasks)
my-connector (1 task)
q–/ +–Г& όƿ
„–/ –#
Write by zombie leader
ỳ +Г–#ƿ
Λ№
ƿ
/ Ρ–} / Ю
ƿ
Ю
#/ 0#+
ò#Λ/ Ю
/ & Ρ#
Ś¤"%–
ɀ H& ƿ
+#Λ/ Ю
/ & Ρ#P
ƿ
Ю
#/ 0#+ƿ
„##„T
○ One new task
ɂ ſ $ )ƿ
)Ю
0ƿ
–/ „Ч„
Zombie fencing: guarded config topic
ɀ O )$ ƿ
0)ƿ
$ #ƿ
ъ+#¤#& –ƿ
c)8 ΛГ#ƿ
Ю
#/ 0#+„ƿ
μ+)8 ƿ
/ ΡΡ#„„Г& όƿ
–φ#ƿ
Ρ)& μГόƿ
–)ъГΡ
ɀ ſ +/ & „/ Ρ–Г)& / Ю
ƿ
ъ+)0} Ρ#+ƿ
–)ƿ
–φ#ƿ
+#„Ρ} #ƿ
;/ ό/ Г& <b
ɀ @
„ƿ
–φГ„ƿ
#& )} όφ
ɀ +#ƿ
$ #ƿ
ό} / +/ & –##0ƿ
–φ/ –ƿ
–φ#ƿ
Ρ)& μГόƿ
–)ъГΡƿ
Ρ/ & ƿ
Λ#ƿ
} „#0ƿ
/ „ƿ
/ ƿ
„)} +Ρ#ƿ
)μƿ
–+} –φƿ
μ)+ƿ
c)8 ΛГ#ƿ
μ#& ΡГ& όƿ
& )$ 
ɀ 🙃🙃🙃
Leadership change
ɀ @
μƿ
/ ƿ
Ю
#/ 0#+ƿ
μ/ Ю
Ю
„ƿ
)} –ƿ
)μƿ
–φ#ƿ
ΡЮ
} „–#+P
ƿ
/ ƿ
& #$ ƿ
)& #ƿ
Г„ƿ
Ρφ)„#&
ɀ O )$ ƿ
0)#„ƿ
–φ#ƿ
& #$ ƿ
Ю
#/ 0#+ƿ
Ч& )$ ƿ
$ φ/ –ƿ
& ##0„ƿ
μ#& ΡГ& ό
Leadership change (with new tasks)
my-connector (3 tasks)
Config topic
my-connector (2 tasks)
q–/ +–Г& όƿ
„–/ –#
ỳ +Г–#ƿ
Λ№
ƿ
Ю
#/ 0#+
ò#Λ/ Ю
/ & Ρ#ƿ
;g ƿ
c)8 ΛГ#ƿ
μ#& ΡГ& ό<
Leader falls out of cluster
Ś¤"%–
ɀ H& ƿ
μГ+„–ƿ
+#Λ/ Ю
/ & Ρ#P
ƿ
)Ю
0ƿ
–/ „Ч„ƿ
/ +#ƿ
μ#& Ρ#0ƿ
)} –ƿ
„} ΡΡ#„„μ} Ю
Ю
№
Rebalance (+ new leader)
ɀ H& ƿ
„#Ρ)& 0ƿ
+#Λ/ Ю
/ & Ρ#P
ƿ
& #$ ƿ
Ю
#/ 0#+ƿ
0)#„& d–ƿ
φ/¤#ƿ
–)ƿ
0)ƿ
/ &№
ƿ
μ#& ΡГ& ό
Leadership change (with new tasks)
my-connector (3 tasks)
Config topic
my-connector (2 tasks)
q–/ +–Г& όƿ
„–/ –#
Write by leader
]#/ 0#+ƿ
μ/ Ю
Ю
„ƿ
)} –ƿ
)μƿ
ΡЮ
} „–#+ƿ
;Λ#μ)+#ƿ
+#Λ/ Ю
/ & Ρ#<
ò#Λ/ Ю
/ & Ρ#ƿ
;g ƿ
& #$ ƿ
Ю
#/ 0#+<
Ś¤"%–
ɀ H& ƿ
+#Λ/ Ю
/ & Ρ#P
ƿ
& #$ ƿ
Ю
#/ 0#+ƿ
φ/ „ƿ
–)ƿ
μ#& Ρ#ƿ
)} –ƿ
)Ю
0ƿ
–/ „Ч„
● But how can it tell?
ɀ ſ φ#ƿ
Ρ)& μГόƿ
–)ъГΡƿ
Ю
))Ч„ƿ
–φ#ƿ
„/ 8 #ƿ
Г& ƿ
Λ)–φƿ
„Ρ#& / +Г)„
ɀ ò#8 #8 Λ#+P
ƿ
$ #ƿ
$ / & –ƿ
–)ƿ
/¤)Г0ƿ
} & & #Ρ#„„/ +№
ƿ
Г& –#++} ъ–Г)& „
Zombie fencing: fence, then write
ɀ Ņ} ++#& –ƿ
„#я } #& Ρ#ƿ
)μƿ
#¤#& –„T
ɂ n} ΛЮ
Г„φƿ
& #$ ƿ
–/ „Чƿ
Ρ)& μГό„
ɂ ò#Λ/ Ю
/ & Ρ#
ɂ i
ƿ̀ )8 ΛГ#ƿ
μ#& ΡГ& ό
ɂ q–/ +–ƿ
& #$ ƿ
–/ „Ч„
ɀ ff #$ ƿ
)+0#+T
ɂ ` )8 ΛГ#ƿ
μ#& ΡГ& ό
ɂ n} ΛЮ
Г„φƿ
& #$ ƿ
–/ „Чƿ
Ρ)& μГό„
ɂ ò#Λ/ Ю
/ & Ρ#
ɂ q–/ +–ƿ
& #$ ƿ
–/ „Ч„
ɀ ſ φГ„ƿ
φ/ „ƿ
–)ƿ
Λ#ƿ
Г–P
ƿ
+Гόφ–
ɀ j )} ƿ
$ Г„φƿ
😈
That was not a good idea
● Poor UX
○ Causes tasks to fail in between zombie fencing and end
of rebalance
○ Forcibly kills them, no chance to commit pending offsets
○ Looks like a bug to users
● Correctness issue
○ Users can manually restart failed tasks
○ Even in between zombie fencing and publishing new
task configs
○ Uh oh, a zombie task made it to the other end of the
rebalance!
Zombie fencing: durable task counts
● Forget the “fence then write” logic
● Instead, we explicitly track the number of to-be-fenced tasks
in the config topic with a task count record
● These serve two purposes:
○ Explicitly: if fencing is necessary, how many tasks have
to be fenced out
○ Implicitly: determine whether zombie fencing is
necessary
Zombie fencing: durable task counts
my-connector (3 tasks)
Ņ)%μГόƿ
–)ъГΡ
my-connector (2 tasks)
my-connector-task-count (2)
q–/ +–Г& όƿ
„–/ –#
ff #$ ƿ
–/ „Чƿ
Ρ)& μГό„
Rebalance (+ zombie fencing)
Ś¤"%–
ɀ H& ƿ
+#Λ/ Ю
/ & Ρ#T
ɂ ]#/ 0#+ƿ
μ#& Ρ#„ƿ
–φ+##ƿ
–/ „Ч„ƿ
Λ/ „#0ƿ
)& ƿ
Ю
/ –#„–ƿ
–/ „Чƿ
Ρ)} & –ƿ
+#Ρ)+0
ɂ ]#/ 0#+ƿ
$ +Г–#„ƿ
& #$ ƿ
–/ „Чƿ
Ρ)} & –ƿ
)μƿ
–$ )ƿ
–/ „Ч„ƿ
Λ/ „#0ƿ
)& ƿ
Ю
/ –#„–ƿ
–/ „Чƿ
Ρ)& μГό„
my-connector-task-count (3)
Zombie fencing: durable task counts
my-connector (3 tasks)
Config topic
my-connector (2 tasks)
my-connector-task-count (2)
q–/ +–Г& όƿ
„–/ –#
ff #$ ƿ
–/ „Чƿ
Ρ)& μГό„
Rebalance (+ zombie fencing)
Ś¤"%–
my-connector-task-count (3)
Safe to run bring up tasks?
✅
❌
✅
ɀ ỳ φ/–ƿ
Ρ)} Ю
0ƿ
ъ)„„ГΛЮ
№
ƿ
Λ+#/Чƿ
–φГ„ƿ
& )$ 
ɀ Hφƿ
№
#/ φP
ƿ
$ φ/ –ƿ
$ / „ƿ
–φ/ –ƿ
/ Λ)} –ƿ
–/ „Чƿ
+#„–/ +–„
Zombie fencing: durable task counts
Laggy task startup
● Zombie fencing disables all initialized task producers from
writing to Kafka
● What if a zombie task lags and hasn’t initialized its producer
by the time zombie fencing for a new generation of tasks
takes place?
● Or, what if a task is restarted on a zombie worker after
zombie fencing takes place?
Laggy task startup
reddit-source-0
reddit-source-1
reddit-source-2
reddit-source-3
ò#00Г–ƿ
„)} +Ρ#ƿ
^k ƿ
)Ю
0ƿ
–/ „Ч„<
reddit-source-0
reddit-source-1
reddit-source-2
ò#00Г–ƿ
„)} +Ρ#ƿ
^e ƿ
& #$ ƿ
–/ „Ч„<
` )0 ΛГ"ƿ
μ"%ΡГ%ό
*)} %-
reddit-source-3
^ſ / „Чƿ
Г„ƿ
Ю
/ όόГ& όƿ
0} +Г& όƿ
„–/ +–} ъ<
^ſ / „Чƿ
φ/ „ƿ
μГ& Г„φ#0ƿ
„–/ +–} ъ<
Zombie fencing: check your work
ɀ μ–#+ƿ
Г& Г–Г/ Ю
ГcГ& όƿ
–+/ & „/ Ρ–Г)& „ƿ
μ)+ƿ
/ ƿ
–/ „Чƿ
ъ+)0} Ρ#+P
ƿ
φ/¤#ƿ
–)ƿ
8 / Ч#ƿ
„} +#ƿ
Г–d„ƿ
„–ГЮ
Ю
ƿ
„/ μ#ƿ
–)ƿ
+} & ƿ
–φ#ƿ
–/ „Ч
ɀ ff #$ ƿ
„#я } #& Ρ#ƿ
)μƿ
#¤#& –„T
ɂ Ø#ΡГ0#ƿ
–)ƿ
;+#<„–/ +–ƿ
–/ „Ч
ɂ Ņ+#/ –#ƿ
ъ+)0} Ρ#+ƿ
μ)+ƿ
–/ „Чƿ
/ & 0ƿ
Г& Г–Г/ Ю
Гc#ƿ
–+/ & „/ Ρ–Г)& „
ɂ ò#/ 0ƿ
–)ƿ
#& 0ƿ
)μƿ
Ρ)& μГόƿ
–)ъГΡ
ɂ @
μƿ
& #$ ƿ
–/ „Чƿ
Ρ)& μГό„ƿ
μ)} & 0P
ƿ
/ Λ)+–ƿ
„–/ +–} ъƿ
/ & 0ƿ
/ Λ/ & 0)& ƿ
–φ#ƿ
–/ „Ч
ɂ H–φ#+$ Г„#P
ƿ
„/ μ#ƿ
–)ƿ
„–/ +–ƿ
ъ+)Ρ#„„Г& όƿ
0/ –/
ɀ O /¤#ƿ
$ #ƿ
μГ& / Ю
Ю
№
ƿ
0)& #ƿ
Г–
🎉🎉🎉 Yes! 🎉🎉🎉
(But…)
Caveats
● Fencing during rebalancing is not a good idea
○ Makes rebalances more brittle
○ Requires a new rebalance any time we want to restart a
task that failed due to failed zombie fencing
● Instead, we fence outside of rebalances
○ During task startup, workers issue a REST request to the
leader to perform zombie fencing for the connector
○ The leader will perform that round (if necessary), then
send back a 2XX response
○ If a non-2XX response is received, the task is marked
failed
○ Tasks can easily be restarted
Caveats
ɀ ſ φ+)$ / $ /№
ƿ
ъ+)0} Ρ#+„ƿ
μ)+ƿ
Г& Г–Г/ Ю
ГcГ& όƿ
–+/ & „/ Ρ–Г)& „ƿ
Г„ƿ
$ / „–#μ} Ю
ɀ ỳ #ƿ
/ 00#0ƿ
/ ƿ
& #$ ƿ
/ 08 Г& ƿ
ΡЮ
Г#& –ƿ n@
ƿ
Г& ƿ
e Me ƿ
–)ƿ
0)ƿ
–φГ„ƿ
Г& „–#/ 0
ɀ ỳ #ƿ
φ/¤#ƿ
–)ƿ
Λ#ƿ
Ρ/ +#μ} Ю
ƿ
/ Λ)} –ƿ
φ)$ ƿ
–φ#ƿ
Ю
#/ 0#+ƿ
} „#„ƿ
–φ#ƿ
–+/ & „/ Ρ–Г)& / Ю
ƿ
ъ+)0} Ρ#+
ɀ qГ8 ГЮ
/ +ƿ
QΡЮ
/ Г8 V –φ#& V Ρφ#ΡЧSƿ
Ю
)όГΡƿ
–)ƿ
„)} +Ρ#ƿ
–/ „Ч„
In practice
@
8 ъЮ
#8 #& –/ –Г)& ƿ
0#–/ ГЮ
„ƿ
/ +#ƿ
Λ)+Г& όP
ƿ
φ)$ ƿ
0)ƿ
$ #ƿ
/ Ρ–} / Ю
Ю
№
ƿ
} „#ƿ
–φГ„ƿ
μ#/ –} +#
In practice (cluster administrators)
ɀ ff #$ ƿ
ΡЮ
} „–#+„T
ɂ [ „#ƿ
¤#+„Г)& ƿ
e Me MLƿ
)+ƿ
Ю
/ –#+
ɂ Ņ)& μГό} +#ƿ
#¤#+№
ƿ
$ )+Ч#+ƿ
$ Г–φƿ
exactly.once.source.support = enabled
ɀ ŚGГ„–Г& όƿ
ΡЮ
} „–#+„T
ɂ ò)Ю
Ю
Г& όƿ
} ъό+/ 0#ƿ
NE
ɂ Ļ +Г& όƿ
/ Ю
Ю
ƿ
$ )+Ч#+„ƿ
–)ƿ
e Me MLƿ
)+ƿ
Ю
/ –#+
ɂ Ņ)& μГό} +#ƿ
#¤#+№
ƿ
$ )+Ч#+ƿ
$ Г–φƿ
exactly.once.source.support = preparing
○ ò)Ю
Ю
Г& όƿ
} ъό+/ 0#ƿ
ŢE
ɂ Ņ)& μГό} +#ƿ
#¤#+№
ƿ
$ )+Ч#+ƿ
$ Г–φƿ
exactly.once.source.support = enabled
In practice (downstream readers)
● Have to filter out records from aborted transactions
● If using the Java consumer, configure with isolation.level
= read_committed
● For sink connectors, do at least one of the following:
○ Configure worker with consumer.isolation.level =
read_committed
○ Configure connector with
consumer.override.isolation.level =
read_committed with (3.0.0 or later, with default
worker configuration)
In practice (writing connectors)
Have to define source offsets correctly
public abstract class SourceTask {
public abstract List<SourceRecord> poll();
}
public class SourceRecord {
public SourceRecord(Map<String, ?>
sourcePartition, Map<String, ?> sourceOffset, ...)
}
In practice (writing connectors)
O /¤#ƿ
–)ƿ
} „"ƿ
„)} +Ρ#ƿ
)μμ„#–„ƿ
Ρ)++#Ρ–Ю
№
public abstract class SourceTask {
protected SourceTaskContext context;
public abstract void start(Map<String, String> props);
}
public interface SourceTaskContext {
OffsetStorageReader offsetStorageReader();
}
public interface OffsetStorageReader {
<T> Map<Map<String, T>, Map<String, Object>>
offsets(Collection<Map<String, T>> partitions);
}
In summary
ɀ ŚG/ Ρ–Ю
№
V )& Ρ#ƿ
Г„ƿ
φ/ +0ƿ
–)ƿ
Г8 ъЮ
#8 #& –
ɂ Ś„ъ#ΡГ/ Ю
Ю
№
ƿ
φ/ & 0Ю
Г& όƿ
c)8 ΛГ#ƿ
$ )+Ч#+„fc)8 ΛГ#ƿ
–/ „Ч„ƿ
/ Ρ+)„„ƿ
–/ „Чƿ
+#Ρ)& μГό} +/ –Г)& „
ɀ ŚG/ Ρ–Ю
№
V )& Ρ#ƿ
Г„ƿ
;φ)ъ#μ} Ю
Ю
№
<ƿ
#/ „№
ƿ
–)ƿ
} „#
ɂ @
μƿ
Г–d„ƿ
& )–P
ƿ
φ/ +/ „„ƿ
ъГ& όƿ
8 #ƿ
)& ƿ
ăГ+/b
ɂ φ––ъ„TffГ„„} #„M/ ъ/ Ρφ#M)+όfТ
Г+/ fъ+)Т
#Ρ–„fǽ Zǽ fГ„„} #„
ɀ Z)+ƿ
/ Ю
Ю
ƿ
–φ#ƿ
0#–/ ГЮ
„P
ƿ
Ρφ#ΡЧƿ
)} –ƿ
ǽ@
nBnND
ɂ φ––ъ„TffΡ$ ГЧГM/ ъ/ Ρφ#M)+όfΡ)& μЮ
} #& Ρ#f0Г„ъЮ
/№
fǽ Zǽ fǽ@
n
BnNDṊ e i
ŚG/ Ρ–Ю
№
V
H& Ρ#g q} ъъ)+–g μ)+g q)} +Ρ#g Ņ)& & #Ρ–)+„
Thank you!
ƿ
Open Source Program Office @Aiven
Ρφ#Г„&' ( Г¤&* +Г,
Chris Egerton
-Г* -Ρφ#Г„. &ό&#–,* . 12 33456 3
@C0urante

Más contenido relacionado

Similar a Exactly-Once, Again: Adding EOS Support for Kafka Connect Source Connectors with Chris Egerton

Танки_в_Лунапарке: нагрузочное_тестирование_в_Яндексе
Танки_в_Лунапарке: нагрузочное_тестирование_в_ЯндексеТанки_в_Лунапарке: нагрузочное_тестирование_в_Яндексе
Танки_в_Лунапарке: нагрузочное_тестирование_в_ЯндексеYandex
 
Os Fetterupdated
Os FetterupdatedOs Fetterupdated
Os Fetterupdatedoscon2007
 
Groovy Vs Perl
Groovy Vs PerlGroovy Vs Perl
Groovy Vs Perlmayperl
 
Angular js活用事例:filydoc
Angular js活用事例:filydocAngular js活用事例:filydoc
Angular js活用事例:filydocKeiichi Kobayashi
 
0097 introduccion-a-redes-y-a-tcpip-sobre-tecnologia-ethernet
0097 introduccion-a-redes-y-a-tcpip-sobre-tecnologia-ethernet0097 introduccion-a-redes-y-a-tcpip-sobre-tecnologia-ethernet
0097 introduccion-a-redes-y-a-tcpip-sobre-tecnologia-ethernetEdgar lamar
 
Fatwa arkanul islam_qa
Fatwa arkanul islam_qaFatwa arkanul islam_qa
Fatwa arkanul islam_qasogoodislam
 
Fatawa arkanul islam
Fatawa arkanul islamFatawa arkanul islam
Fatawa arkanul islamSonali Jannat
 
GDG DevFest Kyoto 2014 これからのGoの話をしよう
GDG DevFest Kyoto 2014 これからのGoの話をしようGDG DevFest Kyoto 2014 これからのGoの話をしよう
GDG DevFest Kyoto 2014 これからのGoの話をしようSatoshi Noda
 
REAL KALKI AVATAR
REAL KALKI AVATARREAL KALKI AVATAR
REAL KALKI AVATARMoral Renew
 
LAMP_TRAINING_SESSION_6
LAMP_TRAINING_SESSION_6LAMP_TRAINING_SESSION_6
LAMP_TRAINING_SESSION_6umapst
 
Nitrificacion importancia medio ambiental bacterias nitrificantes
Nitrificacion importancia medio ambiental bacterias nitrificantesNitrificacion importancia medio ambiental bacterias nitrificantes
Nitrificacion importancia medio ambiental bacterias nitrificantesLUIS FERNANDO QUIROGA PUERTA
 
톰캣 #05+a-배치-parallel deployment
톰캣 #05+a-배치-parallel deployment톰캣 #05+a-배치-parallel deployment
톰캣 #05+a-배치-parallel deploymentGyuSeok Lee
 

Similar a Exactly-Once, Again: Adding EOS Support for Kafka Connect Source Connectors with Chris Egerton (20)

Male Enhancement
Male EnhancementMale Enhancement
Male Enhancement
 
Танки_в_Лунапарке: нагрузочное_тестирование_в_Яндексе
Танки_в_Лунапарке: нагрузочное_тестирование_в_ЯндексеТанки_в_Лунапарке: нагрузочное_тестирование_в_Яндексе
Танки_в_Лунапарке: нагрузочное_тестирование_в_Яндексе
 
Matrices, Matrix
Matrices, MatrixMatrices, Matrix
Matrices, Matrix
 
Os Fetterupdated
Os FetterupdatedOs Fetterupdated
Os Fetterupdated
 
Ecuacionesfuncionales2 1
Ecuacionesfuncionales2 1Ecuacionesfuncionales2 1
Ecuacionesfuncionales2 1
 
Groovy Vs Perl
Groovy Vs PerlGroovy Vs Perl
Groovy Vs Perl
 
Csharp intsight[1]
Csharp intsight[1]Csharp intsight[1]
Csharp intsight[1]
 
Csharp intsight
Csharp intsightCsharp intsight
Csharp intsight
 
Zynga adder
Zynga adderZynga adder
Zynga adder
 
Angular js活用事例:filydoc
Angular js活用事例:filydocAngular js活用事例:filydoc
Angular js活用事例:filydoc
 
Reification
ReificationReification
Reification
 
0097 introduccion-a-redes-y-a-tcpip-sobre-tecnologia-ethernet
0097 introduccion-a-redes-y-a-tcpip-sobre-tecnologia-ethernet0097 introduccion-a-redes-y-a-tcpip-sobre-tecnologia-ethernet
0097 introduccion-a-redes-y-a-tcpip-sobre-tecnologia-ethernet
 
Fatwa arkanul islam_qa
Fatwa arkanul islam_qaFatwa arkanul islam_qa
Fatwa arkanul islam_qa
 
Fatawa arkanul islam
Fatawa arkanul islamFatawa arkanul islam
Fatawa arkanul islam
 
GDG DevFest Kyoto 2014 これからのGoの話をしよう
GDG DevFest Kyoto 2014 これからのGoの話をしようGDG DevFest Kyoto 2014 これからのGoの話をしよう
GDG DevFest Kyoto 2014 これからのGoの話をしよう
 
REAL KALKI AVATAR
REAL KALKI AVATARREAL KALKI AVATAR
REAL KALKI AVATAR
 
If you can
If you canIf you can
If you can
 
LAMP_TRAINING_SESSION_6
LAMP_TRAINING_SESSION_6LAMP_TRAINING_SESSION_6
LAMP_TRAINING_SESSION_6
 
Nitrificacion importancia medio ambiental bacterias nitrificantes
Nitrificacion importancia medio ambiental bacterias nitrificantesNitrificacion importancia medio ambiental bacterias nitrificantes
Nitrificacion importancia medio ambiental bacterias nitrificantes
 
톰캣 #05+a-배치-parallel deployment
톰캣 #05+a-배치-parallel deployment톰캣 #05+a-배치-parallel deployment
톰캣 #05+a-배치-parallel deployment
 

Más de HostedbyConfluent

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonHostedbyConfluent
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolHostedbyConfluent
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesHostedbyConfluent
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaHostedbyConfluent
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonHostedbyConfluent
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonHostedbyConfluent
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyHostedbyConfluent
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...HostedbyConfluent
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...HostedbyConfluent
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersHostedbyConfluent
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformHostedbyConfluent
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubHostedbyConfluent
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonHostedbyConfluent
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLHostedbyConfluent
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceHostedbyConfluent
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondHostedbyConfluent
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsHostedbyConfluent
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemHostedbyConfluent
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksHostedbyConfluent
 

Más de HostedbyConfluent (20)

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit London
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at Trendyol
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and Kafka
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit London
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit London
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And Why
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka Clusters
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy Pub
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit London
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSL
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and Beyond
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink Apps
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC Ecosystem
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local Disks
 

Último

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusZilliz
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 

Último (20)

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 

Exactly-Once, Again: Adding EOS Support for Kafka Connect Source Connectors with Chris Egerton

  • 1. Exactly-Once, Again: Adding EOS Support for Kafka Connect Source Connectors Chris Egerton
  • 2. Nice to meet you! Chris Egerton Open Source Program Office @Aiven Apache Kafka committer and PMC member (Official bio) (Unofficial bio)
  • 3. Exactly-Once, Again: Adding EOS Support for Kafka Connect Source Connectors
  • 4. ● “Exactly-once semantics” ● “Semantics” instead of “delivery”, “guarantees”, “delivery guarantees”, etc. (see Two Generals’ Problem) ● Levels: ○ Probably-once ○ At-least-once ○ At-most-once ○ Exactly-once ● With all else equal, exactly-once is best ● But of course, it’s the hardest to implement EOS
  • 5. Exactly-Once, Again: Adding EOS Support for Kafka Connect Source Connectors
  • 6. Source Connectors ● Kafka stores and transmits events. Where do these events come from, and where do they go? ● DYI producer/consumer application? Nah 👎 ● Connectors: no-code (or low-code) applications to integrate Kafka with other systems ● Sink connectors write data from Kafka to the external system ● Source connectors read data from the external system into Kafka
  • 7. Exactly-Once, Again: Adding EOS Support for Kafka Connect Source Connectors
  • 8. Kafka Connect ● Distributed, horizontally-scalable, fault- tolerant ingest/export tool for Kafka ● Developers implement connectors against the Kafka Connect API ● Cluster administrators install connectors onto one or more Kafka Connect workers, which combine to form a cluster ● Users can then create and manage connectors on that cluster by submitting JSON configurations via a REST API ● (For users) No code required! { "name": "local-file-source", "connector.class": "FileStreamSink", "tasks.max": "1", "file": "test.txt", "topic": "connect-test" }
  • 9. We’re going to talk about designing support for exactly-once semantics (EOS) with source connectors developed for Kafka Connect. In summary…
  • 10. Challenges 1⃣ƿ Ø#μГ& Г& όƿ „)} +Ρ#ƿ )μμ„#–„ 2⃣ƿ q–)+Г& όƿ / & 0ƿ +#–+Г#¤Г& όƿ „)} +Ρ#ƿ )μμ„#–„ 3⃣ƿ ò#–+№ Г& όƿ 0#Ю Г¤#+№ ƿ –)ƿ ǽ/ μЧ/ 4⃣ƿ̀ )8 ΛГ#„ƿ 🧟🧟🧟🧟🧟
  • 11. Source offsets, in detail ● Source connectors provide source records ● Source records come with source offsets (partition + offset) ● On startup, connectors use source offsets to know where to resume from ● Source offsets are stored in an offsets topic by Kafka Connect ✅🎉 // TODO ɀ Ø"μГ%Г%όƿ „)} +Ρ#ƿ )μμ„#–„ƿ ;1⃣<ƿ Г„ƿ –φ#ƿ +#„ъ)& „ГΛГЮ Г–№ ƿ )μƿ –φ#ƿ Ρ)& & #Ρ–)+ ɀ q–)*Г%όƿ , %-ƿ *"–*Г"¤Г%όƿ „)} +Ρ#ƿ )μμ„#–„ƿ ;2⃣<ƿ Г„ƿ –φ#ƿ +#„ъ)& „ГΛГЮ Г–№ ƿ )μƿ ǽ/μЧ/ƿ Ņ)& & #Ρ–
  • 12. Exactly-once for Kafka clients ɀ ǽ@ nBCDEƿ ŚG/ Ρ–Ю № ƿ H& Ρ#ƿ Ø#Ю Г¤#+№ ƿ / & 0ƿ ſ +/ & „/ Ρ–Г)& / Ю ƿ ë #„„/ όГ& ό ɂ ò#Ю #/ „#0ƿ Г& ƿ LMNNMLML ɂ O #& Ρ#P ƿ Q ό/ Г& Sƿ Г& ƿ –φ#ƿ –Г–Ю # ɀ / -"0 ъ)–"%–ƿ ъ*)-} Ρ"*Tƿ +#–+Г#„ƿ $ Г–φ)} –ƿ 0} ъЮ ГΡ/ –#„ƿ ;3⃣< ɀ ſ *, %„, Ρ–Г)%, Ю ƿ ъ*)-} Ρ"*Tƿ / –)8 ГΡƿ Ρ+)„„V –)ъГΡƿ $ +Г–#„ ɀ ſ *, %„, Ρ–Г)%, Ю ƿ / Ø„Tƿ „Г& όЮ #ƿ $ +Г–#+ƿ ъ#+ƿ @ Ø ɂ / %Г–Г, Ю Г7"ƿ –*, %„, Ρ–Г)%„ƿ $ Г–φƿ / ƿ ъ+)0} Ρ#+ƿ –)ƿ μ"%Ρ"ƿ )} –ƿ )–φ#+ƿ ъ+)0} Ρ#+„ƿ $ Г–φƿ –φ#ƿ „/ 8 #ƿ –+/ & „/ Ρ–Г)& / Ю ƿ @ Ø ✅🎉
  • 13. Tracking source offsets (2⃣) Ļ #μ)+#T ɀ n)Ю Ю ƿ Ρ)& & #Ρ–)+ƿ μ)+ƿ „)} +Ρ#ƿ +#Ρ)+0„ ɀ ỳ +Г–#ƿ +#Ρ)+0„ƿ –)ƿ ǽ/ μЧ/ ɀ n#+Г)0ГΡ/ Ю Ю № ƿ $ +Г–#ƿ ;Ρ)8 8 Г–<ƿ „)} +Ρ#ƿ )μμ„#–„ƿ –)ƿ ǽ/ μЧ/ ɀ n+)¤Г0#„ƿ / –V Ю #/ „–V )& Ρ#ƿ „} ъъ)+– μ–#+T ɀ Ļ #όГ& ƿ –+/ & „/ Ρ–Г)& ɀ n)Ю Ю ƿ Ρ)& & #Ρ–)+ƿ μ)+ƿ „)} +Ρ#ƿ +#Ρ)+0„ ɀ ỳ +Г–#ƿ +#Ρ)+0„ƿ Г8 8 #0Г/ –#Ю № ƿ –)ƿ ǽ/ μЧ/ ƿ ;„–ГЮ Ю < ɀ ỳ +Г–#ƿ „)} +Ρ#ƿ )μμ„#–„ƿ –)ƿ ǽ/ μЧ/ ƿ ɀ Ņ)8 8 Г–ƿ –+/ & „/ Ρ–Г)& ɀ n+)¤Г0#„ƿ #G/ Ρ–Ю № V )& Ρ#ƿ „} ъъ)+– ✅🎉
  • 14. 75% done! 1⃣ƿ Ø#μГ& Г& όƿ „)} +Ρ#ƿ )μμ„#–„ 2⃣ƿ q–)+Г& όƿ / & 0ƿ +#–+Г#¤Г& όƿ „)} +Ρ#ƿ )μμ„#–„ 3⃣ƿ ò#–+№ Г& όƿ 0#Ю Г¤#+№ ƿ –)ƿ ǽ/ μЧ/ 🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟 🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟 🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟 ✅ ✅ ✅ 4⃣ƿ ſ φ/ –ƿ Ю #/¤#„Y
  • 15. “Chekhov’s gun cabinet” Kafka Connect: diving deeper ɀ Ņ)%%"Ρ–)*„ƿ ό#& #+/–#ƿ Ρ)& μГό} +/–Г)& „ƿ μ)+ƿ )& #ƿ )+ƿ 8 )+#ƿ –, „Ч„ ɀ ſ / „Ч„ƿ Ρ/ & ƿ Λ#ƿ „ъ+#/ 0ƿ / Ρ+)„„ƿ 8 } Ю –ГъЮ #ƿ ǽ/ μЧ/ ƿ Ņ)& & #Ρ–ƿ $ )+Ч#+„ ɀ ſ / „Ч„ƿ / +#ƿ / „„Гό& #0ƿ –)ƿ $ )+Ч#+„ƿ 0} +Г& όƿ / ƿ *"Λ, Ю , %Ρ" ɀ ƿ „Г& όЮ #ƿ $ )+Ч#+P ƿ –φ#ƿ Ю ", -"*P ƿ 0#–#+8 Г& #„ƿ –φ/ –ƿ / „„Гό& 8 #& – ɀ ſ / „Чƿ Ρ)& μГό} +/ –Г)& „ƿ / +#ƿ „–)+#0ƿ Г& ƿ / ƿ „Г& όЮ #V ъ/+–Г–Г)& ƿ ǽ/μЧ/ƿ –)ъГΡƿ Ρ/ Ю Ю #0ƿ –φ#ƿ Ρ)%μГόƿ –)ъГΡ ɀ Z)+ƿ „)} +Ρ#ƿ Ρ)& & #Ρ–)+„P ƿ ǽ/ μЧ/ ƿ Ņ)& & #Ρ–ƿ } „#„ƿ )& #ƿ ъ+)0} Ρ#+ƿ ъ#+ƿ –/ „Чƿ –)ƿ „#& 0ƿ „)} +Ρ#ƿ +#Ρ)+0„ƿ –)ƿ ǽ/μЧ/
  • 16. Zombie fencing: correctness goals ſ )Ю #+/ –#0ƿ „Ρ#& / +Г)„T ɀ ;[ & <ό+/ Ρ#μ} Ю ƿ $ )+Ч#+ƿ „φ} –0)$ & ɀ +ΛГ–+/ +№ ƿ $ )+Ч#+ƿ „–/ +–} ъ„ ɀ +ΛГ–+/ +ГЮ № V Ю )& όP ƿ / +ΛГ–+/ +ГЮ № V ъЮ /Ρ#0P ƿ / & 0ƿ /+ΛГ–+/+ГЮ № V –Г8 #0ƿ ъ/ } „#„
  • 17. Zombie fencing: UX goals ● Minimal unnecessary interruptions (keep processing data) ● Minimal changes to connector code ● Minimal connector management API changes
  • 18. Zombie fencing: actually pretty easy? ● Give each task a transactional ID derived from the name of the connector and the task ID ○ E.g., “reddit-source-0” or “chris-ksl-3” ● Let tasks fence out older instances on startup ○ Fencing: disabling a producer from writing to Kafka
  • 19. Zombie fencing: actually pretty easy? reddit-source-0 reddit-source-1 reddit-source-2 reddit-source-3 ò#00Г–ƿ „)} +Ρ#ƿ ;)Ю 0ƿ –/ „Ч„< reddit-source-0 reddit-source-1 reddit-source-2 reddit-source-3 ò#00Г–ƿ „)} +Ρ#ƿ ;& #$ ƿ –/ „Ч„< Fences out ;"%Ρ"„ƿ )} – ;"%Ρ"„ƿ )} – Fences out ɀ q)Y ƿ 0)#„ƿ –φГ„ƿ $ )+Ч ɀ ])Ю ƿ @ ƿ $ Г„φƿ 🥲
  • 20. Source partition reshuffling problem ɀ q)} +Ρ#ƿ Ρ)& & #Ρ–)+„ƿ +#/ 0ƿ μ+)8 ƿ „)} *Ρ"ƿ ъ, *–Г–Г)%„ ɂ Ø/ –/ Λ/ „#ƿ –/ ΛЮ #„P ƿ ǽ/ μЧ/ ƿ –)ъГΡ„P ƿ #–ΡM ɀ ſ φ#„#ƿ / +#ƿ / Ю Ю )Ρ/ –#0ƿ / Ρ+)„„ƿ –/ „Ч„ ɀ ſ φ/ –ƿ / Ю Ю )Ρ/ –Г)& ƿ Ρ/ & ƿ Ρφ/ & ό#ƿ )¤#+ƿ –Г8 #
  • 21. Source partition reshuffling problem P0 Task 0 P1 ſ / „Чƿ N @ & Г–Г/ Ю ƿ / Ю Ю )Ρ/ –Г)& ƿ ^Ţƿ –/ „Ч„P ƿ Ţƿ ъ/ +–Г–Г)& „< P0 Task 0 P2 ƿ & #$ ƿ ъ/ +–Г–Г)& ƿ ^nŢ`ƿ Г„ƿ Ρ+#/ –#0 P1 ſ / „Чƿ N & ƿ #GГ„–Г& όƿ ъ/ +–Г–Г)& ƿ ^nN`ƿ Г„ƿ 0#Ю #–#0 P0 ſ / „Чƿ L P2 ſ / „Чƿ N P1 P2
  • 22. Source partition reshuffling problem P0 ſ / „Чƿ L P2 HЮ 0ƿ –/ „Ч„ P1 Task 1 ff #$ ƿ –/ „Ч„ P2 ſ / „Чƿ N ;"%Ρ"„ƿ )} – ● New task T1 starts before old task T0 stops ɀ Ļ )–φƿ / +#ƿ / „„Гό& #0ƿ ъ/+–Г–Г)& ƿ nŢ ɀ Ø} ъЮ ГΡ/ –#„ƿ / Λ)} & 0b
  • 23. Zombie fencing: second try ɀ ỳ φ#& #¤#+ƿ & #$ ƿ –/„Чƿ Ρ)& μГό} +/–Г)& „ƿ /+#ƿ +#/0ƿ μ+)8 ƿ –φ#ƿ Ρ)& μГόƿ –)ъГΡP ƿ ъ#+μ)+8 ƿ /ƿ +)} & 0ƿ )μƿ 7)0 ΛГ"ƿ μ"%ΡГ%ό ɂ Z)+ƿ #¤#+№ ƿ & #$ ƿ –/ „ЧP ƿ –/ Ч#ƿ Г–„ƿ –+/ & „/ Ρ–Г)& / Ю ƿ @ Øƿ / & 0ƿ ъ+##8 ъ–Г¤#Ю № ƿ Г& Г–Г/Ю Гc#ƿ –+/& „/Ρ–Г)& „ ɀ ff )$ ƿ Г–d„ƿ Г8 ъ)„„ГΛЮ #ƿ μ)+ƿ / &№ ƿ )Ю 0#+ƿ Г& „–/ & Ρ#„ƿ )μƿ & #$ Ю № V Ρ+#/ –#0ƿ –/ „Ч„ƿ –)ƿ Λ#ƿ +} & & Г& όƿ Λ#μ)+#ƿ / &№ ƿ & #$ #+ƿ Г& „–/ & Ρ#„ƿ / +#ƿ „–/ +–#0 ɀ Ø)ƿ –φГ„ƿ 0} +Г& όƿ +#Λ/ Ю / & Ρ#
  • 24. Zombie fencing: first try reddit-source-0 reddit-source-1 reddit-source-2 reddit-source-3 ò#00Г–ƿ „)} +Ρ#ƿ ;)Ю 0ƿ –/ „Ч„< reddit-source-0 reddit-source-1 reddit-source-2 reddit-source-3 ò#00Г–ƿ „)} +Ρ#ƿ ;& #$ ƿ –/ „Ч„< ;"%Ρ"„ƿ )} – Fences out ;"%Ρ"„ƿ )} – ;"%Ρ"„ƿ )} –
  • 25. Zombie fencing: second try reddit-source-0 reddit-source-1 reddit-source-2 reddit-source-3 ò#00Г–ƿ „)} +Ρ#ƿ ;)Ю 0ƿ –/ „Ч„< reddit-source-0 reddit-source-1 reddit-source-2 reddit-source-3 ò#00Г–ƿ „)} +Ρ#ƿ ;& #$ ƿ –/ „Ч„< ` )0 ΛГ"ƿ μ"%ΡГ%ό *)} %- ● Woohoo, we did it! ● Just kidding, put your hand back down
  • 26. Zombie fencing: second try ɀ ỳ φ#& #¤#+ƿ & #$ ƿ –/„Чƿ Ρ)& μГό} +/–Г)& „ƿ /+#ƿ +#/0ƿ μ+)8 ƿ –φ#ƿ Ρ)& μГόƿ –)ъГΡP ƿ ъ#+μ)+8 ƿ /ƿ +)} & 0ƿ )μƿ 7)0 ΛГ"ƿ μ"%ΡГ%ό ɂ Z)+ƿ #¤#+№ ƿ & #$ ƿ –/ „ЧP ƿ –/ Ч#ƿ Г–„ƿ –+/ & „/ Ρ–Г)& / Ю ƿ @ Øƿ / & 0ƿ ъ+##8 ъ–Г¤#Ю № ƿ Г& Г–Г/Ю Гc#ƿ –+/& „/Ρ–Г)& „ ɀ ff )$ ƿ Г–d„ƿ Г8 ъ)„„ГΛЮ #ƿ μ)+ƿ / &№ ƿ )Ю 0#+ƿ Г& „–/ & Ρ#„ƿ )μƿ & #$ Ю № V Ρ+#/ –#0ƿ –/ „Ч„ƿ –)ƿ Λ#ƿ +} & & Г& όƿ Λ#μ)+#ƿ / &№ ƿ & #$ #+ƿ Г& „–/ & Ρ#„ƿ / +#ƿ „–/ +–#0 ɀ Ø)ƿ –φГ„ƿ 0} +Г& όƿ +#Λ/ Ю / & Ρ#
  • 27. Connector resizing T0 T1 T2 T3 Z)} +ƿ –/ „Ч„ T0 T1 T2 ſ φ+##ƿ –/ „Ч„ ` )0 ΛГ"ƿ μ"%ΡГ%ό *)} %-= * - based on new tasks ɀ HЮ 0ƿ –/ „Чƿ +} & & Г& όƿ / μ–#+ƿ & #$ #+ƿ –/ „Ч„ƿ / +#ƿ „–/ +–#0 ɀ @ μƿ / ƿ ъ/ +–Г–Г)& ƿ Г„ƿ +#/„„Гό& #0ƿ μ+)8 ƿ )Ю 0ƿ ſ e ƿ –)ƿ & #$ ƿ ſ Lfſ Nfſ ŢY ɀ Ø} ъЮ ГΡ/ –#„ƿ / Λ)} & 0
  • 28. Zombie fencing: third time’s the charm ɀ @ & „–#/ 0ƿ )μƿ } „Г& όƿ –φ#ƿ & #$ ƿ „#–ƿ )μƿ –/ „Чƿ Ρ)& μГό„ƿ μ)+ƿ )} +ƿ +)} & 0ƿ )μƿ c)8 ΛГ#ƿ μ#& ΡГ& όP ƿ } „#ƿ –φ#ƿ )Ю 0ƿ „#– ɀ ό/ Г& P ƿ 0)ƿ –φГ„ƿ 0} +Г& όƿ +#Λ/ Ю / & Ρ# ɀ Ю +Гόφ–P ƿ –φГ„ƿ φ/ „ƿ –)ƿ Λ#ƿ #& )} όφP ƿ +Гόφ– ɀ ỳ #Ю Ю Y ƿ φ)$ ƿ 0)ƿ $ #ƿ Ч& )$ ƿ / Λ)} –ƿ –φ#ƿ )Ю 0ƿ „#–ƿ )μƿ –/ „Чƿ Ρ)& μГό„
  • 29. Zombie leaders ɀ ſ φ#ƿ Ю #/ 0#+ƿ )μƿ –φ#ƿ ΡЮ } „–#+ƿ Г„ƿ –φ#ƿ )& Ю № ƿ )& #ƿ / Ю Ю )$ #0ƿ –)ƿ ъ} ΛЮ Г„φƿ –/ „Чƿ Ρ)& μГό„ƿ –)ƿ –φ#ƿ Ρ)& μГόƿ –)ъГΡ ɀ O )$ #¤#+P ƿ $ #ƿ 0)& d–ƿ #& μ)+Ρ#ƿ –φГ„ƿ ¤#+№ ƿ „–+)& όЮ № ɀ q)8 #ƿ $ )+Ч#+„ƿ 8 /№ ƿ 8 Г„–/ Ч#& Ю № ƿ Λ#Ю Г#¤#ƿ –φ#№ ƿ / +#ƿ –φ#ƿ Ю #/ 0#+ ɀ @ & / ΡΡ} +/ –#ƿ –/ „Чƿ Ρ)& μГό„ƿ 8 /№ ƿ Λ#ƿ ъ} ΛЮ Г„φ#0ƿ Г& ƿ +/ ъГ0ƿ „} ΡΡ#„„Г)& P ƿ )¤#+$ +Г–Г& όƿ ¤/ Ю Г0ƿ –/ „Чƿ Ρ)& μГό„
  • 30. ● Leader should fence out three tasks ● But leader only fences out two Zombie leaders my-connector (3 tasks) Ņ)%μГόƿ –)ъГΡ my-connector (2 tasks) my-connector (1 task) q–/ +–Г& όƿ „–/ –# Write by zombie leader ỳ +Г–#ƿ Λ№ ƿ / Ρ–} / Ю ƿ Ю #/ 0#+ ò#Λ/ Ю / & Ρ# Ś¤"%– ɀ H& ƿ +#Λ/ Ю / & Ρ#P ƿ Ю #/ 0#+ƿ „##„T ○ One new task ɂ ſ $ )ƿ )Ю 0ƿ –/ „Ч„
  • 31. Zombie fencing: guarded config topic ɀ O )$ ƿ 0)ƿ $ #ƿ ъ+#¤#& –ƿ c)8 ΛГ#ƿ Ю #/ 0#+„ƿ μ+)8 ƿ / ΡΡ#„„Г& όƿ –φ#ƿ Ρ)& μГόƿ –)ъГΡ ɀ ſ +/ & „/ Ρ–Г)& / Ю ƿ ъ+)0} Ρ#+ƿ –)ƿ –φ#ƿ +#„Ρ} #ƿ ;/ ό/ Г& <b ɀ @ „ƿ –φГ„ƿ #& )} όφ ɀ +#ƿ $ #ƿ ό} / +/ & –##0ƿ –φ/ –ƿ –φ#ƿ Ρ)& μГόƿ –)ъГΡƿ Ρ/ & ƿ Λ#ƿ } „#0ƿ / „ƿ / ƿ „)} +Ρ#ƿ )μƿ –+} –φƿ μ)+ƿ c)8 ΛГ#ƿ μ#& ΡГ& όƿ & )$ ɀ 🙃🙃🙃
  • 32. Leadership change ɀ @ μƿ / ƿ Ю #/ 0#+ƿ μ/ Ю Ю „ƿ )} –ƿ )μƿ –φ#ƿ ΡЮ } „–#+P ƿ / ƿ & #$ ƿ )& #ƿ Г„ƿ Ρφ)„#& ɀ O )$ ƿ 0)#„ƿ –φ#ƿ & #$ ƿ Ю #/ 0#+ƿ Ч& )$ ƿ $ φ/ –ƿ & ##0„ƿ μ#& ΡГ& ό
  • 33. Leadership change (with new tasks) my-connector (3 tasks) Config topic my-connector (2 tasks) q–/ +–Г& όƿ „–/ –# ỳ +Г–#ƿ Λ№ ƿ Ю #/ 0#+ ò#Λ/ Ю / & Ρ#ƿ ;g ƿ c)8 ΛГ#ƿ μ#& ΡГ& ό< Leader falls out of cluster Ś¤"%– ɀ H& ƿ μГ+„–ƿ +#Λ/ Ю / & Ρ#P ƿ )Ю 0ƿ –/ „Ч„ƿ / +#ƿ μ#& Ρ#0ƿ )} –ƿ „} ΡΡ#„„μ} Ю Ю № Rebalance (+ new leader) ɀ H& ƿ „#Ρ)& 0ƿ +#Λ/ Ю / & Ρ#P ƿ & #$ ƿ Ю #/ 0#+ƿ 0)#„& d–ƿ φ/¤#ƿ –)ƿ 0)ƿ / &№ ƿ μ#& ΡГ& ό
  • 34. Leadership change (with new tasks) my-connector (3 tasks) Config topic my-connector (2 tasks) q–/ +–Г& όƿ „–/ –# Write by leader ]#/ 0#+ƿ μ/ Ю Ю „ƿ )} –ƿ )μƿ ΡЮ } „–#+ƿ ;Λ#μ)+#ƿ +#Λ/ Ю / & Ρ#< ò#Λ/ Ю / & Ρ#ƿ ;g ƿ & #$ ƿ Ю #/ 0#+< Ś¤"%– ɀ H& ƿ +#Λ/ Ю / & Ρ#P ƿ & #$ ƿ Ю #/ 0#+ƿ φ/ „ƿ –)ƿ μ#& Ρ#ƿ )} –ƿ )Ю 0ƿ –/ „Ч„ ● But how can it tell? ɀ ſ φ#ƿ Ρ)& μГόƿ –)ъГΡƿ Ю ))Ч„ƿ –φ#ƿ „/ 8 #ƿ Г& ƿ Λ)–φƿ „Ρ#& / +Г)„ ɀ ò#8 #8 Λ#+P ƿ $ #ƿ $ / & –ƿ –)ƿ /¤)Г0ƿ } & & #Ρ#„„/ +№ ƿ Г& –#++} ъ–Г)& „
  • 35. Zombie fencing: fence, then write ɀ Ņ} ++#& –ƿ „#я } #& Ρ#ƿ )μƿ #¤#& –„T ɂ n} ΛЮ Г„φƿ & #$ ƿ –/ „Чƿ Ρ)& μГό„ ɂ ò#Λ/ Ю / & Ρ# ɂ i ƿ̀ )8 ΛГ#ƿ μ#& ΡГ& ό ɂ q–/ +–ƿ & #$ ƿ –/ „Ч„ ɀ ff #$ ƿ )+0#+T ɂ ` )8 ΛГ#ƿ μ#& ΡГ& ό ɂ n} ΛЮ Г„φƿ & #$ ƿ –/ „Чƿ Ρ)& μГό„ ɂ ò#Λ/ Ю / & Ρ# ɂ q–/ +–ƿ & #$ ƿ –/ „Ч„ ɀ ſ φГ„ƿ φ/ „ƿ –)ƿ Λ#ƿ Г–P ƿ +Гόφ– ɀ j )} ƿ $ Г„φƿ 😈
  • 36. That was not a good idea ● Poor UX ○ Causes tasks to fail in between zombie fencing and end of rebalance ○ Forcibly kills them, no chance to commit pending offsets ○ Looks like a bug to users ● Correctness issue ○ Users can manually restart failed tasks ○ Even in between zombie fencing and publishing new task configs ○ Uh oh, a zombie task made it to the other end of the rebalance!
  • 37. Zombie fencing: durable task counts ● Forget the “fence then write” logic ● Instead, we explicitly track the number of to-be-fenced tasks in the config topic with a task count record ● These serve two purposes: ○ Explicitly: if fencing is necessary, how many tasks have to be fenced out ○ Implicitly: determine whether zombie fencing is necessary
  • 38. Zombie fencing: durable task counts my-connector (3 tasks) Ņ)%μГόƿ –)ъГΡ my-connector (2 tasks) my-connector-task-count (2) q–/ +–Г& όƿ „–/ –# ff #$ ƿ –/ „Чƿ Ρ)& μГό„ Rebalance (+ zombie fencing) Ś¤"%– ɀ H& ƿ +#Λ/ Ю / & Ρ#T ɂ ]#/ 0#+ƿ μ#& Ρ#„ƿ –φ+##ƿ –/ „Ч„ƿ Λ/ „#0ƿ )& ƿ Ю / –#„–ƿ –/ „Чƿ Ρ)} & –ƿ +#Ρ)+0 ɂ ]#/ 0#+ƿ $ +Г–#„ƿ & #$ ƿ –/ „Чƿ Ρ)} & –ƿ )μƿ –$ )ƿ –/ „Ч„ƿ Λ/ „#0ƿ )& ƿ Ю / –#„–ƿ –/ „Чƿ Ρ)& μГό„ my-connector-task-count (3)
  • 39. Zombie fencing: durable task counts my-connector (3 tasks) Config topic my-connector (2 tasks) my-connector-task-count (2) q–/ +–Г& όƿ „–/ –# ff #$ ƿ –/ „Чƿ Ρ)& μГό„ Rebalance (+ zombie fencing) Ś¤"%– my-connector-task-count (3) Safe to run bring up tasks? ✅ ❌ ✅
  • 40. ɀ ỳ φ/–ƿ Ρ)} Ю 0ƿ ъ)„„ГΛЮ № ƿ Λ+#/Чƿ –φГ„ƿ & )$ ɀ Hφƿ № #/ φP ƿ $ φ/ –ƿ $ / „ƿ –φ/ –ƿ / Λ)} –ƿ –/ „Чƿ +#„–/ +–„ Zombie fencing: durable task counts
  • 41. Laggy task startup ● Zombie fencing disables all initialized task producers from writing to Kafka ● What if a zombie task lags and hasn’t initialized its producer by the time zombie fencing for a new generation of tasks takes place? ● Or, what if a task is restarted on a zombie worker after zombie fencing takes place?
  • 42. Laggy task startup reddit-source-0 reddit-source-1 reddit-source-2 reddit-source-3 ò#00Г–ƿ „)} +Ρ#ƿ ^k ƿ )Ю 0ƿ –/ „Ч„< reddit-source-0 reddit-source-1 reddit-source-2 ò#00Г–ƿ „)} +Ρ#ƿ ^e ƿ & #$ ƿ –/ „Ч„< ` )0 ΛГ"ƿ μ"%ΡГ%ό *)} %- reddit-source-3 ^ſ / „Чƿ Г„ƿ Ю / όόГ& όƿ 0} +Г& όƿ „–/ +–} ъ< ^ſ / „Чƿ φ/ „ƿ μГ& Г„φ#0ƿ „–/ +–} ъ<
  • 43. Zombie fencing: check your work ɀ μ–#+ƿ Г& Г–Г/ Ю ГcГ& όƿ –+/ & „/ Ρ–Г)& „ƿ μ)+ƿ / ƿ –/ „Чƿ ъ+)0} Ρ#+P ƿ φ/¤#ƿ –)ƿ 8 / Ч#ƿ „} +#ƿ Г–d„ƿ „–ГЮ Ю ƿ „/ μ#ƿ –)ƿ +} & ƿ –φ#ƿ –/ „Ч ɀ ff #$ ƿ „#я } #& Ρ#ƿ )μƿ #¤#& –„T ɂ Ø#ΡГ0#ƿ –)ƿ ;+#<„–/ +–ƿ –/ „Ч ɂ Ņ+#/ –#ƿ ъ+)0} Ρ#+ƿ μ)+ƿ –/ „Чƿ / & 0ƿ Г& Г–Г/ Ю Гc#ƿ –+/ & „/ Ρ–Г)& „ ɂ ò#/ 0ƿ –)ƿ #& 0ƿ )μƿ Ρ)& μГόƿ –)ъГΡ ɂ @ μƿ & #$ ƿ –/ „Чƿ Ρ)& μГό„ƿ μ)} & 0P ƿ / Λ)+–ƿ „–/ +–} ъƿ / & 0ƿ / Λ/ & 0)& ƿ –φ#ƿ –/ „Ч ɂ H–φ#+$ Г„#P ƿ „/ μ#ƿ –)ƿ „–/ +–ƿ ъ+)Ρ#„„Г& όƿ 0/ –/ ɀ O /¤#ƿ $ #ƿ μГ& / Ю Ю № ƿ 0)& #ƿ Г–
  • 45. Caveats ● Fencing during rebalancing is not a good idea ○ Makes rebalances more brittle ○ Requires a new rebalance any time we want to restart a task that failed due to failed zombie fencing ● Instead, we fence outside of rebalances ○ During task startup, workers issue a REST request to the leader to perform zombie fencing for the connector ○ The leader will perform that round (if necessary), then send back a 2XX response ○ If a non-2XX response is received, the task is marked failed ○ Tasks can easily be restarted
  • 46. Caveats ɀ ſ φ+)$ / $ /№ ƿ ъ+)0} Ρ#+„ƿ μ)+ƿ Г& Г–Г/ Ю ГcГ& όƿ –+/ & „/ Ρ–Г)& „ƿ Г„ƿ $ / „–#μ} Ю ɀ ỳ #ƿ / 00#0ƿ / ƿ & #$ ƿ / 08 Г& ƿ ΡЮ Г#& –ƿ n@ ƿ Г& ƿ e Me ƿ –)ƿ 0)ƿ –φГ„ƿ Г& „–#/ 0 ɀ ỳ #ƿ φ/¤#ƿ –)ƿ Λ#ƿ Ρ/ +#μ} Ю ƿ / Λ)} –ƿ φ)$ ƿ –φ#ƿ Ю #/ 0#+ƿ } „#„ƿ –φ#ƿ –+/ & „/ Ρ–Г)& / Ю ƿ ъ+)0} Ρ#+ ɀ qГ8 ГЮ / +ƿ QΡЮ / Г8 V –φ#& V Ρφ#ΡЧSƿ Ю )όГΡƿ –)ƿ „)} +Ρ#ƿ –/ „Ч„
  • 47. In practice @ 8 ъЮ #8 #& –/ –Г)& ƿ 0#–/ ГЮ „ƿ / +#ƿ Λ)+Г& όP ƿ φ)$ ƿ 0)ƿ $ #ƿ / Ρ–} / Ю Ю № ƿ } „#ƿ –φГ„ƿ μ#/ –} +#
  • 48. In practice (cluster administrators) ɀ ff #$ ƿ ΡЮ } „–#+„T ɂ [ „#ƿ ¤#+„Г)& ƿ e Me MLƿ )+ƿ Ю / –#+ ɂ Ņ)& μГό} +#ƿ #¤#+№ ƿ $ )+Ч#+ƿ $ Г–φƿ exactly.once.source.support = enabled ɀ ŚGГ„–Г& όƿ ΡЮ } „–#+„T ɂ ò)Ю Ю Г& όƿ } ъό+/ 0#ƿ NE ɂ Ļ +Г& όƿ / Ю Ю ƿ $ )+Ч#+„ƿ –)ƿ e Me MLƿ )+ƿ Ю / –#+ ɂ Ņ)& μГό} +#ƿ #¤#+№ ƿ $ )+Ч#+ƿ $ Г–φƿ exactly.once.source.support = preparing ○ ò)Ю Ю Г& όƿ } ъό+/ 0#ƿ ŢE ɂ Ņ)& μГό} +#ƿ #¤#+№ ƿ $ )+Ч#+ƿ $ Г–φƿ exactly.once.source.support = enabled
  • 49. In practice (downstream readers) ● Have to filter out records from aborted transactions ● If using the Java consumer, configure with isolation.level = read_committed ● For sink connectors, do at least one of the following: ○ Configure worker with consumer.isolation.level = read_committed ○ Configure connector with consumer.override.isolation.level = read_committed with (3.0.0 or later, with default worker configuration)
  • 50. In practice (writing connectors) Have to define source offsets correctly public abstract class SourceTask { public abstract List<SourceRecord> poll(); } public class SourceRecord { public SourceRecord(Map<String, ?> sourcePartition, Map<String, ?> sourceOffset, ...) }
  • 51. In practice (writing connectors) O /¤#ƿ –)ƿ } „"ƿ „)} +Ρ#ƿ )μμ„#–„ƿ Ρ)++#Ρ–Ю № public abstract class SourceTask { protected SourceTaskContext context; public abstract void start(Map<String, String> props); } public interface SourceTaskContext { OffsetStorageReader offsetStorageReader(); } public interface OffsetStorageReader { <T> Map<Map<String, T>, Map<String, Object>> offsets(Collection<Map<String, T>> partitions); }
  • 52. In summary ɀ ŚG/ Ρ–Ю № V )& Ρ#ƿ Г„ƿ φ/ +0ƿ –)ƿ Г8 ъЮ #8 #& – ɂ Ś„ъ#ΡГ/ Ю Ю № ƿ φ/ & 0Ю Г& όƿ c)8 ΛГ#ƿ $ )+Ч#+„fc)8 ΛГ#ƿ –/ „Ч„ƿ / Ρ+)„„ƿ –/ „Чƿ +#Ρ)& μГό} +/ –Г)& „ ɀ ŚG/ Ρ–Ю № V )& Ρ#ƿ Г„ƿ ;φ)ъ#μ} Ю Ю № <ƿ #/ „№ ƿ –)ƿ } „# ɂ @ μƿ Г–d„ƿ & )–P ƿ φ/ +/ „„ƿ ъГ& όƿ 8 #ƿ )& ƿ ăГ+/b ɂ φ––ъ„TffГ„„} #„M/ ъ/ Ρφ#M)+όfТ Г+/ fъ+)Т #Ρ–„fǽ Zǽ fГ„„} #„ ɀ Z)+ƿ / Ю Ю ƿ –φ#ƿ 0#–/ ГЮ „P ƿ Ρφ#ΡЧƿ )} –ƿ ǽ@ nBnND ɂ φ––ъ„TffΡ$ ГЧГM/ ъ/ Ρφ#M)+όfΡ)& μЮ } #& Ρ#f0Г„ъЮ /№ fǽ Zǽ fǽ@ n BnNDṊ e i ŚG/ Ρ–Ю № V H& Ρ#g q} ъъ)+–g μ)+g q)} +Ρ#g Ņ)& & #Ρ–)+„
  • 53. Thank you! ƿ Open Source Program Office @Aiven Ρφ#Г„&' ( Г¤&* +Г, Chris Egerton -Г* -Ρφ#Г„. &ό&#–,* . 12 33456 3 @C0urante