SlideShare una empresa de Scribd logo
1 de 48
Tzu-Li (Gordon) Tai
Flink Committer / PMC Member
tzulitai@apache.org
@tzulitai
Managing State in Apache Flink
Original creators of Apache
Flink®
Providers of
dA Platform 2, including
open source Apache Flink +
dA Application Manager
Users are placing more and more
of their most valuable assets within
Flink: their application data
@
Complete social network implemented
using event sourcing and
CQRS (Command Query Responsibility Segregation)
Classic tiered architecture Streaming architecture
database
layer
compute
layer
compute
+
stream storage
and
snapshot storage
(backup)
application
state
Streams and Snapshots
6
Version control
Exactly-once guarantees Job reconfiguration
Asynchronous, pipelined
State
Snapshots
… want to ensure that users can
fully entrust Flink with their state,
as well as good quality of life with
state management
State management FAQs ...
● State declaration best practices?
● How is my state serialized and persisted?
● Can I adapt how my state is serialized?
● Can I adapt the schema / data model of
my state?
8
Recap: Flink’s Managed State
9
Flink Managed State
10
...
...
Your
Code
Your
Code
ValueStateDescriptor<MyPojo> desc =
new ValueStateDescriptor<>(
“my-value-state”,
MyPojo.class
);
ValueState<MyPojo> state =
getRuntimeContext().getState(desc);
MyPojo p = state.value();
...
state.update(...)
...
Flink Managed State
11
...
...
Your
Code
Your
Code
ValueStateDescriptor<MyPojo> desc =
new ValueStateDescriptor<>(
“my-value-state”,
MyPojo.class
);
ValueState<MyPojo> state =
getRuntimeContext().getState(desc);
MyPojo p = state.value();
...
state.update(...)
...
unique identifier
for this managed
state
Flink Managed State
12
...
...
Your
Code
Your
Code
ValueStateDescriptor<MyPojo> desc =
new ValueStateDescriptor<>(
“my-value-state”,
MyPojo.class
);
ValueState<MyPojo> state =
getRuntimeContext().getState(desc);
MyPojo p = state.value();
...
state.update(...)
...
type information
for the state
Flink Managed State
13
...
...
Your
Code
Your
Code
ValueStateDescriptor<MyPojo> desc =
new ValueStateDescriptor<>(
“my-value-state”,
MyPojo.class
);
ValueState<MyPojo> state =
getRuntimeContext().getState(desc);
MyPojo p = state.value();
...
state.update(...)
...
Local
State Backend
registers
state
Flink Managed State
14
...
...
Your
Code
Your
Code
ValueStateDescriptor<MyPojo> desc =
new ValueStateDescriptor<>(
“my-value-state”,
MyPojo.class
);
ValueState<MyPojo> state =
getRuntimeContext().getState(desc);
MyPojo p = state.value();
...
state.update(...)
...
Local
State Backend
read
write
Flink Managed State (II)
15
State
State
State
State
State
State
State
DFS
Checkpoints
- Pipelined
checkpoint barriers
flow through the
topology
- Operators
asynchronously
backup their state in
checkpoints on a
DFS
checkpointed state
Flink Managed State (II)
16
State
State
State
State
State
State
State
DFS
Restore
- Each operator is
assigned their
corresponding
file handles, from
which they
restore their state
checkpointed state
State Serialization Behaviours
17
State
State
State
State
State
State
State
DFS
JVM Heap backed
state backends
(MemoryStateBackend,
FsStateBackend)
⇒ lazy serialization +
eager deserialization
checkpointed state
State Serialization Behaviours
18
State
State
State
State
State
State
State
DFS
JVM Heap backed
state backends
(MemoryStateBackend,
FsStateBackend)
⇒ lazy serialization +
eager deserialization
Java object read / writes
checkpointed state
State Serialization Behaviours
19
State
State
State
State
State
State
State
DFS
JVM Heap backed
state backends
(MemoryStateBackend,
FsStateBackend)
⇒ lazy serialization +
eager deserialization
Java object read / writes
serialize
on chckpts /
savepoints
Checkpoint
checkpointed state
State Serialization Behaviours
20
State
State
State
State
State
State
State
DFS
JVM Heap backed
state backends
(MemoryStateBackend,
FsStateBackend)
⇒ lazy serialization +
eager deserialization
deserialize
to objects
on restore
Restore
checkpointed state
State Serialization Behaviours
21
State
State
State
State
State
State
State
DFS
Out-of-core state backends
(RocksDBStateBackend)
⇒ eager serialization +
lazy deserialization
checkpointed state
State Serialization Behaviours
22
State
State
State
State
State
State
State
DFS
Out-of-core state backends
(RocksDBStateBackend)
⇒ eager serialization +
lazy deserialization
De-/serialize on every
local read / write
checkpointed state
State Serialization Behaviours
23
State
State
State
State
State
State
State
DFS
Out-of-core state backends
(RocksDBStateBackend)
⇒ eager serialization +
lazy deserialization
De-/serialize on every
local read / write
exchange
of state bytes
Checkpoint
checkpointed state
State Serialization Behaviours
24
State
State
State
State
State
State
State
DFS
Out-of-core state backends
(RocksDBStateBackend)
⇒ eager serialization +
lazy deserialization
De-/serialize on every
local read / write
exchange
of state bytes
Restore
checkpointed state
State Declaration
25
Matters of State (I)
State Access Optimizations
26
ValueStateDescriptor<Map<String, MyPojo>> desc =
new ValueStateDescriptor<>(“my-value-state”, MyPojo.class);
ValueState<Map<String, MyPojo>> state =
getRuntimeContext().getState(desc);
● Flink supports different state “structures” for
efficient state access for various patterns
Map<String, MyPojo> map = state.value();
map.put(“someKey”, new MyPojo(...));
state.update(map);
X don’t do
State Access Optimizations
27
MapStateDescriptor<String, MyPojo> desc =
new MapStateDescriptor<>(“my-value-state”, String.class, MyPojo.class);
MapState<String, MyPojo> state =
getRuntimeContext().getMapState(desc);
● Flink supports different state “structures” for
efficient state access for various patterns
MyPojo pojoVal = state.get(“someKey”);
state.put(“someKey”, new MyPojo(...));
optimized for
random key access
Declaration Timeliness
28
● Try registering state as soon as possible
• Typically, all state declaration can be done in the “open()”
lifecycle of operators
Declaration Timeliness
29
public class MyStatefulMapFunction extends RichMapFunction<String, String> {
private static ValueStateDescriptor<MyPojo> DESC =
new ValueStateDescriptor<>(“my-pojo-state”, MyPojo.class);
@Override
public String map(String input) {
MyPojo pojo = getRuntimeContext().getState(DESC).value();
…
}
}
Declaration Timeliness
30
public class MyStatefulMapFunction extends RichMapFunction<String, String> {
private static ValueStateDescriptor<MyPojo> DESC =
new ValueStateDescriptor<>(“my-pojo-state”, MyPojo.class);
private ValueState<MyPojo> state;
@Override
public void open(Configuration config) {
state = getRuntimeContext().getState(DESC);
}
@Override
public String map(String input) {
MyPojo pojo = state.value();
…
}
}
Eager State Declaration
31
● Under discussion with FLIP-22
● Allows the JobManager to have knowledge on
declared states of a job
(under discussion,
release TBD)
Eager State Declaration
32
(under discussion,
release TBD)
public class MyStatefulMapFunction extends RichMapFunction<String, String> {
@KeyedState(
stateId = “my-pojo-state”,
queryableStateName = “state-query-handle”
)
private MapState<String, MyPojo> state;
@Override
public String map(String input) {
MyPojo pojo = state.get(“someKey”);
state.put(“someKey”, new MyPojo(...));
}
}
State Serialization
33
Matters of State (II)
How is my state serialized?
34
ValueStateDescriptor<MyPojo> desc =
new ValueStateDescriptor<>(“my-value-state”, MyPojo.class);
● Provided state type class info is analyzed by Flink’s own serialization
stack, producing a tailored, efficient serializer
● Supported types:
• Primitive types
• Tuples, Scala Case Classes
• POJOs (plain old Java objects)
● All non-supported types fallback to using Kryo for serialization
Avoid Kryo for State Serde
35
● Kryo is generally not recommended for use on persisted data
• Unstable binary formats
• Unfriendly for data model changes to the state
● Serialization frameworks with schema evolution support is
recommended: Avro, Thrift, etc.
● Register custom Kryo serializers for your Flink job that uses these
frameworks
Avoid Kryo for State Serde
36
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
// register the serializer included with
// Apache Thrift as the standard serializer for your type
env.registerTypeWithKryoSerializer(MyCustomType, TBaseSerializer.class);
● Also see: https://goo.gl/oU9NxJ
Custom State Serialization
37
● Instead of supplying type information to be analyzed, directly
provide a TypeSerializer
public class MyCustomTypeSerializer extends TypeSerializer<MyCustomType> {
...
}
ValueStateDescriptor<MyCustomType> desc =
new ValueStateDescriptor<>(“my-value-state”, new MyCustomTypeSerializer());
State Migration
38
Matters of State (III)
State migration / evolution
39
● Upgrading to more efficient serialization schemes for performance
improvements
● Changing the schema / data model of state types, due to evolving
business logic
Upgrading State Serializers
40
● The new upgraded serializer would either be compatible, or not
compatible. If incompatible, state migration is required.
● Disclaimer: as of Flink 1.3, upgraded serializers must be compatible, as
state migration is not yet an available feature.
ValueStateDescriptor<MyCustomType> desc =
new ValueStateDescriptor<>(“my-value-state”, new UpgradedSerializer<MyCustomType>());
ValueStateDescriptor<MyPojo> desc =
new ValueStateDescriptor<>(“my-value-state”, MyPojo.class); // modified MyPojo type
Case #2: New custom serializer
Case #1: Modified state types, resulting in different Flink-generated serializers
Upgrading State Serializers (II)
41
● On restore, new serializers are checked against the metadata of the
previous serializer (stored together with state data in savepoints) for
compatibility.
● All Flink-generated serializers define the metadata to be persisted. Newly
generated serializers at restore time are reconfigured to be compatible.Savepoint
state
data
serializer
config
metadataField “A”
Field “B”
Old serializer at time
of savepoint
Field “B”
Field “A”
New serializer at
restore time
Upgrading State Serializers (II)
42
● On restore, new serializers are checked against the metadata of the
previous serializer (stored together with state data in savepoints) for
compatibility.
● All Flink-generated serializers define the metadata to be persisted. Newly
generated serializers at restore time are reconfigured to be compatible.Savepoint
state
data
serializer
config
metadataField “A”
Field “B”
Old serializer at time
of savepoint
Field “B”
Field “A”
New serializer at
restore time
reconfigure
Upgrading State Serializers (II)
43
● On restore, new serializers are checked against the metadata of the
previous serializer (stored together with state data in savepoints) for
compatibility.
● All Flink-generated serializers define the metadata to be persisted. Newly
generated serializers at restore time are reconfigured to be compatible.Savepoint
state
data
serializer
config
metadataField “A”
Field “B”
Old serializer at time
of savepoint
Field “A”
Field “B”
New serializer at
restore time
Upgrading State Serializers (III)
44
● Custom serializers must define the metadata to write:
TypeSerializerConfigSnapshot
● Also define logic for compatibility checks
public class MyCustomTypeSerializer extends TypeSerializer<MyCustomType> {
...
TypeSerializerConfigSnapshot snapshotConfiguration();
CompatibilityResult<MyCustomType> ensureCompatibility(
TypeSerializerConfigSnapshot configSnapshot);
}
Closing
45
TL;DR
46
● The Flink community takes state management in Flink very seriously
● We try to make sure that users will feel comfortable in placing their
application state within Flink and avoid any kind of lock-in.
● Upcoming changes / features to expect related to state management:
Eager State Declaration & State Migration
4
Thank you!
@tzulitai
@ApacheFlink
@dataArtisans
We are hiring!
data-artisans.com/careers

Más contenido relacionado

La actualidad más candente

Stephan Ewen - Experiences running Flink at Very Large Scale
Stephan Ewen -  Experiences running Flink at Very Large ScaleStephan Ewen -  Experiences running Flink at Very Large Scale
Stephan Ewen - Experiences running Flink at Very Large ScaleVerverica
 
CDC Stream Processing with Apache Flink
CDC Stream Processing with Apache FlinkCDC Stream Processing with Apache Flink
CDC Stream Processing with Apache FlinkTimo Walther
 
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth WiesmanWebinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth WiesmanVerverica
 
Apache Arrow: High Performance Columnar Data Framework
Apache Arrow: High Performance Columnar Data FrameworkApache Arrow: High Performance Columnar Data Framework
Apache Arrow: High Performance Columnar Data FrameworkWes McKinney
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeFlink Forward
 
The Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
The Rise of ZStandard: Apache Spark/Parquet/ORC/AvroThe Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
The Rise of ZStandard: Apache Spark/Parquet/ORC/AvroDatabricks
 
Deploying Kafka Streams Applications with Docker and Kubernetes
Deploying Kafka Streams Applications with Docker and KubernetesDeploying Kafka Streams Applications with Docker and Kubernetes
Deploying Kafka Streams Applications with Docker and Kubernetesconfluent
 
Apache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native EraApache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native EraFlink Forward
 
Deep Dive into Project Tungsten: Bringing Spark Closer to Bare Metal-(Josh Ro...
Deep Dive into Project Tungsten: Bringing Spark Closer to Bare Metal-(Josh Ro...Deep Dive into Project Tungsten: Bringing Spark Closer to Bare Metal-(Josh Ro...
Deep Dive into Project Tungsten: Bringing Spark Closer to Bare Metal-(Josh Ro...Spark Summit
 
APACHE KAFKA / Kafka Connect / Kafka Streams
APACHE KAFKA / Kafka Connect / Kafka StreamsAPACHE KAFKA / Kafka Connect / Kafka Streams
APACHE KAFKA / Kafka Connect / Kafka StreamsKetan Gote
 
Why is My Stream Processing Job Slow? with Xavier Leaute
Why is My Stream Processing Job Slow? with Xavier LeauteWhy is My Stream Processing Job Slow? with Xavier Leaute
Why is My Stream Processing Job Slow? with Xavier LeauteDatabricks
 
Apache Kafka Introduction
Apache Kafka IntroductionApache Kafka Introduction
Apache Kafka IntroductionAmita Mirajkar
 
Webinar: 99 Ways to Enrich Streaming Data with Apache Flink - Konstantin Knauf
Webinar: 99 Ways to Enrich Streaming Data with Apache Flink - Konstantin KnaufWebinar: 99 Ways to Enrich Streaming Data with Apache Flink - Konstantin Knauf
Webinar: 99 Ways to Enrich Streaming Data with Apache Flink - Konstantin KnaufVerverica
 
Temporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, Confluent
Temporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, ConfluentTemporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, Confluent
Temporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, ConfluentHostedbyConfluent
 
Spark (Structured) Streaming vs. Kafka Streams
Spark (Structured) Streaming vs. Kafka StreamsSpark (Structured) Streaming vs. Kafka Streams
Spark (Structured) Streaming vs. Kafka StreamsGuido Schmutz
 
Disaster Recovery with MirrorMaker 2.0 (Ryanne Dolan, Cloudera) Kafka Summit ...
Disaster Recovery with MirrorMaker 2.0 (Ryanne Dolan, Cloudera) Kafka Summit ...Disaster Recovery with MirrorMaker 2.0 (Ryanne Dolan, Cloudera) Kafka Summit ...
Disaster Recovery with MirrorMaker 2.0 (Ryanne Dolan, Cloudera) Kafka Summit ...confluent
 
Stream Processing with Apache Kafka and .NET
Stream Processing with Apache Kafka and .NETStream Processing with Apache Kafka and .NET
Stream Processing with Apache Kafka and .NETconfluent
 
Power of the Log: LSM & Append Only Data Structures
Power of the Log: LSM & Append Only Data StructuresPower of the Log: LSM & Append Only Data Structures
Power of the Log: LSM & Append Only Data Structuresconfluent
 

La actualidad más candente (20)

kafka
kafkakafka
kafka
 
Stephan Ewen - Experiences running Flink at Very Large Scale
Stephan Ewen -  Experiences running Flink at Very Large ScaleStephan Ewen -  Experiences running Flink at Very Large Scale
Stephan Ewen - Experiences running Flink at Very Large Scale
 
CDC Stream Processing with Apache Flink
CDC Stream Processing with Apache FlinkCDC Stream Processing with Apache Flink
CDC Stream Processing with Apache Flink
 
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth WiesmanWebinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
 
Apache Kafka Best Practices
Apache Kafka Best PracticesApache Kafka Best Practices
Apache Kafka Best Practices
 
Apache Arrow: High Performance Columnar Data Framework
Apache Arrow: High Performance Columnar Data FrameworkApache Arrow: High Performance Columnar Data Framework
Apache Arrow: High Performance Columnar Data Framework
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive Mode
 
The Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
The Rise of ZStandard: Apache Spark/Parquet/ORC/AvroThe Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
The Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
 
Deploying Kafka Streams Applications with Docker and Kubernetes
Deploying Kafka Streams Applications with Docker and KubernetesDeploying Kafka Streams Applications with Docker and Kubernetes
Deploying Kafka Streams Applications with Docker and Kubernetes
 
Apache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native EraApache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native Era
 
Deep Dive into Project Tungsten: Bringing Spark Closer to Bare Metal-(Josh Ro...
Deep Dive into Project Tungsten: Bringing Spark Closer to Bare Metal-(Josh Ro...Deep Dive into Project Tungsten: Bringing Spark Closer to Bare Metal-(Josh Ro...
Deep Dive into Project Tungsten: Bringing Spark Closer to Bare Metal-(Josh Ro...
 
APACHE KAFKA / Kafka Connect / Kafka Streams
APACHE KAFKA / Kafka Connect / Kafka StreamsAPACHE KAFKA / Kafka Connect / Kafka Streams
APACHE KAFKA / Kafka Connect / Kafka Streams
 
Why is My Stream Processing Job Slow? with Xavier Leaute
Why is My Stream Processing Job Slow? with Xavier LeauteWhy is My Stream Processing Job Slow? with Xavier Leaute
Why is My Stream Processing Job Slow? with Xavier Leaute
 
Apache Kafka Introduction
Apache Kafka IntroductionApache Kafka Introduction
Apache Kafka Introduction
 
Webinar: 99 Ways to Enrich Streaming Data with Apache Flink - Konstantin Knauf
Webinar: 99 Ways to Enrich Streaming Data with Apache Flink - Konstantin KnaufWebinar: 99 Ways to Enrich Streaming Data with Apache Flink - Konstantin Knauf
Webinar: 99 Ways to Enrich Streaming Data with Apache Flink - Konstantin Knauf
 
Temporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, Confluent
Temporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, ConfluentTemporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, Confluent
Temporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, Confluent
 
Spark (Structured) Streaming vs. Kafka Streams
Spark (Structured) Streaming vs. Kafka StreamsSpark (Structured) Streaming vs. Kafka Streams
Spark (Structured) Streaming vs. Kafka Streams
 
Disaster Recovery with MirrorMaker 2.0 (Ryanne Dolan, Cloudera) Kafka Summit ...
Disaster Recovery with MirrorMaker 2.0 (Ryanne Dolan, Cloudera) Kafka Summit ...Disaster Recovery with MirrorMaker 2.0 (Ryanne Dolan, Cloudera) Kafka Summit ...
Disaster Recovery with MirrorMaker 2.0 (Ryanne Dolan, Cloudera) Kafka Summit ...
 
Stream Processing with Apache Kafka and .NET
Stream Processing with Apache Kafka and .NETStream Processing with Apache Kafka and .NET
Stream Processing with Apache Kafka and .NET
 
Power of the Log: LSM & Append Only Data Structures
Power of the Log: LSM & Append Only Data StructuresPower of the Log: LSM & Append Only Data Structures
Power of the Log: LSM & Append Only Data Structures
 

Similar a Flink Forward Berlin 2017: Tzu-Li (Gordon) Tai - Managing State in Apache Flink

Flink Forward SF 2017: Joe Olson - Using Flink and Queryable State to Buffer ...
Flink Forward SF 2017: Joe Olson - Using Flink and Queryable State to Buffer ...Flink Forward SF 2017: Joe Olson - Using Flink and Queryable State to Buffer ...
Flink Forward SF 2017: Joe Olson - Using Flink and Queryable State to Buffer ...Flink Forward
 
State schema evolution for Apache Flink Applications
State schema evolution for Apache Flink ApplicationsState schema evolution for Apache Flink Applications
State schema evolution for Apache Flink ApplicationsTzu-Li (Gordon) Tai
 
Realtime Statistics based on Apache Storm and RocketMQ
Realtime Statistics based on Apache Storm and RocketMQRealtime Statistics based on Apache Storm and RocketMQ
Realtime Statistics based on Apache Storm and RocketMQXin Wang
 
Migrating To PostgreSQL
Migrating To PostgreSQLMigrating To PostgreSQL
Migrating To PostgreSQLGrant Fritchey
 
Extending Apache Spark – Beyond Spark Session Extensions
Extending Apache Spark – Beyond Spark Session ExtensionsExtending Apache Spark – Beyond Spark Session Extensions
Extending Apache Spark – Beyond Spark Session ExtensionsDatabricks
 
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...confluent
 
Introduction to Direct 3D 12 by Ivan Nevraev
Introduction to Direct 3D 12 by Ivan NevraevIntroduction to Direct 3D 12 by Ivan Nevraev
Introduction to Direct 3D 12 by Ivan NevraevAMD Developer Central
 
Kafka Summit NYC 2017 - Building Advanced Streaming Applications using the La...
Kafka Summit NYC 2017 - Building Advanced Streaming Applications using the La...Kafka Summit NYC 2017 - Building Advanced Streaming Applications using the La...
Kafka Summit NYC 2017 - Building Advanced Streaming Applications using the La...confluent
 
From data stream management to distributed dataflows and beyond
From data stream management to distributed dataflows and beyondFrom data stream management to distributed dataflows and beyond
From data stream management to distributed dataflows and beyondVasia Kalavri
 
Cdcr apachecon-talk
Cdcr apachecon-talkCdcr apachecon-talk
Cdcr apachecon-talkAmrit Sarkar
 
Akka Microservices Architecture And Design
Akka Microservices Architecture And DesignAkka Microservices Architecture And Design
Akka Microservices Architecture And DesignYaroslav Tkachenko
 
State Management in Apache Flink : Consistent Stateful Distributed Stream Pro...
State Management in Apache Flink : Consistent Stateful Distributed Stream Pro...State Management in Apache Flink : Consistent Stateful Distributed Stream Pro...
State Management in Apache Flink : Consistent Stateful Distributed Stream Pro...Paris Carbone
 
Andrzej Ludwikowski - Event Sourcing - what could possibly go wrong? - Codemo...
Andrzej Ludwikowski - Event Sourcing - what could possibly go wrong? - Codemo...Andrzej Ludwikowski - Event Sourcing - what could possibly go wrong? - Codemo...
Andrzej Ludwikowski - Event Sourcing - what could possibly go wrong? - Codemo...Codemotion
 
Performance Tuning RocksDB for Kafka Streams’ State Stores
Performance Tuning RocksDB for Kafka Streams’ State StoresPerformance Tuning RocksDB for Kafka Streams’ State Stores
Performance Tuning RocksDB for Kafka Streams’ State Storesconfluent
 
Building Distributed Systems in Scala
Building Distributed Systems in ScalaBuilding Distributed Systems in Scala
Building Distributed Systems in ScalaAlex Payne
 
Event Sourcing - what could possibly go wrong?
Event Sourcing - what could possibly go wrong?Event Sourcing - what could possibly go wrong?
Event Sourcing - what could possibly go wrong?Andrzej Ludwikowski
 
Jack Gudenkauf sparkug_20151207_7
Jack Gudenkauf sparkug_20151207_7Jack Gudenkauf sparkug_20151207_7
Jack Gudenkauf sparkug_20151207_7Jack Gudenkauf
 
A noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­tica
A noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­ticaA noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­tica
A noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­ticaData Con LA
 

Similar a Flink Forward Berlin 2017: Tzu-Li (Gordon) Tai - Managing State in Apache Flink (20)

Flink Forward SF 2017: Joe Olson - Using Flink and Queryable State to Buffer ...
Flink Forward SF 2017: Joe Olson - Using Flink and Queryable State to Buffer ...Flink Forward SF 2017: Joe Olson - Using Flink and Queryable State to Buffer ...
Flink Forward SF 2017: Joe Olson - Using Flink and Queryable State to Buffer ...
 
State schema evolution for Apache Flink Applications
State schema evolution for Apache Flink ApplicationsState schema evolution for Apache Flink Applications
State schema evolution for Apache Flink Applications
 
Realtime Statistics based on Apache Storm and RocketMQ
Realtime Statistics based on Apache Storm and RocketMQRealtime Statistics based on Apache Storm and RocketMQ
Realtime Statistics based on Apache Storm and RocketMQ
 
Migrating To PostgreSQL
Migrating To PostgreSQLMigrating To PostgreSQL
Migrating To PostgreSQL
 
Extending Apache Spark – Beyond Spark Session Extensions
Extending Apache Spark – Beyond Spark Session ExtensionsExtending Apache Spark – Beyond Spark Session Extensions
Extending Apache Spark – Beyond Spark Session Extensions
 
Spark streaming
Spark streamingSpark streaming
Spark streaming
 
Streamsets and spark
Streamsets and sparkStreamsets and spark
Streamsets and spark
 
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
 
Introduction to Direct 3D 12 by Ivan Nevraev
Introduction to Direct 3D 12 by Ivan NevraevIntroduction to Direct 3D 12 by Ivan Nevraev
Introduction to Direct 3D 12 by Ivan Nevraev
 
Kafka Summit NYC 2017 - Building Advanced Streaming Applications using the La...
Kafka Summit NYC 2017 - Building Advanced Streaming Applications using the La...Kafka Summit NYC 2017 - Building Advanced Streaming Applications using the La...
Kafka Summit NYC 2017 - Building Advanced Streaming Applications using the La...
 
From data stream management to distributed dataflows and beyond
From data stream management to distributed dataflows and beyondFrom data stream management to distributed dataflows and beyond
From data stream management to distributed dataflows and beyond
 
Cdcr apachecon-talk
Cdcr apachecon-talkCdcr apachecon-talk
Cdcr apachecon-talk
 
Akka Microservices Architecture And Design
Akka Microservices Architecture And DesignAkka Microservices Architecture And Design
Akka Microservices Architecture And Design
 
State Management in Apache Flink : Consistent Stateful Distributed Stream Pro...
State Management in Apache Flink : Consistent Stateful Distributed Stream Pro...State Management in Apache Flink : Consistent Stateful Distributed Stream Pro...
State Management in Apache Flink : Consistent Stateful Distributed Stream Pro...
 
Andrzej Ludwikowski - Event Sourcing - what could possibly go wrong? - Codemo...
Andrzej Ludwikowski - Event Sourcing - what could possibly go wrong? - Codemo...Andrzej Ludwikowski - Event Sourcing - what could possibly go wrong? - Codemo...
Andrzej Ludwikowski - Event Sourcing - what could possibly go wrong? - Codemo...
 
Performance Tuning RocksDB for Kafka Streams’ State Stores
Performance Tuning RocksDB for Kafka Streams’ State StoresPerformance Tuning RocksDB for Kafka Streams’ State Stores
Performance Tuning RocksDB for Kafka Streams’ State Stores
 
Building Distributed Systems in Scala
Building Distributed Systems in ScalaBuilding Distributed Systems in Scala
Building Distributed Systems in Scala
 
Event Sourcing - what could possibly go wrong?
Event Sourcing - what could possibly go wrong?Event Sourcing - what could possibly go wrong?
Event Sourcing - what could possibly go wrong?
 
Jack Gudenkauf sparkug_20151207_7
Jack Gudenkauf sparkug_20151207_7Jack Gudenkauf sparkug_20151207_7
Jack Gudenkauf sparkug_20151207_7
 
A noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­tica
A noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­ticaA noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­tica
A noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­tica
 

Más de Flink Forward

Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...Flink Forward
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkFlink Forward
 
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...Flink Forward
 
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Flink Forward
 
Introducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes OperatorIntroducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes OperatorFlink Forward
 
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Flink Forward
 
One sink to rule them all: Introducing the new Async Sink
One sink to rule them all: Introducing the new Async SinkOne sink to rule them all: Introducing the new Async Sink
One sink to rule them all: Introducing the new Async SinkFlink Forward
 
Tuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptxTuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptxFlink Forward
 
Flink powered stream processing platform at Pinterest
Flink powered stream processing platform at PinterestFlink powered stream processing platform at Pinterest
Flink powered stream processing platform at PinterestFlink Forward
 
Using the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production DeploymentUsing the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production DeploymentFlink Forward
 
The Current State of Table API in 2022
The Current State of Table API in 2022The Current State of Table API in 2022
The Current State of Table API in 2022Flink Forward
 
Flink SQL on Pulsar made easy
Flink SQL on Pulsar made easyFlink SQL on Pulsar made easy
Flink SQL on Pulsar made easyFlink Forward
 
Dynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data AlertsDynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data AlertsFlink Forward
 
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotExactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotFlink Forward
 
Processing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial ServicesProcessing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial ServicesFlink Forward
 
Batch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergBatch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergFlink Forward
 
Welcome to the Flink Community!
Welcome to the Flink Community!Welcome to the Flink Community!
Welcome to the Flink Community!Flink Forward
 
Practical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobsPractical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobsFlink Forward
 
Extending Flink SQL for stream processing use cases
Extending Flink SQL for stream processing use casesExtending Flink SQL for stream processing use cases
Extending Flink SQL for stream processing use casesFlink Forward
 
The top 3 challenges running multi-tenant Flink at scale
The top 3 challenges running multi-tenant Flink at scaleThe top 3 challenges running multi-tenant Flink at scale
The top 3 challenges running multi-tenant Flink at scaleFlink Forward
 

Más de Flink Forward (20)

Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in Flink
 
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
 
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
 
Introducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes OperatorIntroducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes Operator
 
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
 
One sink to rule them all: Introducing the new Async Sink
One sink to rule them all: Introducing the new Async SinkOne sink to rule them all: Introducing the new Async Sink
One sink to rule them all: Introducing the new Async Sink
 
Tuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptxTuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptx
 
Flink powered stream processing platform at Pinterest
Flink powered stream processing platform at PinterestFlink powered stream processing platform at Pinterest
Flink powered stream processing platform at Pinterest
 
Using the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production DeploymentUsing the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production Deployment
 
The Current State of Table API in 2022
The Current State of Table API in 2022The Current State of Table API in 2022
The Current State of Table API in 2022
 
Flink SQL on Pulsar made easy
Flink SQL on Pulsar made easyFlink SQL on Pulsar made easy
Flink SQL on Pulsar made easy
 
Dynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data AlertsDynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data Alerts
 
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotExactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
 
Processing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial ServicesProcessing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial Services
 
Batch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergBatch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & Iceberg
 
Welcome to the Flink Community!
Welcome to the Flink Community!Welcome to the Flink Community!
Welcome to the Flink Community!
 
Practical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobsPractical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobs
 
Extending Flink SQL for stream processing use cases
Extending Flink SQL for stream processing use casesExtending Flink SQL for stream processing use cases
Extending Flink SQL for stream processing use cases
 
The top 3 challenges running multi-tenant Flink at scale
The top 3 challenges running multi-tenant Flink at scaleThe top 3 challenges running multi-tenant Flink at scale
The top 3 challenges running multi-tenant Flink at scale
 

Último

Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlCall Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlkumarajju5765
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxolyaivanovalion
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Onlineanilsa9823
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 

Último (20)

Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlCall Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptx
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 

Flink Forward Berlin 2017: Tzu-Li (Gordon) Tai - Managing State in Apache Flink

  • 1. Tzu-Li (Gordon) Tai Flink Committer / PMC Member tzulitai@apache.org @tzulitai Managing State in Apache Flink
  • 2. Original creators of Apache Flink® Providers of dA Platform 2, including open source Apache Flink + dA Application Manager
  • 3. Users are placing more and more of their most valuable assets within Flink: their application data
  • 4. @ Complete social network implemented using event sourcing and CQRS (Command Query Responsibility Segregation)
  • 5. Classic tiered architecture Streaming architecture database layer compute layer compute + stream storage and snapshot storage (backup) application state
  • 6. Streams and Snapshots 6 Version control Exactly-once guarantees Job reconfiguration Asynchronous, pipelined State Snapshots
  • 7. … want to ensure that users can fully entrust Flink with their state, as well as good quality of life with state management
  • 8. State management FAQs ... ● State declaration best practices? ● How is my state serialized and persisted? ● Can I adapt how my state is serialized? ● Can I adapt the schema / data model of my state? 8
  • 10. Flink Managed State 10 ... ... Your Code Your Code ValueStateDescriptor<MyPojo> desc = new ValueStateDescriptor<>( “my-value-state”, MyPojo.class ); ValueState<MyPojo> state = getRuntimeContext().getState(desc); MyPojo p = state.value(); ... state.update(...) ...
  • 11. Flink Managed State 11 ... ... Your Code Your Code ValueStateDescriptor<MyPojo> desc = new ValueStateDescriptor<>( “my-value-state”, MyPojo.class ); ValueState<MyPojo> state = getRuntimeContext().getState(desc); MyPojo p = state.value(); ... state.update(...) ... unique identifier for this managed state
  • 12. Flink Managed State 12 ... ... Your Code Your Code ValueStateDescriptor<MyPojo> desc = new ValueStateDescriptor<>( “my-value-state”, MyPojo.class ); ValueState<MyPojo> state = getRuntimeContext().getState(desc); MyPojo p = state.value(); ... state.update(...) ... type information for the state
  • 13. Flink Managed State 13 ... ... Your Code Your Code ValueStateDescriptor<MyPojo> desc = new ValueStateDescriptor<>( “my-value-state”, MyPojo.class ); ValueState<MyPojo> state = getRuntimeContext().getState(desc); MyPojo p = state.value(); ... state.update(...) ... Local State Backend registers state
  • 14. Flink Managed State 14 ... ... Your Code Your Code ValueStateDescriptor<MyPojo> desc = new ValueStateDescriptor<>( “my-value-state”, MyPojo.class ); ValueState<MyPojo> state = getRuntimeContext().getState(desc); MyPojo p = state.value(); ... state.update(...) ... Local State Backend read write
  • 15. Flink Managed State (II) 15 State State State State State State State DFS Checkpoints - Pipelined checkpoint barriers flow through the topology - Operators asynchronously backup their state in checkpoints on a DFS checkpointed state
  • 16. Flink Managed State (II) 16 State State State State State State State DFS Restore - Each operator is assigned their corresponding file handles, from which they restore their state checkpointed state
  • 17. State Serialization Behaviours 17 State State State State State State State DFS JVM Heap backed state backends (MemoryStateBackend, FsStateBackend) ⇒ lazy serialization + eager deserialization checkpointed state
  • 18. State Serialization Behaviours 18 State State State State State State State DFS JVM Heap backed state backends (MemoryStateBackend, FsStateBackend) ⇒ lazy serialization + eager deserialization Java object read / writes checkpointed state
  • 19. State Serialization Behaviours 19 State State State State State State State DFS JVM Heap backed state backends (MemoryStateBackend, FsStateBackend) ⇒ lazy serialization + eager deserialization Java object read / writes serialize on chckpts / savepoints Checkpoint checkpointed state
  • 20. State Serialization Behaviours 20 State State State State State State State DFS JVM Heap backed state backends (MemoryStateBackend, FsStateBackend) ⇒ lazy serialization + eager deserialization deserialize to objects on restore Restore checkpointed state
  • 21. State Serialization Behaviours 21 State State State State State State State DFS Out-of-core state backends (RocksDBStateBackend) ⇒ eager serialization + lazy deserialization checkpointed state
  • 22. State Serialization Behaviours 22 State State State State State State State DFS Out-of-core state backends (RocksDBStateBackend) ⇒ eager serialization + lazy deserialization De-/serialize on every local read / write checkpointed state
  • 23. State Serialization Behaviours 23 State State State State State State State DFS Out-of-core state backends (RocksDBStateBackend) ⇒ eager serialization + lazy deserialization De-/serialize on every local read / write exchange of state bytes Checkpoint checkpointed state
  • 24. State Serialization Behaviours 24 State State State State State State State DFS Out-of-core state backends (RocksDBStateBackend) ⇒ eager serialization + lazy deserialization De-/serialize on every local read / write exchange of state bytes Restore checkpointed state
  • 26. State Access Optimizations 26 ValueStateDescriptor<Map<String, MyPojo>> desc = new ValueStateDescriptor<>(“my-value-state”, MyPojo.class); ValueState<Map<String, MyPojo>> state = getRuntimeContext().getState(desc); ● Flink supports different state “structures” for efficient state access for various patterns Map<String, MyPojo> map = state.value(); map.put(“someKey”, new MyPojo(...)); state.update(map); X don’t do
  • 27. State Access Optimizations 27 MapStateDescriptor<String, MyPojo> desc = new MapStateDescriptor<>(“my-value-state”, String.class, MyPojo.class); MapState<String, MyPojo> state = getRuntimeContext().getMapState(desc); ● Flink supports different state “structures” for efficient state access for various patterns MyPojo pojoVal = state.get(“someKey”); state.put(“someKey”, new MyPojo(...)); optimized for random key access
  • 28. Declaration Timeliness 28 ● Try registering state as soon as possible • Typically, all state declaration can be done in the “open()” lifecycle of operators
  • 29. Declaration Timeliness 29 public class MyStatefulMapFunction extends RichMapFunction<String, String> { private static ValueStateDescriptor<MyPojo> DESC = new ValueStateDescriptor<>(“my-pojo-state”, MyPojo.class); @Override public String map(String input) { MyPojo pojo = getRuntimeContext().getState(DESC).value(); … } }
  • 30. Declaration Timeliness 30 public class MyStatefulMapFunction extends RichMapFunction<String, String> { private static ValueStateDescriptor<MyPojo> DESC = new ValueStateDescriptor<>(“my-pojo-state”, MyPojo.class); private ValueState<MyPojo> state; @Override public void open(Configuration config) { state = getRuntimeContext().getState(DESC); } @Override public String map(String input) { MyPojo pojo = state.value(); … } }
  • 31. Eager State Declaration 31 ● Under discussion with FLIP-22 ● Allows the JobManager to have knowledge on declared states of a job (under discussion, release TBD)
  • 32. Eager State Declaration 32 (under discussion, release TBD) public class MyStatefulMapFunction extends RichMapFunction<String, String> { @KeyedState( stateId = “my-pojo-state”, queryableStateName = “state-query-handle” ) private MapState<String, MyPojo> state; @Override public String map(String input) { MyPojo pojo = state.get(“someKey”); state.put(“someKey”, new MyPojo(...)); } }
  • 34. How is my state serialized? 34 ValueStateDescriptor<MyPojo> desc = new ValueStateDescriptor<>(“my-value-state”, MyPojo.class); ● Provided state type class info is analyzed by Flink’s own serialization stack, producing a tailored, efficient serializer ● Supported types: • Primitive types • Tuples, Scala Case Classes • POJOs (plain old Java objects) ● All non-supported types fallback to using Kryo for serialization
  • 35. Avoid Kryo for State Serde 35 ● Kryo is generally not recommended for use on persisted data • Unstable binary formats • Unfriendly for data model changes to the state ● Serialization frameworks with schema evolution support is recommended: Avro, Thrift, etc. ● Register custom Kryo serializers for your Flink job that uses these frameworks
  • 36. Avoid Kryo for State Serde 36 StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); // register the serializer included with // Apache Thrift as the standard serializer for your type env.registerTypeWithKryoSerializer(MyCustomType, TBaseSerializer.class); ● Also see: https://goo.gl/oU9NxJ
  • 37. Custom State Serialization 37 ● Instead of supplying type information to be analyzed, directly provide a TypeSerializer public class MyCustomTypeSerializer extends TypeSerializer<MyCustomType> { ... } ValueStateDescriptor<MyCustomType> desc = new ValueStateDescriptor<>(“my-value-state”, new MyCustomTypeSerializer());
  • 39. State migration / evolution 39 ● Upgrading to more efficient serialization schemes for performance improvements ● Changing the schema / data model of state types, due to evolving business logic
  • 40. Upgrading State Serializers 40 ● The new upgraded serializer would either be compatible, or not compatible. If incompatible, state migration is required. ● Disclaimer: as of Flink 1.3, upgraded serializers must be compatible, as state migration is not yet an available feature. ValueStateDescriptor<MyCustomType> desc = new ValueStateDescriptor<>(“my-value-state”, new UpgradedSerializer<MyCustomType>()); ValueStateDescriptor<MyPojo> desc = new ValueStateDescriptor<>(“my-value-state”, MyPojo.class); // modified MyPojo type Case #2: New custom serializer Case #1: Modified state types, resulting in different Flink-generated serializers
  • 41. Upgrading State Serializers (II) 41 ● On restore, new serializers are checked against the metadata of the previous serializer (stored together with state data in savepoints) for compatibility. ● All Flink-generated serializers define the metadata to be persisted. Newly generated serializers at restore time are reconfigured to be compatible.Savepoint state data serializer config metadataField “A” Field “B” Old serializer at time of savepoint Field “B” Field “A” New serializer at restore time
  • 42. Upgrading State Serializers (II) 42 ● On restore, new serializers are checked against the metadata of the previous serializer (stored together with state data in savepoints) for compatibility. ● All Flink-generated serializers define the metadata to be persisted. Newly generated serializers at restore time are reconfigured to be compatible.Savepoint state data serializer config metadataField “A” Field “B” Old serializer at time of savepoint Field “B” Field “A” New serializer at restore time reconfigure
  • 43. Upgrading State Serializers (II) 43 ● On restore, new serializers are checked against the metadata of the previous serializer (stored together with state data in savepoints) for compatibility. ● All Flink-generated serializers define the metadata to be persisted. Newly generated serializers at restore time are reconfigured to be compatible.Savepoint state data serializer config metadataField “A” Field “B” Old serializer at time of savepoint Field “A” Field “B” New serializer at restore time
  • 44. Upgrading State Serializers (III) 44 ● Custom serializers must define the metadata to write: TypeSerializerConfigSnapshot ● Also define logic for compatibility checks public class MyCustomTypeSerializer extends TypeSerializer<MyCustomType> { ... TypeSerializerConfigSnapshot snapshotConfiguration(); CompatibilityResult<MyCustomType> ensureCompatibility( TypeSerializerConfigSnapshot configSnapshot); }
  • 46. TL;DR 46 ● The Flink community takes state management in Flink very seriously ● We try to make sure that users will feel comfortable in placing their application state within Flink and avoid any kind of lock-in. ● Upcoming changes / features to expect related to state management: Eager State Declaration & State Migration