Apache Flink


Learning resourcesarrow-up-right

Article from Confluentarrow-up-right

Streaming joinsarrow-up-right

Stateful and stateless stream processingarrow-up-right with Apache Kafka and Apache Flink,


Apache Flink is a high-performance, distributed engine specifically engineered for stateful computations over both unbounded (streaming) and bounded (batch) data. It is often described as the "gold standard" for stream processing because it treats streaming as the primary use case and batch as just a special case of finite streams.

Core System Philosophy

Flink is designed to be a "good citizen" in a modern data stack. Rather than trying to be a database or a resource manager itself, it focuses on the core logic of distributed processing and delegates other responsibilities to established infrastructure:

  • Resource Management: Flink can run as a standalone cluster, but it is natively integrated with Kubernetes (K8s), YARN, and Mesos. In 2026, the industry has largely shifted toward Native Kubernetes deployments, where Flink dynamically communicates with the K8s API to spin up or tear down worker pods based on the job's needs.

  • Durable Storage: Flink does not provide its own permanent storage. It relies on distributed file systems or object stores like S3, GCS, or HDFS to store the snapshots (checkpoints) of its internal state.

  • High Availability: To prevent a single point of failure, Flink uses a leader-election mechanism. While ZooKeeper was the original standard, many modern setups now use Kubernetes-native HA, which uses K8s' internal storage for leader election.


Interaction Diagram

Here is a diagram showing the interaction between the client, the coordinator (JobManager), and the workers (TaskManagers).

spinner

Key Components

  1. JobManager (The Brain): Coordinates the execution. It decides when to take checkpoints, handles task scheduling, and manages the cluster's recovery if a worker fails.

  2. TaskManager (The Muscle): These are the worker processes that actually execute the tasks. They buffer and exchange data streams between one another. Each TaskManager provides "slots," which are the smallest unit of resource scheduling.

  3. The Client: This isn't technically part of the runtime. It’s where your code lives. It takes your program, transforms it into a JobGraph, and ships it to the JobManager.


Windowing strategies




Last updated