Interesting links - July 2025
A monthly roundup of data engineering links covering Apache Iceberg, Kafka, Debezium, Spark, and lakehouse architecture.
A monthly roundup of data engineering links covering Apache Iceberg, Kafka, Debezium, Spark, and lakehouse architecture.
A technical guide on using Flink SQL to write data to Apache Iceberg tables stored on AWS S3, with metadata managed by the AWS Glue Data Catalog.
A technical guide exploring how Flink SQL handles joins and changelogs for streaming data, with practical examples using Kafka connectors.
A technical guide exploring different Flink SQL connectors and formats for ingesting and processing Debezium CDC events from Apache Kafka topics.
Explores methods for ingesting Debezium CDC events from Kafka into Apache Flink using different SQL connectors and data formats.
A technical tutorial on using Apache Flink SQL to explore and process real-time flood monitoring data from a government API, demonstrating data wrangling techniques.
A technical guide on joining two data streams using Apache Flink SQL, including code examples and practical considerations.
A technical tutorial on using the UNNEST operator in Flink SQL to explode nested arrays of sensor data into separate rows.
A technical guide on troubleshooting S3 connectivity issues in Apache Flink SQL, focusing on configuration and common pitfalls.
A developer's guide to troubleshooting common pitfalls and misconfigurations when setting up and using Apache Flink SQL with JDBC connectors.
Troubleshooting guide for resolving ClassNotFoundException errors in Apache Flink SQL by managing and locating the correct JAR files.
A hands-on guide to using different catalogs, including Apache Hive, with Flink SQL, covering installation, configuration, and practical insights.
Explains the role and types of catalogs in Apache Flink SQL, comparing them to traditional RDBMS systems and highlighting their importance in data management.
A monthly roundup of articles and resources on stream processing, Apache Flink, Kafka, and SQL for data engineering and real-time analytics.
A hands-on guide exploring the Apache Flink SQL Client, covering setup, result modes, and running basic queries in a local cluster.