Quick profiling of data in Apache Kafka using kafkacat and visidata
A guide to quickly profiling data in Apache Kafka topics using the kafkacat and visidata command-line tools for data exploration.
Robin Moffatt is a Principal DevEx Engineer and seasoned conference speaker with 15+ years of experience presenting at top events like QCon, Devoxx, Kafka Summit, and Strata. He shares insights on developer experience, distributed systems, and cloud technologies through his blog, YouTube, and public talks.
491 articles from this blog
A guide to quickly profiling data in Apache Kafka topics using the kafkacat and visidata command-line tools for data exploration.
A guide on integrating OpenSeaMap maritime data into Kibana maps for enhanced visualization of location-based data like ship tracking.
A technical guide on using kafkacat to load static CSV data into Apache Kafka topics for data enrichment, focusing on simplicity over Kafka Connect.
Using bash shell tools like kafkacat, jq, sort, and uniq to perform a GROUP BY-style analysis on data from a Kafka topic.
How to run commands as root in Docker containers that default to non-root users, using the --user flag.
Guide to deploying a self-managed Kafka Connect worker with Docker to integrate custom connectors with Confluent Cloud.
How to configure Kafka Connect to automatically create and customize topics for source connectors, including partition and replication settings.
A deep dive into Kafka Connect's Single Message Transforms (SMT), exploring their use for data manipulation within the pipeline.
Final part of a series exploring community-built Single Message Transformations (SMTs) for Apache Kafka Connect, highlighting useful plugins.
Explains how to use Kafka Connect predicates and the Filter SMT to conditionally transform and drop messages, with a practical example of field renaming.
Explains how to use Kafka Connect's ReplaceField Single Message Transform to include, exclude, or rename fields in data streams.
How to automate publishing future-dated Hugo blog posts using scheduled GitHub Actions workflows.
Explains how to use the Cast Single Message Transform in Kafka Connect to change data types of fields in Kafka messages.
Explains how to use Kafka Connect's TimestampConverter SMT to transform timestamp fields between string and native types for proper data handling.
Explains how to use Kafka Connect's TimestampRouter SMT to dynamically route messages to time-partitioned topics based on message timestamps.
Day 6 of a series on Kafka Connect Single Message Transforms, focusing on using InsertField to add static values and Kafka metadata to messages.
Explains how to use Kafka Connect's MaskField Single Message Transform to mask sensitive data fields like credit card numbers during data ingestion.
Explains how to use Kafka Connect's RegExRouter SMT to rename topics for source connectors and target objects for sink connectors.
Explains how to use the Flatten Single Message Transform (SMT) in Kafka Connect to convert nested JSON data into a flat structure for database insertion.
Explains how to use Kafka Connect's ValueToKey and ExtractField Single Message Transforms to set message keys from data fields.