Current 22 - Session Analysis with DuckDB and Jupyter Notebook
Analyzing conference session ratings using DuckDB and Jupyter Notebooks to demonstrate data wrangling and SQL on raw CSV data.
Robin Moffatt is a Principal DevEx Engineer and seasoned conference speaker with 15+ years of experience presenting at top events like QCon, Devoxx, Kafka Summit, and Strata. He shares insights on developer experience, distributed systems, and cloud technologies through his blog, YouTube, and public talks.
491 articles from this blog
Analyzing conference session ratings using DuckDB and Jupyter Notebooks to demonstrate data wrangling and SQL on raw CSV data.
Explains the evolution from ETL to ELT in data engineering, clarifying the role of modern tools like dbt in the transformation process.
A hands-on tutorial exploring LakeFS for data versioning and branching using PySpark and Jupyter notebooks in a data engineering context.
A curated list of essential resources for data engineering, including articles, newsletters, podcasts, and tools.
Explores modern data engineering trends in 2022, focusing on analytical data storage formats, organization, and access patterns.
A data engineer explores the evolution of the data ecosystem, comparing past practices with modern tools and trends in 2022.
A workaround to customize the fields shown in Airtable's .ics calendar export, which by default only uses the primary field.
A behind-the-scenes look at how the program committee used data and tools to select talks for the Current 2022 and Kafka Summit tech conferences.
A guide on crafting effective abstracts for short, focused lightning talks at tech conferences, emphasizing clarity and a single core idea.
A program committee chair shares common mistakes in tech conference talk abstracts and provides tips for writing better submissions.
A developer advocate shares experiences and strategies for effective remote advocacy, covering virtual conferences, YouTube content, and remote engagement.
A guide to setting up automated Hugo blog draft previews using GitHub Actions and Surge.sh for collaborative review.
A developer shares essential and nice-to-have software tools for setting up a new Mac for productivity and development work.
A developer shares why Alfred App is an essential Mac productivity tool, highlighting features like clipboard history, file search, and workflows.
A guide to automating ksqlDB query deployments using bash scripts and REST endpoints, with examples for local and Confluent Cloud.
A technical guide on using the FilePulse connector to stream CSV data into a Kafka topic on Confluent Cloud.
A guide on connecting to managed ksqlDB in Confluent Cloud using the REST API and CLI, covering API key creation and setup.
A technical guide on using ksqlDB to process and transform complex JSON data from ActiveMQ via Kafka Connect, including array splitting.
A technical deep-dive into how the Kafka Connect JDBC Sink connector handles primary keys for database operations like upserts and deletes.
Troubleshooting a Kafka Connect JDBC Sink error where a TEXT/BLOB column is used as a primary key in MySQL without specifying a key length.