Real-time Data Streaming with Kafka Connect

Why Kafka Connect? While you can always write your own Kafka connector to write data from Kafka to S3 or a database using for example confluent-kafka-python, this might be hard to maintain and error prone. Kafka Connect can help you to simplify this task. In this post we will … … set up a local Kafka cluster, S3 storage & Kafka Connect with Docker Compose … create a Kafka topic and publish messages to it … use Kafka Connect to create an S3 sink connector and write the messages to S3 Run the Example You can run the full example which is described in this post by executing the following script: It will download the necessary files, start the containers, create a Kafka topic, publish messages, and create an S3 sink connector to write the data to S3:...

July 21, 2025 · 4 min · 689 words · Andreas Lay

Ibis: Build your SQL Queries via Python ‒ One API for nearly 20 Backends

What is Ibis and Why Would You Use It? Ibis is a backend agnostic query builder / dataframe API for Python. Think of it as an interface for dynamically generating & executing SQL queries via Python. Ibis provides a unified API for a variety of backends like Snowflake, BigQuery, DuckDB, polars or PySpark. But why would this be useful? I mainly use it for: Generating complex dynamic queries where (Jinja-)templating would becomes too messy Writing SQL generation as self-contained, easily testable Python functions Switching out a production OLAP database (like Snowflake) at test time with a local & version controlled DuckDB instance containing test data If you know SQL you already know Ibis....

March 30, 2025 · 7 min · 1394 words · Andreas Lay

dbt: Programmatic Invocation via dbtRunner

Introduction dbt is a great tool for building & organising your ELT data pipelines. When deploying dbt yourself you can invoke dbt either through dbt core cli or through Python via dbtRunner. I will give you an example template on how to use the latter. You can find the full example in the Full Example section. Note: This example was build on dbt-core==1.8.3. dbtRunner may be subject to breaking changes so there’s no guarantee the provided code works as is with other dbt versions....

August 6, 2024 · 9 min · 1823 words · Andreas Lay