ARIMA Models in Python: All Just Statsmodels Under The Hood?

What’s ARIMA and Why Should You Care? If you’re working with time series and you need to produce forecasts, autoregressive moving-average models (AR(I)MA) are still a good place to start. But which Python implementation should you use (if you don’t want to use R)? Recently I’ve been again looking into what the Python ecosystem has to offer in regards to time series analysis in general and ARIMA models in particular. There are quite a few options, however you should have a rough understanding what’s happening under the hood: Are we dealing with a framework that wraps existing libraries or native implementations?...

May 11, 2025 · 3 min · 615 words · Andreas Lay

Ibis: Build your SQL Queries via Python ‒ One API for nearly 20 Backends

What is Ibis and Why Would You Use It? Ibis is a backend agnostic query builder / dataframe API for Python. Think of it as an interface for dynamically generating & executing SQL queries via Python. Ibis provides a unified API for a variety of backends like Snowflake, BigQuery, DuckDB, polars or PySpark. But why would this be useful? I mainly use it for: Generating complex dynamic queries where (Jinja-)templating would becomes too messy Writing SQL generation as self-contained, easily testable Python functions Switching out a production OLAP database (like Snowflake) at test time with a local & version controlled DuckDB instance containing test data If you know SQL you already know Ibis....

March 30, 2025 · 7 min · 1394 words · Andreas Lay

Distributed locking using Google Cloud Storage (or S3)

The Need for Distributed Locks When you run into situations where you want to prevent two pieces of code ‒ possibly running on differen machines ‒ from running concurrently, you need a distributed lock. An easy solution to implement such a lock is to leverage a cloud storage service like Google Cloud Storage (GCS) or Amazon S3. Here you can find the full code example implementing this kind of lock....

March 27, 2025 · 4 min · 785 words · Andreas Lay

dbt: Programmatic Invocation via dbtRunner

Introduction dbt is a great tool for building & organising your ELT data pipelines. When deploying dbt yourself you can invoke dbt either through dbt core cli or through Python via dbtRunner. I will give you an example template on how to use the latter. You can find the full example in the Full Example section. Note: This example was build on dbt-core==1.8.3. dbtRunner may be subject to breaking changes so there’s no guarantee the provided code works as is with other dbt versions....

August 6, 2024 · 9 min · 1823 words · Andreas Lay

Hackerfluff: A Hackernews Reader built with Flutter

A Hackernews Client In an attempt to learn some frontend / mobile development I’ve decided to give Flutter a try. Flutter is cross-platform framework developed by Google which lets us build & deploy for web (iOS & Android), web and desktop from the same code base. I’m an avid Hacker News reader, they provide an API to query stories and comments, so I decided to build a Hackernews client app....

May 22, 2024 · 4 min · 789 words · Andreas Lay