Calculate the cumulative sum of a column using DuckDB

Duckdb, the (tabular) data exploration tool I use supports window operations. I recently discovered that it can also perform cumulative sums in a very efficient manner. Let us generate a toy dataset where we want to calculate the sum of one column relative to the order of another one. CREATE OR REPLACE TABLE seed AS SELECT SETSEED(0.1); -- seeding for reproducibility, creating a table to hide output -- Create a mock dataset with two integer columns CREATE OR REPLACE TABLE my_table AS SELECT #1 AS column_1, CAST(FLOOR(RANDOM() * 100) AS INT) AS column_2 FROM generate_series(1, 10); -- This generates 10 rows SELECT * FROM my_table; -- We write it to a csv for future use COPY my_table TO my_table.csv; ┌──────────┬──────────┐ │ column_1 │ column_2 │ │ int64 │ int32 │ ├──────────┼──────────┤ │ 1 │ 27 │ │ 2 │ 45 │ │ 3 │ 2 │ │ 4 │ 84 │ │ 5 │ 84 │ │ 6 │ 26 │ │ 7 │ 18 │ │ 8 │ 65 │ │ 9 │ 97 │ │ 10 │ 11 │ ├──────────┴──────────┤ │ 10 rows 2 columns │ └─────────────────────┘ If we wanted to calculate the distribution of the cumulative sum of the table we could use the OVER clause to perform the sum of column_2 in the order defined by column_1. ...

2025-10-22 · Alán F. Muñoz

A workflow for bioimaging and data exploration

One of the common challenges when analysing large bioimaging datasets is to bring it all together in one place. I usually use tools like DuckDB for database querying and copairs for selecting statistically significant subsets of the data. For one of my recent projects I built a marimo interface to explore the result of large-scale (~2TB images, ~2GB feature profiles) image-based profiles, then performs dimensionality reduction of the data, and finally retrieves back the images. This I think is the ideal workflow, one where you can be nimble and pull up the images alongside statistical analyses to be able to interpret the data structure in the biological context. The code is not yet available to the public, but you can find the demo here. ...

2025-07-30 · Alán F. Muñoz