Run multiple python scripts in the background

To solve a multitude of challenges I have faced when processing high throughput microscopy data, have developed Nahual, a tool that allows me to move data across multiple Python environments that deploy deep learning models in the background. I usually keep these models “listening” in the background for the main analysis pipeline (aliby) to send them data to process. To be able to monitor what’s going on inside of these scripts I use GNU screen, which allows me to detach and reattach into these sessions whenever I need to. At some point I had to reboot my server and had rerun all these in independent screens. This rudimentary shell script did the job: ...

2025-08-26 · Alán F. Muñoz

Simple progress indicators with awk

I wanted a simple way to see the progress of a data processing pipeline, and the internal progress bar tools were messed up by threading. I thus decided to use the number of output files in each folder as an indicator of progress. In my case the output of tree . looks like this: . └── steps ├── A01_001 │ ├── segment_nuclei │ │ ├── 0000.npz │ │ ├── 0001.npz │ │ ├── ... │ │ └── 0019.npz │ ├── tile │ │ ├── 0000.npz │ │ ├── 0001.npz │ │ ├── ... I can get the info I need by counting the total number of files and the occurrences of the A01_001 -> P24_005 range (these are fields of view from a microscopy experiment). Using this simple find command we get all the files in the current folder. ...

2025-08-19 · Alán F. Muñoz

Update figure numbering

I was editing some markdown and had to insert a new figure in the middle. The problem is that this document already has an explicit figure numbering (e.g., “Figure 5”), so changing tens of figures felt dull. I like to run small (GNU) awk scripts for this type of tasks. # update_figures.awk { if (match($0, "Figure ([0-9]+)", num)){ if (num[1] > after){ gsub("Figure ([0-9]+)", "Figure " num[1] + increase_by) } }; print $0 } This changes Figure X into Figure X + increase_by starting after the variable “after”. And we can run it as follows: ...

2025-08-19 · Alán F. Muñoz

Recursive search and replace

I needed to rename all occurrences of a pattern with another, where I knew there was no ambiguous situations. This uses ripgrep, xargs and GNU sed. source. rg old_pattern --files-with-matches | xargs sed -i 's/old_pattern/new_pattern/g'

2025-08-12 · Alán F. Muñoz

A workflow for bioimaging and data exploration

One of the common challenges when analysing large bioimaging datasets is to bring it all together in one place. I usually use tools like DuckDB for database querying and copairs for selecting statistically significant subsets of the data. For one of my recent projects I built a marimo interface to explore the result of large-scale (~2TB images, ~2GB feature profiles) image-based profiles, then performs dimensionality reduction of the data, and finally retrieves back the images. This I think is the ideal workflow, one where you can be nimble and pull up the images alongside statistical analyses to be able to interpret the data structure in the biological context. The code is not yet available to the public, but you can find the demo here. ...

2025-07-30 · Alán F. Muñoz

Github code review on existing code base

Create an empty branch with one empty commit Create new branch git checkout --orphan review-1-target Reset git reset . Clean branch git clean -df Add empty commit git commit --allow-empty -m 'Empty commit' Rebase a branch to put this commit at the root Push to your fork git push -u origin review-1-target Move to branch to review git checkout origin/main Spin-off branch from here git checkout -b review-1 Rebase to empty branch git rebase -i review-1-target, the empty commit must be at the start Push git push -u origin review-1 That should make a pull request possible, providing the code review tooling. source ...

2024-11-26 · Alán F. Muñoz