disable cyclic gc on hot ranks + add rt tuning docs by jl33-ai · Pull Request #14 · LorenFrankLab/realtime_decoder

jl33-ai · 2026-05-29T21:43:32Z

this is the OS-level tail-latency follow-up to the per-tick allocation work in #9 and #12. python's GC and the linux scheduler/IRQ defaults are two separate sources of multi-ms hiccups, and for closed-loop the tail is what matters, not the mean.

code

new `realtime_decoder/rt_tuning.py` with a `gc_paused_for_main_loop(config)` context manager. runs one `gc.collect()` up front, `gc.disable()` for the duration of the with, re-enables and collects on exit. opt out via `performance.disable_gc_in_main_loop: false` (useful when hunting reference cycles in dev).
encoder, decoder, and ripple `main_loop` methods wrap their try/except body in that context manager. once preallocate decoder hot path scratch #9 + preallocate encoder and ripple hot path scratch #12 are in, steady-state allocations are low enough that disabling gen-2 collection cleanly removes one common source of 50-100ms spikes.

docs

new `docs/realtime_tuning.md` covering the OS recipe in the order I'd apply it:
1. don't fight the GC (already wired up here)
2. `mpiexec -bind-to hwthread` + rankfiles
3. `isolcpus` / `nohz_full` / `rcu_nocbs` on the kernel cmdline
4. IRQ affinity off the hot cores
5. SCHED_FIFO via `chrt` + `kernel.sched_rt_runtime_us=-1`
6. mlockall (with notes since stock python doesn't have a clean API)
7. PREEMPT_RT kernel
8. `cpupower frequency-set -g performance` + intel `no_turbo`
9. a launcher shell script that ties it together
10. a verification checklist (p99/p99.9 from the per-rank timing .npz)
README points at it.

deliberately not in this PR

no benchmark numbers in the doc. real latency numbers are setup-dependent and quoting them as universal would mislead. the doc is structured so each step is independently verifiable against the existing timing output.

no python-side mlockall implementation. the cleanest path is a tiny LD_PRELOAD shim or a C extension; both feel out of scope for one PR. happy to do it as a follow up if there's interest.

happy to gate the gc disable behind an env var instead of a config flag if you prefer; let me know.

addresses the OS-level tail latency story loren named directly: the python GC and the linux scheduler/IRQ defaults are two separate sources of multi-ms hiccups, and the tail is what matters for closed loop, not the mean. code: - new realtime_decoder/rt_tuning.py with a gc_paused_for_main_loop(config) context manager. runs one gc.collect() up front, gc.disable() for the duration of the with, then re-enables and collects on exit. opt out via performance.disable_gc_in_main_loop: false (useful when hunting reference cycles in dev). - encoder, decoder, and ripple main_loop methods wrap their try/except body in that context manager. with the per-tick pre-allocation pass from LorenFrankLab#9 and LorenFrankLab#12 already in place, steady-state allocations are low enough that disabling gen-2 collection cleanly removes one common source of 50-100ms spikes. docs: - new docs/realtime_tuning.md covering the OS recipe in the order it's worth applying: gc, CPU pinning + rankfiles, isolcpus + nohz_full + rcu_nocbs, IRQ affinity, SCHED_FIFO via chrt + kernel.sched_rt_runtime_us, mlockall, PREEMPT_RT, frequency governor + no_turbo, a launcher script that ties it together, and a verification checklist (p99/p99.9 from the timing .npz files). - README points at it. deliberately no benchmark numbers in the doc. real numbers are setup-dependent and quoting them as universal would be misleading. the doc is structured so each step is verifiable independently.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

disable cyclic gc on hot ranks + add rt tuning docs#14

disable cyclic gc on hot ranks + add rt tuning docs#14
jl33-ai wants to merge 1 commit into
LorenFrankLab:mainfrom
jl33-ai:gc-discipline-and-rt-tuning-docs

jl33-ai commented May 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jl33-ai commented May 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant