fix: stop active pipeline workers before unload to free VRAM by leszko · Pull Request #966 · daydreamlive/scope

leszko · 2026-04-20T09:41:09Z

When /pipeline/load triggered a swap while a PipelineProcessor worker was still producing frames, the unload path dropped pipeline_manager's reference and called gc.collect()/empty_cache(), but the worker thread kept the pipeline object alive through its closure and continued allocating CUDA memory. The next load (e.g. longlive after ltx2) OOMed with ~30 GiB still in use despite logging "CUDA cache cleared".

Add a pre-unload hook registry on PipelineManager. graph_executor registers each processor's stop() under its node_id at creation time, and FrameProcessor.stop() unregisters on normal teardown. The hook fires synchronously inside _unload_pipeline_by_id_unsafe BEFORE the pipeline reference is dropped, so the worker exits and releases its tensors first — then gc/empty_cache can actually reclaim VRAM.

Verified: loading ltx2, running a session, then POSTing /pipeline/load {longlive, passthrough} without a session stop now succeeds. Log sequence is Unloading → PipelineProcessor stopped → CUDA cache cleared, and the next session starts cleanly.

When /pipeline/load triggered a swap while a PipelineProcessor worker was still producing frames, the unload path dropped pipeline_manager's reference and called gc.collect()/empty_cache(), but the worker thread kept the pipeline object alive through its closure and continued allocating CUDA memory. The next load (e.g. longlive after ltx2) OOMed with ~30 GiB still in use despite logging "CUDA cache cleared". Add a pre-unload hook registry on PipelineManager. graph_executor registers each processor's stop() under its node_id at creation time, and FrameProcessor.stop() unregisters on normal teardown. The hook fires synchronously inside _unload_pipeline_by_id_unsafe BEFORE the pipeline reference is dropped, so the worker exits and releases its tensors first — then gc/empty_cache can actually reclaim VRAM. Verified: loading ltx2, running a session, then POSTing /pipeline/load {longlive, passthrough} without a session stop now succeeds. Log sequence is Unloading → PipelineProcessor stopped → CUDA cache cleared, and the next session starts cleanly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Rafal Leszko <rafal@livepeer.org>

coderabbitai · 2026-04-20T09:41:16Z

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d9188ebf-b936-4a84-a880-89ebdfecc792

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch rafal/fix-pipeline-unload

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-04-20T10:12:24Z

🚀 fal.ai Preview Deployment


App ID	`daydream/scope-pr-966--preview`
WebSocket	`wss://fal.run/daydream/scope-pr-966--preview/ws`
Commit	`15fb53f`

Livepeer Runner


App ID	`daydream/scope-livepeer-pr-966--preview`
WebSocket	`wss://fal.run/daydream/scope-livepeer-pr-966--preview/ws`
Auth	`private`

Testing Livepeer Mode

SCOPE_CLOUD_MODE=livepeer SCOPE_CLOUD_APP_ID="daydream/scope-livepeer-pr-966--preview/ws" uv run daydream-scope

leszko requested a review from gioelecerati April 20, 2026 09:43

leszko marked this pull request as ready for review April 20, 2026 09:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: stop active pipeline workers before unload to free VRAM#966

fix: stop active pipeline workers before unload to free VRAM#966
leszko wants to merge 1 commit intomainfrom
rafal/fix-pipeline-unload

leszko commented Apr 20, 2026

Uh oh!

coderabbitai Bot commented Apr 20, 2026 •

edited

Loading

Review skipped

Uh oh!

github-actions Bot commented Apr 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

leszko commented Apr 20, 2026

Uh oh!

coderabbitai Bot commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

github-actions Bot commented Apr 20, 2026

🚀 fal.ai Preview Deployment

Livepeer Runner

Testing Livepeer Mode

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai Bot commented Apr 20, 2026 •

edited

Loading