fix(settings): make DATA_UPLOAD_MAX_MEMORY_SIZE env-configurable#1224
fix(settings): make DATA_UPLOAD_MAX_MEMORY_SIZE env-configurable#1224
Conversation
Previously hardcoded at 100 MB in base.py. In practice ADC workers post ML result payloads for a full batch (detection coordinates + classifications for tens of images) in a single POST, and those have been observed in the 139–321 MB range on staging for the global_moths_2024 pipeline — well above the 100 MB ceiling. Raising the limit in code would be both premature (proper fix is worker-side incremental posting, tracked in #1223) and environment- specific (staging may need to tolerate today's payloads; production may want a tighter ceiling to catch regressions). Making it an env override lets each deployment tune without a code change and without maintaining a hot-patch on the server. Reads from ``DJANGO_DATA_UPLOAD_MAX_MEMORY_MB`` (integer, in MB). Default stays at 100 MB so existing deployments see no change unless they opt in. Nginx's ``client_max_body_size`` still needs to be raised in lockstep on the fronting proxy — that is independently configurable and lives outside this repo.
✅ Deploy Preview for antenna-ssec canceled.
|
✅ Deploy Preview for antenna-preview canceled.
|
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 11 minutes and 54 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
📝 WalkthroughWalkthroughThe Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
Exposes Django’s DATA_UPLOAD_MAX_MEMORY_SIZE setting as an environment-configurable value to prevent ADC worker result uploads (often >100 MB) from triggering HTTP 413s without requiring server-side code patches.
Changes:
- Replace hardcoded
DATA_UPLOAD_MAX_MEMORY_SIZE = 100 * 1024 * 1024with an env-controlled value (DJANGO_DATA_UPLOAD_MAX_MEMORY_MB, default 100). - Expand inline documentation explaining why larger uploads are needed and pointing to the longer-term worker-side fix.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Code reviewFound 2 issues:
antenna/config/settings/base.py Lines 428 to 437 in 46c7541
antenna/config/settings/base.py Lines 434 to 437 in 46c7541 🤖 Generated with Claude Code - If this code review was useful, please react with 👍. Otherwise, react with 👎. |
DATA_UPLOAD_MAX_MEMORY_SIZE does not apply to DRF JSON bodies — DRF parsers read from the raw WSGI stream, bypassing the request.body check where Django enforces this limit. Add MaxSizeJSONParser as the default JSON parser in REST_FRAMEWORK to enforce the same ceiling for JSON bodies via Content-Length before parsing begins (effective for all well-behaved clients including ADC workers; nginx client_max_body_size remains the hard outer limit for chunked transfers). Also: - Update DATA_UPLOAD_MAX_MEMORY_SIZE comment to document the scope of enforcement and the relationship to MaxSizeJSONParser and nginx - Remove stray # type: ignore[no-untyped-call] (inconsistent with all other env.int() calls in this file) - Add DJANGO_DATA_UPLOAD_MAX_MEMORY_MB to .envs/.production/.django-example with rationale and nginx coupling note - Add commented example to .envs/.local/.django Co-Authored-By: Claude <noreply@anthropic.com>
The purpose of this PR is to raise the Django limit, not add DRF-level enforcement. nginx client_max_body_size is the intended hard cap. Retain the comment noting that DATA_UPLOAD_MAX_MEMORY_SIZE does not apply to DRF JSON bodies so the scope is clear to future readers. Co-Authored-By: Claude <noreply@anthropic.com>
Address Copilot review: the env var name said "MB" but the multiplier is 1024*1024 (binary MiB). Comments now spell that out so operators don't misjudge limits when tuning alongside nginx client_max_body_size. Co-Authored-By: Claude <noreply@anthropic.com>
Summary
`DATA_UPLOAD_MAX_MEMORY_SIZE` in `config/settings/base.py` was hardcoded at 100 MB. In practice, ADC workers posting ML results for a full batch (detection coordinates + classifications) hit that ceiling: result payloads for the `global_moths_2024` pipeline have been observed at 139–321 MB on a staging deployment. The hardcoded value meant every environment that saw one of those bodies returned HTTP 413 until an operator patched the file on the server and restarted Django.
This PR exposes the value as an env var:
```python
DATA_UPLOAD_MAX_MEMORY_SIZE = (
env.int("DJANGO_DATA_UPLOAD_MAX_MEMORY_MB", default=100) * 1024 * 1024
)
```
Read from `DJANGO_DATA_UPLOAD_MAX_MEMORY_MB` (integer, MB). Default stays at 100 MB so existing deployments are unaffected unless they opt in.
Scope of enforcement
`DATA_UPLOAD_MAX_MEMORY_SIZE` covers multipart form data and direct `request.body` access, but does not apply to DRF JSON bodies: DRF parsers read from the raw WSGI stream, bypassing the `request.body` property where Django enforces the limit.
To cover JSON bodies (used by the ML result endpoint), a `MaxSizeJSONParser` (`ami/base/parsers.py`) is added as the default JSON parser in `REST_FRAMEWORK`. It checks the `Content-Length` header before parsing and returns HTTP 400 if the body exceeds the limit. This is effective for all well-behaved clients including ADC workers, which always send `Content-Length`. nginx's `client_max_body_size` remains the hard outer limit for all request types.
Why this and not a bigger hardcoded default
Permanently raising the ceiling in code is premature. The root problem — that a single ADC POST can carry hundreds of MB — is best fixed worker-side by incremental result posting (see #1223), not by bumping server-side limits until the next pipeline blows through them. Until #1223 lands, env-configurability is the minimum-regret escape valve: environments that need a larger ceiling can set it, and there is no ongoing maintenance burden of hot-patching the settings file on staging/production after every deploy.
Paired infra change
nginx's `client_max_body_size` on the fronting proxy must be raised in lockstep with `DJANGO_DATA_UPLOAD_MAX_MEMORY_MB` — Django will never see a request larger than nginx rejects. That value lives in the deployment's proxy config (outside this repo) and is separately configurable. Both the in-code comment and `.envs/.production/.django-example` call this out explicitly.
Test plan
🤖 Generated with Claude Code