Limit NSJSONSerialization recursion depth to prevent stack overflow by DTW-Thalion · Pull Request #595 · gnustep/libs-base

DTW-Thalion · 2026-04-13T23:59:48Z

Problem

The NSJSONSerialization parser uses recursive descent for arrays and objects. Before this change, pathologically nested input — e.g. [[[...[null]...]]] at a few thousand levels — could exhaust the C stack and crash the process. There was no depth bound at all, so any caller that accepted untrusted JSON was exposed to a stack-based denial of service.

Fix

Source/NSJSONSerialization.m:

Add depth and maxDepth fields to ParserState, with doc comments describing the invariant that depth mirrors the live container stack (zero at rest, incremented exactly once on successful entry to parseArray/parseObject after the guard check, decremented exactly once on every exit path including error returns).
Add a NS_JSON_SERIALIZATION_MAX_DEPTH macro (set to 512) with a doc comment explaining the choice: RFC 8259 recommends supporting at least 100 levels, most parsers cap in the 100–1000 range, and 512 leaves comfortable head-room below the smallest thread stack GNUstep runs on.
parseArray and parseObject now refuse to descend further once the bound is reached, returning a parse error instead. Both functions maintain the depth invariant on every exit path, including the pre-existing error paths.
+JSONObjectWithData: and +JSONObjectWithStream: initialise p.depth = 0; p.maxDepth = NS_JSON_SERIALIZATION_MAX_DEPTH; before handing the state to the recursive descent.

Regression test

Tests/base/NSJSONSerialization/depth.m — 10 assertions wrapped in an NSJSONDepthTests harness class whose -performTest: method catches NSAssert-raised exceptions and surfaces them through PASS() in main(). Matches the style of Tests/base/NSJSONSerialization/tests00.m. Written in ObjC 1.0 syntax — no container literals, no boxed expressions, no blocks, no libobjc2 runtime extensions — so it builds with every Objective-C compiler GNUstep supports.

Cases:

moderatelyNestedArrayParses / moderatelyNestedObjectParses — 20-level arrays/objects parse cleanly (baseline sanity).
boundaryArrayParses / boundaryObjectParses — 512 levels at the exact boundary still parse. These nail down that the bound is >= rather than >; any future refactor that changes the comparison will break them immediately.
justOverBoundaryArrayRejected / justOverBoundaryObjectRejected — 513 levels are rejected for both arrays and objects.
pathologicallyNestedArrayRejected — 2000 levels rejected.
mixedDeepNestingRejected — 1000 alternating object/array levels rejected.
streamInterfaceRespectsDepthLimit — +JSONObjectWithStream: honours the same bound as +JSONObjectWithData:.
deepAndMalformedRejected — 600 [ with no closing brackets reports an error cleanly (depth guard or EOF handling — either is an acceptable rejection point).

Validation

Built and run against a clang 18.1.3 + libobjc2 stack on Ubuntu 24.04:

Build	Pass	Fail
With depth guard (this patch)	10	0
Depth guard replaced with `if (0)`	5	5

The 5 failing cases under the negative control are exactly the assertions that depend on the guard (513-array, 513-object, 2000-array, mixed-1000, stream-2000). The 5 passing cases are the ones that shouldn't depend on the guard (20-level pair, 512-boundary pair, deep-malformed caught by EOF handling). No flakes in either direction — each assertion cleanly discriminates the fixed and unfixed build.

ChangeLog

Entry added in ChangeLog matching the repository's GNU-style format.

* Source/NSJSONSerialization.m: Track the parser's current nesting depth through the ParserState struct. parseArray and parseObject now refuse to descend further once the bound (512 levels, exposed as NS_JSON_SERIALIZATION_MAX_DEPTH) is reached, returning a parse error instead. Pathologically nested input such as [[[...[null]...]]] with several thousand levels could previously exhaust the C stack and crash the process. The two new ParserState fields carry doc comments describing the invariant that `depth` mirrors the live container stack. * Tests/base/NSJSONSerialization/depth.m: New regression test covering ten cases via an NSJSONDepthTests harness class that wraps each per-selector case in NS_DURING/NS_HANDLER so that individual failures surface through PASS() in main(). Cases: - 20-level array and object each parse cleanly; - a document exactly at the boundary (512 levels) parses for both arrays and objects (proves the bound is `>=`, not `>`); - a document one level past the boundary (513 levels) is rejected for both arrays and objects; - 2000-level array nesting is rejected; - 1000 alternating object/array levels are rejected; - the +JSONObjectWithStream: entry point honours the same bound as +JSONObjectWithData:; - a deeply-nested but syntactically malformed document is rejected cleanly via the error out-parameter. Written in ObjC 1.0 syntax so it builds with every Objective-C compiler GNUstep supports. The test was validated both ways against a clang+libobjc2 build on Ubuntu 24.04: with the depth check in place, all ten assertions pass; with the depth check disabled, the five cases that depend on the guard (513-array, 513-object, 2000-array, mixed-1000, stream-2000) all fail, while the five control cases (20-level array/object, 512-boundary array/object, deep-malformed) still pass. This confirms each assertion discriminates the fixed and unfixed builds correctly rather than passing incidentally.

rfm

First the code changes. It compiles and looks like it works, but, while the code base already contains worse code (some written by me I'm sure), we are really trying to bring overall quality up rather than down...

The depth recording is scattered across the code, which makes it more fragile and harder to validate than necessary. I would scrap all that and wrap calls to parse collections in an inline function which would localise all the depth counting/checking.
The depth limit is confused between using a hard coded limit and a configurable one; the comments and use of a variable for the limit suggest one thing, but there's no API to configure any other limit. I think I'd use a hard coded limit in one place only (no maxDepth field) to get the simplest and most efficient code matching the API.
The NS_JSON_SERIALIZATION_MAX_DEPTH is wrong as that NS prefix would be for part of the Apple APIs and this constant doesn't belong there. There's no need for a preprocessor definition at all as it's not a value that needs to be reused in multiple places.
The error handling on reaching the limit creates an inappropriate error object suggesting that an unexpected character was found rather than a recursion limit being reached, so a better error object needs to be created.
So, while the analysis that we should have a recursion limit is undoubtedly correct, and it's great to have the issue spotted, I would discard almost every line of code change (and rewrite the comments somewhat) with the aim of having more maintainable code. In fact I may just do that.

On the regression tests. These seem to build, run, and probably cover what we want, but again the code quality is poor (the test code is at least twice as many lines as it should be). There is a completely unnecessary class, with a method to 'perform' a testcase which does absolutely nothing that the regression test framework doesn't already do. Other methods in the class each encapsulate one or two tests which raise exceptions on failure. A natural implementation would put the condition being tested in the PASS macro call directly, keeping the tested code right next to the test success/failure message. It only makes sense to separate tests into their own functions/methods when the code to perform a test is long/complex, and that really isn't the case for any of these tests.
That being said, this would serve as a good starting point/inspiration for producing a series of testcases (copy/paste the functional parts and discard all the boilerplate stuff).

rfm · 2026-04-14T16:25:51Z

I added a rewritten/simplified version of all this to master, thanks.
I'm leaving this request open in case you want to update it with code/testcases to demonstrate improvements in what Clause can generate with different prompting.

DTW-Thalion · 2026-04-14T17:02:19Z

Since master already has your implementation (292390e34), and you've generously left this open as an invitation to demonstrate what improved prompting can produce, the honest answer is to close this rather than post a revised commit. I've now read your merged version, so any "second attempt" from me would be a lossy reproduction of code already in master — not a clean-room comparison, and not useful to you. What actually matters is that your review has calibrated the template for the ~15 remaining libs-base findings in our upstream queue. Those will use PASS() inline, static C helpers, no wrapper class, GS_* prefixes (or no new macros at all), per-finding error objects at the rejection site, and single-site invariant wrappers. Closing. Thank you for the time you spent on the review and for being explicit about what "overall quality" means in libs-base.

DTW-Thalion · 2026-04-14T17:02:34Z

Hi Richard, Claude picked up your comments, reviewed them and replied. It is going to do another change to see if you are satisfied with it's template and I gave it a prompt which I hope will improve it's reasoning. I will let you know when it submits the next try. Cheers, Todd

…

On Tue, Apr 14, 2026 at 12:26 PM rfm ***@***.***> wrote: *rfm* left a comment (gnustep/libs-base#595) <#595 (comment)> I added a rewritten/simplified version of all this to master, thanks. I'm leaving this request open in case you want to update it with code/testcases to demonstrate improvements in what Clause can generate with different prompting. — Reply to this email directly, view it on GitHub <#595 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/BSTQQ7CAAODUW6NX66ZKVJD4VZRCNAVCNFSM6AAAAACXYC5N32VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHM2DENBVGUYTKOBTHE> . You are receiving this because you authored the thread.Message ID: ***@***.***>

DTW-Thalion requested a review from rfm as a code owner April 13, 2026 23:59

rfm reviewed Apr 14, 2026

View reviewed changes

DTW-Thalion closed this Apr 14, 2026

DTW-Thalion mentioned this pull request Apr 14, 2026

Fix unsigned integer overflow in binary plist offset-table bounds check #596

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Limit NSJSONSerialization recursion depth to prevent stack overflow#595

Limit NSJSONSerialization recursion depth to prevent stack overflow#595
DTW-Thalion wants to merge 1 commit intognustep:masterfrom
DTW-Thalion:fix/json-recursion-depth-limit

DTW-Thalion commented Apr 13, 2026

Uh oh!

rfm left a comment

Uh oh!

rfm commented Apr 14, 2026

Uh oh!

DTW-Thalion commented Apr 14, 2026

Uh oh!

DTW-Thalion commented Apr 14, 2026 via email

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants

Conversation

DTW-Thalion commented Apr 13, 2026

Problem

Fix

Regression test

Validation

ChangeLog

Uh oh!

rfm left a comment

Choose a reason for hiding this comment

Uh oh!

rfm commented Apr 14, 2026

Uh oh!

DTW-Thalion commented Apr 14, 2026

Uh oh!

DTW-Thalion commented Apr 14, 2026 via email

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants