Limit NSJSONSerialization recursion depth to prevent stack overflow#595
Limit NSJSONSerialization recursion depth to prevent stack overflow#595DTW-Thalion wants to merge 1 commit intognustep:masterfrom
Conversation
* Source/NSJSONSerialization.m: Track the parser's current nesting
depth through the ParserState struct. parseArray and parseObject
now refuse to descend further once the bound (512 levels, exposed
as NS_JSON_SERIALIZATION_MAX_DEPTH) is reached, returning a parse
error instead. Pathologically nested input such as
[[[...[null]...]]] with several thousand levels could previously
exhaust the C stack and crash the process. The two new
ParserState fields carry doc comments describing the invariant
that `depth` mirrors the live container stack.
* Tests/base/NSJSONSerialization/depth.m: New regression test
covering ten cases via an NSJSONDepthTests harness class that
wraps each per-selector case in NS_DURING/NS_HANDLER so that
individual failures surface through PASS() in main(). Cases:
- 20-level array and object each parse cleanly;
- a document exactly at the boundary (512 levels) parses for
both arrays and objects (proves the bound is `>=`, not `>`);
- a document one level past the boundary (513 levels) is
rejected for both arrays and objects;
- 2000-level array nesting is rejected;
- 1000 alternating object/array levels are rejected;
- the +JSONObjectWithStream: entry point honours the same
bound as +JSONObjectWithData:;
- a deeply-nested but syntactically malformed document is
rejected cleanly via the error out-parameter.
Written in ObjC 1.0 syntax so it builds with every Objective-C
compiler GNUstep supports.
The test was validated both ways against a clang+libobjc2 build on
Ubuntu 24.04: with the depth check in place, all ten assertions
pass; with the depth check disabled, the five cases that depend on
the guard (513-array, 513-object, 2000-array, mixed-1000,
stream-2000) all fail, while the five control cases (20-level
array/object, 512-boundary array/object, deep-malformed) still
pass. This confirms each assertion discriminates the fixed and
unfixed builds correctly rather than passing incidentally.
rfm
left a comment
There was a problem hiding this comment.
First the code changes. It compiles and looks like it works, but, while the code base already contains worse code (some written by me I'm sure), we are really trying to bring overall quality up rather than down...
The depth recording is scattered across the code, which makes it more fragile and harder to validate than necessary. I would scrap all that and wrap calls to parse collections in an inline function which would localise all the depth counting/checking.
The depth limit is confused between using a hard coded limit and a configurable one; the comments and use of a variable for the limit suggest one thing, but there's no API to configure any other limit. I think I'd use a hard coded limit in one place only (no maxDepth field) to get the simplest and most efficient code matching the API.
The NS_JSON_SERIALIZATION_MAX_DEPTH is wrong as that NS prefix would be for part of the Apple APIs and this constant doesn't belong there. There's no need for a preprocessor definition at all as it's not a value that needs to be reused in multiple places.
The error handling on reaching the limit creates an inappropriate error object suggesting that an unexpected character was found rather than a recursion limit being reached, so a better error object needs to be created.
So, while the analysis that we should have a recursion limit is undoubtedly correct, and it's great to have the issue spotted, I would discard almost every line of code change (and rewrite the comments somewhat) with the aim of having more maintainable code. In fact I may just do that.
On the regression tests. These seem to build, run, and probably cover what we want, but again the code quality is poor (the test code is at least twice as many lines as it should be). There is a completely unnecessary class, with a method to 'perform' a testcase which does absolutely nothing that the regression test framework doesn't already do. Other methods in the class each encapsulate one or two tests which raise exceptions on failure. A natural implementation would put the condition being tested in the PASS macro call directly, keeping the tested code right next to the test success/failure message. It only makes sense to separate tests into their own functions/methods when the code to perform a test is long/complex, and that really isn't the case for any of these tests.
That being said, this would serve as a good starting point/inspiration for producing a series of testcases (copy/paste the functional parts and discard all the boilerplate stuff).
|
I added a rewritten/simplified version of all this to master, thanks. |
|
Since master already has your implementation ( |
|
Hi Richard,
Claude picked up your comments, reviewed them and replied. It is going to
do another change to see if you are satisfied with it's template and I gave
it a prompt which I hope will improve it's reasoning.
I will let you know when it submits the next try.
Cheers,
Todd
…On Tue, Apr 14, 2026 at 12:26 PM rfm ***@***.***> wrote:
*rfm* left a comment (gnustep/libs-base#595)
<#595 (comment)>
I added a rewritten/simplified version of all this to master, thanks.
I'm leaving this request open in case you want to update it with
code/testcases to demonstrate improvements in what Clause can generate with
different prompting.
—
Reply to this email directly, view it on GitHub
<#595 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/BSTQQ7CAAODUW6NX66ZKVJD4VZRCNAVCNFSM6AAAAACXYC5N32VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHM2DENBVGUYTKOBTHE>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Problem
The
NSJSONSerializationparser uses recursive descent for arrays and objects. Before this change, pathologically nested input — e.g.[[[...[null]...]]]at a few thousand levels — could exhaust the C stack and crash the process. There was no depth bound at all, so any caller that accepted untrusted JSON was exposed to a stack-based denial of service.Fix
Source/NSJSONSerialization.m:depthandmaxDepthfields toParserState, with doc comments describing the invariant thatdepthmirrors the live container stack (zero at rest, incremented exactly once on successful entry toparseArray/parseObjectafter the guard check, decremented exactly once on every exit path including error returns).NS_JSON_SERIALIZATION_MAX_DEPTHmacro (set to 512) with a doc comment explaining the choice: RFC 8259 recommends supporting at least 100 levels, most parsers cap in the 100–1000 range, and 512 leaves comfortable head-room below the smallest thread stack GNUstep runs on.parseArrayandparseObjectnow refuse to descend further once the bound is reached, returning a parse error instead. Both functions maintain the depth invariant on every exit path, including the pre-existing error paths.+JSONObjectWithData:and+JSONObjectWithStream:initialisep.depth = 0; p.maxDepth = NS_JSON_SERIALIZATION_MAX_DEPTH;before handing the state to the recursive descent.Regression test
Tests/base/NSJSONSerialization/depth.m— 10 assertions wrapped in anNSJSONDepthTestsharness class whose-performTest:method catchesNSAssert-raised exceptions and surfaces them throughPASS()inmain(). Matches the style ofTests/base/NSJSONSerialization/tests00.m. Written in ObjC 1.0 syntax — no container literals, no boxed expressions, no blocks, no libobjc2 runtime extensions — so it builds with every Objective-C compiler GNUstep supports.Cases:
moderatelyNestedArrayParses/moderatelyNestedObjectParses— 20-level arrays/objects parse cleanly (baseline sanity).boundaryArrayParses/boundaryObjectParses— 512 levels at the exact boundary still parse. These nail down that the bound is>=rather than>; any future refactor that changes the comparison will break them immediately.justOverBoundaryArrayRejected/justOverBoundaryObjectRejected— 513 levels are rejected for both arrays and objects.pathologicallyNestedArrayRejected— 2000 levels rejected.mixedDeepNestingRejected— 1000 alternating object/array levels rejected.streamInterfaceRespectsDepthLimit—+JSONObjectWithStream:honours the same bound as+JSONObjectWithData:.deepAndMalformedRejected— 600[with no closing brackets reports an error cleanly (depth guard or EOF handling — either is an acceptable rejection point).Validation
Built and run against a clang 18.1.3 + libobjc2 stack on Ubuntu 24.04:
if (0)The 5 failing cases under the negative control are exactly the assertions that depend on the guard (513-array, 513-object, 2000-array, mixed-1000, stream-2000). The 5 passing cases are the ones that shouldn't depend on the guard (20-level pair, 512-boundary pair, deep-malformed caught by EOF handling). No flakes in either direction — each assertion cleanly discriminates the fixed and unfixed build.
ChangeLog
Entry added in
ChangeLogmatching the repository's GNU-style format.