Fix v1 v2 v3 REST endpoint parsers#4167
Conversation
There was a problem hiding this comment.
Pull request overview
This PR hardens the v1/v2/v3 REST JSON parsing paths against excessively deep nesting by switching to RapidJSON’s iterative parsing mode and introducing explicit maximum nesting-depth limits, along with unit tests covering the new behavior.
Changes:
- Use
rapidjson::kParseIterativeFlagfor REST JSON parsing (TFS + KFS + v3 HTTP payload). - Add maximum nesting-depth enforcement for TFS/KFS tensor array parsing and v3 OpenAI JSON body validation.
- Add tests ensuring deeply nested inputs are rejected and shallow nesting remains accepted.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
src/rest_parser.cpp |
Adds MAX_NESTING_DEPTH, enforces depth during array parsing, and switches TFS/KFS parsing to iterative RapidJSON parse. |
src/rest_parser.hpp |
Extends KFSRestParser::parseData signature to carry recursion depth. |
src/http_rest_api_handler.cpp |
Adds a JSON depth check for v3 JSON payloads and switches parsing to iterative RapidJSON parse. |
src/test/tfs_rest_parser_row_test.cpp |
Adds nesting-depth tests for TFS row format. |
src/test/tfs_rest_parser_column_test.cpp |
Adds nesting-depth tests for TFS column format. |
src/test/kfs_rest_parser_test.cpp |
Adds nesting-depth tests for KFS REST input parsing. |
src/test/http_openai_handler_test.cpp |
Adds tests validating v3 OpenAI handler behavior for excessive JSON nesting. |
| static constexpr int MAX_NESTING_DEPTH = 100; | ||
|
|
There was a problem hiding this comment.
MAX_NESTING_DEPTH is introduced here as a TU-local magic number (100). The same limit is also hard-coded for OpenAI JSON handling (MAX_JSON_NESTING_DEPTH in http_rest_api_handler.cpp). Consider centralizing this limit (or at least reusing a shared constant) to avoid the two code paths drifting to different effective limits over time.
There was a problem hiding this comment.
I dont like this idea, we might have different limit for different API in future
| return Status(StatusCode::JSON_INVALID, "JSON body must be an object"); | ||
| } | ||
|
|
||
| if (jsonDepthExceedsLimit(*parsedJson)) { |
There was a problem hiding this comment.
Why this placement? Not at the very beginning or somewhere near the end of processing?
Do we need to do it before the first time we do a lookup?
There was a problem hiding this comment.
Removed manual depth calculation, used SAX filter instead
| TEST_F(KFSRestParserTest, NestingDepthExceeded_FP32) { | ||
| std::string request = R"({"inputs":[{"name":"input0","shape":[1],"datatype":"FP32","data":)" + makeNestedArrayJson(200) + "}]}"; | ||
| auto status = parser.parse(request.c_str()); | ||
| EXPECT_EQ(status, StatusCode::REST_COULD_NOT_PARSE_INPUT); | ||
| } | ||
|
|
||
| TEST_F(KFSRestParserTest, NestingDepthExceeded_BYTES) { | ||
| std::string request = R"({"inputs":[{"name":"input0","shape":[1],"datatype":"BYTES","data":)" + makeNestedArrayJson(200) + "}]}"; | ||
| auto status = parser.parse(request.c_str()); | ||
| EXPECT_EQ(status, StatusCode::REST_COULD_NOT_PARSE_INPUT); | ||
| } | ||
|
|
||
| TEST_F(KFSRestParserTest, NestingDepthExceeded_INT32) { | ||
| std::string request = R"({"inputs":[{"name":"input0","shape":[1],"datatype":"INT32","data":)" + makeNestedArrayJson(200) + "}]}"; | ||
| auto status = parser.parse(request.c_str()); | ||
| EXPECT_EQ(status, StatusCode::REST_COULD_NOT_PARSE_INPUT); | ||
| } |
There was a problem hiding this comment.
Just one type would be sufficient?
🛠 Summary
CVS-185843