[SPARK-56456] Improve checkError() diagnostics#55317
[SPARK-56456] Improve checkError() diagnostics#55317srielau wants to merge 3 commits intoapache:masterfrom
Conversation
Replace the sequential assert chain in checkError() with a collect-all-then-fail pattern so that all mismatches plus the complete actual exception state are reported in a single failure message, eliminating iterative guess-and-rerun cycles.
cloud-fan
left a comment
There was a problem hiding this comment.
Summary
Prior state and problem: checkError() is a test utility method used across the entire Spark test suite to verify that exceptions have the expected error condition, SQL state, parameters, and query context. It exists in two independent copies: SparkTestSuite (core) and Connect QueryTest. Both used a fail-fast pattern — the first assertion failure stops the check, requiring developers to fix one mismatch at a time and re-run.
Design approach: Replace the sequential assert chain with a collect-all-then-fail pattern using ListBuffer. All mismatches are accumulated, then reported in a single failure message that includes both the mismatches and a full dump of the actual exception state.
Implementation: Both files independently refactored with the same pattern. The failure message has two sections: "Actual Exception State" (full dump of exception fields) and "Mismatches" (list of specific differences). The matchPVals path is also improved — the old IllegalArgumentException with confusing "Missing parameter" message is replaced with clear mismatch descriptions, and a new reverse-direction check detects expected keys that the exception didn't produce.
General comments:
-
The "How was this patch tested?" section says "Added testcases" but no tests are present in this PR. Since this changes test infrastructure exercised by thousands of tests, consider adding at least a basic test that verifies the collect-all behavior — e.g., create a
SparkThrowablewith multiple wrong fields and assert that the failure message mentions all of them. -
Minor: the PR description has several typos in the "What changes" and "Why needed" sections: "improviong" → "improving", "isnetad" → "instead", "faioling" → "failing", "imcreases" → "increases".
sql/connect/client/jvm/src/test/scala/org/apache/spark/sql/connect/test/QueryTest.scala
Outdated
Show resolved
Hide resolved
…rd, add tests - Restore the dropped comment explaining the -1 convention for startIndex/stopIndex in SparkTestSuite.checkError. - Align Connect QueryTest.checkError with SparkTestSuite by adding the same -1 guard for startIndex/stopIndex, so fragment-only ExpectedContext does not cause spurious mismatches. - Add two tests in SparkThrowableSuite: one verifying that checkError reports all mismatches in a single failure message, and one verifying it succeeds when all fields match.
|
|
thanks! merge to master |
Replace the sequential assert chain in checkError() with a collect-all-then-fail pattern so that all mismatches plus the complete actual exception state are reported in a single failure message, eliminating iterative guess-and-rerun cycles.
What changes were proposed in this pull request?
We're improving checkError() to provide a detailed report on the context diff, instead off failing on the first assert.
Why are the changes needed?
This increases turn around time to fix test cases when they are newly written or logic changes.
Does this PR introduce any user-facing change?
No
How was this patch tested?
Added two tests in SparkThrowableSuite verifying the collect-all-then-fail behavior and the happy path.
Was this patch authored or co-authored using generative AI tooling?
Claude Opus 4.6