feat(spider-storage): Add ValidatedJobSubmission to hold a validated task graph and its task inputs.#320
Conversation
WalkthroughThis PR adds a ValidatedJobSubmission type that owns a TaskGraph and inputs, enforces non-empty graph and input-count invariants, and propagates the validated submission through TaskGraph creation, SharedJobControlBlock creation, ExternalJobOrchestration::register, and all affected tests. ChangesValidatedJobSubmission Wrapper & Integration
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
components/spider-storage/tests/scheduling_infra.rs (1)
443-445:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winStale doc comment — update to reflect the new
ValidatedJobSubmissionparameter.The phrase "submitted task graph and job inputs" describes the old separate-argument API. A contributor writing a new DB-connector factory would use this function as a template and be confused by the mismatch between the doc and the actual closure parameter.
✏️ Suggested fix
-/// The returned closure receives the submitted task graph and job inputs from [`run_workload`], -/// registers the job via [`ExternalJobOrchestration::register`], and returns the connector along -/// with the resulting [`JobId`] and [`ResourceGroupId`]. +/// The returned closure receives the validated job submission from [`run_workload`], +/// registers the job via [`ExternalJobOrchestration::register`], and returns the connector along +/// with the resulting [`JobId`] and [`ResourceGroupId`].The same applies to the module-level comment (line 71) which still describes the factory as
AsyncFnOnce() -> DbConnectorType, omitting the&ValidatedJobSubmissionargument.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@components/spider-storage/tests/scheduling_infra.rs` around lines 443 - 445, Update the stale doc comments to reflect that the returned closure now receives a &ValidatedJobSubmission (not separate "submitted task graph and job inputs"); specifically edit the function-level comment that mentions run_workload and ExternalJobOrchestration::register to state the closure signature accepts &ValidatedJobSubmission and that it registers the job via ExternalJobOrchestration::register and returns the connector, and also update the module-level comment (which currently describes AsyncFnOnce() -> DbConnectorType) to include the &ValidatedJobSubmission parameter so both comments match the actual closure parameter.
🧹 Nitpick comments (2)
components/spider-storage/tests/mariadb_test.rs (1)
42-44: ⚡ Quick winConsider extracting a
single_task_job_submission()helper to eliminate repeated boilerplate.The two-liner pattern:
let (graph, inputs) = single_task_graph(); let job_submission = ValidatedJobSubmission::validate(graph, inputs).expect("job submission should be valid");appears verbatim in 15+ tests. Extracting it halves the per-test setup noise and makes the intent (register a job from a standard single-task graph) explicit in one call.
♻️ Proposed helper
+/// Builds a [`ValidatedJobSubmission`] from the standard single-task graph used by DB-layer tests. +fn single_task_job_submission() -> ValidatedJobSubmission { + let (graph, inputs) = single_task_graph(); + ValidatedJobSubmission::validate(graph, inputs) + .expect("single-task job submission should be valid") +}Each test then becomes:
- let (graph, inputs) = single_task_graph(); - let job_submission = - ValidatedJobSubmission::validate(graph, inputs).expect("job submission should be valid"); + let job_submission = single_task_job_submission();Also applies to: 64-68
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@components/spider-storage/tests/mariadb_test.rs` around lines 42 - 44, Extract a helper function (e.g., single_task_job_submission()) that encapsulates the two-line boilerplate currently repeated: call single_task_graph() to get (graph, inputs) and then call ValidatedJobSubmission::validate(graph, inputs). The helper should return the ValidatedJobSubmission (or panic with the same expect message) so tests can replace the two-liner with a single call to single_task_job_submission(); update tests that currently use single_task_graph and ValidatedJobSubmission::validate to call this new helper instead.components/spider-storage/src/cache/task.rs (1)
1093-1131: 💤 Low valueDummy task in
build_termination_tcbis the correct workaround.
ValidatedJobSubmission::validaterequires at least one regular task (TaskGraphEmptyinvariant), so the minimal dummy task (zero inputs, zero outputs,input_sources: None) is added solely to satisfy that constraint. The dummy has no effect on the commit-task-focused assertions that follow. A brief inline comment explaining why it exists would prevent future confusion.💬 Suggested comment
+ // A non-empty regular task is required to pass `ValidatedJobSubmission::validate` + // (TaskGraphEmpty check). This dummy task has no bearing on the commit-task behaviour + // being tested. submitted .insert_task(TaskDescriptor {🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@components/spider-storage/src/cache/task.rs` around lines 1093 - 1131, Add a brief inline comment in build_termination_tcb next to the inserted dummy TaskDescriptor (the TaskDescriptor with package "test_pkg", task_func "dummy_fn", zero inputs/outputs and input_sources: None) explaining that this dummy task is intentionally added to satisfy ValidatedJobSubmission::validate's requirement that a TaskGraph contain at least one regular task (the TaskGraphEmpty invariant) and has no effect on the commit-task-focused assertions; reference SubmittedTaskGraph and ValidatedJobSubmission::validate in the comment so future readers understand why the dummy exists.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@components/spider-storage/tests/scheduling_infra.rs`:
- Around line 443-445: Update the stale doc comments to reflect that the
returned closure now receives a &ValidatedJobSubmission (not separate "submitted
task graph and job inputs"); specifically edit the function-level comment that
mentions run_workload and ExternalJobOrchestration::register to state the
closure signature accepts &ValidatedJobSubmission and that it registers the job
via ExternalJobOrchestration::register and returns the connector, and also
update the module-level comment (which currently describes AsyncFnOnce() ->
DbConnectorType) to include the &ValidatedJobSubmission parameter so both
comments match the actual closure parameter.
---
Nitpick comments:
In `@components/spider-storage/src/cache/task.rs`:
- Around line 1093-1131: Add a brief inline comment in build_termination_tcb
next to the inserted dummy TaskDescriptor (the TaskDescriptor with package
"test_pkg", task_func "dummy_fn", zero inputs/outputs and input_sources: None)
explaining that this dummy task is intentionally added to satisfy
ValidatedJobSubmission::validate's requirement that a TaskGraph contain at least
one regular task (the TaskGraphEmpty invariant) and has no effect on the
commit-task-focused assertions; reference SubmittedTaskGraph and
ValidatedJobSubmission::validate in the comment so future readers understand why
the dummy exists.
In `@components/spider-storage/tests/mariadb_test.rs`:
- Around line 42-44: Extract a helper function (e.g.,
single_task_job_submission()) that encapsulates the two-line boilerplate
currently repeated: call single_task_graph() to get (graph, inputs) and then
call ValidatedJobSubmission::validate(graph, inputs). The helper should return
the ValidatedJobSubmission (or panic with the same expect message) so tests can
replace the two-liner with a single call to single_task_job_submission(); update
tests that currently use single_task_graph and ValidatedJobSubmission::validate
to call this new helper instead.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 360476b5-b445-404d-8ded-e82562297e3e
📒 Files selected for processing (13)
components/spider-core/src/types/io.rscomponents/spider-storage/src/cache.rscomponents/spider-storage/src/cache/error.rscomponents/spider-storage/src/cache/job.rscomponents/spider-storage/src/cache/job_submission.rscomponents/spider-storage/src/cache/task.rscomponents/spider-storage/src/db/mariadb.rscomponents/spider-storage/src/db/protocol.rscomponents/spider-storage/src/state/job_cache.rscomponents/spider-storage/src/task_instance_pool.rscomponents/spider-storage/tests/jcb_test.rscomponents/spider-storage/tests/mariadb_test.rscomponents/spider-storage/tests/scheduling_infra.rs
LinZhihao-723
left a comment
There was a problem hiding this comment.
One nit catch on the variable name otherwise lgtm.
Co-authored-by: Lin Zhihao <59785146+LinZhihao-723@users.noreply.github.com>
…ider into validate-job-submission
There was a problem hiding this comment.
🧹 Nitpick comments (1)
components/spider-storage/src/cache/job_submission.rs (1)
79-91: 💤 Low valueOptional: drop the
SubmittedTaskGraphalias — it resolves nothing in this scope.
super::*only re-exportspubitems;TaskGraph(a non-pubuseat line 1) is not among them. Within the test module,spider_core::task::TaskGraphcan be imported under its original name without any shadowing conflict.♻️ Proposed refactor
- TaskGraph as SubmittedTaskGraph, + TaskGraph,Then update all usages from
SubmittedTaskGraph→TaskGraphin the test helper and test bodies.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@components/spider-storage/src/cache/job_submission.rs` around lines 79 - 91, The import alias SubmittedTaskGraph is unnecessary and unused due to scope shadowing; remove the alias from the use list and import spider_core::task::TaskGraph directly in the test module, then replace all occurrences of SubmittedTaskGraph with TaskGraph in the test helper and test bodies (references: the use line that currently declares SubmittedTaskGraph and all usages of SubmittedTaskGraph within job_submission.rs tests). Ensure super::* remains for other re-exports but does not try to rely on SubmittedTaskGraph.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@components/spider-storage/src/cache/job_submission.rs`:
- Around line 79-91: The import alias SubmittedTaskGraph is unnecessary and
unused due to scope shadowing; remove the alias from the use list and import
spider_core::task::TaskGraph directly in the test module, then replace all
occurrences of SubmittedTaskGraph with TaskGraph in the test helper and test
bodies (references: the use line that currently declares SubmittedTaskGraph and
all usages of SubmittedTaskGraph within job_submission.rs tests). Ensure
super::* remains for other re-exports but does not try to rely on
SubmittedTaskGraph.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 18a10b9a-44fb-408a-a9ef-5257224986b7
📒 Files selected for processing (6)
components/spider-storage/src/cache/job_submission.rscomponents/spider-storage/src/cache/task.rscomponents/spider-storage/src/state/job_cache.rscomponents/spider-storage/src/task_instance_pool.rscomponents/spider-storage/tests/jcb_test.rscomponents/spider-storage/tests/mariadb_test.rs
🚧 Files skipped from review as they are similar to previous changes (5)
- components/spider-storage/src/task_instance_pool.rs
- components/spider-storage/tests/jcb_test.rs
- components/spider-storage/src/state/job_cache.rs
- components/spider-storage/src/cache/task.rs
- components/spider-storage/tests/mariadb_test.rs
LinZhihao-723
left a comment
There was a problem hiding this comment.
For the PR title, how about:
feat(spider-storage): Add `ValidatedJobSubmission` to hold a pre-validated task graph and its task inputs.
ValidatedJobSubmission for task graph & inputs validation.ValidatedJobSubmission to hold a validated task graph and its task inputs.
Description
This PR adds a
ValidatedJobSubmissionwrapper type to represents a validated task graph and inputs.ValidatedJobSubmissionvalidates that:Checklist
breaking change.
Validation performed
Summary by CodeRabbit
New Features
Refactor