Skip to content

feat(spider-storage): Add ServiceState and job level functions.#319

Open
sitaowang1998 wants to merge 29 commits intoy-scope:storage-service-devfrom
sitaowang1998:job-lifecycle
Open

feat(spider-storage): Add ServiceState and job level functions.#319
sitaowang1998 wants to merge 29 commits intoy-scope:storage-service-devfrom
sitaowang1998:job-lifecycle

Conversation

@sitaowang1998
Copy link
Copy Markdown
Collaborator

@sitaowang1998 sitaowang1998 commented May 7, 2026

Description

This PR:

  • Adds ServiceState object that holds all storage services.
  • Adds job level functions in SerivceState.
  • Extracts common mock objects in test_mocks.rs.

Checklist

  • The PR satisfies the contribution guidelines.
  • This is a breaking change and that has been indicated in the PR title, OR this isn't a
    breaking change.
  • Necessary docs have been updated, OR no docs need to be updated.

Validation performed

  • Adds new unit tests.
  • GitHub workflows pass.

Summary by CodeRabbit

Release Notes

Refactor

  • Internal infrastructure improvements to job state management and error handling.

Tests

  • Enhanced test infrastructure with improved mock implementations for better test coverage.

Note: This release contains internal refactoring and infrastructure improvements with no direct impact on user-facing features or functionality.

@sitaowang1998 sitaowang1998 requested a review from a team as a code owner May 7, 2026 02:30
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 7, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: a377121f-4bc4-4d59-aed0-85e904b8b658

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@sitaowang1998
Copy link
Copy Markdown
Collaborator Author

@CodeRabbit review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 7, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
components/spider-storage/src/state/job_cache.rs (1)

240-247: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

result is moved by matches! then used again in if let — compile error

matches!(result, Err(...)) expands to match result { ... }, which moves the non-Copy Result<(), StorageServerError> (the Stopping(String) and BadRequest(String) variants alone prevent Copy). The subsequent if let … = result therefore references a moved value and will not compile.

🐛 Proposed fix — borrow in `matches!`
-        assert!(
-            matches!(result, Err(StorageServerError::JobAlreadyExists(_))),
-            "insert should return JobAlreadyExists error for duplicate key"
-        );
-        if let Err(StorageServerError::JobAlreadyExists(id)) = result {
-            assert_eq!(id, job_id, "error should contain the duplicate job ID");
-        }
+        assert!(
+            matches!(&result, Err(StorageServerError::JobAlreadyExists(_))),
+            "insert should return JobAlreadyExists error for duplicate key"
+        );
+        if let Err(StorageServerError::JobAlreadyExists(id)) = result {
+            assert_eq!(id, job_id, "error should contain the duplicate job ID");
+        }

Passing &result borrows the value; Rust's match ergonomics automatically adapt the inner pattern, leaving result available for the subsequent if let.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@components/spider-storage/src/state/job_cache.rs` around lines 240 - 247, The
test currently moves `result` by using `matches!(result,
Err(StorageServerError::JobAlreadyExists(_)))` then reuses `result` in the `if
let`, causing a compile error; change the `matches!` to borrow the value (e.g.,
`matches!(&result, Err(StorageServerError::JobAlreadyExists(_)))`) so `result`
from `cache.insert(...)` remains available for the subsequent `if let
Err(StorageServerError::JobAlreadyExists(id)) = result` check and assertion.
🧹 Nitpick comments (2)
components/spider-storage/src/state/test_mocks.rs (1)

119-145: ⚡ Quick win

commit_outputs discards outputs and fail discards the error message

Both methods update self.states correctly but silently drop their data payloads (_job_outputs, _error_message). Tests that assert on those values work around this by inserting directly into db.outputs / db.errors, so nothing is broken today. However, as more tests are added, this silent discard can cause confusing DbError::JobNotFound results from get_outputs / get_error even after a "successful" transition, making failures hard to diagnose.

♻️ Proposed fix — persist data in the mock
     async fn commit_outputs(
         &self,
         job_id: JobId,
-        _job_outputs: Vec<TaskOutput>,
+        job_outputs: Vec<TaskOutput>,
         _has_commit_task: bool,
     ) -> Result<(), DbError> {
         self.states.insert(job_id, JobState::Succeeded);
+        self.outputs.insert(job_id, job_outputs);
         Ok(())
     }

-    async fn fail(&self, job_id: JobId, _error_message: String) -> Result<(), DbError> {
+    async fn fail(&self, job_id: JobId, error_message: String) -> Result<(), DbError> {
         self.states.insert(job_id, JobState::Failed);
+        self.errors.insert(job_id, error_message);
         Ok(())
     }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@components/spider-storage/src/state/test_mocks.rs` around lines 119 - 145,
The mock methods commit_outputs and fail currently ignore their payloads; update
commit_outputs(JobId, Vec<TaskOutput>, ...) to store the provided job_outputs in
the mock's outputs map (use the real parameter name instead of _job_outputs)
when inserting JobState::Succeeded, and update fail(JobId, String) to store the
provided error message in the mock's errors map (use the real parameter name
instead of _error_message) when inserting JobState::Failed so
get_outputs/get_error return the persisted data in tests.
components/spider-storage/src/state/service.rs (1)

386-417: 💤 Low value

create_test_jcb helper is duplicated across test modules

The create_test_jcb function here is functionally identical to the one in job_cache.rs's test module (Lines 152–183). Since test_mocks.rs already centralises shared mock types, this helper could live there as well, eliminating the duplication.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@components/spider-storage/src/state/service.rs` around lines 386 - 417, The
create_test_jcb helper is duplicated; move the function into the shared test
module (test_mocks.rs) and have tests import it instead of defining their own.
Specifically, extract the create_test_jcb that constructs a SubmittedTaskGraph
and calls SharedJobControlBlock::create (using MockReadyQueueSender,
MockDbConnector::default(), MockTaskInstancePoolConnector) into test_mocks.rs,
remove the duplicate in this file and in job_cache.rs, and update the tests to
use the centralized create_test_jcb helper.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@components/spider-storage/src/state/job_cache.rs`:
- Around line 240-247: The test currently moves `result` by using
`matches!(result, Err(StorageServerError::JobAlreadyExists(_)))` then reuses
`result` in the `if let`, causing a compile error; change the `matches!` to
borrow the value (e.g., `matches!(&result,
Err(StorageServerError::JobAlreadyExists(_)))`) so `result` from
`cache.insert(...)` remains available for the subsequent `if let
Err(StorageServerError::JobAlreadyExists(id)) = result` check and assertion.

---

Nitpick comments:
In `@components/spider-storage/src/state/service.rs`:
- Around line 386-417: The create_test_jcb helper is duplicated; move the
function into the shared test module (test_mocks.rs) and have tests import it
instead of defining their own. Specifically, extract the create_test_jcb that
constructs a SubmittedTaskGraph and calls SharedJobControlBlock::create (using
MockReadyQueueSender, MockDbConnector::default(), MockTaskInstancePoolConnector)
into test_mocks.rs, remove the duplicate in this file and in job_cache.rs, and
update the tests to use the centralized create_test_jcb helper.

In `@components/spider-storage/src/state/test_mocks.rs`:
- Around line 119-145: The mock methods commit_outputs and fail currently ignore
their payloads; update commit_outputs(JobId, Vec<TaskOutput>, ...) to store the
provided job_outputs in the mock's outputs map (use the real parameter name
instead of _job_outputs) when inserting JobState::Succeeded, and update
fail(JobId, String) to store the provided error message in the mock's errors map
(use the real parameter name instead of _error_message) when inserting
JobState::Failed so get_outputs/get_error return the persisted data in tests.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: e5c094e3-844e-4933-8d0c-8f32c049df24

📥 Commits

Reviewing files that changed from the base of the PR and between af1df89 and 92e9cf6.

📒 Files selected for processing (5)
  • components/spider-storage/src/state.rs
  • components/spider-storage/src/state/error.rs
  • components/spider-storage/src/state/job_cache.rs
  • components/spider-storage/src/state/service.rs
  • components/spider-storage/src/state/test_mocks.rs

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 7, 2026

Caution

Failed to replace (edit) comment. This is likely due to insufficient permissions or the comment being deleted.

Error details
{}

Copy link
Copy Markdown
Member

@LinZhihao-723 LinZhihao-723 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed service implementation and brought up some design concerns. Will go for a detailed review in the next round.

Comment thread components/spider-storage/src/state/service.rs
pub async fn register_job(
&self,
resource_group_id: ResourceGroupId,
task_graph: &spider_core::task::TaskGraph,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this will be the serviceable layer, shouldn't the input be a serialized task graph? This call should first deserialize the task graph to make sure it's valid.

) -> Result<JobId, StorageServerError> {
let job_id = self
.db
.register(resource_group_id, task_graph, &job_inputs)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realized we forgot to address #284 so that the task graph vs. inputs are not validated before registration... we should probably fix that first before we implement this registration method.

Comment thread components/spider-storage/src/state/service.rs Outdated
Comment thread components/spider-storage/src/state/service.rs Outdated
Comment thread components/spider-storage/src/state/service.rs
&self,
job_id: JobId,
) -> Result<Vec<TaskOutput>, StorageServerError> {
Ok(self.db.get_outputs(job_id).await?)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Job control block does not directly store the outputs/error of the job. Should we add support for that in JCB or keep it as it is now?

Comment thread components/spider-storage/src/state/service.rs
Comment thread components/spider-storage/src/state/service.rs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants