Skip to content

DO NOT MERGE!: feat(huntsman): Task executor prototyping.#301

Open
LinZhihao-723 wants to merge 8 commits intoy-scope:mainfrom
LinZhihao-723:task-exec-prototyping
Open

DO NOT MERGE!: feat(huntsman): Task executor prototyping.#301
LinZhihao-723 wants to merge 8 commits intoy-scope:mainfrom
LinZhihao-723:task-exec-prototyping

Conversation

@LinZhihao-723
Copy link
Copy Markdown
Member

@LinZhihao-723 LinZhihao-723 commented Apr 13, 2026

Description

Checklist

  • The PR satisfies the contribution guidelines.
  • This is a breaking change and that has been indicated in the PR title, OR this isn't a
    breaking change.
  • Necessary docs have been updated, OR no docs need to be updated.

Validation performed

Summary by CodeRabbit

Release Notes

  • New Features
    • Added a task execution framework supporting dynamic package loading and task execution via FFI boundaries.
    • Added procedural macro support for defining and registering task functions with automatic serialization/deserialization.
    • Added wire-framed serialization for task inputs and outputs with msgpack encoding.
    • Added integration tests validating end-to-end task execution workflows.
    • Added example task package demonstrating complex number arithmetic operations.

@LinZhihao-723 LinZhihao-723 requested review from a team and sitaowang1998 as code owners April 13, 2026 03:13
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 13, 2026

Walkthrough

A comprehensive Task Definition Language (TDL) system is introduced, enabling Rust-based task definitions with automatic serialization, C-FFI boundaries for dynamic loading, wire-format marshalling, procedural macro support, and runtime task execution via a package loader and handler registry.

Changes

Cohort / File(s) Summary
Workspace Configuration
Cargo.toml
Added workspace members for five new crates: components/spider-task-executor, components/spider-tdl, components/spider-tdl-derive, examples/example-tdl-package-complex, and tests/tdl-integration.
Design & Architecture Documentation
claude/task-exec-prototyping/context.md, claude/task-exec-prototyping/design-doc.md, claude/task-exec-prototyping/struct-serde.md
Added three comprehensive design documents specifying the TDL system architecture, type constraints, wire format, procedural macro requirements, FFI boundaries, and two-layer serialization pipeline for task inputs/outputs.
Core TDL Library—Types & Primitives
components/spider-tdl/Cargo.toml, components/spider-tdl/src/lib.rs, components/spider-tdl/src/tdl_types.rs, components/spider-tdl/src/task_context.rs, components/spider-tdl/src/error.rs
Introduced base crate with type aliases (int8–int64, float, double, boolean, Bytes, List, Map), TaskContext runtime metadata struct, and TdlError enum for task-level error handling, all with msgpack serialization support.
Task Execution Abstractions
components/spider-tdl/src/task.rs, components/spider-tdl/src/ffi.rs, components/spider-tdl/src/register.rs
Defined Task and TaskHandler traits, type-erased TaskHandlerImpl bridge, TaskExecutionResult for FFI result transport, CArray/CByteArray for FFI array views, and register_tasks! macro generating C-FFI entry points and task registry.
Wire Serialization
components/spider-tdl/src/wire.rs
Implemented length-prefixed framing for task inputs/outputs with custom serde deserializer (StreamDeserializer) and serializer (TupleOutputSerializer), supporting positional field deserialization and tuple-based output encoding with comprehensive error handling.
Procedural Macro for Task Definition
components/spider-tdl-derive/Cargo.toml, components/spider-tdl-derive/src/lib.rs, components/spider-tdl-derive/src/task_macro.rs
Added #[task] procedural macro that parses annotated functions, validates first parameter is TaskContext, auto-wraps non-tuple returns, generates params struct, and implements the Task trait with optional custom naming.
Task Executor Runtime
components/spider-task-executor/Cargo.toml, components/spider-task-executor/src/lib.rs, components/spider-task-executor/src/error.rs, components/spider-task-executor/src/loader.rs
Introduced runtime package loader using libloading for dynamic shared library loading, task name resolution, duplicate detection, and task execution via resolved C-FFI symbols with result/error decoding.
Example Complex Number Tasks
examples/example-tdl-package-complex/Cargo.toml, examples/example-tdl-package-complex/src/lib.rs
Added example crate demonstrating five TDL tasks (complex::add, sub, mul, div, always_fail) on serializable Complex struct with arithmetic operations, zero-division error handling, and task registration.
Integration Tests & Build Configuration
tests/tdl-integration/Cargo.toml, tests/tdl-integration/src/lib.rs, tests/tdl-integration/tests/complex.rs, taskfiles/test.yaml
Added integration test suite validating loader behaviour, task execution, output deserialization, error propagation, and environment-variable-controlled package path; updated build task to compile example package before test execution.

Sequence Diagram(s)

sequenceDiagram
    actor Client
    participant Executor as Task<br/>Executor
    participant Loader as TdlPackage<br/>Loader
    participant DynLib as Dynamic<br/>Library
    participant Registry as Task<br/>Registry
    
    Client->>Executor: load(shared_lib_path)
    Executor->>DynLib: dlopen(path)
    DynLib-->>Executor: library_handle
    Executor->>DynLib: __spider_tdl_package_get_name()
    DynLib-->>Executor: package_name
    Executor->>Loader: insert(package_name, library)
    Loader-->>Executor: &str reference
    
    Client->>Executor: execute_task(pkg_name, task_name, ctx, inputs)
    Executor->>Loader: get(package_name)
    Loader-->>Executor: &TdlPackage
    Executor->>DynLib: __spider_tdl_package_execute(task_name, ctx, inputs)
    DynLib->>Registry: lookup(task_name)
    Registry->>Registry: TaskHandlerImpl::execute_raw(ctx, inputs)
    Registry->>Registry: deserialize TaskContext + params
    Registry->>Registry: call user Task::execute()
    Registry->>Registry: serialize result/error
    Registry-->>Registry: ExecutionResult
    Registry-->>DynLib: TaskExecutionResult
    DynLib-->>Executor: result_bytes
    Executor->>Executor: decode ExecutionResult
    Executor-->>Client: Ok(output_bytes) or Err(TdlError)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Title check ⚠️ Warning The title contains 'DO NOT MERGE!' which indicates this is a work-in-progress/draft PR that should not be merged, contradicting the purpose of a mergeable pull request. Remove 'DO NOT MERGE!' prefix from the title. Use a clear, concise title like 'feat(huntsman): Task executor prototyping' that accurately describes the changeset without blocking markers.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (3)
components/spider-tdl/src/error.rs (1)

51-55: Strengthen round-trip tests to assert variant identity, not just display text.

At Line 54, string comparison can pass even if the decoded variant is wrong but renders the same message.

Proposed test-strengthening diff
-#[derive(Debug, thiserror::Error, serde::Serialize, serde::Deserialize)]
+#[derive(Debug, PartialEq, Eq, thiserror::Error, serde::Serialize, serde::Deserialize)]
 pub enum TdlError {
@@
-            let original_display = original.to_string();
             let encoded = rmp_serde::to_vec(&original)?;
             let decoded: TdlError = rmp_serde::from_slice(&encoded)?;
-            assert_eq!(decoded.to_string(), original_display);
+            assert_eq!(decoded, original);
         }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/spider-tdl/src/error.rs` around lines 51 - 55, The current
round-trip test only compares decoded.to_string() with original.to_string(),
which can hide variant mismatches; change the assertion to verify the decoded
variant equals the original variant (e.g., use assert_eq!(decoded, original) if
TdlError implements PartialEq, or match on the TdlError enum variants to compare
discriminants/fields), updating the test that constructs original, encodes with
rmp_serde::to_vec, decodes to TdlError, and then assert variant identity for
decoded vs original instead of comparing display text.
components/spider-tdl-derive/src/task_macro.rs (1)

180-209: Consider documenting the intentional permissiveness of TaskContext validation.

The validation checks only the last segment of the type path (line 201), meaning it would accept foo::TaskContext, spider_tdl::TaskContext, or any other path ending in TaskContext. While this appears intentional for flexibility, a brief doc comment explaining this design choice would help future maintainers understand it's not an oversight.

📝 Suggested documentation addition
+/// Validates that the first parameter's type ends with `TaskContext`.
+///
+/// This check is intentionally permissive: it matches any type path whose final segment is
+/// `TaskContext` (e.g., `TaskContext`, `spider_tdl::TaskContext`, `crate::TaskContext`).
+/// This allows flexibility in how users import the type.
 fn validate_first_param_is_task_context(param: &FnArg) -> syn::Result<()> {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/spider-tdl-derive/src/task_macro.rs` around lines 180 - 209, Add a
doc comment to validate_first_param_is_task_context explaining that the function
intentionally only checks the last segment of the type path (the
last_segment.ident == "TaskContext" check) so it accepts TaskContext with or
without module prefixes (e.g., foo::TaskContext or spider_tdl::TaskContext) for
flexibility; update the doc above the function to state this design decision and
why full path validation was avoided.
claude/task-exec-prototyping/design-doc.md (1)

33-42: Consider adding language specifiers to fenced code blocks.

Several fenced code blocks in this design document lack language specifiers (e.g., lines 33, 67, 373, 408, 486, 684, 854). For ASCII diagrams and directory structures, using text or plaintext would satisfy linters and improve rendering consistency.

📝 Example fix for the dependency graph block
-```
+```text
 spider-tdl-derive  (proc-macro: syn, quote, proc-macro2)
        │
        ▼
   spider-tdl  (rmp-serde, serde, thiserror; re-exports spider-tdl-derive)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@claude/task-exec-prototyping/design-doc.md` around lines 33 - 42, The fenced
ASCII diagrams (e.g., the dependency graph containing symbols like
spider-tdl-derive, spider-tdl, spider-executor, and example-tdl-package) lack
language specifiers; update those code fences (including the block shown and the
other similar blocks referenced) by adding a language label such as "text" or
"plaintext" after the opening ``` so linters and renderers treat them as plain
text (e.g., change ``` to ```text for each ASCII/diagram/directory structure
block).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@components/spider-tdl/src/register.rs`:
- Around line 37-44: The registry currently uses HashMap::insert which silently
overwrites handlers when two tasks share the same <$task as $crate::Task>::NAME;
change the insertion to a fail-fast check using the map.entry(...) API (or an
explicit contains_key check) and panic (or return Err) if the name already
exists, emitting a clear message that includes the duplicate name (<$task as
$crate::Task>::NAME) and the task type (e.g. stringify!($task)) so duplicates
are detected during registry construction; update the macro block that creates
the Box::new($crate::TaskHandlerImpl::<$task>::new()) as the value to use this
guarded insertion instead of plain insert.

In `@taskfiles/test.yaml`:
- Line 217: Replace the hardcoded ".so" suffix for SPIDER_TDL_PACKAGE_COMPLEX
with an OS-aware template or variable so macOS uses ".dylib" and Linux uses
".so"; for example compute a SPIDER_LIB_SUFFIX earlier (or use a template
conditional like {{if eq .OS "darwin"}}.dylib{{else}}.so{{end}}) and then set
SPIDER_TDL_PACKAGE_COMPLEX to
"{{.G_RUST_BUILD_DIR}}/release/libexample_tdl_package_complex{{.SPIDER_LIB_SUFFIX}}"
(or use the direct conditional) so the correct dynamic library extension is
chosen at runtime.

In `@tests/tdl-integration/tests/complex.rs`:
- Around line 19-21: The current lib_path() returns Option and allows tests to
silently skip when SPIDER_TDL_PACKAGE_COMPLEX is unset; change lib_path() to
return a String and call
std::env::var("SPIDER_TDL_PACKAGE_COMPLEX").expect("SPIDER_TDL_PACKAGE_COMPLEX
must be set for integration tests") (or otherwise panic with a clear message) so
missing prerequisites hard-fail; update all call sites that currently do `let
Some(path) = lib_path() else { return Ok(()); };` to call lib_path() directly
(or let path = lib_path();) and remove the early-return pattern so the test
fails loudly when the env var is not provided.

---

Nitpick comments:
In `@claude/task-exec-prototyping/design-doc.md`:
- Around line 33-42: The fenced ASCII diagrams (e.g., the dependency graph
containing symbols like spider-tdl-derive, spider-tdl, spider-executor, and
example-tdl-package) lack language specifiers; update those code fences
(including the block shown and the other similar blocks referenced) by adding a
language label such as "text" or "plaintext" after the opening ``` so linters
and renderers treat them as plain text (e.g., change ``` to ```text for each
ASCII/diagram/directory structure block).

In `@components/spider-tdl-derive/src/task_macro.rs`:
- Around line 180-209: Add a doc comment to validate_first_param_is_task_context
explaining that the function intentionally only checks the last segment of the
type path (the last_segment.ident == "TaskContext" check) so it accepts
TaskContext with or without module prefixes (e.g., foo::TaskContext or
spider_tdl::TaskContext) for flexibility; update the doc above the function to
state this design decision and why full path validation was avoided.

In `@components/spider-tdl/src/error.rs`:
- Around line 51-55: The current round-trip test only compares
decoded.to_string() with original.to_string(), which can hide variant
mismatches; change the assertion to verify the decoded variant equals the
original variant (e.g., use assert_eq!(decoded, original) if TdlError implements
PartialEq, or match on the TdlError enum variants to compare
discriminants/fields), updating the test that constructs original, encodes with
rmp_serde::to_vec, decodes to TdlError, and then assert variant identity for
decoded vs original instead of comparing display text.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: aee06e81-462a-4447-b153-7195e10facee

📥 Commits

Reviewing files that changed from the base of the PR and between 0d70a37 and d2cb56c.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (26)
  • Cargo.toml
  • claude/task-exec-prototyping/context.md
  • claude/task-exec-prototyping/design-doc.md
  • claude/task-exec-prototyping/struct-serde.md
  • components/spider-task-executor/Cargo.toml
  • components/spider-task-executor/src/error.rs
  • components/spider-task-executor/src/lib.rs
  • components/spider-task-executor/src/loader.rs
  • components/spider-tdl-derive/Cargo.toml
  • components/spider-tdl-derive/src/lib.rs
  • components/spider-tdl-derive/src/task_macro.rs
  • components/spider-tdl/Cargo.toml
  • components/spider-tdl/src/error.rs
  • components/spider-tdl/src/ffi.rs
  • components/spider-tdl/src/lib.rs
  • components/spider-tdl/src/register.rs
  • components/spider-tdl/src/task.rs
  • components/spider-tdl/src/task_context.rs
  • components/spider-tdl/src/tdl_types.rs
  • components/spider-tdl/src/wire.rs
  • examples/example-tdl-package-complex/Cargo.toml
  • examples/example-tdl-package-complex/src/lib.rs
  • taskfiles/test.yaml
  • tests/tdl-integration/Cargo.toml
  • tests/tdl-integration/src/lib.rs
  • tests/tdl-integration/tests/complex.rs

Comment on lines +37 to +44
$(
map.insert(
<$task as $crate::Task>::NAME,
Box::new($crate::TaskHandlerImpl::<$task>::new())
as Box<dyn $crate::TaskHandler>,
);
)*
map
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Detect duplicate task names during registry construction.

At Line 38, HashMap::insert will silently replace an existing handler if two task types resolve to the same Task::NAME, leading to unexpected dispatch.

Proposed fail-fast guard
             $(
-                map.insert(
+                let previous = map.insert(
                     <$task as $crate::Task>::NAME,
                     Box::new($crate::TaskHandlerImpl::<$task>::new())
                         as Box<dyn $crate::TaskHandler>,
                 );
+                assert!(
+                    previous.is_none(),
+                    "duplicate task name in register_tasks!: {}",
+                    <$task as $crate::Task>::NAME
+                );
             )*
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
$(
map.insert(
<$task as $crate::Task>::NAME,
Box::new($crate::TaskHandlerImpl::<$task>::new())
as Box<dyn $crate::TaskHandler>,
);
)*
map
$(
let previous = map.insert(
<$task as $crate::Task>::NAME,
Box::new($crate::TaskHandlerImpl::<$task>::new())
as Box<dyn $crate::TaskHandler>,
);
assert!(
previous.is_none(),
"duplicate task name in register_tasks!: {}",
<$task as $crate::Task>::NAME
);
)*
map
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/spider-tdl/src/register.rs` around lines 37 - 44, The registry
currently uses HashMap::insert which silently overwrites handlers when two tasks
share the same <$task as $crate::Task>::NAME; change the insertion to a
fail-fast check using the map.entry(...) API (or an explicit contains_key check)
and panic (or return Err) if the name already exists, emitting a clear message
that includes the duplicate name (<$task as $crate::Task>::NAME) and the task
type (e.g. stringify!($task)) so duplicates are detected during registry
construction; update the macro block that creates the
Box::new($crate::TaskHandlerImpl::<$task>::new()) as the value to use this
guarded insertion instead of plain insert.

Comment thread taskfiles/test.yaml
MARIADB_DATABASE: "{{.MARIADB_DATABASE}}"
MARIADB_USERNAME: "{{.MARIADB_USERNAME}}"
MARIADB_PASSWORD: "{{.MARIADB_PASSWORD}}"
SPIDER_TDL_PACKAGE_COMPLEX: "{{.G_RUST_BUILD_DIR}}/release/libexample_tdl_package_complex.so"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Use an OS-aware dynamic library suffix for SPIDER_TDL_PACKAGE_COMPLEX.

Line 217 hardcodes .so, which fails on macOS (.dylib). This can break local/unit test execution for part of the supported Unix-like dev flow.

🔧 Proposed fix
   spider-huntsman-unit-tests-executor:
     internal: true
+    vars:
+      TDL_DYLIB_EXT:
+        sh: 'case "$(uname -s)" in Darwin) echo "dylib" ;; *) echo "so" ;; esac'
     env:
       MARIADB_PORT: "{{.MARIADB_PORT}}"
       MARIADB_DATABASE: "{{.MARIADB_DATABASE}}"
       MARIADB_USERNAME: "{{.MARIADB_USERNAME}}"
       MARIADB_PASSWORD: "{{.MARIADB_PASSWORD}}"
-      SPIDER_TDL_PACKAGE_COMPLEX: "{{.G_RUST_BUILD_DIR}}/release/libexample_tdl_package_complex.so"
+      SPIDER_TDL_PACKAGE_COMPLEX: "{{.G_RUST_BUILD_DIR}}/release/libexample_tdl_package_complex.{{.TDL_DYLIB_EXT}}"

Based on learnings: development tasks/scripts under taskfiles/ are Unix-like only (Linux/macOS), so Linux-only artifact assumptions can break macOS workflows.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@taskfiles/test.yaml` at line 217, Replace the hardcoded ".so" suffix for
SPIDER_TDL_PACKAGE_COMPLEX with an OS-aware template or variable so macOS uses
".dylib" and Linux uses ".so"; for example compute a SPIDER_LIB_SUFFIX earlier
(or use a template conditional like {{if eq .OS
"darwin"}}.dylib{{else}}.so{{end}}) and then set SPIDER_TDL_PACKAGE_COMPLEX to
"{{.G_RUST_BUILD_DIR}}/release/libexample_tdl_package_complex{{.SPIDER_LIB_SUFFIX}}"
(or use the direct conditional) so the correct dynamic library extension is
chosen at runtime.

Comment on lines +19 to +21
fn lib_path() -> Option<String> {
std::env::var("SPIDER_TDL_PACKAGE_COMPLEX").ok()
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid silent pass when integration test prerequisites are missing.

At Line 19, returning Option enables all tests to early-return Ok(()); this can hide broken CI wiring and skip coverage without signal.

Proposed hard-fail helper (root cause fix)
-fn lib_path() -> Option<String> {
-    std::env::var("SPIDER_TDL_PACKAGE_COMPLEX").ok()
+fn lib_path() -> String {
+    std::env::var("SPIDER_TDL_PACKAGE_COMPLEX")
+        .expect("SPIDER_TDL_PACKAGE_COMPLEX is required for tdl-integration tests")
 }

Then replace each let Some(path) = lib_path() else { return Ok(()); }; with:

-    let Some(path) = lib_path() else {
-        return Ok(());
-    };
+    let path = lib_path();
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
fn lib_path() -> Option<String> {
std::env::var("SPIDER_TDL_PACKAGE_COMPLEX").ok()
}
fn lib_path() -> String {
std::env::var("SPIDER_TDL_PACKAGE_COMPLEX")
.expect("SPIDER_TDL_PACKAGE_COMPLEX is required for tdl-integration tests")
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/tdl-integration/tests/complex.rs` around lines 19 - 21, The current
lib_path() returns Option and allows tests to silently skip when
SPIDER_TDL_PACKAGE_COMPLEX is unset; change lib_path() to return a String and
call
std::env::var("SPIDER_TDL_PACKAGE_COMPLEX").expect("SPIDER_TDL_PACKAGE_COMPLEX
must be set for integration tests") (or otherwise panic with a clear message) so
missing prerequisites hard-fail; update all call sites that currently do `let
Some(path) = lib_path() else { return Ok(()); };` to call lib_path() directly
(or let path = lib_path();) and remove the early-return pattern so the test
fails loudly when the env var is not provided.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant