Skip to content

fix: address CodeRabbit findings on PRs #57 + #61 (task orchestration + security family)#62

Merged
rohitg00 merged 1 commit intomainfrom
fix/cr-wave2-task-security
Apr 29, 2026
Merged

fix: address CodeRabbit findings on PRs #57 + #61 (task orchestration + security family)#62
rohitg00 merged 1 commit intomainfrom
fix/cr-wave2-task-security

Conversation

@rohitg00
Copy link
Copy Markdown
Collaborator

@rohitg00 rohitg00 commented Apr 29, 2026

Address 16 CodeRabbit findings (12 on #57, 4 on #61) on the task-orchestration and security-family workers ported in those merges.

PR #57 — workflow / task-decomposer / approval / approval-tiers

Severity Path Fix
Critical workers/approval/src/main.rs Policy read fails closed (propagate state::get error instead of unwrap_or(Null) + return required:false)
Critical workers/workflow/src/main.rs Retry path re-runs full run_step so StepMode semantics (fanout/collect/loop/conditional) are preserved; rolls back vars/results on retry failure
Major workers/workflow/src/main.rs Sanitize agentId at ingress (consistency with rest of migration)
Major workers/workflow/src/types.rs Custom deserialize_with on Workflow.id, WorkflowStep.functionId, WorkflowStep.outputVar so stored workflows cannot smuggle malformed IDs into security calls; new test covers rejection
Major workers/approval/src/main.rs set_policy validates tools is an array of strings before persist
Major workers/approval-tiers/src/main.rs validate_tier_input rejects non-numeric cost (was silently ignored when as_f64 returned None); new tests for null/array/string/bool
Major workers/task-decomposer/src/main.rs state_list propagates errors; callers updated
Major workers/task-decomposer/src/main.rs Sanitize subtask IDs from LLM output via sanitize_id (warn+skip on malformed)
Major workers/task-decomposer/src/main.rs spawn_workers atomically claims pending leaf via state::update -> in_progress before spawning, preventing duplicate spawns on repeat
Minor workers/workflow/src/main.rs total computed from filtered runs, not pre-filter array
Minor workers/approval/src/main.rs decide() rejects unknown decision values explicitly
Minor workers/approval-tiers/src/main.rs decide_tier_request rejects unknown decision values explicitly

PR #61 — security-headers / security-map / security-zeroize / lsp-tools

Severity Path Fix
Critical workers/security-zeroize/src/main.rs Return wrapped secret id to caller (was discarded as _id)
Major workers/security-headers/src/main.rs SSRF guard assert_safe_external_url: blocks non-http(s), localhost, .local/.internal/.localhost, IPv4 loopback/private/link-local/broadcast/unspecified/multicast + AWS metadata, IPv6 loopback/unspecified/multicast/ULA/link-local; redirect::Policy::none() so attacker can't 30x to internal target
Major workers/lsp-tools/src/main.rs lsp_rename adds dryRun mode (returns wouldModify[]), pre-stages all writes before executing, reports per-file failures in failed[]
Minor workers/security-map/src/main.rs state::delete of consumed challenge is best-effort (logs+continues); avoids 500-ing successful verifies on transient delete failure while keeping replay protection (set runs first)

Workspace

Re-added security-headers, security-map, security-zeroize, skill-security to [workspace].members. They were dropped in a later channel-batch merge so weren't building. They build clean now.

Verify

  • cargo check --workspace clean
  • cargo test --workspace --release clean (new tests for non-numeric cost, IPv6 SSRF, deserialize sanitization)
  • cargo clippy -p agentos-workflow -p agentos-task-decomposer -p agentos-approval -p agentos-approval-tiers -p agentos-security-headers -p agentos-security-map -p agentos-security-zeroize -p agentos-skill-security -p agentos-lsp-tools --all-targets -- -D warnings clean

Test plan

  • CI green
  • CodeRabbit re-review confirms findings resolved

Open in Devin Review

Summary by CodeRabbit

  • New Features

    • Added dry-run capability for rename operations—preview changes without executing them.
  • Improvements

    • Enhanced URL validation for external requests (security hardening).
    • Stricter input validation across workers with clearer error messages.
    • Improved error handling and reporting in state operations.
    • Fixed retry logic in workflow execution.
    • Enhanced workflow listing with corrected pagination and totals.
    • Secret identifier now exposed in security operations response.

… + security family)

PR #57 (workflow / task-decomposer / approval / approval-tiers):
- workflow: sanitize agentId at ingress; fix retry path to re-run run_step
  (preserves StepMode semantics: fanout/collect/loop/conditional);
  count `total` from filtered runs, not pre-filter; validate
  WorkflowStep.functionId + outputVar at deserialize via deserialize_with
  so stored workflows cannot smuggle malformed IDs into security calls.
- task-decomposer: state_list now propagates errors instead of masking
  them as []; sanitize subtask IDs from LLM output (skip malformed);
  spawn_workers atomically claims pending leaf tasks via state::update
  to status=in_progress before spawning, preventing duplicate spawns
  on repeated invocations.
- approval: policy read fails closed (propagate state::get error
  instead of treating as "not required"); decide() rejects unknown
  decision values explicitly; set_policy validates tools is an array
  of strings before persisting.
- approval-tiers: validate_tier_input rejects non-numeric cost (was
  silently ignored); decide_tier_request rejects unknown decision
  values explicitly.

PR #61 (security-headers / security-map / security-zeroize / skill-security / lsp-tools):
- security-zeroize: return wrapped secret id to caller (was discarded
  as `_id`, leaving callers unable to unwrap/manage).
- security-headers: SSRF guard before HEAD — assert_safe_external_url
  blocks non-http(s) schemes, localhost, .local/.internal/.localhost
  suffixes, IPv4 loopback/private/link-local/broadcast/unspecified/
  multicast and AWS metadata range, IPv6 loopback/unspecified/multicast/
  ULA/link-local; redirect::Policy::none() so attacker cannot 30x to
  internal target.
- security-map: state::delete of consumed challenge after state::set
  on used-nonce is now best-effort; if delete fails, nonce is still
  recorded as used (replay-safe) and we log+return verified result
  rather than 500-ing the caller.
- lsp-tools: lsp_rename gains dryRun mode (returns wouldModify list
  without writing); writes are pre-staged then executed; per-file
  failures are reported in `failed[]` instead of silently dropped.

Workspace: re-add security-headers/security-map/security-zeroize/
skill-security to [workspace].members so they actually compile and
test (they were dropped in a later channel batch merge).

cargo check --workspace clean. cargo test --workspace --release: all
tests pass including new ones for non-numeric cost, IPv6 SSRF guards,
and Workflow/WorkflowStep deserialize sanitization. clippy
-D warnings clean across all 9 touched workers.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 29, 2026

📝 Walkthrough

Walkthrough

This PR enhances multiple worker crates with stricter input validation, improved error handling, and new functionality. Changes include explicit decision/type validation in approval workflows, URL safety checks in security-headers, dry-run support in lsp-tools, ID sanitization across workflow processing, better state trigger error propagation, and additional validation of policy arrays and task identifiers.

Changes

Cohort / File(s) Summary
Workspace Setup
Cargo.toml
Workspace members array updated to include four additional worker crates.
Approval Workflows
workers/approval-tiers/src/main.rs, workers/approval/src/main.rs
Enhanced validation: cost field now explicitly rejects non-numeric values while accepting null; decision field changed to strict match validation accepting only "approve" and "deny"; approval policy tools input now validated as array of strings.
Task & Workflow Processing
workers/task-decomposer/src/main.rs, workers/workflow/src/main.rs, workers/workflow/src/types.rs
State list operations now return Result and propagate trigger errors; subtask IDs sanitized via sanitize_id with malformed rejection; workflow agentId sanitized on input; custom serde deserializers validate functionId, outputVar, and id fields; retry logic refactored to call run_step and restore state; pagination now counts only matching workflow runs.
Security Enhancements
workers/security-headers/Cargo.toml, workers/security-headers/src/main.rs, workers/security-map/src/main.rs, workers/security-zeroize/src/main.rs
URL safety validation added to block localhost/internal/local patterns and disallowed IP ranges; HTTP redirects disabled; state deletion errors in map_verify treated as non-fatal warnings; secret identifier now exposed in zeroize wrap result; url dependency added to security-headers.
Development Tools
workers/lsp-tools/src/main.rs
Dry-run rename support added; precomputes modified files and occurrence counts without writing; per-file write errors collected and included in response; improved read-error logging.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

Poem

🐰 Whisker-twitched, we validate with care,
Each decision checked, each URL fair,
IDs sanitized, no ghouls slip through,
Dry runs and secrets, workflows brand new!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 44.44% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main change: addressing CodeRabbit findings across multiple worker families (task orchestration and security), referencing the specific PRs (#57 and #61).
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/cr-wave2-task-security

Review rate limit: 2/3 reviews remaining, refill in 20 minutes.

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@workers/task-decomposer/src/main.rs`:
- Around line 419-437: The current claim via iii.trigger(TriggerRequest {
function_id: "state::update", ... }) only checks for transport errors (claim
variable) and not whether the update actually changed status from "pending", so
concurrent workers can both proceed; modify the logic in the spawn flow around
the trigger call to perform an atomic conditional update or verify the returned
state: either (A) use a conditional update variant supported by the state engine
(e.g., add an operation or flag to TriggerRequest to "compare_and_set" status
from "pending" to "in_progress") so the update fails if status != "pending", or
(B) inspect the successful claim response payload to confirm the previous status
was "pending" before proceeding (if not, log/continue). Update the code that
uses iii.trigger, TriggerRequest, the "status" field, task_id and now_ms() so
only truly-claimed tasks continue to spawn.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 7dc8042f-0f64-4d40-81da-22000d963d84

📥 Commits

Reviewing files that changed from the base of the PR and between ab72977 and 25f9b13.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (11)
  • Cargo.toml
  • workers/approval-tiers/src/main.rs
  • workers/approval/src/main.rs
  • workers/lsp-tools/src/main.rs
  • workers/security-headers/Cargo.toml
  • workers/security-headers/src/main.rs
  • workers/security-map/src/main.rs
  • workers/security-zeroize/src/main.rs
  • workers/task-decomposer/src/main.rs
  • workers/workflow/src/main.rs
  • workers/workflow/src/types.rs

Comment on lines +419 to +437
let claim = iii
.trigger(TriggerRequest {
function_id: "state::update".to_string(),
payload: json!({
"scope": scope,
"key": task_id,
"operations": [
{ "type": "set", "path": "status", "value": "in_progress" },
{ "type": "set", "path": "updatedAt", "value": now_ms() }
]
}),
action: None,
timeout_ms: None,
})
.await;
if let Err(e) = claim {
tracing::warn!(task_id = %task_id, error = %e, "skipping task: failed to claim before spawn");
continue;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check if state::update provides any conditional/atomic semantics
rg -n -A 10 'state::update' --type rust

Repository: iii-experimental/agentos

Length of output: 19373


🏁 Script executed:

#!/bin/bash
# Get full context around lines 396-437 in task-decomposer
sed -n '390,440p' workers/task-decomposer/src/main.rs

Repository: iii-experimental/agentos

Length of output: 1808


🏁 Script executed:

#!/bin/bash
# Find the state::update handler implementation
find . -type f -name "*.rs" | xargs grep -l "state::update" | head -5
# Also search for where the handler is defined
rg -n "\"state::update\"" --type rust -B 5 -A 15 | grep -E "(fn|impl|async)" | head -20

Repository: iii-experimental/agentos

Length of output: 464


🏁 Script executed:

#!/bin/bash
# Search for state handler or update handler definition
fd -e rs | xargs grep -l "handle.*state.*update" | head -10
# Or search for the function definition
rg "fn.*state.*update" --type rust

Repository: iii-experimental/agentos

Length of output: 120


🏁 Script executed:

#!/bin/bash
# Find the state_update function and its implementation
rg -n "async fn state_update" --type rust -A 30

Repository: iii-experimental/agentos

Length of output: 1904


🏁 Script executed:

#!/bin/bash
# Search for any conditional logic in state handlers
rg -n "state::" --type rust | grep -i "condition\|compare\|swap\|if"

Repository: iii-experimental/agentos

Length of output: 584


🏁 Script executed:

#!/bin/bash
# Look for the state handler implementation in the codebase
fd -e rs | xargs grep -l "state::update" -l | xargs grep -l "fn.*handler\|async fn.*state" | head -5

Repository: iii-experimental/agentos

Length of output: 230


🏁 Script executed:

#!/bin/bash
# Search for the actual handler implementation - look for where state::update is implemented as a handler
rg -n "\"state::update\"" --type rust -B 20 | grep -E "(handler|match|state.*update)" | head -30

Repository: iii-experimental/agentos

Length of output: 2721


🏁 Script executed:

#!/bin/bash
# Look for a shared library or service that might implement state handlers
find . -type d -name "lib" -o -name "services" -o -name "handlers" -o -name "shared" | head -10

Repository: iii-experimental/agentos

Length of output: 114


🏁 Script executed:

#!/bin/bash
# Search for any condition checking in the state::update calls - look for "condition" or "expect" fields
rg "state::update" --type rust -A 8 | grep -E "(condition|expect|require|compare|swap)" | head -20

Repository: iii-experimental/agentos

Length of output: 50


🏁 Script executed:

#!/bin/bash
# Get the directory structure to understand the project layout
ls -la

Repository: iii-experimental/agentos

Length of output: 2230


🏁 Script executed:

#!/bin/bash
# Look for state handler implementation in crates directory
find ./crates -type f -name "*.rs" | xargs grep -l "state::update\|state.*handler" | head -10

Repository: iii-experimental/agentos

Length of output: 50


🏁 Script executed:

#!/bin/bash
# Search in the src directory
find ./src -type f -name "*.rs" | xargs grep -l "state" | head -10

Repository: iii-experimental/agentos

Length of output: 50


🏁 Script executed:

#!/bin/bash
# Look for function definitions that handle "state::update"
rg -n "\"state::update\"" --type rust -B 5 | grep -E "if.*==.*state::update|match.*state::update" | head -10

Repository: iii-experimental/agentos

Length of output: 50


🏁 Script executed:

#!/bin/bash
# Check the crates directory structure
ls -la ./crates/

Repository: iii-experimental/agentos

Length of output: 285


🏁 Script executed:

#!/bin/bash
# Search for documentation or comments about state::update semantics
rg -n "state::update" --type rust -B 3 | grep -E "(//|\/\*|TODO|NOTE|FIXME|atomic|condition|race)" | head -30

Repository: iii-experimental/agentos

Length of output: 479


🏁 Script executed:

#!/bin/bash
# Look for README or documentation files that explain state::update
find . -type f \( -name "*.md" -o -name "*.txt" \) | xargs grep -l "state::update\|state.*update" 2>/dev/null | head -10

Repository: iii-experimental/agentos

Length of output: 98


🏁 Script executed:

#!/bin/bash
# Check ARCHITECTURE.md for details about state handlers
head -100 ARCHITECTURE.md

Repository: iii-experimental/agentos

Length of output: 4793


🏁 Script executed:

#!/bin/bash
# Search for any condition or expect fields in state::update payloads
rg "state::update" --type rust -A 12 | grep -E "(condition|expect|compare|verify|check)" | head -20

Repository: iii-experimental/agentos

Length of output: 50


🏁 Script executed:

#!/bin/bash
# Look at the rate-limiter code more carefully for the "roll back" pattern
sed -n '400,450p' workers/rate-limiter/src/main.rs

Repository: iii-experimental/agentos

Length of output: 2330


🏁 Script executed:

#!/bin/bash
# Search for any "upsert" or conditional fields in state::update payloads
rg "state::update" --type rust -A 15 | grep -E "(upsert|condition|conditional|if_|only_if|expect)"

Repository: iii-experimental/agentos

Length of output: 144


🏁 Script executed:

#!/bin/bash
# Look at session-replay which uses "upsert" field
sed -n '35,55p' workers/session-replay/src/main.rs

Repository: iii-experimental/agentos

Length of output: 663


🏁 Script executed:

#!/bin/bash
# Search for error handling after state::update calls - look for patterns
rg "state::update" --type rust -A 20 | grep -B 5 -A 5 "if let.*Err\|match.*claim"

Repository: iii-experimental/agentos

Length of output: 792


🏁 Script executed:

#!/bin/bash
# Check if there's any retry logic or result checking after state::update in task-decomposer
sed -n '419,460p' workers/task-decomposer/src/main.rs

Repository: iii-experimental/agentos

Length of output: 1510


🏁 Script executed:

#!/bin/bash
# Look at the full context of how the claim result is used
sed -n '415,475p' workers/task-decomposer/src/main.rs

Repository: iii-experimental/agentos

Length of output: 2083


🏁 Script executed:

#!/bin/bash
# Search for how other workers handle state::update results - compare patterns
rg "state::update" --type rust -A 25 | grep -B 2 -A 8 "\.await" | head -80

Repository: iii-experimental/agentos

Length of output: 4592


Race condition exists: status must be verified as "pending" within the same atomic operation.

The claim-before-spawn pattern is good, but the check only catches network errors, not concurrent overwrites. Between reading status at line 396 and calling state::update at line 419, another concurrent spawn_workers invocation can read the same task with status == "pending" and both workers will successfully call state::update (which applies unconditionally). The resulting state value is only checked for errors (if let Err(e) = claim), not for whether the operation actually changed the status from "pending"—meaning duplicate spawns can still occur.

Either:

  1. Use a conditional update operation (e.g., "update only if status == pending") if the state engine supports it, or
  2. Check the returned state value to confirm status was actually "pending" before spawning.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@workers/task-decomposer/src/main.rs` around lines 419 - 437, The current
claim via iii.trigger(TriggerRequest { function_id: "state::update", ... }) only
checks for transport errors (claim variable) and not whether the update actually
changed status from "pending", so concurrent workers can both proceed; modify
the logic in the spawn flow around the trigger call to perform an atomic
conditional update or verify the returned state: either (A) use a conditional
update variant supported by the state engine (e.g., add an operation or flag to
TriggerRequest to "compare_and_set" status from "pending" to "in_progress") so
the update fails if status != "pending", or (B) inspect the successful claim
response payload to confirm the previous status was "pending" before proceeding
(if not, log/continue). Update the code that uses iii.trigger, TriggerRequest,
the "status" field, task_id and now_ms() so only truly-claimed tasks continue to
spawn.

Copy link
Copy Markdown

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 2 potential issues.

View 6 additional findings in Devin Review.

Open in Devin Review

Comment on lines +35 to +87
fn assert_safe_external_url(raw: &str) -> Result<(), IIIError> {
let parsed = Url::parse(raw)
.map_err(|e| IIIError::Handler(format!("invalid URL: {e}")))?;
let scheme = parsed.scheme();
if scheme != "https" && scheme != "http" {
return Err(IIIError::Handler(format!(
"scheme not allowed: {scheme} (only http/https)"
)));
}
let host = parsed
.host_str()
.ok_or_else(|| IIIError::Handler("URL has no host".into()))?;
let host_lc = host.to_ascii_lowercase();
if host_lc == "localhost"
|| host_lc.ends_with(".localhost")
|| host_lc.ends_with(".internal")
|| host_lc.ends_with(".local")
{
return Err(IIIError::Handler(format!(
"blocked host: {host}"
)));
}
let ip_candidate = host_lc
.strip_prefix('[')
.and_then(|s| s.strip_suffix(']'))
.unwrap_or(&host_lc);
if let Ok(ip) = ip_candidate.parse::<IpAddr>() {
let blocked = match ip {
IpAddr::V4(v4) => {
v4.is_loopback()
|| v4.is_private()
|| v4.is_link_local()
|| v4.is_broadcast()
|| v4.is_unspecified()
|| v4.is_multicast()
|| v4.octets()[0] == 169 && v4.octets()[1] == 254
}
IpAddr::V6(v6) => {
v6.is_loopback()
|| v6.is_unspecified()
|| v6.is_multicast()
|| (v6.segments()[0] & 0xfe00) == 0xfc00
|| (v6.segments()[0] & 0xffc0) == 0xfe80
}
};
if blocked {
return Err(IIIError::Handler(format!(
"blocked IP literal: {host}"
)));
}
}
Ok(())
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 SSRF check lacks DNS resolution, violating project convention

The new assert_safe_external_url function only validates the URL string and IP literals, but does not resolve DNS hostnames to verify the resolved IP is not internal. An attacker can bypass SSRF protection by providing a URL like http://evil.example.com/ where the domain resolves to 127.0.0.1, 169.254.169.254 (cloud metadata), or any other internal IP.

The project convention in CLAUDE.md explicitly states: "SSRF: assertNoSsrf(url) before any outbound HTTP (DNS resolution check)". The canonical TypeScript implementation at packages/shared/src/utils.ts:58-64 performs await lookup(host) and checks the resolved address against blocked IP ranges. The new Rust function skips this step entirely, leaving the check_url endpoint (exposed via api/security/headers/check) vulnerable to DNS-based SSRF.

Prompt for agents
The assert_safe_external_url function in workers/security-headers/src/main.rs needs to add a DNS resolution step to match the project convention described in CLAUDE.md and implemented in the TypeScript assertNoSsrf function at packages/shared/src/utils.ts:39-68.

After the existing hostname/IP string checks pass, you need to resolve the hostname to an IP address and verify the resolved IP is not in any blocked range. In Rust, you can use tokio::net::lookup_host or the trust-dns-resolver crate.

The flow should be:
1. Parse URL (existing)
2. Check scheme (existing)
3. Check hostname string patterns like localhost (existing)
4. Check if host is an IP literal and block private ranges (existing)
5. NEW: If host is NOT an IP literal, resolve it via DNS and check the resolved IP(s) against the same blocked ranges (loopback, private, link-local, broadcast, unspecified, multicast, 169.254.x.x, fc00::/7, fe80::/10)

Also consider adding coverage for Carrier-grade NAT (100.64.0.0/10) and IPv4-mapped IPv6 addresses (::ffff:x.x.x.x) which the TypeScript version checks but the Rust version does not.
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

})
.await
.unwrap_or(Value::Null);
.map_err(|e| IIIError::Handler(e.to_string()))?;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Approval policy check changed from fail-open to fail-closed, inconsistent with codebase pattern

The state::get call for the approval policy was changed from .unwrap_or(Value::Null) to .map_err(|e| IIIError::Handler(e.to_string()))?. Previously, if the state store was unreachable or returned an error, the function would treat it as "no policy set" and return {"required": false} (fail-open). Now it propagates the error, causing all approval checks to fail during state store outages.

This is inconsistent with the same file's other state::get calls (e.g., workers/approval/src/main.rs:323 uses .unwrap_or(Value::Null)) and the broader codebase convention (workers/approval-tiers/src/main.rs:182, workers/security-map/src/main.rs:210), which all use unwrap_or for state reads where keys may not exist. The CLAUDE.md convention also says: "Error handling: safeCall(fn, fallback, { operation }) for external calls that may fail", suggesting fail-safe patterns for state reads.

Suggested change
.map_err(|e| IIIError::Handler(e.to_string()))?;
.unwrap_or(Value::Null);
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

@rohitg00 rohitg00 merged commit 6c5e628 into main Apr 29, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant