Skip to content

feat(deployment): add shell-exec endpoint for synchronous command execution#3097

Open
jobordu wants to merge 1 commit intoakash-network:mainfrom
jobordu:feat/shell-exec-synchronous-endpoint
Open

feat(deployment): add shell-exec endpoint for synchronous command execution#3097
jobordu wants to merge 1 commit intoakash-network:mainfrom
jobordu:feat/shell-exec-synchronous-endpoint

Conversation

@jobordu
Copy link
Copy Markdown

@jobordu jobordu commented Apr 22, 2026

See PR description at jobordu#1

Closes #3079

Summary by CodeRabbit

  • New Features
    • Added shell execution API for running service-specific commands on active deployment leases; configurable timeout (1–120s, default 60) and detailed results (stdout, stderr, exit code, truncated).
  • Tests
    • Comprehensive unit tests covering controller and execution service behaviors, edge cases, parsing, timeouts, and output truncation.
  • Documentation
    • Endpoint included in OpenAPI docs and mounted in the API router.

@jobordu jobordu requested a review from a team as a code owner April 22, 2026 05:52
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 22, 2026

Warning

Rate limit exceeded

@jobordu has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 35 minutes and 27 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 35 minutes and 27 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 10a2f1ea-0fd7-4def-8872-d34f5c1d7593

📥 Commits

Reviewing files that changed from the base of the PR and between 805e071 and d9d4d81.

⛔ Files ignored due to path filters (1)
  • package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (9)
  • apps/api/package.json
  • apps/api/src/deployment/controllers/shell-exec/shell-exec.controller.spec.ts
  • apps/api/src/deployment/controllers/shell-exec/shell-exec.controller.ts
  • apps/api/src/deployment/http-schemas/shell-exec.schema.ts
  • apps/api/src/deployment/routes/shell-exec/shell-exec.router.ts
  • apps/api/src/deployment/services/shell-exec/shell-exec.service.spec.ts
  • apps/api/src/deployment/services/shell-exec/shell-exec.service.ts
  • apps/api/src/rest-app.ts
  • apps/api/src/routes/deployment/index.ts
📝 Walkthrough

Walkthrough

Adds a synchronous POST shell-exec endpoint and supporting service: WebSocket-based ShellExecService, controller with ownership/auth checks, Zod HTTP schemas, router/OpenAPI registration, tests, and ws runtime dependency + types.

Changes

Cohort / File(s) Summary
Dependencies
apps/api/package.json
Add runtime ws and dev @types/ws.
Shell Exec Service & Helpers
apps/api/src/deployment/services/shell-exec/shell-exec.service.ts, apps/api/src/deployment/services/shell-exec/shell-exec.service.spec.ts
New singleton service to build provider WebSocket URL, manage WS lifecycle, parse JSON/binary frames, enforce MAX_OUTPUT_SIZE truncation, timeout behavior, and return structured result; comprehensive unit tests covering parsing, URL encoding, truncation, timeouts, and WS events.
Controller & Tests
apps/api/src/deployment/controllers/shell-exec/shell-exec.controller.ts, .../shell-exec.controller.spec.ts
New controller with Protected decorator, deployment/lease ownership and state checks, wallet/provider lookup and JWT scope resolution, invokes ShellExecService; tests cover failure and success paths.
HTTP Schemas
apps/api/src/deployment/http-schemas/shell-exec.schema.ts
Add Zod schemas and inferred types: params (dseq,gseq,oseq), request (command,service,timeout), and response (stdout,stderr,exitCode,truncated).
Routing & Integration
apps/api/src/deployment/routes/shell-exec/shell-exec.router.ts, apps/api/src/rest-app.ts, apps/api/src/routes/deployment/index.ts
New OpenAPI-backed POST route at /v1/deployments/{dseq}/leases/{gseq}/{oseq}/shell-exec; route registered in rest-app and deployment routes; OpenAPI metadata and error responses added.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Assessment against linked issues

Objective Addressed Explanation
Endpoint route structure, request/response schemas, and OpenAPI docs [#3079]
Authentication/authorization (bearer or API key) and ownership checks for deployment [#3079]
Command execution via provider-proxy with timeout handling and 1MB truncation flag [#3079]
Comprehensive tests for service and controller (#3079 expectations)
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 12

🧹 Nitpick comments (3)
apps/api/src/deployment/controllers/shell-exec/shell-exec.controller.spec.ts (1)

130-182: setup() mutates the global DI container — tests are not isolated.

container.register(AuthService, ...) on every setup() call keeps stacking registrations on the root tsyringe container, and there's no teardown. Running this spec alongside other specs that resolve AuthService will pollute state. Use a child container (container.createChildContainer()) per test, or call container.clearInstances() / container.reset() in an afterEach.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/api/src/deployment/controllers/shell-exec/shell-exec.controller.spec.ts`
around lines 130 - 182, The setup() helper is mutating the global tsyringe
container by calling container.register(AuthService, ...) which leaks DI state
across tests; change setup() to create a fresh child container via
container.createChildContainer() and perform registrations on that child (or
alternatively call container.clearInstances() / container.reset() in an
afterEach) so each test uses an isolated container; update references in setup()
to register AuthService on the child container (and resolve services from the
child) or add an afterEach that invokes
container.clearInstances()/container.reset() to remove accumulated
registrations.
apps/api/src/deployment/services/shell-exec/shell-exec.service.ts (2)

121-123: Silent catches + no logging hide real problems.

catch {} at Line 121 and the empty-handled error paths drop every parse/protocol anomaly on the floor. Per coding guidelines, backend code should use LoggerService (not console.*) — here it's using nothing at all. At minimum, log parse failures at warn/debug with the raw payload length (not contents, to avoid leaking secrets) so operational issues are diagnosable.

As per coding guidelines: "Use LoggerService instead of console.log/warn/error for logging".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/api/src/deployment/services/shell-exec/shell-exec.service.ts` around
lines 121 - 123, The empty catch block in shell-exec.service.ts silently
swallows parse/protocol errors; replace it by injecting/using the service
LoggerService (or the existing logger instance) inside the catch of the parsing
logic (the catch following the parse/deserialize in the ShellExecService method)
and log a warning or debug-level entry that includes the event context and the
raw payload length (not contents) — e.g., logger.warn or logger.debug with a
message like "Failed to parse shell exec payload" plus payload.length and any
non-sensitive identifiers — then rethrow or handle as appropriate per current
flow; do not use console.*.

159-174: Document or fix the space-splitting limitation in shell command parsing.

The buildShellUrl method intentionally splits commands on spaces only (not with proper shell tokenization). This is by design based on the test suite, but causes correctness issues: "echo 'hello world'" becomes three URL parameters (cmd0=echo, cmd1='hello', cmd2=world'), losing the quoting and breaking arguments containing spaces—likely in this endpoint since it targets post-deploy secret injection (paths, JSON blobs, cert PEMs).

Two options:

  1. Preferred: Change command from string to string[] in the API contract, removing the need to split and matching kubectl exec semantics.
  2. If keeping string input: Document this space-splitting behavior explicitly in the OpenAPI description so API consumers understand the limitation.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/api/src/deployment/services/shell-exec/shell-exec.service.ts` around
lines 159 - 174, The buildShellUrl currently splits ShellExecInput.command by
spaces, breaking quoted/space-containing args; change the API contract so
ShellExecInput.command is string[] (matching kubectl exec semantics) and update
buildShellUrl to accept that array (encode each element as &cmd{i}=...),
removing the string-splitting logic; update all call sites, tests, and the
OpenAPI/DTO/schema for the shell exec endpoint to reflect command: string[] and
adjust any validation/serialization accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/api/package.json`:
- Line 109: The package.json currently pins the "ws" dependency at "7.5.9" which
contains CVE-2024-37890; update the "ws" entry to a secure version (minimum
"7.5.10" or preferably "^8.20.0" / "^8") in package.json, then regenerate the
lockfile (run npm install or yarn install), commit the updated package.json and
lockfile, and run the test suite and npm/yarn audit/scan to verify no remaining
vulnerabilities; locate the "ws": "7.5.9" line in package.json to make this
change.

In `@apps/api/src/deployment/controllers/shell-exec/shell-exec.controller.ts`:
- Around line 57-59: Replace the direct assert(false, 502, result.val) call in
the shell-exec controller with a safe mapping: inspect the result (the variable
named result returned from the shell exec service), convert known failure
strings into stable user-facing messages (e.g., "Command timed out", "Connection
lost", "Execution failed") and use assert or throw with that sanitized message
and 502, while sending the raw result.val to the LoggerService (or this.logger)
at error level; ensure any unknown/unmapped result.val is logged but replaced
with a generic message like "Remote execution error" before returning to the
client.
- Around line 29-43: The code only checks lease.state but not deployment.state
and calls toProviderAuth before confirming the provider exists; update the
handler to first assert deployment.state is not "closed" (e.g., check
deployment.state !== "closed" similar to lease.state check) and reorder logic so
you call providerService.getProvider(providerAddress) and assert(providerInfo,
404, "Provider not found") before calling providerService.toProviderAuth(...).
This prevents minting a JWT for a non-existent provider and ensures closed
deployments are rejected.

In `@apps/api/src/deployment/http-schemas/shell-exec.schema.ts`:
- Around line 3-13: Update ShellExecParamsSchema and ShellExecRequestSchema:
change gseq and oseq to z.coerce.number().int().nonnegative() so path params
coerce from strings and reject negatives; add dseq: z.string().regex(/^\d+$/) to
only allow digit sequences; tighten command to include .max(4096); make timeout
an integer by adding .int() (keep .min(1).max(120).default(60)); and tighten
service with a stricter regex (e.g.,
/^[a-zA-Z0-9][a-zA-Z0-9._-]{0,251}[a-zA-Z0-9]?$/ or at minimum reject spaces) on
ShellExecRequestSchema to prevent invalid service names.

In `@apps/api/src/deployment/services/shell-exec/shell-exec.service.spec.ts`:
- Around line 765-773: The test is dead because it only checks trivial
construction; replace it by exercising ShellExecService.execute() with a
negative timeout to assert timeout behavior: create the service and input via
createShellExecInput({timeout: -100}), switch to fake timers
(jest.useFakeTimers()), call service.execute(input) and capture the returned
promise, advance timers by 0 (or run pending timers) to trigger the immediate
timeout, then assert the promise resolves/rejects with the expected Err("Command
timed out") (or appropriate error shape). Ensure to clean up timers
(jest.useRealTimers()) and remove the two trivial expect(...) lines currently in
the test.
- Line 17: Tests are using (service as any).buildShellUrl and parseShellMessage
which violates the "no any" rule and couples tests to private helpers; either
export the pure helper functions buildShellUrl and parseShellMessage from the
shell-exec service module and update tests to import and assert those functions
directly with proper TypeScript types, or remove the private-helper unit tests
and rely on the existing execute(...) integration tests to cover behavior;
ensure no casts to any remain (replace with correct interfaces/types) and update
spec references from (service as any) to the exported helper names or to only
call service.execute when adopting the integration approach.
- Around line 874-912: Tests are asserting undesirable behaviors; update
ShellExecService.execute and related helpers to: 1) treat WebSocket "error"
events as failures by resolving Err with a sanitized error string (e.g., wrap
provider errors like ECONNREFUSED into Err("<sanitized connection error>"))
instead of Ok(exitCode=1) — change the error handler in ShellExecService (the
WebSocket "error" listener) to reject/return Err; 2) enforce a single combined
output cap (stdout+stderr) instead of per-stream caps by changing the
buffering/truncation logic in whatever collects stream chunks (look for
functions/variables handling MAX_OUTPUT_SIZE and per-stream tracking in
ShellExecService or its output buffer helpers) to drop or truncate incoming data
once the combined size reaches MAX_OUTPUT_SIZE; and 3) pick consistent semantics
for "no exit code received" (either always Err("Connection closed without exit
code") or always Ok({ exitCode: 1 })) and make the close/closed message handling
in ShellExecService (the close/message handlers that check for
exit_code/message.closed) follow that chosen behavior, then update/add one-line
comments in the corresponding tests to reflect intentional behavior if you keep
the non-obvious choices.

In `@apps/api/src/deployment/services/shell-exec/shell-exec.service.ts`:
- Around line 114-120: The handler that currently checks if (message.closed ||
message.error) and sets exitCode = 1 discards message.error; update the block in
shell-exec.service.ts (the message handling code around exitCode, timeoutId, ws)
to capture and surface server-reported errors by storing message.error (e.g.,
into a local serverError variable or appending it to the accumulated stderr
buffer) and include that text in the final rejection or stderr output instead of
silently returning exit code 1; ensure you still clearTimeout(timeoutId) and
call ws.close(), but propagate message.error with a clear marker so callers can
distinguish proxy-reported errors from normal process exits (reference symbols:
message, exitCode, timeoutId, ws).
- Around line 126-155: The ws "error" handler currently resolves an Ok with
exitCode=1 which masks auth/connection failures; change it to resolve an Err
containing the Error (or reject) instead of Ok so callers can distinguish
network/auth errors (update the ws.on("error", ...) handler to return Err(error)
using the same Result shape), and add a handler for ws.on("unexpected-response",
(req, res) => ...) that captures the HTTP statusCode and response body (or at
least statusCode) and resolves Err with that info; keep the existing close
handler logic (which uses exitCode/Err("Connection closed without exit code"))
intact but ensure you do not double-resolve by guarding with a resolved flag or
checking if exitCode/previous result was set before calling resolve.
- Around line 70-103: The truncation logic allows later chunks to be appended
after a drop, so change the handling in shell-exec.service.ts (inside the
message parsing branches that use stdout, stderr, MAX_OUTPUT_SIZE) to make the
truncated flag "sticky": before attempting any append in both the JSON branch
(where parsed.type === "data") and the binary branch (where message.message is
an array and firstByte is used), first check whether truncated is already true
and skip processing; when a chunk would overflow, set truncated = true and stop
appending any further data (do not allow subsequent smaller chunks to pass the
length checks), and optionally trigger early termination of the read/socket
(close the socket or break the read loop) to avoid buffering remaining stream
data in memory.
- Around line 186-201: parseShellMessage currently hardcodes stream: "stdout"
for JSON-framed messages; update it to honor any stream discriminator in the
parsed payload so stderr JSON messages aren't mislabelled. In parseShellMessage,
after JSON.parse and confirming parsed.message is a non-empty string, inspect
parsed.stream (and if applicable numeric codes like 2) and map it to "stderr" or
"stdout" accordingly (default to "stdout" if missing/unknown) before returning {
type: "data", data: parsed.message, stream: ... }; keep existing exit_code
handling intact.

---

Nitpick comments:
In
`@apps/api/src/deployment/controllers/shell-exec/shell-exec.controller.spec.ts`:
- Around line 130-182: The setup() helper is mutating the global tsyringe
container by calling container.register(AuthService, ...) which leaks DI state
across tests; change setup() to create a fresh child container via
container.createChildContainer() and perform registrations on that child (or
alternatively call container.clearInstances() / container.reset() in an
afterEach) so each test uses an isolated container; update references in setup()
to register AuthService on the child container (and resolve services from the
child) or add an afterEach that invokes
container.clearInstances()/container.reset() to remove accumulated
registrations.

In `@apps/api/src/deployment/services/shell-exec/shell-exec.service.ts`:
- Around line 121-123: The empty catch block in shell-exec.service.ts silently
swallows parse/protocol errors; replace it by injecting/using the service
LoggerService (or the existing logger instance) inside the catch of the parsing
logic (the catch following the parse/deserialize in the ShellExecService method)
and log a warning or debug-level entry that includes the event context and the
raw payload length (not contents) — e.g., logger.warn or logger.debug with a
message like "Failed to parse shell exec payload" plus payload.length and any
non-sensitive identifiers — then rethrow or handle as appropriate per current
flow; do not use console.*.
- Around line 159-174: The buildShellUrl currently splits ShellExecInput.command
by spaces, breaking quoted/space-containing args; change the API contract so
ShellExecInput.command is string[] (matching kubectl exec semantics) and update
buildShellUrl to accept that array (encode each element as &cmd{i}=...),
removing the string-splitting logic; update all call sites, tests, and the
OpenAPI/DTO/schema for the shell exec endpoint to reflect command: string[] and
adjust any validation/serialization accordingly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 6981f1f1-d270-42d7-86c4-a1fec3bd6a10

📥 Commits

Reviewing files that changed from the base of the PR and between 4ea68c4 and b8c6460.

⛔ Files ignored due to path filters (1)
  • package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (9)
  • apps/api/package.json
  • apps/api/src/deployment/controllers/shell-exec/shell-exec.controller.spec.ts
  • apps/api/src/deployment/controllers/shell-exec/shell-exec.controller.ts
  • apps/api/src/deployment/http-schemas/shell-exec.schema.ts
  • apps/api/src/deployment/routes/shell-exec/shell-exec.router.ts
  • apps/api/src/deployment/services/shell-exec/shell-exec.service.spec.ts
  • apps/api/src/deployment/services/shell-exec/shell-exec.service.ts
  • apps/api/src/rest-app.ts
  • apps/api/src/routes/deployment/index.ts

Comment thread apps/api/package.json Outdated
describe(ShellExecController.name, () => {
it("throws 404 when deployment not found", async () => {
const { controller, deploymentReaderService } = setup();
deploymentReaderService.findByUserIdAndDseq.mockResolvedValue(undefined as never);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Avoid as never/as any casts in mocks — breaks the type safety contract.

The mockResolvedValue(... as never) pattern hides real shape mismatches. Prefer typed factory helpers (e.g., a createDeployment() seeder returning the correct GetDeploymentResponse["data"] shape) so a future change in DeploymentReaderService.findByUserIdAndDseq return type fails the test loudly. Same for walletReaderService.getWalletByUserId and providerService.getProvider.

As per coding guidelines: "Never use type any or cast to type any. Always define the proper TypeScript types."

Also applies to: 94-94, 175-179

Comment on lines +29 to +43
const lease = deployment.leases.find(l => l.id.gseq === input.gseq && l.id.oseq === input.oseq);

assert(lease, 404, "Lease not found");
assert(lease.id.provider, 500, "Lease provider address not found");
assert(lease.state === "active", 400, "Lease is not active");

const providerAddress = lease.id.provider;

const wallet = await this.walletReaderService.getWalletByUserId(userId);

const auth = await this.providerService.toProviderAuth({ walletId: wallet.id, provider: providerAddress }, ["shell"]);

const providerInfo = await this.providerService.getProvider(providerAddress);

assert(providerInfo, 404, "Provider not found");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Missing deployment-state check + wasted JWT minting before provider lookup.

  1. You only validate the lease state === "active". A deployment can be closed while its lease row is stale; also check deployment.state (matching how the issue restricts to live deployments). At minimum reject if deployment.state === "closed".
  2. toProviderAuth(...) signs a JWT before confirming the provider actually exists. If providerInfo ends up null, you've done unnecessary crypto work and potentially handed a shell-scoped JWT to a caller targeting a non-existent provider. Reorder: getProvider and assert(providerInfo, ...) first, then toProviderAuth.
Proposed diff
-    assert(lease, 404, "Lease not found");
-    assert(lease.id.provider, 500, "Lease provider address not found");
-    assert(lease.state === "active", 400, "Lease is not active");
-
-    const providerAddress = lease.id.provider;
-
-    const wallet = await this.walletReaderService.getWalletByUserId(userId);
-
-    const auth = await this.providerService.toProviderAuth({ walletId: wallet.id, provider: providerAddress }, ["shell"]);
-
-    const providerInfo = await this.providerService.getProvider(providerAddress);
-
-    assert(providerInfo, 404, "Provider not found");
+    assert(lease, 404, "Lease not found");
+    assert(lease.id.provider, 500, "Lease provider address not found");
+    assert(lease.state === "active", 400, "Lease is not active");
+
+    const providerAddress = lease.id.provider;
+    const providerInfo = await this.providerService.getProvider(providerAddress);
+    assert(providerInfo, 404, "Provider not found");
+
+    const wallet = await this.walletReaderService.getWalletByUserId(userId);
+    const auth = await this.providerService.toProviderAuth({ walletId: wallet.id, provider: providerAddress }, ["shell"]);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/api/src/deployment/controllers/shell-exec/shell-exec.controller.ts`
around lines 29 - 43, The code only checks lease.state but not deployment.state
and calls toProviderAuth before confirming the provider exists; update the
handler to first assert deployment.state is not "closed" (e.g., check
deployment.state !== "closed" similar to lease.state check) and reorder logic so
you call providerService.getProvider(providerAddress) and assert(providerInfo,
404, "Provider not found") before calling providerService.toProviderAuth(...).
This prevents minting a JWT for a non-existent provider and ensures closed
deployments are rejected.

Comment on lines +57 to +59
if (!result.ok) {
assert(false, 502, result.val);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

assert(false, 502, result.val) can leak internal error strings to clients.

result.val is whatever string the service returns — today it's benign ("Command timed out", "Connection closed without exit code"), but any future change that stuffs a provider-proxy stacktrace, hostname, or Node error message into that string gets piped straight into the 502 response body. Map known failure modes to stable, user-facing messages and log the raw reason via LoggerService instead.

Proposed diff
-    if (!result.ok) {
-      assert(false, 502, result.val);
-    }
+    if (!result.ok) {
+      this.logger.warn({ event: "shell-exec-failed", reason: result.val, providerAddress, dseq: input.dseq });
+      const publicMessage = result.val === "Command timed out" ? "Command timed out" : "Shell execution failed";
+      assert(false, 502, publicMessage);
+    }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (!result.ok) {
assert(false, 502, result.val);
}
if (!result.ok) {
this.logger.warn({ event: "shell-exec-failed", reason: result.val, providerAddress, dseq: input.dseq });
const publicMessage = result.val === "Command timed out" ? "Command timed out" : "Shell execution failed";
assert(false, 502, publicMessage);
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/api/src/deployment/controllers/shell-exec/shell-exec.controller.ts`
around lines 57 - 59, Replace the direct assert(false, 502, result.val) call in
the shell-exec controller with a safe mapping: inspect the result (the variable
named result returned from the shell exec service), convert known failure
strings into stable user-facing messages (e.g., "Command timed out", "Connection
lost", "Execution failed") and use assert or throw with that sanitized message
and 502, while sending the raw result.val to the LoggerService (or this.logger)
at error level; ensure any unknown/unmapped result.val is logged but replaced
with a generic message like "Remote execution error" before returning to the
client.

Comment on lines +3 to +13
export const ShellExecParamsSchema = z.object({
dseq: z.string(),
gseq: z.number(),
oseq: z.number()
});

export const ShellExecRequestSchema = z.object({
command: z.string().min(1),
service: z.string().min(1),
timeout: z.number().min(1).max(120).default(60)
});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

In @hono/zod-openapi 0.18.x, are URL path parameters validated as strings by default, or does z.number() auto-coerce for path params?

💡 Result:

In @hono/zod-openapi 0.18.x, URL path parameters are validated as strings by default. Hono extracts path parameters as strings (automatically URL-decoded), and the official documentation and examples explicitly use z.string for path params like {id}. z.number does not auto-coerce for path params. Multiple GitHub issues confirm that using z.number directly on path params causes type inference issues or validation failures because the input is a string. Developers must use explicit coercion like z.coerce.number, z.string.pipe(z.coerce.number), z.preprocess(...), or similar transforms to convert the string input to a number. This behavior holds across versions, including 0.18.x, as path param extraction in Hono is consistently string-based.

Citations:


🏁 Script executed:

# Find where ShellExecParamsSchema is used
fd -t f "\.ts$" apps/api | xargs rg "ShellExecParamsSchema" -l

Repository: akash-network/console

Length of output: 185


🏁 Script executed:

# Search for route definitions using ShellExecParamsSchema
rg "ShellExecParamsSchema" -A 5 -B 5

Repository: akash-network/console

Length of output: 3782


🏁 Script executed:

# Find shell-exec related service code and buildShellUrl
rg "buildShellUrl|ShellExec" -t ts --max-count 20

Repository: akash-network/console

Length of output: 6628


🏁 Script executed:

# Check test files for shell-exec
fd -t f -i "shell.*exec.*test\|test.*shell.*exec" apps/

Repository: akash-network/console

Length of output: 47


🏁 Script executed:

# Look for schema patterns in the codebase for similar validation
rg "z\.string\(\)\.regex" apps/api/src --max-count 10

Repository: akash-network/console

Length of output: 399


🏁 Script executed:

# Read the shell-exec service test file to find references to dseq: "foo/bar/baz"
cat -n apps/api/src/deployment/services/shell-exec/shell-exec.service.spec.ts | head -150

Repository: akash-network/console

Length of output: 6273


🏁 Script executed:

# Read the buildShellUrl implementation in the service
cat -n apps/api/src/deployment/services/shell-exec/shell-exec.service.ts

Repository: akash-network/console

Length of output: 8112


🏁 Script executed:

# Check how dseq, gseq, oseq are used in the controller
cat -n apps/api/src/deployment/controllers/shell-exec/shell-exec.controller.ts

Repository: akash-network/console

Length of output: 2984


🏁 Script executed:

# Check if there's a test with dseq: "foo/bar/baz" (continuation of spec file)
tail -100 apps/api/src/deployment/services/shell-exec/shell-exec.service.spec.ts

Repository: akash-network/console

Length of output: 3490


🏁 Script executed:

# Look for the createShellExecInput helper function
rg "createShellExecInput" -A 10 apps/api/src/deployment/services/shell-exec/shell-exec.service.spec.ts

Repository: akash-network/console

Length of output: 23237


🏁 Script executed:

# Verify how the router passes validated params to controller
cat -n apps/api/src/deployment/routes/shell-exec/shell-exec.router.ts | grep -A 3 "routeShellExec"

Repository: akash-network/console

Length of output: 334


Path parameters must use z.coerce.number() and numeric fields need additional constraints.

  1. gseq and oseq are URL path parameters but defined as z.number()@hono/zod-openapi does not auto-coerce path params from strings. Every request will fail validation. Use z.coerce.number().int().nonnegative() instead.

  2. dseq has no validation; test evidence shows dseq: "foo/bar/baz" reaches buildShellUrl. Add z.string().regex(/^\d+$/) to reject malformed values at the boundary.

  3. gseq and oseq should be .nonnegative() — test cases explicitly document that negative values (e.g., -1, -5) pass through with note "(no validation)".

  4. command lacks a .max() limit — an API token holder can POST multi-MB payloads. Add .max(4096) or a realistic ceiling aligned with the 1 MB response cap.

  5. timeout should be .int() — fractional seconds (e.g., 60.5) are currently allowed but meaningless.

  6. service should be stricter to match typical k8s service naming (alphanumeric, hyphens, underscores, no spaces/slashes). Consider .regex(/^[a-zA-Z0-9][a-zA-Z0-9._-]{0,251}[a-zA-Z0-9]?$/) or at minimum reject spaces.

Proposed diff
 export const ShellExecParamsSchema = z.object({
-  dseq: z.string(),
-  gseq: z.number(),
-  oseq: z.number()
+  dseq: z.string().regex(/^\d+$/, "dseq must be numeric"),
+  gseq: z.coerce.number().int().nonnegative(),
+  oseq: z.coerce.number().int().nonnegative()
 });

 export const ShellExecRequestSchema = z.object({
-  command: z.string().min(1),
+  command: z.string().min(1).max(4096),
-  service: z.string().min(1),
+  service: z.string().min(1).max(253),
-  timeout: z.number().min(1).max(120).default(60)
+  timeout: z.number().int().min(1).max(120).default(60)
 });
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
export const ShellExecParamsSchema = z.object({
dseq: z.string(),
gseq: z.number(),
oseq: z.number()
});
export const ShellExecRequestSchema = z.object({
command: z.string().min(1),
service: z.string().min(1),
timeout: z.number().min(1).max(120).default(60)
});
export const ShellExecParamsSchema = z.object({
dseq: z.string().regex(/^\d+$/, "dseq must be numeric"),
gseq: z.coerce.number().int().nonnegative(),
oseq: z.coerce.number().int().nonnegative()
});
export const ShellExecRequestSchema = z.object({
command: z.string().min(1).max(4096),
service: z.string().min(1).max(253),
timeout: z.number().int().min(1).max(120).default(60)
});
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/api/src/deployment/http-schemas/shell-exec.schema.ts` around lines 3 -
13, Update ShellExecParamsSchema and ShellExecRequestSchema: change gseq and
oseq to z.coerce.number().int().nonnegative() so path params coerce from strings
and reject negatives; add dseq: z.string().regex(/^\d+$/) to only allow digit
sequences; tighten command to include .max(4096); make timeout an integer by
adding .int() (keep .min(1).max(120).default(60)); and tighten service with a
stricter regex (e.g., /^[a-zA-Z0-9][a-zA-Z0-9._-]{0,251}[a-zA-Z0-9]?$/ or at
minimum reject spaces) on ShellExecRequestSchema to prevent invalid service
names.

Comment thread apps/api/src/deployment/services/shell-exec/shell-exec.service.spec.ts Outdated
Comment thread apps/api/src/deployment/services/shell-exec/shell-exec.service.ts
Comment thread apps/api/src/deployment/services/shell-exec/shell-exec.service.ts
Comment thread apps/api/src/deployment/services/shell-exec/shell-exec.service.ts Outdated
Comment on lines +186 to +201
private parseShellMessage(message: string): { type: "data" | "exit_code"; data?: string; stream?: "stdout" | "stderr"; exit_code?: number } | null {
if (message.startsWith("{")) {
try {
const parsed = JSON.parse(message);
if (typeof parsed.exit_code === "number") {
return { type: "exit_code", exit_code: parsed.exit_code };
}
if (typeof parsed.message === "string" && parsed.message.length > 0) {
return { type: "data", data: parsed.message, stream: "stdout" };
}
} catch {
return null;
}
}
return null;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Inspect provider-proxy shell message producers to confirm JSON vs binary framing and whether JSON carries a stream indicator.
fd -t f -e ts -e js . apps/provider-proxy | xargs rg -nC3 -e 'exit_code' -e '"stdout"' -e '"stderr"' -e 'shell' 2>/dev/null | head -200

Repository: akash-network/console

Length of output: 3963


🏁 Script executed:

cat -n apps/api/src/deployment/services/shell-exec/shell-exec.service.ts | head -250

Repository: akash-network/console

Length of output: 8112


🏁 Script executed:

# Search for provider-proxy shell message construction to understand JSON format
rg -n 'exit_code|message' apps/provider-proxy/src --type ts -A 2 -B 2 | grep -A 5 -B 5 -i 'json\|shell' | head -100

Repository: akash-network/console

Length of output: 7354


🏁 Script executed:

# Find where socketMessage is constructed/received in provider-proxy shell handling
rg -n 'socketMessage' apps/provider-proxy/src --type ts -B 5 -A 5

Repository: akash-network/console

Length of output: 2035


🏁 Script executed:

# Search for provider shell endpoint or message format documentation
fd -t f -e ts -e md apps/provider-proxy | xargs rg -l 'shell' | head -5

Repository: akash-network/console

Length of output: 631


🏁 Script executed:

# Search for how provider sends shell messages - look for Kubelet or exec integration
rg -n 'exit_code|stdout|stderr' apps/provider-proxy/src --type ts | grep -i 'shell\|exec' -A 3 -B 3

Repository: akash-network/console

Length of output: 47


🏁 Script executed:

# Check if there's any JSON message format with stream field in the entire codebase
rg -n '"stream"' apps --type ts

Repository: akash-network/console

Length of output: 820


🏁 Script executed:

# Look for where this data comes from - check if there's any Kubernetes exec interface or test data
rg -n 'exit_code.*message\|message.*exit_code' apps --type ts -B 3 -A 3 | head -80

Repository: akash-network/console

Length of output: 47


🏁 Script executed:

# Check tests to see what message format is used
fd -t f -e spec.ts -e test.ts apps/api/src/deployment | xargs cat | grep -A 10 -B 10 'parseShellMessage\|shell' 2>/dev/null | head -100

Repository: akash-network/console

Length of output: 462


🏁 Script executed:

# Search for test files related to shell-exec
fd --full-path '*shell-exec*' . --type f

Repository: akash-network/console

Length of output: 356


🏁 Script executed:

# Look for any usage or examples of parseShellMessage or the JSON message format
rg -n 'parseShellMessage\|exit_code.*message' apps/api --type ts -B 5 -A 5

Repository: akash-network/console

Length of output: 47


🏁 Script executed:

# Check if provider ever sends stream field in JSON payload - look at actual payload examples
rg -n 'stream.*stdout\|stream.*stderr' apps --type ts

Repository: akash-network/console

Length of output: 47


🏁 Script executed:

# Look directly at the WebSocket message handling to see both framings together
sed -n '54,124p' apps/api/src/deployment/services/shell-exec/shell-exec.service.ts

Repository: akash-network/console

Length of output: 2613


🏁 Script executed:

# Search for any documentation or comments about the message format
rg -n 'JSON\|binary\|framing' apps/api/src/deployment/services/shell-exec/ --type ts

Repository: akash-network/console

Length of output: 47


🏁 Script executed:

# Check provider-proxy Kubernetes exec integration to understand the full message flow
rg -n 'Kubernetes\|kubectl\|exec\|Kubelet' apps/provider-proxy/src --type ts | head -40

Repository: akash-network/console

Length of output: 47


🏁 Script executed:

# Look at what socketMessage contains from provider - is it ever a parsed JSON object?
rg -n 'socketMessage.*JSON\|JSON.*socketMessage' apps/provider-proxy/src --type ts -B 3 -A 3

Repository: akash-network/console

Length of output: 47


🏁 Script executed:

# Check if there are any integration tests that show actual message examples
fd --full-path '*/shell*' apps/api --type f -name '*.spec.ts' -o -name '*.test.ts'

Repository: akash-network/console

Length of output: 295


🏁 Script executed:

# Look at provider-proxy message wrapping more carefully - check entire proxyMessageToProvider flow
rg -n 'proxyMessageToProvider' apps/provider-proxy/src --type ts -A 30 | head -80

Repository: akash-network/console

Length of output: 4278


🏁 Script executed:

# Examine the binary framing comment to understand the protocol better
sed -n '83,111p' apps/api/src/deployment/services/shell-exec/shell-exec.service.ts

Repository: akash-network/console

Length of output: 1220


🏁 Script executed:

# Check if there are any comments or README about message format in shell-exec
cat apps/api/src/deployment/services/shell-exec/README.md 2>/dev/null || echo "No README found"

Repository: akash-network/console

Length of output: 80


🏁 Script executed:

# Look for any Kubernetes exec protocol documentation in comments or code
rg -n 'firstByte.*2\|stderr\|binary' apps/api/src/deployment/services/shell-exec/ --type ts

Repository: akash-network/console

Length of output: 47


🏁 Script executed:

# Final check: look for any test data or examples showing the JSON message format with/without stream field
find apps -type f \( -name '*.spec.ts' -o -name '*.test.ts' -o -name '*.json' \) -exec grep -l 'shell\|exit_code' {} \; 2>/dev/null | head -10

Repository: akash-network/console

Length of output: 351


🏁 Script executed:

# Look at the exact condition check to see if stream field could ever be non-stdout
sed -n '186,201p' apps/api/src/deployment/services/shell-exec/shell-exec.service.ts

Repository: akash-network/console

Length of output: 673


🏁 Script executed:

# Check the shell-exec service test to see what message formats are tested
cat apps/api/src/deployment/services/shell-exec/shell-exec.service.spec.ts

Repository: akash-network/console

Length of output: 50377


parseShellMessage hardcodes stream: "stdout" — if stderr ever arrives via JSON framing, it will be mislabeled.

The JSON handler at line 194 unconditionally assigns stream: "stdout" to all JSON messages with a message field. The binary path (lines 91–103) correctly distinguishes streams using firstByte (0/1 = stdout, 2 = stderr), showing both framings coexist. If provider-proxy ever sends stderr through the JSON path, that output would silently merge into stdout instead of being captured separately.

Confirm the provider-proxy contract: does JSON-framed shell output ever carry stderr, and if so, whether a stream discriminator field exists in the payload to honor here.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/api/src/deployment/services/shell-exec/shell-exec.service.ts` around
lines 186 - 201, parseShellMessage currently hardcodes stream: "stdout" for
JSON-framed messages; update it to honor any stream discriminator in the parsed
payload so stderr JSON messages aren't mislabelled. In parseShellMessage, after
JSON.parse and confirming parsed.message is a non-empty string, inspect
parsed.stream (and if applicable numeric codes like 2) and map it to "stderr" or
"stdout" accordingly (default to "stdout" if missing/unknown) before returning {
type: "data", data: parsed.message, stream: ... }; keep existing exit_code
handling intact.

@jobordu jobordu force-pushed the feat/shell-exec-synchronous-endpoint branch from 9eebcd2 to 805e071 Compare April 22, 2026 10:03
@jobordu
Copy link
Copy Markdown
Author

jobordu commented Apr 22, 2026

@coderabbitai full review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 22, 2026

✅ Actions performed

Full review triggered.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
apps/api/src/deployment/controllers/shell-exec/shell-exec.controller.spec.ts (1)

2-6: Avoid mutating the global DI container in this unit test.

ShellExecController is constructed manually with authService, so registering AuthService in the root tsyringe container is unnecessary and can leak state into unrelated specs. Remove the registration/import unless this test resolves the controller from the container.

♻️ Proposed cleanup
-import { container } from "tsyringe";
 import { describe, expect, it } from "vitest";
 import { mock } from "vitest-mock-extended";
 
-import { AuthService } from "@src/auth/services/auth.service";
+import type { AuthService } from "@src/auth/services/auth.service";
@@
-    container.register(AuthService, { useValue: authService });
-
     const controller = new ShellExecController(deploymentReaderService, providerService, shellExecService, authService, walletReaderService);

Also applies to: 168-168

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/api/src/deployment/controllers/shell-exec/shell-exec.controller.spec.ts`
around lines 2 - 6, The test is mutating the global tsyringe container by
registering AuthService even though ShellExecController is constructed manually
with a local mock; remove the unnecessary import/registration of AuthService and
any calls to container.register/registerInstance in
shell-exec.controller.spec.ts so the spec only uses the local mock
(mock<AuthService>()) and the manual constructor for ShellExecController to
avoid leaking DI state into other tests.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/api/src/deployment/services/shell-exec/shell-exec.service.spec.ts`:
- Around line 970-995: The test currently asserts that a frame starting with
marker 0 carries stdout, but per protocol marker 1 = stdout and marker 0 is an
exit sentinel (or should be discarded for payloads), so update the test (or
service) to treat any incoming frame whose first byte is 0 and which also
contains payload as ignored/protocol error; specifically adjust the
ShellExecService.execute handling (and this spec) so the mock message
Buffer.from(JSON.stringify({ type: "websocket", message: [0,72,101,108,108,111]
})) is ignored and the final assertion checks that (result.val as
ShellExecOutput).stdout is "" (or that an appropriate protocol error is
surfaced), leaving other frames like marker 1 to populate stdout as before;
ensure references to ShellExecService.execute, createShellExecInput, and
ShellExecOutput are used when locating and updating the code/tests.

---

Nitpick comments:
In
`@apps/api/src/deployment/controllers/shell-exec/shell-exec.controller.spec.ts`:
- Around line 2-6: The test is mutating the global tsyringe container by
registering AuthService even though ShellExecController is constructed manually
with a local mock; remove the unnecessary import/registration of AuthService and
any calls to container.register/registerInstance in
shell-exec.controller.spec.ts so the spec only uses the local mock
(mock<AuthService>()) and the manual constructor for ShellExecController to
avoid leaking DI state into other tests.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 9a713d4b-f484-4a14-b6db-b801363558f2

📥 Commits

Reviewing files that changed from the base of the PR and between b8c6460 and 805e071.

⛔ Files ignored due to path filters (1)
  • package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (9)
  • apps/api/package.json
  • apps/api/src/deployment/controllers/shell-exec/shell-exec.controller.spec.ts
  • apps/api/src/deployment/controllers/shell-exec/shell-exec.controller.ts
  • apps/api/src/deployment/http-schemas/shell-exec.schema.ts
  • apps/api/src/deployment/routes/shell-exec/shell-exec.router.ts
  • apps/api/src/deployment/services/shell-exec/shell-exec.service.spec.ts
  • apps/api/src/deployment/services/shell-exec/shell-exec.service.ts
  • apps/api/src/rest-app.ts
  • apps/api/src/routes/deployment/index.ts
✅ Files skipped from review due to trivial changes (3)
  • apps/api/package.json
  • apps/api/src/routes/deployment/index.ts
  • apps/api/src/deployment/http-schemas/shell-exec.schema.ts
🚧 Files skipped from review as they are similar to previous changes (3)
  • apps/api/src/deployment/routes/shell-exec/shell-exec.router.ts
  • apps/api/src/deployment/services/shell-exec/shell-exec.service.ts
  • apps/api/src/deployment/controllers/shell-exec/shell-exec.controller.ts

…cution

- Add POST /v1/deployments/{dseq}/leases/{gseq}/{oseq}/shell-exec endpoint
- Use GranularAccess JWT with shell scope for authentication
- WebSocket client for provider-proxy shell endpoint communication
- CASL-based authorization with deployment ownership validation
- 129 tests including adversarial hardening coverage

Closes akash-network#3079
@jobordu jobordu force-pushed the feat/shell-exec-synchronous-endpoint branch from 805e071 to d9d4d81 Compare April 22, 2026 10:27
@jobordu
Copy link
Copy Markdown
Author

jobordu commented Apr 22, 2026

@baktun14 Ready !

btw, tu es Québécois ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add synchronous shell-exec endpoint for API-token access

1 participant