fix(strands-command): return full PR diffs and offload large tool results#63
fix(strands-command): return full PR diffs and offload large tool results#63yonib05 wants to merge 6 commits into
Conversation
get_pr_files capped every file's patch at 50 lines, so the reviewer silently lost the tail of any larger change and reasoned about code it never saw. Replace the per-file line cap with a total character budget: inline full patches until the budget is reached, then list remaining files for on-demand fetch instead of truncating mid-file.
…ping it A file whose patch alone exceeds the budget previously produced no inline diff at all, leaving the reviewer dependent on fetching it. Inline a line-trimmed head slice when enough budget remains so there is always some inline signal, and note the omitted remainder is fetchable. Adds tests for the partial-head and mixed inline/overflow cases.
Guard the partial-head branch so a patch starting with a newline (whose line-boundary trim leaves an empty head) is fully deferred instead of emitting an empty "Diff (head only):" block. Add tests covering the full-deferral branch, the exact budget boundary, and the empty-head edge, merge the two near-duplicate over-budget tests, and note the accepted file-order dependence of the diff budget.
| if head: | ||
| used += len(head) | ||
| omitted = len(patch) - len(head) | ||
| overflow.append(f"{filename} (+{additions} -{deletions}, partially shown)") |
There was a problem hiding this comment.
Can we write the overflow to disk and let the agent know where it is?
|
|
||
| @tool | ||
| @log_inputs | ||
| def get_pr_files(pr_number: int, repo: str | None = None) -> str: |
There was a problem hiding this comment.
This is PR output?
Can we just leverage the Strands Tool Offloader here and let it manage the truncation?
There was a problem hiding this comment.
Good call will swap it should simplify alot
There was a problem hiding this comment.
Done in 52c3ecf. Swapped the bespoke budgeting for the SDK's ContextOffloader plugin (wired in agent_runner.py with InMemoryStorage). get_pr_files now just returns the full diff, and the plugin offloads any oversized tool result — not only diffs but file reads and shell output too — registering a retrieval tool so the agent fetches full content on demand. Net: ~60 lines of truncation logic + its constants removed, and truncation is handled generically for every tool.
…oader Replace the bespoke per-call diff budgeting in get_pr_files with the SDK's ContextOffloader plugin. get_pr_files now returns the full untruncated diff; the plugin offloads any oversized tool result (diffs, file reads, shell output) to storage and registers a retrieval tool so the agent fetches the full content on demand. This handles truncation generically for every tool rather than just this one, per review feedback.
… sessions resolve references
…reviewed GitHub list endpoints cap at 100 items per page, so reading a single response silently dropped changed files past the first page on large PRs -- the reviewer would never see them. Add a paginated GET helper that follows Link rel="next" (bounded by a page cap) and use it for the PR files list.
Problem
get_pr_fileshad two ways of hiding code from the reviewer:# Limit diff size to avoid overwhelming output), so issues below line 50 were invisible.Change
get_pr_filesnow returns each file's full, untruncated diff, and follows pagination (new_github_get_all_pageshelper, which followsLink rel="next"with a page-count safety cap) so every changed file is included.ContextOffloaderplugin (agent_runner.py). When a result exceeds the token threshold, the plugin offloads it to storage, replaces it in-context with a preview + reference, and registers aretrieve_offloaded_contenttool so the agent pulls the full content on demand.This handles oversized results generically for every tool (diffs, file reads, shell output), not just
get_pr_files, and removes ~60 lines of bespoke per-call budgeting/head-slicing logic.Storage backend
Uses
S3Storage, reusing the bucket already configured for session management. The offloaded result is replaced in the conversation by a storage reference that the session manager persists; a resumed session in a later process must still resolve that reference, which in-memory storage could not guarantee.Tests
tests/test_github_tools.py(get_pr_files):Full unit suite:
38 passed.