feat(drive): add +list-comments shortcut with anchor-aware filtering#1114
feat(drive): add +list-comments shortcut with anchor-aware filtering#1114kyalpha313 wants to merge 4 commits into
Conversation
📝 WalkthroughWalkthroughAdds a new ChangesDrive +list-comments Shortcut Implementation
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@shortcuts/drive/drive_list_comments_test.go`:
- Around line 458-593: Both tests
(TestDriveListComments_E2E_FiltersOrphanedByDefault and
TestDriveListComments_E2E_IncludeOrphanedKeepsAll) rely on global config and
must isolate config state; add t.Setenv("LARKSUITE_CLI_CONFIG_DIR", t.TempDir())
at the start of each test (or in a shared test helper called before
cmdutil.TestFactory/mountAndRunDrive) so the tests use a temporary
LARKSUITE_CLI_CONFIG_DIR and avoid cross-test state leakage.
- Around line 86-132: Replace uses of common.TestNewRuntimeContext(...) in these
tests with the repo-standard test factory cmdutil.TestFactory(t,
&core.CliConfig{}); specifically, when building the runtime for
validateListComments calls (tests referencing newListCommentsCmd and
validateListComments), call cmdutil.TestFactory(t, &core.CliConfig{}) to obtain
the test factory/runtime instead of common.TestNewRuntimeContext, and update any
variable names as needed so validateListComments receives the factory/runtime
produced by cmdutil.TestFactory.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 177b228c-d297-4257-8756-8d8f50e92a48
📒 Files selected for processing (4)
shortcuts/drive/drive_list_comments.goshortcuts/drive/drive_list_comments_test.goshortcuts/drive/shortcuts.goshortcuts/drive/shortcuts_test.go
Add t.Setenv("LARKSUITE_CLI_CONFIG_DIR", t.TempDir()) to the two httpmock E2E
tests for drive +list-comments, matching the pattern used by
drive_inspect_test.go and AGENTS.md guidance for config-state isolation.
Addresses CodeRabbit review on larksuite#1114.
| hasStructuralAnchor := strings.Contains(docXML, structuralAnchorTag) | ||
| // Approximate position of the FIRST sticky-note anchor in normalized space. | ||
| // All sticky-anchored comments share this position for sorting purposes. | ||
| structuralPos := projectRawPosToNormalized(docXML, normalized, structuralAnchorTag) |
There was a problem hiding this comment.
This computes a single structuralPos from the first <readonly-block> and then reuses that same position for every sticky-note comment. As a result, multiple structural comments do not actually sort by document position under the default order=anchor; they all tie on the same anchor position and then fall back to create_time. That does not match the PR description about approximating sticky-note order from XML position. If we cannot map individual sticky-note comments to distinct structural anchors, we should either avoid claiming anchor-order fidelity for them or make this limitation explicit in output and tests.
| return "" | ||
| } | ||
| decoded := html.UnescapeString(firstLine) | ||
| return whitespacePattern.ReplaceAllString(decoded, "") |
There was a problem hiding this comment.
Rich-text quotes containing inline markup will be misclassified here. The PR description says quote matching should strip HTML tags before comparison, but normalizeQuoteNeedle only unescapes entities and removes whitespace. If the comments API returns formatting tags inside quote (for example <b> or <i>), the needle will still contain markup and fail to match the tag-stripped document text, causing a still-valid comment to be classified as orphaned and filtered out by default. Please strip tags here as well, and add a test where the quote itself contains inline markup.
|
感谢这个PR,看了下实现,现在是根据评论的quota去匹配正文的,定位在xml内容的起始位置,有几个问题:
另外,我们下周会在评论接口返回评论所在的blockID位置,这个会更加准确,也能覆盖block评论场景,可以等下我们下周的版本。 |
|
@wittam-01 感谢 review 和详细反馈,这两点都很到位:
听到下周接口会返回 blockID,这才是真正的根治方案——可以唯一定位锚、自然覆盖 block 评论场景,连 fetch 正文都省了。我先把这个 PR 转 draft 暂停,等新接口上线后用 blockID 重写,代码也能砍掉一大半。 不用提前给我看新字段 schema,下周看到了再说。期间如果有什么我能帮忙的(比如新接口上线时优先试一下、或者拆其它周边小工具),随时 ping 我。 |
Add t.Setenv("LARKSUITE_CLI_CONFIG_DIR", t.TempDir()) to the two httpmock E2E
tests for drive +list-comments, matching the pattern used by
drive_inspect_test.go and AGENTS.md guidance for config-state isolation.
Addresses CodeRabbit review on larksuite#1114.
…tests
Revert the t.Setenv("LARKSUITE_CLI_CONFIG_DIR", ...) added in 64e65b6. Per the
repo learning from PR larksuite#343, tests built via cmdutil.TestFactory(t, config) use
an in-memory config closure and never touch the filesystem, so isolating
CONFIG_DIR has no effect. Only the real NewDefault() factory path needs it.
Verified: cmdutil.TestFactory sets Config: func() (*core.CliConfig, error) {
return config, nil } — no filesystem access.
…#1258 Replaces the client-side quote/text-matching orphan detection with the server-supplied anchor identifiers from `need_relation=true`, following the field design documented in larksuite#1258. Behavior: - list comments with `need_relation=true` to obtain `relation.relation` (containing `positionInfo.blockID`) and `content_deleted` - fetch docs XML with `export_block_id=true`; build a block-id index for stable anchor ordering (replaces the prior text-position projection that broke on multi-occurrence and tag-broken anchors) - use `content_deleted` as the orphan signal (resolves the prior "short anchor multi-occurrence" false-positive limitation) - use `parent_type` / `parent_token` to place embedded-resource comments (Sheet / Base / Whiteboard); intentionally downgrade these to `parent_resource_exact` rather than claiming exact internal ordering, since docs XML only identifies the embedding block Output contract additions: - `anchor_block_id`: per-comment anchor (from positionInfo.blockID) - `location_accuracy`: relation_exact | parent_resource_exact | weak_inferred | content_deleted | whole_document - `content_deleted`: server-side orphan signal Tests: - unit: relation parsing, malformed-JSON fallback, content-deleted orphan classification, relation-exact / parent-resource-exact anchoring, embedded-relation downgrade, sorted output, dry-run request shape, httpmock E2E - dry-run E2E (`tests/cli_e2e/drive/drive_list_comments_dryrun_test.go`): asserts `need_relation=true`, `need_reaction=true`, `is_solved=false`, and fetch body `format=xml` with `export_block_id=true` - opt-in live E2E (`tests/cli_e2e/drive/drive_list_comments_workflow_test.go`): gated by `LARK_DRIVE_LIST_COMMENTS_E2E=1`; user identity; creates a docx, adds a block comment, lists, asserts relation metadata, cleans up - coverage doc updated to 11 / 32 = 34.4% Live validation against a real docx: composition now matches the Lark UI side panel for 21 of 22 originally observed comments (the prior algorithm had a 2-item swap error from text-matching false positives); `[Sticky note]` and embedded `<callout>` comments now land at their correct positions.
b0267d3 to
b86407a
Compare
|
@wittam-01 进展同步: 已按 #1258 的字段设计把 PR #1114 重写并 force-push(基于刚 rebase 到
实测某线上 docx(原 29 条):新算法 22 valid + 2 structural + 8 orphan,与 UI 侧栏匹配 21/22(剩余 1 条推测是文档编辑后 block 被删, PR 暂时仍是 draft——等 #1258 合入(或你确认这套字段假设稳定)后再 mark ready,以防字段还有调整。CI 跑通后再来更新这条 thread。 不急,等你方便回。 |
我这边的PR预计明天会合入,还在验证中。你这边的PR还需要吗,现在已经可以使用lark-cli drive file.comments list拿到评论列表了 |
Summary
新增
drive +list-commentsshortcut——以智能默认返回 docx 评论,对齐用户在飞书 UI 侧栏看到的评论卡片。详见 #1111 的背景。本 PR 已基于 #1258 文档化的字段(
need_relation=true/relation.relation.positionInfo.blockID/content_deleted/parent_type/parent_token)重写实现,取代了早期"客户端 quote 文本匹配 + 字节偏移投影"的曲线方案。CLI 形态
输出契约
复用
drive file.comments list原响应,每条评论多五个派生字段:{ "comment_id": "...", "quote": "...", "is_solved": false, "is_whole": false, "anchor_state": "valid" | "structural" | "orphaned", "anchor_position": 12, "anchor_block_id": "...", // 从 positionInfo.blockID 解析 "content_deleted": false, // 服务端 orphan 信号 "location_accuracy": "relation_exact" // 见下表 | "parent_resource_exact" | "weak_inferred" | "content_deleted" | "whole_document", "reply_list": { "replies": [ { ..., "reactions": [...] } ] } }外层附
counts: {total, valid, structural, orphaned}与file_token/wiki_token,便于上层断言。定位准确度分级
relation_exactpositionInfo.blockID能在docs +fetch --detail with-ids的 block 树里找到parent_resource_exactcontent_deletedcontent_deleted=truewhole_documentis_whole=true的全文评论weak_inferred内部流程
--doc,wiki URL 通过wiki/v2/spaces/get_node解包到 docx token。GET /open-apis/drive/v1/files/:file_token/comments,默认带need_relation=true+is_solved=false+need_reaction=true。POST /open-apis/docs_ai/v1/documents/:token/fetch拉取正文 XML(format=xml,export_option.export_block_id=true),构建 block ID 索引用于稳定排序。relation.content_deleted=true→orphaned+location_accuracy=content_deletedrelation.relation(嵌套 JSON 字符串)取positionInfo.blockID→ 命中 block 树 →valid+relation_exact,否则parent_resource_exact或降级parent_type∈ {SHEET_BLOCK, BITABLE_BLOCK, WHITEBOARD_BLOCK} → 用parent_token匹配嵌入 block,降级为structural+parent_resource_exactis_whole=true→valid+whole_document--order=created时改为按创建时间。实测验证(线上真实 docx)
某 docx(原始 22 条用户可见 + 7 条孤立锚 = 29 条):
已知错误误判 valid、P2/P3 事件聚类触发误判 orphan)产品推测是文档编辑后该 block 已删)[Sticky note]#19(用户 UI 是#17,微差源于新评论挤压)体验驱动多行 callout#2(取首个文本出现位置)#18已知错误(短锚多处出现)content_deleted=true正确归入 orphan已知限制
parent_resource_exact而非声称内部精确顺序——由 sheet / bitable / whiteboard skill 下钻负责file_type=docx);其它 file_type 暂不通过need_relation精确定位,需走资源对应 skill为什么是新 shortcut 而非给原命令加 flag
drive file.comments list现有行为 / 输出结构,避免破坏依赖原始计数的脚本。shortcuts/drive下唯一与评论有关的 shortcut 是+add-comment;新增+list-comments为后续+resolve-comment/+update-reply等读改类 shortcut 建立命名 pattern。+search/+inspect等先例,默认就给"用户最常想要的那个口径"。Test Plan
go test -race -count=1 ./shortcuts/drive/...go test -race -count=1 ./tests/cli_e2e/drive -run TestDriveListComments(dry-run E2E)./lark-cli drive +list-comments --doc <real-docx-url>→ 22 valid + 2 structural,组成与 UI 侧栏一致(细节见上"实测验证")gofmt -l shortcuts/drive/drive_list_comments.go shortcuts/drive/drive_list_comments_test.go tests/cli_e2e/drive/drive_list_comments_*无输出go vet ./shortcuts/drive/go mod tidy不改 go.mod / go.sumgolangci-lint run --new-from-rev=origin/main无新增 issueLARK_DRIVE_LIST_COMMENTS_E2E=1 go test ./tests/cli_e2e/drive -run TestDriveListCommentsWorkflow -v(user 身份)通过,含创建临时 docx → 加 block 评论 → list 断言 relation → 清理Related Issues
🤖 Generated with Claude Code