Skip to content

feat(multi-user): port park-cli multi-user surface (Users[]/CurrentUs…#1257

Open
kyeliu99 wants to merge 1 commit into
larksuite:mainfrom
kyeliu99:lrz/feat4multi-user
Open

feat(multi-user): port park-cli multi-user surface (Users[]/CurrentUs…#1257
kyeliu99 wants to merge 1 commit into
larksuite:mainfrom
kyeliu99:lrz/feat4multi-user

Conversation

@kyeliu99
Copy link
Copy Markdown

@kyeliu99 kyeliu99 commented Jun 4, 2026

Summary / 概述

EN: Port the multi-user surface from park-cli into lark-cli — per-profile Users[] with an active CurrentUser, a holder-mismatch gate around auth login, R2 schema-version-aware config I/O,
and a sidecar/user-index sweep on every UAT remove path. No park-cli changes; existing single-user behavior preserved.

中文: 将 park-cli 的 multi-user 能力适配进 lark-cli —— 每个 profile 持有 Users[] 与一个 active CurrentUser,在 auth login 上加身份持有者(holder)校验门,配置读写感知 R2 schema 版本,并在所有
UAT 删除路径上同步清理 sidecar 与 user index。不改动 park-cli,不破坏现有单用户行为。

What's in / 主要变更

EN

  • Identity model: AppConfig gains Users []AppUser + CurrentUser; resolution priority is --user > LARKSUITE_CLI_OPEN_ID > CurrentUser > Users[0].
  • Holder gate: auth login validates the freshly-authorized open_id against the operator-named (or implied) holder. Hard-abort on --user/env mismatch (typo / phishing guard); soft stderr
    advisory + structured holder_mismatch_warning JSON field on implied-holder mismatch (legacy "logout-and-login-as-someone-else" stays unbroken).
  • R2 forward-compat: core.LoadMultiAppConfig returns ConfigError end-to-end; bind/show/list/auth paths preserve transparency. Bind takes the SingleUser flock on the target workspace;
    read-only show skips it.
  • Sweep discipline: auth logout / users remove / profile remove / config bind rebind all sweep keychain UAT, sidecar, and user_index.json together; no orphans.
  • NDJSON observability: auth login --json emits a documented NDJSON stream; authorization_complete carries typed holder_mismatch_warning so headless agents can branch without parsing
    stderr.
  • Security: human-readable strings copied into the JSON message field are sanitized (ANSI/C0/DEL/dangerous Unicode); typed identity fields stay raw bytes for programmatic consumers.
  • What's NOT here: per-user DEK rekey (Step 12) — declined because park-cli's KEK design assumes a long-running agent process while lark-cli is one-shot.
    memo.

中文

  • 身份模型: AppConfig 增加 Users []AppUser + CurrentUser;解析优先级 --user > LARKSUITE_CLI_OPEN_ID > CurrentUser > Users[0]。
  • Holder gate: auth login 校验本次授权 open_id 与操作者声明(或隐含)的 holder。--user / env 不一致硬中止(防止打错或钓鱼跳转);隐含 holder(来自 CurrentUser)不一致只发软告警(stderr WARN + JSON
    holder_mismatch_warning 结构化字段),保留旧版"登出再登另一人"工作流。
  • R2 前向兼容: core.LoadMultiAppConfig 全程返回 ConfigError;bind/show/list/auth 各路径透传,不被吞掉。bind 在目标 workspace 上 flock,只读 show 跳过 flock。
  • 删除路径清扫: auth logout / users remove / profile remove / config bind rebind 同时清理 keychain UAT、sidecar 与 user_index.json,不留孤儿。
  • NDJSON 可观测: auth login --json 是 NDJSON 流(契约写进 SKILL.md);authorization_complete 行带 holder_mismatch_warning 类型化字段,headless agent 不用解析 stderr 即可分支。
  • 安全: 进入 JSON message 的人类可读字符串经过净化(ANSI/C0/DEL/危险 Unicode);类型化身份字段保持原始字节,留给程序消费者自行处理。
  • 故意不做: per-user DEK rekey(Step 12)—— park-cli 的 KEK 设计依赖长进程 agent,lark-cli 是 one-shot,适配会破坏脚本化体验且无安全收益。

Summary by CodeRabbit

  • New Features

    • Added multi-user support: manage multiple login credentials per profile with auth users list, auth users use, and auth users logout commands
    • Added --user flag and LARKSUITE_CLI_OPEN_ID environment variable for user selection during login and commands
    • Introduced login holder verification with advisory warnings when re-logging a different user to prevent accidental active user changes
  • Bug Fixes

    • Fixed concurrent configuration access across login, logout, and profile operations
    • Fixed keychain and sidecar artifact cleanup when switching or removing users
    • Fixed profile fallback behavior to prevent unintended command routing
  • Documentation

    • Added JSON output contract documentation for auth login --json including holder mismatch warnings

…er, holder-gate, R2-aware config, sidecar sweep) into lark-cli
@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


liurunzhou seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 4, 2026

Review Change Stack

📝 Walkthrough

Walkthrough

Introduces multi-user auth: core schema/versioning, new local auth storage (KV, tokens, locks), user index/profile, migration, holder mismatch gate, users subcommands (list/use/logout), login/logout refactors with JSON warnings, config/profile commands serialized with locks, credential/bootstrap user override, and consistent R2 error/hint routing.

Changes

Multi-user auth + storage + migration + CLI integration

Layer / File(s) Summary
Complete multi-user stack and commands
internal/core/*, internal/auth/*, internal/migrate/*, internal/credential/*, internal/cmdutil/*, cmd/auth/*, cmd/config/*, cmd/profile/*, cmd/global_flags.go, cmd/bootstrap.go, internal/envvars/envvars.go, internal/validate/sanitize.go, skills/lark-shared/SKILL.md
Adds multi-user schema/versioning, local storage (KV/tokens/locks), user index/profile, migration and once-at-bootstrap execution, credential/user override wiring, auth holder gate and users subcommands, login/logout refactors with JSON holder warning, config/profile commands serialized with file locks and R2 error passthrough, extensive tests.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant User
  participant CLI
  participant CoreConfig
  participant LocalAuth
  participant OAuth
  User->>CLI: auth login --user/--json
  CLI->>CoreConfig: Resolve profile (login mode)
  CLI->>OAuth: Device flow (authorize + token)
  OAuth-->>CLI: Access/refresh token
  CLI->>OAuth: GetUserInfo (open_id/union_id/name)
  CLI->>CoreConfig: Resolve holder (flag/env/current)
  CLI->>CLI: Verify holder (warn or abort)
  CLI->>LocalAuth: Store tokens (rollback on failure)
  CLI->>LocalAuth: Upsert user profile + user index (under lock)
  CLI-->>User: Success JSON (+holder_mismatch_warning when soft)
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

  • larksuite/cli#252 — Touches auth login flow and selection logic in cmd/auth/login.go similar to this PR’s login refactor.
  • larksuite/cli#728 — Aligns “not configured” hinting used by auth list paths updated here.
  • larksuite/cli#515 — Shares config bind and workspace isolation areas affected by this PR’s bind R2/locking changes.

Suggested labels

domain/mail, domain/im, size/L

Suggested reviewers

  • liangshuo-1

Poem

A rabbit taps the lock, tick-tock—no race,
Multi-burrows mapped for each user’s place.
Tokens tucked safe, profiles in a row,
Holder warns softly: “mind where you go.”
Schemas hop forward, hints light the trail—
JSON squeaks “all set!” with a thump of the tail. 🐇✨

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

@github-actions github-actions Bot added the size/XL Architecture-level or global-impact change label Jun 4, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 20

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
internal/core/config.go (1)

188-214: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Restore the "default" profile alias in app lookup helpers.

This drops the existing FindApp("default") contract, so selector paths that still rely on "default" resolving to the active profile will now miss or treat it as a literal name. FindAppIndex should stay in sync too.

Suggested fix
 func (m *MultiAppConfig) FindApp(name string) *AppConfig {
+	if name == "default" && m.CurrentApp != "" && m.CurrentApp != "default" {
+		if app := m.FindApp(m.CurrentApp); app != nil {
+			return app
+		}
+	}
 	for i := range m.Apps {
 		if m.Apps[i].Name != "" && m.Apps[i].Name == name {
 			return &m.Apps[i]
 		}
 	}
@@
 func (m *MultiAppConfig) FindAppIndex(name string) int {
+	if name == "default" && m.CurrentApp != "" && m.CurrentApp != "default" {
+		return m.FindAppIndex(m.CurrentApp)
+	}
 	for i := range m.Apps {
 		if m.Apps[i].Name != "" && m.Apps[i].Name == name {
 			return i
 		}
 	}

Based on learnings, in larksuite/cli internal/core/config.go, MultiAppConfig.FindApp treats the string "default" as an alias for the currently active app.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/core/config.go` around lines 188 - 214, FindApp and FindAppIndex
dropped support for the special alias "default"; restore it by detecting if the
incoming name == "default" at the top of both MultiAppConfig.FindApp and
MultiAppConfig.FindAppIndex and replacing name with the currently active app
identifier (use the MultiAppConfig field that holds the active app, e.g.
m.Active or m.ActiveApp) before running the existing loops over m.Apps; ensure
both functions use the same replacement logic so they remain in sync.
cmd/auth/login.go (1)

433-438: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Mirror the immediate-path JSON failure contract in --device-code.

When opts.JSON is true, this branch currently returns errs.NewAuthenticationError(...), so the root handler emits a typed error envelope on stderr. The immediate device-flow path already writes an authorization_failed event to stdout and returns output.ErrBare(output.ExitAuth). That asymmetry breaks the documented NDJSON contract for the headless --no-wait--device-code flow and corrupts 2>&1 pipelines that expect JSON only on stdout.

Suggested fix
 	if !result.OK {
 		if shouldRemoveLoginRequestedScope(result) {
 			cleanupRequestedScope()
 		}
+		if opts.JSON {
+			enc := json.NewEncoder(f.IOStreams.Out)
+			enc.SetEscapeHTML(false)
+			if err := enc.Encode(map[string]interface{}{
+				"event": "authorization_failed",
+				"error": result.Message,
+			}); err != nil {
+				return errs.NewInternalError(errs.SubtypeSDKError, "failed to write JSON output: %v", err).WithCause(err)
+			}
+			return output.ErrBare(output.ExitAuth)
+		}
 		return errs.NewAuthenticationError(errs.SubtypeUnknown, "authorization failed: %s", result.Message)
 	}

As per coding guidelines, **/*.go: stdout is data (JSON envelopes), stderr is everything else (progress, warnings, hints). Never mix them as it corrupts pipe chains.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/auth/login.go` around lines 433 - 438, When opts.JSON is true and the
authorization branch sees !result.OK, don't return errs.NewAuthenticationError
(which sends a typed error on stderr); instead mirror the device-code
immediate-path: if shouldRemoveLoginRequestedScope(result) call
cleanupRequestedScope(), then emit the "authorization_failed" NDJSON envelope to
stdout (same envelope shape used by the device-code path) and return
output.ErrBare(output.ExitAuth). For non-JSON paths keep the existing
errs.NewAuthenticationError behavior. Locate this logic around the !result.OK
check in login.go and reuse the same event-formatting and output.ErrBare return
used by the device-code flow.
🧹 Nitpick comments (4)
internal/auth/purge_test.go (1)

175-192: ⚡ Quick win

Exercise a real failing leg here instead of asserting on a literal.

This test never makes PurgeUserArtifacts build an aggregated error; it only proves that a hand-written string contains "; ". If the implementation regresses to returning the first error only, this still stays green. A tiny fake Root/KV that forces DeleteUserProfileFor or DeleteUser to fail would lock the actual contract instead of a placeholder string.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/auth/purge_test.go` around lines 175 - 192, The test currently only
asserts a hand-written string and doesn't exercise PurgeUserArtifacts's real
error-aggregation path; replace the placeholder check by constructing a failing
Root/KV (or small fake) that makes DeleteUserProfileFor and DeleteUser return
distinct errors, call PurgeUserArtifacts(root, "cli_app", "ou_alice") against
that fake root (after keyring.MockInit()), and assert the returned error is
non-nil and its Error() contains both component messages joined with "; " to
lock the aggregation behavior of PurgeUserArtifacts and the
DeleteUserProfileFor/DeleteUser legs.
cmd/auth/users_use_profile_fallback_test.go (1)

59-64: ⚡ Quick win

Use cmdutil.TestFactory for these command-path tests.

These cases call authUsersUseRun, so the direct &cmdutil.Factory{...} setup is bypassing the standard unit-test factory wiring. Switching to cmdutil.TestFactory(t, config) will keep these tests aligned with the rest of the suite as Factory evolves.

As per coding guidelines, Use cmdutil.TestFactory(t, config) for test factories in unit tests.

Also applies to: 110-113

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/auth/users_use_profile_fallback_test.go` around lines 59 - 64, Replace
the manual test factory construction that creates a &cmdutil.Factory (setting
IOStreams and Invocation.Profile) with the standard helper
cmdutil.TestFactory(t, config) so the authUsersUseRun tests use the same test
wiring as the suite; locate the test setup where streams := &cmdutil.IOStreams
and f := &cmdutil.Factory{IOStreams: streams, Invocation:
cmdutil.InvocationContext{Profile: "ghost"}} and instantiate f via
cmdutil.TestFactory(t, /* appropriate config map or nil */) and then set the
profile on the returned Factory (or pass it via the config) before calling
authUsersUseRun to ensure consistent factory behavior across tests.
cmd/auth/users_test.go (1)

24-29: ⚡ Quick win

Use the shared fake-home helper in this test fixture.

This helper only overrides HOME, so Windows runs can still resolve USERPROFILE and leak auth/storage state outside the temp dir. Please switch this to the repo’s setFakeOSHome(t, dir) helper.

Based on learnings, use a shared setFakeOSHome(t, dir) helper that sets both the HOME and USERPROFILE environment variables via t.Setenv.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/auth/users_test.go` around lines 24 - 29, In stubUsersConfig replace the
direct t.Setenv("HOME", t.TempDir()) call with the repo helper that sets both
HOME and USERPROFILE: create a tempDir := t.TempDir() (or reuse the existing
temp dir used for LARKSUITE_CLI_CONFIG_DIR), then call setFakeOSHome(t,
tempDir); this ensures both HOME and USERPROFILE are set for Windows CI while
leaving the rest of stubUsersConfig (including keyring.MockInit and
LARKSUITE_CLI_CONFIG_DIR) unchanged.
cmd/auth/users_logout_profile_fallback_test.go (1)

47-52: ⚡ Quick win

Prefer cmdutil.TestFactory over hand-rolled factories in these tests.

These tests execute config/auth command paths, so &cmdutil.Factory{...} can drift from the canonical wiring the rest of the suite gets for IO, keychain, and future factory defaults. cmdutil.TestFactory(t, config) is the safer fixture here, and you can still override Invocation.Profile or buffers afterward.

As per coding guidelines, Use cmdutil.TestFactory(t, config) for test factories in unit tests.

Also applies to: 91-94, 129-132

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/auth/users_logout_profile_fallback_test.go` around lines 47 - 52, Replace
the hand-constructed factory in the test with the canonical test fixture: use
cmdutil.TestFactory(t, config) instead of &cmdutil.Factory{...}; after creating
the test factory you can set its IOStreams buffers and override
Invocation.Profile (e.g. set f.IOStreams = streams and f.Invocation.Profile =
"ghost") so the test uses the standard wiring for IO, keychain and other
defaults; apply the same change for the other occurrences referenced (lines
around 91-94 and 129-132) to ensure consistency.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cmd/auth/login_holder.go`:
- Around line 123-176: In verifyHolder, sanitize holderOpenId and freshOpenId
before using them in human-facing strings: call the existing display-sanitizer
(or add one, e.g., sanitizeDisplay) on the IDs and on formatHolderLabel inputs
and use those sanitized values when building hints, the errs.NewConfigError
messages, and the soft-warning message string; keep the original raw
holderOpenId/freshOpenId in the holderMismatchWarning struct fields and in any
structured/error fields, but never interpolate the raw IDs into message or hint
text (update the code paths that call errs.NewConfigError, the "flag" and "env"
hint construction, and the "" branch message construction to use the sanitized
variants).

In `@cmd/auth/login_postflock_r2_test.go`:
- Around line 35-37: The test currently sets LARKSUITE_CLI_CONFIG_DIR and HOME
but not USERPROFILE, which makes Windows runs nondeterministic; update the setup
in cmd/auth/login_postflock_r2_test.go so the fake home is deterministic by
either calling the shared helper setFakeOSHome(t, dir) or by adding
t.Setenv("USERPROFILE", dir) immediately after t.Setenv("HOME", ...); ensure you
still set LARKSUITE_CLI_CONFIG_DIR to dir.

In `@cmd/auth/logout_test.go`:
- Around line 206-210: The test currently discards errors from setup calls
(larkauth.SetStoredToken, larkauth.SaveUserProfileFor,
larkauth.RecordUserActivity) which can hide fixture failures; change each
ignored call to capture its error and call t.Fatalf on non-nil (e.g. if err :=
larkauth.SetStoredToken(...); err != nil { t.Fatalf("SetStoredToken failed: %v",
err) }) and do the same for SaveUserProfileFor and RecordUserActivity (keep ctx
:= larkauth.ForUser(u.app, u.oid) as-is).
- Around line 22-28: Replace direct t.Setenv("HOME", ...) usage with the repo
test helper setFakeOSHome(t, dir) so both Unix and Windows home envs are
consistently set; in this file update stubLogoutConfig (and the other similar
call sites where HOME is set: the other setup helpers around the second and
third occurrences) to call setFakeOSHome(t, cfgDir) (or the appropriate temp
dir) after keyring.MockInit(), ensuring tests use the shared helper that sets
HOME and USERPROFILE.

In `@cmd/auth/logout.go`:
- Around line 78-79: The current code captures profileName from
app.ProfileName() before acquiring the flock, which can become stale; instead,
only snapshot an explicit --profile flag (detect the flag source), and for the
normal path recompute the target app after the lock using the locked multi state
(i.e., call FindAppIndex against the post-lock multi to select the active
profile). Concretely: remove/avoid using the pre-lock profileName for the
non-explicit case, detect whether a profile was explicitly passed, and if not
call multi (the locked snapshot) to resolve the active app before calling
FindAppIndex and performing the logout operations.

In `@cmd/auth/users_logout.go`:
- Around line 125-139: The removal of the keychain token
(larkauth.RemoveStoredToken) happens before persisting the updated multi-app
config (core.SaveMultiAppConfig), which can leave config.json pointing to a user
with no token if the save fails; change the sequence so you remove the victim
from app.Users and clear app.CurrentUser as currently done, then call
core.SaveMultiAppConfig(multi) first and only after a successful save call
larkauth.RemoveStoredToken(app.AppId, victim.UserOpenId); keep the existing
warning behavior if RemoveStoredToken fails (log to f.IOStreams.ErrOut) but do
not remove the token unless the config save succeeded.

In `@cmd/auth/users_use.go`:
- Around line 103-108: The post-lock reload can return nil/empty causing a panic
when dereferencing multi; after calling core.LoadMultiAppConfig() check that
multi is non-nil and that multi.FindAppIndex(app.ProfileName()) is valid before
using it, and if not return
core.PassThroughOrNotConfigured(errOrNewNotConfiguredError) (i.e., guard the
dereference of multi and handle the not-configured path), updating the block
around LoadMultiAppConfig, multi, FindAppIndex and the subsequent use of idx
accordingly.

In `@cmd/bootstrap_user_test.go`:
- Around line 19-199: Tests call BootstrapInvocationContext which triggers
runMigrationOnce; set t.Setenv("LARKSUITE_CLI_CONFIG_DIR", t.TempDir()) at the
start of each test (or in clearUserEnv helper) to isolate config state, and
stub/reset the migration sync.Once by replacing runMigrationOnce with a no-op
per test or resetting the migrateOnce variable before invoking
BootstrapInvocationContext so each test starts with a fresh migration state;
reference BootstrapInvocationContext, runMigrationOnce and migrateOnce to locate
where to apply the env setup and the stub/reset.

In `@cmd/config/bind_lock_path_test.go`:
- Around line 47-48: Replace direct t.Setenv("HOME", ...) calls in the tests
with a helper that sets both HOME and USERPROFILE to the same temp dir;
implement a helper like setFakeOSHome(t testing.TB, dir string) that calls
t.Setenv("HOME", dir) and t.Setenv("USERPROFILE", dir), then use
setFakeOSHome(t, t.TempDir()) in place of t.Setenv("HOME", ...) in the test body
(also update the other occurrences around the second fixture at the later
lines). Ensure the existing t.Setenv("LARKSUITE_CLI_CONFIG_DIR", configDir)
remains unchanged.

In `@cmd/config/bind_r2_test.go`:
- Around line 35-45: The test currently hardcodes a future schema version as the
literal 2 when writing the hermes fixture (see hermesDir/ hermesConfigPath setup
in bind_r2_test.go); change the fixture to compute the future version as
core.CurrentSchemaVersion+1 and use that numeric value in the JSON payload and
any corresponding assertion messages (also update the other fixture at the
second occurrence around lines 73-75) so the test remains forward-compatible
when CurrentSchemaVersion is bumped. Use core.CurrentSchemaVersion+1 (formatted
into the JSON string or built with a small struct marshaled to JSON) wherever
the literal 2 is used for the “future” schema version and update assertions that
mention the version accordingly.
- Around line 50-58: The test sets up HERMES_HOME and calls configBindRun with
BindOptions{Factory: f, Source: "hermes"} but does not clear ambient agent env
vars that finalizeSource checks, so a CI/dev environment with OPENCLAW_* or
LARK_CHANNEL_* will cause a source-mismatch before the R2 path runs; before
writing the .env and invoking configBindRun (and similarly in the block around
lines 112-120), explicitly unset or clear any detection env used by
finalizeSource (e.g., remove OPENCLAW_* and LARK_CHANNEL_* variables from
os.Environ via os.Unsetenv for each OPENCLAW_* and LARK_CHANNEL_* key) so the
test environment deterministically pins Hermes and exercises the R2 path when
calling configBindRun/BindOptions and finalizeSource.

In `@cmd/config/bind_uat_cleanup_test.go`:
- Around line 33-180: Tests call cleanupKeychainFromData which uses
larkauth.NewLocalRoot(core.GetConfigDir()) and therefore mutates the real config
root; isolate the config dir in each test by setting the env var before calling
cleanupKeychainFromData (e.g. call t.Setenv("LARKSUITE_CLI_CONFIG_DIR",
t.TempDir()) at the top of each test) and keep keyring.MockInit() as well, or
add a small helper (e.g. initTestEnv or similar) that performs both
keyring.MockInit() and t.Setenv("LARKSUITE_CLI_CONFIG_DIR", t.TempDir()) then
use that helper in all tests in this file.

In `@cmd/config/init_flock_test.go`:
- Around line 41-42: These tests call
core.SetCurrentWorkspace(core.WorkspaceLocal) and never restore global state;
capture the previous workspace (e.g., prev := core.CurrentWorkspace()) before
changing it and register a cleanup to restore it (t.Cleanup(func(){
core.SetCurrentWorkspace(prev) })) immediately after setting the env and before
mutating state; apply this pattern in each test/subtest that calls
core.SetCurrentWorkspace (including the occurrences around the lines shown) so
global workspace is reset after the test.

In `@internal/auth/context.go`:
- Around line 69-113: The sanitizer sanitizeOpenIdForPath used by
AuthContext.sanitizedUserOpenId and sanitizedAppId is lossy and can alias
different IDs; change it to produce a filesystem-safe but collision-resistant
segment by preserving the current sanitized prefix and appending a stable hash
suffix (e.g., "-" + hex of SHA256 of the original trimmed string) so the result
remains safe for joins but is unique per original ID; keep the empty-string ->
"_" behavior and the same allowed characters for the prefix, and update any
callers that rely on sanitizeOpenIdForPath (e.g., storage path resolution) to
use the new non-lossy output.
- Around line 43-47: ForUser currently returns an AuthContext with an empty
appId but a non-empty userOpenId which creates an invalid state; change ForUser
to detect when strings.TrimSpace(appId) == "" and in that case return the
zero-value AuthContext (e.g. return AuthContext{}), otherwise return the trimmed
values as before. Update the ForUser function so empty/blank appId collapses to
the single-user (zero) context.

In `@internal/auth/storage.go`:
- Around line 346-373: The looksLikeJSON check in keychainTokenStore.saveSlot is
too weak (it only checks first non-space byte via looksLikeJSON) and lets
malformed JSON like "{not json" be persisted; replace the heuristic with a
proper JSON validation (e.g., call json.Valid(envelope) or attempt
json.Unmarshal into an interface{}) inside saveSlot (or update looksLikeJSON to
use json.Valid) and return a clear error (e.g., "auth: token envelope is not
valid JSON") when validation fails before calling keychain.Set.

In `@internal/auth/user_index.go`:
- Around line 142-145: The code in the json.Unmarshal error path writes a raw
warning to os.Stderr (fmt.Fprintf(os.Stderr,...)) which is forbidden from
internal/; change the logic in the UserIndex load/parse path to emit the warning
via a caller-supplied writer or logger instead (e.g., accept an io.Writer or
logging hook on the UserIndex loader or package-level variable and use that
instead of os.Stderr), update the error branch that currently calls fmt.Fprintf
to call the provided writer/log function with the same message and retain the
same return (UserIndex{Users: map[string]UserIndexEntry{}}, nil).

In `@internal/core/config.go`:
- Around line 375-383: The error message produced when app == nil in
CurrentAppConfig currently reports profile %q using profileOverride, which can
be empty and hide a stale raw.CurrentApp selector; update the ConfigError
construction in the app == nil branch (the block that returns &ConfigError{...})
to detect when profileOverride == "" and instead include raw.CurrentApp (or both
raw.CurrentApp and profileOverride) in the Message/Hint so the error reports the
dangling CurrentApp value; adjust the Message and/or Hint fields within that
ConfigError to clearly show the stale selector and available profiles.

In `@internal/migrate/migrate.go`:
- Around line 47-50: The current check collapses all failures from
core.LoadMultiAppConfig() into a noOp; change it to propagate actual errors and
only no-op for the benign "no config yet" case (or explicit
forward-incompatible/already-current cases handled later). Concretely, update
the logic around core.LoadMultiAppConfig() so that if err != nil you
return/propagate that err (instead of returning noOp), and only return noOp when
err == nil and multi == nil (or when your explicit schema-version checks later
determine a safe no-op); apply the same fix to the similar branch around lines
68-70 so parse/permission/corruption errors are not swallowed.

In `@internal/validate/sanitize.go`:
- Around line 17-42: The comment overstates OSC handling: SanitizeForTerminal
currently strips OSC sequences terminated by BEL (ESC ] ... BEL) but not the ST
form (ESC ] ... ESC \), so the C0 sweep can remove the ESC and leave the OSC
payload visible; update the ansiEscape logic used by SanitizeForTerminal to
recognize and consume both OSC terminators (BEL and the two-byte ST sequence ESC
followed by backslash) — e.g. extend the existing regex/parse in ansiEscape to
match ESC ] .*? (BEL|ESC\\) so the entire OSC payload is removed and not left
for the C0 sweep; alternatively, if you choose not to change behavior, shorten
the comment to avoid claiming full OSC coverage.

---

Outside diff comments:
In `@cmd/auth/login.go`:
- Around line 433-438: When opts.JSON is true and the authorization branch sees
!result.OK, don't return errs.NewAuthenticationError (which sends a typed error
on stderr); instead mirror the device-code immediate-path: if
shouldRemoveLoginRequestedScope(result) call cleanupRequestedScope(), then emit
the "authorization_failed" NDJSON envelope to stdout (same envelope shape used
by the device-code path) and return output.ErrBare(output.ExitAuth). For
non-JSON paths keep the existing errs.NewAuthenticationError behavior. Locate
this logic around the !result.OK check in login.go and reuse the same
event-formatting and output.ErrBare return used by the device-code flow.

In `@internal/core/config.go`:
- Around line 188-214: FindApp and FindAppIndex dropped support for the special
alias "default"; restore it by detecting if the incoming name == "default" at
the top of both MultiAppConfig.FindApp and MultiAppConfig.FindAppIndex and
replacing name with the currently active app identifier (use the MultiAppConfig
field that holds the active app, e.g. m.Active or m.ActiveApp) before running
the existing loops over m.Apps; ensure both functions use the same replacement
logic so they remain in sync.

---

Nitpick comments:
In `@cmd/auth/users_logout_profile_fallback_test.go`:
- Around line 47-52: Replace the hand-constructed factory in the test with the
canonical test fixture: use cmdutil.TestFactory(t, config) instead of
&cmdutil.Factory{...}; after creating the test factory you can set its IOStreams
buffers and override Invocation.Profile (e.g. set f.IOStreams = streams and
f.Invocation.Profile = "ghost") so the test uses the standard wiring for IO,
keychain and other defaults; apply the same change for the other occurrences
referenced (lines around 91-94 and 129-132) to ensure consistency.

In `@cmd/auth/users_test.go`:
- Around line 24-29: In stubUsersConfig replace the direct t.Setenv("HOME",
t.TempDir()) call with the repo helper that sets both HOME and USERPROFILE:
create a tempDir := t.TempDir() (or reuse the existing temp dir used for
LARKSUITE_CLI_CONFIG_DIR), then call setFakeOSHome(t, tempDir); this ensures
both HOME and USERPROFILE are set for Windows CI while leaving the rest of
stubUsersConfig (including keyring.MockInit and LARKSUITE_CLI_CONFIG_DIR)
unchanged.

In `@cmd/auth/users_use_profile_fallback_test.go`:
- Around line 59-64: Replace the manual test factory construction that creates a
&cmdutil.Factory (setting IOStreams and Invocation.Profile) with the standard
helper cmdutil.TestFactory(t, config) so the authUsersUseRun tests use the same
test wiring as the suite; locate the test setup where streams :=
&cmdutil.IOStreams and f := &cmdutil.Factory{IOStreams: streams, Invocation:
cmdutil.InvocationContext{Profile: "ghost"}} and instantiate f via
cmdutil.TestFactory(t, /* appropriate config map or nil */) and then set the
profile on the returned Factory (or pass it via the config) before calling
authUsersUseRun to ensure consistent factory behavior across tests.

In `@internal/auth/purge_test.go`:
- Around line 175-192: The test currently only asserts a hand-written string and
doesn't exercise PurgeUserArtifacts's real error-aggregation path; replace the
placeholder check by constructing a failing Root/KV (or small fake) that makes
DeleteUserProfileFor and DeleteUser return distinct errors, call
PurgeUserArtifacts(root, "cli_app", "ou_alice") against that fake root (after
keyring.MockInit()), and assert the returned error is non-nil and its Error()
contains both component messages joined with "; " to lock the aggregation
behavior of PurgeUserArtifacts and the DeleteUserProfileFor/DeleteUser legs.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 460fbb10-b643-4c14-b8d0-0298b41c4426

📥 Commits

Reviewing files that changed from the base of the PR and between 8ce3879 and d81b436.

📒 Files selected for processing (89)
  • cmd/auth/auth.go
  • cmd/auth/list.go
  • cmd/auth/list_r2_test.go
  • cmd/auth/login.go
  • cmd/auth/login_config_test.go
  • cmd/auth/login_holder.go
  • cmd/auth/login_holder_gate_test.go
  • cmd/auth/login_holder_sanitize_test.go
  • cmd/auth/login_holder_test.go
  • cmd/auth/login_holder_username_test.go
  • cmd/auth/login_legacy_workflow_test.go
  • cmd/auth/login_new_user_test.go
  • cmd/auth/login_postflock_r2_test.go
  • cmd/auth/login_result.go
  • cmd/auth/login_result_test.go
  • cmd/auth/login_step8_test.go
  • cmd/auth/login_test.go
  • cmd/auth/logout.go
  • cmd/auth/logout_test.go
  • cmd/auth/users.go
  • cmd/auth/users_list.go
  • cmd/auth/users_logout.go
  • cmd/auth/users_logout_active_user_test.go
  • cmd/auth/users_logout_profile_fallback_test.go
  • cmd/auth/users_test.go
  • cmd/auth/users_use.go
  • cmd/auth/users_use_profile_fallback_test.go
  • cmd/bootstrap.go
  • cmd/bootstrap_user_test.go
  • cmd/config/bind.go
  • cmd/config/bind_lock_path_test.go
  • cmd/config/bind_r2_test.go
  • cmd/config/bind_uat_cleanup_test.go
  • cmd/config/config_test.go
  • cmd/config/default_as.go
  • cmd/config/init.go
  • cmd/config/init_currentuser_test.go
  • cmd/config/init_flock_test.go
  • cmd/config/remove.go
  • cmd/config/remove_sweep_test.go
  • cmd/config/show.go
  • cmd/config/show_path_lock_skip_test.go
  • cmd/config/show_r2_test.go
  • cmd/config/strict_mode.go
  • cmd/global_flags.go
  • cmd/global_flags_test.go
  • cmd/profile/add.go
  • cmd/profile/list.go
  • cmd/profile/profile_list_currentuser_test.go
  • cmd/profile/profile_remove_self_toggle_test.go
  • cmd/profile/profile_test.go
  • cmd/profile/remove.go
  • cmd/profile/remove_sweep_test.go
  • cmd/profile/rename.go
  • cmd/profile/use.go
  • internal/auth/context.go
  • internal/auth/context_test.go
  • internal/auth/purge.go
  • internal/auth/purge_test.go
  • internal/auth/storage.go
  • internal/auth/storage_test.go
  • internal/auth/user_index.go
  • internal/auth/user_index_test.go
  • internal/auth/user_profile.go
  • internal/auth/user_profile_test.go
  • internal/cmdutil/factory.go
  • internal/cmdutil/factory_default.go
  • internal/cmdutil/factory_default_step7_test.go
  • internal/core/config.go
  • internal/core/config_multiuser_test.go
  • internal/core/config_step6_test.go
  • internal/core/config_test.go
  • internal/core/errors.go
  • internal/core/notconfigured.go
  • internal/core/passthrough_or_notconfigured_test.go
  • internal/core/require_config_r2_test.go
  • internal/core/resolve_login_test.go
  • internal/credential/decorate_user_resolution_test.go
  • internal/credential/default_provider.go
  • internal/credential/default_provider_r2_test.go
  • internal/credential/default_provider_step7_test.go
  • internal/credential/integration_test.go
  • internal/envvars/envvars.go
  • internal/errcompat/promote.go
  • internal/errcompat/promote_r2_test.go
  • internal/migrate/migrate.go
  • internal/migrate/migrate_test.go
  • internal/validate/sanitize.go
  • skills/lark-shared/SKILL.md

Comment thread cmd/auth/login_holder.go
Comment on lines +123 to +176
func verifyHolder(holderOpenId, holderUserName, holderSource, freshOpenId, freshUserName string) (*holderMismatchWarning, error) {
if holderOpenId == "" || holderOpenId == freshOpenId {
return nil, nil
}

switch holderSource {
case "flag":
hint := "you passed --user " + holderOpenId + " but the device you authorized was " + freshOpenId +
". Re-run with `--user " + freshOpenId + "` to register the user you actually authorized," +
" or re-authorize on a device signed in as " + holderOpenId + "."
return nil, errs.NewConfigError(errs.SubtypeInvalidArgument,
"login holder mismatch: requested user %q but authorized user is %q",
holderOpenId, freshOpenId).
WithHint(hint)
case "env":
hint := "LARKSUITE_CLI_OPEN_ID is set to " + holderOpenId + " but the device you authorized was " + freshOpenId +
" — unset the env var or re-run with `--user " + freshOpenId + "`."
return nil, errs.NewConfigError(errs.SubtypeInvalidArgument,
"login holder mismatch: requested user %q but authorized user is %q",
holderOpenId, freshOpenId).
WithHint(hint)
case "":
// Implied holder (AppConfig.CurrentUser). Soft mismatch — emit a
// warning and let login proceed. The Message must:
// 1. NOT recommend "re-run without --user" (the operator already
// did not pass --user; that hint is self-contradictory).
// 2. Tell the operator the new active-user semantics: the fresh
// user is appended, the active user is unchanged, and the
// explicit switch command.
// 3. Render "Alice (ou_alice)" when display names are available
// so the human reader can map open_ids to people; fall back
// to bare open_id when the name is unknown.
holderLabel := formatHolderLabel(holderUserName, holderOpenId)
freshLabel := formatHolderLabel(freshUserName, freshOpenId)
message := "[lark-cli] [WARN] auth login: the active profile's currentUser is " + holderLabel +
" but the device you authorized was " + freshLabel +
". Login will proceed and add " + freshLabel + " to the profile;" +
" the active user stays " + holderLabel +
". Run `lark-cli auth users use " + freshOpenId + "` to switch the active user."
return &holderMismatchWarning{
HolderOpenId: holderOpenId,
HolderUserName: holderUserName,
FreshOpenId: freshOpenId,
FreshUserName: freshUserName,
Message: message,
}, nil
default:
// Fail-closed: an unknown source is a programmer bug at the
// resolver, not the operator's fault. Surface it as an internal
// error so it gets fixed instead of silently downgrading the
// gate's safety property.
return nil, errs.NewInternalError(errs.SubtypeUnknown,
"verifyHolder: unknown holderSource %q (expected \"\", \"flag\", or \"env\") — cannot decide whether to abort or advise; refusing to proceed",
holderSource)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Sanitize open_id before interpolating it into human-readable messages.

These branches still embed raw holderOpenId / freshOpenId into ConfigError.Message, Hint, and the soft advisory text. When lookup misses, holderOpenId can come straight from --user or LARKSUITE_CLI_OPEN_ID, so ANSI/C0/RTL bytes survive into the human-readable stderr/JSON message fields this PR is otherwise sanitizing.

Keep the raw IDs in structured fields only, and sanitize the display copy before building message strings.

Also applies to: 201-205

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/auth/login_holder.go` around lines 123 - 176, In verifyHolder, sanitize
holderOpenId and freshOpenId before using them in human-facing strings: call the
existing display-sanitizer (or add one, e.g., sanitizeDisplay) on the IDs and on
formatHolderLabel inputs and use those sanitized values when building hints, the
errs.NewConfigError messages, and the soft-warning message string; keep the
original raw holderOpenId/freshOpenId in the holderMismatchWarning struct fields
and in any structured/error fields, but never interpolate the raw IDs into
message or hint text (update the code paths that call errs.NewConfigError, the
"flag" and "env" hint construction, and the "" branch message construction to
use the sanitized variants).

Comment on lines +35 to +37
dir := t.TempDir()
t.Setenv("LARKSUITE_CLI_CONFIG_DIR", dir)
t.Setenv("HOME", t.TempDir())
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Set USERPROFILE alongside HOME here.

Setting only HOME leaves Windows runs machine-dependent if any code in this path resolves the OS home directory. Please use the shared helper for fake home setup, or set both env vars explicitly.

Based on learnings: use a shared setFakeOSHome(t, dir) helper that sets both HOME and USERPROFILE via t.Setenv; relying on HOME alone leaves Windows nondeterministic.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/auth/login_postflock_r2_test.go` around lines 35 - 37, The test currently
sets LARKSUITE_CLI_CONFIG_DIR and HOME but not USERPROFILE, which makes Windows
runs nondeterministic; update the setup in cmd/auth/login_postflock_r2_test.go
so the fake home is deterministic by either calling the shared helper
setFakeOSHome(t, dir) or by adding t.Setenv("USERPROFILE", dir) immediately
after t.Setenv("HOME", ...); ensure you still set LARKSUITE_CLI_CONFIG_DIR to
dir.

Comment thread cmd/auth/logout_test.go
Comment on lines +22 to +28
func stubLogoutConfig(t *testing.T, currentUser string, users []core.AppUser) (*cmdutil.Factory, string, string) {
t.Helper()
keyring.MockInit()
cfgDir := t.TempDir()
t.Setenv("LARKSUITE_CLI_CONFIG_DIR", cfgDir)
t.Setenv("HOME", t.TempDir())

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Use the shared fake-home helper here.

These setups only override HOME. On Windows, os.UserHomeDir can still resolve through USERPROFILE/HOMEDRIVE/HOMEPATH, so the keyring/file paths stay machine-dependent and the tests can flap. Please switch these call sites to the repo’s setFakeOSHome(t, dir) helper. Based on learnings: "In larksuite/cli Go tests for home-directory resolution ... use a shared setFakeOSHome(t, dir) test helper that sets both the HOME (Unix) and USERPROFILE (Windows) environment variables via t.Setenv."

Also applies to: 87-91, 177-181

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/auth/logout_test.go` around lines 22 - 28, Replace direct
t.Setenv("HOME", ...) usage with the repo test helper setFakeOSHome(t, dir) so
both Unix and Windows home envs are consistently set; in this file update
stubLogoutConfig (and the other similar call sites where HOME is set: the other
setup helpers around the second and third occurrences) to call setFakeOSHome(t,
cfgDir) (or the appropriate temp dir) after keyring.MockInit(), ensuring tests
use the shared helper that sets HOME and USERPROFILE.

Comment thread cmd/auth/logout_test.go
Comment on lines +206 to +210
_ = larkauth.SetStoredToken(&larkauth.StoredUAToken{AppId: u.app, UserOpenId: u.oid, AccessToken: "tok"})
ctx := larkauth.ForUser(u.app, u.oid)
now := time.Now().UTC()
_ = larkauth.SaveUserProfileFor(root, ctx, larkauth.UserProfile{UserOpenId: u.oid, UserName: u.name, CachedAt: now, FirstAuthAt: now})
_ = larkauth.RecordUserActivity(root, ctx, nil)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fail fast on fixture setup errors in the preservation test.

These writes are part of the test preconditions, but their errors are discarded. If one of them fails, the test can still “pass” by never creating Carol’s state in the first place. Please assert each error with t.Fatalf(...) so the preservation check is meaningful.

🔧 Suggested fix
-		_ = larkauth.SetStoredToken(&larkauth.StoredUAToken{AppId: u.app, UserOpenId: u.oid, AccessToken: "tok"})
+		if err := larkauth.SetStoredToken(&larkauth.StoredUAToken{AppId: u.app, UserOpenId: u.oid, AccessToken: "tok"}); err != nil {
+			t.Fatalf("SetStoredToken(%s): %v", u.oid, err)
+		}
 		ctx := larkauth.ForUser(u.app, u.oid)
 		now := time.Now().UTC()
-		_ = larkauth.SaveUserProfileFor(root, ctx, larkauth.UserProfile{UserOpenId: u.oid, UserName: u.name, CachedAt: now, FirstAuthAt: now})
-		_ = larkauth.RecordUserActivity(root, ctx, nil)
+		if err := larkauth.SaveUserProfileFor(root, ctx, larkauth.UserProfile{UserOpenId: u.oid, UserName: u.name, CachedAt: now, FirstAuthAt: now}); err != nil {
+			t.Fatalf("SaveUserProfileFor(%s): %v", u.oid, err)
+		}
+		if err := larkauth.RecordUserActivity(root, ctx, nil); err != nil {
+			t.Fatalf("RecordUserActivity(%s): %v", u.oid, err)
+		}
 	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/auth/logout_test.go` around lines 206 - 210, The test currently discards
errors from setup calls (larkauth.SetStoredToken, larkauth.SaveUserProfileFor,
larkauth.RecordUserActivity) which can hide fixture failures; change each
ignored call to capture its error and call t.Fatalf on non-nil (e.g. if err :=
larkauth.SetStoredToken(...); err != nil { t.Fatalf("SetStoredToken failed: %v",
err) }) and do the same for SaveUserProfileFor and RecordUserActivity (keep ctx
:= larkauth.ForUser(u.app, u.oid) as-is).

Comment thread cmd/auth/logout.go
Comment on lines +78 to 79
profileName := app.ProfileName()

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Resolve the implicit target profile after acquiring the flock.

profileName is captured from the unlocked preflight read. If another command switches the current profile before Line 93 reloads under lock, FindAppIndex(profileName) will wipe the previously active profile instead of the profile that is active in the locked snapshot. Only snapshot an explicit --profile; otherwise recompute the target app from the post-lock multi.

Also applies to: 93-111

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/auth/logout.go` around lines 78 - 79, The current code captures
profileName from app.ProfileName() before acquiring the flock, which can become
stale; instead, only snapshot an explicit --profile flag (detect the flag
source), and for the normal path recompute the target app after the lock using
the locked multi state (i.e., call FindAppIndex against the post-lock multi to
select the active profile). Concretely: remove/avoid using the pre-lock
profileName for the non-explicit case, detect whether a profile was explicitly
passed, and if not call multi (the locked snapshot) to resolve the active app
before calling FindAppIndex and performing the logout operations.

Comment thread internal/auth/storage.go
Comment on lines +346 to +373
func (s *keychainTokenStore) saveSlot(_ slotKind, envelope []byte) error {
key, err := s.accountKeyFor()
if err != nil {
return err
}
// Cheap envelope sanity check: keychain backends accept any
// string, but envelopes are always JSON, so a caller passing raw
// garbage gets a typed error rather than a silent write.
if !looksLikeJSON(envelope) {
return fmt.Errorf("auth: token envelope is not JSON")
}
return keychain.Set(keychain.LarkCliService, key, string(envelope))
}

// looksLikeJSON skips a full json.Valid call: this is a programmer-error
// guard, not validation.
func looksLikeJSON(b []byte) bool {
for _, c := range b {
switch c {
case ' ', '\t', '\r', '\n':
continue
case '{':
return true
default:
return false
}
}
return false
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Reject malformed JSON before persisting token envelopes.

looksLikeJSON only checks that the first non-space byte is {, so malformed payloads like {not json pass and get written to the keychain. That violates the “JSON envelope” contract and turns a bad write into a later parse/read failure.

Proposed fix
 func (s *keychainTokenStore) saveSlot(_ slotKind, envelope []byte) error {
 	key, err := s.accountKeyFor()
 	if err != nil {
 		return err
 	}
-	// Cheap envelope sanity check: keychain backends accept any
-	// string, but envelopes are always JSON, so a caller passing raw
-	// garbage gets a typed error rather than a silent write.
-	if !looksLikeJSON(envelope) {
-		return fmt.Errorf("auth: token envelope is not JSON")
+	// Envelopes must be JSON objects, not just "object-shaped" text.
+	if !looksLikeJSONObject(envelope) {
+		return fmt.Errorf("auth: token envelope must be a JSON object")
 	}
 	return keychain.Set(keychain.LarkCliService, key, string(envelope))
 }
 
-// looksLikeJSON skips a full json.Valid call: this is a programmer-error
-// guard, not validation.
-func looksLikeJSON(b []byte) bool {
+func looksLikeJSONObject(b []byte) bool {
+	if !json.Valid(b) {
+		return false
+	}
 	for _, c := range b {
 		switch c {
 		case ' ', '\t', '\r', '\n':
 			continue
 		case '{':
 			return true
 		default:
 			return false
 		}
 	}
 	return false
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
func (s *keychainTokenStore) saveSlot(_ slotKind, envelope []byte) error {
key, err := s.accountKeyFor()
if err != nil {
return err
}
// Cheap envelope sanity check: keychain backends accept any
// string, but envelopes are always JSON, so a caller passing raw
// garbage gets a typed error rather than a silent write.
if !looksLikeJSON(envelope) {
return fmt.Errorf("auth: token envelope is not JSON")
}
return keychain.Set(keychain.LarkCliService, key, string(envelope))
}
// looksLikeJSON skips a full json.Valid call: this is a programmer-error
// guard, not validation.
func looksLikeJSON(b []byte) bool {
for _, c := range b {
switch c {
case ' ', '\t', '\r', '\n':
continue
case '{':
return true
default:
return false
}
}
return false
func (s *keychainTokenStore) saveSlot(_ slotKind, envelope []byte) error {
key, err := s.accountKeyFor()
if err != nil {
return err
}
// Envelopes must be JSON objects, not just "object-shaped" text.
if !looksLikeJSONObject(envelope) {
return fmt.Errorf("auth: token envelope must be a JSON object")
}
return keychain.Set(keychain.LarkCliService, key, string(envelope))
}
func looksLikeJSONObject(b []byte) bool {
if !json.Valid(b) {
return false
}
for _, c := range b {
switch c {
case ' ', '\t', '\r', '\n':
continue
case '{':
return true
default:
return false
}
}
return false
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/auth/storage.go` around lines 346 - 373, The looksLikeJSON check in
keychainTokenStore.saveSlot is too weak (it only checks first non-space byte via
looksLikeJSON) and lets malformed JSON like "{not json" be persisted; replace
the heuristic with a proper JSON validation (e.g., call json.Valid(envelope) or
attempt json.Unmarshal into an interface{}) inside saveSlot (or update
looksLikeJSON to use json.Valid) and return a clear error (e.g., "auth: token
envelope is not valid JSON") when validation fails before calling keychain.Set.

Comment on lines +142 to +145
if err := json.Unmarshal(data, &idx); err != nil {
// Corrupt: surface on stderr, return empty so the next Record reseeds.
fmt.Fprintf(os.Stderr, "[lark-cli] [WARN] auth: user index corrupt, rebuilding empty: %v\n", err)
return UserIndex{Users: map[string]UserIndexEntry{}}, nil
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't write raw warnings to os.Stderr from internal/.

This path bypasses the command-layer stderr envelope contract and is likely to trip the repo's forbidigo rule for internal/ packages. Please route the warning through a caller-supplied writer/log hook instead of emitting directly here.

Based on learnings, os.Stderr is forbidden in internal/ packages and should not be used for ad-hoc warnings there.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/auth/user_index.go` around lines 142 - 145, The code in the
json.Unmarshal error path writes a raw warning to os.Stderr
(fmt.Fprintf(os.Stderr,...)) which is forbidden from internal/; change the logic
in the UserIndex load/parse path to emit the warning via a caller-supplied
writer or logger instead (e.g., accept an io.Writer or logging hook on the
UserIndex loader or package-level variable and use that instead of os.Stderr),
update the error branch that currently calls fmt.Fprintf to call the provided
writer/log function with the same message and retain the same return
(UserIndex{Users: map[string]UserIndexEntry{}}, nil).

Comment thread internal/core/config.go
Comment on lines 375 to 383
app := raw.CurrentAppConfig(profileOverride)
if app == nil {
return nil, &ConfigError{
Code: 3,
Type: "config",
Message: fmt.Sprintf("profile %q not found", profileOverride),
Hint: fmt.Sprintf("available profiles: %s", formatProfileNames(raw.ProfileNames())),
Rung: RungProfile,
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Report the dangling CurrentApp value instead of profile "" not found.

When profileOverride is empty and raw.CurrentApp points at a removed profile, this path returns profile "" not found. That hides the stale selector the operator actually needs to fix and makes the remediation misleading.

Suggested fix
 	app := raw.CurrentAppConfig(profileOverride)
 	if app == nil {
+		missingProfile := profileOverride
+		if missingProfile == "" {
+			missingProfile = raw.CurrentApp
+		}
 		return nil, &ConfigError{
 			Code:    3,
 			Type:    "config",
-			Message: fmt.Sprintf("profile %q not found", profileOverride),
+			Message: fmt.Sprintf("profile %q not found", missingProfile),
 			Hint:    fmt.Sprintf("available profiles: %s", formatProfileNames(raw.ProfileNames())),
 			Rung:    RungProfile,
 		}
 	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/core/config.go` around lines 375 - 383, The error message produced
when app == nil in CurrentAppConfig currently reports profile %q using
profileOverride, which can be empty and hide a stale raw.CurrentApp selector;
update the ConfigError construction in the app == nil branch (the block that
returns &ConfigError{...}) to detect when profileOverride == "" and instead
include raw.CurrentApp (or both raw.CurrentApp and profileOverride) in the
Message/Hint so the error reports the dangling CurrentApp value; adjust the
Message and/or Hint fields within that ConfigError to clearly show the stale
selector and available profiles.

Comment on lines +47 to +50
multi, err := core.LoadMultiAppConfig()
if err != nil || multi == nil {
// No config yet — first `auth login` will stamp SchemaVersion.
return noOp
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't collapse every config-load failure into noOp.

These branches currently swallow parse/permission/corruption errors from core.LoadMultiAppConfig() the same way as "no config yet". That loses the R2 error-transparency contract and makes real config failures disappear instead of reaching the bootstrap logger. Please no-op only for the specific benign cases (missing config / already-current / explicitly forward-incompatible), and propagate the rest.

Also applies to: 68-70

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/migrate/migrate.go` around lines 47 - 50, The current check
collapses all failures from core.LoadMultiAppConfig() into a noOp; change it to
propagate actual errors and only no-op for the benign "no config yet" case (or
explicit forward-incompatible/already-current cases handled later). Concretely,
update the logic around core.LoadMultiAppConfig() so that if err != nil you
return/propagate that err (instead of returning noOp), and only return noOp when
err == nil and multi == nil (or when your explicit schema-version checks later
determine a safe no-op); apply the same fix to the similar branch around lines
68-70 so parse/permission/corruption errors are not swallowed.

Comment on lines 17 to +42
// SanitizeForTerminal strips ANSI escape sequences, C0 control characters
// (except \n and \t), and dangerous Unicode from text, preserving the actual
// readable content. It should be applied to table format output and stderr
// messages, but NOT to json/ndjson output where programmatic consumers need
// the raw data.
// (except \n and \t), DEL (0x7f), and dangerous Unicode (zero-width joiners,
// RTL overrides, etc.) from text, preserving the readable content.
//
// API responses may contain injected ANSI sequences that clear the screen,
// fake a colored "OK" status, or change the terminal title. In AI Agent
// scenarios, such injections can also pollute the LLM's context window
// with misleading output.
// Apply this anywhere a string is destined for a terminal — table-format
// stdout, every stderr message, and ALSO any human-readable string that is
// embedded inside a JSON/NDJSON payload for compatibility (e.g. a `message`
// field that mirrors what stderr just printed). The rule is "sanitize on
// write to a TTY," not "sanitize before json.Marshal": a JSON consumer that
// pretty-prints the payload still surfaces those bytes to a terminal, and
// the sanitization is a one-way transform that strips only renderer-control
// codes — readable characters, including Unicode letters, are preserved.
//
// Do NOT apply to typed identity / data fields whose programmatic consumers
// need raw bytes (e.g. `holder_user_name`, `requested[]`, `granted[]`):
// those carry data, and any escape-stripping there is the consumer's job
// once they know what context they will render to. JSON's own escaping
// already protects byte-level transport for the wire format.
//
// API responses (and persisted MultiAppConfig values originally sourced
// from API responses) may contain injected ANSI sequences that clear the
// screen, fake a colored "OK" status, or change the terminal title. In AI
// Agent scenarios, such injections can also pollute the LLM's context
// window with misleading output. The sanitize-on-write boundary keeps the
// data layer pristine while preventing any TTY-bound rendering surface
// from being hijacked.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

The new sanitizer contract overstates OSC coverage.

SanitizeForTerminal currently strips ESC ] ... BEL, but not the other common OSC terminator form ESC ] ... ESC \. In that case the C0 sweep removes the ESC bytes and leaves the OSC payload behind as visible junk, so “strips ANSI escape sequences” is not quite true yet. Either extend ansiEscape to consume the ST form too, or narrow this new comment.

Possible regex update
-var ansiEscape = regexp.MustCompile(`\x1b\[[0-9;?>=!]*[a-zA-Z]|\x1b\][^\x07]*\x07`)
+var ansiEscape = regexp.MustCompile(`\x1b\[[0-9;?>=!]*[a-zA-Z]|\x1b\][^\x07\x1b]*(?:\x07|\x1b\\)`)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/validate/sanitize.go` around lines 17 - 42, The comment overstates
OSC handling: SanitizeForTerminal currently strips OSC sequences terminated by
BEL (ESC ] ... BEL) but not the ST form (ESC ] ... ESC \), so the C0 sweep can
remove the ESC and leave the OSC payload visible; update the ansiEscape logic
used by SanitizeForTerminal to recognize and consume both OSC terminators (BEL
and the two-byte ST sequence ESC followed by backslash) — e.g. extend the
existing regex/parse in ansiEscape to match ESC ] .*? (BEL|ESC\\) so the entire
OSC payload is removed and not left for the C0 sweep; alternatively, if you
choose not to change behavior, shorten the comment to avoid claiming full OSC
coverage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/XL Architecture-level or global-impact change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants