feat(hooks): add local session telemetry hooks#2008
Conversation
| registry.parent.mkdir(parents=True, exist_ok=True) | ||
| with open(registry, "w", encoding="utf-8") as handle: | ||
| handle.write("".join(d + "\n" for d in live)) | ||
| except OSError: |
26a676d to
d20402f
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #2008 +/- ##
==========================================
- Coverage 80.82% 80.68% -0.14%
==========================================
Files 117 117
Lines 19095 19147 +52
==========================================
+ Hits 15433 15449 +16
- Misses 3662 3698 +36
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
bindsi
left a comment
There was a problem hiding this comment.
Approved: the local session telemetry hooks are opt-in and the packaging/generated outputs are consistent. I did not find actionable security, privacy, or generated-artifact issues.
katriendg
left a comment
There was a problem hiding this comment.
Review summary — local session telemetry hooks
Thank you for this very interesting addition and functionality to the repo. The telemetry hook implementation shows real care: cross-platform bash + PowerShell variants, a shared Python core with unit tests and an Atheris fuzz harness, an XSS-defended HTML report, and clear opt-in/opt-out via HVE_TELEMETRY=1 / a gitignored .hve-telemetry marker. The collection schema, plugin generator, and docs are a strong start at making "hooks" a first-class artifact type.
The main thing this PR needs before it can land is some repository groundwork so hooks fit cleanly into how HVE Core builds, packages, and distributes artifacts. Hooks are brand new to our system, so several seams that already exist for agents/prompts/instructions/skills don't yet exist for hooks — most importantly the collection-scoped folder layout, the clone-based installer wiring (chat.hookFilesLocations), a hook manifest schema wired into linting, and a decision about the VS Code extension distribution channel. None of this is a problem with your hook logic; it's about wiring the new type through the existing build/docs machinery.
One option worth considering: this groundwork is really a repo-capability change that is somewhat separable from the telemetry hook itself. You could land the hooks-support groundwork as its own focused PR, then rebase this telemetry hook PR to stack on top once that merges — keeping each PR smaller and letting the platform changes be reviewed independently. Entirely your call.
This PR also needs to be linked to a tracking issue; if one doesn't exist yet, please create it and link this PR to it.
All findings are validated against the official VS Code documentation: Agent hooks in Visual Studio Code (Preview) (page last edited 2026-06-10). Key facts: chat.hookFilesLocations default includes .github/hooks and loads all *.json in a configured folder (relative/~ paths only, no **); there are exactly eight events (SessionStart, UserPromptSubmit, PreToolUse, PostToolUse, PreCompact, SubagentStart, SubagentStop, Stop — no SessionEnd); VS Code auto-converts lowerCamelCase CLI event names to PascalCase and maps bash→osx/linux, powershell→windows, so a single CLI-format manifest works in both CLI and VS Code; and there is no contributes.* extension API for standalone hooks.
Inline comments cover the items on files in this diff. Four findings touch files not changed here and are captured below.
Comment 3 — High — REQUIRED — .github/skills/installer/hve-core-installer/SKILL.md
The settings template configures chat.agentFilesLocations, chat.promptFilesLocations, chat.instructionsFilesLocations, and chat.agentSkillsLocations, but not chat.hookFilesLocations. The default value of that setting only covers the workspace .github/hooks; for clone-based installs hve-core lives at a non-workspace prefix (peer dir, .hve-core/, mount), so the hook is never discovered. Please add a chat.hookFilesLocations block enumerating each installed collection's hook subfolder, e.g. "<PREFIX>/.github/hooks/shared": true, and extend the "Settings Configuration" prose to include .github/hooks/. ** globs are unsupported here, matching the existing note for the other chat.*Locations settings.
Comment 4 — High — Decision required — scripts/extension/Prepare-Extension.ps1, Package-Extension.ps1, extension/templates/package.template.json
The marketplace extension never ships this hook: Update-PackageJsonContributes and Copy-CollectionArtifacts handle only the four existing kinds, and package.template.json contributes is {}. More fundamentally, the docs list no contributes.* extension API for standalone hooks — only Workspace, User, custom-agent frontmatter, and Plugin. Please decide and document the intended channel: (a) deliver via agent-scoped hooks: frontmatter in a bundled agent, (b) declare hooks CLI-plugin + clone-only (and say so in the docs), or (c) have the extension write chat.hookFilesLocations on activation. Whatever the choice, the README/docs should set the right expectation so users aren't surprised the hook is inert after a marketplace install.
Comment 7 — Medium — scripts/linting/schemas/ + schema-mapping.json
Every other AI artifact type is schema-validated, but hooks are not. Markdown-frontmatter artifacts (.agent.md, .prompt.md, .instructions.md, SKILL.md) each have a *-frontmatter.schema.json registered in schema-mapping.json and enforced by Validate-MarkdownFrontmatter.ps1 (npm run lint:frontmatter); JSON manifests like collections and marketplace have standalone *-manifest.schema.json. Since the hook manifest is JSON, the analogous addition is a new scripts/linting/schemas/hook-manifest.schema.json constraining the manifest to the eight valid VS Code events and documented command properties (type, command, windows/linux/osx, cwd, env, timeout), rejecting the same event declared in both CLI-lowercase and PascalCase form. Wire it into the JSON/schema linting so .github/hooks/**/*.json is validated in CI.
Comment 8 — Low — .github/PULL_REQUEST_TEMPLATE.md + docs/contributing/
The "AI Artifacts" checklist in the PR template doesn't include a hook entry, and ai-artifacts-common.md / the project-structure section of .github/copilot-instructions.md aren't updated to register hooks as a first-class type. The new docs/contributing/hooks.md and README row are a good start; please complete the onboarding surface so future hook contributions have a consistent path.
Comment 5 — Medium — plugins/*/.github/plugin/plugin.json
The generated plugin manifest emits "hooks": "hooks/telemetry.json" (a single string), and PluginHelpers warns when more than one hook exists. The plugin hook format references hooks.json or hooks/hooks.json. Please confirm the plugin hooks field contract (filename and single-vs-multiple) against the plugin format docs, and align the generator so multiple collections each contributing a hook are representable.
Comment 6 — Medium — scripts/collections/Modules/CollectionHelpers.psm1 + scripts/plugins/Modules/PluginHelpers.psm1
The Python core is well-tested, but the new 'hook' branches in CollectionHelpers.psm1, PluginHelpers.psm1, and Generate-Plugins.ps1 have no Pester coverage. Please extend scripts/tests/collections/CollectionHelpers.Tests.ps1 and scripts/tests/plugins/PluginHelpers.Tests.ps1 to cover hook discovery, artifact-key generation, and the new plugin-manifest path rewriting.
Recommended next steps (ordered)
- Adopt collection-scoped hook layout (REQUIRED). Move to
.github/hooks/<collection>/*.json; update discovery (Get-ArtifactFilesone level down),Test-HveCoreRepoRelativePath(addhooks), andcollections/*.collection.yml. - De-duplicate the event manifest (REQUIRED). Keep one CLI-format block; remove PascalCase duplicates; fix
userPromptSubmitted/sessionEnd. - Wire the installer (REQUIRED). Add a
chat.hookFilesLocationsblock enumerating installed collection hook folders. - Settle the extension channel. Agent-scoped frontmatter vs CLI-plugin/clone-only vs activation-time settings write; update packaging and docs.
- Confirm the plugin
hookscontract (filename + multiplicity) and align the generator. - Add a hook manifest schema + linting mirroring other AI artifacts.
- Add CI guardrails + tests. Pester coverage for the new hook branches; re-run
lint:collections-metadata,lint:marketplace,lint:frontmatter,plugin:generate,validate:skills,test:ps. - Complete artifact-type onboarding. PR template entry,
ai-artifacts-common.md, copilot-instructions project-structure.
Suggested split: items 1, 3, 5, 6, 7, 8 are repo-capability groundwork that can land standalone; items 2 and 4 are specific to this telemetry hook.
Security positives noted: opt-in/off by default, traversal-safe session ids (_is_safe_sid), quoted launcher generation, safe type="application/json" data embedding, allow-list cleanup. See inline security comments (SEC-1/2/3).
| @@ -0,0 +1,142 @@ | |||
| { | |||
| "version": 1, | |||
There was a problem hiding this comment.
Comment 1 — High — REQUIRED: dual-cased events double-fire in VS Code.
This manifest declares each event twice — once in Copilot CLI lowercase form (preToolUse, postToolUse, …) and once in VS Code PascalCase form (PreToolUse, PostToolUse, …). Per the official docs, VS Code automatically converts the lowercase CLI names to PascalCase and maps bash→osx/linux and powershell→windows. As a result VS Code will register and fire the collector twice for every event (duplicate JSONL rows, doubled process spawns). Please keep a single CLI-format block (lowercase keys with bash/powershell) and remove the PascalCase duplicates.
Also: userPromptSubmitted converts to UserPromptSubmitted, which is not a valid event (correct is UserPromptSubmit); sessionEnd has no VS Code event (use Stop). The ConvertFrom-Json -AsHashtable workaround in CollectionHelpers.psm1 only exists to tolerate these case-colliding keys and can be removed once the duplicates are gone.
| @@ -0,0 +1,142 @@ | |||
| { | |||
There was a problem hiding this comment.
Comment 2 — High — REQUIRED: hooks must be collection-scoped, not flat.
Hooks are placed flat at .github/hooks/telemetry.json, but every other HVE artifact type is collection-scoped (.github/agents/<collection>/, .github/skills/<collection>/, …). This is a requirement, for two doc-backed reasons:
- The installer activates artifacts per chosen collection by adding each collection subfolder to the relevant
chat.*Locationssetting. VS Code loads all*.jsonfiles in a configured hook folder, so to preserve per-collection opt-in each collection's hooks must live under.github/hooks/<collection>/and the installer adds exactly<PREFIX>/.github/hooks/<collection>. A flat folder forces every collection's hooks to load together. - Repo convention treats flat (non-collection) artifacts as repo-specific and excludes them from distribution.
Test-HveCoreRepoRelativePathalready excludes flatagents|instructions|prompts|skillsbut nothooks, so a flat hook is inconsistently treated as distributable.
Please move to .github/hooks/<collection>/telemetry.json (e.g. .github/hooks/shared/), keep the implementation scripts in a sibling subfolder, update Get-ArtifactFiles to discover one collection level down, add hooks to Test-HveCoreRepoRelativePath, and update the collections/*.collection.yml item paths to match.
| const skills = [...s.skills].map(sk => `<span class="tag tag-skill">${sk}</span>`).join('') || dim('none loaded'); | ||
| const topTools = Object.entries(s.tools).sort((a,b) => b[1] - a[1]).slice(0, 8) | ||
| .map(([t,c]) => `<span class="tag tag-tool">${t} (${c})</span>`).join('') || dim('none'); | ||
|
|
There was a problem hiding this comment.
SEC-1 — Medium: DOM-based HTML injection (stored XSS) in the rendered report.
Every render function builds markup with template literals assigned to innerHTML without HTML-escaping the interpolated values — e.g. subagent names rendered as `<span class="tag tag-agent">${a}</span>` here, and the working directory interpolated into both text and a title="${s.cwd}" attribute. The <\/-escaped JSON embedding only protects the data transport into the page; at render time these strings are written as live HTML. Several fields are model- or externally-influenced: the subagent label comes from runSubagent's free-form description, and tool / instruction / skill names can originate from MCP servers or third-party collections. A value like <img src=x onerror=...> in any field executes when the local report is opened. Please add a single esc() helper (escape & < > " ') applied to every dynamic interpolation (or switch to textContent/setAttribute). I'd treat this as a genuine code-quality blocker independent of the distribution work.
| local telemetry_dir="${HVE_TELEMETRY_DIR:-$repo_root/.copilot-tracking/telemetry}" | ||
| mkdir -p "$telemetry_dir" "$telemetry_dir/.stacks" | ||
|
|
||
| # Dump raw input for diagnostics (first 5 events only) |
There was a problem hiding this comment.
SEC-2 — Low/Medium: sensitive payload capture in plaintext.
raw-input.jsonl stores the first five hook events verbatim, which for PreToolUse includes the full tool_input (file contents being written, shell command strings) and for UserPromptSubmit the full prompt; sessions-*.jsonl additionally stores the first 200 chars of every prompt. These are written in plaintext to the gitignored .copilot-tracking/telemetry/ (so not committed) and to user-level ~/.hve / ~/.copilot. Risk is local-disk exposure, not a git leak, but prompt and tool-input payloads can contain secrets. Recommend documenting the capture explicitly (TRANSPARENCY-NOTE / docs/contributing/hooks.md) and consider redacting or making the raw-input dump separately opt-in, since the processed events already provide the diagnostic signal.
|
|
||
|
|
||
| def _mode_clean(all_dirs: bool, dry_run: bool) -> int: | ||
| """Remove telemetry artifacts from the current store. |
There was a problem hiding this comment.
SEC-3 — Low: cleanup honors a user-writable registry.
clean --all-dirs iterates every path in ~/.hve/telemetry-dirs.txt and removes the known artifact names from each. Removal is constrained to a fixed allow-list (raw-input.jsonl, sessions-*.jsonl, .stacks, report.generated.html), so a tampered registry can at most delete those specific names in an attacker-chosen directory — not arbitrary files. Low risk given the registry is user-owned; noted for completeness.
Pull Request
Description
Adds an opt-in local telemetry hooks system that records GitHub Copilot session lifecycle events to local JSONL files for self-service analysis and troubleshooting, plus a self-contained HTML report generator. Telemetry is strictly local and opt-in: it stays in no-op mode unless enabled via the
HVE_TELEMETRY=1environment variable or a.hve-telemetryrepository marker file.Highlights:
.github/hooks/telemetry.json) wiring collector scripts to Copilot lifecycle events (session start, prompt submit, pre/post tool use, subagent start/stop, agent stop, session end, pre-compact)..sh) and PowerShell (.ps1) variants, backed by a shared Python core (_telemetry_core.py) with unit tests and an Atheris fuzz harness.hve-coreandhve-core-allplugins surface the telemetry hook (symlinked back to the canonical.github/hooks/telemetrysource).docs/customization/local-telemetry.mdanddocs/contributing/hooks.md.Related Issue(s)
Type of Change
Select all that apply:
Code & Documentation:
Infrastructure & Configuration:
AI Artifacts:
prompt-builderagent and addressed all feedback.github/instructions/*.instructions.md).github/prompts/*.prompt.md).github/agents/*.agent.md).github/skills/*/SKILL.md)Other:
.ps1,.sh,.py)Testing
lint:md,lint:frontmatter,spell-check,validate:copyright(221/221),validate:skills,lint:collections-metadata,lint:marketplace— all pass.ruff(Python lint) — all checks pass.pytestfor the telemetry core — 40 passed.shellcheckon the hook.shscripts — clean.npm run plugin:generate; regeneratedplugins/output is in sync (the freshness check produces no drift).Checklist
Required Checks
Required Automated Checks
The following validation commands must pass before merging:
npm run lint:mdnpm run spell-checknpm run lint:frontmatternpm run validate:skillsnpm run lint:md-linksnpm run lint:psnpm run plugin:generatenpm run docs:testSecurity Considerations
Additional Notes
.hve-telemetryopt-in marker is gitignored and no data is transmitted.plugins/changes are generated output (npm run plugin:generate); thehooks/telemetryentries are symlinks to the canonical.github/hooks/telemetry.