Skip to content

feat(hooks): add local session telemetry hooks#2008

Open
vyta wants to merge 2 commits into
microsoft:mainfrom
vyta:feat/local-telemetry-hooks
Open

feat(hooks): add local session telemetry hooks#2008
vyta wants to merge 2 commits into
microsoft:mainfrom
vyta:feat/local-telemetry-hooks

Conversation

@vyta

@vyta vyta commented Jun 15, 2026

Copy link
Copy Markdown

Pull Request

Description

Adds an opt-in local telemetry hooks system that records GitHub Copilot session lifecycle events to local JSONL files for self-service analysis and troubleshooting, plus a self-contained HTML report generator. Telemetry is strictly local and opt-in: it stays in no-op mode unless enabled via the HVE_TELEMETRY=1 environment variable or a .hve-telemetry repository marker file.

Highlights:

  • New hook manifest (.github/hooks/telemetry.json) wiring collector scripts to Copilot lifecycle events (session start, prompt submit, pre/post tool use, subagent start/stop, agent stop, session end, pre-compact).
  • Cross-platform collector, report generator, and cleanup scripts with both bash (.sh) and PowerShell (.ps1) variants, backed by a shared Python core (_telemetry_core.py) with unit tests and an Atheris fuzz harness.
  • Plugin generator support for a new "Hooks" artifact type, so hve-core and hve-core-all plugins surface the telemetry hook (symlinked back to the canonical .github/hooks/telemetry source).
  • Documentation under docs/customization/local-telemetry.md and docs/contributing/hooks.md.

Related Issue(s)

Type of Change

Select all that apply:

Code & Documentation:

  • Bug fix (non-breaking change fixing an issue)
  • New feature (non-breaking change adding functionality)
  • Breaking change (fix or feature causing existing functionality to change)
  • Documentation update

Infrastructure & Configuration:

  • GitHub Actions workflow
  • Linting configuration (markdown, PowerShell, etc.)
  • Security configuration
  • DevContainer configuration
  • Dependency update

AI Artifacts:

  • Reviewed contribution with prompt-builder agent and addressed all feedback
  • Copilot instructions (.github/instructions/*.instructions.md)
  • Copilot prompt (.github/prompts/*.prompt.md)
  • Copilot agent (.github/agents/*.agent.md)
  • Copilot skill (.github/skills/*/SKILL.md)

Note for AI Artifact Contributors:

  • Agents: Research, indexing/referencing other project (using standard VS Code GitHub Copilot/MCP tools), planning, and general implementation agents likely already exist. Review .github/agents/ before creating new ones.
  • Skills: Must include both bash and PowerShell scripts. See Skills.
  • Model Versions: Only contributions targeting the latest Anthropic and OpenAI models will be accepted. Older model versions (e.g., GPT-3.5, Claude 3) will be rejected.
  • See Agents Not Accepted and Model Version Requirements.

Other:

  • Script/automation (.ps1, .sh, .py)
  • Other (please describe):

Testing

  • Installed hook locally to collect and view telemetry via Copilot CLI and Copilot in VS Code
  • Ran the full relevant validation suite (PowerShell 7, Node 24, uv, shellcheck, pinned PS modules):
    • lint:md, lint:frontmatter, spell-check, validate:copyright (221/221), validate:skills, lint:collections-metadata, lint:marketplace — all pass.
    • ruff (Python lint) — all checks pass.
    • pytest for the telemetry core — 40 passed.
    • shellcheck on the hook .sh scripts — clean.
  • Ran npm run plugin:generate; regenerated plugins/ output is in sync (the freshness check produces no drift).

Checklist

Required Checks

  • Documentation is updated (if applicable)
  • Files follow existing naming conventions
  • Changes are backwards compatible (if applicable)
  • Tests added for new functionality (if applicable)

Required Automated Checks

The following validation commands must pass before merging:

  • Markdown linting: npm run lint:md
  • Spell checking: npm run spell-check
  • Frontmatter validation: npm run lint:frontmatter
  • Skill structure validation: npm run validate:skills
  • Link validation: npm run lint:md-links
  • PowerShell analysis: npm run lint:ps
  • Plugin freshness: npm run plugin:generate
  • Docusaurus tests: npm run docs:test

Security Considerations

  • This PR does not contain any sensitive or NDA information
  • Any new dependencies have been reviewed for security issues
  • Security-related scripts follow the principle of least privilege

Additional Notes

  • Telemetry data is written locally only; the .hve-telemetry opt-in marker is gitignored and no data is transmitted.
  • plugins/ changes are generated output (npm run plugin:generate); the hooks/telemetry entries are symlinks to the canonical .github/hooks/telemetry.

@vyta vyta requested a review from a team as a code owner June 15, 2026 22:38
@vyta vyta changed the title Feat/local telemetry hooks feat(hooks): add local session telemetry hooks Jun 15, 2026
registry.parent.mkdir(parents=True, exist_ok=True)
with open(registry, "w", encoding="utf-8") as handle:
handle.write("".join(d + "\n" for d in live))
except OSError:
@vyta vyta force-pushed the feat/local-telemetry-hooks branch from 26a676d to d20402f Compare June 15, 2026 22:40
@codecov-commenter

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 41.66667% with 35 lines in your changes missing coverage. Please review.
✅ Project coverage is 80.68%. Comparing base (f3fa829) to head (d20402f).

Files with missing lines Patch % Lines
scripts/plugins/Modules/PluginHelpers.psm1 42.10% 22 Missing ⚠️
scripts/collections/Modules/CollectionHelpers.psm1 20.00% 12 Missing ⚠️
scripts/plugins/Generate-Plugins.ps1 85.71% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #2008      +/-   ##
==========================================
- Coverage   80.82%   80.68%   -0.14%     
==========================================
  Files         117      117              
  Lines       19095    19147      +52     
==========================================
+ Hits        15433    15449      +16     
- Misses       3662     3698      +36     
Flag Coverage Δ
pester 84.31% <41.66%> (-0.34%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
scripts/plugins/Generate-Plugins.ps1 93.37% <85.71%> (-0.35%) ⬇️
scripts/collections/Modules/CollectionHelpers.psm1 93.23% <20.00%> (-5.73%) ⬇️
scripts/plugins/Modules/PluginHelpers.psm1 88.30% <42.10%> (-6.26%) ⬇️

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@bindsi bindsi left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved: the local session telemetry hooks are opt-in and the packaging/generated outputs are consistent. I did not find actionable security, privacy, or generated-artifact issues.

@katriendg katriendg left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review summary — local session telemetry hooks

Thank you for this very interesting addition and functionality to the repo. The telemetry hook implementation shows real care: cross-platform bash + PowerShell variants, a shared Python core with unit tests and an Atheris fuzz harness, an XSS-defended HTML report, and clear opt-in/opt-out via HVE_TELEMETRY=1 / a gitignored .hve-telemetry marker. The collection schema, plugin generator, and docs are a strong start at making "hooks" a first-class artifact type.

The main thing this PR needs before it can land is some repository groundwork so hooks fit cleanly into how HVE Core builds, packages, and distributes artifacts. Hooks are brand new to our system, so several seams that already exist for agents/prompts/instructions/skills don't yet exist for hooks — most importantly the collection-scoped folder layout, the clone-based installer wiring (chat.hookFilesLocations), a hook manifest schema wired into linting, and a decision about the VS Code extension distribution channel. None of this is a problem with your hook logic; it's about wiring the new type through the existing build/docs machinery.

One option worth considering: this groundwork is really a repo-capability change that is somewhat separable from the telemetry hook itself. You could land the hooks-support groundwork as its own focused PR, then rebase this telemetry hook PR to stack on top once that merges — keeping each PR smaller and letting the platform changes be reviewed independently. Entirely your call.

This PR also needs to be linked to a tracking issue; if one doesn't exist yet, please create it and link this PR to it.

All findings are validated against the official VS Code documentation: Agent hooks in Visual Studio Code (Preview) (page last edited 2026-06-10). Key facts: chat.hookFilesLocations default includes .github/hooks and loads all *.json in a configured folder (relative/~ paths only, no **); there are exactly eight events (SessionStart, UserPromptSubmit, PreToolUse, PostToolUse, PreCompact, SubagentStart, SubagentStop, Stopno SessionEnd); VS Code auto-converts lowerCamelCase CLI event names to PascalCase and maps bashosx/linux, powershellwindows, so a single CLI-format manifest works in both CLI and VS Code; and there is no contributes.* extension API for standalone hooks.

Inline comments cover the items on files in this diff. Four findings touch files not changed here and are captured below.


Comment 3 — High — REQUIRED — .github/skills/installer/hve-core-installer/SKILL.md

The settings template configures chat.agentFilesLocations, chat.promptFilesLocations, chat.instructionsFilesLocations, and chat.agentSkillsLocations, but not chat.hookFilesLocations. The default value of that setting only covers the workspace .github/hooks; for clone-based installs hve-core lives at a non-workspace prefix (peer dir, .hve-core/, mount), so the hook is never discovered. Please add a chat.hookFilesLocations block enumerating each installed collection's hook subfolder, e.g. "<PREFIX>/.github/hooks/shared": true, and extend the "Settings Configuration" prose to include .github/hooks/. ** globs are unsupported here, matching the existing note for the other chat.*Locations settings.

Comment 4 — High — Decision required — scripts/extension/Prepare-Extension.ps1, Package-Extension.ps1, extension/templates/package.template.json

The marketplace extension never ships this hook: Update-PackageJsonContributes and Copy-CollectionArtifacts handle only the four existing kinds, and package.template.json contributes is {}. More fundamentally, the docs list no contributes.* extension API for standalone hooks — only Workspace, User, custom-agent frontmatter, and Plugin. Please decide and document the intended channel: (a) deliver via agent-scoped hooks: frontmatter in a bundled agent, (b) declare hooks CLI-plugin + clone-only (and say so in the docs), or (c) have the extension write chat.hookFilesLocations on activation. Whatever the choice, the README/docs should set the right expectation so users aren't surprised the hook is inert after a marketplace install.

Comment 7 — Medium — scripts/linting/schemas/ + schema-mapping.json

Every other AI artifact type is schema-validated, but hooks are not. Markdown-frontmatter artifacts (.agent.md, .prompt.md, .instructions.md, SKILL.md) each have a *-frontmatter.schema.json registered in schema-mapping.json and enforced by Validate-MarkdownFrontmatter.ps1 (npm run lint:frontmatter); JSON manifests like collections and marketplace have standalone *-manifest.schema.json. Since the hook manifest is JSON, the analogous addition is a new scripts/linting/schemas/hook-manifest.schema.json constraining the manifest to the eight valid VS Code events and documented command properties (type, command, windows/linux/osx, cwd, env, timeout), rejecting the same event declared in both CLI-lowercase and PascalCase form. Wire it into the JSON/schema linting so .github/hooks/**/*.json is validated in CI.

Comment 8 — Low — .github/PULL_REQUEST_TEMPLATE.md + docs/contributing/

The "AI Artifacts" checklist in the PR template doesn't include a hook entry, and ai-artifacts-common.md / the project-structure section of .github/copilot-instructions.md aren't updated to register hooks as a first-class type. The new docs/contributing/hooks.md and README row are a good start; please complete the onboarding surface so future hook contributions have a consistent path.

Comment 5 — Medium — plugins/*/.github/plugin/plugin.json

The generated plugin manifest emits "hooks": "hooks/telemetry.json" (a single string), and PluginHelpers warns when more than one hook exists. The plugin hook format references hooks.json or hooks/hooks.json. Please confirm the plugin hooks field contract (filename and single-vs-multiple) against the plugin format docs, and align the generator so multiple collections each contributing a hook are representable.

Comment 6 — Medium — scripts/collections/Modules/CollectionHelpers.psm1 + scripts/plugins/Modules/PluginHelpers.psm1

The Python core is well-tested, but the new 'hook' branches in CollectionHelpers.psm1, PluginHelpers.psm1, and Generate-Plugins.ps1 have no Pester coverage. Please extend scripts/tests/collections/CollectionHelpers.Tests.ps1 and scripts/tests/plugins/PluginHelpers.Tests.ps1 to cover hook discovery, artifact-key generation, and the new plugin-manifest path rewriting.


Recommended next steps (ordered)

  1. Adopt collection-scoped hook layout (REQUIRED). Move to .github/hooks/<collection>/*.json; update discovery (Get-ArtifactFiles one level down), Test-HveCoreRepoRelativePath (add hooks), and collections/*.collection.yml.
  2. De-duplicate the event manifest (REQUIRED). Keep one CLI-format block; remove PascalCase duplicates; fix userPromptSubmitted/sessionEnd.
  3. Wire the installer (REQUIRED). Add a chat.hookFilesLocations block enumerating installed collection hook folders.
  4. Settle the extension channel. Agent-scoped frontmatter vs CLI-plugin/clone-only vs activation-time settings write; update packaging and docs.
  5. Confirm the plugin hooks contract (filename + multiplicity) and align the generator.
  6. Add a hook manifest schema + linting mirroring other AI artifacts.
  7. Add CI guardrails + tests. Pester coverage for the new hook branches; re-run lint:collections-metadata, lint:marketplace, lint:frontmatter, plugin:generate, validate:skills, test:ps.
  8. Complete artifact-type onboarding. PR template entry, ai-artifacts-common.md, copilot-instructions project-structure.

Suggested split: items 1, 3, 5, 6, 7, 8 are repo-capability groundwork that can land standalone; items 2 and 4 are specific to this telemetry hook.

Security positives noted: opt-in/off by default, traversal-safe session ids (_is_safe_sid), quoted launcher generation, safe type="application/json" data embedding, allow-list cleanup. See inline security comments (SEC-1/2/3).

@@ -0,0 +1,142 @@
{
"version": 1,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment 1 — High — REQUIRED: dual-cased events double-fire in VS Code.

This manifest declares each event twice — once in Copilot CLI lowercase form (preToolUse, postToolUse, …) and once in VS Code PascalCase form (PreToolUse, PostToolUse, …). Per the official docs, VS Code automatically converts the lowercase CLI names to PascalCase and maps bashosx/linux and powershellwindows. As a result VS Code will register and fire the collector twice for every event (duplicate JSONL rows, doubled process spawns). Please keep a single CLI-format block (lowercase keys with bash/powershell) and remove the PascalCase duplicates.

Also: userPromptSubmitted converts to UserPromptSubmitted, which is not a valid event (correct is UserPromptSubmit); sessionEnd has no VS Code event (use Stop). The ConvertFrom-Json -AsHashtable workaround in CollectionHelpers.psm1 only exists to tolerate these case-colliding keys and can be removed once the duplicates are gone.

@@ -0,0 +1,142 @@
{

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment 2 — High — REQUIRED: hooks must be collection-scoped, not flat.

Hooks are placed flat at .github/hooks/telemetry.json, but every other HVE artifact type is collection-scoped (.github/agents/<collection>/, .github/skills/<collection>/, …). This is a requirement, for two doc-backed reasons:

  1. The installer activates artifacts per chosen collection by adding each collection subfolder to the relevant chat.*Locations setting. VS Code loads all *.json files in a configured hook folder, so to preserve per-collection opt-in each collection's hooks must live under .github/hooks/<collection>/ and the installer adds exactly <PREFIX>/.github/hooks/<collection>. A flat folder forces every collection's hooks to load together.
  2. Repo convention treats flat (non-collection) artifacts as repo-specific and excludes them from distribution. Test-HveCoreRepoRelativePath already excludes flat agents|instructions|prompts|skills but not hooks, so a flat hook is inconsistently treated as distributable.

Please move to .github/hooks/<collection>/telemetry.json (e.g. .github/hooks/shared/), keep the implementation scripts in a sibling subfolder, update Get-ArtifactFiles to discover one collection level down, add hooks to Test-HveCoreRepoRelativePath, and update the collections/*.collection.yml item paths to match.

const skills = [...s.skills].map(sk => `<span class="tag tag-skill">${sk}</span>`).join('') || dim('none loaded');
const topTools = Object.entries(s.tools).sort((a,b) => b[1] - a[1]).slice(0, 8)
.map(([t,c]) => `<span class="tag tag-tool">${t} (${c})</span>`).join('') || dim('none');

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SEC-1 — Medium: DOM-based HTML injection (stored XSS) in the rendered report.

Every render function builds markup with template literals assigned to innerHTML without HTML-escaping the interpolated values — e.g. subagent names rendered as `<span class="tag tag-agent">${a}</span>` here, and the working directory interpolated into both text and a title="${s.cwd}" attribute. The <\/-escaped JSON embedding only protects the data transport into the page; at render time these strings are written as live HTML. Several fields are model- or externally-influenced: the subagent label comes from runSubagent's free-form description, and tool / instruction / skill names can originate from MCP servers or third-party collections. A value like <img src=x onerror=...> in any field executes when the local report is opened. Please add a single esc() helper (escape & < > " ') applied to every dynamic interpolation (or switch to textContent/setAttribute). I'd treat this as a genuine code-quality blocker independent of the distribution work.

local telemetry_dir="${HVE_TELEMETRY_DIR:-$repo_root/.copilot-tracking/telemetry}"
mkdir -p "$telemetry_dir" "$telemetry_dir/.stacks"

# Dump raw input for diagnostics (first 5 events only)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SEC-2 — Low/Medium: sensitive payload capture in plaintext.

raw-input.jsonl stores the first five hook events verbatim, which for PreToolUse includes the full tool_input (file contents being written, shell command strings) and for UserPromptSubmit the full prompt; sessions-*.jsonl additionally stores the first 200 chars of every prompt. These are written in plaintext to the gitignored .copilot-tracking/telemetry/ (so not committed) and to user-level ~/.hve / ~/.copilot. Risk is local-disk exposure, not a git leak, but prompt and tool-input payloads can contain secrets. Recommend documenting the capture explicitly (TRANSPARENCY-NOTE / docs/contributing/hooks.md) and consider redacting or making the raw-input dump separately opt-in, since the processed events already provide the diagnostic signal.



def _mode_clean(all_dirs: bool, dry_run: bool) -> int:
"""Remove telemetry artifacts from the current store.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SEC-3 — Low: cleanup honors a user-writable registry.

clean --all-dirs iterates every path in ~/.hve/telemetry-dirs.txt and removes the known artifact names from each. Removal is constrained to a fixed allow-list (raw-input.jsonl, sessions-*.jsonl, .stacks, report.generated.html), so a tampered registry can at most delete those specific names in an attacker-chosen directory — not arbitrary files. Low risk given the registry is user-owned; noted for completeness.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants