feat: add operational counters for sampling, requirements, and tools (#467) by ajbozarth · Pull Request #883 · generative-computing/mellea

ajbozarth · 2026-04-17T23:01:39Z

Misc PR

Type of PR

Bug Fix
New Feature
Documentation
Other

Description

Link to Issue: Fixes Add counters for operational metrics specific to Mellea's sampling and validation loops #467

Adds six new OpenTelemetry counters as part of the telemetry epic (#443),
giving operators visibility into retry behaviour, validation failure rates,
and tool call health alongside the existing LLM-level metrics:

mellea.sampling.attempts / mellea.sampling.successes / mellea.sampling.failures — tagged by strategy class name
mellea.requirement.checks / mellea.requirement.failures — tagged by requirement class name and failure reason
mellea.tool.calls — tagged by tool name and status ("success" / "failure")

Follows the established plugin pattern: lazy-init counter globals + record_*
helpers in metrics.py, consumed by three new Plugin classes in
metrics_plugins.py (SamplingMetricsPlugin, RequirementMetricsPlugin,
ToolMetricsPlugin) that hook into existing hook events in FIRE_AND_FORGET
mode. No metric calls in business-logic files.

Also extends SamplingIterationPayload and SamplingLoopEndPayload with a
strategy_name field (backward-compatible, defaults to "") so plugins can
tag counters with the strategy class name.

Testing

Tests added to the respective file if code was changed
New code has 100% coverage if code as added
Ensure existing tests and github automation passes (a maintainer will kick off the github automation when the rest of the PR is populated)

Attribution

AI coding assistants used

…enerative-computing#467) Adds six new OpenTelemetry counters giving operators visibility into retry behaviour, validation failure rates, and tool call health: mellea.sampling.attempts/successes/failures, mellea.requirement.checks/failures, and mellea.tool.calls. Follows the established lazy-init globals + record_* helpers + Plugin hooks pattern. Extends SamplingIterationPayload and SamplingLoopEndPayload with a strategy_name field so plugins can tag counters by strategy class. Assisted-by: Claude Code Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com>

github-actions · 2026-04-17T23:01:49Z

The PR description has been updated. Please fill out the template for your PR to be reviewed.

PeriodicExportingMetricReader background threads (60 s default) would fire after pytest closed stdout once the suite crossed the 60 s mark, causing "I/O operation on closed file" and OTLP UNAVAILABLE errors. Assisted-by: Claude Code Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com>

ajbozarth · 2026-04-17T23:24:03Z

This is parallel work to #882 which adds the other remaining metrics telemetry and whichever merges second will need a manual merge to deal with the overlapping git diffs.

akihikokuroda · 2026-04-20T17:55:56Z

This looks good to me as the implementation of the issue #467. I wonder how these counters are useful. They are grouped by strategies and requirements. They may not give useful insight without some more fine groupings.

akihikokuroda

LGTM. I have a general comment about the issue that this PR implements.

ajbozarth · 2026-04-20T18:05:44Z

This looks good to me as the implementation of the issue #467. I wonder how these counters are useful. They are grouped by strategies and requirements. They may not give useful insight without some more fine groupings.

@jakelorocco @psschwei I actually wondered about this while implementing. I thought the idea was interesting and valuable, but my work did not create an example of how it could be used (which maybe I should add?)

ajbozarth self-assigned this Apr 17, 2026

github-actions bot added the enhancement New feature or request label Apr 17, 2026

ajbozarth changed the title ~~feat: add operational counters for sampling, requirements, and tools …~~ feat: add operational counters for sampling, requirements, and tools (#467) Apr 17, 2026

ajbozarth marked this pull request as ready for review April 17, 2026 23:03

ajbozarth requested a review from a team as a code owner April 17, 2026 23:03

ajbozarth requested review from akihikokuroda and avinash2692 April 17, 2026 23:03

ajbozarth mentioned this pull request Apr 17, 2026

feat: add pricing registry and cost metrics (#464) #882

Open

9 tasks

akihikokuroda approved these changes Apr 20, 2026

View reviewed changes

ajbozarth mentioned this pull request Apr 20, 2026

ModelOutputThunk field refactor: partition fields into sub-structures #793

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add operational counters for sampling, requirements, and tools (#467)#883

feat: add operational counters for sampling, requirements, and tools (#467)#883
ajbozarth wants to merge 2 commits intogenerative-computing:mainfrom
ajbozarth:feat/operational-counters-467

ajbozarth commented Apr 17, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Apr 17, 2026

Uh oh!

ajbozarth commented Apr 17, 2026

Uh oh!

akihikokuroda commented Apr 20, 2026

Uh oh!

akihikokuroda left a comment

Uh oh!

ajbozarth commented Apr 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ajbozarth commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Misc PR

Type of PR

Description

Testing

Attribution

Uh oh!

github-actions bot commented Apr 17, 2026

Uh oh!

ajbozarth commented Apr 17, 2026

Uh oh!

akihikokuroda commented Apr 20, 2026

Uh oh!

akihikokuroda left a comment

Choose a reason for hiding this comment

Uh oh!

ajbozarth commented Apr 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ajbozarth commented Apr 17, 2026 •

edited

Loading