Skip to content

feat: Dual emitting timer and histogram metrics#1048

Open
neil-xie wants to merge 6 commits intocadence-workflow:masterfrom
neil-xie:timer_metrics_migration
Open

feat: Dual emitting timer and histogram metrics#1048
neil-xie wants to merge 6 commits intocadence-workflow:masterfrom
neil-xie:timer_metrics_migration

Conversation

@neil-xie
Copy link
Copy Markdown
Member

What changed?
Start dual emitting histogram metrics as part of timer -> histogram metrics migration
Added migration doc to example the changes

Why?
Time -> histogram migration
cadence-workflow/cadence#7741

How did you test it?
Unit test

Potential risks
Backward compatibility, only risk is the metrics storage usage increase

Release notes

Documentation Changes

Duration.ofNanos(TimeUnit.SECONDS.toNanos(100)));

/**
* High-resolution bucket configuration for long-running operations.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high cardinality

Duration.ofNanos(TimeUnit.HOURS.toNanos(24)));

/**
* Medium-resolution bucket configuration for long-running operations.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: mid cardinality

com.uber.m3.util.Duration d = com.uber.m3.util.Duration.ofNanos(nanoTime - wfStartTimeNanos);
metricsScope.timer(MetricsType.WORKFLOW_E2E_LATENCY).record(d);
MetricsEmit.emitLatency(
metricsScope, MetricsType.WORKFLOW_E2E_LATENCY, d, HistogramBuckets.HIGH_1MS_24H);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this also what we use for Go SDK? A few workflows may have E2E latency > 24hrs.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In go, we used the algorithmic exponential buckets with headroom, last bucket is ~74.5h. Here I have to manually define the buckets. I also asked in the dev channel, does user still care when the workflow was running for more than 1 day?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can stamp to un-block you. We could add more buckets later.

Comment thread src/test/java/com/uber/cadence/internal/worker/WorkflowPollTaskTest.java Outdated
Signed-off-by: Neil Xie <neil.xie@uber.com>
Signed-off-by: Neil Xie <neil.xie@uber.com>
Signed-off-by: Neil Xie <neil.xie@uber.com>
Signed-off-by: Neil Xie <neil.xie@uber.com>
Signed-off-by: Neil Xie <neil.xie@uber.com>
Signed-off-by: Neil Xie <neil.xie@uber.com>
@neil-xie neil-xie force-pushed the timer_metrics_migration branch from a270288 to 892840e Compare April 16, 2026 21:52
@gitar-bot
Copy link
Copy Markdown

gitar-bot bot commented Apr 16, 2026

Code Review ✅ Approved 1 resolved / 1 findings

Implements dual-emitting timer and histogram metrics, resolving the incorrect metric naming convention found in test mocks. No outstanding issues remain.

✅ 1 resolved
Bug: Test histogram mocks use wrong metric name (missing _ns suffix)

📄 src/test/java/com/uber/cadence/internal/worker/WorkflowPollTaskTest.java:56-58 📄 src/test/java/com/uber/cadence/internal/worker/WorkflowPollTaskTest.java:111-113 📄 src/test/java/com/uber/cadence/internal/worker/WorkflowPollTaskTest.java:125-126 📄 src/test/java/com/uber/cadence/internal/worker/ActivityPollTaskTest.java:55-57 📄 src/test/java/com/uber/cadence/internal/worker/ActivityPollTaskTest.java:93-94 📄 src/main/java/com/uber/cadence/internal/metrics/MetricsEmit.java:112 📄 src/main/java/com/uber/cadence/internal/metrics/MetricsEmit.java:140 📄 src/main/java/com/uber/cadence/internal/metrics/MetricsEmit.java:170
The histogram mocks in WorkflowPollTaskTest and ActivityPollTaskTest stub scope.histogram(eq(MetricsType.DECISION_POLL_LATENCY), any()), but MetricsEmit internally calls scope.histogram(name + "_ns", buckets). Since the eq() matcher expects the name without _ns, the mock won't match, Mockito returns null, and null.start() will throw a NullPointerException at runtime.

The same issue affects every histogram mock in these two test files: DECISION_POLL_LATENCY, DECISION_SCHEDULED_TO_START_LATENCY, and ACTIVITY_POLL_LATENCY.

Note: MetricsEmitTest avoids this because it mocks with eq("test-metric_ns") which correctly includes the suffix.

Options

Auto-apply is off → Gitar will not commit updates to this branch.
Display: compact → Showing less information.

Comment with these commands to change:

Auto-apply Compact
gitar auto-apply:on         
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants