Emit metric tracking empty responses from prometheus by aliaqel-stripe · Pull Request #7671 · kedacore/keda

aliaqel-stripe · 2026-04-20T18:20:19Z

We'd like to have a way to monitor the number of Keda errors due to empty responses from prometheus after enabling the ignoreNullValues flag for most of our prometheus triggers.

Right now this error gets logged but the error metric that Keda emits is generic and doesn't differentiate by error type.

The metric keda_scaler_empty_upstream_responses_total is labeled with namespace, scaledObject, and triggerName so operators can identify which scaler is producing empty upstream responses.

Tests

E2e tests have been added to tests/sequential/prometheus_metrics/ and tests/sequential/opentelemetry_metrics/ that deploy a real Prometheus instance and verify the metric is emitted with the correct labels when a query returns an empty result.

Checklist

~~When introducing a new scaler, I agree with the scaling governance policy~~ N/A
I have verified that my change is according to the deprecations & breaking changes policy
Tests have been added
Changelog has been updated and is aligned with our changelog requirements
A PR is opened to update our Helm chart (repo) (if applicable, ie. when deployment manifests are modified) N/A
A PR is opened to update the documentation on (repo): Document keda_scaler_empty_upstream_responses_total metric keda-docs#1739
Commits are signed with Developer Certificate of Origin (DCO - learn more)

Fixes #7062

Signed-off-by: Daniele Rolando <drolando@stripe.com>

Co-authored-by: Jorge Turrado Ferrero <Jorge_turrado@hotmail.es> Signed-off-by: drolando-stripe <102543345+drolando-stripe@users.noreply.github.com>

github-actions · 2026-04-20T18:20:30Z

Thank you for your contribution! 🙏

Please understand that we will do our best to review your PR and give you feedback as soon as possible, but please bear with us if it takes a little longer as expected.

While you are waiting, make sure to:

Add an entry in our changelog in alphabetical order and link related issue
Update the documentation, if needed
Add unit & e2e tests for your changes
GitHub checks are passing
Is the DCO check failing? Here is how you can fix DCO issues

Once the initial tests are successful, a KEDA member will ensure that the e2e tests are run. Once the e2e tests have been successfully completed, the PR may be merged at a later date. Please be patient.

Learn more about our contribution guide.

snyk-io · 2026-04-20T18:20:36Z

✅ Snyk checks have passed. No issues have been found so far.

Status	Scan Engine	Critical	High	Medium	Low	Total (0)
✅	Open Source Security	0	0	0	0	0 issues

💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse.

…nse metric Add labels to keda_scaler_empty_upstream_responses_total so operators can identify which scaler is producing empty upstream responses. Also add e2e tests for both Prometheus and OpenTelemetry metric backends. Signed-off-by: Ali Aqel <aliaqel@stripe.com>

Signed-off-by: Ali Aqel <aliaqel@stripe.com>

wozniakjan

minor nits below for your consideration

wozniakjan · 2026-04-22T11:00:56Z

+		logger:             logger,
+		scalableObjectName: config.ScalableObjectName,
+		scalableObjectNS:   config.ScalableObjectNamespace,
+		triggerName:        config.TriggerName,


triggerName can be empty string, would it make sense to add triggerIndex too? that should be readily available from the same config struct

or metric_name?

adding metric_name

trigger index is kind of useless because it's just a number and when you have 2000+ scaled objects in a cluster, it doesn't give any userful info

I wonder if we should explore removing it from other metrics?

wozniakjan · 2026-04-22T11:02:08Z

 		if s.metadata.IgnoreNullValues {
 			return 0, nil
 		}
+		metricscollector.RecordEmptyUpstreamResponse(s.scalableObjectNS, s.scalableObjectName, s.triggerName)


is it desired to record the metric even when IgnoreNullValues is set to true? The value of IgnoreNullValues can be added as yet another label so users can filter which one they care about. This could surface broken prometheus queries that have been masked by IgnoreNullValues

adding ignorenullvalues as a label

wozniakjan · 2026-04-22T11:05:12Z

/run-e2e prometheus
Update: You can check the progress here

Copilot

Pull request overview

Adds a dedicated metric to track Prometheus-scaler empty query responses so operators can distinguish this failure mode from generic scaler errors, along with sequential e2e coverage validating labels/attributes for both Prometheus and OpenTelemetry metric pipelines.

Changes:

Add keda_scaler_empty_upstream_responses_total (Prometheus) and keda.scaler.empty.upstream.responses (OpenTelemetry) counters with namespace/scaledObject/triggerName labeling.
Record the counter from the Prometheus scaler when queries return empty results (when ignoreNullValues=false).
Extend sequential e2e tests to deploy a Prometheus instance returning empty results and assert the metric is emitted with expected labels.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
`pkg/scalers/prometheus_scaler.go`	Records the new “empty upstream response” metric when Prometheus query results/value arrays are empty.
`pkg/metricscollector/prommetrics.go`	Defines/registers the new Prometheus CounterVec and exposes a recorder method.
`pkg/metricscollector/opentelemetry.go`	Defines/registers the new OTel counter and records it with attributes.
`pkg/metricscollector/metricscollectors.go`	Extends the collector interface and adds a dispatcher function for the new metric.
`tests/sequential/prometheus_metrics/prometheus_metrics_test.go`	Adds sequential test validating the Prometheus-exported metric and labels.
`tests/sequential/opentelemetry_metrics/opentelemetry_metrics_test.go`	Adds sequential test validating the Prometheus-exported view of OTel metric and labels.
`CHANGELOG.md`	Documents the new Prometheus scaler metric in the Unreleased Improvements section.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- Rename scaledObject label -> scaledResource, add metricName, resourceType, and ignoreNullValues labels to keda_scaler_empty_upstream_responses_total - Record metric unconditionally (before IgnoreNullValues guard) so masked empty responses are also visible, with ignoreNullValues label for filtering - Fix CHANGELOG: reference issue kedacore#7062 instead of PR kedacore#7060, capitalize Prometheus - Update e2e tests to assert new labels Signed-off-by: Ali Aqel <aliaqel@stripe.com>

aliaqel-stripe · 2026-04-23T20:39:48Z

@rickbrouwer ready for 2nd review.

Copilot

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (1)

pkg/scalers/prometheus_scaler.go:273

ExecutePromQuery treats any response with len(result.Data.Result)==0 as an "empty upstream response" and now records the empty-upstream metric. However the Prometheus HTTP API can return HTTP 200 with JSON status: "error" (e.g., invalid query), where data.result will also be empty in this struct. That would incorrectly increment the empty-upstream counter for query errors. Consider checking result.Status (and/or modeling the errorType/error fields) and returning an error before recording the empty-upstream metric unless status == "success".

	var v float64 = -1

	// allow for zero element or single element result sets
	if len(result.Data.Result) == 0 {
		metricscollector.RecordEmptyUpstreamResponse(s.scalableObjectNS, s.scalableObjectName, s.triggerName, s.metricName, s.resourceType, s.metadata.IgnoreNullValues)
		if s.metadata.IgnoreNullValues {
			return 0, nil
		}
		return -1, fmt.Errorf("prometheus metrics 'prometheus' target may be lost, the result is empty")

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-24T14:31:36Z

+	time.Sleep(15 * time.Second)
+
+	family := fetchAndParsePrometheusMetrics(t, fmt.Sprintf("curl --insecure %s", kedaOperatorCollectorPrometheusExportURL))
+	val, ok := family["keda_scaler_empty_upstream_responses_total"]
+	assert.True(t, ok, "keda_scaler_empty_upstream_responses_total not available")
+	if ok {


This test relies on a fixed time.Sleep(15 * time.Second) before scraping metrics. On slower clusters/CI runs the metric may not be exported yet, causing intermittent failures. Consider polling until keda_scaler_empty_upstream_responses_total is present with the expected labels (with an overall timeout), similar to how the Prometheus-metrics e2e test waits for metrics to appear.

JorTurFer · 2026-04-26T16:09:36Z

Could you add another PR to docs adding the otel metric? I see that prometeus one is already merged, but otel is pending 😅

aliaqel-stripe · 2026-04-26T16:47:45Z

Could you add another PR to docs adding the otel metric? I see that prometeus one is already merged, but otel is pending 😅

kedacore/keda-docs#1751

Signed-off-by: Ali Aqel <aliaqel@stripe.com>

Signed-off-by: aliaqel-stripe <120822631+aliaqel-stripe@users.noreply.github.com>

rickbrouwer · 2026-05-07T05:47:44Z

/run-e2e prometheus
Update: You can check the progress here

drolando-stripe and others added 5 commits April 19, 2026 23:39

Emit metric tracking empty responses from prometheus

6144e69

Signed-off-by: Daniele Rolando <drolando@stripe.com>

update comment

8accfd2

Signed-off-by: Daniele Rolando <drolando@stripe.com>

rename to empty_upstream_responses_total

995d9a9

Signed-off-by: Daniele Rolando <drolando@stripe.com>

Update pkg/metricscollector/prommetrics.go

d80bede

Co-authored-by: Jorge Turrado Ferrero <Jorge_turrado@hotmail.es> Signed-off-by: drolando-stripe <102543345+drolando-stripe@users.noreply.github.com>

Update pkg/metricscollector/opentelemetry.go

cc8f1d7

Co-authored-by: Jorge Turrado Ferrero <Jorge_turrado@hotmail.es> Signed-off-by: drolando-stripe <102543345+drolando-stripe@users.noreply.github.com>

aliaqel-stripe requested a review from a team as a code owner April 20, 2026 18:20

keda-automation requested a review from a team April 20, 2026 18:20

aliaqel-stripe added 2 commits April 20, 2026 18:25

Fix gci formatting

2edc55e

Signed-off-by: Ali Aqel <aliaqel@stripe.com>

aliaqel-stripe force-pushed the drolando/add_empty_response_metric branch from 5840abf to 2edc55e Compare April 20, 2026 18:25

Merge branch 'main' into drolando/add_empty_response_metric

2825c7b

aliaqel-stripe mentioned this pull request Apr 20, 2026

Document keda_scaler_empty_upstream_responses_total metric kedacore/keda-docs#1739

Merged

wozniakjan mentioned this pull request Apr 22, 2026

Release: 2.20 #7435

Open

22 tasks

wozniakjan approved these changes Apr 22, 2026

View reviewed changes

wozniakjan added the Awaiting/2nd-approval This PR needs one more approval review label Apr 22, 2026

wozniakjan requested a review from Copilot April 22, 2026 11:05

Copilot started reviewing on behalf of wozniakjan April 22, 2026 11:05 View session

Copilot AI reviewed Apr 22, 2026

View reviewed changes

Comment thread pkg/metricscollector/prommetrics.go Outdated

Comment thread CHANGELOG.md Outdated

rickbrouwer added the waiting-author-response All PR's or Issues where we are waiting for a response from the author label Apr 22, 2026

keda-automation requested a review from a team April 22, 2026 16:30

Merge branch 'main' into drolando/add_empty_response_metric

a5034f7

rickbrouwer removed the waiting-author-response All PR's or Issues where we are waiting for a response from the author label Apr 24, 2026

rickbrouwer requested a review from Copilot April 24, 2026 14:26

Copilot started reviewing on behalf of rickbrouwer April 24, 2026 14:27 View session

Copilot AI reviewed Apr 24, 2026

View reviewed changes

JorTurFer approved these changes Apr 26, 2026

View reviewed changes

aliaqel-stripe mentioned this pull request Apr 26, 2026

Document OTEL empty upstream response metric kedacore/keda-docs#1751

Open

fix: use isScaledObject for OTEL empty upstream metric

cab05e6

Signed-off-by: Ali Aqel <aliaqel@stripe.com>

keda-automation requested a review from a team April 28, 2026 01:18

Merge branch 'main' into drolando/add_empty_response_metric

41d65ff

Signed-off-by: aliaqel-stripe <120822631+aliaqel-stripe@users.noreply.github.com>

rickbrouwer added ok-to-merge This PR can be merged waiting-for-e2e and removed Awaiting/2nd-approval This PR needs one more approval review waiting-for-e2e labels May 7, 2026

Conversation

aliaqel-stripe commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Tests

Checklist

Uh oh!

github-actions Bot commented Apr 20, 2026

Uh oh!

snyk-io Bot commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Snyk checks have passed. No issues have been found so far.

Uh oh!

wozniakjan left a comment

Choose a reason for hiding this comment

Uh oh!

wozniakjan Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

aliaqel-stripe Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

aliaqel-stripe Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

aliaqel-stripe Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wozniakjan Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

aliaqel-stripe Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

wozniakjan commented Apr 22, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

aliaqel-stripe commented Apr 23, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

JorTurFer commented Apr 26, 2026

Uh oh!

aliaqel-stripe commented Apr 26, 2026

Uh oh!

rickbrouwer commented May 7, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

aliaqel-stripe commented Apr 20, 2026 •

edited

Loading

snyk-io Bot commented Apr 20, 2026 •

edited

Loading

aliaqel-stripe Apr 22, 2026 •

edited

Loading

wozniakjan commented Apr 22, 2026 •

edited by github-actions Bot

Loading

rickbrouwer commented May 7, 2026 •

edited by github-actions Bot

Loading