feat(temporal): add composite metric (backlog + running workflow count)#7460
feat(temporal): add composite metric (backlog + running workflow count)#7460Sanil2108 wants to merge 1 commit intokedacore:mainfrom
Conversation
|
Thank you for your contribution! 🙏 Please understand that we will do our best to review your PR and give you feedback as soon as possible, but please bear with us if it takes a little longer as expected. While you are waiting, make sure to:
Once the initial tests are successful, a KEDA member will ensure that the e2e tests are run. Once the e2e tests have been successfully completed, the PR may be merged at a later date. Please be patient. Learn more about our contribution guide. |
✅ Snyk checks have passed. No issues have been found so far.
💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse. |
|
I think it's mainly solved by running this command from root so the schema's are created automatically: And the DCO of course :) |
782db31 to
3f6bb6c
Compare
3f6bb6c to
6d5c27f
Compare
|
Relevant Keda docs PR - kedacore/keda-docs#1714 |
cc169f8 to
bfe999b
Compare
|
The tests are failing, is it possible they are flaky? They sometimes pass and sometimes fail |
Yeah, that is a flaky test. It would be nice if it were solved soon. |
@rickbrouwer Understood, I can't rerun it as well, what should I do here? |
I will keep re-running it until it's green |
|
@rickbrouwer I think they are all green now, PTAL |
|
Is there anything blocking this PR from being merged? It would be really helpful for me and I've been watching it every day for 2 weeks :/ |
rickbrouwer
left a comment
There was a problem hiding this comment.
Can you add a validation in Validate() that returns an error when workflowTaskQueueForCount is set but includeRunningWorkflowCount is false?
Without this, a user who sets workflowTaskQueueForCount but forgets to enable includeRunningWorkflowCount will silently get no effect from the parameter.
|
We're also waiting on this — we run Temporal workers on EKS with KEDA and hit the same premature scale-down issue during long-running sequential workflows. The |
2882049 to
55d8a82
Compare
@rickbrouwer Done |
|
Hi @rickbrouwer — just checking in. I've addressed all the review feedback (including the validation in |
| if err != nil { | ||
| s.logger.Error(err, "failed to get running workflow count, skipping for activity check") | ||
| } else { | ||
| isActive = runningCount > 0 |
There was a problem hiding this comment.
is this intentionally 0 or should it be > s.metadata.ActivationTargetQueueSize?
rickbrouwer
left a comment
There was a problem hiding this comment.
We're almost there. I see that DCO is still failing and that some unit tests are still failing. Could you fix this?
Add support for including running workflow count alongside task queue backlog as the scaling metric, so workers don't get prematurely scaled down when the backlog is frequently drained but workflows are still executing. New metadata fields: - includeRunningWorkflowCount (default: false) — when true, scale by backlog + number of running workflow executions. - workflowTaskQueueForCount (optional) — scope the running-count query by workflow task queue, useful when scaling activity workers whose task queue differs from the workflow task queue. Running-count query uses CountWorkflow with visibility filter ExecutionStatus = 'Running' [ AND TaskQueue = '<workflowTaskQueueForCount>' ]. On CountWorkflow failure the scaler falls back to backlog-only to avoid scale-to-zero on transient errors. Validate() rejects workflowTaskQueueForCount without includeRunningWorkflowCount so the parameter can't be silently ignored. Unit tests cover new metadata parsing, the composite path, and the validation rule. Ref: kedacore#7459 Signed-off-by: Sanil2108 <sanilkhurana7@gmail.com>
09fdd74 to
3d8526d
Compare
|
@rickbrouwer apologies for the churn — I rebased onto latest main and squashed into a single clean commit:
Local |
|
We will first wait until #7672 has merged in terms of order. |
|
Sounds good, @rickbrouwer — will hold off until #7672 lands. Let me know if I should rebase after it merges, or if you'd prefer I leave the branch as-is. |
Provide a description of what has been changed
Checklist
make generate-scalers-schemahas been run to update any outdated generated filesFixes #7459
Docs: kedacore/keda-docs#1714