Skip to content

perf: major speed up when querying jobs by tags#429

Open
michaeladler wants to merge 12 commits intosiemens:mainfrom
michaeladler:feat/tags
Open

perf: major speed up when querying jobs by tags#429
michaeladler wants to merge 12 commits intosiemens:mainfrom
michaeladler:feat/tags

Conversation

@michaeladler
Copy link
Copy Markdown
Member

@michaeladler michaeladler commented Apr 2, 2026

Description

Computing the total field in the /jobs pagination result parameter was done in a very inefficient way (correlating with the number of queried tags).

This patch series contains the following two optimizations:

  1. Disable pagination metadata by default as this is typically only needed by UI clients. Clients relying on that information can append the new query parameter pagination=true to include the pagination metadata in responses.
  2. The old ent-generated code looped over each tag and added a separate HasTagsWith predicate, producing one correlated IN subquery per tag:
  WHERE job.id IN (SELECT tag_jobs.job_id FROM tag_jobs
    JOIN tag ON ... WHERE tag.name = 'TAG1')
  AND job.id IN (SELECT tag_jobs.job_id FROM tag_jobs
    JOIN tag ON ... WHERE tag.name = 'TAG2')

Each subquery performs an independent scan of the tag_jobs table, which is expensive when the jobs table is large.

Replace this with a single explicit JOIN on tag_jobs and tags, filtering all requested tags in a single IN clause:

  FROM job
  JOIN tag_jobs ON job.id = tag_jobs.job_id
  JOIN tag ON tag_jobs.tag_id = tag.id
  WHERE tag.name IN ('TAG1', 'TAG2')

The JOIN allows the database to resolve the tag filter in a single pass.
Add a database index on tag_jobs(job_id) so the join can use an index lookup instead of a sequential scan.

Benchmarks

I used the enhanced wfx-loadtest to populate a locally running PostgreSQL database with 1 million jobs, each having two tags.

  • wfx 0.5.0: Querying for one tag took approximately 3 seconds, while querying for two tags exceeded 10 seconds (hitting the wfxctl timeout).
  • Optimized, with pagination enabled: Queries consistently took ~1 second, regardless of the number of tags.
  • Optimized without pagination: Queries completed in ~25ms.

Issues Addressed

List and link all the issues addressed by this PR.

Change Type

Please select the relevant options:

  • Bug fix (non-breaking change that resolves an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

Checklist

  • I have read the CONTRIBUTING document.
  • My changes adhere to the established code style, patterns, and best practices.
  • I have added tests that demonstrate the effectiveness of my changes.
  • I have updated the documentation accordingly (if applicable).
  • I have added an entry in the CHANGELOG to document my changes (if applicable).

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 2, 2026

Codecov Report

❌ Patch coverage is 60.46512% with 34 lines in your changes missing coverage. Please review.
✅ Project coverage is 73.73%. Comparing base (c06da84) to head (0d83d67).

Files with missing lines Patch % Lines
cmd/wfx/cmd/config/appconfig.go 7.69% 11 Missing and 1 partial ⚠️
internal/persistence/entgo/workflow_query.go 28.57% 9 Missing and 1 partial ⚠️
internal/server/server_collection.go 0.00% 5 Missing ⚠️
api/wfx.go 0.00% 4 Missing ⚠️
internal/persistence/entgo/job_query.go 89.28% 2 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #429      +/-   ##
==========================================
- Coverage   73.90%   73.73%   -0.18%     
==========================================
  Files          96       96              
  Lines        4055     4059       +4     
==========================================
- Hits         2997     2993       -4     
- Misses        828      839      +11     
+ Partials      230      227       -3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@michaeladler michaeladler self-assigned this Apr 2, 2026
@michaeladler michaeladler force-pushed the feat/tags branch 4 times, most recently from ef187e2 to 56b1473 Compare April 2, 2026 15:21
Comment thread internal/persistence/entgo/mysql.go
Comment thread cmd/wfx/cmd/config/appconfig.go
Comment thread cmd/wfx/cmd/config/appconfig.go Outdated
Comment thread cmd/wfx/cmd/config/appconfig.go Outdated
Enable logging of all SQL queries when the log level is set to trace.
This is useful for identifying slow or inefficient queries during
development and debugging, e.g. to analyze N+1 query problems.

Signed-off-by: Michael Adler <michael.adler@siemens.com>
Signed-off-by: Michael Adler <michael.adler@siemens.com>
Signed-off-by: Michael Adler <michael.adler@siemens.com>
Signed-off-by: Michael Adler <michael.adler@siemens.com>
Signed-off-by: Michael Adler <michael.adler@siemens.com>
Previously this was ui/priv, now it's in ui/dist.

Signed-off-by: Michael Adler <michael.adler@siemens.com>
Signed-off-by: Michael Adler <michael.adler@siemens.com>
Introduce a 'populate' command to easily fill the database with sample
data. This is useful for reproducing performance issues or testing
scenarios that require a non-empty database.

Signed-off-by: Michael Adler <michael.adler@siemens.com>
The old ent-generated code looped over each tag and added a separate
HasTagsWith predicate, producing one correlated IN subquery per tag:

  WHERE job.id IN (SELECT tag_jobs.job_id FROM tag_jobs
    JOIN tag ON ... WHERE tag.name = 'TAG1')
  AND job.id IN (SELECT tag_jobs.job_id FROM tag_jobs
    JOIN tag ON ... WHERE tag.name = 'TAG2')

Each subquery performs an independent scan of the tag_jobs table, which
is expensive when the jobs table is large.

Replace this with a single explicit JOIN on tag_jobs and tags, filtering
all requested tags in one IN clause:

  FROM job
  JOIN tag_jobs ON job.id = tag_jobs.job_id
  JOIN tag ON tag_jobs.tag_id = tag.id
  WHERE tag.name IN ('TAG1', 'TAG2')

The JOIN allows the database to resolve the tag filter in a single pass.
Add a database index on tag_jobs(job_id) for MySQL, PostgreSQL, and
SQLite so the join can use an index lookup instead of a sequential scan.
Use DISTINCT to deduplicate rows introduced by the join.

Signed-off-by: Michael Adler <michael.adler@siemens.com>
Signed-off-by: Michael Adler <michael.adler@siemens.com>
Add a `pagination` boolean query parameter to the GET /jobs and GET
/workflows endpoints. When not set (default: false), the pagination
object is omitted from the response, reducing payload size for clients
that don't need it.

Signed-off-by: Michael Adler <michael.adler@siemens.com>
@michaeladler michaeladler force-pushed the feat/tags branch 2 times, most recently from b552c74 to 7bfc656 Compare April 17, 2026 09:51
This removes the retry loops for storage initialization and creating
network listeners. These are unnecessary in both common scenarios:

- Developer use: fast failure with a clear error is more useful than
  silently retrying for minutes.
- Production: service managers (systemd, k8s) already handle restarts
  with proper backoff and observability.

Fail fast and let the caller decide how to recover.

Signed-off-by: Michael Adler <michael.adler@siemens.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants