[Haproxy] Add Prometheus metrics data stream#18439
[Haproxy] Add Prometheus metrics data stream#18439giorgi-imerlishvili-elastic wants to merge 2 commits intoelastic:mainfrom
Conversation
| - remove: | ||
| field: prometheus.labels | ||
| ignore_missing: true |
There was a problem hiding this comment.
🔴 Critical ingest_pipeline/default.yml:7
The ingest pipeline unconditionally removes prometheus.labels at lines 7-9, but fields.yml defines prometheus.labels.instance, prometheus.labels.job, prometheus.labels.proxy, and prometheus.labels.state as dimension: true. Since this data stream uses index_mode: "time_series", removing these labels causes metrics from different HAProxy proxies, servers, and states to collapse into the same time series, resulting in silent data loss. Consider removing only unwanted labels while preserving the dimension-critical ones.
- - remove:
- field: prometheus.labels
- ignore_missing: trueAlso found in 1 other location(s)
packages/haproxy/data_stream/metrics/fields/fields.yml:5
The ingest pipeline in
default.ymlremovesprometheus.labelsentirely via aremoveprocessor, but lines 5-24 offields.ymldefineprometheus.labels.instance,prometheus.labels.job,prometheus.labels.proxy, andprometheus.labels.stateasdimension: truefields. Withindex_mode: "time_series"configured inmanifest.yml, these dimensions are critical for uniquely identifying time series. Because the pipeline deletes them before indexing, metrics from different HAProxy proxies, servers, and states will collapse into the same time series, causing data loss or incorrect metric values. For example,haproxy_backend_current_sessionsfor proxy "web" and proxy "api" would be indistinguishable.
🤖 Copy this AI Prompt to have your agent fix this:
In file packages/haproxy/data_stream/metrics/elasticsearch/ingest_pipeline/default.yml around lines 7-9:
The ingest pipeline unconditionally removes `prometheus.labels` at lines 7-9, but `fields.yml` defines `prometheus.labels.instance`, `prometheus.labels.job`, `prometheus.labels.proxy`, and `prometheus.labels.state` as `dimension: true`. Since this data stream uses `index_mode: "time_series"`, removing these labels causes metrics from different HAProxy proxies, servers, and states to collapse into the same time series, resulting in silent data loss. Consider removing only unwanted labels while preserving the dimension-critical ones.
Also found in 1 other location(s):
- packages/haproxy/data_stream/metrics/fields/fields.yml:5 -- The ingest pipeline in `default.yml` removes `prometheus.labels` entirely via a `remove` processor, but lines 5-24 of `fields.yml` define `prometheus.labels.instance`, `prometheus.labels.job`, `prometheus.labels.proxy`, and `prometheus.labels.state` as `dimension: true` fields. With `index_mode: "time_series"` configured in `manifest.yml`, these dimensions are critical for uniquely identifying time series. Because the pipeline deletes them before indexing, metrics from different HAProxy proxies, servers, and states will collapse into the same time series, causing data loss or incorrect metric values. For example, `haproxy_backend_current_sessions` for proxy "web" and proxy "api" would be indistinguishable.
| - name: state | ||
| type: keyword | ||
| dimension: true | ||
| description: HAProxy state label exported with metrics. |
There was a problem hiding this comment.
🟠 High fields/fields.yml:24
Server-level metrics (haproxy_server_*) include a server label and listener metrics (haproxy_listener_*) include a listener label, but neither label is declared in prometheus.labels with dimension: true. In TSDB time_series mode, metrics from different servers or listeners sharing the same proxy and state labels will collapse into a single time series, causing data loss or aggregation errors.
+ - name: server
+ type: keyword
+ dimension: true
+ description: HAProxy server label exported with metrics.
+ - name: listener
+ type: keyword
+ dimension: true
+ description: HAProxy listener label exported with metrics.🤖 Copy this AI Prompt to have your agent fix this:
In file packages/haproxy/data_stream/metrics/fields/fields.yml around line 24:
Server-level metrics (`haproxy_server_*`) include a `server` label and listener metrics (`haproxy_listener_*`) include a `listener` label, but neither label is declared in `prometheus.labels` with `dimension: true`. In TSDB `time_series` mode, metrics from different servers or listeners sharing the same `proxy` and `state` labels will collapse into a single time series, causing data loss or aggregation errors.
| type: long | ||
| metric_type: counter | ||
| description: >- | ||
| Number of bytes submitted to the HTTP compressor in this worker process over the last second | ||
| - name: haproxy_process_http_comp_bytes_out_total | ||
| type: long | ||
| metric_type: counter | ||
| description: >- | ||
| Number of bytes emitted by the HTTP compressor in this worker process over the last second | ||
| - name: haproxy_process_idle_time_percent |
There was a problem hiding this comment.
🟡 Medium fields/fields.yml:615
haproxy_process_http_comp_bytes_in_total and haproxy_process_http_comp_bytes_out_total are declared as metric_type: counter but describe per-second byte rates (
- - name: haproxy_process_http_comp_bytes_in_total
- type: long
- metric_type: counter
- description: >-
- Number of bytes submitted to the HTTP compressor in this worker process over the last second
- - name: haproxy_process_http_comp_bytes_out_total
- type: long
- metric_type: counter
- description: >-
- Number of bytes emitted by the HTTP compressor in this worker process over the last second
+ - name: haproxy_process_http_comp_bytes_in
+ type: long
+ metric_type: gauge
+ description: >-
+ Number of bytes submitted to the HTTP compressor in this worker process over the last second
+ - name: haproxy_process_http_comp_bytes_out
+ type: long
+ metric_type: gauge
+ description: >-
+ Number of bytes emitted by the HTTP compressor in this worker process over the last second🤖 Copy this AI Prompt to have your agent fix this:
In file packages/haproxy/data_stream/metrics/fields/fields.yml around lines 615-624:
`haproxy_process_http_comp_bytes_in_total` and `haproxy_process_http_comp_bytes_out_total` are declared as `metric_type: counter` but describe per-second byte rates (
🚀 Benchmarks reportTo see the full report comment with |
💚 Build Succeeded
|
Proposed commit message
Adds metrics exposed via prometheus metrics endpoint
Checklist
changelog.ymlfile.Author's Checklist
How to test this PR locally
Related issues
Screenshots