-
Notifications
You must be signed in to change notification settings - Fork 237
improve: health probes showcase & docs to operations #3291
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
35 commits
Select commit
Hold shift + click to select a range
daeee6b
improve: metrics processing to showcase all operations
csviri 65e9984
wip
csviri 7345b47
wip
csviri 7db8e88
wip
csviri efb84c4
wip
csviri 587c308
fix handling empty yaml config file with a comment only
csviri 3a24a6b
wip
csviri 37e81d6
wip
csviri 5cbc3f6
wip
csviri ec45729
wip
csviri 96bc016
wip
csviri cf6c880
wip
csviri 327b1d0
wip
csviri 9a11f07
wip
csviri c1bfc77
wip
csviri 78fa979
wip
csviri 378719c
wip
csviri f01aab2
wip
csviri a143ace
wip
csviri e0b608d
wip
csviri e1d2800
wip
csviri 2803f3b
wip
csviri 6355736
wip
csviri 1bebe64
wip
csviri d27b563
wip
csviri dde1d97
Update operator-framework-junit/src/main/java/io/javaoperatorsdk/oper…
csviri c3e67a3
Update docs/content/en/docs/documentation/operations/helm-chart.md
csviri adb41b8
Update helm/generic-helm-chart/values.yaml
csviri 623cfe6
Update operator-framework-junit/src/main/java/io/javaoperatorsdk/oper…
csviri c04275c
wip
csviri 7e595bc
wip
csviri dcf610b
wip
csviri 9a0a405
wip
csviri df8bb9e
health endpoint naming
csviri 64b690b
wip
csviri File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
111 changes: 111 additions & 0 deletions
111
docs/content/en/docs/documentation/operations/health-probes.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,111 @@ | ||
| --- | ||
| title: Health Probes | ||
| weight: 85 | ||
| --- | ||
|
|
||
| Operators running in Kubernetes should expose health probe endpoints so that the kubelet can detect startup | ||
| failures and runtime degradation. JOSDK provides the building blocks through its | ||
| [`RuntimeInfo`](https://github.com/java-operator-sdk/java-operator-sdk/blob/main/operator-framework-core/src/main/java/io/javaoperatorsdk/operator/RuntimeInfo.java) | ||
| API. | ||
|
|
||
| ## RuntimeInfo | ||
|
|
||
| `RuntimeInfo` is available via `operator.getRuntimeInfo()` and exposes: | ||
|
|
||
| | Method | Purpose | | ||
| |---|---| | ||
| | `isStarted()` | `true` once the operator and all its controllers have fully started | | ||
| | `allEventSourcesAreHealthy()` | `true` when every registered event source (informers, polling sources, etc.) reports a healthy status | | ||
| | `unhealthyEventSources()` | returns a map of controller name → unhealthy event sources, useful for diagnostics | | ||
| | `unhealthyInformerWrappingEventSourceHealthIndicator()` | returns a map of controller name → unhealthy informer-wrapping event sources, each exposing per-informer details via `InformerHealthIndicator` (`hasSynced()`, `isWatching()`, `isRunning()`, `getTargetNamespace()`) | | ||
|
|
||
| In most cases a single readiness probe backed by `allEventSourcesAreHealthy()` is sufficient: before the | ||
| operator has fully started the informers will not have synced yet, so the check naturally covers the startup | ||
| case as well. Once running, it detects runtime degradation such as a lost watch connection. | ||
|
|
||
| ### Fine-Grained Informer Diagnostics | ||
|
|
||
| For advanced use cases — such as exposing per-informer health in a diagnostic endpoint or logging which | ||
| specific namespace lost its watch — `unhealthyInformerWrappingEventSourceHealthIndicator()` gives access to | ||
| individual `InformerHealthIndicator` instances. Each indicator exposes `hasSynced()`, `isWatching()`, | ||
| `isRunning()`, and `getTargetNamespace()`. This is typically not needed for a standard health probe but can | ||
| be valuable for operational dashboards or troubleshooting. | ||
|
|
||
| ## Setting Up a Probe Endpoint | ||
|
|
||
| The example below uses [Jetty](https://eclipse.dev/jetty/) to expose a `/healthz` endpoint. Any HTTP | ||
| server library works — the key is calling the `RuntimeInfo` methods to determine the response code. | ||
|
|
||
| ```java | ||
| import org.eclipse.jetty.server.Server; | ||
| import org.eclipse.jetty.server.handler.ContextHandler; | ||
|
|
||
| Operator operator = new Operator(); | ||
| operator.register(new MyReconciler()); | ||
|
|
||
| // start the health server before the operator so probes can be queried during startup | ||
| var health = new ContextHandler(new HealthHandler(operator), "/healthz"); | ||
| Server server = new Server(8080); | ||
| server.setHandler(health); | ||
| server.start(); | ||
|
|
||
| operator.start(); | ||
| ``` | ||
|
|
||
| Where `HealthHandler` extends `org.eclipse.jetty.server.Handler.Abstract` and checks | ||
| `operator.getRuntimeInfo().allEventSourcesAreHealthy()`. | ||
|
|
||
| See the | ||
| [`operations` sample operator](https://github.com/java-operator-sdk/java-operator-sdk/tree/main/sample-operators/operations) | ||
| for a complete working example. | ||
|
|
||
| ## Kubernetes Deployment Configuration | ||
|
|
||
| Once your operator exposes the probe endpoint, configure probes in your Deployment manifest. Both the | ||
| startup and readiness probes can point to the same `/healthz` endpoint — the startup probe simply uses a | ||
| higher `failureThreshold` to give the operator time to initialize: | ||
|
|
||
| ```yaml | ||
| containers: | ||
| - name: operator | ||
| ports: | ||
| - name: probes | ||
| containerPort: 8080 | ||
| startupProbe: | ||
|
csviri marked this conversation as resolved.
|
||
| httpGet: | ||
| path: /healthz | ||
| port: probes | ||
| initialDelaySeconds: 1 | ||
| periodSeconds: 3 | ||
| failureThreshold: 20 | ||
| readinessProbe: | ||
| httpGet: | ||
| path: /healthz | ||
| port: probes | ||
| initialDelaySeconds: 5 | ||
| periodSeconds: 5 | ||
| failureThreshold: 3 | ||
| ``` | ||
|
|
||
| The startup probe gives the operator time to start (up to ~60 s with the settings above). Once the startup | ||
| probe succeeds, the readiness probe takes over and will mark the pod as not-ready if any event source | ||
| becomes unhealthy. | ||
|
|
||
| ## Helm Chart Support | ||
|
|
||
| The [generic Helm chart](/docs/documentation/operations/helm-chart) supports health probes out of the box. | ||
| Enable them in your `values.yaml`: | ||
|
|
||
| ```yaml | ||
| probes: | ||
| port: 8080 | ||
| startup: | ||
| enabled: true | ||
| path: /healthz | ||
| readiness: | ||
| enabled: true | ||
| path: /healthz | ||
| ``` | ||
|
|
||
| All probe timing parameters (`initialDelaySeconds`, `periodSeconds`, `failureThreshold`) have sensible | ||
| defaults and can be overridden. | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.