Skip to content

Interceptor: wedge-aware health endpoint + livenessProbe #25

@hhuuggoo

Description

@hhuuggoo

The interceptor's /healthz is currently a static 200 (internal/proxy/proxy.go:161handleHealth just w.WriteHeader(http.StatusOK)). It does NOT detect a wedged-but-alive proxy: a deadlock, conn-pool exhaustion, or a hung upstream/resolver leaves the process answering /healthz 200 while serving no real inference traffic.

Need a health endpoint that reflects real serving health (e.g. resolver reachable, not stuck on a saturated worker pool / pending-request high-water mark), THEN a chart livenessProbe pointed at it.

A livenessProbe against today's static /healthz would be a placebo — it would restart only on a fully-dead process, which the kubelet already detects — so the chart deliberately ships NO interceptor livenessProbe until this lands. (The readinessProbe on /healthz is fine: it gates RollingUpdate on the process being up.)

Tracked follow-up from the phoebe Helm chart review (saturn-k8s PR #981); Ben-ruled a follow-up, not a chart blocker.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions