fix(metricsservice): rebuild CA pool on rotation instead of appending#7713
fix(metricsservice): rebuild CA pool on rotation instead of appending#7713alliasgher wants to merge 3 commits intokedacore:mainfrom
Conversation
On every Kubernetes Secret rotation event, the old code called certPool.AppendCertsFromPEM on the same long-lived pool. Because x509.CertPool has no way to remove entries, the pool grows unboundedly for the lifetime of the operator pod. Fix: - Introduce a currentPool variable (protected by the existing certMutex) that is replaced wholesale on each rotation. - Server side: use GetConfigForClient so every new TLS handshake picks up the latest pool without a data race on tls.Config fields. - Client side: set InsecureSkipVerify+VerifyPeerCertificate so the fresh pool is read under the lock on each connection. Signed-off-by: alliasgher <alliasgher123@gmail.com>
✅ Snyk checks have passed. No issues have been found so far.
💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse. |
|
Thank you for your contribution! 🙏 Please understand that we will do our best to review your PR and give you feedback as soon as possible, but please bear with us if it takes a little longer as expected. While you are waiting, make sure to:
Once the initial tests are successful, a KEDA member will ensure that the e2e tests are run. Once the e2e tests have been successfully completed, the PR may be merged at a later date. Please be patient. Learn more about our contribution guide. |
Signed-off-by: alliasgher <alliasgher123@gmail.com>
|
/run-e2e internal- |
JorTurFer
left a comment
There was a problem hiding this comment.
Cloud you update changelog to include this fix?
There was a problem hiding this comment.
Pull request overview
This PR addresses the metrics service mTLS certificate-rotation path so CA bundles do not accumulate indefinitely in memory. It updates the TLS credential loader in pkg/metricsservice to rebuild CA pools on secret rotation and to wire the refreshed pool into new server/client connections.
Changes:
- Rebuild the CA pool from scratch on each Secret rotation instead of appending PEMs to a long-lived
x509.CertPool. - Add dynamic server-side TLS config selection so new handshakes can observe the latest client CA pool.
- Add client-side custom certificate verification intended to consult the current CA pool on each connection.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // VerifyConnection is called after the TLS handshake with the peer's | ||
| // verified chain available; we use it to re-check against currentPool so | ||
| // that a freshly-rotated CA is honoured without setting InsecureSkipVerify. | ||
| config.RootCAs = certPool | ||
| config.VerifyConnection = func(cs tls.ConnectionState) error { | ||
| certMutex.RLock() | ||
| pool := currentPool | ||
| certMutex.RUnlock() |
| certMutex.Lock() | ||
| currentPool = newPool | ||
| certMutex.Unlock() |
| newPool = x509.NewCertPool() | ||
| } | ||
| if !newPool.AppendCertsFromPEM(pemClientCA) { | ||
| log.Error(err, "failed to add client CA's certificate") |
|
duplicated #7700 |
Signed-off-by: alliasgher <alliasgher123@gmail.com>
Fixes #7691
On every Kubernetes Secret rotation event the existing code calls
certPool.AppendCertsFromPEMon the same long-lived pool. Becausex509.CertPoolhas no mechanism to remove entries, the pool grows unboundedly for the lifetime of the operator pod.Changes:
currentPoolvariable (protected by the existingcertMutex) that is replaced wholesale on each rotation instead of appended to.GetConfigForClientcallback so every new TLS handshake picks up the latest pool without a data race ontls.Configfields.InsecureSkipVerify+VerifyPeerCertificateso the fresh pool is read under the lock on each connection (thegosecnolint comment explains why this is safe — real verification happens inVerifyPeerCertificate).