Fix possible 100% CPU loop in CivetWeb by DL6ER · Pull Request #2882 · pi-hole/FTL

DL6ER · 2026-05-06T19:33:17Z

What does this implement/fix?

Try tiny backoff to avoid tight retry loops on idle HTTPS keep-alive connections

Related issue or feature (if applicable): N/A

Pull request in docs with documentation (if applicable): N/A

By submitting this pull request, I confirm the following:

I have read and understood the contributors guide, as well as this entire template. I understand which branch to base my commits and Pull Requests against.
I have commented my proposed changes within the code.
I am willing to help maintain this change if there are issues with it later.
It is compatible with the EUPL 1.2 license
I have squashed any insignificant commits. (git rebase)

Checklist:

The code change is tested and works locally.
I based my code and PRs against the repositories development branch.
I signed off all commits. Pi-hole enforces the DCO for all contributions
I signed all my commits. Pi-hole requires signatures to verify authorship
I have read the above and my PR is ready for review.

…connections Signed-off-by: Dominik <dl6er@dl6er.de>

DL6ER · 2026-05-11T19:54:57Z

TODO: Need to submit a PR upstream

Copilot

Pull request overview

This pull request introduces a small configurable backoff for non-blocking mbedTLS operations in CivetWeb to prevent tight retry loops that can otherwise drive a worker thread to 100% CPU on idle/keep-alive HTTPS connections.

Changes:

Define MG_MBEDTLS_WANT_RETRY_DELAY_MS (default: 5ms) as a tunable backoff interval.
Sleep briefly in the mbedTLS handshake loop when WANT_READ/WRITE (or async-in-progress) is returned.
Sleep briefly in the mbedTLS read path when WANT_READ/WRITE (or async-in-progress) is returned after a poll-readability event.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File	Description
`src/webserver/civetweb/mod_mbedtls.inl`	Adds a tiny sleep in the mbedTLS handshake retry loop to avoid spinning on non-blocking sockets.
`src/webserver/civetweb/civetweb.c`	Introduces the backoff macro and applies it in the mbedTLS read path when `WANT_*` is returned.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

yubiuser

Should such a change not better be a patch at https://github.com/pi-hole/FTL/tree/master/patch/civetweb

rdwebdesign · 2026-05-11T20:20:40Z

I think this is the intention:

TODO: Need to submit a PR upstream

gkuchta · 2026-05-24T19:40:33Z

FWIW, I think I maybe ran into this after issuing a request to the admin UI (clean session; was just going to add a blocklist entry). I saw a single pihole-FTL thread spike to, and stay at, 100% cpu use. From strace:

3185 20:44:39.629235 <... select resumed>) = 0 (Timeout) <0.048528> 3185 20:44:39.629253 select(0, NULL, NULL, NULL, {tv_sec=0, tv_usec=100000} <unfinished ...> 3191 20:44:39.629265 poll([{fd=35, events=POLLIN}, {fd=36, events=POLLIN}, {fd=37, events=POLLIN}, {fd=38, events=POLLIN}, {fd=34, events=POLLIN}], 5, 2000) = 4 ([{fd=35, revents=POLLIN}, {fd=36, revents=POLLIN}, { fd=37, revents=POLLIN}, {fd=38, revents=POLLIN}]) <0.000007> 3191 20:44:39.629292 poll([{fd=35, events=POLLIN}, {fd=36, events=POLLIN}, {fd=37, events=POLLIN}, {fd=38, events=POLLIN}, {fd=34, events=POLLIN}], 5, 2000) = 4 ([{fd=35, revents=POLLIN}, {fd=36, revents=POLLIN}, { fd=37, revents=POLLIN}, {fd=38, revents=POLLIN}]) <0.000007> 3191 20:44:39.629319 poll([{fd=35, events=POLLIN}, {fd=36, events=POLLIN}, {fd=37, events=POLLIN}, {fd=38, events=POLLIN}, {fd=34, events=POLLIN}], 5, 2000) = 4 ([{fd=35, revents=POLLIN}, {fd=36, revents=POLLIN}, { fd=37, revents=POLLIN}, {fd=38, revents=POLLIN}]) <0.000007>

fd35 = 0.0.0.0:80
fd36 = 0.0.0.0:443
fd37 = [::]:80
fd38 = [::]:443
poll() returns POLLIN for all four listener fds in ~7us
FTL performs no accept/read/write (or any other calls) between polls; basically just an infinite poll() loop.
Recv-Q remains nonzero (Recv-Q was at 2) on listeners
no inbound 80/443 traffic observed via tcpdump
dns resolution continued without interruption

If I run into it again I can try to get some more useful info via gdb or something, but it's just my home network so I just dumped what info I could and HUP'd the process

DL6ER · 2026-06-13T15:57:10Z

Sorry for the long delay in replying - real life has been really busy lately.

Thanks for the detailed dump - that's genuinely useful.

One thing stands out though: the strace shows the spinning thread sitting in a tight poll() loop over the listener sockets (fd 35–38 = your :80/:443 listeners), returning POLLIN on all of them but never calling accept(), with Recv-Q=2 pending. That's the master/accept thread, whereas this PR fixes a busy-spin in the mbedTLS read/handshake path on connections that have already been accepted (pull_inner / mbed_ssl_handshake).

So as it stands this looks like it may be a related but distinct loop rather than the exact one this PR targets — possibly the accept loop or the worker queue getting wedged. It's plausible the two are connected (e.g. workers stuck spinning never return to the pool and starve the accept side), but the trace only shows one hot thread and it's the master on the listeners, so I can't tie it to the TLS path from this alone.

If you hit it again, the one thing that would nail it down is a backtrace of the 100%-CPU thread under gdb:

gdb -p <pid>
(gdb) info threads
(gdb) thread apply all bt

(or at least bt for the hot LWP). That'll tell us immediately whether it's the mbedTLS loop this PR addresses or the accept side. The Recv-Q nonzero + no accept() symptom in particular makes me suspect the latter.

Appreciate you grabbing what you could before HUP'ing it!

Try tiny backoff to avoid tight retry loops on idle HTTPS keep-alive …

1fd5e9a

…connections Signed-off-by: Dominik <dl6er@dl6er.de>

DL6ER added the CivetWeb bug label May 6, 2026

DL6ER mentioned this pull request May 6, 2026

FTL process takes 100% of CPU #2490

Closed

DL6ER marked this pull request as ready for review May 11, 2026 19:54

DL6ER requested a review from a team as a code owner May 11, 2026 19:54

Copilot AI review requested due to automatic review settings May 11, 2026 19:54

Copilot started reviewing on behalf of DL6ER May 11, 2026 19:55 View session

Copilot AI reviewed May 11, 2026

View reviewed changes

yubiuser reviewed May 11, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix possible 100% CPU loop in CivetWeb#2882

Fix possible 100% CPU loop in CivetWeb#2882
DL6ER wants to merge 1 commit into
developmentfrom
fix/spinning-civet

DL6ER commented May 6, 2026

Uh oh!

DL6ER commented May 11, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

yubiuser left a comment

Uh oh!

rdwebdesign commented May 11, 2026

Uh oh!

gkuchta commented May 24, 2026 •

edited

Loading

Uh oh!

DL6ER commented Jun 13, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Uh oh!

Conversation

DL6ER commented May 6, 2026

What does this implement/fix?

Checklist:

Uh oh!

DL6ER commented May 11, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

yubiuser left a comment

Choose a reason for hiding this comment

Uh oh!

rdwebdesign commented May 11, 2026

Uh oh!

gkuchta commented May 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DL6ER commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

gkuchta commented May 24, 2026 •

edited

Loading

DL6ER commented Jun 13, 2026 •

edited

Loading