Skip to content

fix(entrypoint): pipe *.sql into isql to avoid 1-byte read amplification#41

Merged
fdcastel merged 2 commits intoFirebirdSQL:masterfrom
fdcastel:fix/init-db-pipe-perf
May 1, 2026
Merged

fix(entrypoint): pipe *.sql into isql to avoid 1-byte read amplification#41
fdcastel merged 2 commits intoFirebirdSQL:masterfrom
fdcastel:fix/init-db-pipe-perf

Conversation

@fdcastel
Copy link
Copy Markdown
Member

@fdcastel fdcastel commented May 1, 2026

Summary

Fixes #40. Restores the v1-style cat "$f" | process_sql for plain SQL files in init_db(), in place of the v2 redirect form process_sql < "$f".

Root cause

isql reads stdin one byte per read() syscall. The cause isn't a missing stdio buffer — isql is statically linked against the bundled editline from extern/editline/, and its read_char() calls raw read(fd, _, 1) per character, bypassing stdio entirely (so e.g. setvbuf(stdin, …) would not help). With cat | isql, those byte-reads come from a kernel pipe (in-memory, lock-free). With isql < file, every byte-read goes through the regular-file path (i_rwsem, atime, FS layer).

strace -c against the example.fdb repro from #40 (~110 KB schema): isql issues ~95 600 single-byte reads on FD 0 in both modes. The number of syscalls is identical; the per-call latency is what differs.

Why the magnitude is host-dependent

On native disk (no FS layering), pipe vs regular-file read() differ by tenths of a microsecond, so the total init delta is small (~50 ms locally, ~25 % of init time). But on layered or remote filesystems — Docker Desktop bind mounts on macOS/Windows (gRPC FUSE / virtiofs), NFS, sshfs, FUSE-overlay — every read() round-trips through a userland helper and adds tens to hundreds of microseconds. Multiplied by ~95 600 syscalls, that turns into seconds.

Demonstrated by injecting per-read() latency via LD_PRELOAD (regular-file FDs only, mimicking what FUSE/virtiofs add over the kernel-pipe path):

injected per-read delay cat | isql isql < file
0 μs (native) 0.17 s 0.22 s
25 μs 0.32 s 7.56 s
50 μs 0.37 s 9.95 s
100 μs 0.45 s 14.75 s

cat | isql barely moves because cat reads the file in ~10 large chunks, amortising the FS overhead. isql < file scales linearly with per-syscall cost. The 25–50 μs row is the realistic regime for Docker Desktop bind mounts and matches the 8–9 s reported in #40.

Verification

Ran the user's exact repro from #40 (compose.yaml + 01-schema.sql) on native ZFS:

variant init.d phase
v1 5.0.3 (digest-pinned, fast) ~0.18 s
v2 5.0.3-bookworm (slow, master) ~0.23 s
v2 5.0.3-bookworm + v1 entrypoint mounted ~0.18 s
v2 5.0.3-bookworm + this fix ~0.18 s

Native-disk fix is small (~50 ms) — expected. The win comes on the FS shapes that amplify per-syscall cost; the synthetic-latency table above shows the fix collapses 8–15 s back to sub-second.

Why pipe (not isql -i)

Three options were tested at 0 μs delay:

  • cat "$f" | process_sql — ~0.175 s (this PR)
  • process_sql < "$f" — ~0.223 s (current master, regressed)
  • process_sql -i "$f" — ~0.166 s (slightly faster still)

isql -i is marginally faster because it fopen()s the file itself and gets default stdio buffering, but switching to it changes process_sql's call shape (would no longer be uniform with the compressed cases) and would require touching the function signature. The pipe form is the minimal, surgical revert and matches the existing shape of *.sql.gz / *.sql.xz / *.sql.zst (which already use a decompressor pipeline).

Changes

  • src/entrypoint.sh: one-line revert in init_db() for the *.sql case.
  • generated/*/*/entrypoint.sh: regenerated (byte-identical to src/entrypoint.sh).
  • DECISIONS.md: new D-016 recording the rationale.

Test plan

…ification

Reading the init.d schema via `process_sql < "$f"` issues one read syscall
per byte against a regular file FD, because isql does not set up stdio
buffering on stdin. On native disk that is a ~25% cost on init-time; on
layered or remote filesystems (Docker Desktop bind mounts, gRPC FUSE,
virtiofs, NFS) the per-syscall overhead amplifies into 10x+ slowdowns.

Restoring the v1 `cat "$f" | process_sql` form moves those byte-reads
through a kernel pipe (in-memory, lock-free) and matches the existing
shape of the compressed cases (`*.sql.gz`, `*.sql.xz`, `*.sql.zst`).

Verified against the example.fdb repro from issue FirebirdSQL#40: v1 image and v2
image with this fix both complete the init.d phase in ~0.18s; v2 without
the fix is ~0.23s on native disk.

Records the rationale as D-016. Closes FirebirdSQL#40.
@fdcastel fdcastel marked this pull request as draft May 1, 2026 04:08
@fdcastel fdcastel self-assigned this May 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

latest 5 build causes our initialisation script to take around 18 times longer than previous version

1 participant