Skip to content

perf: replace per-pixel GdkPixbuf allocation with direct Cairo buffer#688

Open
killerdevildog wants to merge 1 commit intomaoschanz:masterfrom
killerdevildog:optimize/pixbuf-alloc-per-pixel-read
Open

perf: replace per-pixel GdkPixbuf allocation with direct Cairo buffer#688
killerdevildog wants to merge 1 commit intomaoschanz:masterfrom
killerdevildog:optimize/pixbuf-alloc-per-pixel-read

Conversation

@killerdevildog
Copy link
Copy Markdown

benchmarks.patch

perf: Replace per-pixel GdkPixbuf allocation with direct Cairo buffer reads

What

utilities_get_rgba_for_xy() reads a single pixel's color from a Cairo surface. It's used by the paint bucket, color picker, and eraser tools. The paint bucket's utilities_get_magic_path() calls it up to 50,000 times per fill operation.

The old implementation called Gdk.pixbuf_get_from_surface(surface, x, y, 1, 1) — which allocates a full GdkPixbuf GObject on the heap just to read one pixel. That allocation overhead dominated the cost.

What changed

Replaced the GdkPixbuf allocation with a direct read from the Cairo surface's pixel buffer via surface.get_data(). A new _read_pixel_rgba() helper handles the BGRA→RGBA byte reorder and alpha un-premultiplication that GdkPixbuf was doing internally.

In utilities_get_magic_path(), the surface buffer is now cached once at function entry instead of being re-acquired on every iteration of the hot loop.

Also fixed an off-by-one in the bounds check (x > widthx >= width).

Benchmark results

Measured with a microbenchmark on Python 3.12.3 / Cairo 1.18.0 / GTK 3 / Linux:

Scenario Before After Speedup
Single pixel read (median) 3.5 μs 0.78 μs 4.5×
10,000 consecutive reads 36.4 ms 7.1 ms 5.1×
Paint bucket worst case (50K reads, projected) ~180 ms ~36 ms ~5×

No behavioral changes — same visual output, same color values returned.

… reads

Replace Gdk.pixbuf_get_from_surface() single-pixel calls with direct
reads from the Cairo surface buffer via surface.get_data(). This avoids
allocating a temporary GdkPixbuf for every pixel read.

- Add _read_pixel_rgba() helper that reads BGRA from Cairo ARGB32
  buffer, handles alpha un-premultiplication, and returns RGBA bytes
- Update utilities_get_rgba_for_xy() to use the new direct read path
- Cache buffer data once in utilities_get_magic_path() instead of
  re-acquiring it per pixel in the ~50K-iteration boundary loop
- Fix off-by-one in bounds check (> to >=)

Measured ~4.5-5x speedup for pixel reads (3.5 us -> 0.78 us per read).
@killerdevildog
Copy link
Copy Markdown
Author

Closes #689

This PR implements the fix described in #689 — replaces per-pixel GdkPixbuf allocation with direct Cairo buffer reads, achieving a 4.5–5.1× speedup in pixel reads.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant