[client] Implement adaptive fetch rate control for LogScanner by swuferhong · Pull Request #3007 · apache/fluss

swuferhong · 2026-04-07T02:00:05Z

(The sections below can be removed for hotfixes or typos)
-->

Purpose

Linked issue: close #3006

For partitioned tables with many inactive partitions, the LogFetcher
previously sent fetch requests to all subscribed buckets equally,
wasting CPU and network resources on empty partitions.

Introduce BucketFetchRateController that uses exponential backoff to
reduce fetch frequency for buckets that consistently return no data.
Buckets that return data are always fetched at full frequency, and a
single non-empty fetch immediately resets the backoff.

Backoff schedule: 1, 2, 4, 8, 16, 32 rounds (configurable max).

New config options:

client.scanner.log.adaptive-fetch.enabled (default: true)
client.scanner.log.adaptive-fetch.max-skip-rounds (default: 32)

Brief change log

Tests

API and Format

Documentation

fresh-borzoni

@swuferhong Cool feature, really like it.

I left some comments, PTAL 🙏

Also we might want to have integration test for this, as this seems pretty E2E feature flow.

fresh-borzoni · 2026-04-17T02:24:51Z

+    private static final Logger LOG = LoggerFactory.getLogger(BucketFetchRateController.class);
+
+    /** Maximum exponent for the exponential backoff (2^5 = 32). */
+    private static final int MAX_BACKOFF_SHIFT = 5;


We also have max_skip_rounds, so with this we have weird situation where user expects config to work, but it's capped at 32.

fresh-borzoni · 2026-04-17T02:32:09Z

+     * @param tableBucket the bucket to check
+     * @return {@code true} if the bucket should be fetched in this round
+     */
+    boolean shouldFetch(TableBucket tableBucket) {


Is it intentional that a single empty fetch arms a skip?

A streaming scanner caught up to HW alternates "batch / empty at HW" every poll, this throttles the empty half even though the bucket is active.
Worst-case new-data latency becomes ~(max-skip × poll interval).

If that's by design, worth a comment in the Javadoc

fresh-borzoni · 2026-04-17T02:39:36Z

+    }
+
+    /** Removes the tracking state for the given bucket. */
+    void removeBucket(TableBucket tableBucket) {


Do we actually call it? It seems that without it being called we have a leak

fresh-borzoni · 2026-04-17T02:43:21Z

+    }
+
+    /** Resets all tracking state. */
+    void reset() {


fresh-borzoni · 2026-04-17T03:00:14Z

                            LogRecords logRecords = fetchResultForBucket.recordsOrEmpty();
-                            boolean hasRecords = !MemoryLogRecords.EMPTY.equals(logRecords);
-                            if (hasRecords) {
+                            hasData = !MemoryLogRecords.EMPTY.equals(logRecords);


After the rebase and #2951 #3032 changes, there is an issue the new hasData = !MemoryLogRecords.EMPTY.equals(logRecords) as it treats filtered-empty the same as truly empty, so an actively-producing bucket behind a selective WHERE gets throttled.

So better to rebase and reason about it

fresh-borzoni · 2026-04-17T03:01:15Z


        if (readyForFetchCount == 0) {
+            if (skippedByAdaptiveFetch > 0) {
+                LOG.info(


it might be very noisy, mb debug?

fresh-borzoni · 2026-04-17T03:01:29Z

            return Collections.emptyMap();
        } else {
+            if (skippedByAdaptiveFetch > 0) {
+                LOG.info(


fresh-borzoni · 2026-04-17T03:02:15Z

+            state.consecutiveEmptyFetches++;
+            int shift = Math.min(state.consecutiveEmptyFetches - 1, MAX_BACKOFF_SHIFT);
+            state.remainingSkipRounds = Math.min(1 << shift, maxSkipRounds);
+            LOG.info(


ditto about logging noise

fresh-borzoni · 2026-04-17T03:02:34Z

+                bucketStates.computeIfAbsent(tableBucket, k -> new BucketFetchState());
+        if (hasRecords) {
+            if (state.consecutiveEmptyFetches > 0) {
+                LOG.info(


[client] Implement adaptive fetch rate control for LogScanner

0f38996

swuferhong force-pushed the partition-fetch-freq branch from dbe43c8 to 0f38996 Compare April 7, 2026 02:13

fresh-borzoni reviewed Apr 17, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[client] Implement adaptive fetch rate control for LogScanner#3007

[client] Implement adaptive fetch rate control for LogScanner#3007
swuferhong wants to merge 1 commit intoapache:mainfrom
swuferhong:partition-fetch-freq

swuferhong commented Apr 7, 2026

Uh oh!

fresh-borzoni left a comment

Uh oh!

fresh-borzoni Apr 17, 2026

Uh oh!

fresh-borzoni Apr 17, 2026

Uh oh!

fresh-borzoni Apr 17, 2026

Uh oh!

fresh-borzoni Apr 17, 2026

Uh oh!

fresh-borzoni Apr 17, 2026

Uh oh!

fresh-borzoni Apr 17, 2026

Uh oh!

fresh-borzoni Apr 17, 2026

Uh oh!

fresh-borzoni Apr 17, 2026

Uh oh!

fresh-borzoni Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

swuferhong commented Apr 7, 2026

Purpose

Brief change log

Tests

API and Format

Documentation

Uh oh!

fresh-borzoni left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants