Skip to content

Fix(raster): cancel in-flight raster tile requests when source URL changes - issue #7149#7214

Open
pcardinal wants to merge 17 commits intomaplibre:mainfrom
pcardinal:issue_7149
Open

Fix(raster): cancel in-flight raster tile requests when source URL changes - issue #7149#7214
pcardinal wants to merge 17 commits intomaplibre:mainfrom
pcardinal:issue_7149

Conversation

@pcardinal
Copy link
Copy Markdown
Contributor

Launch Checklist

  • Confirm your changes do not include backports from Mapbox projects (unless with compliant license) - if you are not sure about this, please ask!
  • Briefly describe the changes in this PR.
  • Link to related issues.
  • Include before/after visuals or gifs if this PR includes visual changes.
  • Write tests for all new functionality.
  • Document any changes to public APIs.
  • Post benchmark scores.
  • Add an entry to CHANGELOG.md under the ## main section.

Summary

  • This PR fixes a raster source lifecycle issue where calling RasterTileSource.setUrl could leave old tile requests running.
  • When the URL changes, in-flight tile requests are now aborted deterministically through TileManager.abortAllRequests, with cache-side support in TileCache.
  • The change ensures in-flight raster requests are cancelled before reloading source metadata.

Root cause
When a raster source URL was updated, the source reload path did not consistently cancel all active requests tied to the previous URL/state.

Changes included

  • Updated RasterTileSource.setUrl to trigger request cancellation through the tile manager before reload.
  • Updated request-cancellation behavior in TileManager.abortAllRequests.
  • Updated cache-level request cancellation in TileCache.abortAllRequests to ensure out-of-view pending raster requests are aborted.

Tests added/updated

  • raster_tile_source.test.ts - setUrl aborts in-flight raster tile requests through TileManager
  • tile_manager.test.ts - coverage for TileManager.abortAllRequests
  • tile_cache.test.ts - coverage for TileCache.abortAllRequests

How to validate locally
npm run test-unit -- src/source/raster_tile_source.test.ts
npm run test-unit -- src/tile/tile_manager.test.ts
npm run test-unit -- src/tile/tile_cache.test.ts

Impact

  • No public API changes.
  • Internal behavior change only for raster request cancellation on URL update.
  • Prevents stale requests/flicker and reduces unnecessary network activity.

Issue
Fixes #7149

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 3, 2026

Codecov Report

❌ Patch coverage is 87.09677% with 8 lines in your changes missing coverage. Please review.
✅ Project coverage is 92.73%. Comparing base (0cac8ee) to head (266ef76).
⚠️ Report is 4 commits behind head on main.

Files with missing lines Patch % Lines
src/source/vector_tile_source.ts 83.78% 6 Missing ⚠️
src/tile/tile_manager.ts 87.50% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #7214      +/-   ##
==========================================
+ Coverage   92.45%   92.73%   +0.28%     
==========================================
  Files         288      288              
  Lines       24043    24095      +52     
  Branches     5093     5107      +14     
==========================================
+ Hits        22228    22344     +116     
+ Misses       1815     1751      -64     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@pcardinal
Copy link
Copy Markdown
Contributor Author

pcardinal commented Mar 7, 2026

Vector and raster tiles are adressed
If this works, the title of this issue may need to be changed.

@HarelM
Copy link
Copy Markdown
Collaborator

HarelM commented Mar 7, 2026

Thanks for the time spent on this!
I haven't reviewed the test code yet, let's make sure we address the production code comments first.

@HarelM
Copy link
Copy Markdown
Collaborator

HarelM commented Mar 10, 2026

I haven't reviewed the tests yet as I want to make sure the business logic is correct first.
Once the code looks good I'll do a second review for the tests.
Thanks for taking the time to open this PR!

@pcardinal pcardinal force-pushed the issue_7149 branch 4 times, most recently from e3f9e81 to d2703a8 Compare March 10, 2026 16:14
}

// Keep all callers waiting for a reload while this tile is already loading.
// Resolve/reject cascades so every queued caller settles when the queued reload completes.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is like the 3-5 time I'm reading this and it still isn't clear what's happening here.
I think the comment is explaining it , but I still find this code combersome.
Is there a better way to manage a queue instead of this obscure promise inside promise thing?
It's also might be better to change the name of tile.reloadPromise to something like reloadResolveReject as it's not really promise but something that can resolve or reject a promise, and I think it adds to the confusion (to be fair I think I gave it that "bad" name).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for calling this out. I agree the previous implementation was hard to read.

I refactored the reload-while-loading flow to use an explicit queue of waiters instead of chaining resolve/reject callbacks through nested promises. This makes the behavior easier to follow:

  • when a tile is already loading, each caller is added to a queued waiters list
  • when loading finishes, we run at most one queued reload
  • then we resolve or reject all queued callers together

I also renamed the tile field from reloadPromise to queuedReloadWaiters, since it stores resolver pairs rather than an actual Promise.

Behavior is unchanged, but the control flow and naming are now much clearer. I validated this with the targeted unit tests for both:

  • single reload while loading
  • multiple queued reload callers resolved by one queued worker reload

}

async abortTile(tile: Tile): Promise<void> {
tile.aborted = true;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder why aborted is not a state of the tile and instead it's a boolean field, but that is probably outside the scope of this PR...

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aborted is a boolean (not a TileState) because it represents a different axis of information than state:

  1. state (state: TileState;) models data/render lifecycle, not request cancellation
    It answers: “what data status does this tile currently have?” (loading, loaded, reloading, unloaded, errored, expired) in tile.ts:47.

  2. aborted (aborted: boolean;) is a transient control flag for in-flight async work
    It answers: “should the current async load result be ignored/cancelled?” It is checked right after worker response/error handling in vector_tile_source.ts:248 and vector_tile_source.ts:264, and set in abort path at vector_tile_source.ts:347.

  3. A tile can keep a meaningful render state while a request is aborted
    For example, when reloading an already renderable tile, aborting should not necessarily force a new lifecycle state; it may keep prior usable data and just stop/ignore the current fetch. The code explicitly only forces unloaded in some cases (e.g. when it had no renderable data before), see vector_tile_source.ts:249 and vector_tile_source.ts:265.

  4. This pattern is used across the manager as request-level metadata
    (method) TileManager.abortAllRequests(): void marks tile.aborted = true independently of lifecycle state in tile_manager.ts:136, which reinforces that cancellation is orthogonal to the main tile state machine.

So in short: state is “what the tile is,” aborted is “what to do with the current request.” Keeping them separate avoids state explosion and preserves correct rendering behavior during cancellations.

async abortTile(tile: Tile): Promise<void> {
tile.aborted = true;
if (!tile.hasData()) {
tile.state = 'unloaded';
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't this be a reaction to the below abort call from the abort controller? Is this mandatory to set this?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting tile.state = 'unloaded' here is intentional and complementary to AbortController.abort(). The controller only cancels the in-flight async work; it does not guarantee the tile lifecycle state is updated consistently. Marking non-renderable tiles as unloaded ensures we do not keep a loading/aborted tile in an inconsistent state, while preserving renderable tiles (hasData()) to avoid flicker.


});

test('falls back to cache miss after out-of-view cache reset', () => {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is another test not related to this PR, right?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed. This test seems unrelated to the PR scope since it exercises TileManager cache reset behavior on reload, not the abortPendingTileRequests path.


test('calls abortTile before unloadTile for unfinished tile', () => {
const tileID = new OverscaledTileID(0, 0, 0, 0, 0);
const calls: string[] = [];
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as previous comment with the spies.

Copy link
Copy Markdown
Contributor Author

@pcardinal pcardinal Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which comment ?

tileManager._paused = false;
tileManager.transform = new MercatorTransform() as any;

(tileManager as any)._dataHandler({dataType: 'source', abortPendingTileRequests: true});
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use public APIs only, this as any is a sign of a bad test.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cleaned up the two neighboring unit tests to use public APIs only and removed the private/internal access patterns introduced in this thread.

What was changed:

  1. Refactored the test at tile_manager.test.ts:537 to stop calling private handlers and stop using internal state mutation.

  2. Refactored the adjacent test at tile_manager.test.ts:571 with the same approach.

  3. Replaced direct private calls and state poking with public behavior via source events:

    • onAdd(...)
    • update(transform)
    • getSource().fire(new Event('data', ...))

Validation:

  1. Targeted test run for “ignores content events after abortPendingTileRequests until metadata arrives” passed.
  2. Targeted test run for “forwards sourceDataChanged and shouldReloadTileOptions to reload” passed.

Scope note:

I only changed code authored in this conversation, as requested, and left other existing as any usages untouched.


test('ignores content events after abortPendingTileRequests until metadata arrives', () => {
const tileManager = createTileManager({});
const abortAllSpy = vi.spyOn(tileManager, 'abortAllRequests');
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that's the right way to test how this class behaves. It's best to either mock other classes this class uses or check events and other public APIs.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback. I updated the two latest tests to avoid asserting TileManager internals and to focus on observable behavior.

  • test('ignores content events after abortPendingTileRequests until metadata arrives'
  • test('forwards sourceDataChanged and shouldReloadTileOptions to reload

What changed:

  1. Removed spies on TileManager methods (abortAllRequests, reload, update).

  2. Kept event-driven setup through public source data events.

  3. Asserted behavior through public/collaborator effects instead:

    • loaded() transitions after abortPendingTileRequests
    • reload behavior inferred from Source.loadTile call count changes
    • forwarding of shouldReloadTileOptions verified via Source.shouldReloadTile arguments

I reran the two updated tests, and both pass.

expect(updateSpy).toHaveBeenCalledTimes(1);
});

test('forwards sourceDataChanged and shouldReloadTileOptions to reload', () => {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my previous comment, this test mocks too much of the class under test.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see previous comments R568 and R576

const nonLoadingTile = tileManager.addTile(new OverscaledTileID(1, 0, 1, 1, 1));
nonLoadingTile.state = 'loaded';

const abortTileSpy = vi.spyOn(tileManager._source, 'abortTile');
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would feel better to simply check the state of the tiles and avoid making sure some internal methods are called, as those can change in the future.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! I've refactored both tests to remove the internal spy assertions and focus only on observable behavior—the tile states.

The tests now:

  1. First test: Directly assert that loadingTile.aborted is true and nonLoadingTile.aborted is not—checking only the public state changes
  2. Second test: Assert that tile.aborted is true after calling abortAllRequests()—verifying the end result without coupling to internal method calls

This approach follows the project guidelines : "Prefer assertions on public behavior (emitted events, public state, return values) over private/internal method calls." The tests are now more resilient to future implementation refactors.

tile.state = 'loaded';
tile.abortController = {abort: vi.fn()} as unknown as AbortController;

const abortTileSpy = vi.spyOn(tileManager._source, 'abortTile');
Copy link
Copy Markdown
Collaborator

@HarelM HarelM Mar 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the outcome is that the tile is aborted by calling abortAllRequests I think that's a good enough test, no need to drill down to the exact method that was called.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see previous comment "Comment on line R2711"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

source.setUrl does not abort in-flight raster tile requests (tilejson completed)

3 participants