executor, distsql: reduce the log of dispatching mpp tasks by gengliqi · Pull Request #68041 · pingcap/tidb

gengliqi · 2026-04-24T12:47:31Z

What problem does this PR solve?

Issue Number: close #68040

Problem Summary:

What changed and how does it work?

Aggregate the dispatching mpp task log from one line per task to one line per fragment.
The aggregated log now records task IDs and TiFlash store IDs instead of verbose addresses.
BTW, this PR also includes TiFlash compute nodes in the store info map in disaggregated mode, so zone-based traffic statistics will not miss them.

Run TPC-H Q9 with 2 TiFlash
Before

[2026/04/24 17:53:53.151 +08:00] [INFO] [local_mpp_coordinator.go:239] ["Dispatch mpp task"] [timestamp=465836292997120001] [ID=1] [QueryTs=1777024433145415964] [LocalQueryId=2] [ServerID=1253] [address=127.0.0.1:3931] [plan="Table(nation)->Send(2, )"] [mpp-version=3] [exchange-compression-mode=FAST] [GatherID=1] [resource_group=default]
[2026/04/24 17:53:53.151 +08:00] [INFO] [local_mpp_coordinator.go:239] ["Dispatch mpp task"] [timestamp=465836292997120001] [ID=2] [QueryTs=1777024433145415964] [LocalQueryId=2] [ServerID=1253] [address=127.0.0.1:3931] [plan="RightHashJoin{Recv(1, )->Table(supplier)}(tpch100.nation.n_nationkey,tpch100.supplier.s_nationkey)->Projection->Send(5, 6, )"] [mpp-version=3] [exchange-compression-mode=FAST] [GatherID=1] [resource_group=default]
[2026/04/24 17:53:53.151 +08:00] [INFO] [local_mpp_coordinator.go:239] ["Dispatch mpp task"] [timestamp=465836292997120001] [ID=3] [QueryTs=1777024433145415964] [LocalQueryId=2] [ServerID=1253] [address=127.0.0.1:3931] [plan="Table(lineitem)->Send(5, 6, )"] [mpp-version=3] [exchange-compression-mode=FAST] [GatherID=1] [resource_group=default]
[2026/04/24 17:53:53.151 +08:00] [INFO] [local_mpp_coordinator.go:239] ["Dispatch mpp task"] [timestamp=465836292997120001] [ID=4] [QueryTs=1777024433145415964] [LocalQueryId=2] [ServerID=1253] [address=127.0.0.1:3930] [plan="Table(lineitem)->Send(5, 6, )"] [mpp-version=3] [exchange-compression-mode=FAST] [GatherID=1] [resource_group=default]
[2026/04/24 17:53:53.151 +08:00] [INFO] [local_mpp_coordinator.go:239] ["Dispatch mpp task"] [timestamp=465836292997120001] [ID=5] [QueryTs=1777024433145415964] [LocalQueryId=2] [ServerID=1253] [address=127.0.0.1:3931] [plan="RightHashJoin{Recv(2, )->Recv(3, 4, )}(tpch100.supplier.s_suppkey,tpch100.lineitem.l_suppkey)->Projection->Send(9, 10, )"] [mpp-version=3] [exchange-compression-mode=FAST] [GatherID=1] [resource_group=default]
[2026/04/24 17:53:53.151 +08:00] [INFO] [local_mpp_coordinator.go:239] ["Dispatch mpp task"] [timestamp=465836292997120001] [ID=6] [QueryTs=1777024433145415964] [LocalQueryId=2] [ServerID=1253] [address=127.0.0.1:3930] [plan="RightHashJoin{Recv(2, )->Recv(3, 4, )}(tpch100.supplier.s_suppkey,tpch100.lineitem.l_suppkey)->Projection->Send(9, 10, )"] [mpp-version=3] [exchange-compression-mode=FAST] [GatherID=1] [resource_group=default]
[2026/04/24 17:53:53.151 +08:00] [INFO] [local_mpp_coordinator.go:239] ["Dispatch mpp task"] [timestamp=465836292997120001] [ID=7] [QueryTs=1777024433145415964] [LocalQueryId=2] [ServerID=1253] [address=127.0.0.1:3931] [plan="Table(part)->Sel([like(tpch100.part.p_name, %dim%, 92)])->Send(9, 10, )"] [mpp-version=3] [exchange-compression-mode=FAST] [GatherID=1] [resource_group=default]
[2026/04/24 17:53:53.151 +08:00] [INFO] [local_mpp_coordinator.go:239] ["Dispatch mpp task"] [timestamp=465836292997120001] [ID=8] [QueryTs=1777024433145415964] [LocalQueryId=2] [ServerID=1253] [address=127.0.0.1:3930] [plan="Table(part)->Sel([like(tpch100.part.p_name, %dim%, 92)])->Send(9, 10, )"] [mpp-version=3] [exchange-compression-mode=FAST] [GatherID=1] [resource_group=default]
[2026/04/24 17:53:53.151 +08:00] [INFO] [local_mpp_coordinator.go:239] ["Dispatch mpp task"] [timestamp=465836292997120001] [ID=9] [QueryTs=1777024433145415964] [LocalQueryId=2] [ServerID=1253] [address=127.0.0.1:3931] [plan="LeftHashJoin{Recv(5, 6, )->Recv(7, 8, )}(tpch100.lineitem.l_partkey,tpch100.part.p_partkey)->Projection->Send(13, 14, )"] [mpp-version=3] [exchange-compression-mode=FAST] [GatherID=1] [resource_group=default]
[2026/04/24 17:53:53.151 +08:00] [INFO] [local_mpp_coordinator.go:239] ["Dispatch mpp task"] [timestamp=465836292997120001] [ID=10] [QueryTs=1777024433145415964] [LocalQueryId=2] [ServerID=1253] [address=127.0.0.1:3930] [plan="LeftHashJoin{Recv(5, 6, )->Recv(7, 8, )}(tpch100.lineitem.l_partkey,tpch100.part.p_partkey)->Projection->Send(13, 14, )"] [mpp-version=3] [exchange-compression-mode=FAST] [GatherID=1] [resource_group=default]
[2026/04/24 17:53:53.151 +08:00] [INFO] [local_mpp_coordinator.go:239] ["Dispatch mpp task"] [timestamp=465836292997120001] [ID=11] [QueryTs=1777024433145415964] [LocalQueryId=2] [ServerID=1253] [address=127.0.0.1:3931] [plan="Table(partsupp)->Send(13, 14, )"] [mpp-version=3] [exchange-compression-mode=FAST] [GatherID=1] [resource_group=default]
[2026/04/24 17:53:53.152 +08:00] [INFO] [local_mpp_coordinator.go:239] ["Dispatch mpp task"] [timestamp=465836292997120001] [ID=12] [QueryTs=1777024433145415964] [LocalQueryId=2] [ServerID=1253] [address=127.0.0.1:3930] [plan="Table(partsupp)->Send(13, 14, )"] [mpp-version=3] [exchange-compression-mode=FAST] [GatherID=1] [resource_group=default]
[2026/04/24 17:53:53.152 +08:00] [INFO] [local_mpp_coordinator.go:239] ["Dispatch mpp task"] [timestamp=465836292997120001] [ID=13] [QueryTs=1777024433145415964] [LocalQueryId=2] [ServerID=1253] [address=127.0.0.1:3931] [plan="LeftHashJoin{Recv(9, 10, )->Recv(11, 12, )}(tpch100.lineitem.l_suppkey,tpch100.partsupp.ps_suppkey)(tpch100.lineitem.l_partkey,tpch100.partsupp.ps_partkey)->Projection->Send(17, 18, )"] [mpp-version=3] [exchange-compression-mode=FAST] [GatherID=1] [resource_group=default]
[2026/04/24 17:53:53.152 +08:00] [INFO] [local_mpp_coordinator.go:239] ["Dispatch mpp task"] [timestamp=465836292997120001] [ID=14] [QueryTs=1777024433145415964] [LocalQueryId=2] [ServerID=1253] [address=127.0.0.1:3930] [plan="LeftHashJoin{Recv(9, 10, )->Recv(11, 12, )}(tpch100.lineitem.l_suppkey,tpch100.partsupp.ps_suppkey)(tpch100.lineitem.l_partkey,tpch100.partsupp.ps_partkey)->Projection->Send(17, 18, )"] [mpp-version=3] [exchange-compression-mode=FAST] [GatherID=1] [resource_group=default]
[2026/04/24 17:53:53.152 +08:00] [INFO] [local_mpp_coordinator.go:239] ["Dispatch mpp task"] [timestamp=465836292997120001] [ID=15] [QueryTs=1777024433145415964] [LocalQueryId=2] [ServerID=1253] [address=127.0.0.1:3931] [plan="Table(orders)->Send(17, 18, )"] [mpp-version=3] [exchange-compression-mode=FAST] [GatherID=1] [resource_group=default]
[2026/04/24 17:53:53.152 +08:00] [INFO] [local_mpp_coordinator.go:239] ["Dispatch mpp task"] [timestamp=465836292997120001] [ID=16] [QueryTs=1777024433145415964] [LocalQueryId=2] [ServerID=1253] [address=127.0.0.1:3930] [plan="Table(orders)->Send(17, 18, )"] [mpp-version=3] [exchange-compression-mode=FAST] [GatherID=1] [resource_group=default]
[2026/04/24 17:53:53.152 +08:00] [INFO] [local_mpp_coordinator.go:239] ["Dispatch mpp task"] [timestamp=465836292997120001] [ID=17] [QueryTs=1777024433145415964] [LocalQueryId=2] [ServerID=1253] [address=127.0.0.1:3931] [plan="LeftHashJoin{Recv(13, 14, )->Recv(15, 16, )}(tpch100.lineitem.l_orderkey,tpch100.orders.o_orderkey)->Projection->Projection->Projection->HashAgg->Send(19, 20, )"] [mpp-version=3] [exchange-compression-mode=FAST] [GatherID=1] [resource_group=default]
[2026/04/24 17:53:53.152 +08:00] [INFO] [local_mpp_coordinator.go:239] ["Dispatch mpp task"] [timestamp=465836292997120001] [ID=18] [QueryTs=1777024433145415964] [LocalQueryId=2] [ServerID=1253] [address=127.0.0.1:3930] [plan="LeftHashJoin{Recv(13, 14, )->Recv(15, 16, )}(tpch100.lineitem.l_orderkey,tpch100.orders.o_orderkey)->Projection->Projection->Projection->HashAgg->Send(19, 20, )"] [mpp-version=3] [exchange-compression-mode=FAST] [GatherID=1] [resource_group=default]
[2026/04/24 17:53:53.152 +08:00] [INFO] [local_mpp_coordinator.go:239] ["Dispatch mpp task"] [timestamp=465836292997120001] [ID=19] [QueryTs=1777024433145415964] [LocalQueryId=2] [ServerID=1253] [address=127.0.0.1:3931] [plan="Recv(17, 18, )->HashAgg->Projection->Projection->Send(-1, )"] [mpp-version=3] [exchange-compression-mode=NONE] [GatherID=1] [resource_group=default]
[2026/04/24 17:53:53.152 +08:00] [INFO] [local_mpp_coordinator.go:239] ["Dispatch mpp task"] [timestamp=465836292997120001] [ID=20] [QueryTs=1777024433145415964] [LocalQueryId=2] [ServerID=1253] [address=127.0.0.1:3930] [plan="Recv(17, 18, )->HashAgg->Projection->Projection->Send(-1, )"] [mpp-version=3] [exchange-compression-mode=NONE] [GatherID=1] [resource_group=default]

After

[2026/04/29 01:25:42.728 +08:00] [INFO] [local_mpp_coordinator.go:270] ["Dispatch mpp tasks"] [timestamp=465933996569657356] [IDs="[1]"] [storeIDs="[128]"] [QueryTs=1777397142717349452] [LocalQueryId=2] [ServerID=1359] [plan="Table(nation)->Send(2, )"] [mpp-version=3] [exchange-compression-mode=FAST] [GatherID=1] [resource_group=default]
[2026/04/29 01:25:42.728 +08:00] [INFO] [local_mpp_coordinator.go:270] ["Dispatch mpp tasks"] [timestamp=465933996569657356] [IDs="[2]"] [storeIDs="[128]"] [QueryTs=1777397142717349452] [LocalQueryId=2] [ServerID=1359] [plan="RightHashJoin{Recv(1, )->Table(supplier)}(tpch100.nation.n_nationkey,tpch100.supplier.s_nationkey)->Projection->Send(5, 6, )"] [mpp-version=3] [exchange-compression-mode=FAST] [GatherID=1] [resource_group=default]
[2026/04/29 01:25:42.728 +08:00] [INFO] [local_mpp_coordinator.go:270] ["Dispatch mpp tasks"] [timestamp=465933996569657356] [IDs="[3,4]"] [storeIDs="[128,129]"] [QueryTs=1777397142717349452] [LocalQueryId=2] [ServerID=1359] [plan="Table(lineitem)->Send(5, 6, )"] [mpp-version=3] [exchange-compression-mode=FAST] [GatherID=1] [resource_group=default]
[2026/04/29 01:25:42.728 +08:00] [INFO] [local_mpp_coordinator.go:270] ["Dispatch mpp tasks"] [timestamp=465933996569657356] [IDs="[5,6]"] [storeIDs="[128,129]"] [QueryTs=1777397142717349452] [LocalQueryId=2] [ServerID=1359] [plan="RightHashJoin{Recv(2, )->Recv(3, 4, )}(tpch100.supplier.s_suppkey,tpch100.lineitem.l_suppkey)->Projection->Send(9, 10, )"] [mpp-version=3] [exchange-compression-mode=FAST] [GatherID=1] [resource_group=default]
[2026/04/29 01:25:42.728 +08:00] [INFO] [local_mpp_coordinator.go:270] ["Dispatch mpp tasks"] [timestamp=465933996569657356] [IDs="[7,8]"] [storeIDs="[129,128]"] [QueryTs=1777397142717349452] [LocalQueryId=2] [ServerID=1359] [plan="Table(part)->Sel([like(tpch100.part.p_name, %dim%, 92)])->Send(9, 10, )"] [mpp-version=3] [exchange-compression-mode=FAST] [GatherID=1] [resource_group=default]
[2026/04/29 01:25:42.728 +08:00] [INFO] [local_mpp_coordinator.go:270] ["Dispatch mpp tasks"] [timestamp=465933996569657356] [IDs="[9,10]"] [storeIDs="[128,129]"] [QueryTs=1777397142717349452] [LocalQueryId=2] [ServerID=1359] [plan="LeftHashJoin{Recv(5, 6, )->Recv(7, 8, )}(tpch100.lineitem.l_partkey,tpch100.part.p_partkey)->Projection->Send(13, 14, )"] [mpp-version=3] [exchange-compression-mode=FAST] [GatherID=1] [resource_group=default]
[2026/04/29 01:25:42.728 +08:00] [INFO] [local_mpp_coordinator.go:270] ["Dispatch mpp tasks"] [timestamp=465933996569657356] [IDs="[11,12]"] [storeIDs="[128,129]"] [QueryTs=1777397142717349452] [LocalQueryId=2] [ServerID=1359] [plan="Table(partsupp)->Send(13, 14, )"] [mpp-version=3] [exchange-compression-mode=FAST] [GatherID=1] [resource_group=default]
[2026/04/29 01:25:42.728 +08:00] [INFO] [local_mpp_coordinator.go:270] ["Dispatch mpp tasks"] [timestamp=465933996569657356] [IDs="[13,14]"] [storeIDs="[128,129]"] [QueryTs=1777397142717349452] [LocalQueryId=2] [ServerID=1359] [plan="LeftHashJoin{Recv(9, 10, )->Recv(11, 12, )}(tpch100.lineitem.l_suppkey,tpch100.partsupp.ps_suppkey)(tpch100.lineitem.l_partkey,tpch100.partsupp.ps_partkey)->Projection->Send(17, 18, )"] [mpp-version=3] [exchange-compression-mode=FAST] [GatherID=1] [resource_group=default]
[2026/04/29 01:25:42.728 +08:00] [INFO] [local_mpp_coordinator.go:270] ["Dispatch mpp tasks"] [timestamp=465933996569657356] [IDs="[15,16]"] [storeIDs="[129,128]"] [QueryTs=1777397142717349452] [LocalQueryId=2] [ServerID=1359] [plan="Table(orders)->Send(17, 18, )"] [mpp-version=3] [exchange-compression-mode=FAST] [GatherID=1] [resource_group=default]
[2026/04/29 01:25:42.729 +08:00] [INFO] [local_mpp_coordinator.go:270] ["Dispatch mpp tasks"] [timestamp=465933996569657356] [IDs="[17,18]"] [storeIDs="[128,129]"] [QueryTs=1777397142717349452] [LocalQueryId=2] [ServerID=1359] [plan="LeftHashJoin{Recv(13, 14, )->Recv(15, 16, )}(tpch100.lineitem.l_orderkey,tpch100.orders.o_orderkey)->Projection->Projection->Projection->HashAgg->Send(19, 20, )"] [mpp-version=3] [exchange-compression-mode=FAST] [GatherID=1] [resource_group=default]
[2026/04/29 01:25:42.729 +08:00] [INFO] [local_mpp_coordinator.go:270] ["Dispatch mpp tasks"] [timestamp=465933996569657356] [IDs="[19,20]"] [storeIDs="[128,129]"] [QueryTs=1777397142717349452] [LocalQueryId=2] [ServerID=1359] [plan="Recv(17, 18, )->HashAgg->Projection->Projection->Send(-1, )"] [mpp-version=3] [exchange-compression-mode=NONE] [GatherID=1] [resource_group=default]

Check List

Tests

Unit test
Integration test
Manual test (add detailed scripts or steps below)
No need to test
- I checked and no code files have been changed.

Side effects

Performance regression: Consumes more CPU
Performance regression: Consumes more Memory
Breaking backward compatibility

Documentation

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

Summary by CodeRabbit

Refactor
- Optimized multi-task dispatch coordination to support broader store info (including disaggregated compute stores), improve dispatch efficiency, reduce per-task logging, and consolidate dispatch reporting.
Tests
- Updated unit tests to match the new dispatch and store-info handling.

Note: Internal optimizations only; no visible end-user changes.

Signed-off-by: gengliqi <gengliqiii@gmail.com>

ti-chi-bot · 2026-04-24T12:47:34Z

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

tiprow · 2026-04-24T12:47:50Z

Hi @gengliqi. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

coderabbitai · 2026-04-24T12:47:56Z

📝 Walkthrough

Walkthrough

appendMPPDispatchReq now takes a map of TiFlash store info (zone + storeID) instead of zone-only mapping; it aggregates task IDs and store IDs per fragment, uses the new map for exchange zoning, and emits a single consolidated dispatch log. Execute builds the store-info map (including compute stores when disaggregated) before calling appendMPPDispatchReq.

Changes

Cohort / File(s)	Summary
MPP Coordinator core `pkg/executor/internal/mpp/local_mpp_coordinator.go`	`appendMPPDispatchReq` signature changed to accept `map[string]tiFlashStoreInfo` (zone + storeID). Caches sink tasks, accumulates task IDs and TiFlash storeIDs per fragment, sets exchange zone using the new store-info map, and replaces per-task dispatch logs with a single consolidated `"Dispatch mpp tasks"` log. `Execute` now builds the TiFlash store-info map and includes compute stores when disaggregated.
Tests / Helpers `pkg/executor/internal/mpp/local_mpp_coordinator_test.go`	`mockTaskZoneInfoHelper` updated to populate `allTiFlashStoreInfo` by converting `storeZoneMpp` (`map[string]string`) into `map[string]tiFlashStoreInfo{zone: ...}`, and adjusted helper initialization to use the new field.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

pkg/executor, pkg/kv, pkg/store: propagate SQL digest for MPP task metadata #66763 — Modifies the same appendMPPDispatchReq function to change MPP dispatch request construction; closely related in implementation area.

Suggested labels

size/M

Suggested reviewers

windtalker

Poem

🐰 I gathered stores and numbered each hare,
Zones and IDs tucked with tidy care.
One cheerful log now sings for the team,
No noisy hops to break the scheme.
Hooray — steady dispatch without a scare!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and concisely summarizes the main change: reducing the volume of logging when dispatching MPP tasks by aggregating logs.
Description check	✅ Passed	The description includes the required issue reference (close `#68040`), clearly explains the problem, changes, and provides before-and-after log examples demonstrating the improvement.
Linked Issues check	✅ Passed	The PR successfully addresses the linked issue `#68040` by aggregating dispatch logs from per-task to per-fragment entries, reducing log volume while preserving relevant task IDs and store information.
Out of Scope Changes check	✅ Passed	All code changes are directly related to the stated objectives: aggregating MPP dispatch logging and including TiFlash compute nodes in the store info map for disaggregated mode.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.11.4)

Command failed

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Review rate limit: 7/8 reviews remaining, refill in 7 minutes and 30 seconds.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (1)

pkg/executor/internal/mpp/local_mpp_coordinator.go (1)
265-277: Aggregated dispatch log: consider a couple of small polish items.

The consolidation looks good and meets the PR goal. A few optional points worth considering:

Log-string change is a breaking signal for operators. The message changed from the per-task "Dispatch mpp task" to "Dispatch mpp tasks". Any external log alerting/grep that matched the old string will silently go dark. Worth calling out in the release note (currently None) so operators can update dashboards/alerts.

task-count is redundant with IDs. zap.Int64s("IDs", taskIDs) already encodes the count; zap.Int("task-count", len(taskIDs)) duplicates that info. Fine to keep for human readability, but you can drop it if terseness is preferred.

Field-name style is mixed (timestamp, IDs, QueryTs, LocalQueryId, ServerID, addresses, mpp-version, GatherID, resource_group). This mirrors the legacy log so it’s not a regression, but while you’re touching this line it’s a cheap opportunity to pick one style (the rest of the file tends toward lower-kebab like mpp-version, task-id).

firstTask.StartTs vs c.startTS. For a fragment dispatched by this coordinator these should always match; using c.startTS directly would avoid the implicit assumption and remove the need to read firstTask for that field.

No functional concerns — these are all nits.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/executor/internal/mpp/local_mpp_coordinator.go` around lines 265 - 277,
Restore the original log message and tidy redundant/mixed fields in the
consolidated dispatch log: in the log statement built in
local_mpp_coordinator.go (the logutil.BgLogger().Info call referencing
firstTask, taskIDs, pf.Sink, c.gatherID, rgName), change the message back to
"Dispatch mpp task" to avoid breaking alerts, remove the redundant
zap.Int("task-count", len(taskIDs)) since zap.Int64s("IDs", taskIDs) already
conveys count, replace firstTask.StartTs with c.startTS to avoid the implicit
dependency on firstTask, and normalize key names to a consistent lower-kebab
style (e.g., query-ts, local-query-id, server-id, gather-id, resource-group)
while keeping existing values (firstTask.MppQueryID.*, c.gatherID, rgName,
addresses, plannercore.ToString(pf.Sink), firstTask.MppVersion,
pf.Sink.GetCompressionMode()).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@pkg/executor/internal/mpp/local_mpp_coordinator.go`:
- Around line 265-277: Restore the original log message and tidy redundant/mixed
fields in the consolidated dispatch log: in the log statement built in
local_mpp_coordinator.go (the logutil.BgLogger().Info call referencing
firstTask, taskIDs, pf.Sink, c.gatherID, rgName), change the message back to
"Dispatch mpp task" to avoid breaking alerts, remove the redundant
zap.Int("task-count", len(taskIDs)) since zap.Int64s("IDs", taskIDs) already
conveys count, replace firstTask.StartTs with c.startTS to avoid the implicit
dependency on firstTask, and normalize key names to a consistent lower-kebab
style (e.g., query-ts, local-query-id, server-id, gather-id, resource-group)
while keeping existing values (firstTask.MppQueryID.*, c.gatherID, rgName,
addresses, plannercore.ToString(pf.Sink), firstTask.MppVersion,
pf.Sink.GetCompressionMode()).

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: f3804ee6-fa49-4752-95de-249ffd50749a

📥 Commits

Reviewing files that changed from the base of the PR and between f2ebab0 and e1f8dff.

📒 Files selected for processing (1)

pkg/executor/internal/mpp/local_mpp_coordinator.go

codecov · 2026-04-24T13:03:59Z

Codecov Report

❌ Patch coverage is 0% with 51 lines in your changes missing coverage. Please review.
✅ Project coverage is 77.0719%. Comparing base (9d1ac81) to head (15870ae).

Additional details and impacted files

@@               Coverage Diff                @@
##             master     #68041        +/-   ##
================================================
- Coverage   77.7558%   77.0719%   -0.6839%     
================================================
  Files          1990       1972        -18     
  Lines        551768     552429       +661     
================================================
- Hits         429032     425768      -3264     
- Misses       121816     126649      +4833     
+ Partials        920         12       -908

Flag	Coverage Δ
integration	`41.3648% <0.0000%> (+1.5630%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
dumpling	`60.4888% <ø> (ø)`
parser	`∅ <ø> (∅)`
br	`50.0597% <ø> (-13.0338%)`	⬇️

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Signed-off-by: gengliqi <gengliqiii@gmail.com>

coderabbitai

🧹 Nitpick comments (1)

pkg/executor/internal/mpp/local_mpp_coordinator.go (1)
265-277: Aggregated dispatch log — consider a brief comment on the invariant.

The aggregated log reuses firstTask.StartTs, firstTask.MppQueryID, and firstTask.MppVersion to represent the whole fragment. That is correct because all tasks produced for a single fragment share the same query/gather metadata, but it's not self-evident to a future reader skimming this method. A one-line comment above the log (e.g., "all tasks in the fragment share StartTs/MppQueryID/MppVersion") would make the intent obvious and prevent someone from later "fixing" this to a per-task loop.

Also note the positional correspondence between taskIDs[i] and addresses[i] is implicit; since this log is intended for operators rather than structured parsing, that's acceptable — just worth being aware of if downstream tooling ever starts consuming these fields.
✍️ Suggested clarifying comment
 	if len(tasks) > 0 {
+		// All tasks within a fragment share the same query-level metadata
+		// (StartTs/MppQueryID/MppVersion), so it's safe to take them from tasks[0].
+		// taskIDs[i] corresponds positionally to addresses[i].
 		firstTask := tasks[0]
 		logutil.BgLogger().Info("Dispatch mpp tasks", zap.Uint64("timestamp", firstTask.StartTs),
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/executor/internal/mpp/local_mpp_coordinator.go` around lines 265 - 277,
Add a one-line clarifying comment immediately above the aggregated dispatch log
in LocalMPPCoordinator (the block using firstTask, tasks, taskIDs, addresses,
and c.gatherID) stating the invariant that all tasks in this fragment share the
same StartTs, MppQueryID, and MppVersion so using firstTask values is
intentional; also note that taskIDs[index] corresponds positionally to
addresses[index] to avoid future accidental per-task refactors or
misinterpretation of the logged arrays.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@pkg/executor/internal/mpp/local_mpp_coordinator.go`:
- Around line 265-277: Add a one-line clarifying comment immediately above the
aggregated dispatch log in LocalMPPCoordinator (the block using firstTask,
tasks, taskIDs, addresses, and c.gatherID) stating the invariant that all tasks
in this fragment share the same StartTs, MppQueryID, and MppVersion so using
firstTask values is intentional; also note that taskIDs[index] corresponds
positionally to addresses[index] to avoid future accidental per-task refactors
or misinterpretation of the logged arrays.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: f45b6296-ebb4-4063-bce9-6312f5dd607a

📥 Commits

Reviewing files that changed from the base of the PR and between e1f8dff and 546b053.

📒 Files selected for processing (1)

pkg/executor/internal/mpp/local_mpp_coordinator.go

Signed-off-by: gengliqi <gengliqiii@gmail.com>

gengliqi · 2026-04-28T17:27:35Z

/cc @windtalker

gengliqi · 2026-04-28T17:28:31Z

/cc @xzhangxian1008

windtalker · 2026-04-29T01:59:39Z

+		if config.GetGlobalConfig().DisaggregatedTiFlash {
+			computeStores, getStoreErr := cache.GetTiFlashComputeStores(
+				backoff.NewBackoffer(ctx, copr.CopNextMaxBackoff).TiKVBackoffer())
+			if getStoreErr == nil {


what happens if getStoreErr is not nil?

Traffic that could cross zones might be counted as traffic within the same zone, and the store ID in the logs might be 0.

xzhangxian1008

LGTM

ti-chi-bot · 2026-04-30T03:11:03Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: xzhangxian1008
Once this PR has been reviewed and has the lgtm label, please assign nrc for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

ti-chi-bot · 2026-04-30T03:11:06Z

[LGTM Timeline notifier]

Timeline:

2026-04-30 03:11:05.948148215 +0000 UTC m=+2826671.153508272: ☑️ agreed by xzhangxian1008.

Signed-off-by: gengliqi <gengliqiii@gmail.com>

coderabbitai

♻️ Duplicate comments (1)

pkg/executor/internal/mpp/local_mpp_coordinator.go (1)

988-995: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Handle GetTiFlashComputeStores failures explicitly instead of silent fallback.

When getStoreErr != nil, the code silently skips compute stores. That can leave allTiFlashStoreInfo incomplete, which later causes unknown zones / storeID=0 in aggregated dispatch logs and zone-traffic misclassification.

Suggested fix

 		if config.GetGlobalConfig().DisaggregatedTiFlash {
 			computeStores, getStoreErr := cache.GetTiFlashComputeStores(
 				backoff.NewBackoffer(ctx, copr.CopNextMaxBackoff).TiKVBackoffer())
-			if getStoreErr == nil {
+			if getStoreErr != nil {
+				logutil.BgLogger().Warn("failed to get tiflash compute stores for mpp dispatch",
+					zap.Error(getStoreErr),
+					zap.Uint64("txnStartTS", c.startTS),
+					zap.Uint64("gatherID", c.gatherID))
+			} else {
 				for _, tiflashStore := range computeStores {
 					addTiFlashStoreInfo(allTiFlashStoreInfo, tiflashStore)
 				}
 			}
 		}

As per coding guidelines: "Keep error handling actionable and contextual; avoid silently swallowing errors."

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/executor/internal/mpp/local_mpp_coordinator.go` around lines 988 - 995,
The code currently swallows errors from cache.GetTiFlashComputeStores causing
missing entries in allTiFlashStoreInfo; update the DisaggregatedTiFlash branch
to handle getStoreErr explicitly: check if getStoreErr != nil and then either
return or propagate the error (or at minimum log it with contextual information
including the backoffer, ctx/region/store identifiers and the fact it's
populating allTiFlashStoreInfo) before skipping the loop; keep the existing
successful path that calls addTiFlashStoreInfo for each tiflashStore, but do not
silently ignore failures from GetTiFlashComputeStores.

🧹 Nitpick comments (1)

pkg/executor/internal/mpp/local_mpp_coordinator.go (1)

282-293: ⚡ Quick win

Consider capping IDs/storeIDs log payload size for very large fragments.

Aggregating per fragment is great, but logging full arrays can still create huge single log entries at scale. Consider logging count + truncated samples to keep log cost predictable.

Possible approach

+		const maxLogged = 128
+		loggedTaskIDs := taskIDs
+		loggedStoreIDs := storeIDs
+		truncated := false
+		if len(loggedTaskIDs) > maxLogged {
+			loggedTaskIDs = loggedTaskIDs[:maxLogged]
+			loggedStoreIDs = loggedStoreIDs[:maxLogged]
+			truncated = true
+		}
 		logutil.BgLogger().Info("Dispatch mpp tasks", zap.Uint64("timestamp", firstTask.StartTs),
-			zap.Int64s("IDs", taskIDs), zap.Uint64s("storeIDs", storeIDs),
+			zap.Int("taskCount", len(taskIDs)),
+			zap.Bool("idsTruncated", truncated),
+			zap.Int64s("IDs", loggedTaskIDs), zap.Uint64s("storeIDs", loggedStoreIDs),
 			...
 		)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/executor/internal/mpp/local_mpp_coordinator.go` around lines 282 - 293,
The "Dispatch mpp tasks" log is currently emitting entire taskIDs and storeIDs
arrays (variables taskIDs and storeIDs) which can blow up log size; change the
log to instead emit their counts and small truncated samples (e.g.,
len(taskIDs)/len(storeIDs) plus first N elements or a hash/summary) while
keeping the existing context fields (firstTask.MppQueryID.*,
firstTask.MppVersion, pf.Sink plan/compression, c.gatherID, rgName, sqlDigest,
planDigestStr) so callers like the Dispatch mpp tasks logging site produce
predictable, bounded-sized entries.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@pkg/executor/internal/mpp/local_mpp_coordinator.go`:
- Around line 988-995: The code currently swallows errors from
cache.GetTiFlashComputeStores causing missing entries in allTiFlashStoreInfo;
update the DisaggregatedTiFlash branch to handle getStoreErr explicitly: check
if getStoreErr != nil and then either return or propagate the error (or at
minimum log it with contextual information including the backoffer,
ctx/region/store identifiers and the fact it's populating allTiFlashStoreInfo)
before skipping the loop; keep the existing successful path that calls
addTiFlashStoreInfo for each tiflashStore, but do not silently ignore failures
from GetTiFlashComputeStores.

---

Nitpick comments:
In `@pkg/executor/internal/mpp/local_mpp_coordinator.go`:
- Around line 282-293: The "Dispatch mpp tasks" log is currently emitting entire
taskIDs and storeIDs arrays (variables taskIDs and storeIDs) which can blow up
log size; change the log to instead emit their counts and small truncated
samples (e.g., len(taskIDs)/len(storeIDs) plus first N elements or a
hash/summary) while keeping the existing context fields (firstTask.MppQueryID.*,
firstTask.MppVersion, pf.Sink plan/compression, c.gatherID, rgName, sqlDigest,
planDigestStr) so callers like the Dispatch mpp tasks logging site produce
predictable, bounded-sized entries.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 8d5b971d-e2d2-4071-91c0-92e96de9eadd

📥 Commits

Reviewing files that changed from the base of the PR and between e5c4c54 and 15870ae.

📒 Files selected for processing (1)

pkg/executor/internal/mpp/local_mpp_coordinator.go

reduce mpp logs

e1f8dff

Signed-off-by: gengliqi <gengliqiii@gmail.com>

ti-chi-bot Bot added release-note-none Denotes a PR that doesn't merit a release note. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. labels Apr 24, 2026

gengliqi marked this pull request as ready for review April 24, 2026 12:47

ti-chi-bot Bot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. labels Apr 24, 2026

coderabbitai Bot reviewed Apr 24, 2026

View reviewed changes

u

546b053

Signed-off-by: gengliqi <gengliqiii@gmail.com>

coderabbitai Bot reviewed Apr 24, 2026

View reviewed changes

u

e5c4c54

Signed-off-by: gengliqi <gengliqiii@gmail.com>

ti-chi-bot Bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Apr 28, 2026

ti-chi-bot Bot requested a review from windtalker April 28, 2026 17:27

ti-chi-bot Bot requested a review from xzhangxian1008 April 28, 2026 17:28

windtalker reviewed Apr 29, 2026

View reviewed changes

xzhangxian1008 approved these changes Apr 30, 2026

View reviewed changes

ti-chi-bot Bot added the needs-1-more-lgtm Indicates a PR needs 1 more LGTM. label Apr 30, 2026

gengliqi requested a review from windtalker April 30, 2026 06:36

merge master

15870ae

Signed-off-by: gengliqi <gengliqiii@gmail.com>

coderabbitai Bot reviewed Apr 30, 2026

View reviewed changes

Conversation

gengliqi commented Apr 24, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What problem does this PR solve?

What changed and how does it work?

Check List

Release note

Summary by CodeRabbit

Uh oh!

ti-chi-bot Bot commented Apr 24, 2026

Uh oh!

tiprow Bot commented Apr 24, 2026

Uh oh!

coderabbitai Bot commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

gengliqi commented Apr 28, 2026

Uh oh!

gengliqi commented Apr 28, 2026

Uh oh!

windtalker Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

gengliqi Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

xzhangxian1008 left a comment

Choose a reason for hiding this comment

Uh oh!

ti-chi-bot Bot commented Apr 30, 2026

Uh oh!

ti-chi-bot Bot commented Apr 30, 2026

[LGTM Timeline notifier]

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

gengliqi commented Apr 24, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 24, 2026 •

edited

Loading

codecov Bot commented Apr 24, 2026 •

edited

Loading