Fix non-canonical payload attestation processing#9305
Conversation
| /// limits the number of concurrent states that can be loaded into memory for the committee cache. | ||
| /// The maximum number of concurrent shuffling "promises" that can be issued. In effect, this | ||
| /// limits the number of concurrent states that can be loaded into memory for the shuffling. | ||
| /// This prevents excessive memory usage at the cost of rejecting some attestations. |
There was a problem hiding this comment.
Not directly related to this PR - may be worth revisiting this pre-tree states constant, we could probably handle more than 2 now?
|
This pull request has merge conflicts. Could you please resolve them @michaelsproul? 🙏 |
|
I suggest to split the HTTP API cache into a separate thing so we can overload the existing cache with the PTC, and always have PTC entries |
- PR #9305 wants to store PTCs in the committee cache. BUT the http API route wants to use the committee cache and insert historical committees (i.e. given state at epoch 1000, compute and store the committee for epoch 900). If we want a single cache to serve both use cases we need to: - Have entries in the committee cache that have no PTC: Makes reading PTCs from the cache not deterministic - Compute historical PTC: A bunch of complicated code that's useless Instead we can add a separate cache for the API, very simple one, that caches committees only. And have the one in the beacon chain compute and cache PTCs always. ### Performance impact Slightly additional memory cost for users of the `beacon/states/committees` route. Caching is almost equivalent, except for queries of recent committees that may already exist in the beacon chain's committee cache. ### AI disclousure This PR was written by hand 90%. Claude fixed some warp type issues Co-Authored-By: dapplion <35266934+dapplion@users.noreply.github.com>
| beacon_chain | ||
| .shuffling_cache | ||
| .write() | ||
| .insert_committee_cache(shuffling_id.clone(), cached_shuffling)?; |
There was a problem hiding this comment.
So we only prime this cache after gloas? Why not make CachedPTCs return PreGloas for those cases?
There was a problem hiding this comment.
No. We also hit this codepath pre-Gloas, and it works because CachedPTCs::from_state will return CachedPTCs::PreGloas.
The condition that causes us to skip priming is just for the Gloas fork boundary (shuffling_epoch is Gloas, but state is not).
…-committee-cache # Conflicts: # beacon_node/beacon_chain/src/payload_attestation_verification/tests.rs
- Remove unused `BeaconChainError::MissingPtcForGloasShuffling` variant (no producers remained after the earlier cleanup). - Drop the `Result<(), BeaconChainError>` return type from `ShufflingCache::insert_committee_cache`; both match arms are infallible. Update callers in `beacon_chain.rs`, `state_advance_timer.rs`, `shuffling_cache.rs` and the unit tests accordingly. - Trim stale "Replace the committee if it's not present" comment in `insert_committee_cache`; the Committee arm is now a no-op so only the `Promise(_) | None` whimsy line remains.
CachedPTCs::try_from_state now returns Result<Option<Self>, _> and internalises the boundary rule (pre-Gloas state, Gloas shuffling epoch => Ok(None)). Callers (block import priming, state advance timer, with_cached_shuffling miss path) just skip insertion on None instead of duplicating the guard. The unit test exercises the three boundary cases against a pre-Gloas state.
Issue Addressed
Breakout from:
We currently do not handle the verification of payload attestations on non-canonical side chains, we always attempt to use the head. The included regression test demonstrates this, and there is also a fork choice compliance test in #9295 that triggers it.
Proposed Changes
This PR is a bit opinionated, but I'll explain my judgements:
The other opinionated change is in the tests. The previous tests were set up kind of nicely to avoid instantiating a
BeaconChainHarness. However they were not using mocking, which made testing the non-canonical chain case kind of infeasible. To remedy this, I've changed them to just use a beacon chain harness and create two chains using its relatively easy to use methods for doing this. The running time of the tests goes from something like 2.6s for 8 tests to 3.3s for 9 tests, which is only an increase of 0.04s/test. Negligible. Another plus to using theBeaconChainHarnessis that it avoids a bunch of the cruft to create synthetic non-mocked beacon chain bits.At the same time, I've made some attempt to improve modularity (and fit with the
GossipVerificationContext) by pulling out the guts ofwith_committee_cacheinto a new function (with_cached_shuffling) that clearly shows its dependency surface.