Gloas data column reprocess queue#9339
Conversation
| const MAXIMUM_QUEUED_ATTESTATIONS: usize = 16_384; | ||
|
|
||
| /// How many columns we keep before new ones get dropped. | ||
| const MAXIMUM_QUEUED_DATA_COLUMNS: usize = 256; |
There was a problem hiding this comment.
I picked 256 so we could cache up to two slots worth of columns in the super node case.
There was a problem hiding this comment.
This is around 24mb of storage. I concur.
| } | ||
| ReadyWork::DataColumn(QueuedGossipDataColumn { process_fn, .. }) => Self { | ||
| drop_during_sync: true, | ||
| work: Work::UnknownBlockAttestation { process_fn }, |
There was a problem hiding this comment.
lol, funny enough it still works but yeah fixed
| const MAXIMUM_QUEUED_ATTESTATIONS: usize = 16_384; | ||
|
|
||
| /// How many columns we keep before new ones get dropped. | ||
| const MAXIMUM_QUEUED_DATA_COLUMNS: usize = 256; |
There was a problem hiding this comment.
This is around 24mb of storage. I concur.
| // Queue the column for reprocessing when the block arrives. | ||
| let processor = self.clone(); | ||
| let reprocess_msg = | ||
| ReprocessQueueMessage::UnknownBlockDataColumn(QueuedGossipDataColumn { |
There was a problem hiding this comment.
This can happen in a loop no?
Attacker sends random columns with random block roots -> We send it to reprocess queue -> timer expires -> try again -> send to reprocess queue again.
We might need a allow_reprocess kind of logic that we have for attestations here?
| /// Queued backfill batches | ||
| queued_backfill_batches: Vec<QueuedBackfillBatch>, | ||
| /// Queued gossip data columns awaiting their block. | ||
| queued_gossip_data_columns: FnvHashMap<usize, (QueuedGossipDataColumn, DelayKey)>, |
There was a problem hiding this comment.
I'm not sure why we need to store individual indices separately.
The logic could be similar to the envelope where we store all columns for a given root upto the queue size. If a root is imported, then release everything under it at once. The timer can be for a block_root level instead of per column.
…ta-column-reprocess-queue
|
sorry there was a bit of slop in this PR. was scrambling trying to fix things before the devnet forked to gloas and missed some claude nonsense. |
| if let Some(oldest_root) = | ||
| self.awaiting_envelopes_per_root.keys().next().copied() | ||
| && let Some((_envelope, delay_key)) = | ||
| self.awaiting_envelopes_per_root.remove(&oldest_root) | ||
| { | ||
| self.envelope_delay_queue.remove(&delay_key); | ||
| } |
There was a problem hiding this comment.
we were just arbitrarily dropping a random envelope here which didn't seem right
|
Some required checks have failed. Could you please take a look @eserilev? 🙏 |
Issue Addressed
When debugging ePBS with columns, we noticed that columns arriving before their block dont pass gossip verification checks and are dropped. This PR ensures that columns arriving before the block are sent to the reprocess queue. Once their block arrives, they are reprocessed.
This isn't an issue pre-gloas because we don't make block root checks for fulu data columns. This allows us to gossip verify the column and send it to the DA cache before the block arrives.
I think we also need to handle this edge case for partial data columns. Theres an existing TODO for that already.