Bwatch#9069
Draft
sangbida wants to merge 102 commits intoElementsProject:masterfrom
Draft
Conversation
841b6d8 to
1958906
Compare
sangbida
commented
Apr 23, 2026
| @@ -0,0 +1,44 @@ | |||
| #include "config.h" | |||
| #include <ccan/array_size/array_size.h> | |||
Collaborator
Author
There was a problem hiding this comment.
Call these cln-bwatch
Collaborator
Author
|
Think about how we might rescan scriptpubkeys on migration, we don't really have to rescan scriptpubkeys more than once |
Like bitcoin_txid, they are special backwards-printed snowflakes. Thanks Obama! Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
These helper functions decode hex strings from JSON into big-endian 32-bit and 64-bit values, useful for parsing datastore entries exposing these into a more common space so they can be used by bwatch in the future.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
bwatch is an async block scanner that consumes blocks from bcli or any other bitcoind interface and communicates with lightningd by sending it updates. In this commit we're only introducing the plugin and some files that we will populate in future commits.
This wire file primarily contains datastructures that is used to serialize data for storing in the datastore. We have 2 types of datastores for bwatch. The block history datastore and the watch datastore. For block history we store height, the hash and the hash of the previous block. For watches we have 4 types of watches - utxo, scriptpubkey, scid and blockdepth watches, each of these have their unique info stored in the datastore. The common info for all watches includes the start block and the list of owners interested in watching.
We have 4 types of watches: utxo (outpoint), scriptpubkey, scid and blockdepth. Each gets its own hash table with a key shape that makes lookups direct.
bwatch keeps a tail of recent blocks (height, hash, prev hash) so it can detect and unwind reorgs without re-fetching from bitcoind. The datastore key for each block is zero-padded to 10 digits so listdatastore returns blocks in ascending height order. On startup we replay the stored history and resume from the most recent block.
Each watch (and its set of owners) is serialized through the wire format from the earlier commit and stored in the datastore. On startup we walk each type's prefix and reload the watches into their respective hash tables, so a restart resumes watching the same things without anyone re-registering.
bwatch_add_watch and bwatch_del_watch are the high-level entry points the RPCs (added in a later commit) use. Adding a watch that already exists merges the owner list and lowers start_block if the new request needs to scan further back, so a re-registering daemon (e.g. onchaind on restart) doesn't lose missed events. Removing a watch drops only the requesting owner; the watch itself is removed once the owner list is empty.
Add the chain-polling loop. A timer fires bwatch_poll_chain, which calls getchaininfo to learn bitcoind's tip; if we're behind, we fetch the next block via getrawblockbyheight, append it to the in-memory history and persist it to the datastore. After each successful persist we reschedule the timer at zero delay so we keep fetching back-to-back until we catch up to the chain tip. Once getchaininfo reports no new block, we settle into the steady-state cadence (30s by default, tunable via the --bwatch-poll-interval option). This commit only handles the happy path. Reorg detection, watchman notifications and watch matching land in subsequent commits.
After bwatch persists a new tip, send a block_processed RPC to watchman (lightningd) with the height and hash. bwatch only continues polling for the next block once watchman has acknowledged that it has also processed the new block height on its end. This matters for crash safety: on restart we treat watchman's height as the floor and re-fetch anything above it, so any block we acted on must be visible to watchman before we move on. If watchman isn't ready yet (e.g. lightningd still booting) the RPC errors out non-fatally; we just reschedule and retry.
When handle_block fetches the next block, validate its parent hash against our current tip. If they disagree we're seeing a reorg: pop our in-memory + persisted tip via bwatch_remove_tip, walk the history one back, and re-fetch from the new height. Each fetch may itself reorg further, so the loop naturally peels off as many stale tips as needed until the chain rejoins. After every rollback, tell watchman the new tip via revert_block_processed so its persisted height tracks bwatch's. If we crash before the ack lands, watchman's stale height will be higher than ours on restart, which retriggers the rollback. If the rollback exhausts our history (we rolled back past the oldest record we still hold) we zero current_height/current_blockhash and let the next poll re-init from bitcoind's tip. Notifying owners that their watches were reverted lands in a subsequent commit.
Add two RPCs for surfacing watches to lightningd on a new block or reorg. bwatch_send_watch_found informs lightningd of any watches that were found in the current processed block. The owner is used to disambiguate watches that may pertain to multiple subdaemons. bwatch_send_watch_revert is sent in case of a revert; it informs the owner that a previously reported watch has been rolled back. These functions get wired up in subsequent commits.
After every fetched block, walk each transaction and fire watch_found for matching scriptpubkey outputs and spent outpoints. Outputs are matched by hash lookup against scriptpubkey_watches; inputs by reconstructing the spent outpoint and looking it up in outpoint_watches.
After the per-tx scriptpubkey/outpoint pass, walk every scid watch and fire watch_found for any whose encoded blockheight matches the block just processed. The watch's scid encodes the expected (txindex, outnum), so we jump straight there without scanning. If the position is out of range (txindex past the block, or outnum past the tx) we send watch_found with tx=NULL, which lightningd treats as the "not found" case.
Subdaemons like channel_open and onchaind care about confirmation depth, not the underlying tx. Walk blockdepth_watches on every new block and send watch_found with the current depth to each owner. This is what keeps bwatch awake in environments like Greenlight, where we'd otherwise prefer to hibernate: as long as something is waiting on a confirmation milestone, the blockdepth watch holds the poll open; once it's deleted, we're free to sleep again. Depth fires before the per-tx scan so restart-marker watches get a chance to spin up subdaemons before any outpoint hits land for the same block. Watches whose start_block is ahead of the tip are stale (reorged-away, awaiting delete) and skipped.
On init, query bcli for chain name, headercount, blockcount and IBD state, then forward the result to watchman via the chaininfo RPC before bwatch starts its normal poll loop. Watchman uses this to gate any work that depends on bitcoind being synced. If bitcoind's blockcount comes back lower than our persisted tip, peel stored blocks off until they line up so watchman gets a consistent picture. During steady-state polling the same case is handled by hash-mismatch reorg detection inside handle_block; this shortcut only matters at startup, before we've fetched anything. If bcli or watchman is not yet ready, log and fall back to scheduling the poll loop anyway so init never stalls. bwatch_remove_tip is exposed in bwatch.h so the chaininfo path in bwatch_interface.c can use it.
addscriptpubkeywatch and delscriptpubkeywatch are how lightningd asks bwatch to start/stop watching an output script for a given owner.
addoutpointwatch and deloutpointwatch are how lightningd asks bwatch to start/stop watching a specific (txid, outnum) for a given owner.
addscidwatch and delscidwatch are how lightningd asks bwatch to start/stop watching a specific short_channel_id for a given owner. The scid pins the watch to one (block, txindex, outnum), so on each new block we go straight to that position rather than scanning.
addblockdepthwatch and delblockdepthwatch are how lightningd asks bwatch to start/stop a depth-tracker for a given (owner, start_block). start_block doubles as the watch key and the anchor used to compute depth = tip - start_block + 1 on every new block.
listwatch returns every active watch as a flat array. Each entry carries its type-specific key (scriptpubkey hex, outpoint, scid triple, or blockdepth anchor) plus the common type / start_block / owners fields, so callers can dispatch on the per-type key without parsing the type string first. Mostly used by tests and operator tooling to inspect what bwatch is currently tracking.
0c7e64a to
11a4d26
Compare
Part of the chaintopology spring clean.
The feerate samples now live alongside everything else watchman owns; chain_topology no longer needs to know about them.
Part of the chaintopology spring clean. Wrap the fee poll in a small struct fee_poll on lightningd, hardcode the cadence at 30s (BITCOIND_POLL_SECONDS, matching bwatch's default), and use topo as the bcli request ctx. The fee poll is a stopgap; eventually bwatch will push feerate updates alongside blocks. --dev-bitcoind-poll is kept as a deprecated no-op so existing test fixtures keep parsing.
Replace topology_synced() and topology_add_sync_waiter() with direct ld->bitcoind->synced checks. ld->bitcoind->synced is already driven by watchman.
lightningd's perspective of the block height should advance only when bwatch has fully delivered all the txs in a block — exactly what watchman tracks (last_processed_height).
This used to bump our height up to the bitcoin backend's headercount when chaintopology hadn't caught up yet, mostly to compute slightly tighter HTLC expiries. Bwatch should startup in parallel to lightningd now so it may be simpler to use the blockheight provided by bwatch. A TODO for me would be to verify this using a test.
bwatch now drives block ingestion, so chain_topology has nothing to bootstrap or stop. Create the bitcoind backend and start fee polling inline at wallet-init and drop begin_topology / stop_topology / broadcast_shutdown. Rebroadcast was previously kicked off by begin_topology; trigger it from notify_feerate_change instead so RBF still chases the new feerate.
The remaining chain_topology stub does nothing useful — bwatch drives block ingestion and feerate.c owns fee polling. Remove the file, the lightningd field, the Makefile entry, and the new_topology call site. Many consumers used to get broadcast.h and feerate.h indirectly through chaintopology.h. They now include those directly, which accounts for the include-only churn across ~16 files. In particular, channel_control.c (calls rebroadcast_txs) and peer_control.c (calls broadcast_tx) lost their transitive route to broadcast.h when notification.h stopped including chaintopology.h, so both gain a direct include. Move struct txlocator to feerate.h since it has no other home.
The blanket "skip everything" hook is being unwound suite-by-suite. Replace it with an empty allowlist so subsequent commits can opt each test file in once it has been ported and verified, without having to keep re-tweaking the hook itself. Behaviour is unchanged at this commit (allowlist is empty, so everything still skips).
These tests load pre-recorded sqlite3.xz fixtures (or run the lightning-downgrade tool) that all predate the bwatch-era schema (our_outputs / our_txs and the dropped utxoset / outputs tables). Skipping them individually keeps their parent suites unblocked when we re-enable test files one at a time. TODO: regenerate or rewrite each of these fixtures against the new schema and remove the @pytest.mark.skip decorators. Files touched: test_db.py, test_invoices.py, test_runes.py, test_wallet.py, test_coinmoves.py, test_bookkeeper.py, test_downgrade.py.
Test files identical between master and the cherry-pick-blockid-helpers target — they touch nothing chaintopology / watch.c / txfilter could have affected, so they should be safe to run on top of the bwatch migration unchanged. Adds to BWATCH_MIGRATION_ALLOWLIST: test_cln_rs, test_clnrest, test_mkfunding, test_onion, test_reckless, test_renepay, test_runes TODO: revisit test_cln_lsps once the wallet migration story is complete; it currently fails on the CI builder.
When fundpsbt/addpsbtoutput derive a fresh change address, immediately register the scriptpubkey watch in bwatch so those outputs are tracked consistently before confirmation. This matches the original branch behavior and avoids reservation mismatches in wallet tests.
Test fixtures previously set rescan=1 so the wallet would re-scan the last block on startup/restart. In the bwatch world, every wallet scriptpubkey is registered as a perennial watch, and asking for a 1-block rescan on startup re-arms every per-keyindex watch and triggers a rescan loop that drops in-memory reservation state. This was visible as test_reserveinputs failing with `assert not True` after `l1.restart()`, because every UTXO's reserved_til was reset by the rescan path. Drop the default to 0 to match upstream behaviour. Tests that need an explicit rescan (e.g. test_bip86_mnemonic_recovery) opt in via `options=`.
The four wallet_datastore_{get,create,update,remove} helpers used to
require the caller to be inside a wallet transaction; otherwise the
underlying db_prepare_v2 fatals at db/utils.c:103 with "Attempting to
prepare a db_stmt outside of a transaction".
674cff1 to
09b3f08
Compare
Two bugs broke splices on bwatch: 1. Duplicate scid update. channel_splice_watch_found already sets channel->scid, so handle_peer_splice_locked's change_scid call re-added the same scid to chanmap and aborted. Fold change_scid into depthcb_update_scid (matching the original branch), make it a no-op when the scid hasn't changed, and notify gossip from handle_peer_splice_locked directly. 2. Missing funding rawtx. The peer's splice handshake calls splice_lookup_tx, which reads our_txs. We never stored the funding tx there, so every splice failed with "channel control unable to find txid". Save it on first confirmation (annotated TX_CHANNEL_FUNDING) and on later splice / unexpected-outpoint events.
For CHANNELD_AWAITING_SPLICE, channel_funding_depth_found called channeld_tell_depth (splicing=false). channeld then treated the splice confirmation as the original funding tx, overwrote short_channel_ids[LOCAL], and aborted with "Duplicate splice_locked events detected by scid check". Use channeld_tell_splice_depth for the splice case instead.
bwatch delivers peer_got_splice_locked asynchronously, so the two peers can advance their gossip state machines a few ms apart. When the slower peer retransmits announcement_signatures, the faster peer's sent_sigs guard suppresses the response and the channel never finishes announcing. Clear sent_sigs in WAITING_FOR_MATCHING_PEER_SIGS so we always respond. ANNOUNCE_DEPTH is unaffected by the race.
Add test_splice.py, test_splicing.py, test_splicing_disconnect.py and test_splicing_insane.py to the bwatch-migration allowlist now that the splice path works under bwatch.
bwatch fires output watches before input watches, so the change
deposit can arrive before the spend withdrawal in the bookkeeper.
Group each tx's spend+change as unordered pairs in
test_script_splice_{out,in}.
Collaborator
Author
|
Fix INSERT OR IGNORE for postgres for our_txs and our_outputs. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Important
26.04 FREEZE March 11th: Non-bugfix PRs not ready by this date will wait for 26.06.
RC1 is scheduled on March 23rd
The final release is scheduled for April 15th.
Checklist
Before submitting the PR, ensure the following tasks are completed. If an item is not applicable to your PR, please mark it as checked:
tools/lightning-downgrade