Skip to content

Safely migrate to FileSystemStoreV2#872

Open
benthecarman wants to merge 5 commits into
lightningdevkit:mainfrom
benthecarman:filesystem-v2
Open

Safely migrate to FileSystemStoreV2#872
benthecarman wants to merge 5 commits into
lightningdevkit:mainfrom
benthecarman:filesystem-v2

Conversation

@benthecarman
Copy link
Copy Markdown
Contributor

Before moving to PaginatedKVStore everywhere we need to use FileSystemStoreV2 instead of FileSystemStoreV1. This will safely migrate over to it on first start up. Also adds a test to make sure we handle it properly.

@ldk-reviews-bot
Copy link
Copy Markdown

ldk-reviews-bot commented Apr 10, 2026

👋 Thanks for assigning @jkczyz as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

@ldk-reviews-bot
Copy link
Copy Markdown

🔔 1st Reminder

Hey @tnull! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

Copy link
Copy Markdown
Collaborator

@tnull tnull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I have to say for the file system store I am a bit hesitant to just kick-off auto-migration like that, given users might have messed with the store in some manual fashion (def. had users do that before)?

Comment thread src/builder.rs Outdated
Comment thread src/builder.rs Outdated
match FilesystemStoreV2::new(storage_dir_path.clone()) {
Ok(store) => Ok(store),
Err(e) if e.kind() == std::io::ErrorKind::InvalidData => {
// The directory contains v1 data, migrate to v2.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, how do we know io::ErrorKind::InvalidData corresponds to v1? Couldn't it have other reasons (eg if users misconfigured something) and we might be screwing with users data here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only time we throw the error is when we detect a v1 fs store. We shouldn't ever encounter it otherwise, it only ever thrown by the std lib when trying to parse things like strings (checking for utf8 encoding), since we just read raw bytes we should be safe.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only time we throw the error is when we detect a v1 fs store.

Hmm, right, that's the current state, but it's still bad to lean on this, as we might very well change that behavior in the future? Do we need to change the new API to return a FilesystemStore-specific error type with a V1DataFound error variant?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benthecarman Any update here? Seems like we might need to still land the API change (explicit error variant) before the LDK v0.3 cutoff?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought I wrote a comment but I guess not. I researched and looked, the only way InvalidData is thrown is where we throw it explicitly. The only way for the normal std lib to throw is it when its parsing data, not just reading bytes like how we do so we should be safe. But yeah if that guarantee isn't enough, can create an error type for this.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought I wrote a comment but I guess not. I researched and looked, the only way InvalidData is thrown is where we throw it explicitly.

Right, but that's not part of any API contract, nor documented. So we might very well change it at some point going forward. IMO it therefore would be preferable to at the very least clearly document this behavior or introduce a dedicated error type before we can safely lean on it here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment thread src/builder.rs Outdated
let mut backup_dir = storage_dir_path.clone();
backup_dir.set_file_name("fs_store_v1_backup");
fs::rename(&storage_dir_path, &backup_dir)
.map_err(|_| BuildError::KVStoreSetupFailed)?;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that if we crash here the user can't recover. Can we somehow make the two renames part of the same fsync/metadata operation? Not sure?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the only real way would be to implement some recovery logic. Basically if the dir doesn't exist but fs_store_v2_migrating does, we do the rename and continue.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, yeah, might be good to at least add a test that checks we can safely retry and finish the migration after restart if we somehow crash/abort. AFAIU, this is currently not the case, i.e., we wouldn't actually continue/retry.

Comment thread tests/integration_tests_rust.rs Outdated
async fn build_0_7_0_node(
bitcoind: &BitcoinD, electrsd: &ElectrsD, storage_path: String, esplora_url: String,
seed_bytes: [u8; 64],
seed_bytes: [u8; 64], use_fs_store: bool,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, please have this take and use TestConfig for stuff like that, all the bool flags in test code are getting hard to read.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@benthecarman
Copy link
Copy Markdown
Contributor Author

Hmm, I have to say for the file system store I am a bit hesitant to just kick-off auto-migration like that, given users might have messed with the store in some manual fashion (def. had users do that before)?

Yeah I understand your hesitancy, however I am not really sure what else would be the best way forward if we are going to require the PaginatedKvStore. Otherwise we would just have this function fail with old ones and the dev/user would have to figure out how to migrate themselves. Maybe that's preferable so people move to sqlite?

@tnull
Copy link
Copy Markdown
Collaborator

tnull commented Apr 14, 2026

Yeah I understand your hesitancy, however I am not really sure what else would be the best way forward if we are going to require the PaginatedKvStore.

Well, as you know I was in favor of simply deprecating FilesystemStore, but it seems that ship has sailed. We could have also deprecated support in LDK Node, but given that we now want to ship this for the next release, just ripping it out for 0.8 seems like a really short notice for any users that might be out there, so likely we don't want to do that.

Otherwise we would just have this function fail with old ones and the dev/user would have to figure out how to migrate themselves. Maybe that's preferable so people move to sqlite?

Yeah, not entirely sure what's preferable yet, but it might be an option.

@tnull
Copy link
Copy Markdown
Collaborator

tnull commented Apr 28, 2026

@benthecarman Any update here?

@benthecarman
Copy link
Copy Markdown
Contributor Author

sorry busy with conference, found time to get to this one

@tnull
Copy link
Copy Markdown
Collaborator

tnull commented Apr 29, 2026

No worries, for this one we'll need to wait for the splice-related changes to go in first anyways, see failing CI.

@benthecarman benthecarman force-pushed the filesystem-v2 branch 2 times, most recently from 1c0c6c4 to ae16ea6 Compare May 5, 2026 03:11
Copy link
Copy Markdown
Collaborator

@tnull tnull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs a rebase by now.

Copy link
Copy Markdown
Collaborator

@tnull tnull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I'll ping @jkczyz as a secondary reviewer on the splice-related changes.

Comment thread src/payment/unified.rs Outdated
Comment thread src/wallet/mod.rs Outdated
Comment thread src/io/utils.rs
// Swap directories: rename v1 out of the way, move v2 into place.
let mut backup_dir = storage_dir_path.clone();
backup_dir.set_file_name("fs_store_v1_backup");
fs::rename(&storage_dir_path, &backup_dir)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Codex:

  • P1 /tmp/ldk-node-pr872/src/io/utils.rs:650, /tmp/ldk-node-pr872/src/builder.rs:659: the v1 to v2 migration is not crash-recoverable. After fs_store is renamed to fs_store_v1_backup, a crash or error before fs_store_v2_migrating is renamed into place leaves no fs_store. On the next startup,
    build_with_fs_store recreates an empty fs_store, FilesystemStoreV2::new succeeds, and the node can start against empty storage while the real data sits in backup/temp dirs. The migration helper should detect and recover incomplete states before creating a fresh store.

Seems we still might want to double-check we survive an ill-timed crash?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added migration defense and tests for all scenarios

Comment thread src/wallet/mod.rs Outdated
// independent ceilings, contributing up to 2 sats.
if let Some(ref mut change) = change_output {
let num_inputs = (confirmed_utxos.len() + must_spend.len()) as u64;
let per_input_rounding_surplus_sat = num_inputs.saturating_sub(1);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Codex:

  • P2 /tmp/ldk-node-pr872/src/wallet/mod.rs:986: the splice fee rounding adjustment adds the worst-case surplus to the change output instead of the actual BDK over-reservation. For example, at 1000 sat/kwu the per-component fee math is exact, so the actual rounding surplus is 0, but this
    still adds N + 1 sats to change. Since LDK derives value_added from inputs minus outputs/change/fees, splice_in(..., amount) can silently add less than the requested amount. This should compute the actual fee delta from the selected inputs/outputs, or cap the change bump to that delta.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ended up just removing this extra handling its only for an extra 2 sats, complex, and potentially error prone.

@ldk-reviews-bot
Copy link
Copy Markdown

🔔 1st Reminder

Hey @jkczyz! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot
Copy link
Copy Markdown

🔔 2nd Reminder

Hey @jkczyz! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot
Copy link
Copy Markdown

🔔 3rd Reminder

Hey @jkczyz! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot
Copy link
Copy Markdown

🔔 4th Reminder

Hey @jkczyz! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot
Copy link
Copy Markdown

🔔 5th Reminder

Hey @jkczyz! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

Comment thread tests/integration_tests_rust.rs
Comment thread src/io/utils.rs
.map_err(|_| BuildError::StoragePathAccessFailed)?;
}

match FilesystemStoreV2::new(storage_dir_path.clone()) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • [P2] Detect v1 empty-secondary namespace data — /home/tnull/workspace/ldk-node/src/io/utils.rs:640-641
    When an old fs store only has v1 entries under a non-empty primary namespace with an empty secondary namespace, such as fs_store/bdk_wallet/descriptor before any top-level LDK files have been persisted, FilesystemStoreV2::new succeeds because it only rejects top-level files. Startup then
    skips migration and V2 looks under bdk_wallet/[empty]/descriptor, silently ignoring the existing wallet state and potentially missing historical funds; please detect these direct namespace files and migrate them too.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

migrate_kv_store_data does this already

Comment thread src/io/utils.rs
Comment thread src/lib.rs
))
.map_err(|e| {
let contribution =
funding_template.splice_out(outputs, min_feerate, max_feerate).map_err(|e| {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • [P2] Preserve exact splice amounts with prior contributions — /home/tnull/workspace/ldk-node/src/lib.rs:1702-1703
    With the updated rust-lightning API, FundingTemplate::splice_out amends any prior_contribution by appending these outputs rather than replacing the pending request. If a user retries or replaces a pending splice-out, this can withdraw the previous output plus the new splice_amount_sats, so
    the public Node API removes more than requested; opt out with without_prior_contribution or explicitly account for the prior contribution.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, we should do these updates in separate PRs, IMO, especially if there are behavioral changes to consider. This is also related to #888, though it can probably be done independently.

That said, we need to decide what API we want to expose to the user. If they already initiated a splice, then should we allow them to add / remove more in a subsequent call if the first hasn't locked? In LDK, doing so results in RBF'ing the splice with the added / remove contributions. However, even if without_prior_contribution is used (which was the previous behavior), the first one can still confirm which might be unexpected for the user.

Maybe best to disallow a splice for now if one was already negotiated to unblock you. Otherwise, we need to consider what to do about the portion that didn't lock.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe stick with this for now and decide / fix in #888?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you drop the first two commits and rebase onto https://github.com/jkczyz/ldk-node/tree/2026-05-splice-builder-api? That fails if we already have a splice and makes my life a little easier in #888 for the Cargo.toml changes.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

benthecarman and others added 3 commits May 18, 2026 17:45
The new API computes its splice fee independently of the BDK coin
selection it drives, so any surplus BDK reserves on top of LDK's fee
flows into the new funding output instead of returning as change. A
splice-in therefore deposits slightly more on the channel side than
requested.

Stop double-counting the 5 WU per foreign input that BDK adds for the
empty `script_sig` byte and witness-count varint already in LDK's
`satisfaction_weight`. A small residue from BDK's per-component fee
rounding still inflates the funding output.

Co-Authored-By: Jeffrey Czyz <jkczyz@gmail.com>
Co-Authored-By: Claude <noreply@anthropic.com>
After the LDK splice-builder API change, `FundingTemplate::splice_in`
and `splice_out` silently reuse and amend a prior contribution when one
is present, rather than starting from scratch. ldk-node has no caller
that intentionally exercises that path today, so reject the request
explicitly until we design a dedicated RBF entry point. This preserves
the pre-upgrade behavior for back-to-back splice attempts on the same
channel.

Co-Authored-By: Claude <noreply@anthropic.com>
Point at LDK 2313bd584d2c46a50d67b8266f488c07516e0b3c and
bitcoin-payment-instructions 6bea50e3d57742cd1d8a65c32076bff55003f175.

Co-Authored-By: Claude <noreply@anthropic.com>
let expected_splice_in_fee_sat = 255;
let expected_splice_in_fee_sat = 251;
let expected_splice_in_onchain_cost_sat = 254;
let expected_splice_in_lightning_balance_sat = 4_000_003;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, the extra three sats will disappear once BDK (and rust-bitcoin?) are upgraded?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it'll get rid of all 3 but maybe 1 or 2. Some of them aren't really avoidable in bdk with how things are structured

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was there something to do with the segwit marker that would need a rust-bitcoin update, too?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No that one was a Claude hallucination sadly. rust-bitcoin already treats empty inputs as segwit

Comment thread src/lib.rs
))
.map_err(|e| {
let contribution =
funding_template.splice_out(outputs, min_feerate, max_feerate).map_err(|e| {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you drop the first two commits and rebase onto https://github.com/jkczyz/ldk-node/tree/2026-05-splice-builder-api? That fails if we already have a splice and makes my life a little easier in #888 for the Cargo.toml changes.

Before moving to PaginatedKVStore everywhere we need to use
FileSystemStoreV2 instead of FileSystemStoreV1. This will safely migrate
over to it on first start up. Also adds a test to make sure we handle it
properly.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

4 participants