-
Notifications
You must be signed in to change notification settings - Fork 138
Fix peer store management on channel closure #895
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -1585,6 +1585,44 @@ where | |
| } => { | ||
| log_info!(self.logger, "Channel {} closed due to: {}", channel_id, reason); | ||
|
|
||
| // If the counterparty initiated closure of their last remaining channel | ||
| // with us, remove them from the peer store so we stop trying to reconnect. | ||
| // | ||
| // If we initiated the closure, keep them in the peer store so the | ||
| // background reconnection task fires and we can complete the | ||
| // channel_reestablish recovery flow. This matters especially for LND | ||
| // peers, which need us to reconnect to recover from force-closures. | ||
| // | ||
| // We exclude `channel_id` from the remaining-channel check because LDK | ||
| // fires ChannelClosed before removing the channel from its internal list, | ||
| // so list_channels_with_counterparty still includes the closing channel. | ||
| if let Some(counterparty_node_id) = counterparty_node_id { | ||
| let counterparty_initiated = matches!( | ||
| reason, | ||
| ClosureReason::CounterpartyForceClosed { .. } | ||
| | ClosureReason::CounterpartyInitiatedCooperativeClosure | ||
| ); | ||
|
Comment on lines
+1594
to
+1599
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We should flip the order here and check the closure reason first, so the read on the channel manager only happens when it's actually relevant
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's true, no point querying the channel manager if the closure reason rules out action. |
||
|
|
||
| if counterparty_initiated { | ||
| let has_other_channels = self | ||
| .channel_manager | ||
| .list_channels_with_counterparty(&counterparty_node_id) | ||
| .iter() | ||
| .any(|c| c.channel_id != channel_id); | ||
|
|
||
| if !has_other_channels { | ||
| if let Err(e) = self.peer_store.remove_peer(&counterparty_node_id) { | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We usually replay the event in case of persistence failures. While this path is not critical, we should probably do the same here to keep it consistent.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree, I've updated this to return ReplayEvent() on failure to keep it consistent with other persistence paths. |
||
| log_error!( | ||
| self.logger, | ||
| "Failed to remove peer {} from peer store: {}", | ||
| counterparty_node_id, | ||
| e | ||
| ); | ||
| } | ||
| } | ||
| } | ||
| } | ||
|
|
||
| let event = Event::ChannelClosed { | ||
| channel_id, | ||
| user_channel_id: UserChannelId(user_channel_id), | ||
|
|
||
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -1841,8 +1841,16 @@ impl Node { | |||||
| })?; | ||||||
| } | ||||||
|
|
||||||
| // Check if this was the last open channel, if so, forget the peer. | ||||||
| if open_channels.len() == 1 { | ||||||
| // If this was the last open channel and we're closing cooperatively, forget the peer | ||||||
| // since we have no further reason to reconnect. | ||||||
|
|
||||||
| // For force-closes we intentionally keep the peer in the store so the background reconnection | ||||||
| // task keeps firing and can drive the channel_reestablish recovery flow. | ||||||
| // This is especially important against LND peers, which don't always handle force-closure error messages correctly. | ||||||
|
|
||||||
| //Note that this means a force-closed peer is retained until the user explicitly calls Node::disconnect. | ||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit:
Suggested change
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fixed, Thanks! |
||||||
|
|
||||||
| if open_channels.len() == 1 && !force { | ||||||
| self.peer_store.remove_peer(&counterparty_node_id)?; | ||||||
| } | ||||||
| } | ||||||
|
|
||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment partially duplicates the rationale in the
lib.rschange. I suggest trimming it to just what this block does, and let thelib.rscomment carry the LND recovery rationaleThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
I have trimmed the comment here to focus on the behavior of this block.