Skip to content

Conversation

@joostjager
Copy link
Contributor

@joostjager joostjager commented Oct 9, 2025

PoC branch to try out persisting channels with the monitors rather than chanmgr. Goal: fix force closures.

Builds on #4218

@ldk-reviews-bot
Copy link

👋 Hi! I see this is a draft PR.
I'll wait to assign reviewers until you mark it as ready for review.
Just convert it out of draft status when you're ready for review!

@joostjager joostjager self-assigned this Oct 9, 2025
@joostjager joostjager force-pushed the chanmgr-refactor-passthrough branch 2 times, most recently from dd9d6db to c9d03d4 Compare October 30, 2025 15:06
@joostjager joostjager force-pushed the chanmgr-refactor-passthrough branch 3 times, most recently from 2eb7981 to 4eb87c1 Compare November 18, 2025 14:24
let mut failed_htlcs = Vec::new();
let channel_count: u64 = Readable::read(reader)?;

// Discard channel manager versions of channels.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we'll need to use these manager versions of channels if the monitor versions aren't set (i.e. reading a manager that was last serialized on <= 0.2). Ideal would be to have an upgrade test, i.e. in upgrade_downgrade_tests.rs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to leave this until we've got solid confidence in the approach taken? Ofc agreed that we need to implement upgrade.


/// The encoded channel data associated with this ChannelMonitor, if any.
#[cfg(feature = "safe_channels")]
pub encoded_channel: Option<Vec<u8>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you explore encoding the FundedChannel itself rather than a bag of bytes? I guess it would require propagating the signer trait parameter, which is kind of annoying?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered it, but indeed, the signer parameter is annoying. But your comment did make me think again. I will try and see what we can do with a copy of FundedChannel just for data storage. Perhaps that's useful too once we start stripping out the redundant fields?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I noticed while trying this is that ChannelContext contains blocked_monitor_updates: Vec<PendingChannelMonitorUpdate>. PendingChannelMonitorUpdate will contain FundedChannel snapshots. Nesting 😨

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// for the safe_channels feature.
#[cfg(feature = "safe_channels")]
if let Some(ref encoded_channel) = channel_monitor.encoded_channel {
channel_monitor.check_encoded_channel_consistency(encoded_channel);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be better to have the above docs (or duplicate them) on the method itself

Comment on lines 1432 to 1433
/// The serialized channel state as provided via the last `ChannelMonitorUpdate` or via a call to
/// [`ChannelMonitor::update_encoded_channel`].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe note that this is a FundedChannel and explain why we're encoding it here?

if let Ok(check_data) = check_res {
debug_assert!(
check_data.cur_holder_commitment_transaction_number
<= self.get_cur_holder_commitment_number(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you document generally why it's okay for the channel to be behind the monitor (but not ahead)?

}

#[cfg(feature = "safe_channels")]
pub fn read_check_data<R: io::Read>(reader: &mut R) -> Result<ChannelStateCheckData, DecodeError> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs on this method?

@joostjager joostjager force-pushed the chanmgr-refactor-passthrough branch from 4eb87c1 to 1adfd7d Compare November 19, 2025 09:39
@codecov
Copy link

codecov bot commented Nov 19, 2025

Codecov Report

❌ Patch coverage is 58.55856% with 46 lines in your changes missing coverage. Please review.
✅ Project coverage is 89.29%. Comparing base (fe5d942) to head (74285c1).

Files with missing lines Patch % Lines
lightning/src/ln/channel.rs 45.88% 46 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4151      +/-   ##
==========================================
- Coverage   89.32%   89.29%   -0.03%     
==========================================
  Files         180      180              
  Lines      138730   138820      +90     
  Branches   138730   138820      +90     
==========================================
+ Hits       123914   123958      +44     
- Misses      12190    12240      +50     
+ Partials     2626     2622       -4     
Flag Coverage Δ
fuzzing 36.00% <26.47%> (+0.97%) ⬆️
tests 88.65% <58.55%> (-0.03%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@joostjager
Copy link
Contributor Author

Added more cfg gating for safe_channels to make CI pass. Did allow unused warnings for the skipped tests, because otherwise an extreme amount of uses gating would be needed.

@joostjager joostjager changed the title Store channel alongside channel monitor Reload channelmanager with channel data from monitors Nov 19, 2025
@joostjager joostjager force-pushed the chanmgr-refactor-passthrough branch 2 times, most recently from 5967f55 to b096bf7 Compare November 19, 2025 16:08
@joostjager
Copy link
Contributor Author

If we go with unencoded channels in the monitors and monitor updates, this PR would come out along the following lines: https://github.com/lightningdevkit/rust-lightning/compare/main...joostjager:rust-lightning:chanmgr-refactor-passthrough-unencoded?expand=1

In that branch, I've also stripped away all of chanmgr::read except for the static data. This leads to 85 failing tests, which would be the base line to go work off of.

Additional trace logs to help with debugging.
Add a new `safe_channels` feature flag that gates in-development work
toward persisting channel monitors and channels atomically, preventing
them from desynchronizing and causing force closures.

This commit begins that transition by storing both pieces together and
adding consistency checks during writes. These checks mirror what the
channel manager currently validates only on reload, but performing them
earlier increases coverage and surfaces inconsistencies sooner.
@joostjager joostjager force-pushed the chanmgr-refactor-passthrough branch from e87a727 to 5859132 Compare November 27, 2025 09:46
@joostjager
Copy link
Contributor Author

Rebased PR on top of unencoded channel state in ChannelMonitorUpdates. Also switched to clean-slate tlv serialization for chan mgr with only the static fields. Total number of tests failing without any reconstruction attempts is 44. Seems not the worst starting point.

@joostjager joostjager force-pushed the chanmgr-refactor-passthrough branch from 5859132 to 9168f08 Compare November 27, 2025 10:58
@joostjager joostjager force-pushed the chanmgr-refactor-passthrough branch from 9168f08 to 74285c1 Compare November 27, 2025 12:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

3 participants