-
Notifications
You must be signed in to change notification settings - Fork 418
Async send: sender-side #4046
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Async send: sender-side #4046
Conversation
👋 Thanks for assigning @joostjager as a reviewer! |
02eb665
to
0ca36b6
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #4046 +/- ##
==========================================
+ Coverage 88.63% 88.73% +0.10%
==========================================
Files 176 176
Lines 131920 132601 +681
Branches 131920 132601 +681
==========================================
+ Hits 116927 117667 +740
+ Misses 12325 12254 -71
- Partials 2668 2680 +12
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few initial comments.
Awesome to see how you bring everything that has been planned and discussed upfront together in this PR. Great work Val.
/// Parameters for the reply path to a [`HeldHtlcAvailable`] onion message. | ||
pub enum HeldHtlcReplyPath { | ||
/// The reply path to the [`HeldHtlcAvailable`] message should terminate at our node. | ||
ToUs { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's too late now, but I think it would have been okay to just not implement the always-online sender to mostly-offline receiver case yet and focus purely on LSP clients.
0ca36b6
to
e1fa5eb
Compare
f04161e
to
708c96b
Compare
708c96b
to
0b75f40
Compare
0b75f40
to
20fdbec
Compare
fn hold_htlc_channels(&self) -> Result<Vec<ChannelDetails>, ()> { | ||
let should_send_async = { | ||
let cfg = self.config.read().unwrap(); | ||
cfg.hold_outbound_htlcs_at_next_hop |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't just hold_outbound_htlcs_at_next_hop
enough as a condition?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That option is true
by default, so since users in general don't change defaults, my thinking was that we should override it if the user has proactively configured themselves to be an announced node.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. I think I'd prefer a single clean flag that defines the node behavior. It's not that we are talking about an end-user check box somewhere in the config menu. This is for integrators that know what they are doing?
But no blocker.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should make the config knob an Option<bool>
? Then we can default it to None
, which bases on peers and announced-channels but users can override it? I don't feel super strongly but I agree that it does feel weird.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not in favor of dealing with uncertainty by adding flexibility. There is a cost to flexibility. Not just code, but also maintenance, dev questions, user questions, testing. Although of course in this particular case it isn't much.
What I think is the better option when faced with an unknown is to implement the absolute simplest and only extend it when there is a need.
@@ -2647,11 +2647,12 @@ pub fn get_route(send_node: &Node, route_params: &RouteParameters) -> Result<Rou | |||
let scorer = TestScorer::new(); | |||
let keys_manager = TestKeysInterface::new(&[0u8; 32], Network::Testnet); | |||
let random_seed_bytes = keys_manager.get_secure_random_bytes(); | |||
let first_hops = send_node.node.list_usable_channels(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this be discoverable through #4045 (comment) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed with @TheBlueMatt offline, this is not something we should be concerned about apparently: "An ldk user can always create a lock inversion by calling ldk things in the same expression 🤷♂️ nothing we can really do to stop them"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes okay, but I meant just to auto-discover the issue in this test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh sorry, yes that's right. In general tests moving forward will probably use ChannelManager::send_payment
and not ::send_to_route
, the latter API is mostly used for older tests before we had send_payment
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to see if lock ordering problems are indeed automatically detected in tests when running with --features backtrace
as @TheBlueMatt suggested in the comment above, but I didn't succeed. Tried with this patch:
diff --git a/lightning/src/ln/channelmanager.rs b/lightning/src/ln/channelmanager.rs
index 727617028..2c475d771 100644
--- a/lightning/src/ln/channelmanager.rs
+++ b/lightning/src/ln/channelmanager.rs
@@ -10866,6 +10866,7 @@ This indicates a bug inside LDK. Please report this error at https://github.com/
},
}
} else {
+ let tmp = self.pending_intercepted_htlcs.lock().unwrap();
match self.forward_htlcs.lock().unwrap().entry(scid) {
hash_map::Entry::Occupied(mut entry) => {
entry.get_mut().push(HTLCForwardInfo::AddHTLC(pending_add));
@@ -10874,6 +10875,7 @@ This indicates a bug inside LDK. Please report this error at https://github.com/
entry.insert(vec![HTLCForwardInfo::AddHTLC(pending_add)]);
},
}
+ tmp.is_empty();
}
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, yes that actually makes sense. But it isn't though as if that lock tree in comments is unnecessary, and also it isn't fully enforced by debug_sync.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah the lock tree in comments isn't really necessary and it is human-maintained and can become outdated. I do find it useful to reference sometimes when I run into lock issues, personally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tree prescribes that some two locks can't be held at the same time (such as the forward_htlcs and intercepted_htlcs). I thought that would somehow be checked automatically, but it doesn't appear to be the case. So the tree in comments is useful?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah in some cases you could hold locks at the same time, you should just update the lock order docs if you do that since it creates a dependency, so that's a bit confusing. I'm not married to keeping it, I've just found it useful to reference when taking locks. Before we had it I remember having to scan the file to see what order locks are taken.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This becomes a total side discussion, but it would be nice if that tree could be enforced in debug in some way or the other.
// Create this network topology: | ||
// LSP1 | ||
// / | \ | ||
// sender | recipient |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pretty advanced scenario. It seems more likely that an multi-part payment will be sent through one LSP, that LSP holding both htlcs? Or maybe not likely at all with splicing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tis advanced and unlikely... We do support it though so I wanted to check that it works. We have been discussing liquidity ads lately so maybe multiple LSPs will become more common in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, a multi-part payment through a single LSP doesn't necessarily mean that that LSP client has more than one channel. It could still be one channel, but the payment sharded because there isn't enough capacity at a more distant point in the network.
I don't know if this is a scenario worth covering. Doesn't seem essential, but suggesting nonetheless.
20fdbec
to
6b2f95c
Compare
Rebased since #4045, will address feedback/CI issues when I next push. |
6b2f95c
to
a0344f0
Compare
fn hold_htlc_channels(&self) -> Result<Vec<ChannelDetails>, ()> { | ||
let should_send_async = { | ||
let cfg = self.config.read().unwrap(); | ||
cfg.hold_outbound_htlcs_at_next_hop |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. I think I'd prefer a single clean flag that defines the node behavior. It's not that we are talking about an end-user check box somewhere in the config menu. This is for integrators that know what they are doing?
But no blocker.
@@ -2647,11 +2647,12 @@ pub fn get_route(send_node: &Node, route_params: &RouteParameters) -> Result<Rou | |||
let scorer = TestScorer::new(); | |||
let keys_manager = TestKeysInterface::new(&[0u8; 32], Network::Testnet); | |||
let random_seed_bytes = keys_manager.get_secure_random_bytes(); | |||
let first_hops = send_node.node.list_usable_channels(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to see if lock ordering problems are indeed automatically detected in tests when running with --features backtrace
as @TheBlueMatt suggested in the comment above, but I didn't succeed. Tried with this patch:
diff --git a/lightning/src/ln/channelmanager.rs b/lightning/src/ln/channelmanager.rs
index 727617028..2c475d771 100644
--- a/lightning/src/ln/channelmanager.rs
+++ b/lightning/src/ln/channelmanager.rs
@@ -10866,6 +10866,7 @@ This indicates a bug inside LDK. Please report this error at https://github.com/
},
}
} else {
+ let tmp = self.pending_intercepted_htlcs.lock().unwrap();
match self.forward_htlcs.lock().unwrap().entry(scid) {
hash_map::Entry::Occupied(mut entry) => {
entry.get_mut().push(HTLCForwardInfo::AddHTLC(pending_add));
@@ -10874,6 +10875,7 @@ This indicates a bug inside LDK. Please report this error at https://github.com/
entry.insert(vec![HTLCForwardInfo::AddHTLC(pending_add)]);
},
}
+ tmp.is_empty();
}
}
}
unify_blockheight_across_nodes(&nodes); | ||
|
||
// We need to be able to predict the offer_nonce to hardcode blinded payment paths containing a | ||
// JIT channel scid for the recipient. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It isn't immediately clear why this is needed. Is that for the expect?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I elaborated in the comment, is it clearer?
Edit: updated the comment again based on offline discussion
🔔 1st Reminder Hey @TheBlueMatt! This PR has been waiting for your review. |
a0344f0
to
277100c
Compare
Rebased due to conflicts conflicts after #4049 landed, will address feedback on the next push |
277100c
to
9c64bd7
Compare
Does this PR also handle a duplicate StaticInvoice onion message correctly, regardless of arrival order, as introduced with #4049? |
fn hold_htlc_channels(&self) -> Result<Vec<ChannelDetails>, ()> { | ||
let should_send_async = { | ||
let cfg = self.config.read().unwrap(); | ||
cfg.hold_outbound_htlcs_at_next_hop |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should make the config knob an Option<bool>
? Then we can default it to None
, which bases on peers and announced-channels but users can override it? I don't feel super strongly but I agree that it does feel weird.
lightning/src/ln/channelmanager.rs
Outdated
} else { | ||
const ONE_DAY_BLOCKS: u16 = 144; | ||
for chan in hold_htlc_channels.iter_mut() { | ||
// Increase the CLTV delta for this channel so that they will hold the HTLC for up to one |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems weird to modify the ChannelDetails
that we'll pass to the router (which may get the wrong idea) vs editing the resulting Route
, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain why the router would get the wrong idea? The router adds a random offset based on how much CLTV delta is remaining to not exceed the max_total_cltv_expiry_delta
, so I was thinking it would be good for it to have accurate numbers when it does that. I think we'd need some extra checks if we edit the route after the fact.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The router might very reasonably refuse to pay over channels with an extremely high minimum-cltv-expiry-delta, for example. Indeed, there's a bit of a dance to do here but I don't think its wildly complicated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I could see that. Moving this item to the follow-up PR.
9c64bd7
to
5bc3eea
Compare
Updated with diff, some minor test tweaks and whitespace fixes. Holding off on altering the config option til we're convinced we have a better option, I'm not super sure yet. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few responses, at least for now
lightning/src/ln/channelmanager.rs
Outdated
} else { | ||
const ONE_DAY_BLOCKS: u16 = 144; | ||
for chan in hold_htlc_channels.iter_mut() { | ||
// Increase the CLTV delta for this channel so that they will hold the HTLC for up to one |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The router might very reasonably refuse to pay over channels with an extremely high minimum-cltv-expiry-delta, for example. Indeed, there's a bit of a dance to do here but I don't think its wildly complicated.
5bc3eea
to
aba999b
Compare
8513341 purported to remove a lock dependency between ChannelManager::pending_intercepted_htlcs and ::forward_htlcs. However, in practice the dependency remained because the locks of the two maps were still held at the same time in some cases. Here we fix this.
Useful to filter for channel peers that support a specific feature, in this case the hold_htlc feature, in upcoming commits.
As part of supporting sending payments as an often-offline sender, the sender needs to be able to set a flag in their update_add_htlc message indicating that the HTLC should be held until receipt of a release_held_htlc onion message from the often-offline payment recipient. We don't yet ever set this flag, but lay the groundwork by including the field in Channel serialization. See-also BOLTs PR 989
As part of supporting sending payments as an often-offline sender, the sender needs to be able to set a flag in their update_add_htlc message indicating that the HTLC should be held until receipt of a release_held_htlc onion message from the often-offline payment recipient. We don't yet ever set this flag, but lay the groundwork by including a parameter for it in internal channel send_htlc APIs, including serializing the field when the parameter is set. See-also BOLTs PR 989
Makes an upcoming commit cleaner. In the util we allow manually passing in the first_hops that should be used when paying a static invoice, which is useful for only passing in next-hop channels that support the hold_htlcs feature.
As part of supporting sending payments as an often-offline sender, the sender needs to be able to set a flag in their update_add_htlc message indicating that the HTLC should be held until receipt of a release_held_htlc onion message from the often-offline payment recipient. We don't yet ever set this flag, but lay the groundwork by including the parameter in the pay_route method. See-also <lightning/bolts#989>
If the flag is set to true, then if we as an often-offline payer receive a static invoice to pay, we will attempt to hold the corresponding outbound HTLCs with our next-hop channel counterparty(s) that support the feature. This allows our node to go offline once the HTLCs are locked in even though the recipient may not yet be online to receive them.
Returns a list of channels where our counterparty supports InitFeatures::supports_htlc_hold, or an error if there are none or we detect that we are an announced node. Useful for sending async payments to StaticInvoices.
As part of supporting sending payments as an often-offline sender, the sender needs to be able to set a flag in their update_add_htlc message indicating that the HTLC should be held until receipt of a release_held_htlc onion message from the often-offline payment recipient. The prior commits laid groundwork to finally set the flag here in this commit. See-also BOLTs PR 989
As part of supporting sending payments as an often-offline sender, the sender needs to send held_htlc_available onion messages such that the reply path to the message terminates at their always-online channel counterparty that is holding the HTLC. That way when the recipient responds with release_held_htlc, the sender's counterparty will receive that message. Here we lay some groundwork for using a counterparty-created reply path when sending held_htlc_available as an async sender in the next commit.
As part of supporting sending payments as an often-offline sender, the sender needs to send held_htlc_available onion messages such that the reply path to the message terminates at their always-online channel counterparty that is holding the HTLC. That way when the recipient responds with release_held_htlc, the sender's counterparty will receive that message. After laying groundwork over some past commits, here we as an async sender send held_htlc_available messages using reply paths created by our always-online channel counterparty.
Now that we support the feature of sending payments as an often-offline sender to an often-offline recipient, including the sender's LSP-side, we can start conditionally advertising the feature bit to other nodes on the network. Also remove an outdated comment, in a previous version of the code we had more logic to check if we're an always-online node before advertising the feature bit but it's since been removed.
Previously we were taking the NetworkGraph::{channels,nodes} locks before ChannelManager::per_peer_state's peer_state locks in some tests, which violated a lock order requirement we have in ChannelManager to take netgraph locks *after* peer_state locks. See OffersMessageFlow::path_for_release_htlc which is called while a peer_state lock is held and takes netgraph locks while creating blinded paths.
Add testing for sending payments from an often-offline sender to an often-offline recipient.
aba999b
to
9d71121
Compare
Rebased due to conflicts that were also causing CI failures. Also added another test comment post-rebase diff --git a/lightning/src/ln/async_payments_tests.rs b/lightning/src/ln/async_payments_tests.rs
index 5a0fedb5f..307510bbe 100644
--- a/lightning/src/ln/async_payments_tests.rs
+++ b/lightning/src/ln/async_payments_tests.rs
@@ -3113,6 +3113,14 @@ fn intercepted_hold_htlc() {
create_announced_chan_between_nodes_with_value(&nodes, 1, 2, 1_000_000, 0);
unify_blockheight_across_nodes(&nodes);
+ // Typically, JIT channels are created by an LSP providing the recipient with special "intercept"
+ // scids out-of-band, to be put in the recipient's BOLT 11 invoice route hints. The intercept
+ // scids signal to the LSP to open a JIT channel.
+ //
+ // Below we hardcode blinded payment paths containing intercept scids, to be used in the
+ // recipient's eventual static invoice, because we don't yet support intercept scids in the normal
+ // static invoice server flow.
+
// We need to be able to predict the offer_nonce in order to hardcode acceptable blinded payment
// paths containing a JIT channel scid for the recipient below.
// Without the offer_nonce used below in the `AsyncBolt12OfferContext` matching the eventual offer |
Implements the sender-side of sending payments as an often-offline sender to an often-offline recipient.
Partially addresses #2298
Based on
#4044,#4045