-
Notifications
You must be signed in to change notification settings - Fork 419
Add LSPS5 DOS protections. #3993
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
👋 I see @jkczyz was un-assigned. |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3993 +/- ##
==========================================
+ Coverage 88.84% 88.87% +0.03%
==========================================
Files 175 175
Lines 127760 127790 +30
Branches 127760 127790 +30
==========================================
+ Hits 113510 113579 +69
+ Misses 11679 11641 -38
+ Partials 2571 2570 -1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
🔔 1st Reminder Hey @tnull! This PR has been waiting for your review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for looking into this!
use lightning::sign::NodeSigner; | ||
|
||
/// A trait for implementing Denial-of-Service (DoS) protection mechanisms for LSP services. | ||
pub trait DosProtectionEnforcer { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO this is over-engineering. We can just have a single method on LSPS5ServiceHandler
that checks the necessary bools. I also don't think this warrants to be in a separate module right now.
peer_state | ||
.outbound_channels_by_intercept_scid | ||
.values() | ||
.any(|c| c.is_pending_channel_open()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, we'll also want to allow LSPS5 for any activity beyond this point, no? I think rather than adding helpers for each individual state, maybe we finally should expose the list of OutboundJITChannelState
s at least pub(crate) and also implement Ord
for OutboundJITChannelState
? This would allow us to use comparison operators on the channel states.
Also, can we somehow enforce that for a pending payment, we'd always first process it and advance the LSPS2 state before evaluating whether we can notifiy the LSPS5 client?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what I thought here (and I didn't clarify, sorry) is that it should only check for the pending_channel_open
. after that state, the channel will be opened and the LSPS5 service has_active_channels
check will kick in, so there is no point on checking the other LSPS2 states. does that make sense?
lightning-liquidity/src/manager.rs
Outdated
if !self.peer_is_engaged(sender_node_id) { | ||
return Err(LightningError { | ||
err: format!( | ||
"Rejecting LSPS5 request from {:?} without existing engagement", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"existing engagement" is pretty unclear terminology, would be good to find more intuitive wording.
lightning-liquidity/src/manager.rs
Outdated
"Rejecting LSPS5 request from {:?} without existing engagement", | ||
sender_node_id | ||
), | ||
action: ErrorAction::IgnoreAndLog(Level::Info), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't log on on Info
here.
@@ -64,6 +64,8 @@ struct StoredWebhook { | |||
pub struct LSPS5ServiceConfig { | |||
/// Maximum number of webhooks allowed per client. | |||
pub max_webhooks_per_client: u32, | |||
/// Require an existing channel or active LSPS1/LSPS2 flow before accepting requests. | |||
pub enforce_dos_protections: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's at least start out with enabling this by default, i.e., don't make it configurable.
👋 The first review has been submitted! Do you think this PR is ready for a second reviewer? If so, click here to assign a second reviewer. |
38c8436
to
ffe25e6
Compare
@tnull thanks for the review! I think this is a much better I pushed a fixup commit addressing all the comments ffe25e6 I still have my doubts regarding this:
I think that maybe that's not necessary and I posted a question related to that: #3993 (comment) thanks! |
ffe25e6
to
aab6332
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, much cleaner already! Some comments.
@@ -108,8 +108,8 @@ struct ForwardPaymentAction(ChannelId, FeePayment); | |||
struct ForwardHTLCsAction(ChannelId, Vec<InterceptedHTLC>); | |||
|
|||
/// The different states a requested JIT channel can be in. | |||
#[derive(Debug)] | |||
enum OutboundJITChannelState { | |||
#[derive(Clone, Debug, PartialEq, Eq)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this need to be Clone
suddenly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -134,6 +134,30 @@ enum OutboundJITChannelState { | |||
PaymentForwarded { channel_id: ChannelId }, | |||
} | |||
|
|||
impl OutboundJITChannelState { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this explicit implementation? Wouldn't derive(PartialOrd)
do the ~the same thing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this would need putting also the derive(Ord)
and derive(PartialOrd)
on PaymentQueue
and InterceptId
happy to do it but wanted to minimize changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems like I can't define Ord but derive PartialOrd
from the linter:
error: you are implementing `Ord` explicitly but have derived `PartialOrd`
--> lightning-liquidity/src/lsps2/service.rs:164:1
|
164 | / impl Ord for OutboundJITChannelState {
165 | | fn cmp(&self, other: &Self) -> core::cmp::Ordering {
166 | | self.stage().cmp(&other.stage())
167 | | }
168 | | }
| |_^
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, sorry. Then I have to eat my words. I'd actually prefer the way you had vs. introducing the redundant ..Stage
object. Sorry, mind reverting to explicitly implementing it?
let outer_state_lock = self.per_peer_state.read().unwrap(); | ||
if let Some(inner_state_lock) = outer_state_lock.get(counterparty_node_id) { | ||
let peer_state = inner_state_lock.lock().unwrap(); | ||
peer_state.outbound_channels_by_intercept_scid.values().map(|c| c.state.clone()).max() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't buy we need to clone here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
error[E0507]: cannot move out of `c.state` which is behind a shared reference
--> lightning-liquidity/src/lsps2/service.rs:599:68
|
599 | peer_state.outbound_channels_by_intercept_scid.values().map(|c| c.state).max()
| ^^^^^^^ move occurs because `c.state` has type `OutboundJITChannelState`, which does not implement the `Copy` trait
|
help: consider cloning the value if the performance cost is acceptable
|
599 | peer_state.outbound_channels_by_intercept_scid.values().map(|c| c.state.clone()).max()
| ++++++++
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me rephrase: we definitely can't clone the entire state, with all the HTLC data, etc each time we want to check whether a client reached a certain state. I still think handling all the state by reference should be doable, but it's probably easiest to mirror the fn has_active_requests(&self, counterparty_node_id: &PublicKey) -> bool
above.
(While eventually we want to give users insight into the held state, it doesn't need to happen in this PR, so for now it's probably easiest to revert to having OutboundJITChannelState
private, and just check the bool
returned by the helper mentioned above)
) -> bool { | ||
self.client_has_open_channel(client_id) | ||
|| lsps1_has_activity | ||
|| lsps2_max_state.map_or(false, |s| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, why not just use >= PendingChannelOpen
now that we can?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this could not be done directly because I would need to create some dummy properties for the PendingChannelOpen
struct for it to be able to compare:
|s| s >= OutboundJITChannelState::PendingChannelOpen {
payment_queue: PaymentQueue::new(),
opening_fee_msat: 0
})
I prefer not to do this so went with something different:
- dropped the
OutboundJITChannelState::ord_index()
- instead, I created a new enum
OutboundJITStage
that matches one to one withOutboundJITChannelState
but has no properties. this newOutboundJITStage
hasDerive(ord, partialOrd)
so it just grabs the declaration order for its ordering
with this, now we can do |s| s.stage() >= OutboundJITStage::PendingChannelOpen
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, see #3993 (comment)
I'd prefer not to introduce a redundant object here. Sorry for the noise.
8eaea29
to
9a59755
Compare
I squashed the last fixup commit and created a new fixup commit addressing the latest comments also responded to some of the comments that may need a follow up thanks @tnull ! |
just pushed a fixup commit addressing the latest comments a74c585
let me know what you think @tnull , thanks!! |
142842b
to
109de16
Compare
peer_state.outbound_channels_by_intercept_scid.values().any(|chan| { | ||
matches!( | ||
chan.state, | ||
OutboundJITChannelState::PendingChannelOpen { .. } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, how do we imagine this to work for the initial LSPS2 receive? If we only allow clients to even register for notifications once we already have a (pending) channel open, how would a first-time user's phone get notified to wake up and accept the channel when they receive the very first payment?
I think we'd need to extend this to also accept LSPS5 requests for PendingInitialPayment
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, that makes sense. I will update this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, this should be good for a second reviewer.
In the meantime, it's gonna need a rebase.
✅ Added second reviewer: @jkczyz |
When handling an incoming LSPS5 request, the manager will check if the counterparty is 'engaged' in some way before responding. `Engaged` meaning = active channel | LSPS2 active operation | LSPS1 active operation. Logic: `If not engaged then reject request;` A single test is added only checking for the active channel condition, because it's not super easy to get LSPS1-2 on the correct state to check this (yet). Other tangential work is happening that will make this easier and more tests will come in the near future
- refactor highest_state_for_peer and has_active_requests for better readabiliy - refactor ordering logic OutboundJITChannelState
…hecking if lsps2 has any pendingChannelOpen or above
…tboundJITChannelState
8f2805a
to
b522194
Compare
done ✅ |
let outer_state_lock = self.per_peer_state.read().unwrap(); | ||
outer_state_lock.get(counterparty_node_id).map_or(false, |inner| { | ||
let peer_state = inner.lock().unwrap(); | ||
!(peer_state.pending_requests.is_empty() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, how are we thinking about whether pending_requests
counts? I mean on the one hand not counting it is gonna cause issues cause users are gonna register a webhook quickly and see it fail (tho they should retry), on the other hand, a pending request that we haven't even responded to feels like it shouldn't count - it just means they've sent us a message and we haven't responded yet.
let lsps2_has_active_requests = self | ||
.lsps2_service_handler | ||
.as_ref() | ||
.map_or(false, |h| h.has_active_requests(sender_node_id)); | ||
#[cfg(lsps1_service)] | ||
let lsps1_has_active_requests = self | ||
.lsps1_service_handler | ||
.as_ref() | ||
.map_or(false, |h| h.has_active_requests(sender_node_id)); | ||
#[cfg(not(lsps1_service))] | ||
let lsps1_has_active_requests = false; | ||
|
||
if !lsps5_service_handler.can_accept_request( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we move all this logic to only running when we get a message that indicates we're allocating state, rather than ignoring all messages from a peer?
When handling an incoming LSPS5 request, the manager will check
if the counterparty is 'engaged' in some way before responding.
Engaged
meaning = active channel | LSPS2 active operation | LSPS1 active operation.Logic:
If not engaged then reject request;
A single test is added only checking for the active channel condition,
because it's not super easy to get LSPS1-2 on the correct state to check this (yet).
Other tangential work is happening that will make this easier and more tests will come in the near future.
A few decisions that could be changed:- the DOS protections are optional (default true). maybe this should not be configurable?- I made thedos_protection_enforcer
generic enough so it would be possible to add more behavior in the future. also some logic could be moved here like theignored_peers
logic from the manager, which could make sense to move to the dos enforcerthoughts @tnull ?