Remove background processor macro and parallelize persistence #3968

joostjager · 2025-07-28T14:56:45Z

The define_run_body! macro appears to have reached a complexity threshold where further extensions would add more complexity than benefit. At this point, it may be more practical to accept some duplication between the sync and async variants.

This PR expands the macro, cleans up the code and implements an initial step towards parallelization by running scorer, graph, sweeper and channelmanager persistence simultaneously.

ldk-reviews-bot · 2025-07-28T14:56:48Z

👋 Thanks for assigning @TheBlueMatt as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

joostjager · 2025-07-28T15:02:32Z

lightning-background-processor/src/lib.rs

-		None,
-	));
+
+	log_trace!(logger, "Calling ChannelManager's timer_tick_occurred on startup");


If desired, I can split this commit into multiple smaller clean up commits.

codecov · 2025-07-29T11:14:20Z

Codecov Report

❌ Patch coverage is 78.27004% with 103 lines in your changes missing coverage. Please review.
✅ Project coverage is 88.89%. Comparing base (55d8666) to head (d9a6641).
⚠️ Report is 21 commits behind head on main.

Files with missing lines	Patch %	Lines
lightning-background-processor/src/lib.rs	78.27%	73 Missing and 30 partials ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #3968      +/-   ##
==========================================
- Coverage   88.93%   88.89%   -0.05%     
==========================================
  Files         174      174              
  Lines      123880   124232     +352     
  Branches   123880   124232     +352     
==========================================
+ Hits       110176   110433     +257     
- Misses      11251    11319      +68     
- Partials     2453     2480      +27

Flag	Coverage Δ
fuzzing	`22.61% <ø> (-0.01%)`	⬇️
tests	`88.72% <78.27%> (-0.05%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

lightning-background-processor/src/lib.rs

joostjager · 2025-07-29T13:39:22Z

lightning-background-processor/src/lib.rs

@@ -674,7 +682,7 @@ where
 	PM::Target: APeerManager,
 	LM::Target: ALiquidityManager,
 	O::Target: 'static + OutputSpender,
-	D::Target: 'static + ChangeDestinationSource,
+	D::Target: 'static + ChangeDestinationSource + MaybeSync,


Now all of a sudden needed for rust 1.63.0

Bleh, in order to use MultiResultFuturePoller we end up having to Box everything (because the futures need to be the same type, forcing dyn indirection), and then we have to expand our type bounds (because the compiler can't see through the Future passed to MultiResultFuturePoller to tell if its send/sync or not). I'm not convinced its worth it, we can drop all of that with a simple poller, see https://git.bitcoin.ninja/?p=rust-lightning;a=shortlog;h=refs/heads/2025-07-3968-poller-demo

I indeed noticed too that the futures always are of different types and need to go through dyn. The code that you show there is indeed better, will use that.

lightning-background-processor/src/lib.rs

lightning/src/util/async_poll.rs

Only expansion, clean ups in the next commit to facilitate review.

TheBlueMatt · 2025-07-29T16:32:36Z

lightning-background-processor/src/lib.rs

-				(false, false) => FASTEST_TIMER,
-			};
+		// Channel manager timer tick.
+		match check_sleeper(&mut last_freshness_call) {


Since we're splitting between async and sync anyway, it would be nice to do the new-sleeper init in check_sleeper - if check_sleeper polls Ready, we aren't allowed to poll again (per the Future API contract) so need to be careful to ensure we always create a new sweeper. It would be easier to just do it in check_sleeper rather than being diligent in review here.

I tried it with

fn check_sleeper<SleepFuture: core::future::Future<Output = bool> + core::marker::Unpin>( fut: &mut SleepFuture, new_sleeper: impl Fn() -> SleepFuture, ) -> Option<bool> { let mut waker = dummy_waker(); let mut ctx = task::Context::from_waker(&mut waker); match core::pin::Pin::new(&mut *fut).poll(&mut ctx) { task::Poll::Ready(exit) => { *fut = new_sleeper(); Some(exit) }, task::Poll::Pending => None, } }

But that ran into a complication with the network graph. There the new interval is based on whether a prune has already happened. Can probably be refactored to make it work, but perhaps it is also ok to leave this pre-existing weakness for now?

Still seems worth fixing. We can just pass NETWORK_PRUNE_TIMER for the network graph prune and in the case where we need to prune but prunable_network_graph returns None we can override it with FIRST_NETWORK_PRUNE_TIMER. That's mostly for the "RGS sync but RGS hasn't finished yet" case anyway.

Will do this in a direct follow up to minimize the rebase pain of the PR with macro expansion and code move.

lightning-background-processor/src/lib.rs

TheBlueMatt · 2025-07-29T17:34:17Z

lightning-background-processor/src/lib.rs

@@ -674,7 +682,7 @@ where
 	PM::Target: APeerManager,
 	LM::Target: ALiquidityManager,
 	O::Target: 'static + OutputSpender,
-	D::Target: 'static + ChangeDestinationSource,
+	D::Target: 'static + ChangeDestinationSource + MaybeSync,


Bleh, in order to use MultiResultFuturePoller we end up having to Box everything (because the futures need to be the same type, forcing dyn indirection), and then we have to expand our type bounds (because the compiler can't see through the Future passed to MultiResultFuturePoller to tell if its send/sync or not). I'm not convinced its worth it, we can drop all of that with a simple poller, see https://git.bitcoin.ninja/?p=rust-lightning;a=shortlog;h=refs/heads/2025-07-3968-poller-demo

ldk-reviews-bot · 2025-07-29T17:34:28Z

👋 The first review has been submitted!

Do you think this PR is ready for a second reviewer? If so, click here to assign a second reviewer.

joostjager · 2025-07-30T10:00:34Z

lightning/src/util/async_poll.rs

@@ -15,27 +15,22 @@ use core::marker::Unpin;
 use core::pin::Pin;
 use core::task::{Context, Poll, RawWaker, RawWakerVTable, Waker};

-pub(crate) enum ResultFuture<F: Future<Output = Result<(), E>>, E: Copy + Unpin> {
+pub(crate) enum ResultFuture<F: Future<Output = Result<(), E>>, E: Unpin> {


No longer using MultiResultFuturePoller, but left the commit in as a general clean up.

lightning-background-processor/src/lib.rs

TheBlueMatt · 2025-07-30T15:13:07Z

lightning-background-processor/src/lib.rs

-				(false, false) => FASTEST_TIMER,
-			};
+		// Channel manager timer tick.
+		match check_sleeper(&mut last_freshness_call) {


Still seems worth fixing. We can just pass NETWORK_PRUNE_TIMER for the network graph prune and in the case where we need to prune but prunable_network_graph returns None we can override it with FIRST_NETWORK_PRUNE_TIMER. That's mostly for the "RGS sync but RGS hasn't finished yet" case anyway.

Macros can only be expanded recursively, so the log macros needed to be brought back. Also some unnecessary parenthesis, curly braces and unused arguments/code removed.

Prepare for parallelization.

Co-authored-by: Matt Corallo <[email protected]>

TheBlueMatt · 2025-07-31T01:11:26Z

lightning-background-processor/src/lib.rs

+			log_trace!(logger, "Done persisting ChannelManager.");
+		}
+
+		// Note that we want to run a graph prune once not long after startup before


This comment is now in the wrong spot. It was on the assignment of what the timer's time is, and now its on the sleep itself.

Hm yes, moved. For the async macro invocation, it was always at the wrong location, because async sets the timer further down. It just wasn't so easy to spot.

And moved back after the check_sleeper extension...

TheBlueMatt · 2025-07-31T01:17:52Z

lightning-background-processor/src/lib.rs

+				let fastest_timeout = batch_delay.min(Duration::from_millis(100));
+				sleeper.wait_timeout(fastest_timeout);


Somehow all the await_slow logic got dropped from the sync BP and now only appears in the async one?

Oh, TIL we only do that on the async BP...weird...

Yes, that surfaced indeed... Not sure if it was intentional originally, or should be mirrored for sync?

TheBlueMatt

Discussed it offline and we can swap the way the check_sleeper method works in a followup to avoid having to rebase this. Gonna go ahead and land.

TheBlueMatt · 2025-07-31T01:29:58Z

lightning-background-processor/src/lib.rs

+				let fastest_timeout = batch_delay.min(Duration::from_millis(100));
+				sleeper.wait_timeout(fastest_timeout);


Oh, TIL we only do that on the async BP...weird...

TheBlueMatt · 2025-07-31T01:32:06Z

This is straightforward code translation. There's some risk in the reordering, but its relatively low so just gonna land with one review. I might do a followup to do an initial poll of the CM persist future before we do the graph prune/scorer time step so that it has a chance to get going before we do some CPU-intensive tasks.

joostjager · 2025-07-31T06:49:53Z

Follow up: #3978

tnull

Post-merge ACK. Also confirmed that this/current main works with LDK Node.

tnull · 2025-07-31T07:41:55Z

lightning-background-processor/src/lib.rs

@@ -761,7 +761,6 @@ where
 			last_forwards_processing_call = sleeper(cur_batch_delay);
 		}
 		if should_break {
-			log_trace!(logger, "Terminating background processor.");


nit: I think I liked the previous behavior (logging this when initiating the shutdown process) a bit better.

I don't think it changed. We are still logging before we do the final persist round. The only difference is logging before or after the break.

elnosh

another post-merge ACK. I had started looking at it a bit before merging but very much in favor of this 👍 and less macros where feasible.

elnosh · 2025-07-31T13:48:26Z

lightning-background-processor/src/lib.rs

-				if last_freshness_call.elapsed() > FRESHNESS_TIMER {
-					log_trace!(logger, "Calling ChannelManager's timer_tick_occurred");
-					channel_manager.get_cm().timer_tick_occurred();
-					last_freshness_call = Instant::now();
-				}
-				if last_onion_message_handler_call.elapsed() > ONION_MESSAGE_HANDLER_TIMER {
-					if let Some(om) = &onion_messenger {
-						log_trace!(logger, "Calling OnionMessageHandler's timer_tick_occurred");
-						om.get_om().timer_tick_occurred();
-					}
-					last_onion_message_handler_call = Instant::now();
-				}


was this reordering in the sync one needed or done to match the async one?

Done to match async (#3968 (comment))

elnosh · 2025-07-31T13:51:52Z

lightning-background-processor/src/lib.rs

+							log_error!(logger,
+						"Error: Failed to persist scorer, check your disk and permissions {}",
+						e,
+					);


this seems to be a bit oddly formatted

Hmm indeed. Rustfmt is unpredictable inside macros.

Added fix to #3978

joostjager commented Jul 28, 2025

View reviewed changes

joostjager force-pushed the remove-bg-macro branch 4 times, most recently from ddd5249 to 64c502c Compare July 29, 2025 10:57

joostjager force-pushed the remove-bg-macro branch 3 times, most recently from 1364139 to b68a228 Compare July 29, 2025 13:37

joostjager commented Jul 29, 2025

View reviewed changes

lightning-background-processor/src/lib.rs Outdated Show resolved Hide resolved

joostjager commented Jul 29, 2025

View reviewed changes

lightning-background-processor/src/lib.rs Outdated Show resolved Hide resolved

joostjager commented Jul 29, 2025

View reviewed changes

lightning-background-processor/src/lib.rs Outdated Show resolved Hide resolved

joostjager commented Jul 29, 2025

View reviewed changes

lightning-background-processor/src/lib.rs Outdated Show resolved Hide resolved

joostjager commented Jul 29, 2025

View reviewed changes

lightning/src/util/async_poll.rs Outdated Show resolved Hide resolved

Expand background processor define_run_body! macro

d749ccd

Only expansion, clean ups in the next commit to facilitate review.

joostjager force-pushed the remove-bg-macro branch from b68a228 to df282cb Compare July 29, 2025 14:22

joostjager marked this pull request as ready for review July 29, 2025 14:25

joostjager requested a review from TheBlueMatt July 29, 2025 14:25

TheBlueMatt reviewed Jul 29, 2025

View reviewed changes

joostjager force-pushed the remove-bg-macro branch 4 times, most recently from 6c18934 to 80f68ce Compare July 30, 2025 08:39

joostjager commented Jul 30, 2025

View reviewed changes

joostjager force-pushed the remove-bg-macro branch from 80f68ce to b55f977 Compare July 30, 2025 10:02

joostjager requested a review from TheBlueMatt July 30, 2025 11:36

TheBlueMatt reviewed Jul 30, 2025

View reviewed changes

joostjager and others added 6 commits July 30, 2025 18:49

Clean up macro expansion

ff01967

Macros can only be expanded recursively, so the log macros needed to be brought back. Also some unnecessary parenthesis, curly braces and unused arguments/code removed.

Dedup terminating log

e15f6a3

Introduce check_sleeper helper

fa02602

Re-order async background processor tasks

ec8e1ca

Prepare for parallelization.

Remove unnecessary Copy bound from MultiResultFuturePoller

269cff7

Parallelize persistence in the async bg processor

d9a6641

Co-authored-by: Matt Corallo <[email protected]>

joostjager force-pushed the remove-bg-macro branch from b55f977 to d9a6641 Compare July 30, 2025 17:20

joostjager requested a review from TheBlueMatt July 30, 2025 18:52

TheBlueMatt reviewed Jul 31, 2025

View reviewed changes

TheBlueMatt approved these changes Jul 31, 2025

View reviewed changes

TheBlueMatt merged commit 664511b into lightningdevkit:main Jul 31, 2025
24 of 25 checks passed

tnull reviewed Jul 31, 2025

View reviewed changes

elnosh reviewed Jul 31, 2025

View reviewed changes

joostjager mentioned this pull request Jul 24, 2025

Async Persistence TODOs #3052

Open

24 tasks

joostjager added this to Weekly Goals Jul 31, 2025

joostjager self-assigned this Jul 31, 2025

joostjager mentioned this pull request Aug 1, 2025

Rewrite sync bg processor #3820

Draft

martinsaposnic mentioned this pull request Aug 6, 2025

Windows CI failing #3990

Closed

		let fastest_timeout = batch_delay.min(Duration::from_millis(100));
		sleeper.wait_timeout(fastest_timeout);

Remove background processor macro and parallelize persistence #3968

Remove background processor macro and parallelize persistence #3968

Uh oh!

Conversation

joostjager commented Jul 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ldk-reviews-bot commented Jul 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Jul 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ldk-reviews-bot commented Jul 29, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TheBlueMatt left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TheBlueMatt commented Jul 31, 2025

Uh oh!

Uh oh!

joostjager commented Jul 31, 2025

Uh oh!

tnull left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

elnosh left a comment

Choose a reason for hiding this comment

Uh oh!

joostjager commented Jul 28, 2025 •

edited

Loading

ldk-reviews-bot commented Jul 28, 2025 •

edited

Loading

codecov bot commented Jul 29, 2025 •

edited

Loading