Skip to content

Conversation

@ysbaddaden
Copy link
Contributor

@ysbaddaden ysbaddaden commented Jul 3, 2025

Allows to dynamically resize a Parallel execution context to a new maximum parallelism.

Parallelism can grow: merely adds more schedulers, which may eventually start more threads.

Parallelism can shrink: immediately removes any overflow schedulers and tell them to shutdown. The actual shutdown is cooperative, a scheduler won't notice the shutdown until it's current fiber is rescheduled.

Follow up to #15936 and #15946.

Whenever the scheduler tries to reschedule or its main loop iterates, it
checks if it has been told to shutdown now, and if so, drains its
runnables queue back into the global queue, and terminates.
Relies on copy-on-write for the schedulers array: we dup and replace the
array instead of mutating it, to modify #steal to read the reference
once, then always refer to the same array (never mutated) so we can
safely steal without acquiring the mutex.
@ysbaddaden ysbaddaden force-pushed the feature/add-resize-to-execution-context-parallel branch from 71efcdb to 1a56f90 Compare August 4, 2025 17:33
@ysbaddaden ysbaddaden marked this pull request as ready for review August 4, 2025 17:34
@straight-shoota straight-shoota changed the title Add Fiber::ExecutionContext::Parallel#resize Add Fiber::ExecutionContext::Parallel#resize Aug 5, 2025
@ysbaddaden
Copy link
Contributor Author

ysbaddaden commented Aug 5, 2025

I dropped the Shutdown::NO value and replaced with an invalid value until it's set to Shutdown::NOW. I kept it since @shutdown.now? reads well, but maybe we could drop it, @shutdown_now would read just as well. That depends on the cooperative shutdown.

~~EDIT: I believe the cooperative shutdown would use Shutdown::OnIdle for example. But it could also be a property on Parallel instead of Parallel::Scheduler 🤔 ~~

I dropped the enum and use a boolean that is enough for now. We'll see when we implement context shutdown if an enum would make sense.

I also fixed an issue (must reset @parked) and added an early abort when spinning (no need to spin). I kept the non-blocking evloop run, which doesn't hurt... it might be preferable to enqueue to the global queue directly, though 🤔

@Sija
Copy link
Contributor

Sija commented Aug 5, 2025

I'd use #shutdown? (getter) and #shutdown! to better signify their function. Right now it's IMO kinda unclear with @shutdown ivar and #shutdown method which is not a simple getter.

We must update the spinning threads counter before waking up a thread,
because the thread's scheduler will declare itself as spinning right
after unpark and expects the counter to already have been incremented.

We must also stop spinning before shutting down because the scheduler
might have been spinning.
@ysbaddaden
Copy link
Contributor Author

ysbaddaden commented Aug 25, 2025

  • I fixed an issue with the @spinning counter where the counter would go out of sync.
  • I renamed the #shutdown method as #shutdown! since it could now be a bit ambiguous compared to the previous #shutdown(:now). I kept @shutdown since it's fully internal to the scheduler.

@straight-shoota
Copy link
Member

Could we get a test for ExecutionContext::Parallel? So far this is only testing Runnables.

@straight-shoota straight-shoota added this to the 1.18.0 milestone Sep 8, 2025
@straight-shoota straight-shoota merged commit 02bed9c into crystal-lang:master Sep 10, 2025
39 checks passed
@github-project-automation github-project-automation bot moved this from Review to Done in Multi-threading Sep 10, 2025
@ysbaddaden ysbaddaden deleted the feature/add-resize-to-execution-context-parallel branch September 10, 2025 08:26
straight-shoota added a commit that referenced this pull request Sep 11, 2025
…16135)

The `Concurrent` implementation mostly duplicates the `Parallel` one, and the parallel schedulers now take a few shortcuts when parallelism is set to 1 (skip stealing, quick evloop check) and `Concurrent` doesn't exhibit any noticeable performance improvement over `Parallel`.

Having a single implementation instead of two, almost identical, will ease maintenance.

I kept the type and didn't deprecate it yet. Its only value is that you can't mistakenly resize the context, and an exception is raised when you try to (related to #15956).

Co-authored-by: Johannes Müller <[email protected]>
straight-shoota added a commit that referenced this pull request Sep 15, 2025
The concurrent context doesn't exhibit any performance improvement over the parallel context (see #16135). So we may as well use a parallel context as default context, and allow resizing it for more parallelism (see #15956).

The default configuration is without parallelism (MT:1), thus the execution context remains concurrent only, which is backward compatible (nothing changes).

Resizing the context to enable parallelism will be at the discretion of the application: it can keep using the `CRYSTAL_WORKERS` environment variable, and/or fallback to the number of CPUs or use a hardcoded value, and/or provide a command line argument, or anything else.

Co-authored-by: Johannes Müller <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants