-
Notifications
You must be signed in to change notification settings - Fork 256
Lambda in main thread and Enumerator support #356
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from 10 commits
36af69b
28b74f9
5d5eb63
9d9bd11
756685f
60ff3ac
4823fcc
3eda109
7e8a77c
20cb415
7e76725
5eea888
ec0083b
074eab7
70208e6
98a74e4
f3a9dd2
9b9f4bd
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -97,20 +97,27 @@ def wait | |||||||||
|
|
||||||||||
| class JobFactory | ||||||||||
| def initialize(source, mutex) | ||||||||||
| @lambda = (source.respond_to?(:call) && source) || queue_wrapper(source) | ||||||||||
| @source = source.to_a unless @lambda # turn Range and other Enumerable-s into an Array | ||||||||||
| @lambda = enum_wrapper(source) || (source.respond_to?(:call) && source) || queue_wrapper(source) | ||||||||||
| @source = source.to_a unless @lambda # turn non-Enumerable-s into an Array | ||||||||||
| @runloop_queue = Thread::Queue.new if @lambda | ||||||||||
| @mutex = mutex | ||||||||||
| @index = -1 | ||||||||||
| @stopped = false | ||||||||||
| end | ||||||||||
|
|
||||||||||
| def next | ||||||||||
| if producer? | ||||||||||
| queue_for_thread = Thread.current.thread_variable_get(:parallel_queue) | ||||||||||
| if @runloop_queue && queue_for_thread | ||||||||||
| return if @stopped | ||||||||||
| item = runloop_enq(queue_for_thread) | ||||||||||
| return if item == Stop | ||||||||||
| index = @index += 1 | ||||||||||
| elsif producer? | ||||||||||
| # - index and item stay in sync | ||||||||||
| # - do not call lambda after it has returned Stop | ||||||||||
| item, index = @mutex.synchronize do | ||||||||||
| return if @stopped | ||||||||||
| item = @lambda.call | ||||||||||
| item = call_lambda | ||||||||||
| @stopped = (item == Stop) | ||||||||||
| return if @stopped | ||||||||||
| [item, @index += 1] | ||||||||||
|
|
@@ -123,6 +130,26 @@ def next | |||||||||
| [item, index] | ||||||||||
| end | ||||||||||
|
|
||||||||||
| def runloop | ||||||||||
|
||||||||||
| def runloop | |
| def run_runloop |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some method comments would help here too, what does it do exactly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right. Please feel free to change anything.
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| def runloop | |
| def consume_enumerator_queues |
does this work ?
# consume items for from enumerator queues until they stop producing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed the comment says "enumerator queues", but as far as I can tell, there's only one queue being consumed from in this method.
Does this enumerator queues mean JobFactory's @lambda, doesn't it?
Should this be singular instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this right ?
| loop do | |
| # every time a threads wants to start work, it adds a new queue, we pop the queue here until everything is done (stop) | |
| # then push a new item into the queue for the thread to read and work on | |
| loop do |
takahiro-blab marked this conversation as resolved.
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe give some examples of what types this is trying to detect
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are asking about #enum_wrapper , aren't you?
This method aims converting Enumerator instance and objects including Enumerable to Method instance which calls #next , but objects which is accessible by [] method shouldn't be converted. It's because accessing by index is faster and it can avoid serializing problems, you know.
So, as first, checking [] method, if [] method is available, it returns false. Next, if #next method is available, returns Method instance.
For example:
enum_wrapper([1,2,3]) # -> false
enum_wrapper(1..5) # -> Method ( (1..5).method(:next) )
enum_wrapper(Prime.to_enum) # -> Method (See infinite_sequece.rb test case)
takahiro-blab marked this conversation as resolved.
Show resolved
Hide resolved
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
explain why 1 less
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's because the last thread must raise ThreadError by calling empty queue's #pop method.
For example, if there are 5 worker threads (when option[:count] is 5), Queue finished_monitor will have 4 values.
Each of five worker threads will execute finished_monitor.pop(true), then 4 of them can get value from finished_monitor.
But the last one thread don't get value, and a ThreadError exception will be raised.
So JobFactory#stopper will be called only once.
(I just realized that, perhaps, multiple calls of JobFactory#stopper may has no side effect... If so, these could be written more concisely without using Queue finished_monitor.)
In rescue section, the last one thread will call stopper.call, which stops JobFactory#runloop.
Queue#pop raises ThreadError when true is given as argument and the queue is empty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah got it, very complicated ... can you leave some short inline comment to explain a bit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added comment to this code with below question.
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
explain why it needs to be executed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is not case, JobFacotry#runloop will block in queue = @runloop_queue.pop and it cannot come back from #runloop.
To main thread's surely finishing options[:runloop]&.call (JobFactory#stopper), each of worker threads must call finished_monitor&.pop(true) even if it is killed. So it's in the ensure section.
(Please look at above question)
(JobFactory#stopper pushes Stop JobFactory's @runloop_queue, so it make #runloop finish.)
And, this logic is also necessary for terminating operations by Ctrl+C or workers' throwing Parallel::Kill.
(When Parallel::Kill or Break is thrown, worker threads will be killed by UserInterruptHandler.kill in #work_in_processes)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thx, can you leave a bit of this inline for future archeologists :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added comment with above question.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,24 @@ | ||
| # frozen_string_literal: true | ||
|
|
||
| # Reproduction case based on GitHub Issue #211 | ||
| # Original code provided by @cyclotron3k in the issue | ||
takahiro-blab marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| # using a enum that is infinite, so this will hang forever when trying to convert to an array | ||
|
|
||
| require 'prime' | ||
| require './spec/cases/helper' | ||
|
|
||
| private_key = 12344567899 | ||
|
|
||
| results = [] | ||
|
|
||
| [{ in_threads: 2 }, { in_threads: 0 }].each do |options| | ||
| primes = Prime.to_enum | ||
| Parallel.each(primes, options) do |prime| | ||
| if private_key % prime == 0 | ||
| results << prime.to_s | ||
| raise Parallel::Break | ||
| end | ||
| end | ||
| end | ||
|
|
||
| print results.join(',') | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,21 @@ | ||
| # frozen_string_literal: true | ||
| require './spec/cases/helper' | ||
|
|
||
| runner_thread = nil | ||
| all = [3, 2, 1] | ||
| my_proc = proc { | ||
| runner_thread ||= Thread.current | ||
| if Thread.current != runner_thread | ||
| raise "proc is called in different thread!" | ||
| end | ||
|
|
||
| all.pop || Parallel::Stop | ||
| } | ||
|
|
||
| class Callback | ||
| def self.call(x) | ||
| $stdout.sync = true | ||
| "ITEM-#{x}" | ||
| end | ||
| end | ||
| puts(Parallel.map(my_proc, in_threads: 2) { |(i, _id)| Callback.call i }) |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,21 @@ | ||
| # frozen_string_literal: true | ||
| require './spec/cases/helper' | ||
|
|
||
| def generate_proc | ||
| count = 0 | ||
| proc { | ||
| raise StopIteration if 3 <= count | ||
| count += 1 | ||
| } | ||
| end | ||
|
|
||
| class Callback | ||
| def self.call(x) | ||
| $stdout.sync = true | ||
| "ITEM-#{x}" | ||
| end | ||
| end | ||
|
|
||
| [{ in_processes: 2 }, { in_threads: 2 }, { in_threads: 0 }].each do |options| | ||
| puts(Parallel.map(generate_proc, options) { |(i, _id)| Callback.call i }) | ||
| end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is it not like this ?
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stoppedwill be set inJobFactory#runloop.This
JobFactory#nextmay be called from some threads at the same time.So your
@stopped = (item == Stop)needs exclusive controlMutex#synchronize.Taking the value out of the
@lambdaand Setting the result to@stoppedmust be handled in the critical section or be handled in one thread, I think.Otherwise, a certain thread may clear
@stoppedflag. This could be a bug.So previous implementation of Parallel uses
@mutex.synchronize.This PR's code handles
@lambdaand checkitem == Stopin#runloopby one thread, so@mutexis given, but it's not used.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah thx, yeah this is a tricky section :)
can you leave a bit of inline comment for the gotchas
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added comment to this code.