Skip to content

Conversation

@mikepurvis
Copy link
Contributor

As discussed in #169.

It's a nuisance having to put the wrapper/helper class up at module level, but multiprocessing can only pickle top level functions and objects, so I don't see an easy way around this. The class could potentially be avoided by defining separate helpers to always capture/not capture warnings, but that's a matter of taste.

@mikepurvis mikepurvis force-pushed the parallel-pkg-parse branch from 33c4151 to d2488fb Compare May 10, 2017 18:36
for path in package_paths:
packages[path] = parse_package(os.path.join(basepath, path), warnings=warnings)
return packages
parser = _PackageParser(capture_warnings=isinstance(warnings, list))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this check we changed to:

capture_warnings=warnings is not None

in order to allow "duck typing"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, probably— will change.

parser = _PackageParser(capture_warnings=isinstance(warnings, list))
results = multiprocessing.Pool(4).map(parser, package_paths)
if parser.capture_warnings:
map(warnings.extend, [package_warnings for _, _, package_warnings in results])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe use results[2] to avoid the underscores?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, can do. I find the tuple unpacking a bit more readable than explicit indexing, but it could go either way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I've dropped the comprehensions in favour of zipping; I think that's maybe a bit clearer.

packages[path] = parse_package(os.path.join(basepath, path), warnings=warnings)
return packages
parser = _PackageParser(capture_warnings=isinstance(warnings, list))
results = multiprocessing.Pool(4).map(parser, package_paths)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use something more "dynamic" like multiprocessing.cpu_count() as a default (and fallback to a hard coded number like 4)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you can just leave off the number and it will fall back on cpu_count as the default. Will change (I found empirically that 4 worked well for me).

@mikepurvis mikepurvis force-pushed the parallel-pkg-parse branch from d2488fb to a65b88f Compare May 10, 2017 18:55
@mikepurvis mikepurvis force-pushed the parallel-pkg-parse branch from 03d2852 to 9a0641a Compare May 10, 2017 19:20
@dirk-thomas
Copy link
Member

This looks good.

It might be a bit out of scope for the patch but it would be great if the API would allow the same improvement for this use (https://github.com/ros/rosdistro/blob/master/test/test_build_caches.py#L40-L41) which passes the xml rather than the paths.

@dirk-thomas
Copy link
Member

I tried to apply the same idea to the parse_packages_by_strings case. But it seems since in that case I already have all the xml in memory and only parse the xml in parallel it doesn't improve the performance (it actually is about 10% slower).

I tried calling the API find_packages_allowing_duplicates locally pointing to different folders. For smaller numbers (e.g. 100) the patched version was slower. For larger cases (e.g. 1000) it was certainly faster. Should be switch between the old and new behavior based on the number of packages or just optimize for the large scale? @ros-infrastructure/ros_team Any opinions?

@mikepurvis
Copy link
Contributor Author

There's certainly a cost to summoning the other processes. I just wasn't sure if the cost was enough to justify switching between implementations, but happy to go either way on it.

@mikepurvis
Copy link
Contributor Author

I don't really have the time/energy for it at the moment, but it would be interesting to study whether the time in the kind of situation is predominantly being spent in syscalls, XML parsing, or Python somewhere. If it's either of the first two, then perhaps there are cases where the GIL isn't being properly released? If that was the case, and it could be fixed, then you'd definitely expect a threaded implementation to be the fastest, particularly since the setup cost then is way smaller.

@mikepurvis
Copy link
Contributor Author

Maybe there's a cleaner implementation here where multiprocessing is only used to get them into memory in parallel, and then they're parsed serially in a parse_package_strings function. That neatly sidesteps the matter of handling warnings properly, since that portion of things goes back to being serial.

@dirk-thomas
Copy link
Member

I did some more benchmarking with a path containing ~1000 packages:

  • find_package_paths takes about 1/3 of a second
  • get_package_xml for all packages takes only 10-15ms (independent of using / not using multiprocessing)
  • parse_package_string takes 3/4s (without), 1/4s (with multiprocessing)

Therefore I have created an updated PR which uses multiprocessing only for the xml parsing part: see #171.

With this implementation the performance with multiprocessing is only worse for < 10 packages. And the overhead is only around 10ms. Therefore I didn't implement any conditional logic to skip using multiprocessing in case of few packages.

@mikepurvis
Copy link
Contributor Author

Interesting! I admittedly didn't dig in, but it definitely looked to me like the loading rather than parsing was the more expensive piece. Thanks for verifying that not to be the case.

@mikepurvis mikepurvis closed this May 17, 2017
@dirk-thomas
Copy link
Member

Maybe it also depends on what kind of system your are using? Does the new PR show the same improvements for your use case as this one?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants