-
Notifications
You must be signed in to change notification settings - Fork 966
Concurrently download components of a toolchain #4436
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Concurrently download components of a toolchain #4436
Conversation
10d1514
to
7093333
Compare
@@ -14,11 +14,11 @@ pub enum Notification<'a> { | |||
RemovingDirectory(&'a str, &'a Path), | |||
DownloadingFile(&'a Url, &'a Path), | |||
/// Received the Content-Length of the to-be downloaded data. | |||
DownloadContentLengthReceived(u64), | |||
DownloadContentLengthReceived(u64, &'a str), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you mind explaining the meaning of ""
here and why they are required? It might be an interesting addition to the doc comments as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we are downloading the components concurrently, we can’t know which download the Content-Length refers to. Therefore, in addition to receiving the length itself, we also need the component’s URL so we can match it to the corresponding progress bar.
Note: I've already added a doc comment. Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before resolving this conversation, I will investigate the correct use of this notification and the addition of this extra parameter to ensure that, even though the installations are not concurrent, it did not have any negative effect on them (which, unfortunately, cannot be guaranteed by the test suite).
) | ||
.unwrap() | ||
.progress_chars("## "), | ||
); | ||
pb.set_message(component.to_string()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: .to_string()
doesn't seem necessary here...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may be misreading the situation here, but I think we actually need the .to_string()
here.
This is due to the fact that the set_message()
expects Into<Cow<'static, str>>
. and we would try to pass a &str
with a non-'static
lifetime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO to_owned()
more clearly expresses what you're trying to do (and might be slightly more efficient). In this case since create_progress_bar()
needs two copies, suggest taking the argument as String
and cloning it for use in set_message()
.
837bf58
to
4737f25
Compare
After my investigation stemming from this comment, I have concluded that the changes presented on this PR are not consistent with the previous behavior of displaying a progress bar during the installation of the component. Would this be necessary, especially considering that the next step will be to interleave the downloads with the installation? |
@FranciscoTGouveia No, I don't think it is necessary for that part of the behavior to stay unchanged, as you have already mentioned. Of course, if it is easy to add it back, then it might still be interesting to do so. |
|
||
things_to_install.push((component, format, downloaded_file)); | ||
let component_stream = | ||
tokio_stream::iter(components.into_iter()).map(|(component, format, url, hash)| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also have to consider the restriction of concurrent downloads similar to what we have done with threadpooled diskio.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This closure is complicated enough that I think it should be extracted into a separate function or struct
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also have to consider the restriction of concurrent downloads similar to what we have done with threadpooled diskio.
I will prepare a PR in the near future to address this (possibly before this one gets merged).
Would introducing an environment variable (e.g., RUSTUP_CONCURRENT_DOWNLOADS
), similar to RUSTUP_IO_THREADS
, be a good approach?
NB: This could also be extended to the rustup check
command to maintain consistency.
This closure is complicated enough that I think it should be extracted into a separate function or
struct
.
I am somewhat reluctant to do this, as my upcoming work (installing a component as soon as its download completes) will also require changes to this function.
However, for the sake of keeping the changes coherent in this PR, I will ensure it is extracted. Thank you!
) | ||
.unwrap() | ||
.progress_chars("## "), | ||
); | ||
pb.set_message(component.to_string()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO to_owned()
more clearly expresses what you're trying to do (and might be slightly more efficient). In this case since create_progress_bar()
needs two copies, suggest taking the argument as String
and cloning it for use in set_message()
.
@@ -263,10 +263,11 @@ impl Executor for Threaded<'_> { | |||
// pretend to have bytes to deliver. | |||
let mut prev_files = self.n_files.load(Ordering::Relaxed); | |||
if let Some(handler) = self.notify_handler { | |||
handler(Notification::DownloadFinished); | |||
handler(Notification::DownloadFinished("")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems strange. Why pass in an empty string?
handler(Notification::DownloadPushUnit(Unit::IO)); | ||
handler(Notification::DownloadContentLengthReceived( | ||
prev_files as u64, | ||
"", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here. What does the empty string mean? Should the type be Option<_>
instead?
Ok::<_, Error>((component, format, downloaded_file, hash)) | ||
} | ||
}); | ||
if components_len > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is the only use of components_len
, suggest storing a bool
instead.
|
||
things_to_install.push((component, format, downloaded_file)); | ||
let component_stream = | ||
tokio_stream::iter(components.into_iter()).map(|(component, format, url, hash)| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This closure is complicated enough that I think it should be extracted into a separate function or struct
.
Some notifications needed to be updated to include the download URL, enabling the identification of the component being downloaded. This was necessary for accurate progress reporting of each component.
4737f25
to
4615937
Compare
As of now, each component of a toolchain is downloaded sequentially, leaving room for performance improvements.
This is particularly notorious when considering interleaving downloads with installation, as discussed in #731.
Following the work made in #4426, as the
DownloadTracker
is nowindicatif
-based, doing the progress reporting for concurrent downloads is much easier.As such, this PR comes to enable downloading different components concurrently (possibly closing #3018).
Performance-wise, benchmarks using
hyperfine
showed a small (1.1x) speedup -- which doesn't seem noticeable but lays the foundations for a future PR to interleave the download with the installation (of components).The benchmarks were ran 50 times (with 5 warmup runs) over a 50 Mbps connection, each downloading an entire toolchain.
Many thanks to @rami3l and his PR for refactoring part of the test suite for the purpose of this PR.