Skip to content

Conversation

@icidasset
Copy link
Contributor

@icidasset icidasset commented Jan 5, 2020

When fetching a url, store the resolved url for later use. This is done, for example, when a user provides a url that redirects. We store the resulting url (after following the redirect), so that it doesn't need to follow the redirect each time we make a request. Thus, this change improves performance when dealing with redirects.


@Borewit This is what I assumed would fix #117

Also related to this, where is tokenizer.fileInfo.url used? It appears this rangeRequestClient (ie. this library) doesn't use that.

Copy link
Owner

@Borewit Borewit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check the url assignment

private static makeResponse(resp): IRangeRequestResponse {
const contentRange = HttpClient.parseContentRange(resp.headers);
return {
url: resp.url,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent

public getHeadInfo(): Promise<IHeadRequestInfo> {
return _fetch(this.url, {method: 'HEAD'}).then(resp => HttpClient.makeResponse(resp));
return _fetch(this.url, {method: 'HEAD'}).then(resp => {
this.url = resp.url;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assigning the URL from the response back to the request, looks a bit odd.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by request? Isn't this the instance of the HttpClient? 🤔 The goal here is to change the httpClientInstance.url, or in other words the url used on line 58 as the first argument to _fetch.

Copy link
Owner

@Borewit Borewit Jan 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HttpClient maybe is a confusing name. The HttpClient is the request. It is bound to one, and only URL request, one file sessions, which may indeed be multiple HTTP-requests.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe create a second url attribute for fetching the data?

So fetch does already handle the redirect right? That is why you use the url?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So fetch does already handle the redirect right? That is why you use the url?

Yes.

@Borewit
Copy link
Owner

Borewit commented Jan 5, 2020

where is tokenizer.fileInfo.url used

This is also a kind of metadata-information, purely used as file identification. Files have a name or URL. The filename may be used to determine the file type.

@Borewit
Copy link
Owner

Borewit commented Jan 5, 2020

I think there is no problem following redirects @icidasset. Those redirects may be there for load balancing. Only when a redirect is marked as permanent, you should in principle replace the old URL with new URL.

But I see you point, you want to prevent it each range request. I guess the HEAD request is the best place to follow the redirects.

@icidasset
Copy link
Contributor Author

I think there is no problem following redirects @icidasset. Those redirects may be there for load balancing. Only when a redirect is marked as permanent, you should in principle replace the old URL with new URL.

That's a good point, I didn't think of that 👍
I guess keeping the redirects is better then, even though it's slower.
Will update the PR.

@icidasset
Copy link
Contributor Author

Ok, so this PR is basically useless except that it now stores the resolved url in the fileInfo 😅 Feel free to close if you think it's useless.

@Borewit
Copy link
Owner

Borewit commented Jan 5, 2020

No it makes sense. I did not directly see why you did it this way, but I think I understand. I would like to keep the originating request URL. If you cache the download URL in another attribute I am happy with it.

When fetching a url, store the resolved url for later use. This is done, for example, when a user provides a url that redirects. We store the resulting url (after following the redirect), so that it doesn't need to follow the redirect each time we make a request. Thus, this change improves performance when dealing with redirects.
@icidasset
Copy link
Contributor Author

Sounds good @Borewit , updated the code.

@Borewit
Copy link
Owner

Borewit commented Jan 5, 2020

The HEAD requests may be avoided (because some servers give the wrong response)
Probably best to assign it as well in:

  1. constructor
  2. getResponse

@icidasset
Copy link
Contributor Author

Yeah, I did assign it in getResponse, uses the initial url if it was not set yet.

@icidasset
Copy link
Contributor Author

Don't know what the IHttpClient interface should look like regarding the resolvedUrl property, I'm not used to Typescript 😅

getHeadInfo?(): Promise<IHeadInfo>;

getResponse(method: string, range?: [number, number]): Promise<IHttpResponse>;
export interface IHttpClient extends IRangeRequestClient {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for fixing my code ;-)

@Borewit Borewit merged commit 1263752 into Borewit:master Jan 5, 2020
@Borewit
Copy link
Owner

Borewit commented Jan 5, 2020

Part of v0.5.1

@icidasset
Copy link
Contributor Author

Thank you @Borewit ! 🙏 🙌

@Borewit
Copy link
Owner

Borewit commented Jan 5, 2020

Thank you @icidasset.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Resolve redirects early

2 participants