-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Adding Cluster Endpoints for search failures and refactoring document status counters #19115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Adding Cluster Endpoints for search failures and refactoring document status counters #19115
Conversation
Signed-off-by: Anthony Leong <[email protected]>
|
❌ Gradle check result for ab74705: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Anthony Leong <[email protected]>
|
❌ Gradle check result for ad9e107: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Anthony Leong <[email protected]>
|
❌ Gradle check result for 9e3af03: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Anthony Leong <[email protected]>
|
❌ Gradle check result for 7e9374a: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Anthony Leong <[email protected]>
Signed-off-by: Anthony Leong <[email protected]>
|
❌ Gradle check result for a345b21: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Anthony Leong <[email protected]>
|
❌ Gradle check result for ae2c661: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Anthony Leong <[email protected]>
Signed-off-by: Anthony Leong <[email protected]>
|
❌ Gradle check result for c9ad00b: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Anthony Leong <[email protected]>
|
❌ Gradle check result for 32a2fae: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Anthony Leong <[email protected]>
kaushalmahi12
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are some intial comments, will go through rest of it in couple of days
| } else { | ||
| return "system_failure"; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets move the numeric values into well defined constants/Enums then directly use, StatusCode.from(getStatusFamilyCode)
| String successType = RestStatus.ACCEPTED.getErrorType(); | ||
| String userFailureType = RestStatus.BAD_REQUEST.getErrorType(); | ||
| String systemErrorType = RestStatus.INTERNAL_SERVER_ERROR.getErrorType(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these the only RestStatus codes we need to track or all of them ? It would be better to encapsulate these into a collection to avoid tracking any new additions. That will help avoid any changes in this class.
It can easily be written more generically
for (RestStatus status: supportedStatusCodes) {
builder.field(status.getErrorType(), errorTypeCounts.getOrDefault(status.getErrorType(), (long) 0));
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea to group the possible RestStatus values was mentioned in #18438 (comment). I think it simplifies what the user views. I will add the new enums in a different file.
| * @opensearch.api | ||
| */ | ||
| @PublicApi(since = "1.0.0") | ||
| public class SearchResponseStatusStats implements Writeable, ToXContentFragment { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks pretty much duplicate of the DocStatusStats class. Can we make a generic class which will help us remove this duplicity
| } | ||
| } | ||
|
|
||
| indicesService.addDocStatusStats(stats); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are replacing the previous stats completely then why not just use the same method with updated logic.
| cancellationListener.onResponse(searchResponse); | ||
| indicesService.getSearchResponseStatusStats().inc(searchResponse.status()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not clear on whether if you should put these instructions in this order it will be correct and might be flaky since it might have already sent the response via the httpChannel which is responsible for this request
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. I am not 100% sure if the code will continue to be run after the original listener receives the SearchResponse. I think it is safer to switch the orders just in case.
| stats.inc(RestStatus.OK); | ||
|
|
||
| indicesService.addDocStatusStats(stats); | ||
| indicesService.addNewDocStatusStats(stats); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of these changes can be avoided when you change the existing method instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An issue is that I could not directly change the signature of the existing method because of backwards compatibility. addDocStatusStats would have to taken in the new class for DocStatusStats. I could try to rewrite the method but with a different signature which is what I should have probably done in the first place.
| this.isThrottled = isThrottled; | ||
| this.throttleTimeInMillis = throttleTimeInMillis; | ||
| this.docStatusStats = docStatusStats; | ||
| this.docStatusStats = null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are deprecating this construct then do we really need the ctor changes ?
|
❌ Gradle check result for b544e35: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Anthony Leong <[email protected]>
…g623/OpenSearch into upgrade-cluster-endpoints
|
@kaushalmahi12 Thank you so much for looking into my pr and reviewing it. The idea is that I would like to track search failures in some kind of counter which will be stored in Additionally, #8716 implemented a change where document response statuses are counted. Based on some comments from #18438 (comment) I refactored the logic as described in #18438 (comment). I have addressed a couple of the comments but will continue to work through them. Thank you again for the review! |
|
❌ Gradle check result for 3455efb: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Anthony Leong <[email protected]>
|
@kaushalmahi12 I believe the requested changes have been addressed. I have made a new class |
|
❌ Gradle check result for a2aa335: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Anthony Leong <[email protected]>
|
❌ Gradle check result for 93f508f: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
❌ Gradle check result for eb19367: null Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Anthony Leong <[email protected]>
|
❌ Gradle check result for 536c20b: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Anthony Leong <[email protected]>
|
@kaushalmahi12 I believe the comments you suggested have been addressed. Could you please confirm whether that is true or not since I did not want to resolve the comments without making sure they are truly resolved? Also, I noticed that this PR is scheduled for a 3.3.0 release |
|
This PR is stalled because it has been open for 30 days with no activity. |
Signed-off-by: Anthony Leong <[email protected]>
|
❌ Gradle check result for 917722c: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Description
This change is a rebased and cleaned version of another pr #18601
Related Issues
Resolves #18377, resolves #18438
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.