-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Fix regression: preserve not(not X) in BooleanFlatteningRewriter #19273
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
opensearch-project#19266 by fixing double-negation under MUST_NOT introduced by opensearch-project#19060 When parent is must_not and nested bool has only must_not clauses, rewrite to a positive OR: not(bool(must_not:[X1..Xn])) => filter(bool(should:[X1..Xn], minimum_should_match=1)) Use FILTER (not MUST) to preserve non-scoring semantics of must_not Leave other flattenings unchanged; only trigger on pure must_not nested bool Add unit test: testDoubleNegationConvertedToPositiveMustShould Signed-off-by: Atri Sharma <[email protected]>
|
@andrross Please review |
|
❌ Gradle check result for c9b4c36: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Atri Sharma <[email protected]>
|
❌ Gradle check result for 7e6d723: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
❌ Gradle check result for 7e6d723: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
❌ Gradle check result for 7e6d723: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
❌ Gradle check result for 7e6d723: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
It's safe to use this form of rewriting for idempotent functions like must, should and filter. I'm not very certain about applying this operation for non idempotent functions like must_not and the way we are adding special handling from them. I would remove it for must_not unless it's really needed. |
Signed-off-by: Atri Sharma <[email protected]>
|
Agreed. Removed the same. |
|
❌ Gradle check result for e28f8c7: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
@rishabhmaurya @atris What are you thoughts on the test coverage we have for the other rewritten queries? Obviously we had a gap with must_not. How likely is it that we have other regressions? There is a setting to enable query rewriting that currently defaults to enabled, so we have the option to default to disabled for the next release if there's additional work needed to validate all the other optimizations. |
|
FYI, it's probably fine to skip the changelog here since the "regression" was never released so a changelog entry would just add noise. |
|
❌ Gradle check result for e28f8c7: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
…lattening invariants This change broadens the test suite for query rewriting to codify semantics, guard against regressions, and address maintainer feedback around non‑idempotent must_not handling. BooleanFlatteningRewriterTests: Add explicit “don’t flatten” cases when nested bool has non-default boost, has queryName, or nested should has minimum_should_match. Add idempotence check (second rewrite is a no-op in structure/serialization). MatchAllRemovalRewriterTests: Remove match_all from must when a non-scoring context (filter) is present. Remove deeply nested match_all under filter-in-filter. MustToFilterRewriterTests: Move boosted numeric term/range queries to filter; ensure text queries remain scoring. With null QueryShardContext, move ranges but do not move numeric term/terms. MustNotToShouldRewriterTests: Add idempotence check. Ensure no rewrite for multi-valued numeric fields (complements must_not disabled). TermsMergingRewriterTests: Should-clause merging above threshold. Per-field merging: only coalesce within the same field; mixed fields remain separate. QueryRewriterRegistryTests: Dynamic setting update for terms merging threshold (e.g., 16 → 32) changes behavior as expected. Registry idempotence: applying rewrite twice yields identical structure. Signed-off-by: Atri Sharma <[email protected]>
|
@andrross fair point. I have added additional tests for the rewrite infrastructure - please let me know if they seem complete. I believe these should be enough and we can continue with current settings but will follow the guidance we get to |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #19273 +/- ##
============================================
+ Coverage 72.83% 72.87% +0.03%
- Complexity 69654 69688 +34
============================================
Files 5659 5659
Lines 320176 320174 -2
Branches 46357 46354 -3
============================================
+ Hits 233207 233316 +109
+ Misses 68115 67978 -137
- Partials 18854 18880 +26 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
…nsearch-project#19273) * Fix regression: preserve not(not X) in BooleanFlatteningRewriter Resolve opensearch-project#19266 by fixing double-negation under MUST_NOT introduced by opensearch-project#19060 When parent is must_not and nested bool has only must_not clauses, rewrite to a positive OR: not(bool(must_not:[X1..Xn])) => filter(bool(should:[X1..Xn], minimum_should_match=1)) Use FILTER (not MUST) to preserve non-scoring semantics of must_not Leave other flattenings unchanged; only trigger on pure must_not nested bool Add unit test: testDoubleNegationConvertedToPositiveMustShould Signed-off-by: Atri Sharma <[email protected]> * spotless output Signed-off-by: Atri Sharma <[email protected]> * Remove must_not rewrite Signed-off-by: Atri Sharma <[email protected]> * Query rewriting tests: expand coverage; enforce idempotence and non-flattening invariants This change broadens the test suite for query rewriting to codify semantics, guard against regressions, and address maintainer feedback around non‑idempotent must_not handling. BooleanFlatteningRewriterTests: Add explicit “don’t flatten” cases when nested bool has non-default boost, has queryName, or nested should has minimum_should_match. Add idempotence check (second rewrite is a no-op in structure/serialization). MatchAllRemovalRewriterTests: Remove match_all from must when a non-scoring context (filter) is present. Remove deeply nested match_all under filter-in-filter. MustToFilterRewriterTests: Move boosted numeric term/range queries to filter; ensure text queries remain scoring. With null QueryShardContext, move ranges but do not move numeric term/terms. MustNotToShouldRewriterTests: Add idempotence check. Ensure no rewrite for multi-valued numeric fields (complements must_not disabled). TermsMergingRewriterTests: Should-clause merging above threshold. Per-field merging: only coalesce within the same field; mixed fields remain separate. QueryRewriterRegistryTests: Dynamic setting update for terms merging threshold (e.g., 16 → 32) changes behavior as expected. Registry idempotence: applying rewrite twice yields identical structure. Signed-off-by: Atri Sharma <[email protected]> --------- Signed-off-by: Atri Sharma <[email protected]>
…nsearch-project#19273) * Fix regression: preserve not(not X) in BooleanFlatteningRewriter Resolve opensearch-project#19266 by fixing double-negation under MUST_NOT introduced by opensearch-project#19060 When parent is must_not and nested bool has only must_not clauses, rewrite to a positive OR: not(bool(must_not:[X1..Xn])) => filter(bool(should:[X1..Xn], minimum_should_match=1)) Use FILTER (not MUST) to preserve non-scoring semantics of must_not Leave other flattenings unchanged; only trigger on pure must_not nested bool Add unit test: testDoubleNegationConvertedToPositiveMustShould Signed-off-by: Atri Sharma <[email protected]> * spotless output Signed-off-by: Atri Sharma <[email protected]> * Remove must_not rewrite Signed-off-by: Atri Sharma <[email protected]> * Query rewriting tests: expand coverage; enforce idempotence and non-flattening invariants This change broadens the test suite for query rewriting to codify semantics, guard against regressions, and address maintainer feedback around non‑idempotent must_not handling. BooleanFlatteningRewriterTests: Add explicit “don’t flatten” cases when nested bool has non-default boost, has queryName, or nested should has minimum_should_match. Add idempotence check (second rewrite is a no-op in structure/serialization). MatchAllRemovalRewriterTests: Remove match_all from must when a non-scoring context (filter) is present. Remove deeply nested match_all under filter-in-filter. MustToFilterRewriterTests: Move boosted numeric term/range queries to filter; ensure text queries remain scoring. With null QueryShardContext, move ranges but do not move numeric term/terms. MustNotToShouldRewriterTests: Add idempotence check. Ensure no rewrite for multi-valued numeric fields (complements must_not disabled). TermsMergingRewriterTests: Should-clause merging above threshold. Per-field merging: only coalesce within the same field; mixed fields remain separate. QueryRewriterRegistryTests: Dynamic setting update for terms merging threshold (e.g., 16 → 32) changes behavior as expected. Registry idempotence: applying rewrite twice yields identical structure. Signed-off-by: Atri Sharma <[email protected]> --------- Signed-off-by: Atri Sharma <[email protected]> Signed-off-by: Ankit Jain <[email protected]>
…nsearch-project#19273) * Fix regression: preserve not(not X) in BooleanFlatteningRewriter Resolve opensearch-project#19266 by fixing double-negation under MUST_NOT introduced by opensearch-project#19060 When parent is must_not and nested bool has only must_not clauses, rewrite to a positive OR: not(bool(must_not:[X1..Xn])) => filter(bool(should:[X1..Xn], minimum_should_match=1)) Use FILTER (not MUST) to preserve non-scoring semantics of must_not Leave other flattenings unchanged; only trigger on pure must_not nested bool Add unit test: testDoubleNegationConvertedToPositiveMustShould Signed-off-by: Atri Sharma <[email protected]> * spotless output Signed-off-by: Atri Sharma <[email protected]> * Remove must_not rewrite Signed-off-by: Atri Sharma <[email protected]> * Query rewriting tests: expand coverage; enforce idempotence and non-flattening invariants This change broadens the test suite for query rewriting to codify semantics, guard against regressions, and address maintainer feedback around non‑idempotent must_not handling. BooleanFlatteningRewriterTests: Add explicit “don’t flatten” cases when nested bool has non-default boost, has queryName, or nested should has minimum_should_match. Add idempotence check (second rewrite is a no-op in structure/serialization). MatchAllRemovalRewriterTests: Remove match_all from must when a non-scoring context (filter) is present. Remove deeply nested match_all under filter-in-filter. MustToFilterRewriterTests: Move boosted numeric term/range queries to filter; ensure text queries remain scoring. With null QueryShardContext, move ranges but do not move numeric term/terms. MustNotToShouldRewriterTests: Add idempotence check. Ensure no rewrite for multi-valued numeric fields (complements must_not disabled). TermsMergingRewriterTests: Should-clause merging above threshold. Per-field merging: only coalesce within the same field; mixed fields remain separate. QueryRewriterRegistryTests: Dynamic setting update for terms merging threshold (e.g., 16 → 32) changes behavior as expected. Registry idempotence: applying rewrite twice yields identical structure. Signed-off-by: Atri Sharma <[email protected]> --------- Signed-off-by: Atri Sharma <[email protected]> Signed-off-by: Ankit Jain <[email protected]>
…nsearch-project#19273) * Fix regression: preserve not(not X) in BooleanFlatteningRewriter Resolve opensearch-project#19266 by fixing double-negation under MUST_NOT introduced by opensearch-project#19060 When parent is must_not and nested bool has only must_not clauses, rewrite to a positive OR: not(bool(must_not:[X1..Xn])) => filter(bool(should:[X1..Xn], minimum_should_match=1)) Use FILTER (not MUST) to preserve non-scoring semantics of must_not Leave other flattenings unchanged; only trigger on pure must_not nested bool Add unit test: testDoubleNegationConvertedToPositiveMustShould Signed-off-by: Atri Sharma <[email protected]> * spotless output Signed-off-by: Atri Sharma <[email protected]> * Remove must_not rewrite Signed-off-by: Atri Sharma <[email protected]> * Query rewriting tests: expand coverage; enforce idempotence and non-flattening invariants This change broadens the test suite for query rewriting to codify semantics, guard against regressions, and address maintainer feedback around non‑idempotent must_not handling. BooleanFlatteningRewriterTests: Add explicit “don’t flatten” cases when nested bool has non-default boost, has queryName, or nested should has minimum_should_match. Add idempotence check (second rewrite is a no-op in structure/serialization). MatchAllRemovalRewriterTests: Remove match_all from must when a non-scoring context (filter) is present. Remove deeply nested match_all under filter-in-filter. MustToFilterRewriterTests: Move boosted numeric term/range queries to filter; ensure text queries remain scoring. With null QueryShardContext, move ranges but do not move numeric term/terms. MustNotToShouldRewriterTests: Add idempotence check. Ensure no rewrite for multi-valued numeric fields (complements must_not disabled). TermsMergingRewriterTests: Should-clause merging above threshold. Per-field merging: only coalesce within the same field; mixed fields remain separate. QueryRewriterRegistryTests: Dynamic setting update for terms merging threshold (e.g., 16 → 32) changes behavior as expected. Registry idempotence: applying rewrite twice yields identical structure. Signed-off-by: Atri Sharma <[email protected]> --------- Signed-off-by: Atri Sharma <[email protected]>
…nsearch-project#19273) * Fix regression: preserve not(not X) in BooleanFlatteningRewriter Resolve opensearch-project#19266 by fixing double-negation under MUST_NOT introduced by opensearch-project#19060 When parent is must_not and nested bool has only must_not clauses, rewrite to a positive OR: not(bool(must_not:[X1..Xn])) => filter(bool(should:[X1..Xn], minimum_should_match=1)) Use FILTER (not MUST) to preserve non-scoring semantics of must_not Leave other flattenings unchanged; only trigger on pure must_not nested bool Add unit test: testDoubleNegationConvertedToPositiveMustShould Signed-off-by: Atri Sharma <[email protected]> * spotless output Signed-off-by: Atri Sharma <[email protected]> * Remove must_not rewrite Signed-off-by: Atri Sharma <[email protected]> * Query rewriting tests: expand coverage; enforce idempotence and non-flattening invariants This change broadens the test suite for query rewriting to codify semantics, guard against regressions, and address maintainer feedback around non‑idempotent must_not handling. BooleanFlatteningRewriterTests: Add explicit “don’t flatten” cases when nested bool has non-default boost, has queryName, or nested should has minimum_should_match. Add idempotence check (second rewrite is a no-op in structure/serialization). MatchAllRemovalRewriterTests: Remove match_all from must when a non-scoring context (filter) is present. Remove deeply nested match_all under filter-in-filter. MustToFilterRewriterTests: Move boosted numeric term/range queries to filter; ensure text queries remain scoring. With null QueryShardContext, move ranges but do not move numeric term/terms. MustNotToShouldRewriterTests: Add idempotence check. Ensure no rewrite for multi-valued numeric fields (complements must_not disabled). TermsMergingRewriterTests: Should-clause merging above threshold. Per-field merging: only coalesce within the same field; mixed fields remain separate. QueryRewriterRegistryTests: Dynamic setting update for terms merging threshold (e.g., 16 → 32) changes behavior as expected. Registry idempotence: applying rewrite twice yields identical structure. Signed-off-by: Atri Sharma <[email protected]> --------- Signed-off-by: Atri Sharma <[email protected]>
…nsearch-project#19273) * Fix regression: preserve not(not X) in BooleanFlatteningRewriter Resolve opensearch-project#19266 by fixing double-negation under MUST_NOT introduced by opensearch-project#19060 When parent is must_not and nested bool has only must_not clauses, rewrite to a positive OR: not(bool(must_not:[X1..Xn])) => filter(bool(should:[X1..Xn], minimum_should_match=1)) Use FILTER (not MUST) to preserve non-scoring semantics of must_not Leave other flattenings unchanged; only trigger on pure must_not nested bool Add unit test: testDoubleNegationConvertedToPositiveMustShould Signed-off-by: Atri Sharma <[email protected]> * spotless output Signed-off-by: Atri Sharma <[email protected]> * Remove must_not rewrite Signed-off-by: Atri Sharma <[email protected]> * Query rewriting tests: expand coverage; enforce idempotence and non-flattening invariants This change broadens the test suite for query rewriting to codify semantics, guard against regressions, and address maintainer feedback around non‑idempotent must_not handling. BooleanFlatteningRewriterTests: Add explicit “don’t flatten” cases when nested bool has non-default boost, has queryName, or nested should has minimum_should_match. Add idempotence check (second rewrite is a no-op in structure/serialization). MatchAllRemovalRewriterTests: Remove match_all from must when a non-scoring context (filter) is present. Remove deeply nested match_all under filter-in-filter. MustToFilterRewriterTests: Move boosted numeric term/range queries to filter; ensure text queries remain scoring. With null QueryShardContext, move ranges but do not move numeric term/terms. MustNotToShouldRewriterTests: Add idempotence check. Ensure no rewrite for multi-valued numeric fields (complements must_not disabled). TermsMergingRewriterTests: Should-clause merging above threshold. Per-field merging: only coalesce within the same field; mixed fields remain separate. QueryRewriterRegistryTests: Dynamic setting update for terms merging threshold (e.g., 16 → 32) changes behavior as expected. Registry idempotence: applying rewrite twice yields identical structure. Signed-off-by: Atri Sharma <[email protected]> --------- Signed-off-by: Atri Sharma <[email protected]>
Resolve #19266 by fixing double-negation under MUST_NOT introduced by #19060 When parent is must_not and nested bool has only must_not clauses, rewrite to a positive OR:
not(bool(must_not:[X1..Xn])) => filter(bool(should:[X1..Xn], minimum_should_match=1))
Use FILTER (not MUST) to preserve non-scoring semantics of must_not
Leave other flattenings unchanged; only trigger on pure must_not nested bool
Add unit test: testDoubleNegationConvertedToPositiveMustShould
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.