-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Add support for minimum_should_match to simple_query_string
#9864
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is the best place to apply minShouldMatch. The difference between this and say match query is that it supports multiple fields (i realize the default is _all and this patch will work with that).
So in this case, this query will return a boolean query across say title, body, author fields. the current application of minShouldMatch will be unintuitive there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh okay, I will do this in SimpleQueryParser instead
|
@rmuir I dug into how this was changing the queries, and I think the current behavior is correct (I pushed more tests): For these 4 documents: I have 3 queries First, an SQS query with Which is correct Second an SQS query with This seems correct to be also, it can occur in either Third, an SQS query with Which seems correct to me also. Next, I indexed 4 more documents to test the cross-field matching: Then I issue 3 more queries. First, a query for This correctly matches documents 3, 4, 7, and 8 Second, a query for This matches 3, 4, 6, 7, & 8, which is what I would expect Finally, a query for Matching 6, 7, and 8, which is what I would expect, because to me it is Can you explain more about what you mean with the unintuitive multi-field |
|
I think you are right (and great to have those new tests), in that it works for simple multi-field cases too. My brain must have mixed this up with other parsers. I think it still degrades to the behavior I am concerned about when the language doesn't use whitespace to separate words, or when WHITESPACE operator is disabled (useful for query-time multi-word synonym support), or other cases. But in general it works and that is something to just fix about the parser one day. +1 for this simple approach... sorry for the noise |
c85912b to
2e9ea4a
Compare
This behaves similar to the way that `minimum_should_match` works for the `match` query (in fact it is implemented in the exact same way) Fixes elastic#6449
|
as far as confusion over some of that stuff, i think it can apply in general with MSM, and will happen when things like stopwords are different across the fields too. Perhaps we should just doc some kind of warning, and keep it simple. I see all kinds of confusion about this e.g. on the solr lists around such things, and i fixed a bug in one of its parsers where it didnt apply it for chinese, which will certainly happen here. This is better than a lot of complexity I think. I always thought MSM was hard to integrate into parsers. |
This behaves similar to the way that
minimum_should_matchworks forthe
matchquery (in fact it is implemented in the exact same way)Fixes #6449