-
Notifications
You must be signed in to change notification settings - Fork 177
Support 'usenull' option in PPL top and rare commands
#4696
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,6 +1,6 @@ | ||
| ============= | ||
| ==== | ||
| rare | ||
| ============= | ||
| ==== | ||
|
|
||
| .. rubric:: Table of contents | ||
|
|
||
|
|
@@ -10,13 +10,13 @@ rare | |
|
|
||
|
|
||
| Description | ||
| ============ | ||
| =========== | ||
| | Using ``rare`` command to find the least common tuple of values of all fields in the field list. | ||
|
|
||
| **Note**: A maximum of 10 results is returned for each distinct tuple of values of the group-by fields. | ||
|
|
||
| Syntax | ||
| ============ | ||
| ====== | ||
| rare <field-list> [by-clause] | ||
|
|
||
| rare [rare-options] <field-list> [by-clause] ``(available from 3.1.0+)`` | ||
|
|
@@ -26,10 +26,13 @@ rare [rare-options] <field-list> [by-clause] ``(available from 3.1.0+)`` | |
| * rare-options: optional. options for the rare command. Supported syntax is [countfield=<string>] [showcount=<bool>]. | ||
| * showcount=<bool>: optional. whether to create a field in output that represent a count of the tuple of values. Default value is ``true``. | ||
| * countfield=<string>: optional. the name of the field that contains count. Default value is ``'count'``. | ||
| * usenull=<bool>: optional (since 3.4.0). whether to output the null value. The default value of ``usenull`` is determined by ``plugins.ppl.syntax.legacy.preferred``: | ||
|
|
||
| * When ``plugins.ppl.syntax.legacy.preferred=true``, ``usenull`` defaults to ``true`` | ||
| * When ``plugins.ppl.syntax.legacy.preferred=false``, ``usenull`` defaults to ``false`` | ||
|
Comment on lines
+31
to
+32
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. bullet points in blockquote, is it expected? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes. Keep the format same with the current stats.rst. I tried to use the new format to align with Update PPL Command Documentation. But seems there are still many format problems in that PR. ref |
||
|
|
||
| Example 1: Find the least common values in a field | ||
| =========================================== | ||
| ================================================== | ||
|
|
||
| The example finds least common gender of all the accounts. | ||
|
|
||
|
|
@@ -46,7 +49,7 @@ PPL query:: | |
|
|
||
|
|
||
| Example 2: Find the least common values organized by gender | ||
| ==================================================== | ||
| =========================================================== | ||
|
|
||
| The example finds least common age of all the accounts group by gender. | ||
|
|
||
|
|
@@ -66,12 +69,10 @@ PPL query:: | |
| Example 3: Rare command with Calcite enabled | ||
| ============================================ | ||
|
|
||
| The example finds least common gender of all the accounts when ``plugins.calcite.enabled`` is true. | ||
|
|
||
| PPL query:: | ||
|
|
||
| PPL> source=accounts | rare gender; | ||
| fetched row | ||
| os> source=accounts | rare gender; | ||
| fetched rows / total rows = 2/2 | ||
| +--------+-------+ | ||
| | gender | count | | ||
| |--------+-------| | ||
|
|
@@ -83,19 +84,47 @@ PPL query:: | |
| Example 4: Specify the count field option | ||
| ========================================= | ||
|
|
||
| The example specifies the count field when ``plugins.calcite.enabled`` is true. | ||
|
|
||
| PPL query:: | ||
|
|
||
| PPL> source=accounts | rare countfield='cnt' gender; | ||
| fetched row | ||
| os> source=accounts | rare countfield='cnt' gender; | ||
| fetched rows / total rows = 2/2 | ||
| +--------+-----+ | ||
| | gender | cnt | | ||
| |--------+-----| | ||
| | F | 1 | | ||
| | M | 3 | | ||
| +--------+-----+ | ||
|
|
||
|
|
||
| Example 5: Specify the usenull field option | ||
| =========================================== | ||
|
|
||
| PPL query:: | ||
|
|
||
| os> source=accounts | rare usenull=false email; | ||
| fetched rows / total rows = 3/3 | ||
| +-----------------------+-------+ | ||
| | email | count | | ||
| |-----------------------+-------| | ||
| | [email protected] | 1 | | ||
| | [email protected] | 1 | | ||
| | [email protected] | 1 | | ||
| +-----------------------+-------+ | ||
|
|
||
| PPL query:: | ||
|
|
||
| os> source=accounts | rare usenull=true email; | ||
| fetched rows / total rows = 4/4 | ||
| +-----------------------+-------+ | ||
| | email | count | | ||
| |-----------------------+-------| | ||
| | null | 1 | | ||
| | [email protected] | 1 | | ||
| | [email protected] | 1 | | ||
| | [email protected] | 1 | | ||
| +-----------------------+-------+ | ||
|
|
||
|
|
||
| Limitations | ||
| =========== | ||
| The ``rare`` command is not rewritten to OpenSearch DSL, it is only executed on the coordination node. | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not apply
isNotNulldirectly ongroupByList, but on their underlying input ref? If some operation converts anullfield to a non-null one, I think it should not be filtered out.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yuancu
Think about query:
The
groupByListcontainsRexCall "AS($9, 'a')". So we finally build acontext.relBuilder.filter(isNotNull($9))which $9 isnullif(status, "200")instead ofstatus