-
Notifications
You must be signed in to change notification settings - Fork 492
DateTime field support #1665
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DateTime field support #1665
Conversation
quickwit-doc-mapper/Cargo.toml
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: keep the deps sorted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This intermediate representation exists because we want to build the parsers only once? Are there other reasons?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nicer to display the formats attempted:
"Failed to parse datetime `foo` using the specified formats `unix_ts_secs, unix_ts_millis`."
docs/configuration/index-config.md
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder whether we should have date and datetime. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could definitely have both: date being based on datetime. For better support (query parsing, display). I think we need to start from tantivy. Would you suggest renaming the date type as datetime now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, let's rename everything Datetime for now. We can add a proper Date type later with better tantivy support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think those function types are buying us anything. We could just iterate over the list of formats and get away with it. Less code, no dynamic dispatch. That would also remove the need for QuickwitDateOptionsDeser:
docs/configuration/index-config.md
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - `Strftime`: Parsing dates using the unix [strftime](https://man7.org/linux/man-pages/man3/strftime.3.html) format. | |
| - `strftime`: Parsing dates using the Unix [strftime](https://man7.org/linux/man-pages/man3/strftime.3.html) format. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| pub struct QuickwitDateOptions { | |
| pub struct QuickwitDateTimeOptions { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking much better! Thank you for the refactor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may want to implement as_str DateTimeFormat?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and then implement Display for DateTimeFormat instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| use tantivy::{DatePrecision, DateTime}; | |
| use tantivy::{DatePrecision as DateTimePrecision, DateTime}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the unwrap and return proper error: "Failed to blabla".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove?
quickwit-indexing/Cargo.toml
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lol, just review a PR from Paul where he added the trailing comma. This makes me feel better about myself, I felt I was the only one OCD on the team.
docs/configuration/index-config.md
Outdated
| - `unix_ts_secs`, `unix_ts_millis`, `unix_ts_micros`: Parsing dates from numbers (timestamp). Only one can be used in configuration. `unix_ts_secs` is added to the list by default if none is specified. | ||
|
|
||
| :::info | ||
| When accepting multiple formats, the corresponding parsers are tried in order they are declared. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| When accepting multiple formats, the corresponding parsers are tried in order they are declared. | |
| When specifying multiple input formats, the corresponding parsers are tried in the order they are declared. |
|
|
||
| pub(crate) fn parse_number(&self, value: i64) -> Result<OffsetDateTime, String> { | ||
| for format in self.input_formats.iter() { | ||
| match format { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if let DateTimeFormat::Timestamp(precision) = format {
return unix_timestamp_parse(value, precision)
}| } | ||
|
|
||
| /// Recognizes numbers as unix timestamp with a precision. | ||
| fn unix_timestamp_parse( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| fn unix_timestamp_parse( | |
| fn parse_unix_timestamp( |
| DateTimeFormat::RFC2822 => "rfc2822", | ||
| DateTimeFormat::ISO8601 => "iso8601", | ||
| DateTimeFormat::Strftime(format) => format, | ||
| DateTimeFormat::Timestamp(precision) => match precision { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DateTimeFormat::Timestamp(DateTimePrecision::Seconds) => "unix_ts_secs"
...| } | ||
|
|
||
| impl Display for DateTimeFormat { | ||
| fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The codebase aligns on
fn fmt(&self, f: &mut std::fmt::Formatter)| } | ||
|
|
||
| /// Parses DateTime strings using the unix strftime formatting. | ||
| fn strftime_parse(value: &str, format: &str) -> Result<OffsetDateTime, String> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| fn strftime_parse(value: &str, format: &str) -> Result<OffsetDateTime, String> { | |
| fn parse_strftime(value: &str, format: &str) -> Result<OffsetDateTime, String> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The codebase aligns on putting the verb first.
| (self.get_type(), json_val.as_i64()) | ||
| { | ||
| let date_time_str = | ||
| timestamp_to_datetime_str(timestamp, &options.precision).unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably an expect, right?
Closes Datetime field #1328