Skip to content

Conversation

@evanxg852000
Copy link
Collaborator

@evanxg852000 evanxg852000 commented Jun 24, 2022

  • Implements the DateTime field support
  • updated tantivy version to include DateTime changes
    Closes Datetime field #1328

@evanxg852000 evanxg852000 changed the title Fix 1328 [W.I.P] DateTime field support [W.I.P] Jul 22, 2022
@evanxg852000 evanxg852000 marked this pull request as ready for review July 25, 2022 13:07
@evanxg852000 evanxg852000 changed the title DateTime field support [W.I.P] DateTime field support Jul 25, 2022
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: keep the deps sorted

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This intermediate representation exists because we want to build the parsers only once? Are there other reasons?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nicer to display the formats attempted:
"Failed to parse datetime `foo` using the specified formats `unix_ts_secs, unix_ts_millis`."

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether we should have date and datetime. Thoughts?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could definitely have both: date being based on datetime. For better support (query parsing, display). I think we need to start from tantivy. Would you suggest renaming the date type as datetime now?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, let's rename everything Datetime for now. We can add a proper Date type later with better tantivy support.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think those function types are buying us anything. We could just iterate over the list of formats and get away with it. Less code, no dynamic dispatch. That would also remove the need for QuickwitDateOptionsDeser:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- `Strftime`: Parsing dates using the unix [strftime](https://man7.org/linux/man-pages/man3/strftime.3.html) format.
- `strftime`: Parsing dates using the Unix [strftime](https://man7.org/linux/man-pages/man3/strftime.3.html) format.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pub struct QuickwitDateOptions {
pub struct QuickwitDateTimeOptions {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking much better! Thank you for the refactor.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to implement as_str DateTimeFormat?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and then implement Display for DateTimeFormat instead.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
use tantivy::{DatePrecision, DateTime};
use tantivy::{DatePrecision as DateTimePrecision, DateTime};

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the unwrap and return proper error: "Failed to blabla".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol, just review a PR from Paul where he added the trailing comma. This makes me feel better about myself, I felt I was the only one OCD on the team.

- `unix_ts_secs`, `unix_ts_millis`, `unix_ts_micros`: Parsing dates from numbers (timestamp). Only one can be used in configuration. `unix_ts_secs` is added to the list by default if none is specified.

:::info
When accepting multiple formats, the corresponding parsers are tried in order they are declared.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
When accepting multiple formats, the corresponding parsers are tried in order they are declared.
When specifying multiple input formats, the corresponding parsers are tried in the order they are declared.


pub(crate) fn parse_number(&self, value: i64) -> Result<OffsetDateTime, String> {
for format in self.input_formats.iter() {
match format {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if let DateTimeFormat::Timestamp(precision) = format {
    return unix_timestamp_parse(value, precision)
}

}

/// Recognizes numbers as unix timestamp with a precision.
fn unix_timestamp_parse(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fn unix_timestamp_parse(
fn parse_unix_timestamp(

DateTimeFormat::RFC2822 => "rfc2822",
DateTimeFormat::ISO8601 => "iso8601",
DateTimeFormat::Strftime(format) => format,
DateTimeFormat::Timestamp(precision) => match precision {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DateTimeFormat::Timestamp(DateTimePrecision::Seconds) => "unix_ts_secs"
...

}

impl Display for DateTimeFormat {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The codebase aligns on

fn fmt(&self, f: &mut std::fmt::Formatter)

}

/// Parses DateTime strings using the unix strftime formatting.
fn strftime_parse(value: &str, format: &str) -> Result<OffsetDateTime, String> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fn strftime_parse(value: &str, format: &str) -> Result<OffsetDateTime, String> {
fn parse_strftime(value: &str, format: &str) -> Result<OffsetDateTime, String> {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The codebase aligns on putting the verb first.

(self.get_type(), json_val.as_i64())
{
let date_time_str =
timestamp_to_datetime_str(timestamp, &options.precision).unwrap();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably an expect, right?

@evanxg852000 evanxg852000 merged commit 37976c9 into main Jul 28, 2022
@evanxg852000 evanxg852000 deleted the fix-1328 branch July 28, 2022 18:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Datetime field

3 participants