Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expression rewrite filter pushdown for dates #12056

Merged

Conversation

Tmonster
Copy link
Contributor

Create a new expression rewriting rule so that comparisons between timestamps and dates can be pushed down.

TODO:
I don't quite understand the matching logic, currently the expressions can be matched using the unordered policy. This way I can catch both cases of ts::DATE = '2024-05-06'::DATE and '2024-05-06'::DATE = ts::DATE For some reason though, if the arguments are switched, an extra binding is pushed back the the bindings.

Should fix https://github.com/duckdblabs/duckdb-internal/issues/1964

@duckdb-draftbot duckdb-draftbot marked this pull request as draft May 14, 2024 19:51
@Tmonster Tmonster marked this pull request as ready for review May 14, 2024 19:51
@duckdb-draftbot duckdb-draftbot marked this pull request as draft May 15, 2024 05:37
@Tmonster Tmonster marked this pull request as ready for review May 15, 2024 05:41
@duckdb-draftbot duckdb-draftbot marked this pull request as draft May 15, 2024 09:22
Copy link
Collaborator

@Mytherin Mytherin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! LGTM - some minor comments:


// add one day
auto date_t_copy = result.GetValue<duckdb::date_t>();
date_t_copy.days += 1;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if this overflows - can we check with DATE_MAX?

if (!is_constant) {
// means the matchers are flipped, so we need to flip our bindings
// for some reason an extra binding is added in this case.
cast_constant = bindings[4].get().Copy();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are four matches being created:

  • Comparison (top-level)
  • Cast
  • Bound column ref (child of cast)
  • Constant

I think the bindings are in the order in which they are found, so the order is either comparison/cast/bound colref/constant or comparison/constant/cast/bound colref, but I'm not 100% sure.

// other side is varchar to date?
auto right = make_uniq<CastExpressionMatcher>();
right->type = make_uniq<TypeMatcherId>(LogicalTypeId::DATE);
right->matcher = make_uniq<ConstantExpressionMatcher>();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be a FoldableConstantMatcher

@Tmonster Tmonster force-pushed the expression_rewrite_filter_pushdown_for_dates branch from e9a1e0b to 37cc021 Compare May 15, 2024 18:58
@Tmonster Tmonster marked this pull request as ready for review May 16, 2024 07:41
@Tmonster Tmonster requested a review from Mytherin May 16, 2024 10:54
@Tmonster
Copy link
Contributor Author

@Mytherin This is ready for another review. I added a TryAddOperation for dates to check if there is an overflow. If there is, the optimization isn't applied and we fall back to current functionality.

Copy link
Collaborator

@Mytherin Mytherin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes! Looks good - one more comment:

@@ -154,6 +154,22 @@ bool TryAddOperator::Operation(uint64_t left, uint64_t right, uint64_t &result)
return OverflowCheckedAddition::Operation<uint64_t, uint64_t>(left, right, result);
}

template <>
bool TryAddOperator::Operation(date_t left, date_t right, date_t &result) {
if (!Value::IsFinite(left) || !Value::IsFinite(right)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make AddOperator::Operation(date_t, int32_t) call this?

@@ -154,6 +154,22 @@ bool TryAddOperator::Operation(uint64_t left, uint64_t right, uint64_t &result)
return OverflowCheckedAddition::Operation<uint64_t, uint64_t>(left, right, result);
}

template <>
bool TryAddOperator::Operation(date_t left, date_t right, date_t &result) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we turn right into an int32_t? Adding two dates together does not make much logical sense

@duckdb-draftbot duckdb-draftbot marked this pull request as draft May 17, 2024 07:00
@Tmonster Tmonster marked this pull request as ready for review May 17, 2024 11:26
Date::Convert(left, year, month, day);
int32_t year_diff = right.months / Interval::MONTHS_PER_YEAR;
year += year_diff;
month += right.months - year_diff * Interval::MONTHS_PER_YEAR;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is adding a lot of duplicate code - in general when we add the Try methods, we rewrite the regular Add methods to use the Try, i.e. Interval::Add becomes something like:

interval_t result;
if (!Interval::TryAdd(left, right, result)) {
   throw OutOfRangeException("Date out of range");
}
return result;

Could we either do that, or revert to extending the TryAddOperator for date_t/int32_t as before?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we either do that, or revert to extending the TryAddOperator for date_t/int32_t as before?

Since we are close to the release, I decided not to rewrite the regular add method, fearing that I could add more bugs in the process. I went with extending the TryAddOperator for date_t/int32_t

Copy link
Collaborator

@Mytherin Mytherin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! LGTM

@duckdb-draftbot duckdb-draftbot marked this pull request as draft May 21, 2024 13:22
@Tmonster Tmonster marked this pull request as ready for review May 21, 2024 13:35
@hannes hannes added this to the v0.10.3 milestone May 21, 2024
@Mytherin Mytherin merged commit 1a6fa55 into duckdb:main May 21, 2024
41 checks passed
github-actions bot pushed a commit to duckdb/duckdb-r that referenced this pull request May 21, 2024
Merge pull request duckdb/duckdb#12056 from Tmonster/expression_rewrite_filter_pushdown_for_dates
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants