Skip to content

Conversation

@alamb
Copy link
Contributor

@alamb alamb commented Jan 31, 2024

Which issue does this PR close?

Part of #5472

Rationale for this change

This is something I noticed while working on #8849

The count distinct module was getting large so splitting it into smaller modules would makes things easier to navigate

What changes are included in this PR?

  1. Move code to native.rs module (to mirror strings.rs)
  2. rename native to NativeDistinctCountAccumulator to PrimitiveDistinctCountAccumulator to align with ArrowPrimitiveType
  3. add some more docs
  4. Remove an unecessary typedef

Are these changes tested?

By existing tests

Are there any user-facing changes?

No, this is internal code reorganization. None of this code is accessible outside of the crate

@github-actions github-actions bot added the physical-expr Changes to the physical-expr crates label Jan 31, 2024
UInt16 => Box::new(NativeDistinctCountAccumulator::<UInt16Type>::new()),
UInt32 => Box::new(NativeDistinctCountAccumulator::<UInt32Type>::new()),
UInt64 => Box::new(NativeDistinctCountAccumulator::<UInt64Type>::new()),
Int8 => Box::new(PrimitiveDistinctCountAccumulator::<Int8Type>::new()),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed

#[derive(Debug)]
struct DistinctCountAccumulator {
values: HashSet<DistinctScalarValues, RandomState>,
values: HashSet<ScalarValue, RandomState>,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was typedefd to

type DistinctScalarValues = ScalarValue;

Which serves no purpose that I can tell other than obscuring what this code is doing

}

#[derive(Debug)]
struct NativeDistinctCountAccumulator<T>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to native.rs

@alamb alamb marked this pull request as ready for review January 31, 2024 17:48
@alamb alamb changed the title Split count_distinct.rs into separate modules @alamb Split count_distinct.rs into separate modules Jan 31, 2024
@alamb alamb requested a review from crepererum January 31, 2024 21:10
// KIND, either express or implied. See the License for the
// specific language governing permissions and limitations
// under the License.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here could be a module-level comment that quickly says what "native" is. "Native" I think revers to "primitive" scalars like ints and floats, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is correct. Good call

Added in 1edf4f0

@alamb
Copy link
Contributor Author

alamb commented Feb 1, 2024

Thanks again for the review @crepererum

@alamb alamb merged commit 8b50774 into apache:main Feb 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

physical-expr Changes to the physical-expr crates

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants