Skip to content

Conversation

@alamb
Copy link
Contributor

@alamb alamb commented Oct 31, 2025

Which issue does this PR close?

Rationale for this change

I am reworking how the parquet decoder's state machine works in #8159

One of the unit tests, test_cache_projection_excludes_nested_columns uses non-public APIs that I am changing

Rather than rewrite them into other non public APIs I think it would be better if this test is in terms of public APIs

What changes are included in this PR?

  1. refactor test_cache_projection_excludes_nested_columns to use high level APIs

Are these changes tested?

They are run in CI

I also verified this test covers the intended functionality by commenting it out:

--- a/parquet/src/arrow/async_reader/mod.rs
+++ b/parquet/src/arrow/async_reader/mod.rs
@@ -724,7 +724,9 @@ where
             cache_projection.union(predicate.projection());
         }
         cache_projection.intersect(projection);
-        self.exclude_nested_columns_from_cache(&cache_projection)
+        // TEMP don't exclude nested columns
+        //self.exclude_nested_columns_from_cache(&cache_projection)
+        Some(cache_projection)
     }

     /// Exclude leaves belonging to roots that span multiple parquet leaves (i.e. nested columns)

And then running the test:

cargo test --all-features --test arrow_reader

And the test fails (as expected)

---- predicate_cache::test_cache_projection_excludes_nested_columns stdout ----

thread 'predicate_cache::test_cache_projection_excludes_nested_columns' panicked at parquet/tests/arrow_reader/predicate_cache.rs:244:9:
assertion `left == right` failed: Expected 0 records read from cache, but got 100
  left: 100
 right: 0
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace


failures:
    predicate_cache::test_cache_projection_excludes_nested_columns

test result: FAILED. 88 passed; 1 failed; 1 ignored; 0 measured; 0 filtered out; finished in 0.20s

Are there any user-facing changes?

No, this is only test changes

@github-actions github-actions bot added the parquet Changes to the parquet crate label Oct 31, 2025
@alamb alamb marked this pull request as ready for review October 31, 2025 12:40
@alamb
Copy link
Contributor Author

alamb commented Oct 31, 2025

@XiangpengHao can I trouble you for a review of this PR (as you wrote the original version of the test)?

Copy link
Contributor

@XiangpengHao XiangpengHao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! This is much more readable than before 💯

@alamb alamb merged commit 06f7f07 into apache:main Oct 31, 2025
32 checks passed
@alamb
Copy link
Contributor Author

alamb commented Oct 31, 2025

Tahnk you @XiangpengHao

@alamb alamb deleted the alamb/rewrite_tests branch October 31, 2025 21:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

parquet Changes to the parquet crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants