Skip to content

Conversation

@hubert2dev
Copy link

Context

  • Task: Improve PostgreSQL bucket compaction query efficiency (ticket “400: [Postgres Storage] Inefficient index usage when compacting”).
  • File touched: modules/module-postgres-storage/src/storage/PostgresCompactor.ts.
  • Branch: feature/pg-compaction-bounds.

Current Issue

  • The compaction loop queries bucket_data in reverse PK order ORDER BY bucket_name DESC, op_id DESC with this filter:

    bucket_name >= :bucketLower
    AND (
      (bucket_name = :bucketUpper AND op_id < :upperOpId)
      OR bucket_name < :bucketUpper COLLATE "C"
    )
    
  • Because of the OR, PostgreSQL cannot traverse the (group_id, bucket_name, op_id) PRIMARY KEY efficiently and instead runs Index Scan Backward using unique_id while filtering millions of rows (example plan reports ~4.3M rows scanned for a 10k LIMIT).

  • Result: high latency and IO for compaction jobs, especially when buckets contain large histories.

Change Overview

  1. Row-value comparison for pagination
    • Compute parameter objects (groupParam, bucketLowerParam, bucketUpperParam, upperOpParam, limitParam).
    • First batch (upperOpIdLimit === BIGINT_MAX): keep binary comparison bucket_name < bucketUpper COLLATE "C".
    • Subsequent batches: ROW(bucket_name COLLATE "C", op_id) < ROW(bucketUpper, upperOpIdLimit); this follows the composite index ordering exactly.
  2. Preserve behavior
    • bucketLower handling and MAX_CHAR sentinel unchanged.
    • Pagination still updates bucketUpper/upperOpIdLimit from the last row in each batch.
    • Decoding, MOVE/CLEAR logic, and batching limits untouched.
  3. Type-safe params
    • Explicit as const tagging on parameter objects prevents lint errors and clarifies Postgres types.

Why This Works

  • PostgreSQL can perform key-ordered scans when the predicate forms a prefix of the index ordering.
  • Replacing the OR with row comparison removes the planner barrier; the backward scan can stop as soon as the tuple falls below the (bucketUpper, upperOpIdLimit) cursor, so scanned rows ≈ requested LIMIT.
  • Uses binary collation (COLLATE "C") consistently to match the original comparison semantics.

Testing / Verification Plan

  1. Explain before/after
    • Capture EXPLAIN (ANALYZE, BUFFERS) for the old query to document ~4.3M scanned rows.
    • Re-run after the change; expect Heap tuples/Rows removed by filter to drop near zero.
  2. Result equivalence
    • Run compactor on a staging DB and confirm bucket batches (first batch, mid-bucket, cross-bucket) match historical output.
  3. Manual smoke
    • Trigger compaction via CLI with verbose logging, verify batches proceed bucket-by-bucket without errors.

@changeset-bot
Copy link

changeset-bot bot commented Nov 18, 2025

⚠️ No Changeset found

Latest commit: 7b3c289

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

Copy link
Contributor

@rkistner rkistner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Can the upperOpIdLimit === BIGINT_MAX special case be removed? It seems like the second query should be able to cover that case as well. I do like the ROW (...) < ROW (...) comparison - that does appear better than the OR clause we had.
  2. There are still two comparisons, one using the default collation, and the other using COLLATE "C". It feels like at least one of those is wrong - these should be using the same collation - which would be the better one for index usage?
  3. The PR description mentions a test plan to confirm that (1) this fixes the performance issue, and (2) gives the same results. Have you done any of those tests?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants