Skip to content

Conversation

@sumeerbhola
Copy link
Collaborator

The snapshot is used to create an iterator, which is recreated based on the storage.snapshot.recreate_iter_duration cluster setting, which defaults to 20s.

This is mostly plumbing changes, except for catchup_scan.go.

Fixes #133851

Epic: none

Release note (ops change): The cluster setting
storage.snapshot.recreate_iter_duration (default 20s) controls how frequently a long-lived engine iterator, backed by an engine snapshot, will be closed and recreated. Currently, it is only used for iterators used in rangefeed catchup scans.

@sumeerbhola sumeerbhola requested review from a team as code owners September 29, 2025 22:12
@sumeerbhola sumeerbhola requested review from jeffswenson, log-head, rytaft and xinhaoz and removed request for a team September 29, 2025 22:12
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Collaborator Author

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani, @jeffswenson, @log-head, @rytaft, @stevendanna, and @xinhaoz)


pkg/kv/kvserver/rangefeed/catchup_scan.go line 251 at r1 (raw file):

			// iterator, and using the cpu time to amortize that cost seems
			// reasonable.
			if (readmitted && i.iterRecreateDuration > 0) || util.RaceEnabled {

Always recreating, now gated behind util.RaceEnabled, ironed out some correctness issues via existing tests.

There isn't a unit test to check that we are recreating at the appropriate interval, and it seems hard to test without injecting testing alternatives to the Pacer, time functions etc.

@yuzefovich yuzefovich removed the request for review from rytaft September 29, 2025 22:50
Copy link
Collaborator

@stevendanna stevendanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stevendanna reviewed 8 of 26 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani, @jeffswenson, @log-head, and @xinhaoz)


pkg/kv/kvserver/rangefeed/catchup_scan.go line 272 at r1 (raw file):

				// on a key > lastKey. Which is why we need to potentially replace
				// lastKey below and to consider the case theat the iterator is
				// exhausted.

Just to confirm, this is OK even with withDiff because of the sameKey check above, so if the NextIgnoringTime landed us on a key outside the time bound but which is needed to populate the diff, we won't end up in this block. Does that seem right to you?

@sumeerbhola sumeerbhola force-pushed the catchup_snap branch 3 times, most recently from 4fff1aa to 0dfc257 Compare September 29, 2025 23:12
Copy link
Collaborator Author

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani, @jeffswenson, @log-head, @stevendanna, and @xinhaoz)


pkg/kv/kvserver/rangefeed/catchup_scan.go line 272 at r1 (raw file):

Previously, stevendanna (Steven Danna) wrote…

Just to confirm, this is OK even with withDiff because of the sameKey check above, so if the NextIgnoringTime landed us on a key outside the time bound but which is needed to populate the diff, we won't end up in this block. Does that seem right to you?

Correct. I've added a comment.

@sumeerbhola sumeerbhola force-pushed the catchup_snap branch 2 times, most recently from 06fc1cb to f7c28ac Compare September 30, 2025 02:09
Copy link
Collaborator

@arulajmani arulajmani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

@arulajmani reviewed 18 of 26 files at r1, 8 of 8 files at r2, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @jeffswenson, @log-head, @stevendanna, and @xinhaoz)


pkg/kv/kvserver/rangefeed/catchup_scan.go line 278 at r2 (raw file):

				// So the following seek (which does respect the time window) may land
				// on a key > lastKey. Which is why we need to potentially replace
				// lastKey below and to consider the case theat the iterator is

nit: "that"


pkg/kv/kvserver/rangefeed/buffered_registration.go line 59 at r2 (raw file):

		disconnected       bool

		// catchUpSnap is created by replica under raftMu lock when registration is

nit: while we're here, could you add some "a" and "the"'s in this comment? e.g. "is created by a replica", "a registration", "the output loop" etc.

Copy link
Collaborator

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @jeffswenson, @log-head, @stevendanna, @sumeerbhola, and @xinhaoz)


-- commits line 6 at r2:
Are we worried about the snapshots continuing to pin sstables for the duration of the catchup scan? I'm wondering if our removal of old-style LSM snapshots was premature, or if catchup scans could possibly acquire a new LSM version.

@RaduBerinde
Copy link
Member

pkg/storage/pebble.go line 181 at r2 (raw file):

// SnapshotRecreateIterDuration controls how often a storage iterator over a snapshot
// should be recreated.

[nit] Can you add to this comment explaining why we want to recreate the iterator and what is the trade-off for this setting?

The snapshot is used to create an iterator, which is recreated based on
the storage.snapshot.recreate_iter_duration cluster setting, which
defaults to 20s.

This is mostly plumbing changes, except for catchup_scan.go.

Fixes cockroachdb#133851

Epic: none

Release note (ops change): The cluster setting
storage.snapshot.recreate_iter_duration (default 20s) controls how
frequently a long-lived engine iterator, backed by an engine snapshot,
will be closed and recreated. Currently, it is only used for iterators
used in rangefeed catchup scans.
Copy link
Collaborator Author

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTRs!

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @jeffswenson, @log-head, @stevendanna, and @xinhaoz)


-- commits line 6 at r2:

Are we worried about the snapshots continuing to pin sstables for the duration of the catchup scan?

Slightly.

I'm wondering if our removal of old-style LSM snapshots was premature

I think not. This is an example where we avoided using old-style snapshots because of the concern of their expense in compactions and write-amp, and so we stuck with an iterator. Replacing an iterator with a new-style snapshot has no such tradeoff, so I think is strictly better.

or if catchup scans could possibly acquire a new LSM version.

I'd asked about this on that doc, wrt whether we hold a protected timestamp. @stevendanna


pkg/kv/kvserver/rangefeed/buffered_registration.go line 59 at r2 (raw file):

Previously, arulajmani (Arul Ajmani) wrote…

nit: while we're here, could you add some "a" and "the"'s in this comment? e.g. "is created by a replica", "a registration", "the output loop" etc.

Done. Also in unbuffered_registration.go.


pkg/kv/kvserver/rangefeed/catchup_scan.go line 278 at r2 (raw file):

Previously, arulajmani (Arul Ajmani) wrote…

nit: "that"

Done


pkg/storage/pebble.go line 181 at r2 (raw file):

Previously, RaduBerinde wrote…

[nit] Can you add to this comment explaining why we want to recreate the iterator and what is the trade-off for this setting?

Done

@sumeerbhola
Copy link
Collaborator Author

bors r+

@sumeerbhola
Copy link
Collaborator Author

bors retry

@craig
Copy link
Contributor

craig bot commented Oct 1, 2025

@craig craig bot merged commit 5a4758c into cockroachdb:master Oct 1, 2025
22 of 23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

kv/rangefeed: use snapshots instead of iterators for rangefeed catchup scans

6 participants