Skip to content

Conversation

@ever0de
Copy link
Contributor

@ever0de ever0de commented Nov 13, 2025

Previously, TrySeekUsingNext would perform up to 5 simple Next() operations
at level 0. If the target key wasn't found, all that work was abandoned and
the iterator fell back to a full seek from the top of the skiplist. This
wasted both the forward progress made and the information gained during those
Next operations.

The new implementation leverages the skiplist tower structure to make seeking
more efficient. Instead of advancing only at level 0, it attempts to climb to
higher levels when possible, allowing it to skip over many nodes efficiently.
When a node >= target is found at a higher level, it descends through
progressively lower levels to find the exact first node >= target at level 0.
This approach better utilizes the skiplist's hierarchical structure for
workloads with sequential access patterns.

The optimization shows a modest improvement in the SeekPrefixGE benchmark,
particularly as the skip distance between seeks increases. With use-next=true,
performance improves by up to 5.5% for larger skips (skip=16), while
shorter skips show smaller gains.

goos: darwin
goarch: arm64
pkg: github.com/cockroachdb/pebble/internal/arenaskl
cpu: Apple M2 Pro
                                       │ ./bench_master.txt │       ./bench_optimized.txt       │
                                       │       sec/op       │   sec/op     vs base              │
SeekPrefixGE/skip=1/use-next=false-12          180.8n ±  2%   186.8n ± 2%  +3.32% (p=0.015 n=6)
SeekPrefixGE/skip=1/use-next=true-12           88.26n ± 10%   89.05n ± 3%       ~ (p=0.699 n=6)
SeekPrefixGE/skip=2/use-next=false-12          183.0n ±  5%   184.9n ± 2%       ~ (p=1.000 n=6)
SeekPrefixGE/skip=2/use-next=true-12           94.73n ±  2%   93.31n ± 0%  -1.49% (p=0.002 n=6)
SeekPrefixGE/skip=4/use-next=false-12          175.9n ±  4%   178.1n ± 1%       ~ (p=0.065 n=6)
SeekPrefixGE/skip=4/use-next=true-12           114.5n ±  1%   111.0n ± 3%  -3.01% (p=0.006 n=6)
SeekPrefixGE/skip=8/use-next-false-12          177.2n ±  5%   177.6n ± 3%       ~ (p=0.615 n=6)
SeekPrefixGE/skip=8/use-next=true-12           219.1n ±  1%   216.4n ± 3%       ~ (p=0.485 n=6)
SeekPrefixGE/skip=16/use-next-false-12         180.2n ±  2%   179.2n ± 3%       ~ (p=0.288 n=6)
SeekPrefixGE/skip=16/use-next=true-12          208.3n ±  3%   196.8n ± 6%  -5.50% (p=0.009 n=6)
geomean                                        155.2n         154.4n       -0.53%

                                       │ ./bench_master.txt │       ./bench_optimized.txt        │
                                       │        B/op        │    B/op     vs base                │
SeekPrefixGE/skip=1/use-next=false-12            22.00 ± 0%   22.00 ± 0%       ~ (p=1.000 n=6) ¹
SeekPrefixGE/skip=1/use-next=true-12             22.00 ± 0%   22.00 ± 0%       ~ (p=1.000 n=6) ¹
SeekPrefixGE/skip=2/use-next=false-12            22.00 ± 0%   22.00 ± 0%       ~ (p=1.000 n=6) ¹
SeekPrefixGE/skip=2/use-next=true-12             22.00 ± 0%   22.00 ± 0%       ~ (p=1.000 n=6) ¹
SeekPrefixGE/skip=4/use-next=false-12            22.00 ± 0%   22.00 ± 0%       ~ (p=1.000 n=6) ¹
SeekPrefixGE/skip=4/use-next=true-12             22.00 ± 0%   22.00 ± 0%       ~ (p=1.000 n=6) ¹
SeekPrefixGE/skip=8/use-next-false-12            22.00 ± 0%   22.00 ± 0%       ~ (p=1.000 n=6) ¹
SeekPrefixGE/skip=8/use-next=true-12             22.00 ± 0%   22.00 ± 0%       ~ (p=1.000 n=6) ¹
SeekPrefixGE/skip=16/use-next-false-12           22.00 ± 0%   22.00 ± 0%       ~ (p=1.000 n=6) ¹
SeekPrefixGE/skip=16/use-next=true-12            22.00 ± 0%   22.00 ± 0%       ~ (p=1.000 n=6) ¹
geomean                                          22.00        22.00       +0.00%
¹ all samples are equal

                                       │ ./bench_master.txt │       ./bench_optimized.txt        │
                                       │     allocs/op      │ allocs/op   vs base                │
SeekPrefixGE/skip=1/use-next=false-12            2.000 ± 0%   2.000 ± 0%       ~ (p=1.000 n=6) ¹
SeekPrefixGE/skip=1/use-next=true-12             2.000 ± 0%   2.000 ± 0%       ~ (p=1.000 n=6) ¹
SeekPrefixGE/skip=2/use-next=false-12            2.000 ± 0%   2.000 ± 0%       ~ (p=1.000 n=6) ¹
SeekPrefixGE/skip=2/use-next=true-12             2.000 ± 0%   2.000 ± 0%       ~ (p=1.000 n=6) ¹
SeekPrefixGE/skip=4/use-next=false-12            2.000 ± 0%   2.000 ± 0%       ~ (p=1.000 n=6) ¹
SeekPrefixGE/skip=4/use-next=true-12             2.000 ± 0%   2.000 ± 0%       ~ (p=1.000 n=6) ¹
SeekPrefixGE/skip=8/use-next-false-12            2.000 ± 0%   2.000 ± 0%       ~ (p=1.000 n=6) ¹
SeekPrefixGE/skip=8/use-next=true-12             2.000 ± 0%   2.000 ± 0%       ~ (p=1.000 n=6) ¹
SeekPrefixGE/skip=16/use-next-false-12           2.000 ± 0%   2.000 ± 0%       ~ (p=1.000 n=6) ¹
SeekPrefixGE/skip=16/use-next=true-12            2.000 ± 0%   2.000 ± 0%       ~ (p=1.000 n=6) ¹
geomean                                          2.000        2.000       +0.00%
¹ all samples are equal

Fixes #5358

Previously, TrySeekUsingNext would perform up to 5 simple Next() operations
at level 0. If the target key wasn't found, all that work was abandoned and
the iterator fell back to a full seek from the top of the skiplist. This
wasted both the forward progress made and the information gained during those
Next operations.

The new implementation leverages the skiplist tower structure to make seeking
more efficient. Instead of advancing only at level 0, it attempts to climb to
higher levels when possible, allowing it to skip over many nodes efficiently.
When a node >= target is found at a higher level, it descends through
progressively lower levels to find the exact first node >= target at level 0.
This approach better utilizes the skiplist's hierarchical structure for
workloads with sequential access patterns.

The optimization shows a modest improvement in the SeekPrefixGE benchmark,
particularly as the skip distance between seeks increases. With `use-next=true`,
performance improves by up to 5.5% for larger skips (`skip=16`), while
shorter skips show smaller gains.

```
goos: darwin
goarch: arm64
pkg: github.com/cockroachdb/pebble/internal/arenaskl
cpu: Apple M2 Pro
                                       │ ./bench_master.txt │       ./bench_optimized.txt       │
                                       │       sec/op       │   sec/op     vs base              │
SeekPrefixGE/skip=1/use-next=false-12          180.8n ±  2%   186.8n ± 2%  +3.32% (p=0.015 n=6)
SeekPrefixGE/skip=1/use-next=true-12           88.26n ± 10%   89.05n ± 3%       ~ (p=0.699 n=6)
SeekPrefixGE/skip=2/use-next=false-12          183.0n ±  5%   184.9n ± 2%       ~ (p=1.000 n=6)
SeekPrefixGE/skip=2/use-next=true-12           94.73n ±  2%   93.31n ± 0%  -1.49% (p=0.002 n=6)
SeekPrefixGE/skip=4/use-next=false-12          175.9n ±  4%   178.1n ± 1%       ~ (p=0.065 n=6)
SeekPrefixGE/skip=4/use-next=true-12           114.5n ±  1%   111.0n ± 3%  -3.01% (p=0.006 n=6)
SeekPrefixGE/skip=8/use-next-false-12          177.2n ±  5%   177.6n ± 3%       ~ (p=0.615 n=6)
SeekPrefixGE/skip=8/use-next=true-12           219.1n ±  1%   216.4n ± 3%       ~ (p=0.485 n=6)
SeekPrefixGE/skip=16/use-next-false-12         180.2n ±  2%   179.2n ± 3%       ~ (p=0.288 n=6)
SeekPrefixGE/skip=16/use-next=true-12          208.3n ±  3%   196.8n ± 6%  -5.50% (p=0.009 n=6)
geomean                                        155.2n         154.4n       -0.53%

                                       │ ./bench_master.txt │       ./bench_optimized.txt        │
                                       │        B/op        │    B/op     vs base                │
SeekPrefixGE/skip=1/use-next=false-12            22.00 ± 0%   22.00 ± 0%       ~ (p=1.000 n=6) ¹
SeekPrefixGE/skip=1/use-next=true-12             22.00 ± 0%   22.00 ± 0%       ~ (p=1.000 n=6) ¹
SeekPrefixGE/skip=2/use-next=false-12            22.00 ± 0%   22.00 ± 0%       ~ (p=1.000 n=6) ¹
SeekPrefixGE/skip=2/use-next=true-12             22.00 ± 0%   22.00 ± 0%       ~ (p=1.000 n=6) ¹
SeekPrefixGE/skip=4/use-next=false-12            22.00 ± 0%   22.00 ± 0%       ~ (p=1.000 n=6) ¹
SeekPrefixGE/skip=4/use-next=true-12             22.00 ± 0%   22.00 ± 0%       ~ (p=1.000 n=6) ¹
SeekPrefixGE/skip=8/use-next-false-12            22.00 ± 0%   22.00 ± 0%       ~ (p=1.000 n=6) ¹
SeekPrefixGE/skip=8/use-next=true-12             22.00 ± 0%   22.00 ± 0%       ~ (p=1.000 n=6) ¹
SeekPrefixGE/skip=16/use-next-false-12           22.00 ± 0%   22.00 ± 0%       ~ (p=1.000 n=6) ¹
SeekPrefixGE/skip=16/use-next=true-12            22.00 ± 0%   22.00 ± 0%       ~ (p=1.000 n=6) ¹
geomean                                          22.00        22.00       +0.00%
¹ all samples are equal

                                       │ ./bench_master.txt │       ./bench_optimized.txt        │
                                       │     allocs/op      │ allocs/op   vs base                │
SeekPrefixGE/skip=1/use-next=false-12            2.000 ± 0%   2.000 ± 0%       ~ (p=1.000 n=6) ¹
SeekPrefixGE/skip=1/use-next=true-12             2.000 ± 0%   2.000 ± 0%       ~ (p=1.000 n=6) ¹
SeekPrefixGE/skip=2/use-next=false-12            2.000 ± 0%   2.000 ± 0%       ~ (p=1.000 n=6) ¹
SeekPrefixGE/skip=2/use-next=true-12             2.000 ± 0%   2.000 ± 0%       ~ (p=1.000 n=6) ¹
SeekPrefixGE/skip=4/use-next=false-12            2.000 ± 0%   2.000 ± 0%       ~ (p=1.000 n=6) ¹
SeekPrefixGE/skip=4/use-next=true-12             2.000 ± 0%   2.000 ± 0%       ~ (p=1.000 n=6) ¹
SeekPrefixGE/skip=8/use-next-false-12            2.000 ± 0%   2.000 ± 0%       ~ (p=1.000 n=6) ¹
SeekPrefixGE/skip=8/use-next=true-12             2.000 ± 0%   2.000 ± 0%       ~ (p=1.000 n=6) ¹
SeekPrefixGE/skip=16/use-next-false-12           2.000 ± 0%   2.000 ± 0%       ~ (p=1.000 n=6) ¹
SeekPrefixGE/skip=16/use-next=true-12            2.000 ± 0%   2.000 ± 0%       ~ (p=1.000 n=6) ¹
geomean                                          2.000        2.000       +0.00%
¹ all samples are equal
```

Fixes cockroachdb#5358
@ever0de ever0de requested a review from a team as a code owner November 13, 2025 07:21
@ever0de ever0de requested a review from jbowens November 13, 2025 07:21
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 files reviewed, 2 unresolved discussions (waiting on @jbowens)


internal/arenaskl/iterator.go line 318 at r1 (raw file):

		// Try to advance at the current level or higher levels.
		advanced := false
		for candidateLevel := level; candidateLevel <= maxLevel; candidateLevel++ {

I am a bit confused why we start with lower levels first, it feels like if we have to skip over a bunch of keys, those small steps will be more or less wasted.

Couldn't we binary search for the lowest level which points after key and then descend down? It feels like that would be close to optimal regardless of TrySeekUsingNext and doesn't need to have a maxSteps limit (though we could use TrySeekUsingNext to start the binary search with a lower or higher initial pivot).


internal/arenaskl/iterator.go line 365 at r1 (raw file):

// searching for the first node >= key. It starts at 'nd' which is known to be < key,
// and searches forward at progressively lower levels.
func (it *Iterator) searchDownToLevel0(nd *node, startLevel int, key []byte) *node {

Can this be consolidated with / used by seekForBaseSplice? They do almost the same thing AFAICT.

…Next

Addresses reviewer feedback from @RaduBerinde regarding inefficient level
traversal and code duplication.

This PR adds a height field to the skiplist node structure to enable safe
level-aware traversal from arbitrary positions, then implements binary search
to find the optimal starting level as suggested in the review.

Changes:

1. Added height field to node struct (4 bytes per node)
   - Nodes previously didn't store their tower height, making it impossible
     to safely access higher levels from arbitrary starting positions
   - Only the head node was guaranteed to have links at all levels
   - The height field enables checking  before access

2. Implemented seekForBaseSpliceFrom with binary search
   - When starting from an intermediate node (not head), binary search finds
     the highest safe level where current < target
   - This addresses the review comment that starting from lower levels is
     inefficient - we now start from the optimal level
   - Falls back to standard skiplist descent after finding the starting level

3. Simplified trySeekUsingNext to use seekForBaseSpliceFrom
   - Removed maxSteps limitation for cleaner, always-correct implementation
   - Consolidated with seekForBaseSplice as suggested in review
   - Now simply calls seekForBaseSpliceFrom(currentNode, currentHeight-1, key)

Implementation Journey:

Several approaches were attempted before adding the height field:
- Direct level climbing from arbitrary nodes: crashed due to accessing
  non-existent tower levels
- Checking next pointer validity: can't reliably detect invalid levels
- Starting from lower levels: inefficient as noted in review

The height field was the minimal addition needed to enable proper binary
search and level-aware traversal from any position.

Benchmark Results (master vs optimized):

Key improvement for longer seeks (target use case):
  skip=8/use-next=true:  219.1ns → 169.2ns  (-22.75%)
  skip=16/use-next=true: 208.3ns → 211.6ns  (+1.61%)

Trade-off for very short seeks (overhead from binary search):
  skip=1/use-next=true:  88.3ns → 119.3ns   (+35.17%)
  skip=2/use-next=true:  94.7ns → 132.0ns   (+39.35%)
  skip=4/use-next=true:  114.5ns → 149.7ns  (+30.74%)
@ever0de
Copy link
Contributor Author

ever0de commented Nov 13, 2025

@RaduBerinde Thank you for the feedback. Could you please confirm if my understanding is correct?

Initially, I was proceeding with the understanding that I should traverse up the tower to perform the Next operation, which is why I started from the lower levels.

Regarding your comment:

Couldn't we binary search for the lowest level which points after key and then descend down?

I interpreted this as performing a binary search across the node's levels. Please let me know if that was a misunderstanding.

I first attempted to implement this in commit 9b68e5c, but I found it was not possible to traverse up the tower from an arbitrary node (the iterator's current position) because I couldn't determine that node's height.
-> This same issue also blocked me from consolidating seekForBaseSplice to start from a specific node.

To solve this, I have now added a uint32 height field to the node, replacing what was previously padding. This change allows me to know the height of any arbitrary node, which makes it possible to proceed with your suggestions. This is the approach I've chosen.

https://github.com/ever0de/pebble/blob/b55c0ab4e38b7446fe970aecd16befae686d3c16/internal/arenaskl/iterator.go#L283-L289

Does this align with the implementation direction you had in mind?

Below are the benchmark results.

goos: darwin
goarch: arm64
pkg: github.com/cockroachdb/pebble/internal/arenaskl
cpu: Apple M2 Pro
                                                    │ bench_master_v2.txt │       bench_optimized_v2.txt        │
                                                    │       sec/op        │    sec/op     vs base               │
SeekPrefixGE/skip=1/use-next=false-12                        233.0n ±  2%   238.2n ±  8%   +2.21% (p=0.002 n=6)
SeekPrefixGE/skip=1/use-next=true-12                         104.6n ±  5%   118.7n ±  7%  +13.38% (p=0.002 n=6)
SeekPrefixGE/skip=2/use-next=false-12                        235.6n ±  5%   238.2n ±  4%        ~ (p=0.290 n=6)
SeekPrefixGE/skip=2/use-next=true-12                         115.6n ±  2%   132.1n ± 11%  +14.27% (p=0.002 n=6)
SeekPrefixGE/skip=4/use-next=false-12                        228.0n ±  3%   235.4n ±  1%   +3.25% (p=0.009 n=6)
SeekPrefixGE/skip=4/use-next=true-12                         139.9n ±  1%   149.2n ±  1%   +6.65% (p=0.002 n=6)
SeekPrefixGE/skip=8/use-next=false-12                        234.4n ±  5%   237.4n ±  5%        ~ (p=0.509 n=6)
SeekPrefixGE/skip=8/use-next=true-12                         289.2n ±  5%   171.3n ±  2%  -40.75% (p=0.002 n=6)
SeekPrefixGE/skip=16/use-next=false-12                       228.1n ±  2%   235.0n ±  2%   +3.00% (p=0.002 n=6)
SeekPrefixGE/skip=16/use-next=true-12                        271.3n ±  1%   215.5n ±  5%  -20.55% (p=0.002 n=6)
CockroachKeysSeekPrefixGE/skip=1/use-next=false-12           408.1n ±  1%   416.1n ±  1%   +1.95% (p=0.009 n=6)
CockroachKeysSeekPrefixGE/skip=1/use-next=true-12            50.96n ±  2%   66.71n ±  4%  +30.91% (p=0.002 n=6)
CockroachKeysSeekPrefixGE/skip=2/use-next=false-12           422.6n ±  2%   428.4n ±  1%   +1.38% (p=0.026 n=6)
CockroachKeysSeekPrefixGE/skip=2/use-next=true-12            74.05n ±  8%   92.66n ±  7%  +25.14% (p=0.002 n=6)
CockroachKeysSeekPrefixGE/skip=4/use-next=false-12           433.2n ±  5%   443.2n ± 13%   +2.32% (p=0.041 n=6)
CockroachKeysSeekPrefixGE/skip=4/use-next=true-12            107.3n ± 11%   130.8n ±  6%  +21.84% (p=0.002 n=6)
CockroachKeysSeekPrefixGE/skip=8/use-next=false-12           450.5n ±  3%   455.2n ± 12%        ~ (p=0.420 n=6)
CockroachKeysSeekPrefixGE/skip=8/use-next=true-12            532.8n ±  2%   156.0n ±  2%  -70.72% (p=0.002 n=6)
CockroachKeysSeekPrefixGE/skip=16/use-next=false-12          566.6n ± 12%   567.6n ±  6%        ~ (p=0.937 n=6)
CockroachKeysSeekPrefixGE/skip=16/use-next=true-12           685.7n ± 11%   289.6n ± 12%  -57.76% (p=0.002 n=6)
geomean                                                      234.4n         215.8n         -7.94%

                                                    │ bench_master_v2.txt │       bench_optimized_v2.txt       │
                                                    │        B/op         │    B/op     vs base                │
SeekPrefixGE/skip=1/use-next=false-12                        22.00 ± 0%     22.00 ± 0%       ~ (p=1.000 n=6) ¹
SeekPrefixGE/skip=1/use-next=true-12                         22.00 ± 0%     22.00 ± 0%       ~ (p=1.000 n=6) ¹
SeekPrefixGE/skip=2/use-next=false-12                        22.00 ± 0%     22.00 ± 0%       ~ (p=1.000 n=6) ¹
SeekPrefixGE/skip=2/use-next=true-12                         22.00 ± 0%     22.00 ± 0%       ~ (p=1.000 n=6) ¹
SeekPrefixGE/skip=4/use-next=false-12                        22.00 ± 0%     22.00 ± 0%       ~ (p=1.000 n=6) ¹
SeekPrefixGE/skip=4/use-next=true-12                         22.00 ± 0%     22.00 ± 0%       ~ (p=1.000 n=6) ¹
SeekPrefixGE/skip=8/use-next=false-12                        22.00 ± 0%     22.00 ± 0%       ~ (p=1.000 n=6) ¹
SeekPrefixGE/skip=8/use-next=true-12                         22.00 ± 0%     22.00 ± 0%       ~ (p=1.000 n=6) ¹
SeekPrefixGE/skip=16/use-next=false-12                       22.00 ± 0%     22.00 ± 0%       ~ (p=1.000 n=6) ¹
SeekPrefixGE/skip=16/use-next=true-12                        22.00 ± 0%     22.00 ± 0%       ~ (p=1.000 n=6) ¹
CockroachKeysSeekPrefixGE/skip=1/use-next=false-12           0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=6) ¹
CockroachKeysSeekPrefixGE/skip=1/use-next=true-12            0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=6) ¹
CockroachKeysSeekPrefixGE/skip=2/use-next=false-12           0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=6) ¹
CockroachKeysSeekPrefixGE/skip=2/use-next=true-12            0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=6) ¹
CockroachKeysSeekPrefixGE/skip=4/use-next=false-12           0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=6) ¹
CockroachKeysSeekPrefixGE/skip=4/use-next=true-12            0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=6) ¹
CockroachKeysSeekPrefixGE/skip=8/use-next=false-12           0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=6) ¹
CockroachKeysSeekPrefixGE/skip=8/use-next=true-12            0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=6) ¹
CockroachKeysSeekPrefixGE/skip=16/use-next=false-12          0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=6) ¹
CockroachKeysSeekPrefixGE/skip=16/use-next=true-12           0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=6) ¹
geomean                                                                 ²               +0.00%               ²
¹ all samples are equal
² summaries must be >0 to compute geomean

                                                    │ bench_master_v2.txt │       bench_optimized_v2.txt       │
                                                    │      allocs/op      │ allocs/op   vs base                │
SeekPrefixGE/skip=1/use-next=false-12                        2.000 ± 0%     2.000 ± 0%       ~ (p=1.000 n=6) ¹
SeekPrefixGE/skip=1/use-next=true-12                         2.000 ± 0%     2.000 ± 0%       ~ (p=1.000 n=6) ¹
SeekPrefixGE/skip=2/use-next=false-12                        2.000 ± 0%     2.000 ± 0%       ~ (p=1.000 n=6) ¹
SeekPrefixGE/skip=2/use-next=true-12                         2.000 ± 0%     2.000 ± 0%       ~ (p=1.000 n=6) ¹
SeekPrefixGE/skip=4/use-next=false-12                        2.000 ± 0%     2.000 ± 0%       ~ (p=1.000 n=6) ¹
SeekPrefixGE/skip=4/use-next=true-12                         2.000 ± 0%     2.000 ± 0%       ~ (p=1.000 n=6) ¹
SeekPrefixGE/skip=8/use-next=false-12                        2.000 ± 0%     2.000 ± 0%       ~ (p=1.000 n=6) ¹
SeekPrefixGE/skip=8/use-next=true-12                         2.000 ± 0%     2.000 ± 0%       ~ (p=1.000 n=6) ¹
SeekPrefixGE/skip=16/use-next=false-12                       2.000 ± 0%     2.000 ± 0%       ~ (p=1.000 n=6) ¹
SeekPrefixGE/skip=16/use-next=true-12                        2.000 ± 0%     2.000 ± 0%       ~ (p=1.000 n=6) ¹
CockroachKeysSeekPrefixGE/skip=1/use-next=false-12           0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=6) ¹
CockroachKeysSeekPrefixGE/skip=1/use-next=true-12            0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=6) ¹
CockroachKeysSeekPrefixGE/skip=2/use-next=false-12           0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=6) ¹
CockroachKeysSeekPrefixGE/skip=2/use-next=true-12            0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=6) ¹
CockroachKeysSeekPrefixGE/skip=4/use-next=false-12           0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=6) ¹
CockroachKeysSeekPrefixGE/skip=4/use-next=true-12            0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=6) ¹
CockroachKeysSeekPrefixGE/skip=8/use-next=false-12           0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=6) ¹
CockroachKeysSeekPrefixGE/skip=8/use-next=true-12            0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=6) ¹
CockroachKeysSeekPrefixGE/skip=16/use-next=false-12          0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=6) ¹
CockroachKeysSeekPrefixGE/skip=16/use-next=true-12           0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=6) ¹
geomean                                                                 ²               +0.00%               ²
¹ all samples are equal
² summaries must be >0 to compute geomean

@RaduBerinde
Copy link
Member

Oh, I see, thanks! I forgot that not all nodes have all the pointers - we might actually need to take some smaller steps just to get to a node that has more pointers (is "higher") before we can do a bigger jump. Let me think about this some more. @jbowens knows this code best, let's wait for him to chime in too.

@RaduBerinde
Copy link
Member

Going back to your first implementation (r1 in Reviewable), how does that work without knowing the height of a node? How do you know it's legal to look at the next pointer for a higher candidateLevel?

An alternative to storing the height in the node is remembering the entire stack during each Seek (i.e. the previous node on each level). Then we can try them starting at L1+ to find the correct level to start our descent from.

@ever0de
Copy link
Contributor Author

ever0de commented Nov 14, 2025

Hmm... initial idea was to traverse the tower of the node at the current iterator position. I assumed that if getNext(nd, level) returned tail, it meant there was no link at that level, and I could proceed from there. I now realize this was a flawed assumption and approach.

An alternative to storing the height in the node is remembering the entire stack during each Seek (i.e. the previous node on each level). Then we can try them starting at L1+ to find the correct level to start our descent from.

Regarding your suggestion to maintain a stack: I'm concerned that this might be difficult to implement correctly because arenaskl uses CAS (Compare-And-Swap) to manage its towers. Wouldn't it be challenging to ensure the stack remains valid amidst concurrent modifications? What are your thoughts on this?

In the read patterns I anticipate, it's possible for SeekGE to be called again on an iterator that has already been positioned (for instance, after being moved by Next, Prev, etc.). I'm worried that managing a stack in such scenarios could lead to a confusing or inconsistent iterator state.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

arenaskl: consider alternative TrySeekingUsingNext optimizations

3 participants