Skip to content

Conversation

@sagar0
Copy link
Contributor

@sagar0 sagar0 commented Sep 23, 2017

Summary

For every merge operand encountered for a key in the read path we now have the ability to decide whether to look further (to retrieve more merge operands for the key) or stop and invoke the merge operator to return the value. The user needs to override ShouldMerge() method with a condition to terminate search when true to avail this facility.

This has a couple of advantages:

  1. It helps in limiting the number of merge operands that are looked at to compute a value as part of a user Get operation.
  2. It allows to peek at a merge key-value to see if further merge operands need to look at.

Example: Limiting the number of merge operands that are looked at: Lets say you have 10 merge operands for a key spread over various levels. If you only want RocksDB to look at the latest two merge operands instead of all 10 to compute the value, it is now possible with this PR. You can set the condition in ShouldMerge() to return true when the size of the operand list is 2. Look at the example implementation in the unit test. Without this PR, a Get might look at all the 10 merge operands in different levels before invoking the merge-operator.

Test Plan:

Added a new unit test.
Made sure that there is no perf regression by running benchmarks.

Benchmark results:

Command line to Load data:

TEST_TMPDIR=/dev/shm ./db_bench --benchmarks="mergerandom" --merge_operator="uint64add" --num=10000000
...
mergerandom  :      12.861 micros/op 77757 ops/sec;    8.6 MB/s ( updates:10000000)

ReadRandomMergeRandom bechmark results:
Command line:

TEST_TMPDIR=/dev/shm ./db_bench --benchmarks="readrandommergerandom" --merge_operator="uint64add" --num=10000000

Base -- Without this code change (on commit fc7476b):

readrandommergerandom :      38.586 micros/op 25916 ops/sec; (reads:3001599 merges:6998401 total:10000000 hits:842235 maxlength:8)

With this code change:

readrandommergerandom :      38.653 micros/op 25870 ops/sec; (reads:3001599 merges:6998401 total:10000000 hits:842235 maxlength:8)

@facebook-github-bot
Copy link
Contributor

@sagar0 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

To support limiting the number of merge operands that are looked at
during a full-merge as part of a Get operation.
@sagar0 sagar0 force-pushed the limit-merge-operands branch from 463badc to 1bfc366 Compare September 25, 2017 18:09
@facebook-github-bot
Copy link
Contributor

@sagar0 has updated the pull request.

@facebook-github-bot
Copy link
Contributor

@sagar0 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@sagar0 has updated the pull request. View: changes, changes since last import

@facebook-github-bot
Copy link
Contributor

@sagar0 has updated the pull request. View: changes, changes since last import

@facebook-github-bot
Copy link
Contributor

@sagar0 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@sagar0 sagar0 requested review from ajkr, maysamyabandeh and siying and removed request for maysamyabandeh September 26, 2017 17:08
Copy link
Contributor

@siying siying left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure the performance doesn't regress before committing it.

@sagar0
Copy link
Contributor Author

sagar0 commented Sep 28, 2017

Benchmark results:

Command line to Load data:

TEST_TMPDIR=/dev/shm ./db_bench --benchmarks="mergerandom" --merge_operator="uint64add" --num=10000000
...
mergerandom  :      12.861 micros/op 77757 ops/sec;    8.6 MB/s ( updates:10000000)

ReadRandomMergeRandom bechmark results:
Command line:
TEST_TMPDIR=/dev/shm ./db_bench --benchmarks="readrandommergerandom" --merge_operator="uint64add" --num=10000000

Without this PR (on commit fc7476b):

readrandommergerandom :      38.586 micros/op 25916 ops/sec; (reads:3001599 merges:6998401 total:10000000 hits:842235 maxlength:8)

With the current PR:

readrandommergerandom :      38.653 micros/op 25870 ops/sec; (reads:3001599 merges:6998401 total:10000000 hits:842235 maxlength:8)

@facebook-github-bot
Copy link
Contributor

@sagar0 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

// during a point lookup, thereby helping in limiting the number of levels to
// read from.
// Doesn't help with iterators.
virtual bool ShouldMerge(const std::vector<Slice>& operands) const {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one question, when this gets called, do we already read all the operands from the disk? which means we still need to pay the disk io?

Copy link
Contributor Author

@sagar0 sagar0 Sep 29, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. This ShouldMerge callback function gets called for each merge operand as we go down the levels, and we don't read more operands when ShouldMerge returns true. So we don't pay for any more disk io than what is absolutely necessary.

Example: Lets say:

bool ShouldMerge(const std::vector<Slice>& operands) const {
  return (operands.size() >= 2);
}

Lets say, a key k1 has 8 merge operands spread over different levels.
On encountering a Get(k1), ShouldMerge gets called with one operand first, it returns false, and we continue trying to find more merge operands, if any.
Next it gets called when the second operand is encountered, returns true, which leads to invoking the FullMerge operator, and a value is returned.
All the merge operands for k1 at higher levels are not read from disk.

facebook-github-bot pushed a commit that referenced this pull request Oct 2, 2017
Summary:
Now that RocksDB supports conditional merging during point lookups (introduced in #2923), Cassandra value merge operator can be updated to pass in a limit. The limit needs to be passed in from the Cassandra code.
Closes #2947

Differential Revision: D5938454

Pulled By: sagar0

fbshipit-source-id: d64a72d53170d8cf202b53bd648475c3952f7d7f
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants