Skip to content

Conversation

@ajkr
Copy link
Contributor

@ajkr ajkr commented Oct 20, 2017

Instead of using samples directly, we now support passing the samples through zstd's dictionary generator when CompressionOptions::zstd_max_train_bytes is set to nonzero. If set to zero, we will use the samples directly as the dictionary -- same as before.

Note this is the first step of #2987, extracted into a separate PR per reviewer request.

Test Plan: make check -j64

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ajkr has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@ajkr ajkr changed the title optionally pass samples through zstd compression dictionary generator pass key/value samples through zstd compression dictionary generator Oct 20, 2017
Copy link
Contributor

@yiwu-arbug yiwu-arbug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to this answer the following should be safe:
https://stackoverflow.com/questions/39200665/directly-write-into-char-buffer-of-stdstring

std::string dict_data(max_dict_bytes, '\0');
size_t dict_len =
      ZDICT_trainFromBuffer(&dict_data[0], max_dict_bytes, &samples[0],
                            &sample_lens[0], sample_lens.size());
...
return dict_data;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, Yi.

@facebook-github-bot
Copy link
Contributor

@ajkr has updated the pull request.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ajkr is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

benesch pushed a commit to benesch/rocksdb that referenced this pull request Nov 6, 2017
Summary:
Instead of using samples directly, we now support passing the samples through zstd's dictionary generator when `CompressionOptions::zstd_max_train_bytes` is set to nonzero. If set to zero, we will use the samples directly as the dictionary -- same as before.

Note this is the first step of facebook#2987, extracted into a separate PR per reviewer request.
Closes facebook#3057

Differential Revision: D6116891

Pulled By: ajkr

fbshipit-source-id: 70ab13cc4c734fa02e554180eed0618b75255497
facebook-github-bot pushed a commit that referenced this pull request Nov 7, 2017
Summary:
- moved existing compression options to `InitializeOptionsGeneral` since they cannot be set through options file
- added flag for `zstd_max_train_bytes` which was recently introduced by #3057
Closes #3128

Differential Revision: D6240460

Pulled By: ajkr

fbshipit-source-id: 27dbebd86a55de237ba6a45cc79cff9214e82ebc
facebook-github-bot pushed a commit that referenced this pull request Jan 5, 2018
Summary:
With the ZSTD dictionary generator support added in #3057
`PORTABLE=1 ROCKSDB_NO_FBCODE=1 make rocksdbjavastatic` fails as it can't find zdict.h. Specifically due to:
https://github.com/facebook/rocksdb/blob/e3a06f12d27fd50af7b6c5941973f529601f9a3e/util/compression.h#L39
In java static builds zstd code gets directly downloaded from https://github.com/facebook/zstd , and in there zdict.h is under dictBuilder directory. So, I modified libzstd.a target to use `make install` to collect all the header files into a single location and used that as the zstd's include path.
Closes #3260

Differential Revision: D6669850

Pulled By: sagar0

fbshipit-source-id: f8a7562a670e5aed4c4fb6034a921697590d7285
siying pushed a commit that referenced this pull request Jan 11, 2018
Summary:
With the ZSTD dictionary generator support added in #3057
`PORTABLE=1 ROCKSDB_NO_FBCODE=1 make rocksdbjavastatic` fails as it can't find zdict.h. Specifically due to:
https://github.com/facebook/rocksdb/blob/e3a06f12d27fd50af7b6c5941973f529601f9a3e/util/compression.h#L39
In java static builds zstd code gets directly downloaded from https://github.com/facebook/zstd , and in there zdict.h is under dictBuilder directory. So, I modified libzstd.a target to use `make install` to collect all the header files into a single location and used that as the zstd's include path.
Closes #3260

Differential Revision: D6669850

Pulled By: sagar0

fbshipit-source-id: f8a7562a670e5aed4c4fb6034a921697590d7285
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants