[SIGMOD'24] CaaS-LSM: Compaction-as-a-Service for LSM-based Key-Value Stores in Storage Disaggregated Infrastructure
Qiaolin Yu, Chang Guo, Jay Zhuang, Viraj Thakkar, Jianguo Wang, Zhichao Cao.
ACM Conference on Management of Data (SIGMOD 2024), Research Track Full Paper.
- Linux - Ubuntu
- Prepare for the dependencies of RocksDB: https://github.com/facebook/rocksdb/blob/main/INSTALL.md
- Install and config HDFS server
- Install gRPC
- Notice that multiple CSAs should bind with one Control Plane.
- Please install the required dependencies before trying to compile.
- Note that currently we only support cmake. Do not use the makefile directly.
Search the repository for this code and delete it.
tmp_options.compaction_service = std::make_shared<MyTestCompactionService>(
dbname, compaction_options, compaction_stats, remote_listeners,
remote_table_properties_collector_factories);- Config the address of Control Plane, CSA, and HDFS server in
include/rocksdb/options.h - Build and compile
- Run Control Plane and CSA
git clone [email protected]:asu-idi/CaaS-LSM.git
cd CaaS-LSM
mkdir build
cd build
cmake ..
make -j100
./procp_server #run Control Plane server
./csa_server #run CSA server- Config the address of CSA, and HDFS server in
include/rocksdb/options.h - Build and compile
- Run CSA
git clone [email protected]:asu-idi/CaaS-LSM.git
cd CaaS-LSM
git checkout disaggre-rocksdb
mkdir build
cd build
cmake ..
make -j100
./csa_server # The name is the same, but the function of CSA is different with that of CaaS-LSM
- clone repo: https://github.com/bytedance/terarkdb
- Build and compile
- checkout branch to
terark-native
sudo apt-get install libaio-dev
- Before building, open
WITH_TOOLSandWITH_TERARK_ZIP, it's neccessary for remote compaction mode.
./build.sh
- Use
remote_compaction_worker_101
- Copy the code in
db/compaction/remote_compactionof CaaS-LSM, includingprocp_server.cc,csa_server.cc,utils.h,compaction_service.proto - Change
CompactionArgstostring, since TerarkDB uses encoded string in network transmit. - Use the same way in CaaS-LSM to start.
Run db_bench
./db_bench --benchmarks="fillrandom" --num=4000000 --statistics --threads=16 --max_background_compactions=8 --db=/xxx/xxx --statistics
The OPS of CaaS-LSM surpassed Disaggre-RocksDB by up to 61%, and TerarkDB-CaaS surpassed native TerarkDB up to 42%.
- Use the branch
nebula, follow the tips in https://docs.nebula-graph.io/3.2.0/4.deployment-and-installation/2.compile-and-install-nebula-graph/1.install-nebula-graph-by-compiling-the-source-code/ - when compiling, replace
build/third-party/install/include/rocksdb/andbuild/third-party/install/lib/librocksdb.awith the header files and libs produced bymainbranch of this repo. - If the
mainbranch cannot compile, you can tryrest_rpcbranch
-
Clone
Kvrocksat https://github.com/apache/incubator-kvrocks -
Before build:
- modify this part in "cmake/rocksdb.cmake" to switch the branch of the default RocksDB to this repository
FetchContent_DeclareGitHubWithMirror(rocksdb facebook/rocksdb v7.8.3 MD5=f0cbf71b1f44ce8f50407415d38b9d44 ) -
Build:
./x.py build -
Single mode:
- build/kvrocks -c kvrocks.conf
-
Cluster mode:
- Based on
kvrocks controllerhttps://github.com/KvrocksLabs/kvrocks_controller.git with commitdf83752849ef41ce91037ca5c9cc6c670a480d56 - Dependencies: etcd https://etcd.io/docs/v3.5/install/
- Build
kvrocks controller: make - Start controller server:
./_build/kvrocks-controller-server -c ./config/config.yaml - A fast way to build cluster:
python scripts/e2e_test.py - Check cluster status:
./_build/kvrocks-controller-cli -c ./config/kc_cli_config.yaml - modify kvrocks.conf: port(e.g., 30001-30006), cluster-enabled(yes), dir /tmp/kvrocks(/tmp/kvrocks1-6)
- Based on
Nebula-Random-Sche has a total OPS of 5,669 and an average latency of 526 ms, which are about 86% lower and 6$X$ higher than Nebula-CaaS-LSM respectively.
With better scheduling of compaction jobs in Kvrocks-CaaS, the overall OPS is about 20% better than that of Kvrocks-Local, and the average latency improves by 30%. In the cross-datacenter scenario, according to the log file, Kvrocks-Local experiences compaction jobs piled and a severe write slowdown after intensive compaction starts. In contrast, Kvrocks-CaaS runs smoothly and improves the overall OPS by 28% and P99 latency by 65%.





