-
Notifications
You must be signed in to change notification settings - Fork 370
Description
Search before asking
- I searched in the issues and found nothing similar.
Description
Currently, We found that the processing of DeleteReplicaResponseReceivedEvent
can be blocked for a long time in some cases, especially when handling partitions of KV tables.
After some debug, I found the problem is at ZooKeeperClient#deletePartitionAssignment
. This method will recursively delete all the children for the partition assignment ZNode which may spend a lot of time. In my local tests, deleting a parent node with three levels of child nodes (containing 1,024 first-level child nodes, each of which has 4 second-level child nodes, and each second-level child node has 5 third-level child nodes) takes more than 5 seconds.
@Test
void test() throws Exception {
String basePath = "/perfTest";
int firstLevel = 1024;
int secondLevel = 4;
int thirdLevel = 5;
CuratorFramework client = zookeeperClient.getCuratorClient();
for (int i = 0; i < firstLevel; i++) {
String level1 = basePath + "/n1_" + i;
client.create().creatingParentsIfNeeded().forPath(level1);
for (int j = 0; j < secondLevel; j++) {
String level2 = level1 + "/n2_" + j;
client.create().creatingParentsIfNeeded().forPath(level2);
for (int k = 0; k < thirdLevel; k++) {
String level3 = level2 + "/n3_" + k;
client.create().creatingParentsIfNeeded().forPath(level3, ("dummy").getBytes());
}
}
}
long start = System.currentTimeMillis();
client.delete()
.deletingChildrenIfNeeded()
.forPath(basePath);
long end = System.currentTimeMillis();
System.out.println("Delete finished. Cost: " + (end - start) + " ms");
}
For KV tables, there are many snapshot nodes, resulting in more child nodes compared to Log tables. Therefore, the deletion event takes longer, and this issue is more pronounced for KV tables.
Willingness to contribute
- I'm willing to submit a PR!