Skip to content

[server] TabletServer could not start because it attempted to recover tablet logs from residual data of already dropped tables. #1486

@platinumhamburg

Description

@platinumhamburg

Search before asking

  • I searched in the issues and found nothing similar.

Fluss version

0.7.0 (latest release)

Please describe the bug 🐞

When a table is dropped on the CoordinatorServer, the table metadata is removed from ZooKeeper. TabletServers then perform replica cleanup actions asynchronously. However, if any TabletServer restarts before its local bucket data cleanup is completed, it will attempt to recover table logs from residual data in local disk of the already dropped table, which no longer has schema data in ZooKeeper. Consequently, the TabletServer will throw a fatal exception and be unable to restart.

The error messages are as follows:

2025-08-06 16:31:31,185 ERROR com.alibaba.fluss.server.ServerBase [] - Could not start TabletServer.
com.alibaba.fluss.exception.FlussException: Failed to start the TabletServer.
at com.alibaba.fluss.server.ServerBase.start(ServerBase.java:136) ~[fluss-server-0.8-SNAPSHOT.jar:0.8-SNAPSHOT]
at com.alibaba.fluss.server.ServerBase.startServer(ServerBase.java:93) [fluss-server-0.8-SNAPSHOT.jar:0.8-SNAPSHOT]
at com.alibaba.fluss.server.tablet.TabletServer.main(TabletServer.java:172) [fluss-server-0.8-SNAPSHOT.jar:0.8-SNAPSHOT]
Caused by: com.alibaba.fluss.exception.FlussRuntimeException: Failed to recovery log
at com.alibaba.fluss.server.log.LogManager.loadLogs(LogManager.java:213) ~[fluss-server-0.8-SNAPSHOT.jar:0.8-SNAPSHOT]
at com.alibaba.fluss.server.log.LogManager.startup(LogManager.java:129) ~[fluss-server-0.8-SNAPSHOT.jar:0.8-SNAPSHOT]
at com.alibaba.fluss.server.tablet.TabletServer.startServices(TabletServer.java:200) ~[fluss-server-0.8-SNAPSHOT.jar:0.8-SNAPSHOT]
at com.alibaba.fluss.server.ServerBase.start(ServerBase.java:123) ~[fluss-server-0.8-SNAPSHOT.jar:0.8-SNAPSHOT]
... 2 more

Solution

When TabletServers attempt to recover table logs from data whose corresponding table has no schema in ZooKeeper, they can confirm that the table has already been dropped. In this case, they can skip the recovery process, and the residual data can be safely removed.

Are you willing to submit a PR?

  • I'm willing to submit a PR!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions