Strimzi Backup

Note: This is not part of the Strimzi Cloud Native Computing Foundation (CNCF) project!

Strimzi Backup is a CLI tool for backing up and restoring your Strimzi-based Apache Kafka cluster.

How to use Strimzi Backup?

Installation

You can download one of the release binaries from one of the GitHub releases and use it. Alternatively, you can also use the provided container image to run it from a Kubernetes Pod or locally as a container.

Getting help

You can always get help by using the --help command 😉. You can also ask in discussions.

Backing up your Apache Kafka cluster

You can back up your Kafka cluster using the strimzi-backup backup kafka command. This command will get the Kubernetes resources and store them in a GZIP archive. It will include:

The Kafka CR
(Optional) The Secrets with the Cluster and Client Certification Authorities
All KafkaNodePool CRs belonging to this Kafka cluster
All KafkaTopic CRs belonging to this Kafka cluster
All KafkaUser CRs belonging to this Kafka cluster
(Optional) All Secrets belonging to the Kafka Users with their mTLS or SCRAM-SHA-512 credentials

The backup command uses the following options:

Option	Description	Default Value
`--kubeconfig`	Path to the kubeconfig file to use for Kubernetes API requests. If not specified, `strimzi-backup` will try to auto-detect the Kubernetes configuration.
`--namespace`	Namespace of the Kafka cluster to backup. If not specified, `strimzi-backup` will try to auto-detect and use the current namespace from your Kubernetes configuration.
`--name`	Name of the Kafka cluster to backup. (Required)
`--filename`	Name of the file with the backup. If not set, the backup will be auto-generated based on the current time.
`--skip-metadata-cleansing`	Skip cleanup of the Kubernetes metadata in the backed up resources. Metadata cleansing removes the fields that are not useful for restoring the cluster such as the generation, timestamps, managed fields, or last applied configurations. Skipping the metadata cleansing will make the resulting backup file larger. But in some cases - for example for auditing purposes - the metadata might be useful.	`false`
`--skip-ca-secrets`	Skip backup of the Cluster and Client Certification Authority Secrets	`false`
`--skip-user-secrets`	Skip backup of the Kafka User Secrets	`false`

Notes:

The server certificates used by the different nodes are not part of the backup. The Strimzi Cluster Operator will just create new ones once the cluster is restored. As clients trust the Kafka cluster based on its Cluster CA, restoring the CLuster CA is sufficient to make sure the original trusted certificates work.
strimzi-backup does not include any third party Secrets (such as listener server certificates). You are responsible for backing them up and restoring them yourself.

Restoring your Apache Kafka cluster

You can restore your Kafka cluster using the strimzi-backup restore kafka command. This command will load the resources from the backup and recreate them. It will first recreate the Kafka cluster in the paused state. Then it will restore its Kafka Cluster ID. And only in the end, once all other resources are restored as well, it will unpause it and wait for it to get ready. The restore will include all resources from the backup file:

The Kafka CR
The Secrets with the Cluster and Client Certification Authorities
All KafkaNodePool CRs belonging to this Kafka cluster
All KafkaTopic CRs belonging to this Kafka cluster
All KafkaUser CRs belonging to this Kafka cluster
All Secrets belonging to the Kafka Users with their mTLS or SCRAM-SHA-512 credentials

The restore command uses the following options:

Option	Description	Default Value
`--kubeconfig`	Path to the kubeconfig file to use for Kubernetes API requests. If not specified, `strimzi-backup` will try to auto-detect the Kubernetes configuration.
`--namespace`	Namespace in which the Kafka cluster should be restored. If not specified, `strimzi-backup` will try to auto-detect and use the current namespace from your Kubernetes configuration. This might differ from the original name when the back was done.
`--name`	Name of the restored Kafka cluster. This might differ from the original name when the back was done. `strimzi-backup` will rename the cluster accordingly. (Required)
`--filename`	Name of the file with the backup which should be restored. (Required)
`--timeout`	Timeout for how long to wait for the cluster to restore. In milliseconds.	`300000`
`--skip-ca-secrets`	Skip restoring of the Cluster and Client Certification Authority Secrets	`false`
`--skip-user-secrets`	Skip restoring of the Kafka User Secrets	`false`
`--skip-cluster-id`	Skip restoring of the Kafka Cluster ID	`false`

Notes:

In most cases, Strimzi cannot fully restore the addresses of the external listeners. Things such as load balancers will be newly provisioned when the cluster is restored and are likely to differ from the original ones.
The addresses of the internal listeners will also differ in case you change the namespace or name of the Kafka cluster.
The restore process expects to do the restoration into a clean environment and will currently fail if any of the resources already exist. This might be addressed in the future with the dry-run and force modes (see #11 for more details).

Backing up your Apache Kafka Connect cluster

You can back up your Kafka cluster using the strimzi-backup backup connect command. This command will get the Kubernetes resources and store them in a GZIP archive. It will include:

The KafkaConnect CR
All KafkaConnector CRs belonging to this Kafka cluster

The backup command uses the following options:

Option	Description	Default Value
`--kubeconfig`	Path to the kubeconfig file to use for Kubernetes API requests. If not specified, `strimzi-backup` will try to auto-detect the Kubernetes configuration.
`--namespace`	Namespace of the Kafka cluster to backup. If not specified, `strimzi-backup` will try to auto-detect and use the current namespace from your Kubernetes configuration.
`--name`	Name of the Kafka cluster to backup. (Required)
`--filename`	Name of the file with the backup. If not set, the backup will be auto-generated based on the current time.
`--skip-metadata-cleansing`	Skip cleanup of the Kubernetes metadata in the backed up resources. Metadata cleansing removes the fields that are not useful for restoring the cluster such as the generation, timestamps, managed fields, or last applied configurations. Skipping the metadata cleansing will make the resulting backup file larger. But in some cases - for example for auditing purposes - the metadata might be useful.	`false`

Notes:

The backup process does not back up any data the original Connect cluster had stored in its Kafka cluster (such as offset information etc.).
strimzi-backup does not include any third party Secrets (such as database server certificates or passwords). You are responsible for backing them up and restoring them yourself.

Restoring your Apache Kafka Connect cluster

You can restore your Kafka Connect cluster using the strimzi-backup restore connect command. This command will load the resources from the backup and recreate them. It will first recreate the Connect cluster together with any Connector resources and wait for it to get ready. The restore will include all resources from the backup file:

The KafkaConnect CR
All KafkaConnector CRs belonging to this Connect cluster

The restore command uses the following options:

Option	Description	Default Value
`--kubeconfig`	Path to the kubeconfig file to use for Kubernetes API requests. If not specified, `strimzi-backup` will try to auto-detect the Kubernetes configuration.
`--namespace`	Namespace in which the Kafka Connect cluster should be restored. If not specified, `strimzi-backup` will try to auto-detect and use the current namespace from your Kubernetes configuration. This might differ from the original name when the back was done.
`--name`	Name of the restored Kafka Connect cluster. This might differ from the original name when the back was done. `strimzi-backup` will rename the cluster accordingly. (Required)
`--filename`	Name of the file with the backup which should be restored. (Required)
`--timeout`	Timeout for how long to wait for the cluster to restore. In milliseconds.	`300000`

Notes:

The restore process does not restore any data the original Connect cluster had stored in its Kafka cluster (such as offset information etc.).
The restore process expects to do the restoration into a clean environment and will currently fail if any of the resources already exist. This might be addressed in the future with the dry-run and force modes (see #11 for more details).

Exporting the resources from the backup

You can use the command strimzi-backup export command to export the custom resources from the backup archive to separate YAML files. The export command uses the following options:

Option	Description	Default Value
`--filename`	Name of the file with the backup which should be exported. (Required)
`--target-directory`	The directory where the files should be exported. (Required)

Future Plans

There are several features I plan to add in the future. The major ones are:

Support for data backup for Kafka clusters
Support for backing up into Config Map / Secret to allow running the tool from a CronJob
Tests 🙄

Frequently Asked Questions

Does Strimzi Backup support ZooKeeper-based clusters?

No, Strimzi Backup supports only KRaft-based Apache Kafka clusters. There are currently no plans to support ZooKeeper-based clusters.

Any plans to support other Strimzi resources?

Currently, the support is planned only for Apache Kafka and Apache Kafka Connect clusters, which consist of multiple custom resources, and (in case of Apache Kafka clusters) use persistent volumes to store data. The other resources such as Mirror Maker 2 or Bridge are stateless and consist of a single custom resource. So you can back them up with kubectl get ... -o yaml and do not need any special tools.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.github/workflows		.github/workflows
cmd		cmd
pkg		pkg
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
go.mod		go.mod
go.sum		go.sum
main.go		main.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Strimzi Backup

How to use Strimzi Backup?

Installation

Getting help

Backing up your Apache Kafka cluster

Restoring your Apache Kafka cluster

Backing up your Apache Kafka Connect cluster

Restoring your Apache Kafka Connect cluster

Exporting the resources from the backup

Future Plans

Frequently Asked Questions

Does Strimzi Backup support ZooKeeper-based clusters?

Any plans to support other Strimzi resources?

About

Uh oh!

Releases 2

Packages

Uh oh!

Languages

License

scholzj/strimzi-backup

Folders and files

Latest commit

History

Repository files navigation

Strimzi Backup

How to use Strimzi Backup?

Installation

Getting help

Backing up your Apache Kafka cluster

Restoring your Apache Kafka cluster

Backing up your Apache Kafka Connect cluster

Restoring your Apache Kafka Connect cluster

Exporting the resources from the backup

Future Plans

Frequently Asked Questions

Does Strimzi Backup support ZooKeeper-based clusters?

Any plans to support other Strimzi resources?

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Languages

Packages