RabbitMQ Node sometimes fails with nxdomain on startup

## Describe the bug
Not sure if this is a bug or just a question.
Sometimes I get random restarts of the RabbitMQ pods. When they restart, they produce following log:

This log is from e.g. `rabbitmq-server-1`:
```
2022-02-03 16:22:01.153148+00:00 [warn] <0.130.0> cluster_formation.randomized_startup_delay_range.min and cluster_formation.randomized_startup_delay_range.max are deprecated

BOOT FAILED
===========
2022-02-03 16:22:01.563990+00:00 [erro] <0.130.0>
2022-02-03 16:22:01.563990+00:00 [erro] <0.130.0> BOOT FAILED
2022-02-03 16:22:01.563990+00:00 [erro] <0.130.0> ===========
2022-02-03 16:22:01.563990+00:00 [erro] <0.130.0> ERROR: epmd error for host rabbitmq-server-1.rabbitmq-nodes.mynamespace: nxdomain (non-existing domain)
2022-02-03 16:22:01.563990+00:00 [erro] <0.130.0>
ERROR: epmd error for host rabbitmq-server-1.rabbitmq-nodes.mynamespace: nxdomain (non-existing domain)

2022-02-03 16:22:02.564897+00:00 [erro] <0.130.0>     supervisor: {local,rabbit_prelaunch_sup}
2022-02-03 16:22:02.564897+00:00 [erro] <0.130.0>     errorContext: start_error
2022-02-03 16:22:02.564897+00:00 [erro] <0.130.0>     reason: {epmd_error,"rabbitmq-server-1.rabbitmq-nodes.mynamespace",
2022-02-03 16:22:02.564897+00:00 [erro] <0.130.0>                         nxdomain}
2022-02-03 16:22:02.564897+00:00 [erro] <0.130.0>     offender: [{pid,undefined},
2022-02-03 16:22:02.564897+00:00 [erro] <0.130.0>                {id,prelaunch},
2022-02-03 16:22:02.564897+00:00 [erro] <0.130.0>                {mfargs,{rabbit_prelaunch,run_prelaunch_first_phase,[]}},
2022-02-03 16:22:02.564897+00:00 [erro] <0.130.0>                {restart_type,transient},
2022-02-03 16:22:02.564897+00:00 [erro] <0.130.0>                {significant,false},
2022-02-03 16:22:02.564897+00:00 [erro] <0.130.0>                {shutdown,5000},
2022-02-03 16:22:02.564897+00:00 [erro] <0.130.0>                {child_type,worker}]
2022-02-03 16:22:02.564897+00:00 [erro] <0.130.0>
2022-02-03 16:22:02.565649+00:00 [erro] <0.128.0>   crasher:
2022-02-03 16:22:02.565649+00:00 [erro] <0.128.0>     initial call: application_master:init/4
2022-02-03 16:22:02.565649+00:00 [erro] <0.128.0>     pid: <0.128.0>
2022-02-03 16:22:02.565649+00:00 [erro] <0.128.0>     registered_name: []
2022-02-03 16:22:02.565649+00:00 [erro] <0.128.0>     exception exit: {{shutdown,
2022-02-03 16:22:02.565649+00:00 [erro] <0.128.0>                          {failed_to_start_child,prelaunch,
2022-02-03 16:22:02.565649+00:00 [erro] <0.128.0>                              {epmd_error,
2022-02-03 16:22:02.565649+00:00 [erro] <0.128.0>                                  "rabbitmq-server-1.rabbitmq-nodes.mynamespace",
```

For me it seems that the RabbitMQ pod starts faster than the coreDNS could provide the hostname. For me the message `cluster_formation.randomized_startup_delay_range.min and cluster_formation.randomized_startup_delay_range.max are deprecated` is a bit irritating.

## Version and environment information
- RabbitMQ image: 3.9.10-management
- RabbitMQ Cluster Operator: 1.11.1
- Kubernetes: 1.21.7

Questions:
- Why does `rabbitmq-server-1` try to resolve its own hostname via the cluster coreDNS?
- Could it be, that rabbitmq needs some delay to be sure that the hostname is resolvable?
- When `cluster_formation.randomized_startup_delay_range.min` is deprecated, what will be the replacement option?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

RabbitMQ Node sometimes fails with nxdomain on startup #958

Describe the bug

Version and environment information

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

RabbitMQ Node sometimes fails with nxdomain on startup #958

Description

Describe the bug

Version and environment information

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions