[bitnami/rabbitmq] RabbitMQ: explain how to safely avoid a deployment deadlock #25931

michaelklishin · 2024-05-15T15:23:56Z

Explain the problem and a widely used solution
instead of recommending force booting nodes.

These practices are generally used by existing
RabbitMQ K8S Cluster Operators but this PR
intentionally does not recommend any of them.

References #25698, #16081, #25916.

Explain the problem and a widely used solution instead of recommending force booting nodes. These practices are generally used by existing RabbitMQ K8S Cluster Operators but this PR intentionally does not recommend any of them. Signed-off-by: Michael Klishin <michael@clojurewerkz.org> Signed-off-by: Michael Klishin <klishinm@vmware.com> Signed-off-by: Michael Klishin <mikhail.klishin@broadcom.com> Commit a suggested edit to bitnami/rabbitmq/README.md Co-authored-by: Carlos Rodríguez Hernández <carlosrh@vmware.com> Signed-off-by: Michael Klishin <michaelklishin@icloud.com>

michaelklishin · 2024-05-23T22:15:08Z

@rafariossaa @carrodher @javsalgar is there anything else I can do to help push this forward? Not having this aspect documented is a big deal for RabbitMQ users.

michaelklishin · 2024-06-01T00:14:39Z

Bump. This is still as relevant and important to document as before.

carrodher · 2024-06-03T22:31:18Z

Thanks @michaelklishin and sorry for the delay. Our team will review and provide feedback.
Your contribution is greatly appreciated!

rafariossaa · 2024-06-04T12:59:55Z

bitnami/rabbitmq/README.md


-This happens if the pod management policy of the statefulset is not `Parallel` and the last pod to be running wasn't the first pod of the statefulset. If that happens, update the pod management policy to recover a healthy state:
+The following combination of deployment settings avoids the problem:


Hi,
Is there any kind of process that user could perform to once he is in this situation ?.
I mean, I am not sure if the user has already deployed without the parameters you indicate here he could avoid the issue by just upgrading the deployment with the parameter or if he may need to execute other steps.

@rafariossaa change to the recommended deployment parameters and change the readiness probe. The absolute minimum required would be using rabbitmq-diagnostics ping for readiness probe but both recommendations can be applied at the same time in a single deployment update.

The goal of this PR is to recommended the safe setup from the start.

github-actions bot added rabbitmq triage Triage is needed labels May 15, 2024

github-actions bot assigned javsalgar May 15, 2024

github-actions bot requested a review from javsalgar May 15, 2024 15:24

michaelklishin mentioned this pull request May 15, 2024

[bitnami/rabbitmq] clustering force boot should be true by default #16081

Closed

javsalgar changed the title ~~RabbitMQ: explain how to safely avoid a deployment deadlock~~ [bitnami/rabbitmq] RabbitMQ: explain how to safely avoid a deployment deadlock May 16, 2024

javsalgar requested review from carrodher and removed request for javsalgar May 16, 2024 07:42

javsalgar assigned carrodher and unassigned javsalgar May 16, 2024

carrodher added the in-progress label May 17, 2024

github-actions bot removed the triage Triage is needed label May 17, 2024

github-actions bot unassigned carrodher May 17, 2024

github-actions bot removed the request for review from carrodher May 17, 2024 06:36

github-actions bot assigned dgomezleon May 17, 2024

github-actions bot requested a review from dgomezleon May 17, 2024 06:36

carrodher removed the request for review from dgomezleon May 17, 2024 06:37

carrodher assigned rafariossaa May 17, 2024

carrodher requested a review from rafariossaa May 17, 2024 06:37

carrodher unassigned dgomezleon May 17, 2024

rafariossaa reviewed Jun 4, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[bitnami/rabbitmq] RabbitMQ: explain how to safely avoid a deployment deadlock #25931

[bitnami/rabbitmq] RabbitMQ: explain how to safely avoid a deployment deadlock #25931

michaelklishin commented May 15, 2024

michaelklishin commented May 23, 2024

michaelklishin commented Jun 1, 2024

carrodher commented Jun 3, 2024

rafariossaa Jun 4, 2024

michaelklishin Jun 4, 2024 •

edited


		This happens if the pod management policy of the statefulset is not `Parallel` and the last pod to be running wasn't the first pod of the statefulset. If that happens, update the pod management policy to recover a healthy state:
		The following combination of deployment settings avoids the problem:

[bitnami/rabbitmq] RabbitMQ: explain how to safely avoid a deployment deadlock #25931

Are you sure you want to change the base?

[bitnami/rabbitmq] RabbitMQ: explain how to safely avoid a deployment deadlock #25931

Conversation

michaelklishin commented May 15, 2024

michaelklishin commented May 23, 2024

michaelklishin commented Jun 1, 2024

carrodher commented Jun 3, 2024

rafariossaa Jun 4, 2024

Choose a reason for hiding this comment

michaelklishin Jun 4, 2024 • edited

Choose a reason for hiding this comment

michaelklishin Jun 4, 2024 •

edited