The last couple of months I’ve seen this error in a Workspace One Access Cluster (WSA/vIDM), installed in a VCF4 environment:
“There was a problem with Messaging service. Error retrieving rabbitmq status.”
WSA System Diagnostics Dashboard (node 3)
WSA was initially installed with v3.3.4 on VCF 4.2. The last couple of weeks we’ve upgraded the environment to VCF v4.3.1.0 and along with that, WSA to v3.3.5. We were told this version should fix the problem, as there were many wrong ownership/permissions for rabbitmq on v3.3.4 somehow. And looking at the WSA v3.3.5 Release Notes, we see this is one of the fixed issues:

But upgrading WSA to v3.3.5 did not solve the problem for us. We still had a node complaining about not getting rabbitmq status. So I started digging into the problem. Logging into the cluster nodes with ssh and checking the status gave me a good clue:
# WSA Node 1
root@xreg-wsa1n1 [ ~ ]# rabbitmqctl cluster_status
Cluster status of node rabbitmq@xreg-wsa1n1 ...
Basics
Cluster name: rabbitmq@xreg-wsa1n1.<domain.name>
Disk Nodes
rabbitmq@xreg-wsa1n1
Running Nodes
rabbitmq@xreg-wsa1n1
Versions
rabbitmq@xreg-wsa1n1: RabbitMQ 3.8.9 on Erlang 23.1
Maintenance status
Node: rabbitmq@xreg-wsa1n1, status: not under maintenance
Alarms
(none)
Network Partitions
(none)
Listeners
Node: rabbitmq@xreg-wsa1n1, interface: [::], port: 15672, protocol: http, purpose: HTTP API
Node: rabbitmq@xreg-wsa1n1, interface: [::], port: 5700, protocol: clustering, purpose: inter-node and CLI tool communication
Node: rabbitmq@xreg-wsa1n1, interface: 127.0.0.1, port: 5672, protocol: amqp, purpose: AMQP 0-9-1 and AMQP 1.0
Feature flags
Flag: drop_unroutable_metric, state: enabled
Flag: empty_basic_get_metric, state: enabled
Flag: implicit_default_bindings, state: enabled
Flag: maintenance_mode_status, state: disabled
Flag: quorum_queue, state: enabled
Flag: virtual_host_metadata, state: enabled
# WSA Node 2
# Node 2 not shown here, it had same OK output as node 1
# WSA Node 3
root@xreg-wsa1n3 [ ~ ]# rabbitmqctl cluster_status
12:00:14.844 [error] Cookie file /root/.erlang.cookie must be accessible by owner only
12:00:15.626 [error] Cookie file /root/.erlang.cookie must be accessible by owner only
^C
As you can see, there was still a permission problem.
Checking the permissions on node 1, 2 and 3, I found that node 3 had “640” permission (-rw-r—–) while it should be “400” (-r——–):
# On all nodes it was a symlink:
root@xreg-wsa1n1 [ ~ ]# ls -l /root/.erlang.cookie
lrwxrwxrwx 1 root root 32 Oct 22 10:46 /root/.erlang.cookie -> /var/lib/rabbitmq/.erlang.cookie
# On node 1 (and 2):
root@xreg-wsa1n1 [ ~ ]# ls -l /var/lib/rabbitmq/.erlang.cookie
-r-------- 1 rabbitmq rabbitmq 20 Jan 22 2021 /var/lib/rabbitmq/.erlang.cookie
# On node 3:
root@xreg-wsa1n3 [ ~ ]# ls -l /var/lib/rabbitmq/.erlang.cookie
-rw-r----- 1 rabbitmq rabbitmq 20 Jan 22 2021 /var/lib/rabbitmq/.erlang.cookie
So fixing the permission and restarting rabbitmq fixed the problem. Here is how (as root):
root@xreg-wsa1n3 [ ~ ]# chmod 400 /var/lib/rabbitmq/.erlang.cookie
root@xreg-wsa1n3 [ ~ ]# rabbitmq-server detached &
And the Systems Diagnostics Dashboard now showed ‘Integrated Components’ as all green:

WSA Dashboard (dropdown) > System Diagnostics DashboardYou can of course also run the status command in CLI to check if all is good:
root@xreg-wsa1n3 [ ~ ]# rabbitmqctl cluster_status