ansible installation: apigee-service wait_for_ready - timeout

Hello Folks,

I'm installing a a 5-node cluster via ansible (https://github.com/apigee/ansible-install) and I keep on running into a timeout issue when it runs task: "/opt/apigee/apigee-service/bin/apigee-service", "edge-message-processor", "wait_for_ready". Get error: "message-processor failed to answer on 127.0.0.1 port 8082."

However, if I log in to the server directly, I can run these commands manually and it works fine. I even increased the timeout in ansible.cfg but it didn't make a difference. Anyone else has run into this issue?

Thanks,

Henry

0 4 739
4 REPLIES 4

I followed these guidelines to make sure the network is setup properly: https://docs.apigee.com/private-cloud/v4.19.01/installation-requirements#networksetting

Essentially, hostname and hostname -i yield the expected results + all ports are open among the servers. Also selinux is set to permissive.

hostname -i returns the ip of the server and not 127.0.0.1, so not sure where this "127.0.0.1" is comming from. As I keep on getting the same error:

"management-server failed to answer on 127.0.0.1 port 8080"

The 127.0.01 comes from an action file for wait_for_ready action:

/opt/apigee/edge-message-processor/lib/actions/wait_for_ready

The localhost is hard-coded into the script. That means, it is meant to be run on an MP node locally. That explains why it execute successfully when you log into the node.

As you're using ansible scripts, when ansible executes commands it also will run then on a specific target servers. For example, when I execute wait_for_ready command from my jumpbox (node where I run ansible tasks against my first MP node, n2, I get an output like this:

$ ansible n2 -a "apigee-service edge-message-processor wait_for_ready"
n2 | CHANGED | rc=0 >>
Checking for message-processor on 127.0.0.1 port 8082   OK
Checking for message-processor uuid  1d0e22bd-e268-41c5-a679-af4ee42f4795
Checking if message-processor is up message-processor is up.

It could be that your ansible targets wrong node (ie, n1, or n4, or n5) to execute edge_message_processor wait_for_ready command and so fails eventually.

There is not enough context information in your question to explain what went wrong. I.e., what is an ansible script/command that causes the problem.

Thanks for your response @Developer Edge . Based on https://docs.apigee.com/private-cloud/v4.19.01/install-edge-components-node#specifyingthecomponentst... here's the topology I'm using for my 5-node cluster.

apigee_topology:

- dc-1 node0 ds,ms

- dc-1 node1 ds,rmp

- dc-1 node2 ds,rmp

- dc-1 node3 ps,qs

- dc-1 node4 ps,qs

I'm following the doc: https://docs.apigee.com/private-cloud/v4.19.01/installation-topologies

the message processors run on n1 and n2, but the management server runs on n0. It stands to reason that if the 'self-check' needs to run it's on the management server, as that's where port 8080 should be listening on.

The commands:

/opt/apigee/apigee-service/bin/apigee-service edge-management-server restart

/opt/apigee/apigee-service/bin/apigee-service edge-management-server wait_for_ready

run sucesfully from the server directly, but not via ansible..

hmmm, now you switched from edge-message-processor to edge-management-server.

Still same logic applies. Ansible does nothing else but running a command on a remote node. So if you can run this command locally, You should run to remotely via ansible with same success.

Can you show the output of a following command, please? Change n1 to whatever you called node with ds and ms.

ansible n1 -a "apigee-service edge-management-server wait_for_ready " -vvv