How to do sanity check and fix of the deployment status of Apigee Edge servers

Not applicable

Can anyone please advise me on how to check the deployment status on Apigee Edge for Private Cloud and fix it as needed?

When we call to check the deployment status by;

curl -u <admin user>:<admin passwd> http://localhost:8080/v1/organizations/orgname/deployments

it sometimes gives the response with including "error" like;

{ 
      "error" : "Call timed out; either server is down or server is not reachable",
      "status" : "error",
      "type" : [ "router" ],
      "uUID" : "6185be60-ff6d-401f-ba5c-026e42695a1d" 
},... 

And we found that the it's caused by the configuration of the Routers or Message Processors which actually don't exist in the system, where the configurations could remain there possibly by some issues at the setup conducted before. Now we want to fix it by deleting unnecessary configurations.

According to the 'Apigee Edge Operations Guide' we see that we can delete the above nonexistent servers by following the steps in the section of 'Removing a Server (Management Server/Message Processor/Router)';

  1. (Message Processor only) Deregister Message Processor from the organization's environments
  2. Deregister server’s type
  3. Delete the server
The questions are;
  1. Are there any other commands than the above /deployments API call to check the to-be status of the servers deployment?
  2. Are there any other procedures than the above 3 steps, once the necessary/unnecessary servers are clarified or are these Management Server API calls doing all the necessary ZooKeeper/Cassandra configuration changes?

I'll appreciate your help on this.

Solved Solved
2 6 1,029
1 ACCEPTED SOLUTION

remeeshnair
Participant IV

We have faced similar issues with some orphan UUIDs recently, we have used below steps to clean up them.

  • You can check your server registration by using below script.

/opt/apigee4/contrib/registration-overview.sh [-p <admin-password>] <region> <pod>

This will list all registered nodes with its UUIDs.

  • On the management server, you can check all the UUIDs, whether the corresponding components present or not by running the below command

curl -v -u admin userid:password http://localhost:<<ms server port, default

8080>>/v1/servers/<UUID>

  • Remove servers from the organization's environments

curl -v -X POST http://localhost:<<MS server port, default

8080>>/v1/o/<ORG_NAME>/environments/<ENV_NAME>/servers -d "uuid=<UUID>&region=

<DC_NAME>&pod=<GATEWAY_NAME>&action=remove" -u admin uid:password

  • Remove server’s type on Management Server

curl http://localhost:<<MS port>>/v1/servers -v -X POST -d "type=<TYPE_NAME>&region=<DC_NAME>&pod=<GATEWAY_NAME>&uuid=<UUID>&action=remove" -u admin uid:password

  • Delete server on MS server

curl -v -X DELETE "http://localhost:<<MS Port>>/v1/servers/<UUID>" -u admin uid:password

  • Verify UUID is removed by running the registrations-overview.sh script.

You can also remove old snapshot files from zookeeper please check OPDK Apigee Edge On-Premises Operations Guide and search "Removing Old Snapshot Files" .

It is also good to remove(using rmr command) these UUID associated tree paths from Zookeeper. Hope this helps.

Regards,

Remeesh

View solution in original post

6 REPLIES 6

remeeshnair
Participant IV

We have faced similar issues with some orphan UUIDs recently, we have used below steps to clean up them.

  • You can check your server registration by using below script.

/opt/apigee4/contrib/registration-overview.sh [-p <admin-password>] <region> <pod>

This will list all registered nodes with its UUIDs.

  • On the management server, you can check all the UUIDs, whether the corresponding components present or not by running the below command

curl -v -u admin userid:password http://localhost:<<ms server port, default

8080>>/v1/servers/<UUID>

  • Remove servers from the organization's environments

curl -v -X POST http://localhost:<<MS server port, default

8080>>/v1/o/<ORG_NAME>/environments/<ENV_NAME>/servers -d "uuid=<UUID>&region=

<DC_NAME>&pod=<GATEWAY_NAME>&action=remove" -u admin uid:password

  • Remove server’s type on Management Server

curl http://localhost:<<MS port>>/v1/servers -v -X POST -d "type=<TYPE_NAME>&region=<DC_NAME>&pod=<GATEWAY_NAME>&uuid=<UUID>&action=remove" -u admin uid:password

  • Delete server on MS server

curl -v -X DELETE "http://localhost:<<MS Port>>/v1/servers/<UUID>" -u admin uid:password

  • Verify UUID is removed by running the registrations-overview.sh script.

You can also remove old snapshot files from zookeeper please check OPDK Apigee Edge On-Premises Operations Guide and search "Removing Old Snapshot Files" .

It is also good to remove(using rmr command) these UUID associated tree paths from Zookeeper. Hope this helps.

Regards,

Remeesh

® replace with & reg (no space) and ∾ with & ac(no space)

@Remeesh

Thank you so much for the details of the useful information that must help us also. We will check the steps you explained here and will come back if there are any further questions.

@Remeesh

Customer wants to know whether we need to restart Routers/Message Processors and Management Servers after removing the server information as described above. Could you please answer to the question?

remeeshnair
Participant IV

Yes, I think it's safe to do a rolling restart.

Thank you for the confirmation. I also got info from GSC that older versions have issue on ZK bindings and we to need to restart to avoid it.