Apigee Hybrid cluster failing the check-ready step during runtime installation

Hi,

I am setting up a hybrid cluster on AWS following the instructions provided in https://cloud.google.com/apigee/docs/hybrid/v1.8/install-hybrid-runtime

Most of the steps have gone through fine. However post Step 8. of runtime installation activity, the readiness check is failing with the following error:

Error: ready check failed: apigeeorganization golden-bloom-11-76313e8 is not running

Here is the command executed:

${APIGEECTL_HOME}/apigeectl check-ready -f overrides/overrides.yaml

Can some pointers be provided to troubleshoot this.

0 18 1,439
18 REPLIES 18

Can you check which POD is failing and show logs of that POD using below cmds-
```
kubectl get pods -n apigee
kubectl logs {replace-me-with-pod-name} -n apigee
```

Was this issue resolved ?  Even I have the same issue when upgrading from 1,7.4 to 1.8 and fails at this step - Apply your overrides to upgrade the org-level components (MART, Watcher and Apigee Connect) and check completion - https://cloud.google.com/apigee/docs/hybrid/v1.8/upgrade#outside-google-cloud

All PODS are running. The only pod failed are the one in external-secrets namespace..

The information in this thread is not enough to guide in any direction. Could you show some error logs of controller in apigee-system namespace or error logs from components that are failing in apigee namespace ?

@kidiyoor- Looking for help - getting this error  on install-hybrid-runtime while setting up a hybrid cluster on azure  while checking the ready status of the deployment

 Error: ready check failed: apigeeorganization ups-ux-ops-20b1e11 is not running . while all pods are up . 

kubectl get apigeeorganization
NAME                         STATE                        AGE
ups-ux-op              creating                 118m

Could you run below cmds-
1. kubectl get ad -n apigee
2. `kubectl get pods -n apigee`
3. if there are any PODs failed/crashing/pending - `kubectl describe pod -n apigee podname`
4. Get logs of failing PODs - `kubectl logs -n apigee podname`

1.PNG

One of the logger pods is in pending state else all pods are running 

 

2.PNG

1. Could you describe the logger pod that is pending ?
`kubectl describe pod podname -n apigee`

2. Since logger is not part of ApigeeOrg CR and all the other PODs are running, I don't see why ApigeeOrg is still in creating state - here is few things that can help - 
a) Can you describe the ApigeeOrg CR ?
`kubectl describe apigeeorg -n apigee ups-ux-op`
b) Try restarting controller pod by deleting the pod -  a new pod should come up.
c) Can you check controller for any obvious error ?
`kubectl logs -n apigee-system -replace-me-with-controller-pod-name`

1. Logger pod - is related to resource quota  0/7  nodes are available: 1 Insufficient cpu. preemption:

2. i have describe apigeeorganisation it shows the status creating. 

3. Error logs in apigee controller - 

"level":"error","ts":1678112406.074659,"caller":"common/error.go:91","msg":"error updating ApigeeDeployment status, Operation cannot be fulfilled on apigeedeployments.apigee.cloud.google.com \"apigee-redis-envoy-default\": the object has been modified; please apply your changes to the latest version and try againerror updating ApigeeDeployment status, Operation cannot be fulfilled on apigeedeployments.apigee.cloud.google.com \"apigee-redis-envoy-default\": the object has been modified; please apply your changes to the latest version and try again","controller":"ApigeeDeployment","kind":"ApigeeDeployment","apiVersion":"apigee.cloud.google.com/v1alpha3","Namespace":"apigee","Name":"apigee-redis-envoy-default","resourceVersion":"24066","uid":"b455e458-a826-4f68-b2a8-414407dbdc07","stacktrace":"edge-internal.git.corp.google.com/k8s-controllers.git/provisioning/controllers/common.HandleError\n\t/go/src/edge-internal/k8s-controllers/provisioning/controllers/common/error.go:91\nedge-internal.git.corp.google.com/k8s-controllers.git/provisioning/controllers.(*ApigeeDeploymentReconciler).updateADStatus\n\t/go/src/edge-internal/k8s-controllers/provisioning/controllers/apigeedeployment_controller.go:706\nedge-internal.git.corp.google.com/k8s-controllers.git/provisioning/controllers.(*ApigeeDeploymentReconciler).reconcileRSReplicaStatus\n\t/go/src/edge-internal/k8s-controllers/provisioning/controllers/apigeedeployment_controller.go:769\nedge-internal.git.corp.google.com/k8s-controllers.git/provisioning/controllers.(*ApigeeDeploymentReconciler).reconcileReplicaSet\n\t/go/src/edge-internal/k8s-controllers/provisioning/controllers/apigeedeployment_controller.go:115\nedge-internal.git.corp.google.com/k8s-controllers.git/provisioning/controllers.(*ApigeeDeploymentReconciler).Reconcile\n\t/go/src/edge-internal/k8s-controllers/provisioning/controllers/apigeedeployment_controller.go:109\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.1/pkg/internal/controller/controller.go:121\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.1/pkg/internal/controller/controller.go:320\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.1/pkg/internal/controller/controller.go:273\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.1/pkg/internal/controller/controller.go:234"}
{"level":"error","ts":1678112406.0747695,"caller":"common/error.go:91","msg":"error retrieving replica status from ReplicaSet, error updating ApigeeDeployment status, Operation cannot be fulfilled on apigeedeployments.apigee.cloud.google.com \"apigee-redis-envoy-default\": the object has been modified; please apply your changes to the latest version and try againerror retrieving replica status from ReplicaSet, error updating ApigeeDeployment status, Operation cannot be fulfilled on apigeedeployments.apigee.cloud.google.com \"apigee-redis-envoy-default\": the object has been modified; please apply your changes to the latest version and try again","controller":"ApigeeDeployment","kind":"ApigeeDeployment","apiVersion":"apigee.cloud.google.com/v1alpha3","Namespace":"apigee","Name":"apigee-redis-envoy-default","resourceVersion":"24066","uid":"b455e458-a826-4f68-b2a8-414407dbdc07","stacktrace":"edge-internal.git.corp.google.com/k8s-controllers.git/provisioning/controllers/common.HandleError\n\t/go/src/edge-internal/k8s-controllers/provisioning/controllers/common/error.go:91\nedge-internal.git.corp.google.com/k8s-controllers.git/provisioning/controllers.(*ApigeeDeploymentReconciler).reconcileReplicaSet\n\t/go/src/edge-internal/k8s-controllers/provisioning/controllers/apigeedeployment_controller.go:117\nedge-internal.git.corp.google.com/k8s-controllers.git/provisioning/controllers.(*ApigeeDeploymentReconciler).Reconcile\n\t/go/src/edge-internal/k8s-controllers/provisioning/controllers/apigeedeployment_controller.go:109\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.1/pkg/internal/controller/controller.go:121\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.1/pkg/internal/controller/controller.go:320\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.1/pkg/internal/controller/controller.go:273\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.1/pkg/internal/controller/controller.go:234"}
{"level":"error","ts":1678112406.1059897,"caller":"common/error.go:91","msg":"error creating apigee-redis object: failed to update resource apigee/apigee-redis-default: Operation cannot be fulfilled on certificates.cert-manager.io \"apigee-redis-default\": the object has been modified; please apply your changes to the latest version and try againerror creating apigee-redis object: failed to update resource apigee/apigee-redis-default: Operation cannot be fulfilled on certificates.cert-manager.io \"apigee-redis-default\": the object has been modified; please apply your changes to the latest version and try again","kind":"ApigeeRedis","apiVersion":"apigee.cloud.google.com/v1alpha1","Namespace":"apigee","Name":"default","resourceVersion":"24072","uid":"34932ed4-ffee-40b5-b4e6-30ec5ce2776a","stacktrace":"edge-internal.git.corp.google.com/k8s-controllers.git/provisioning/controllers/common.HandleError\n\t/go/src/edge-internal/k8s-controllers/provisioning/controllers/common/error.go:91\nedge-internal.git.corp.google.com/k8s-controllers.git/provisioning/controllers.(*ApigeeRedisReconciler).reconcileRedisComponent\n\t/go/src/edge-internal/k8s-controllers/provisioning/controllers/apigeeredis_controller.go:256\nedge-internal.git.corp.google.com/k8s-controllers.git/provisioning/controllers.(*ApigeeRedisReconciler).reconcileRedis\n\t/go/src/edge-internal/k8s-controllers/provisioning/controllers/apigeeredis_controller.go:134\nedge-internal.git.corp.google.com/k8s-controllers.git/provisioning/controllers.(*ApigeeRedisReconciler).Reconcile.func1\n\t/go/src/edge-internal/k8s-controllers/provisioning/controllers/apigeeredis_controller.go:100\nedge-internal.git.corp.google.com/k8s-controllers.git/provisioning/controllers.(*ApigeeRedisReconciler).Reconcile\n\t/go/src/edge-internal/k8s-controllers/provisioning/controllers/apigeeredis_controller.go:105\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.1/pkg/internal/controller/controller.go:121\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.1/pkg/internal/controller/controller.go:320\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.1/pkg/internal/controller/controller.go:273\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.1/pkg/internal/controller/controller.go:234"}
{"level":"error","ts":1678112406.4012177,"caller":"common/error.go:91","msg":"error creating redis envoy apigeedeployment, failed to update resource apigee/apigee-redis-envoy-default: Operation cannot be fulfilled on apigeedeployments.apigee.cloud.google.com \"apigee-redis-envoy-default\": the object has been modified; please apply your changes to the latest version and try againerror creating redis envoy apigeedeployment, failed to update resource apigee/apigee-redis-envoy-default: Operation cannot be fulfilled on apigeedeployments.apigee.cloud.google.com \"apigee-redis-envoy-default\": the object has been modified; please apply your changes to the latest version and try again","kind":"ApigeeRedis","apiVersion":"apigee.cloud.google.com/v1alpha1","Namespace":"apigee","Name":"default","resourceVersion":"24072","uid":"34932ed4-ffee-40b5-b4e6-30ec5ce2776a","stacktrace":"edge-internal.git.corp.google.com/k8s-controllers.git/provisioning/controllers/common.HandleError\n\t/go/src/edge-internal/k8s-controllers/provisioning/controllers/common/error.go:91\nedge-internal.git.corp.google.com/k8s-controllers.git/provisioning/controllers.(*ApigeeRedisReconciler).reconcileRedisEnvoyComponent\n\t/go/src/edge-internal/k8s-controllers/provisioning/controllers/apigeeredis_controller.go:359\nedge-internal.git.corp.google.com/k8s-controllers.git/provisioning/controllers.(*ApigeeRedisReconciler).reconcileRedis\n\t/go/src/edge-internal/k8s-controllers/provisioning/controllers/apigeeredis_controller.go:137\nedge-internal.git.corp.google.com/k8s-controllers.git/provisioning/controllers.(*ApigeeRedisReconciler).Reconcile.func1\n\t/go/src/edge-internal/k8s-controllers/provisioning/controllers/apigeeredis_controller.go:100\nedge-internal.git.corp.google.com/k8s-controllers.git/provisioning/controllers.(*ApigeeRedisReconciler).Reconcile\n\t/go/src/edge-internal/k8s-controllers/provisioning/controllers/apigeeredis_controller.go:105\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.1/pkg/internal/controller/controller.go:121\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.1/pkg/internal/controller/controller.go:320\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.1/pkg/internal/controller/controller.go:273\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.1/pkg/internal/controller/controller.go:234"}
{"level":"error","ts":1678112408.4883757,"caller":"common/error.go:91","msg":"error updating ApigeeDeployment status, Operation cannot be fulfilled on apigeedeployments.apigee.cloud.google.com \"apigee-connect-agent-ups-ux-ops-20b1e11\": the object has been modified; please apply your changes to the latest version and try againerror updating ApigeeDeployment status, Operation cannot be fulfilled on apigeedeployments.apigee.cloud.google.com \"apigee-connect-agent-ups-ux-ops-20b1e11\": the object has been modified; please apply your changes to the latest version and try again","controller":"ApigeeDeployment","kind":"ApigeeDeployment","apiVersion":"apigee.cloud.google.com/v1alpha3","Namespace":"apigee","Name":"apigee-connect-agent-ups-ux-ops-20b1e11","resourceVersion":"24161","uid":"4071834b-db5f-4ce1-9941-f360885f00e1","stacktrace":"edge-internal.git.corp.google.com/k8s-controllers.git/provisioning/controllers/common.HandleError\n\t/go/src/edge-internal/k8s-controllers/provisioning/controllers/common/error.go:91\nedge-internal.git.corp.google.com/k8s-controllers.git/provisioning/controllers.(*ApigeeDeploymentReconciler).updateADStatus\n\t/go/src/edge-internal/k8s-controllers/provisioning/controllers/apigeedeployment_controller.go:706\nedge-internal.git.corp.google.com/k8s-controllers.git/provisioning/controllers.(*ApigeeDeploymentReconciler).reconcileRSReplicaStatus\n\t/go/src/edge-internal/k8s-controllers/provisioning/controllers/apigeedeployment_controller.go:769\nedge-internal.git.corp.google.com/k8s-controllers.git/provisioning/controllers.(*ApigeeDeploymentReconciler).reconcileReplicaSet\n\t/go/src/edge-internal/k8s-controllers/provisioning/controllers/apigeedeployment_controller.go:115\nedge-internal.git.corp.google.com/k8s-controllers.git/provisioning/controllers.(*ApigeeDeploymentReconciler).Reconcile\n\t/go/src/edge-internal/k8s-controllers/provisioning/controllers/apigeedeployment_controller.go:109\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.1/pkg/internal/controller/controller.go:121\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.1/pkg/internal/controller/controller.go:320\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.1/pkg/internal/controller/controller.go:273\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.1/pkg/internal/controller/controller.go:234"}
{"level":"error","ts":1678112408.4885314,"caller":"common/error.go:91","msg":"error retrieving replica status from ReplicaSet, error updating ApigeeDeployment status, Operation cannot be fulfilled on apigeedeployments.apigee.cloud.google.com \"apigee-connect-agent-ups-ux-ops-20b1e11\": the object has been modified; please apply your changes to the latest version and try againerror retrieving replica status from ReplicaSet, error updating ApigeeDeployment status, Operation cannot be fulfilled on apigeedeployments.apigee.cloud.google.com \"apigee-connect-agent-ups-ux-ops-20b1e11\": the object has been modified; please apply your changes to the latest version and try again","controller":"ApigeeDeployment","kind":"ApigeeDeployment","apiVersion":"apigee.cloud.google.com/v1alpha3","Namespace":"apigee","Name":"apigee-connect-agent-ups-ux-ops-20b1e11","resourceVersion":"24161","uid":"4071834b-db5f-4ce1-9941-f360885f00e1","stacktrace":"edge-internal.git.corp.google.com/k8s-controllers.git/provisioning/controllers/common.HandleError\n\t/go/src/edge-internal/k8s-controllers/provisioning/controllers/common/error.go:91\nedge-internal.git.corp.google.com/k8s-controllers.git/provisioning/controllers.(*ApigeeDeploymentReconciler).reconcileReplicaSet\n\t/go/src/edge-internal/k8s-controllers/provisioning/controllers/apigeedeployment_controller.go:117\nedge-internal.git.corp.google.com/k8s-controllers.git/provisioning/controllers.(*ApigeeDeploymentReconciler).Reconcile\n\t/go/src/edge-internal/k8s-controllers/provisioning/controllers/apigeedeployment_controller.go:109\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.1/pkg/internal/controller/controller.go:121\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.1/pkg/internal/controller/controller.go:320\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.1/pkg/internal/controller/controller.go:273\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.1/pkg/internal/controller/controller.go:234"}
{"level":"error","ts":1678112408.50424,"caller":"common/error.go:91","msg":"Operation cannot be fulfilled on apigeeenvironments.apigee.cloud.google.com \"ups-ux-ops-test-d9e801c\": the object has been modified; please apply your changes to the latest version and try againOperation cannot be fulfilled on apigeeenvironments.apigee.cloud.google.com \"ups-ux-ops-test-d9e801c\": the object has been modified; please apply your changes to the latest version and try again","controller":"ApigeeEnvironment","kind":"ApigeeEnvironment","apiVersion":"apigee.cloud.google.com/v1alpha2","Namespace":"apigee","Name":"ups-ux-ops-test-d9e801c","resourceVersion":"24145","uid":"759a31b5-d294-41e5-9a83-f7ea69b7b7e0","stacktrace":"edge-internal.git.corp.google.com/k8s-controllers.git/provisioning/controllers/common.HandleError\n\t/go/src/edge-internal/k8s-controllers/provisioning/controllers/common/error.go:91\nedge-internal.git.corp.google.com/k8s-controllers.git/provisioning/controllers.(*ApigeeEnvironmentReconciler).reconcileEnv\n\t/go/src/edge-internal/k8s-controllers/provisioning/controllers/apigeeenvironment_controller.go:214\nedge-internal.git.corp.google.com/k8s-controllers.git/provisioning/controllers.(*ApigeeEnvironmentReconciler).Reconcile.func1\n\t/go/src/edge-internal/k8s-controllers/provisioning/controllers/apigeeenvironment_controller.go:127\nedge-internal.git.corp.google.com/k8s-controllers.git/provisioning/controllers.(*ApigeeEnvironmentReconciler).Reconcile\n\t/go/src/edge-internal/k8s-controllers/provisioning/controllers/apigeeenvironment_controller.go:132\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.1/pkg/internal/controller/controller.go:121\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.1/pkg/internal/controller/controller.go:320\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.1/pkg/internal/controller/controller.go:273\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.1/pkg/internal/controller/controller.go:234"}

3.PNG1. Logger pod - is related to resource quota  0/7  nodes are available: 1 Insufficient cpu. preemption:

2.a  describe apigeeorg - getting state creating

2c. 

error updating ApigeeDeployment status, Operation cannot be fulfilled on apigeedeployments.apigee.cloud.google.com \"apigee-redis-envoy-default\": the object has been modified; please apply your changes to the latest version and try againerror updating ApigeeDeployment status, Operation cannot be fulfilled on apigeedeployments.apigee.cloud.google.com \"apigee-redis-envoy-default\": the object has been modified; please apply your changes to the latest version and try again","controller":"ApigeeDeployment","kind":"ApigeeDeployment","

Can you provide full output of describe apigeeorg ?

I am trying to attach the full output but it is not allowing to paste

I see. We want to look in the status to see which component within ApigeeOrg is still in creation.

connect agent , mart,udca,watrcher all components state is succeeded. In final state it is showing creating.4.PNG

@kidiyoor We see now mart in creating state. but Mart pod is up and running. 

amanpurwar_0-1678353547931.png

amanpurwar_1-1678353639563.png

 

Mart is success now . But still the final state is creating. 

issue got fixed now! . I got stucked on this bug 243167389 . after updating ingress gateway name in configuration yaml file . It worked!

This solve my issue. Thanks a lot!

Hi

Were you able to install apigee hynrid on aws? were you using eks or custom cluster on ec2?