Ingress failure - NetworkEndpointGroup not getting generated

I have a created an HTTPS 'gce-internal' ingress that routes traffic coming into my Anthos cluster. My ingress was working as expected till today but all of a sudden it started throwing the following error - Error syncing to GCP: error running backend syncing routine: googleapi: Error 404: The resource 'projects/xxxxxx/zones/europe-west2-a/networkEndpointGroups/xxxxxx' was not found. Most strange part is that this error is coming up for the API which was already working fine under the same ingress. I tried different approaches to fix the issue but no luck so far. Following is the manifest for the ingress file -

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: ingress-name
annotations:
kubernetes.io/ingress.regional-static-ip-name: staticip
kubernetes.io/ingress.class: "gce-internal"
kubernetes.io/ingress.allow-http: "false"
ingress.gcp.kubernetes.io/pre-shared-cert: certname
spec:
rules:
- host: hostname
http:
paths:
- path: /pathofapi/*
pathType: ImplementationSpecific
backend:
service:
name: apiname-v1
port:
number: 8080


 Following is manifest for my service -

apiVersion: v1
kind: Service
metadata:
name: apiname-v1
labels:
app: apiname
version: v1
annotations:
cloud.google.com/neg: '{"ingress": true}'
cloud.google.com/backend-config: '{"default": "backend-configuration"}'
spec:
ports:
- name: http
port: 8080
selector:
app: apiname
version: v1
1 3 1,808
3 REPLIES 3

Hi @iamnitprakash,

Based from the error that you're getting, there seems to be a mismatch in version. I have no access to your project so it will be helpful if you will be adding logs to your question. This is also to check what were the recent changes to your configuration prior to the error, "Error syncing to GCP: error running backend syncing routine:"


For now, what I can suggest is to delete the existing ingress, and create a new one. 

I started seeing this error yesterday on a regular GKE cluster after doing a rollout of a new container image for a StatefulSet. This cluster has been running in production for about 2 years and it's the first time I see this error.

Here's what I've observed:

  • When I looked at the Ingress events, I saw the error message: Error syncing to GCP: error running backend syncing routine: googleapi: Error 404: The resource 'projects/xxxxx/zones/europe-west1-b/networkEndpointGroups/xxxxx' was not found, notFound.
  • I confirmed in GCP console that the NEG was never created.
  • I confirmed that a ServiceNetworkEndpointGroup resource was created but its lastSyncTime field was always null.
  • When I looked at the Service resource events, I saw this error message: error processing service "xxxxx/xxxxx": NEG syncer for xxxxx/xxxxxx/80-8000-GCE_VM_IP_PORT-L7 is shutting down.

I tried to delete the Ingress as suggested, but that didn't help. I even tried deleting the Service, the StatefulSet and the Ingress, waited until all the Load Balancer resources were not visible anymore on GCP console, recreated the resources and still got the exact same error. Strangely it's only this Service that's having this error; all other services work perfectly.

What's even stranger is that I was able to finally workaround the issue by creating a Service with a different name - this new Service is exactly the same as the old service except for the metadata.name field. If I try to create the Service with the old name, the problem happens again.

I am also seeing issues with NEGs not being created for a service I have had for a long time.

Top Labels in this Space