APIGEE Router behind ELB

Not applicable

Hello All,

I have 10 node cluster and 4 of them are routers and behind an ELB in AWS. I have 2 environments listing on 2 different port 9001, 9002. All the routers are showing unhealthy on ELB. I am using health check "HTTP:9001/hc"

Any idea what I am doing wrong: Here are my virtual hosts details:

{
  "hostAliases" : [ "vh1.example.com:9001" ],
  "interfaces" : [ ],
  "name" : "default",
  "port" : "9001"
}

{
  "hostAliases" : [ "vh2.example.com:9002" ],
  "interfaces" : [ ],
  "name" : "default",
  "port" : "9002"
}

0 13 1,297
13 REPLIES 13

Former Community Member
Not applicable

@Paul Mibus are you able to help? @Satyajit Roy Choudhury Why is your hostAlias on the second interface references port 9001 instead of vh2.example.com:9002?

@Prithpal Bhogill sorry that was typo.

{
  "hostAliases" : [ "vh1.example.com:9001" ],
  "interfaces" : [ ],
  "name" : "default",
  "port" : "9001"
}

{
  "hostAliases" : [ "vh2.example.com:9002" ],
  "interfaces" : [ ],
  "name" : "default",
  "port" : "9002"
}

Former Community Member
Not applicable

Sometimes the router can appear available due to the resulting Message Processor downstream. You may consider checking the router log file (eg: /opt/apigee4/var/log/apigee/router/logs/system.log) to see if any "Mark Down" messages appear.

There is a troubleshooting section on the Operations Guide that you should have received. Another option is to open a support ticket with Apigee Support.

@Prithpal Bhogill I don't see "Mark Down" error but i do see something like this:

router  nioEventLoopGroup-2-8 ERROR Proxy-session - RouterProxySession$ServerContext.onException() : Message Id: apigee-web.example.com_BUD6DWk8_RouterProxy-6-1080_1 Exception on Server channel null while message was in progress, cause: java.lang.NullPointerException

> have you deployed a proxy that listens on /hc for healthchecks?

> ELB expects an success code 200, if that api returns anything else, ELB shows the instance as unhealthy. [if you have not deployed anything for /hc, it would return 404]

> if you do not have any synthetic transactions -- you could also use http://router:8081/servers/self/reachable as healthcheck endpoint in your ELB

@Mukundha Madhavanyes I have the proxy deployed. Same setup is working with TEST setup with 2 node setup but not with PROD with 10 node setup

Not applicable

I think this might be the problem:

curl -u ${ADMIN_EMAIL}:${ADMIN_PASSWORD} http://${MGMTURL}:8080/v1/o/${ORG}/e/prod
{
  "createdAt" : XXXXXXXXXXX,
  "createdBy" : "email@example.com",
  "lastModifiedAt" : XXXXXXXXXXX,
  "lastModifiedBy" : "email@example.com",
  "name" : "prod",
  "properties" : {
    "property" : [ {
      "name" : "useSampling",
      "value" : "100"
    }, {
      "name" : "samplingThreshold",
      "value" : "100000"
    }, {
      "name" : "samplingTables",
      "value" : "10=ten;1=one;"
    }, {
      "name" : "samplingAlgo",
      "value" : "reservoir_sampler"
    }, {
      "name" : "samplingInterval",
      "value" : "300000"
    }, {
      "name" : "aggregationinterval",
      "value" : "300000"
    } ]
  }
}

But when I do this

curl -u ${ADMIN_EMAIL}:${ADMIN_PASSWORD} http://${MGMTURL}:8080/v1/o/${ORG}/e/prod/servers/                                                                     
[ ]

If I am not wrong then I should get some UUIDs. Please advise...

yes, this could be the problem, you do not have any MPs for the env.. typically MPs are added to the env when you create the env [the script will ask you to choose it], anyways np - you can add it now

> try v1/servers?pod=gateway&type=message-processor

it should list all your MPs [make sure the pod name is correct, default is gateway]

> For each MP UUID, you can do a

POST http://${MGMTURL}:8080/v1/o/${ORG}/e/prod/servers

uuid=<uuid>

to add MP to an env,

> Finally check http://${MGMTURL}:8080/v1/o/${ORG}/e/prod/servers

to make sure it is listing the MPs,

>once done, try this HTTP:9001/hc and make sure this is working, then ELBs should show healthy status as well,

This appears to be a part of the issue. The other problem may be the :<port> suffix that you have on your host aliases. The host alias list should match the string that the client passes in the Host header, which doesn't include port information (that's implicit through the TCP connection). So in this case your host aliases should simply be vh1.example.com and vh2.example.com.

@Paul Mibus here is screenshot from APIGEE online docs. Actually removing the port numbers worked for me.

1447-2015-11-04-15-15-55.png

Thanks for the heads up; we'll get that fixed!

I can fix the doc. But, from the doc:

"While the port number is optional, it is recommended that you specify it. Or, you can specify two <HostAlias> elements, one with the port number and one without."

So should the port number only be specified if you pass it in the Host header?

Stephen

Not applicable

quick questions:

is this private cloud?

what version?

when was it installed?

was it just updated?