Deplyment of API Proxy is failing with error " revision is deployed, flow may be impaired"?

Hi Apigeeks,

We are facing this issue while deployment and undeployment of API proxies.

I checked the MP and MS logs and found that below errors are being logged whenever a proxy is deployed and results in half deployment only:

On Message processor:

2017-10-24 11:22:43,267  qtp1532139270-693 ERROR REST - ExceptionMapper.toResponse() : Error occurred : null  org.apache.cxf.jaxrs.JAXRSInvoker.checkResultObject(JAXRSInvoker.java:336)  org.apache.cxf.jaxrs.JAXRSInvoker.invoke(JAXRSInvoker.java:211)  org.apache.cxf.jaxrs.JAXRSInvoker.invoke(JAXRSInvoker.java:241)  org.apache.cxf.jaxrs.JAXRSInvoker.invoke(JAXRSInvoker.java:241)

On Management Servers:

2017-10-24 15:30:10,148 org:enbd env:dev-inbound  qtp553759818-831 ERROR DISTRIBUTION - RemoteServicesDeploymentHandler.deployToServers() : RemoteServicesDeploymentHandler.deployToServers : Deployment exception for server with uuid 383b427d-748d-49d0-bfe1-519eeb9e49fc : cause = Call timed outcom.apigee.rpc.RPCException: Call timed out  at com.apigee.rpc.impl.AbstractCallerImpl.handleTimeout(AbstractCallerImpl.java:64) ~[rpc-1.0.0.jar:na]  at com.apigee.rpc.impl.RPCMachineImpl$OutgoingCall.handleTimeout(RPCMachineImpl.java:483) ~[rpc-1.0.0.jar:na]  at com.apigee.rpc.impl.RPCMachineImpl$OutgoingCall.access$000(RPCMachineImpl.java:402) ~[rpc-1.0.0.jar:na]  at com.apigee.rpc.impl.RPCMachineImpl$OutgoingCall$1.run(RPCMachineImpl.java:437) ~[rpc-1.0.0.jar:na]  at io.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:532) ~[netty-all-4.0.0.CR1.jar:na]  at io.netty.util.HashedWheelTimer$Worker.notifyExpiredTimeouts(HashedWheelTimer.java:430) ~[netty-all-4.0.0.CR1.jar:na]  at io.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:371) ~[netty-all-4.0.0.CR1.jar:na]


2017-10-24 15:27:52,642 org:enbd  qtp553759818-833 ERROR DISTRIBUTION - ApplicationDistributionServiceImpl.matchSubType() : Unable to read deployment bean from revision for beanSharedFlow-Logging
2017-10-24 15:27:52,645 org:enbd  qtp553759818-833 ERROR DISTRIBUTION - ApplicationDistributionServiceImpl.matchSubType() : Unable to read deployment bean from revision for beanShared-Preflow
2017-10-24 15:27:52,662 org:enbd  qtp553759818-833 ERROR DISTRIBUTION - ApplicationDistributionServiceImpl.matchSubType() : Unable to read deployment bean from revision for beanSharedFlow-UnknownResourceFlow
2017-10-24 15:27:52,667 org:enbd  qtp553759818-833 ERROR DISTRIBUTION - ApplicationDistributionServiceImpl.matchSubType() : Unable to read deployment bean from revision for beanSharedFlow-AddSecurityHeaders
2017-10-24 15:27:52,671 org:enbd  qtp553759818-833 ERROR DISTRIBUTION - ApplicationDistributionServiceImpl.matchSubType() : Unable to read deployment bean from revision for beanSharedFlow-DefaultFaultHandling
2017-10-24 15:27:52,674 org:enbd  qtp553759818-833 ERROR DISTRIBUTION - ApplicationDistributionServiceImpl.matchSubType() : Unable to read deployment bean from revision for beanSharedFlow-Logging
2017-10-24 15:27:52,678 org:enbd  qtp553759818-833 ERROR DISTRIBUTION - ApplicationDistributionServiceImpl.matchSubType() : Unable to read deployment bean from revision for beanShared-Preflow

Does any one here have an idea why we get these kind of errors?

It would be really helpful!

Thanks

Solved Solved
0 10 2,268
1 ACCEPTED SOLUTION

Hi All,

After having some troubleshooting done with support teams, we figured out the problem was with MessageLogging policy.

We were doing file logging with the <Message/> element having content as a variable. Whenever this variable turns out to be null or empty, the MessageLoggin policy will hang up the thread.

This way all of the threads on MP were being stuck up in WAITING/TIMED_WAITING state. This is a bug already logged with Apigee and they are working on it.

To reproduce this issue you can add a MessageLogging policy in PostClientFlow with empty or null <Message/> element. Make some 15 to 20 calls and check the MP logs. You will find a Buffer size 0 errors for FILE_LOGGER.

As a side effect, you'll find that API deployments will not work on the MP that is in the hung up state. You will also experience 504 gateway timeouts from router as the MP will not be able to serve the requests.

A workaround we came up with is to hardcode some string in the <Message/> element or add {messageid} as a prefix to log variable. This will ensure that some string is always written to logs.

Hope this helps!

View solution in original post

10 REPLIES 10

Not applicable

What do you mean by half deployment? You mean you cant see the revision number OR you can see revision number, but cant see it deployed even after multiple deployments? Also did you create a simple empty proxy and checked whether its getting deployed or not? @Mohammed Zuber

Hi @Raunak Narooka,

Half deployment is when your proxy is deployed only on some of the MPs or routers. Yes I created a simple proxy and tried deploying it and got the same result.

ylesyuk
Participant V

Hi @Mohammed Zuber

Two most popular reasons for the problem are:

a) networking infrastructure problem; ie, lack of connectivity between components, ie, R/MP, CS/ZK;

b) Edge deployment component problem, ie, one of the MPs are down

Depending on a situation you would need to restart MS.


If/When your deployment and infrastructure eventually is OK, sometimes it is needed to force-undeploy the faulty proxies.

Yes probably connectivity between components might be a problem... But im still curious to know what do you mean by half deployment...

Thanks @ylesyuk,

The issue got resolved after a restart of MS, MP and Routers. Still, I wonder how this happened in first place. Restarting the cluster could not be a solution always.

Completely agree.

Restarting a cluster is not a method, especially for Production. Even restarting some components should not be required without a strong reason.

It looks like based on what you said you do not see that some component were down?

The right way is to capture monitoring events and history of the components and investigate a point(s) when a deployment fails. This way we shall be able to identify root cause. Ie., temporary loss of connection between components or irresponsive component.

It is hard to speculate without being able to reproduce a problem.

rajeevyes
Participant II

@Raunak Narooka when the proxy is not deployed successfully on some MPs in the cluster, then the deployment status circle will be half-green. That's what he means by half deployment I guess.

Oh. didn't knew about this. Thanks for the info...

Hi All,

After having some troubleshooting done with support teams, we figured out the problem was with MessageLogging policy.

We were doing file logging with the <Message/> element having content as a variable. Whenever this variable turns out to be null or empty, the MessageLoggin policy will hang up the thread.

This way all of the threads on MP were being stuck up in WAITING/TIMED_WAITING state. This is a bug already logged with Apigee and they are working on it.

To reproduce this issue you can add a MessageLogging policy in PostClientFlow with empty or null <Message/> element. Make some 15 to 20 calls and check the MP logs. You will find a Buffer size 0 errors for FILE_LOGGER.

As a side effect, you'll find that API deployments will not work on the MP that is in the hung up state. You will also experience 504 gateway timeouts from router as the MP will not be able to serve the requests.

A workaround we came up with is to hardcode some string in the <Message/> element or add {messageid} as a prefix to log variable. This will ensure that some string is always written to logs.

Hope this helps!

HI All, i am facing exactly same issue when i use java callout policy. does anyone has resolved this issue?

It would be really helpful.