Hi Apigeeks,
We are facing this issue while deployment and undeployment of API proxies.
I checked the MP and MS logs and found that below errors are being logged whenever a proxy is deployed and results in half deployment only:
On Message processor:
2017-10-24 11:22:43,267 qtp1532139270-693 ERROR REST - ExceptionMapper.toResponse() : Error occurred : null org.apache.cxf.jaxrs.JAXRSInvoker.checkResultObject(JAXRSInvoker.java:336) org.apache.cxf.jaxrs.JAXRSInvoker.invoke(JAXRSInvoker.java:211) org.apache.cxf.jaxrs.JAXRSInvoker.invoke(JAXRSInvoker.java:241) org.apache.cxf.jaxrs.JAXRSInvoker.invoke(JAXRSInvoker.java:241)
On Management Servers:
2017-10-24 15:30:10,148 org:enbd env:dev-inbound qtp553759818-831 ERROR DISTRIBUTION - RemoteServicesDeploymentHandler.deployToServers() : RemoteServicesDeploymentHandler.deployToServers : Deployment exception for server with uuid 383b427d-748d-49d0-bfe1-519eeb9e49fc : cause = Call timed outcom.apigee.rpc.RPCException: Call timed out at com.apigee.rpc.impl.AbstractCallerImpl.handleTimeout(AbstractCallerImpl.java:64) ~[rpc-1.0.0.jar:na] at com.apigee.rpc.impl.RPCMachineImpl$OutgoingCall.handleTimeout(RPCMachineImpl.java:483) ~[rpc-1.0.0.jar:na] at com.apigee.rpc.impl.RPCMachineImpl$OutgoingCall.access$000(RPCMachineImpl.java:402) ~[rpc-1.0.0.jar:na] at com.apigee.rpc.impl.RPCMachineImpl$OutgoingCall$1.run(RPCMachineImpl.java:437) ~[rpc-1.0.0.jar:na] at io.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:532) ~[netty-all-4.0.0.CR1.jar:na] at io.netty.util.HashedWheelTimer$Worker.notifyExpiredTimeouts(HashedWheelTimer.java:430) ~[netty-all-4.0.0.CR1.jar:na] at io.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:371) ~[netty-all-4.0.0.CR1.jar:na] 2017-10-24 15:27:52,642 org:enbd qtp553759818-833 ERROR DISTRIBUTION - ApplicationDistributionServiceImpl.matchSubType() : Unable to read deployment bean from revision for beanSharedFlow-Logging 2017-10-24 15:27:52,645 org:enbd qtp553759818-833 ERROR DISTRIBUTION - ApplicationDistributionServiceImpl.matchSubType() : Unable to read deployment bean from revision for beanShared-Preflow 2017-10-24 15:27:52,662 org:enbd qtp553759818-833 ERROR DISTRIBUTION - ApplicationDistributionServiceImpl.matchSubType() : Unable to read deployment bean from revision for beanSharedFlow-UnknownResourceFlow 2017-10-24 15:27:52,667 org:enbd qtp553759818-833 ERROR DISTRIBUTION - ApplicationDistributionServiceImpl.matchSubType() : Unable to read deployment bean from revision for beanSharedFlow-AddSecurityHeaders 2017-10-24 15:27:52,671 org:enbd qtp553759818-833 ERROR DISTRIBUTION - ApplicationDistributionServiceImpl.matchSubType() : Unable to read deployment bean from revision for beanSharedFlow-DefaultFaultHandling 2017-10-24 15:27:52,674 org:enbd qtp553759818-833 ERROR DISTRIBUTION - ApplicationDistributionServiceImpl.matchSubType() : Unable to read deployment bean from revision for beanSharedFlow-Logging 2017-10-24 15:27:52,678 org:enbd qtp553759818-833 ERROR DISTRIBUTION - ApplicationDistributionServiceImpl.matchSubType() : Unable to read deployment bean from revision for beanShared-Preflow
Does any one here have an idea why we get these kind of errors?
It would be really helpful!
Thanks
Solved! Go to Solution.
Hi All,
After having some troubleshooting done with support teams, we figured out the problem was with MessageLogging policy.
We were doing file logging with the <Message/> element having content as a variable. Whenever this variable turns out to be null or empty, the MessageLoggin policy will hang up the thread.
This way all of the threads on MP were being stuck up in WAITING/TIMED_WAITING state. This is a bug already logged with Apigee and they are working on it.
To reproduce this issue you can add a MessageLogging policy in PostClientFlow with empty or null <Message/> element. Make some 15 to 20 calls and check the MP logs. You will find a Buffer size 0 errors for FILE_LOGGER.
As a side effect, you'll find that API deployments will not work on the MP that is in the hung up state. You will also experience 504 gateway timeouts from router as the MP will not be able to serve the requests.
A workaround we came up with is to hardcode some string in the <Message/> element or add {messageid} as a prefix to log variable. This will ensure that some string is always written to logs.
Hope this helps!
What do you mean by half deployment? You mean you cant see the revision number OR you can see revision number, but cant see it deployed even after multiple deployments? Also did you create a simple empty proxy and checked whether its getting deployed or not? @Mohammed Zuber
Hi @Raunak Narooka,
Half deployment is when your proxy is deployed only on some of the MPs or routers. Yes I created a simple proxy and tried deploying it and got the same result.
Two most popular reasons for the problem are:
a) networking infrastructure problem; ie, lack of connectivity between components, ie, R/MP, CS/ZK;
b) Edge deployment component problem, ie, one of the MPs are down
Depending on a situation you would need to restart MS.
If/When your deployment and infrastructure eventually is OK, sometimes it is needed to force-undeploy the faulty proxies.
Yes probably connectivity between components might be a problem... But im still curious to know what do you mean by half deployment...
Thanks @ylesyuk,
The issue got resolved after a restart of MS, MP and Routers. Still, I wonder how this happened in first place. Restarting the cluster could not be a solution always.
Completely agree.
Restarting a cluster is not a method, especially for Production. Even restarting some components should not be required without a strong reason.
It looks like based on what you said you do not see that some component were down?
The right way is to capture monitoring events and history of the components and investigate a point(s) when a deployment fails. This way we shall be able to identify root cause. Ie., temporary loss of connection between components or irresponsive component.
It is hard to speculate without being able to reproduce a problem.
@Raunak Narooka when the proxy is not deployed successfully on some MPs in the cluster, then the deployment status circle will be half-green. That's what he means by half deployment I guess.
Oh. didn't knew about this. Thanks for the info...
Hi All,
After having some troubleshooting done with support teams, we figured out the problem was with MessageLogging policy.
We were doing file logging with the <Message/> element having content as a variable. Whenever this variable turns out to be null or empty, the MessageLoggin policy will hang up the thread.
This way all of the threads on MP were being stuck up in WAITING/TIMED_WAITING state. This is a bug already logged with Apigee and they are working on it.
To reproduce this issue you can add a MessageLogging policy in PostClientFlow with empty or null <Message/> element. Make some 15 to 20 calls and check the MP logs. You will find a Buffer size 0 errors for FILE_LOGGER.
As a side effect, you'll find that API deployments will not work on the MP that is in the hung up state. You will also experience 504 gateway timeouts from router as the MP will not be able to serve the requests.
A workaround we came up with is to hardcode some string in the <Message/> element or add {messageid} as a prefix to log variable. This will ensure that some string is always written to logs.
Hope this helps!
HI All, i am facing exactly same issue when i use java callout policy. does anyone has resolved this issue?
It would be really helpful.