Qpid Server JVM Deadlock

yazinthufael
Participant II

Hello,

Seeing this Deadlock issue on Qpid Server JVM. The Postgres server and Postgres db are absolutely fine and receiving traffic.

Extract from QS logs:

Error while extracting database product name - falling back to empty error codes org.springframework.jdbc.support.MetaDataAccessException: Could not get Connection for extracting meta data; nested exception is org.springframework.jdbc.CannotGetJdbcConnectionException: Could not get JDBC Connection; nested exception is java.sql.SQLException: Data source is closed

Extract from Qpid Broker logs:

2017-08-24 04:02:12 [Broker] error Execution exception: resource-limit-exceeded: Maximum depth exceeded on ax-q-axgroup-001-consumer-group-001: current=[count: 7113, si ze: 1073740553], max=[count: 1000000, size: 1073741824] (/home/fliu/rpmbuild/BUILD/qpid-0.28/cpp/src/qpid/broker/Queue.cpp:1575)

No errors found on Postgres DB and Postgres Server.

Due to this deadlock, the qpid broker queue has gone full, since it's unable to read them.

The issue was resolved with the QS restart. However, any help with this root cause analysis would be appreciated.

Thanks

Thufael

0 2 1,022
2 REPLIES 2

Not applicable

"Could not get JDBC Connection" and "Maximum depth exceeded" are clear indications of a connectivity issue.

Make sure there is connectivity between Qpid box and Postgres box over 5432.

Do not limit your search to Firewall rules. Also check: iptables, selinux, tcpwrappers, any applicable security software/appliance.

http://docs.apigee.com/private-cloud/latest/installation-requirements#firewallsandvirtualhosts-edgep...

yazinthufael
Participant II

@Maudrit, this issue is seen on our prod env which is been running for more than an year. And also the issue was resolved with a Qs restart. The connectivity was always there, we're actively monitoring it. There were no alerts for the same during this time. Firewalls, selinix, tcpwrappers are all disabled and we don't have any other security/software in place. With our gateway handling close to 10k tps, I've seen this Issue surfacing a few times in the past, especially when the volumes are high. If you notice, it's the jdbc connection availability ( from the pool) and not merely a connect timeout between qs and pg db. I would assume that the QS is choking at a threshold. Is it a known issue/bug with Apigee. We're running 16.09.