Edge-UI and Management Server going down

Hi All,

Management server and edge-ui sits on the same server is going down(not frequent though). The error in edge-ui.log

p.c.s.n.PlayDefaultUpstreamHandler - Exception caught in Netty java.lang.IllegalArgumentException: empty text

Looks like edge-management server is not logging enough to capture any error.

Full edge-ui error is attached.

Thanks!

edge-ui.png

0 7 714
7 REPLIES 7

Hi @kkhanna .

This error seems to occur when a netty based server receives a request without a HTTP version in the request. These requests could be your healthchecks where HTTP version may not be included. For example some F5 loadbalancer http monitors doesn't inlcude it by default. Please refer below links

https://github.com/netty/netty/commit/b419bd152a7ea4946bff824ce1c6eff62d2c56ce

https://support.f5.com/csp/article/K13397

Updating the healthchecks with HTTP/1.1 version can resolve this issue. Hope this helps

Thank you Sailendhran, will look into it.
But that cannot bring down the edge-ui, isn't it?

No. This error shouldn't bring down edge ui or management server. These errors were just filling up disks. It could be for another reason. Have you checked application.log or system.log?

After doing some more digging, I found out the edge-ui and management server pids are getting kill in order to allocate memory. Since devportal also resides at the same node its asking for more memory.

message log

httpd invoked oom-killer: gfp_mask=0x200da, order=0, oom_score_adj=0
httpd cpuset=/ mems_allowed=0

** Then killing process one by one **

devportal error log

[Thu Jan 18 01:52:55 2018] [error] [client 172.18.3.129] PHP Fatal error:  Uncaught exception 'PDOException' with message 'SQLSTATE[HY000] [1040]
 Too many connections' in /apps/apigee/apigee-devportal/private/simplesamlphp/lib/SimpleSAML/Store/SQL.php:54\nStack trace:\n#0 /apps/apigee/apig
ee-devportal/private/simplesamlphp/lib/SimpleSAML/Store/SQL.php(54): PDO->__construct('mysql:host=...', '...', '...')\n#1 /apps/a
pigee/apigee-devportal/private/simplesamlphp/lib/SimpleSAML/Store.php(49): SimpleSAML_Store_SQL->__construct()\n#2 /apps/apigee/apigee-devportal/
private/simplesamlphp/lib/SimpleSAML/SessionHandler.php(121): SimpleSAML_Store::getInstance()\n#3 /apps/apigee/apigee-devportal/private/simplesam
lphp/lib/SimpleSAML/SessionHandler.php(39): SimpleSAML_SessionHandler::createSessionHandler()\n#4 /apps/apigee/apigee-devportal/private/simplesam
lphp/lib/SimpleSAML/Session.php(148): SimpleSAML_SessionHandler::getSessionHandler()\n#5 /apps/apigee/apigee-devportal/private/simplesamlphp/lib/
SimpleSAML/Session.php(398): SimpleSAML_Session->__construct(true)\n#6 /apps/apigee/apigee-d in /apps/apigee/apigee-devportal/private/simplesamlp
hp/lib/SimpleSAML/Store/SQL.php on line 54


Any thoughts?

Thanks!

Good find. Looks like OS is out of memory and trying to kill those processes. From the devportal log it looks like mysql is giving the error "Too many connections" . Probably mysql memory allocation for each connection needs to be tuned or moving the devportal to a separate instance might help. I don't think having devportal, edge-ui and management server all on same instance is supported in any of the installation topologies.

Hi,

Can you paste the number of file handles open during the error.

And values from sysctl,conf and limits.conf.

Regards

deverror.png

Currently:

# lsof | wc -l

7421


Can you guide me how to look for that in logs?

sysctl.conf

# Controls the System Request debugging functionality of the kernel
kernel.sysrq = 0
# Controls whether core dumps will append the PID to the core filename.
# Useful for debugging multi-threaded applications.
kernel.core_uses_pid = 1
#enable core dumps for setuid processes - Added by Middleware to generate coredumps
fs.suid_dumpable = 2
#Core dump patterns - Added by Middleware to generate coredumps
kernel.core_pattern = core_%h_%e_%u_%g_%t_%p
# Controls the use of TCP syncookies
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_max_syn_backlog = 4096
# Disable netfilter on bridges.
net.bridge.bridge-nf-call-ip6tables = 0
net.bridge.bridge-nf-call-iptables = 0
net.bridge.bridge-nf-call-arptables = 0
#protect against buffer overflow attacks
kernel.exec-shield = 1
#Set the system flag to force randomized virtual memory region placement
kernel.randomize_va_space = 2
fs.suid_dumpable = 0 

limit.conf

#*               soft    core            0
*                hard   core          0
#*               hard    rss             10000

# End of file
* hard   core  0

Due to overload apache is creating multiple httpd process and that is leading to high CPU and high memory usage. I have attached the devportal logs during that time. Huge number of request for that resource came.

I'm looking where those requests are coming from.

Appreciate for all the help!

Thanks!