Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

连接超时 #50

Open
amilytan opened this issue May 8, 2020 · 6 comments
Open

连接超时 #50

amilytan opened this issue May 8, 2020 · 6 comments

Comments

@amilytan
Copy link

amilytan commented May 8, 2020

程序直连报超时错误:PHP Fatal error: Uncaught RedisException: Connection timed out
cerberus错误日志如下:
2020-05-07 16:50:10,758 I 139821100205824 Start accepting - Acceptor(16@0x7f2aa6f830f8)
2020-05-07 16:50:10,758 W 139821100205824 Too many open files. Stop accepting from Acceptor(16@0x7f2aa6f830f8)
2020-05-07 16:50:10,758 W 139821100205824 version:0.8.0-2018-05-02
threads:8
cluster_ok:1
read_slave:0
clients_count:149,115,114,125,108,124,107,107
accepting:0,1,0,0,0,0,1,0
long_connections_count:0,0,0,0,0,0,0,0
used_cpu_sys:5016.74
used_cpu_user:5436.59
mem_buffer_alloc:5727980,5757925,6057566,5053367,5545437,6056406,6060881,5886104
completed_commands:31235142
total_process_elapse:16135
total_remote_cost:13903.9
last_command_elapse:0.000243825,0.0186795,0.0141551,0.000359905,0.0203328,0.00054563,0.000973783,0.000646515
last_remote_cost:0.000197395,0.0186091,0.0139204,0.000297576,0.0202605,0.000488308,0.000921514,0.000564282

此时accepting指标大多为0,直连cerbers端口有卡顿,请问是什么原因呢?

@zheplusplus
Copy link
Contributor

看起来有将近 1000 个连接了. 可以检查一下系统的最大连接数是不是默认的 1024 吗? 试试调大连接数然后重启程序

@amilytan
Copy link
Author

amilytan commented May 8, 2020

看起来有将近 1000 个连接了. 可以检查一下系统的最大连接数是不是默认的 1024 吗? 试试调大连接数然后重启程序

ulimt -n 参数值为65535
fs.file-max 参数值为6509408
cerberus是否有连接数上限呢?我试试调大cerberus的threads进程数。

@amilytan
Copy link
Author

amilytan commented May 12, 2020

看起来有将近 1000 个连接了. 可以检查一下系统的最大连接数是不是默认的 1024 吗? 试试调大连接数然后重启程序

ulimt -n 参数值为65535
fs.file-max 参数值为6509408
cerberus是否有连接数上限呢?我试试调大cerberus的threads进程数。

已经将系统的连接数调大至1024000,lsof |wc -l 在2万左右。还是存在Too many open files. Stop accepting from Acceptor(16@0x7f2aa6f830f8)日志,直连cerberus服务依旧卡顿。

@zheplusplus
Copy link
Contributor

这条日志的上下文在

if (cfd == -1) {

从代码来看应该是 accept 后操作系统返回了 -1 并设置错误码为 EN/MFILE...
我个人倾向于可能此进程并没正确继承操作系统的设置.. 看能不能在这条 log 里加点信息来调试一下.

@amilytan
Copy link
Author

amilytan commented Jun 8, 2020

这条日志的上下文在

if (cfd == -1) {

从代码来看应该是 accept 后操作系统返回了 -1 并设置错误码为 EN/MFILE...
我个人倾向于可能此进程并没正确继承操作系统的设置.. 看能不能在这条 log 里加点信息来调试一下.

不懂具体怎么加,可以指点下吗?感谢!

@zheplusplus
Copy link
Contributor

可以试试

#include <unistd.h>

然后在将那条日志改成

LOG(WARNING) << "Too many open files. Stop accepting from " << this->str() << " OPEN MAX=" << sysconf(_SC_OPEN_MAX);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants