docker client 使用不当造成 too many op

2020-05-06  本文已影响0人  一个大大大坑

dockerd 造成 too many open files

一台运行时长约5天的机器,被发现 docker 无法使用,且其他操作报错 too many open files

docker version: 19.03

排查

# journalctl -xe -u docker.service -f
5月 06 10:57:23 node-2-76 dockerd[5082]: time="2020-05-06T10:57:23.395950238+08:00" level=error msg="1484f4ee9031a4e560c55ed4fdbc68689a668209f4b18b92350e3830969026ec cleanup: failed to delete container from containerd: no such container"
5月 06 10:57:24 node-2-76 dockerd[5082]: time="2020-05-06T10:57:24.387829239+08:00" level=warning msg="Unable to locate plugin: calico-ipam, retrying in 2s"
5月 06 10:57:26 node-2-76 dockerd[5082]: time="2020-05-06T10:57:26.388193568+08:00" level=warning msg="Unable to locate plugin: calico-ipam, retrying in 4s"
5月 06 10:57:30 node-2-76 dockerd[5082]: time="2020-05-06T10:57:30.388593845+08:00" level=warning msg="Unable to locate plugin: calico-ipam, retrying in 8s"
 curl --unix-socket /var/run/docker.sock "http://./debug/pprof/goroutine?debug=2" > docker.trace

发现其中有许多goroutine处于 semacquire 状态,该状态是在等待锁,且有时长最大高达 8355 分钟的。

goroutine 2195141 [semacquire, 5306 minutes]:
sync.runtime_SemacquireMutex(0xc0004aa704, 0xc00c3cd700)
    /usr/local/go/src/runtime/sema.go:71 +0x3f
sync.(*Mutex).Lock(0xc0004aa700)
    /usr/local/go/src/sync/mutex.go:134 +0x10b
github.com/docker/docker/daemon.(*Daemon).ContainerStart.func1(0x0, 0x0)
    /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/daemon/start.go:29 +0x45

解决方案

上一篇 下一篇

猜你喜欢

热点阅读