【文件句柄】JAVA程序句柄不释放导致删除文件磁盘空间不释放问题

2022-12-01  本文已影响0人  Bogon

写java代码,文件资源的释放需要特别谨慎的对待.通常文件资源使用后必须close,然后再删除。

如果先删除但没有close掉,会造成文件句柄未被释放,这会造成实际使用磁盘空间较大,删除文件不释放磁盘空间。

此时文件关闭了,但是out还持有文件,out未关闭则文件句柄未被释放,会造成实际可使用空间小于可使用空间。

文件句柄的调试可用lsof 命令进行查看:

lsof  -s | grep  java
lsof  -s |grep deleted

系统告警磁盘空间不足,因为某个服务一直在刷错误日志,磁盘爆了,把容器删除重新起了一个。

df -h 后磁盘空间没有释放

du -sh 统计没有占用那么多空间

通过指令:lsof | grep deleted 指令,查看当前系统句柄未释放情况

因为都是容器空间,所以只查看容器进程未释放的文件句柄。

lsof | grep deleted
lsof -p 3495 | grep deleted
lsof -p $(ps aux |grep dockerd |grep -v grep  |awk '{print$2}') | grep deleted

发现有很多已经不存在的容器空间文件句柄未释放。

问题找到后怎么解决,有两种方法。

1、将当前线程进行重启,关闭线程,从而让句柄释放,释放空间。
2、找到指定的文件句柄,将当前文件句柄的大小设置为空。

第一种方法频繁重启不适合当前业务场景在生产环境不适用,采用第二种方法。

通过lsof | grep deleted拿到 PID(进程标识符)和 FD(文件描述符,应用程序通过文件描述符识别该文件。)

image.png

置空文件内容,然后查看磁盘使用发现空间恢复了:

echo > /proc/${pid}/fd/${fd}
echo > /proc/3840/fd/124
或者
truncate -s 0 /proc/${pid}/fd/${fd}
truncate -s 0 /proc/3840/fd/124

文件删除空间不释放,必须重启解决?
DBA日常运维过程中经常会遇到服务器磁盘空间不足的问题,容易一顿操作猛如虎,直接删除服务器不常用的日志和文件,然而空间并没有释放,给后来者留下隐患。

##/usr/local空间不足
# df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda1        20G  9.0G  9.6G  49% /
/dev/sda3        20G  8.2G   11G  79% /usr/local
/dev/sda4       401G  297G   84G  78% /data

##查看只有8.2G,实际占用15G
# du -sh /usr/local/
8.2G    /usr/local/

检查/usr/local目录下删除的文件,发现有日志被删除,但是不少进程占用文件句柄,空间并未释放

# lsof /usr/local/|grep -i delete

mysqld_sa 21589 user_00    2w   REG    8,3 6858136961 1053546 /usr/local/soft/shell_pkginstall_2021-07-07.log (deleted)
router_up 27504 user_00    1w   REG    8,3 6858136961 1053546 /usr/local/soft/shell_pkginstall_2021-07-07.log (deleted)
router_up 27504 user_00    2w   REG    8,3 6858136961 1053546 /usr/local/soft/shell_pkginstall_2021-07-07.log (deleted)
dcagent_t 28895 user_00    1w   REG    8,3 6858136961 1053546 /usr/local/soft/shell_pkginstall_2021-07-07.log (deleted)
dcagent_t 28895 user_00    2w   REG    8,3 6858136961 1053546 /usr/local/soft/shell_pkginstall_2021-07-07.log (deleted)
dcagent_t 28897 user_00    1w   REG    8,3 6858136961 1053546 /usr/local/soft/shell_pkginstall_2021-07-07.log (deleted)
dcagent_t 28897 user_00    2w   REG    8,3 6858136961 1053546 /usr/local/soft/shell_pkginstall_2021-07-07.log (deleted)
router_up 70763 user_00    1w   REG    8,3 6858137684 1053546 /usr/local/soft/shell_pkginstall_2021-07-07.log (deleted)
router_up 70763 user_00    2w   REG    8,3 6858137684 1053546 /usr/local/soft/shell_pkginstall_2021-07-07.log (deleted)
dcagent_t 71398 user_00    1w   REG    8,3 6858137684 1053546 /usr/local/soft/shell_pkginstall_2021-07-07.log (deleted)
dcagent_t 71398 user_00    2w   REG    8,3 6858137684 1053546 /usr/local/soft/shell_pkginstall_2021-07-07.log (deleted)
dcagent_t 71403 user_00    1w   REG    8,3 6858137684 1053546 /usr/local/soft/shell_pkginstall_2021-07-07.log (deleted)
dcagent_t 71403 user_00    2w   REG    8,3 6858137684 1053546 /usr/local/soft/shell_pkginstall_2021-07-07.log (deleted)
java      72021 user_00    1w   REG    8,3 6858137684 1053546 /usr/local/soft/shell_pkginstall_2021-07-07.log (deleted)
java      72021 user_00    2w   REG    8,3 6858137684 1053546 /usr/local/soft/shell_pkginstall_2021-07-07.log (deleted)

方案一:重启相关进程
这是最常见的方案,只有重启相关进程后,占用的文件句柄才会释放,磁盘空间也会释放

方案二:置空未释放文件句柄的文件

##从相关进程中随机找1个,查看文件句柄
# ls -l /proc/21589/fd
total 0
lr-x------ 1 user_00 users 64 Jun 22 15:05 0 -> /dev/nulll
-wx------ 1 user_00 users 64 Jun 22 15:05 1 -> /data/log/dblogs/nohup.out
l-wx------ 1 user_00 users 64 Jun 22 15:05 2 -> /usr/local/soft/shell_pkginstall_2021-07-07.log (deleted)
lrwx------ 1 user_00 users 64 Jun 22 15:05 3 -> socket:[3094866717]

##发现文件句柄2占用了删除文件
l-wx------ 1 user_00 users 64 Jun 22 15:05 2 -> /usr/local/soft/shell_pkginstall_2021-07-07.log (deleted)

##清空文件句柄2占用的文件
#  >/proc/21589/fd/2

##查看磁盘,发现空间已释放
# df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda1        20G  9.0G  9.6G  49% /
/dev/sda3        20G  8.2G   11G  79% /usr/local
/dev/sda4       401G  297G   84G  78% /data

#优化文件打开数
##CentOS6.x版本,是先读/etc/security/limits.conf,如果/etc/security/limits.d/目录下还有配置文件的话
##CentOS7.x会遍历读取里面文件,所以/etc/security/limits.d/里面的文件里面的配置会覆盖/etc/security/limits.conf的配置

##注释原有的nofile行
sed -i "/nofile/s/^/#/g" /etc/security/limits.conf

##注释原有的nproc行
sed -i "/nproc/s/^/#/g" /etc/security/limits.conf

echo "* soft nofile 1048576" >>/etc/security/limits.conf
echo "* hard nofile 1048576" >>/etc/security/limits.conf
echo "root soft nofile 1048576" >>/etc/security/limits.conf
echo "root hard nofile 1048576" >>/etc/security/limits.conf
echo "* soft nproc 65535" >>/etc/security/limits.conf
echo "* hard nproc 65535" >>/etc/security/limits.conf
echo "root soft nproc unlimited" >>/etc/security/limits.conf
echo "root hard nproc unlimited" >>/etc/security/limits.conf

##注释原有的nproc行
sed -i "/nproc/s/^/#/g" /etc/security/limits.d/90-nproc.conf

##注释原有的nofile行
sed -i "/nofile/s/^/#/g" /etc/security/limits.d/90-nproc.conf

echo "* soft nofile 1048576" >>/etc/security/limits.d/90-nproc.conf
echo "* hard nofile 1048576" >>/etc/security/limits.d/90-nproc.conf
echo "root soft nofile 1048576" >>/etc/security/limits.d/90-nproc.conf
echo "root hard nofile 1048576" >>/etc/security/limits.d/90-nproc.conf
echo "* soft nproc 65535" >>/etc/security/limits.d/90-nproc.conf
echo "* hard nproc 65535" >>/etc/security/limits.d/90-nproc.conf
echo "root soft nproc unlimited" >>/etc/security/limits.d/90-nproc.conf
echo "root hard nproc unlimited" >>/etc/security/limits.d/90-nproc.conf
echo "* soft memlock unlimited" >>/etc/security/limits.d/90-nproc.conf
echo "* hard memlock unlimited" >>/etc/security/limits.d/90-nproc.conf

image.png

参考

Linux文件句柄未释放
https://blog.bwcxtech.com/posts/1501dca

释放java文件句柄
https://oomake.com/question/313034

lsof处理文件恢复、句柄以及空间释放问题
https://blog.51cto.com/u_13293070/2298059

LINUX删除文件,但空间不释放
https://blog.51cto.com/chbinmile/1872633

上一篇下一篇

猜你喜欢

热点阅读