CDH故障总结

2022-01-18  本文已影响0人  彩色的炮灰

一、CDH集群因为强制断电,导致报错,提示块损坏和丢失

1、故障如下图


image.png image.png

2、登陆cdh主节点服务器,执行如下命令:
root账号没有执行权限,需要切换到hdfs账号执行

 sudo -u hdfs hdfs fsck /  > test.log

3、查看test.log文件
可以看到损坏的块所在节点、路径等。

[root@cdh1 ~]# cat test.log 
FSCK started by hdfs (auth:SIMPLE) from /172.16.40.170 for path / at Tue Jan 18 16:03:34 CST 2022

/hbase/data/hbase/meta/1588230740/info/3f61550423ef4eef884ca6541b2a73c2: CORRUPT blockpool BP-2050597982-172.16.40.170-1642485592225 block blk_1073741884

/hbase/data/hbase/meta/1588230740/info/3f61550423ef4eef884ca6541b2a73c2: CORRUPT 1 blocks of total size 6809 B.
/hbase/data/hbase/meta/1588230740/info/834bd0386103452b9e943c975902416c: CORRUPT blockpool BP-2050597982-172.16.40.170-1642485592225 block blk_1073741883

/hbase/data/hbase/meta/1588230740/info/834bd0386103452b9e943c975902416c: CORRUPT 1 blocks of total size 6479 B.
Status: CORRUPT
 Number of data-nodes:  3
 Number of racks:       1
 Total dirs:            49
 Total symlinks:        0

Replicated Blocks:
 Total size:    235179578 B (Total open files size: 248 B)
 Total files:   20 (Files currently being written: 16)
 Total blocks (validated):  19 (avg. block size 12377872 B) (Total open file blocks (not validated): 15)
  ********************************
  UNDER MIN REPL'D BLOCKS:  2 (10.526316 %)
  dfs.namenode.replication.min: 1
  CORRUPT FILES:    2
  CORRUPT BLOCKS:   2
  CORRUPT SIZE:     13288 B
  ********************************

4、如果文件不重要直接将文件块删除:

sudo -u hdfs hdfs fsck -delete /hbase/data/hbase/meta/1588230740/info/834bd0386103452b9e943c975902416c

5、再次刷新即可恢复。
参考链接:https://blog.csdn.net/hcq_lxq/article/details/121628219

上一篇下一篇

猜你喜欢

热点阅读