kibana启动失败all shards failed,无法连接

2019-12-31 本文已影响0人 Depro

现象：

本地集群启动3个Node,es都启动正常，search-head也都能连接上，但是有警告日志:

2019-12-31T08:54:46,320][WARN ][o.e.c.r.a.DiskThresholdMonitor] [node1] high disk watermark [90%] exceeded on [wYsY5n5QRduREAAZvA5Biw][vipnode2][/node-2/data/nodes/0] free: 17.8gb[7.6%], shards will be relocated away from this node

然后启动kibana,启动报一堆的红色日志，控制台打不开，关键错误日志:

elasticsearch - SearchPhaseExecutionException[Failed to execute phase [query], all shards failed]

{ statusCode: 503,

payload:

{ statusCode: 503,

error: 'Service Unavailable',

message: 'Request Timeout after 30000ms' },

headers: {} },

reformat: [Function],

[Symbol(SavedObjectsClientErrorCode)]: 'SavedObjectsClient/esUnavailable' }

log [00:44:10.647] [info][plugins-system] Stopping all plugins.

log [00:44:10.648] [info][plugins][translations] Stopping plugin

解决：

参考了https://www.jianshu.com/p/443cf6ce87d5排查问题ap，https://www.elastic.co/guide/en/elasticsearch/reference/5.5/cluster-allocation-explain.htmli,

最后确定了关键的参数cluster.routing.allocation.disk.threshold_enabled

(es可以根据磁盘使用情况来决定是否继续分配shard。默认设置是开启的).

为了在本地单机上测试，我自己电脑磁盘空间剩下没多少了，修改elasticsearch.yml,设置cluster.routing.allocation.disk.threshold_enabled: false。

然后删除了data,logs的文件，重启es,kibana,一切都正常，从red到green.

总结：

1.系统启动的warm日志也很重要，关注每一个细节，能快速定位问题。

2.这次问题的几个关键参数，具体含义可以去官网查:cluster.routing.allocation.disk.threshold_enabled，cluster.routing.allocation.disk.watermark.low，cluster.routing.allocation.disk.watermark.high

kibana启动失败all shards failed,无法连接

猜你喜欢

热点阅读