ES 故障
2019-10-18
primary shard lost
unassigned_info
"can_allocate" : "no_valid_shard_copy", "allocate_explanation" : "cannot allocate because all found copies of the shard are either stale or corrupt"
几种报错:
"node_id" : "dBd4onKFSLSvrxgFCIP6GQ", "node_name" : "elasticsearch-data-86d6d959c5-ddlfd", "transport_address" : "172.16.38.77:9300", "node_decision" : "no", "store" : { "in_sync" : false, "allocation_id" : "SA157qPdRViXVK2ie2QgKg", "store_exception" : { "type" : "file_not_found_exception", "reason" : "no segments* file found in SimpleFSDirectory@/data/db/nodes/0/indices/sRSOtM-URGGeOs49IACD8w/1/index lockFactory=org.apache.lucene.store.NativeFSLockFactory@42971bc2: files: [write.lock]" } } }
"node_id" : "WJSf3f08Riuy-4kajyLb6A", "node_name" : "elasticsearch-data-86d6d959c5-8jb7x", "transport_address" : "172.16.126.15:9300", "node_decision" : "no", "store" : { "in_sync" : false, "allocation_id" : "A8L3_M7SQG-ZSy7zbBNWFg" } }
"node_id" : "9S-fKqTkQg-06muuMl20Uw", "node_name" : "elasticsearch-data-86d6d959c5-j89bq", "transport_address" : "172.16.23.40:9300", "node_decision" : "no", "deciders" : [ { "decider" : "disk_threshold", "decision" : "NO", "explanation" : "the node is above the low watermark cluster setting [cluster.routing.allocation.disk.watermark.low=85%], using more disk space than the maximum allowed [85.0%], actual free: [13.930268425592923%]" } ] }, { "node_id" : "E2BvZJ4jQu2anQzaQrgCLA", "node_name" : "elasticsearch-data-86d6d959c5-f2957", "transport_address" : "172.16.63.6:9300", "node_decision" : "no", "deciders" : [ { "decider" : "disk_threshold", "decision" : "NO", "explanation" : "the node is above the low watermark cluster setting [cluster.routing.allocation.disk.watermark.low=85%], using more disk space than the maximum allowed [85.0%], actual free: [14.635894063472007%]" } ] }, { "node_id" : "F_iPC-LbQ9uH9NquGrnYmw", "node_name" : "elasticsearch-data-86d6d959c5-2dlpb", "transport_address" : "172.16.95.7:9300", "node_decision" : "no", "deciders" : [ { "decider" : "disk_threshold", "decision" : "NO", "explanation" : "the node is above the low watermark cluster setting [cluster.routing.allocation.disk.watermark.low=85%], using more disk space than the maximum allowed [85.0%], actual free: [14.663439808514775%]" } ] }, { "node_id" : "lj1_omL1RYOSo5xws7ibQg", "node_name" : "elasticsearch-data-86d6d959c5-khh96", "transport_address" : "172.16.53.13:9300", "node_decision" : "no", "deciders" : [ { "decider" : "disk_threshold", "decision" : "NO", "explanation" : "the node is above the low watermark cluster setting [cluster.routing.allocation.disk.watermark.low=85%], using more disk space than the maximum allowed [85.0%], actual free: [14.5565950622084%]" } ] }
ES 分配策略
https://doc.yonyoucloud.com/doc/mastering-elasticsearch/chapter-4/43_README.html
数据量太大
shards disk.indices disk.used disk.avail disk.total disk.percent host ip node
4060 442.9gb 443.4gb 56.5gb 500gb 88 172.16.100.115 172.16.100.115 elasticsearch-data-f8449cccf-2qblf
3719 439gb 442.2gb 57.7gb 500gb 88 172.16.3.50 172.16.3.50 elasticsearch-data-f8449cccf-bxpmh
490 103.8gb 459.2gb 40.7gb 500gb 91 172.16.98.138 172.16.98.138 elasticsearch-data-f8449cccf-mr2bm
3742 439.8gb 446.6gb 53.3gb 500gb 89 172.16.51.88 172.16.51.88 elasticsearch-data-f8449cccf-b74rw
3631 463.9gb 464.4gb 35.5gb 500gb 92 172.16.98.137 172.16.98.137 elasticsearch-data-f8449cccf-gz62r
11294 UNASSIGNED
当容量超过80%就会有问题
引用es文档
cluster.routing.allocation.disk.threshold_enabled Defaults to true. Set to false to disable the disk allocation decider.
cluster.routing.allocation.disk.watermark.low Controls the low watermark for disk usage. It defaults to 85%, meaning that Elasticsearch will not allocate shards to nodes that have more than 85% disk used. It can also be set to an absolute byte value (like 500mb) to prevent Elasticsearch from allocating shards if less than the specified amount of space is available. This setting has no effect on the primary shards of newly-created indices or, specifically, any shards that have never previously been allocated.
cluster.routing.allocation.disk.watermark.high Controls the high watermark. It defaults to 90%, meaning that Elasticsearch will attempt to relocate shards away from a node whose disk usage is above 90%. It can also be set to an absolute byte value (similarly to the low watermark) to relocate shards away from a node if it has less than the specified amount of free space. This setting affects the allocation of all shards, whether previously allocated or not.
cluster.routing.allocation.disk.watermark.flood_stage Controls the flood stage watermark. It defaults to 95%, meaning that Elasticsearch enforces a read-only index block (index.blocks.read_only_allow_delete) on every index that has one or more shards allocated on the node that has at least one disk exceeding the flood stage. This is a last resort to prevent nodes from running out of disk space. The index block must be released manually once there is enough disk space available to allow indexing operations to continue.
Other issues
内存段错误 core dump
segment fault
java offheap
es memory limit 5 5 10