Ceph

2019-08-12 本文已影响0人杰森斯坦sen

PG数量计算方法

官方给出的计算公式是这样的：

Total PGs = (Total_number_of_OSD * 100) / max_replication_count

结果汇总后应该接近 2 的幂。
例：
有200个osd，3副本，10个pool

(200 * 100)
----------- = 6667. Nearest power of 2: 8192
     3

每个pool 的PG=8192/10=819，那么创建pool的时候就指定PG为819

ceph osd pool create pool_name 819

几个常用的值：

少于 5 个 OSD 时可把 pg_num 设置为 128
OSD 数量在 5 到 10 个时，可把 pg_num 设置为 512
OSD 数量在 10 到 50 个时，可把 pg_num 设置为 4096
OSD 数量大于 50 时，你得理解权衡方法、以及如何自己计算 pg_num 取值
自己计算 pg_num 取值时可借助 pgcalc 工具

pool的容量规划也要考虑，例如对象存储default.rgw.buckets.data要分配多一点的PGs。
所以更准确的计算公式是：

( Target PGs per OSD ) x ( OSD # ) x ( %Data )
-----------
( Size )

RADOS Gateway pgcalc工具计算值

常用命令

查看状态

ceph osd dump
ceph osd dump | grep 'replicated size'
ceph pg stat
ceph pg dump pgs | grep ^1 | awk '{print $1,$2,$15}'  #grep pool.id
ceph pg map 1.7f  #pg.id
ceph pg 1.7f query  #pg.id
ceph pg dump_stuck inactive|unclean|stale
ceph df
ceph osd df
rados df

集群start/stop

systemctl stop ceph.target
systemctl start ceph.target
systemctl list-units --type=service|grep ceph

PG修复
把down的osd踢出集群

# 先将该osd reweight 到0，也就是将权重降低到0，让数据副本分散到其它osd上
ceph osd reweight 2 0.0

# 待集群重新恢复为ok后执行以下命令将osd踢出集群
service ceph stop osd.2
ceph osd out 2
ceph osd crush remove osd.2
ceph auth del osd.2
ceph osd rm osd.2

Unfound objects

# <1>尝试让失败的osd起来，如果起来后集群恢复正常，则结束
# <2>尝试将该PG的unfound对象回滚到上一个版本，如果恢复正常，则结束
ceph pg $pgid mark_unfound_lost revert
# <3>如果还是不行，那只有将该object删除掉了，注意这会导致丢失数据，
ceph pg $pgid mark_unfound_lost delete

Stale PG

# 找出所有的stale的PG
ceph pg dump |grep stale
ceph health detail |grep stale

设置某个osd作为主的，某个为副osd：

# 假设只有osd.0和osd.1，要将osd.0作为副本osd，1作为主osd，
# 则可以将osd.0的主亲和力设置为0，这样osd.0就只能做副本osd
ceph tell mon.\* injectargs '--mon_osd_allow_primary_affinity=true'
ceph osd primary-affinity osd.0 0

Reference

存储池、归置组和 CRUSH 配置参考
 归置组
 Ceph PG介绍及故障状态和修复
 Ceph CRUSH算法
 ceph-systemd
Ceph 部署完整版(el7+jewel)

Ceph

PG数量计算方法

常用命令

Reference

猜你喜欢

热点阅读