openstack

问题解决:kolla 环境,cinder-backup备份还原出

2017-07-14  本文已影响119人  笨手笨脚越

找到volume的挂载位置

1、 查看虚机的host和instance_id:

[root@node1 kolla]# nova show 875176b8-5c30-4114-b1f7-33a457c2c933
图片.png

2、在node1,用virsh工具查看磁盘挂载位置:

(nova-compute)[root@node1 /]# virsh dumpxml instance-00000012 
图片.png

得到挂载位置:
/dev/disk/by-path/ip-172.24.3.180:3260-iscsi-iqn.1992-08.com.netapp:sn.2d72abb030d511e7875800a098ac0ce9:vs.24-lun-1

写磁盘

1、 用xxd工具(安装vim即可使用)检查磁盘内容,二进制转十六进制:

[root@localhost ]# xxd --help
Usage:
xxd [options] [infile [outfile]]
or
xxd -r [-s [-]offset] [-c cols] [-ps] [infile [outfile]]
Options:
-a toggle autoskip: A single '*' replaces nul-lines. Default off.
-b binary digit dump (incompatible with -p,-i,-r). Default hex.
-c cols format <cols> octets per line. Default 16 (-i: 12, -ps: 30).
-E show characters in EBCDIC. Default ASCII.
-g number of octets per group in normal output. Default 2. 每个goup的字节数,默认为2,可设置。
-h print this summary.
-i output in C include file style. :输出为c包含文件的风格,数组方式存在。
-l len stop after <len> octets. :转换到len个字节后停止转换。
-ps output in postscript plain hexdump style.
-r reverse operation: convert (or patch) hexdump into binary.
-r -s off revert with <off> added to file positions found in hexdump.
-s [+][-]seek start at <seek> bytes abs. (or +: rel.) infile offset.
-u use upper case hex letters. : 字节大写方式
-v show version: "xxd V1.10 27oct98 by Juergen Weigert".

2、 用dd指令把2M的块写满:

[ubuntu@localhost by-path]$ sudo dd if=/dev/urandom of=../../sde bs=1024 count=2
2+0 records in
2+0 records out
2048 bytes (2.0 kB) copied, 0.00195619 s, 1.0 MB/s

检查:

(cinder-backup)[root@node1 by-path]#  xxd -g 1 -i -u -l 100 ../../sde 
unsigned char ______sdcn[] = {
  0X00, 0X00, 0X00, 0X00, 0X00, 0X00, 0X00, 0X00, 0X00, 0X00, 0X00, 0X00,
  0X00, 0X00, 0X00, 0X00, 0X00, 0X00, 0X00, 0X00, 0X00, 0X00, 0X00, 0X00,
  0X00, 0X00, 0X00, 0X00, 0X00, 0X00, 0X00, 0X00, 0X00, 0X00, 0X00, 0X00,
  0X00, 0X00, 0X00, 0X00, 0X00, 0X00, 0X00, 0X00, 0X00, 0X00, 0X00, 0X00,
  0X00, 0X00, 0X00, 0X00, 0X00, 0X00, 0X00, 0X00, 0X00, 0X00, 0X00, 0X00,
  0X00, 0X00, 0X00, 0X00, 0X00, 0X00, 0X00, 0X00, 0X00, 0X00, 0X00, 0X00,
  0X00, 0X00, 0X00, 0X00, 0X00, 0X00, 0X00, 0X00, 0X00, 0X00, 0X00, 0X00,
  0X00, 0X00, 0X00, 0X00, 0X00, 0X00, 0X00, 0X00, 0X00, 0X00, 0X00, 0X00,
  0X00, 0X00, 0X00, 0X00
};

已经写入数据:

[ubuntu@localhost by-path]$ sudo xxd -g 1 -i -u -l 100 ../../sde   
unsigned char ______sde[] = {
  0XE5, 0XB2, 0X21, 0X81, 0XD0, 0X55, 0X22, 0XBA, 0X36, 0X2A, 0XA5, 0XF1,
  0X12, 0X00, 0XB1, 0X81, 0X62, 0X60, 0X94, 0XE6, 0X05, 0XAC, 0X28, 0XF9,
  0XC4, 0X1B, 0XB7, 0X6F, 0X6E, 0X1D, 0XF7, 0XA8, 0XE3, 0X5B, 0X0C, 0XD2,
  0X6B, 0X50, 0X47, 0X60, 0XC8, 0X35, 0X37, 0X9A, 0X52, 0XC9, 0X4C, 0XE4,
  0X46, 0X9B, 0X72, 0X51, 0XC1, 0XE7, 0X03, 0XED, 0X5C, 0XA7, 0X4D, 0X49,
  0X90, 0XAC, 0X3D, 0X8E, 0XE9, 0X9E, 0XCD, 0X0A, 0X63, 0X77, 0X39, 0X52,
  0X11, 0XBD, 0XE9, 0XC7, 0XC3, 0X74, 0X04, 0X37, 0X27, 0X8B, 0X85, 0XD8,
  0X77, 0X6B, 0X6F, 0XE1, 0X96, 0XBC, 0X2D, 0XD6, 0XC2, 0X19, 0XFB, 0XC9,
  0XDA, 0X55, 0X0E, 0XF3
};
unsigned int ______sde_len = 100;

截取程序里读取磁盘的内容

cinder\backup\chunkeddriver.py
增加方法:

    def wangyue_write_file(self, usage, content, backup_id):
        base_path = '/var/lib/cinder/wangyue/'
        if not os.path.exists(base_path):
            os.mkdir(base_path)
        file_name = base_path + usage + backup_id + '.txt'
        with open(file_name, 'a') as f:
            f.write(content)
            f.write('\n')

修改方法cinder.backup.chunkeddriver.ChunkedBackupDriver#backup:

        <!--省略-->
        wangyue_time = 1
        while True:
            backup = objects.Backup.get_by_id(self.context, backup.id)
            if backup.status in (fields.BackupStatus.DELETING,
                                 fields.BackupStatus.DELETED):
                is_backup_canceled = True
                # To avoid the chunk left when deletion complete, need to
                # clean up the object of chunk again.
                self.delete(backup)
                LOG.debug('Cancel the backup process of %s.', backup.id)
                break
            data_offset = volume_file.tell()
            data = volume_file.read(self.chunk_size_bytes)

            if wangyue_time == 1:
                self.wangyue_write_file('backup_', data[0:10000], backup.id)
                wangyue_time = 0

截取磁盘的前10000字节存在文件backup_${backup_id} 文件:

图片.png

可以看到都是0x00,磁盘是空的!
需要验证下是不是Iscsi挂载有问题

让程序延迟断开Iscsi连接

从代码 cinder\backup\manager.py#_run_backup 知道,创建备份的过程是:

  1. 对源卷volume创建Iscsi连接
  2. 对卷备份
  3. 关闭连接

我们在备份之后,加个延时断连:

    def _run_backup(self, context, backup, volume):
        backup_service = self.service.get_backup_driver(context)

        properties = utils.brick_get_connector_properties()
        try:
            backup_device = self.volume_rpcapi.get_backup_device(context,
                                                                 backup,
                                                                 volume)
            attach_info = self._attach_device(context,
                                              backup_device.device_obj,
                                              properties,
                                              backup_device.is_snapshot)
            try:
                device_path = attach_info['device']['path']
                if isinstance(device_path, six.string_types):
                    if backup_device.secure_enabled:
                        with open(device_path) as device_file:
                            backup_service.backup(backup, device_file)
                    else:
                        with utils.temporary_chown(device_path):
                            with open(device_path) as device_file:
                                backup_service.backup(backup, device_file)
                # device_path is already file-like so no need to open it
                else:
                    backup_service.backup(backup, device_path)

                LOG.debug('==========waite 6 mins, begin===========')
                import time
                time.sleep(360)
                LOG.debug('==========waite 6 mins, end===========')

这样,在中断的时间内,我们可以验证下Iscsi是否挂载成功:

1、netapp 管理平台检查lun是否已经连接上

图片.png

2、根据cinder-backup.log,知道原卷临时挂载在/dev/disk/by-path/ip-172.24.3.180:3260-iscsi-iqn.1992-08.com.netapp:sn.2d72abb030d511e7875800a098ac0ce9:vs.24-lun-0目录, xxd检查发现内容都是X00, 0X00, 0X00, 0X00,说明是个空卷,不是原卷真正的内容。

node1 2017-07-13 16:16:05.050 6 DEBUG os_brick.initiator.connectors.iscsi [req-54e3c2dd-46f6-4e3d-894f-211af2fde587 1a236a63e6864cf5a5b26d4b816f719b 406cd353135e44f0ade98f53d92d5d8b - default default] Found iSCSI node [u'/dev/disk/by-path/ip-172.24.3.180:3260-iscsi-iqn.1992-08.com.netapp:sn.2d72abb030d511e7875800a098ac0ce9:vs.24-lun-0'] (after 1 rescans) connect_volume /var/lib/kolla/venv/lib/python2.7/site-packages/os_brick/initiator/connectors/iscsi.py:452

3、那我们试着在随便某台主机手动挂载卷(root权限):

(1) 发现目标

[root@node1 ~]# iscsiadm -m discovery -t st -p 172.24.3.180
172.24.3.180:3260,1057 iqn.1992-08.com.netapp:sn.2d72abb030d511e7875800a098ac0ce9:vs.24
172.24.3.181:3260,1058 iqn.1992-08.com.netapp:sn.2d72abb030d511e7875800a098ac0ce9:vs.24

(2) 获得本机iqn

[root@node1 ~]# cat /etc/iscsi/initiatorname.iscsi 
InitiatorName=iqn.1994-05.com.redhat:cad0246c576

(3) netapp管理界面,把iqn添加到lun启动程序里,下拉框内选择,然后点击“添加启动程序”,保存退出


图片.png

(4) login Iscsi

[root@node1 ~]# iscsiadm -m node -p 172.24.3.180 -l
Logging in to [iface: default, target: iqn.1992-08.com.netapp:sn.2d72abb030d511e7875800a098ac0ce9:vs.24, portal: 172.24.3.180,3260] (multiple)
Login to [iface: default, target: iqn.1992-08.com.netapp:sn.2d72abb030d511e7875800a098ac0ce9:vs.24, portal: 172.24.3.180,3260] successful.

(5) 检查内容不是空


图片.png

那就奇怪了,怎么程序挂载的卷和我们手动挂载的卷,内容不一样呢??

用inspect指令检查容器配置

docker inspect cinder_volume
docker inspect cinder_backup

做对比

图片.png

发现cinder_backup少了Iscsi_info,是不是缺少这个就无法正常使用Iscsi,导致Iscsi卷挂载读取内容为空?

检查kolla-ansible的配置文件 /usr/share/kolla-ansible/ansible/roles/cinder/defaults/main.yml

图片.png

改成:

图片.png

重新部署 kolla-ansible reconfigure -i ~/multinode -t cinder

再测试备份还原就正常了!!!

上一篇下一篇

猜你喜欢

热点阅读