云计算

ovn-metadata 访问不稳定定位

2021-01-27  本文已影响0人  cloudFans
image.png image.png

2021-01-27 10:54:22.443 22 ERROR neutron.agent.ovn.metadata.server [-] Unexpected error.: AttributeError: 'MetadataProxyHandler' object has no attribute 'sb_idl'
2021-01-27 10:54:22.443 22 ERROR neutron.agent.ovn.metadata.server Traceback (most recent call last):
2021-01-27 10:54:22.443 22 ERROR neutron.agent.ovn.metadata.server File "/var/lib/kolla/venv/lib/python3.6/site-packages/neutron/agent/ovn/metadata/server.py", line 66, in call
2021-01-27 10:54:22.443 22 ERROR neutron.agent.ovn.metadata.server instance_id, project_id = self._get_instance_and_project_id(req)
2021-01-27 10:54:22.443 22 ERROR neutron.agent.ovn.metadata.server File "/var/lib/kolla/venv/lib/python3.6/site-packages/neutron/agent/ovn/metadata/server.py", line 83, in _get_instance_and_project_id
2021-01-27 10:54:22.443 22 ERROR neutron.agent.ovn.metadata.server ports = self.sb_idl.get_network_port_bindings_by_ip(network_id,
2021-01-27 10:54:22.443 22 ERROR neutron.agent.ovn.metadata.server AttributeError: 'MetadataProxyHandler' object has no attribute 'sb_idl'
2021-01-27 10:54:22.443 22 ERROR neutron.agent.ovn.metadata.server

可以确定地

def _get_instance_and_project_id(self, req):
    remote_address = req.headers.get('X-Forwarded-For')
    network_id = req.headers.get('X-OVN-Network-ID')

    ports = self.sb_idl.get_network_port_bindings_by_ip(network_id,          # 这个sb_idl属性 竟然突然间不存在了,神奇
                                                        remote_address)   
    num_ports = len(ports)
    if num_ports == 1:
        external_ids = ports[0].external_ids
        return (external_ids[ovn_const.OVN_DEVID_EXT_ID_KEY],
                external_ids[ovn_const.OVN_PROJID_EXT_ID_KEY])
    elif num_ports == 0:
        LOG.error("No port found in network %s with IP address %s",
                  network_id, remote_address)
    elif num_ports > 1:
        port_uuids = ', '.join([str(port.uuid) for port in ports])
        LOG.error("More than one port found in network %s with IP address "
                  "%s. Please run the neutron-ovn-db-sync-util script as "
                  "there seems to be inconsistent data between Neutron "
                  "and OVN databases. OVN Port uuids: %s", network_id,
                  remote_address, port_uuids)

    return None, None

这个sb_idl属性 竟然突然间不存在了,神奇,难道跟ovn neutron metadata woker初始化地时候 存在部分worker初始化出来的sb_idl 本来就是不正常地,所以才会出现概率性失败地情况。

[root@compute011 ~]# grep worker /etc/kolla/neutron-ovn-metadata-agent/neutron.conf
api_workers = 4
metadata_workers = 4
rpc_workers = 2
rpc_state_report_workers = 2

由于是私有云 而且metadata的访问频率应该很少,出现可以将worker直接设置为1。

可以确定地是 重启可以解决问题,建议开启debug持续跟踪下。

如果一开始 重启后不会出现概率性失败,那么应该会长期稳定下去。

官方解决方案:

https://bugs.launchpad.net/charm-ovn-chassis/+bug/1920037

更新代码后发现还是有连续两条500出现的情况,但是极大多数情况下不会出现,而且cloud-init本身有容错机制,会重试120次,所以这个问题目前看起来应该算是妥善解决了

image.png
上一篇下一篇

猜你喜欢

热点阅读