nova neutron deadlock的问题

2020-09-24  本文已影响0人  cloudFans

dead lock
两个例子

nova-comptue rpc nova-conductor 更新port

SAVEPOINT sa_savepoint_1 does not exist
RELEASE SAVEPOINT sa_savepoint_1

nova-compute log

2020-09-09 16:03:50.556 6 ERROR nova.network.base_api [instance: 5b1c5539-7c56-47ae-ab79-0d6ffdc3e0f3] Traceback (most recent call last):
2020-09-09 16:03:50.556 6 ERROR nova.network.base_api [instance: 5b1c5539-7c56-47ae-ab79-0d6ffdc3e0f3] File "/usr/lib/python2.7/dist-packages/nova/network/base_api.py", line 55, in update_instance_cache_with_nw_info
2020-09-09 16:03:50.556 6 ERROR nova.network.base_api [instance: 5b1c5539-7c56-47ae-ab79-0d6ffdc3e0f3] ic.save(update_cells=update_cells)
2020-09-09 16:03:50.556 6 ERROR nova.network.base_api [instance: 5b1c5539-7c56-47ae-ab79-0d6ffdc3e0f3] File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 210, in wrapper
2020-09-09 16:03:50.556 6 ERROR nova.network.base_api [instance: 5b1c5539-7c56-47ae-ab79-0d6ffdc3e0f3] ctxt, self, fn.name, args, kwargs)
2020-09-09 16:03:50.556 6 ERROR nova.network.base_api [instance: 5b1c5539-7c56-47ae-ab79-0d6ffdc3e0f3] File "/usr/lib/python2.7/dist-packages/nova/conductor/rpcapi.py", line 247, in object_action
2020-09-09 16:03:50.556 6 ERROR nova.network.base_api [instance: 5b1c5539-7c56-47ae-ab79-0d6ffdc3e0f3] objmethod=objmethod, args=args, kwargs=kwargs)
2020-09-09 16:03:50.556 6 ERROR nova.network.base_api [instance: 5b1c5539-7c56-47ae-ab79-0d6ffdc3e0f3] File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 179, in call
2020-09-09 16:03:50.556 6 ERROR nova.network.base_api [instance: 5b1c5539-7c56-47ae-ab79-0d6ffdc3e0f3] retry=self.retry)
2020-09-09 16:03:50.556 6 ERROR nova.network.base_api [instance: 5b1c5539-7c56-47ae-ab79-0d6ffdc3e0f3] File "/usr/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 133, in _send
2020-09-09 16:03:50.556 6 ERROR nova.network.base_api [instance: 5b1c5539-7c56-47ae-ab79-0d6ffdc3e0f3] retry=retry)
2020-09-09 16:03:50.556 6 ERROR nova.network.base_api [instance: 5b1c5539-7c56-47ae-ab79-0d6ffdc3e0f3] File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 584, in send
2020-09-09 16:03:50.556 6 ERROR nova.network.base_api [instance: 5b1c5539-7c56-47ae-ab79-0d6ffdc3e0f3] call_monitor_timeout, retry=retry)
2020-09-09 16:03:50.556 6 ERROR nova.network.base_api [instance: 5b1c5539-7c56-47ae-ab79-0d6ffdc3e0f3] File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 575, in _send
2020-09-09 16:03:50.556 6 ERROR nova.network.base_api [instance: 5b1c5539-7c56-47ae-ab79-0d6ffdc3e0f3] raise result
2020-09-09 16:03:50.556 6 ERROR nova.network.base_api [instance: 5b1c5539-7c56-47ae-ab79-0d6ffdc3e0f3] RemoteError: Remote error: DBError (pymysql.err.InternalError) (1305, u'SAVEPOINT sa_savepoint_1 does not exist') [SQL: u'ROLLBACK TO SAVEPOINT sa_savepoint_1'] (Background on this error at: http://sqlalche.me/e/2j85)
2020-09-09 16:03:50.556 6 ERROR nova.network.base_api [instance: 5b1c5539-7c56-47ae-ab79-0d6ffdc3e0f3] DBDeadlock: (pymysql.err.InternalError) (1213, u'Deadlock found when trying to get lock; try restarting transaction') [SQL: u'RELEASE SAVEPOINT sa_savepoint_1'] (Background on this error at: http://sqlalche.me/e/2j85)

nova-conductor log

grep '2020-09-09 16:03' -r nova-conductor.log

查看这个时段 control是否有意外情况

2020-09-09 16:03:49.651 45 DEBUG nova.objects.instance [req-5390ac8f-af20-4b63-bab1-02d7010cabf5 7b6a8a4f5d4e4815949607b865b38ba8 d33c6c311f9a4389aecc3ba52a0e5db1 - default default] Lazy-loading 'tags' on Instance uuid 5b1c5539-7c56-47ae-ab79-0d6ffdc3e0f3 obj_load_attr /usr/lib/python2.7/dist-packages/nova/objects/instance.py:1115
2020-09-09 16:03:50.093 57 INFO nova.db.sqlalchemy.api [req-8850611a-0093-414e-8091-1eeae7da5814 7b6a8a4f5d4e4815949607b865b38ba8 d33c6c311f9a4389aecc3ba52a0e5db1 - default default] Gpucloud reset assign state of video (id=75) to unused when instance destroy.
2020-09-09 16:03:50.539 62 ERROR oslo_db.sqlalchemy.exc_filters [req-9a4fc93b-d4c0-4513-9016-ec6294ebda56 21b4ac2e88584d53b277f152dfe349ed 509cb535d37c4b1d8faa2941d06b9d17 - default default] DBAPIError exception wrapped from (pymysql.err.InternalError) (1305, u'SAVEPOINT sa_savepoint_1 does not exist') [SQL: u'ROLLBACK TO SAVEPOINT sa_savepoint_1'] (Background on this error at: http://sqlalche.me/e/2j85): InternalError: (1305, u'SAVEPOINT sa_savepoint_1 does not exist')
2020-09-09 16:03:50.539 62 ERROR oslo_db.sqlalchemy.exc_filters Traceback (most recent call last):
2020-09-09 16:03:50.539 62 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1193, in _execute_context
2020-09-09 16:03:50.539 62 ERROR oslo_db.sqlalchemy.exc_filters context)
2020-09-09 16:03:50.539 62 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/default.py", line 508, in do_execute
2020-09-09 16:03:50.539 62 ERROR oslo_db.sqlalchemy.exc_filters cursor.execute(statement, parameters)
2020-09-09 16:03:50.539 62 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python2.7/dist-packages/pymysql/cursors.py", line 165, in execute
2020-09-09 16:03:50.539 62 ERROR oslo_db.sqlalchemy.exc_filters result = self._query(query)
2020-09-09 16:03:50.539 62 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python2.7/dist-packages/pymysql/cursors.py", line 321, in _query
2020-09-09 16:03:50.539 62 ERROR oslo_db.sqlalchemy.exc_filters conn.query(q)
2020-09-09 16:03:50.539 62 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python2.7/dist-packages/pymysql/connections.py", line 860, in query
2020-09-09 16:03:50.539 62 ERROR oslo_db.sqlalchemy.exc_filters self._affected_rows = self._read_query_result(unbuffered=unbuffered)
2020-09-09 16:03:50.539 62 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python2.7/dist-packages/pymysql/connections.py", line 1061, in _read_query_result
2020-09-09 16:03:50.539 62 ERROR oslo_db.sqlalchemy.exc_filters result.read()
2020-09-09 16:03:50.539 62 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python2.7/dist-packages/pymysql/connections.py", line 1349, in read
2020-09-09 16:03:50.539 62 ERROR oslo_db.sqlalchemy.exc_filters first_packet = self.connection._read_packet()
2020-09-09 16:03:50.539 62 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python2.7/dist-packages/pymysql/connections.py", line 1018, in _read_packet
2020-09-09 16:03:50.539 62 ERROR oslo_db.sqlalchemy.exc_filters packet.check_error()
2020-09-09 16:03:50.539 62 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python2.7/dist-packages/pymysql/connections.py", line 384, in check_error
2020-09-09 16:03:50.539 62 ERROR oslo_db.sqlalchemy.exc_filters err.raise_mysql_exception(self._data)
2020-09-09 16:03:50.539 62 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python2.7/dist-packages/pymysql/err.py", line 107, in raise_mysql_exception
2020-09-09 16:03:50.539 62 ERROR oslo_db.sqlalchemy.exc_filters raise errorclass(errno, errval)
2020-09-09 16:03:50.539 62 ERROR oslo_db.sqlalchemy.exc_filters InternalError: (1305, u'SAVEPOINT sa_savepoint_1 does not exist')
2020-09-09 16:03:50.539 62 ERROR oslo_db.sqlalchemy.exc_filters
2020-09-09 16:03:50.929 63 DEBUG nova.objects.instance [req-b4dbdf23-a0bc-43f4-b621-4029c6441613 7b6a8a4f5d4e4815949607b865b38ba8 d33c6c311f9a4389aecc3ba52a0e5db1 - default default] Lazy-loading 'tags' on Instance uuid 7a368bf3-444a-4f61-a75c-cbe4e9e03697 obj_load_attr /usr/lib/python2.7/dist-packages/nova/objects/instance.py:1115
2020-09-09 16:03:53.843 39 DEBUG nova.objects.instance [req-1b64bf7d-40d8-4445-af5c-b67dc86245e3 7b6a8a4f5d4e4815949607b865b38ba8 d33c6c311f9a4389aecc3ba52a0e5db1 - default default] Lazy-loading 'tags' on Instance uuid 7fedfb93-99a9-4b65-8d71-4c297e604da4 obj_load_attr /usr/lib/python2.7/dist-packages/nova/objects/instance.py:1115

在MySQL中, 保存点SAVEPOINT属于事务控制处理部分。利用SAVEPOINT可以回滚指定部分事务,从而使事务处理更加灵活和精细。参考:
SAVEPOINT identifier
设置SAVEPOINT。如果重复设置同名savepoint,新的会覆盖老的.

RELEASE SAVEPOINT identifier
释放SAVEPOINT。

ROLLBACK [WORK] TO [SAVEPOINT] identifier
回滚到指定的SAVEPOINT。

https://www.cnblogs.com/justfortaste/p/5054368.html

trigger SAVEPOINT a does not exist

delimiter //
drop procedure if exists p1//
create procedure p1()
begin
release savepoint a;
end//
delimiter ;

begin;
savepoint a;
call p1();
rollback to savepoint a;
ERROR 1305 (42000): SAVEPOINT a does not exist
从结果来看与官方文档描述并不一致。

实际从代码中上看,stored function和trigger并没有开启独立的事务,而是与调用着共用同一事务。savepoint都在同一事务的链表中,因此store function和trigger中的savepoint作用域和调用者相同。

官方对savepoint的实现并不彻底。

匿名SAVEPOINT
实际上,InnoDB事务中每个语句执行前都会记录一个匿名savepoint;如果当前语句执行失败,不会回滚整个事务,而是利用这个匿名savepoint回滚失败的语句。

struct trx_t{
trx_savept_t last_sql_stat_start; //匿名savepoint

该情况是一个小概率出现的情况,目前25 compute 24小时内只出现了一次。 关于修复该逻辑,可以通过retry简单解决。
毕竟在neutron的patch中以及sqlalchemy中都有类似的逻辑。

上一篇下一篇

猜你喜欢

热点阅读