MySQL

Innodb undo之 undo结构简析

2019-02-21  本文已影响0人  重庆八怪

水平有限,如果有误请指出


参考:

做一个简单的记录,自己备用

一、大体结构

rollback segments(128)

undo segments(1024)

undo log (header insert/modify 分开的) <-> undo page

undo record
undo record

作为undo segments的第一个undo page可以存放多个事物的undo log,因为如果这个块的undo 记录没有填满3/4则会进入 rollback segment的cache list,那么下次可以继续使用,但是如果第一个块不足以装下事物的undo 记录,那么很显然需要分配新的undo page,这种情况下一个undo page就只能包含一个事物的undo记录了。
事物每次需要分配rollback segments然后分配undo segments然后初始化好undo log header,insert和update/delete需要分配不同的undo segments,一个undo segments往往对应了一个undo log,undo log可以包含多个undo record(因为从debug来看undo log header的初始化只做了一次),对于操作的每行都会留下一个undo record作为mvcc构建历史版本的基础。
undo生成的基本单位是undo record,每行记录都会包含一个undo record,而rollback ptr指向的是undo record的偏移量,对于每行的记录都会去判断其可见性,如果需要构建前版本则通过本指针进行构建包含:

每一个undo log包含一个trx_undo_t结构体
每一个rollback segments包含一个trx_rseg_t结构体

二、物理结构

/** Transaction undo log page header offsets */
/* @{ */
#define TRX_UNDO_PAGE_TYPE  0   /*!< TRX_UNDO_INSERT or
                    TRX_UNDO_UPDATE */
#define TRX_UNDO_PAGE_START 2   /*!< Byte offset where the undo log
                    records for the LATEST transaction
                    start on this page (remember that
                    in an update undo log, the first page
                    can contain several undo logs) */
#define TRX_UNDO_PAGE_FREE  4   /*!< On each page of the undo log this
                    field contains the byte offset of the
                    first free byte on the page */
#define TRX_UNDO_PAGE_NODE  6   /*!< The file list node in the chain
                    of undo log pages */

#define TRX_UNDO_STATE      0   /*!< TRX_UNDO_ACTIVE, ... */

#ifndef UNIV_INNOCHECKSUM

#define TRX_UNDO_LAST_LOG   2   /*!< Offset of the last undo log header
                    on the segment header page, 0 if
                    none */
#define TRX_UNDO_FSEG_HEADER    4   /*!< Header for the file segment which
                    the undo log segment occupies */
#define TRX_UNDO_PAGE_LIST  (4 + FSEG_HEADER_SIZE)
                    /*!< Base node for the list of pages in
                    the undo log segment; defined only on
                    the undo log segment's first page */

undo log header 包含

#define TRX_UNDO_TRX_ID     0   /*!< Transaction id */
#define TRX_UNDO_TRX_NO     8   /*!< Transaction number of the
                    transaction; defined only if the log
                    is in a history list */
#define TRX_UNDO_DEL_MARKS  16  /*!< Defined only in an update undo
                    log: TRUE if the transaction may have
                    done delete markings of records, and
                    thus purge is necessary */
#define TRX_UNDO_LOG_START  18  /*!< Offset of the first undo log record
                    of this log on the header page; purge
                    may remove undo log record from the
                    log start, and therefore this is not
                    necessarily the same as this log
                    header end offset */
#define TRX_UNDO_XID_EXISTS 20  /*!< TRUE if undo log header includes
                    X/Open XA transaction identification
                    XID */
#define TRX_UNDO_DICT_TRANS 21  /*!< TRUE if the transaction is a table
                    create, index create, or drop
                    transaction: in recovery
                    the transaction cannot be rolled back
                    in the usual way: a 'rollback' rather
                    means dropping the created or dropped
                    table, if it still exists */
#define TRX_UNDO_TABLE_ID   22  /*!< Id of the table if the preceding
                    field is TRUE */
#define TRX_UNDO_NEXT_LOG   30  /*!< Offset of the next undo log header
                    on this page, 0 if none */
#define TRX_UNDO_PREV_LOG   32  /*!< Offset of the previous undo log
                    header on this page, 0 if none */
#define TRX_UNDO_HISTORY_NODE   34  /*!< If the log is put to the history
                    list, the file list node is here */

/* Note: the writing of the undo log old header is coded by a log record
MLOG_UNDO_HDR_CREATE or MLOG_UNDO_HDR_REUSE. The appending of an XID to the
header is logged separately. In this sense, the XID is not really a member
of the undo log header. TODO: do not append the XID to the log header if XA
is not needed by the user. The XID wastes about 150 bytes of space in every
undo log. In the history list we may have millions of undo logs, which means
quite a large overhead. */

/** X/Open XA Transaction Identification (XID) */
/* @{ */
/** xid_t::formatID */
#define TRX_UNDO_XA_FORMAT  (TRX_UNDO_LOG_OLD_HDR_SIZE)
/** xid_t::gtrid_length */
#define TRX_UNDO_XA_TRID_LEN    (TRX_UNDO_XA_FORMAT + 4)
/** xid_t::bqual_length */
#define TRX_UNDO_XA_BQUAL_LEN   (TRX_UNDO_XA_TRID_LEN + 4)
/** Distributed transaction identifier data */
#define TRX_UNDO_XA_XID     (TRX_UNDO_XA_BQUAL_LEN + 4)

三、分配步骤和写入

#0  get_next_redo_rseg (max_undo_logs=128, n_tablespaces=4) at /root/mysqlc/percona-server-locks-detail-5.7.22/storage/innobase/trx/trx0trx.cc:1138
#1  0x0000000001c0bce8 in trx_assign_rseg_low (max_undo_logs=128, n_tablespaces=4, rseg_type=TRX_RSEG_TYPE_REDO)
    at /root/mysqlc/percona-server-locks-detail-5.7.22/storage/innobase/trx/trx0trx.cc:1314
#2  0x0000000001c1097d in trx_set_rw_mode (trx=0x7fffd7804080) at /root/mysqlc/percona-server-locks-detail-5.7.22/storage/innobase/trx/trx0trx.cc:3352
#3  0x0000000001a64013 in lock_table (flags=0, table=0x7ffeac012ae0, mode=LOCK_IX, thr=0x7ffe7c92ef48)
    at /root/mysqlc/percona-server-locks-detail-5.7.22/storage/innobase/lock/lock0lock.cc:4139
#4  0x0000000001b7950e in row_search_mvcc (buf=0x7ffe7c92e350 "\377", mode=PAGE_CUR_GE, prebuilt=0x7ffe7c92e7d0, match_mode=1, direction=0)
    at /root/mysqlc/percona-server-locks-detail-5.7.22/storage/innobase/row/row0sel.cc:5100
#5  0x00000000019d5443 in ha_innobase::index_read (this=0x7ffe7c92de10, buf=0x7ffe7c92e350 "\377", key_ptr=0x7ffe7cd57590 "\004", key_len=4, 
    find_flag=HA_READ_KEY_EXACT) at /root/mysqlc/percona-server-locks-detail-5.7.22/storage/innobase/handler/ha_innodb.cc:9536
#6  0x0000000000f9345a in handler::index_read_map (this=0x7ffe7c92de10, buf=0x7ffe7c92e350 "\377", key=0x7ffe7cd57590 "\004", keypart_map=1, 
    find_flag=HA_READ_KEY_EXACT) at /root/mysqlc/percona-server-locks-detail-5.7.22/sql/handler.h:2942
#7  0x0000000000f83e44 in handler::ha_index_read_map (this=0x7ffe7c92de10, buf=0x7ffe7c92e350 "\377", key=0x7ffe7cd57590 "\004", keypart_map=1, 
    find_flag=HA_READ_KEY_EXACT) at /root/mysqlc/percona-server-locks-detail-5.7.22/sql/handler.cc:3248

第二步 对于主键每行更改操作都会调用trx_undo_report_row_operation 他会分配undo segments 并且会负责写入undo record

#0  trx_undo_report_row_operation (flags=0, op_type=2, thr=0x7ffe7c932828, index=0x7ffea4016590, clust_entry=0x7ffe7c932cc0, update=0x0, cmpl_info=0, 
    rec=0x7fffb580d369 "", offsets=0x7fffec0f3e00, roll_ptr=0x7fffec0f3688) at /root/mysqlc/percona-server-locks-detail-5.7.22/storage/innobase/trx/trx0rec.cc:1866
#1  0x0000000001c5795b in btr_cur_del_mark_set_clust_rec (flags=0, block=0x7fffb4ccaae0, rec=0x7fffb580d369 "", index=0x7ffea4016590, offsets=0x7fffec0f3e00, 
    thr=0x7ffe7c932828, entry=0x7ffe7c932cc0, mtr=0x7fffec0f38f0) at /root/mysqlc/percona-server-locks-detail-5.7.22/storage/innobase/btr/btr0cur.cc:4894
#2  0x0000000001b9f218 in row_upd_del_mark_clust_rec (flags=0, node=0x7ffe7c932550, index=0x7ffea4016590, offsets=0x7fffec0f3e00, thr=0x7ffe7c932828, referenced=0, 
    mtr=0x7fffec0f38f0) at /root/mysqlc/percona-server-locks-detail-5.7.22/storage/innobase/row/row0upd.cc:2778
#3  0x0000000001b9f765 in row_upd_clust_step (node=0x7ffe7c932550, thr=0x7ffe7c932828)
    at /root/mysqlc/percona-server-locks-detail-5.7.22/storage/innobase/row/row0upd.cc:2923
#4  0x0000000001b9fc74 in row_upd (node=0x7ffe7c932550, thr=0x7ffe7c932828) at /root/mysqlc/percona-server-locks-detail-5.7.22/storage/innobase/row/row0upd.cc:3042
#5  0x0000000001ba0155 in row_upd_step (thr=0x7ffe7c932828) at /root/mysqlc/percona-server-locks-detail-5.7.22/storage/innobase/row/row0upd.cc:3188
#6  0x0000000001b3d3a0 in row_update_for_mysql_using_upd_graph (mysql_rec=0x7ffe7c9318d0 "\375\001", prebuilt=0x7ffe7c931d50)
    at /root/mysqlc/percona-server-locks-detail-5.7.22/storage/innobase/row/row0mysql.cc:3040
#7  0x0000000001b3d6a1 in row_update_for_mysql (mysql_rec=0x7ffe7c9318d0 "\375\001", prebuilt=0x7ffe7c931d50)
    at /root/mysqlc/percona-server-locks-detail-5.7.22/storage/innobase/row/row0mysql.cc:3131
#8  0x00000000019d47c3 in ha_innobase::delete_row (this=0x7ffe7c931390, record=0x7ffe7c9318d0 "\375\001")
    at /root/mysqlc/percona-server-locks-detail-5.7.22/storage/innobase/handler/ha_innodb.cc:9141

大概流程

switch (op_type)
{
    case TRX_UNDO_INSERT_OP:
        undo = undo_ptr->insert_undo; //如果是 insert 则使用insert_undo 类型为trx_undo_t 指针

        if (undo == NULL) { //如果已经分配了就不用分配了

            err = trx_undo_assign_undo( //分配undo segment 同时初始化 undo log header
                trx, undo_ptr, TRX_UNDO_INSERT);
            undo = undo_ptr->insert_undo;
  ...
        }
        break;
    default:
        ut_ad(op_type == TRX_UNDO_MODIFY_OP); //断言

        undo = undo_ptr->update_undo;

        if (undo == NULL) {
            err = trx_undo_assign_undo(
                trx, undo_ptr, TRX_UNDO_UPDATE); //分配undo segment 同时初始化 undo log header
            undo = undo_ptr->update_undo;
    ...
    }

  ...
  case TRX_UNDO_INSERT_OP://注意是每行都会操作
            offset = trx_undo_page_report_insert( //写入insert undo log record
                undo_page, trx, index, clust_entry, &mtr);
            break;
        default:
            ut_ad(op_type == TRX_UNDO_MODIFY_OP); //写入delete update undo log record
            offset = trx_undo_page_report_modify(
                undo_page, trx, index, rec, offsets, update,
                cmpl_info, clust_entry, &mtr);
        }
    ...
    
    *roll_ptr = trx_undo_build_roll_ptr( //构建rollback ptr 主键中每行都有这个 用于MVCC构建回滚版本
                op_type == TRX_UNDO_INSERT_OP,
                undo_ptr->rseg->id, page_no, offset);

四、分解undo log record

我将undo log record的写入到了错误日志,下面进行简单的分解。
表结构如下:

mysql> show create table t1;
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table                                                                                                                                                |
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+
| t1    | CREATE TABLE `t1` (
  `id1` int(11) NOT NULL,
  `id2` int(11) DEFAULT NULL,
  PRIMARY KEY (`id1`),
  KEY `id2` (`id2`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 |
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
mysql> insert into t1 values(28,28);
Query OK, 1 row affected (0.00 sec)

输出如下:

trx_undo_assign_undo:assign undo space:
RSEG SLOT:34,RSEG SPACE ID:2 PAGE NO:3
UNDO SLOT:0,UNDO SPACE ID:2 UNDO LOG HEADER PAGE NO:27,UNDO LOG HEADER OFFSET:86,UNDO LOG LAST PAGE:27
trx_undo_page_report_insert:undo log record
TABLE_NAME:test/t1 TRX_ID:12591,UODO RECORD LEN:10
 len 10; hex 011e0b0032048000001c; 

011e0b0032048000001c就是undo record的实际记录解析如下:

011c page内部本undo record结束的位置
0b 类型为 #define  TRX_UNDO_INSERT_REC 11(0X0b)
00 undo no,提交才会有
32 table_id 可以查询 INNODB_SYS_TABLES 对照
04 字段长度4个字节
8000001c 我插入的记录主键 28(0X1c)
mysql> update t1 set id2=1000 where id1=14;
Query OK, 1 row affected (5 min 40.91 sec)
Rows matched: 1  Changed: 1  Warnings: 0

输出如下:

trx_undo_assign_undo:assign undo space:
RSEG SLOT:41,RSEG SPACE ID:1 PAGE NO:5
UNDO SLOT:1,UNDO SPACE ID:1 UNDO LOG HEADER PAGE NO:37,UNDO LOG HEADER OFFSET:1389,UNDO LOG LAST PAGE:37
trx_undo_page_report_modify:undo log record
TABLE_NAME:test/t1 TRX_ID:12604,UODO RECORD LEN:47
 len 47; hex 06560c0032000000003136e0260000002c052e048000000e010304800003e7000e00048000000e0304800003e70627; 

06560c0032000000003136e0260000002c052e048000000e010304800003e7000e00048000000e030480
就是undo record的记录
大体解析如下:

0656 :page内部本undo record结束的位置
0c:类型为 #define  TRX_UNDO_UPD_EXIST_REC  12(0X0c)
00: undo no,提交才会有
32: table_id 可以查询 INNODB_SYS_TABLES 对照
00:
0000003136e0:事物ID
260000002c052e:undo回滚指针
04:主键长度
8000000e:主键值
01
03:位置
04:被修改值的长度
800003e7:值为999(0x3e7)
000e:接下来字符的长度,记录原始值?
00:位置
04:长度
8000000e:主键值
03:位置
04:长度
800003e7:值为999(0x3e7)
0627:page内部本undo record开始的位置,0X0656-0X0627就是长度

作者微信:gp_22389860

上一篇下一篇

猜你喜欢

热点阅读