Innodb中的MVVC

2018-08-22 本文已影响0人西元前__YP

Mysql中有读锁和写锁，在没有引入MVVC之前，读锁是允许共享读，但是如果一行记录事先被上了写锁，那么就不允许其他事务进行读，现在的大部分应用都具有读多写少的特性，所以为了进一步增加并发读的性能，引入了MVVC-----Multi Version Concurrency Control（多版本并发控制）

从mysql-5.5.5开始,InnoDB作为默认存储引擎，InnoDB默认隔离级别REPEATABLE READ, 行级锁

在InnoDB中用B+树作为索引的存储结构，并且主键所在的索引为ClusterIndex(聚簇索引), ClusterIndex中的叶子节点中保存了对应的数据内容。一个表只能有一个主键，所以只能有一个聚簇索引，如果表没有定义主键，则选择第一个非NULL唯一索引作为聚簇索引，如果还没有则生成一个隐藏id列作为聚簇索引。
除了Cluster Index外的索引是Secondary Index(辅助索引)。辅助索引中的叶子节点保存的是聚簇索引的叶子节点的值。
由于索引的组织方式为B+树，在最底层的叶子节点层，数据页之间相当于是一个双向链表，在插入的过程中，数据页之间会引起裂变（相关信息可以参考 http://hedengcheng.com/?p=525）
无论是聚簇索引，还是二级索引，其每条记录都包含了一个DELETED BIT位，用于标识该记录是否是删除记录。

InnoDB中数据行的组织格式大致为

201707060610327.jpg

在InnoDB中，每一行都有2个隐藏列DATA_TRX_ID和DATA_ROLL_PTR（如果没有定义主键，则还有个隐藏主键列ROWID）：

DATA_TRX_ID: 表示最近修改的事务的id
DATA_ ROLL_PTR: 表示指向该行回滚段(undo segment 中的 undo log)的指针，该行上所有旧的版本，在undo中都通过链表的形式组织，而该值，正式指向undo中该行的历史记录链表

事务链表
MySQL中的事务在开始到提交这段过程中，都会被保存到一个叫trx_sys的事务链表中，这是一个基本的链表结构：

2017070606195027.jpg
事务链表中保存的都是还未提交的事务，事务一旦被提交，则会被从事务链表中摘除。

ReadView
在MVVC的源码实现中，一个比较重要的部分就是ReadView，介绍一下这个类，看一下源代码

// Friend declaration
class MVCC;
/** Read view lists the trx ids of those transactions for which a consistent
read should not see the modifications to the database. */
...
class ReadView {
    ...
    private:
        // Prevent copying
        ids_t(const ids_t&);
        ids_t& operator=(const ids_t&);
    private:
        /** Memory for the array */
        value_type* m_ptr;
        /** Number of active elements in the array */
        ulint       m_size;
        /** Size of m_ptr in elements */
        ulint       m_reserved;
        friend class ReadView;
    };
public:
    ReadView();
    ~ReadView();
    /** Check whether transaction id is valid.
    @param[in]  id      transaction id to check
    @param[in]  name        table name */
    static void check_trx_id_sanity(
        trx_id_t        id,
        const table_name_t& name);
// 判断一个修改是否可见
    /** Check whether the changes by id are visible.
    @param[in]  id  transaction id to check against the view
    @param[in]  name    table name
    @return whether the view sees the modifications of id. */
    bool changes_visible(
        trx_id_t        id,
        const table_name_t& name) const
        MY_ATTRIBUTE((warn_unused_result))
    {
        ut_ad(id > 0);
        if (id < m_up_limit_id || id == m_creator_trx_id) {
            return(true);
        }
        check_trx_id_sanity(id, name);
        if (id >= m_low_limit_id) {
            return(false);
        } else if (m_ids.empty()) {
            return(true);
        }
        const ids_t::value_type*    p = m_ids.data();
        return(!std::binary_search(p, p + m_ids.size(), id));
    }
    
private:
    // Disable copying
    ReadView(const ReadView&);
    ReadView& operator=(const ReadView&);
private:
   // 活动事务中的id的最大
    /** The read should not see any transaction with trx id >= this
    value. In other words, this is the "high water mark". */
    trx_id_t    m_low_limit_id;
    // 活动事务id的最小值
    /** The read should see all trx ids which are strictly
    smaller (<) than this value.  In other words, this is the
    low water mark". */
    // 
    trx_id_t    m_up_limit_id;
    /** trx id of creating transaction, set to TRX_ID_MAX for free
    views. */
    trx_id_t    m_creator_trx_id;
    /** Set of RW transactions that was active when this snapshot
    was taken */
    ids_t       m_ids;
    /** The view does not need to see the undo logs for transactions
    whose transaction number is strictly smaller (<) than this value:
    they can be removed in purge if not needed by other views */
    trx_id_t    m_low_limit_no;
    /** AC-NL-RO transaction view that has been "closed". */
    bool        m_closed;
    typedef UT_LIST_NODE_T(ReadView) node_t;
    /** List of read views in trx_sys */
    byte        pad1[64 - sizeof(node_t)];
    node_t      m_view_list;
};

在一个事务中，处理可见性时，主要用到的数据结构如下

private:
   // 活动事务中的id的最大
    /** The read should not see any transaction with trx id >= this
    value. In other words, this is the "high water mark". */
    trx_id_t    m_low_limit_id;
    // 活动事务id的最小值
    /** The read should see all trx ids which are strictly
    smaller (<) than this value.  In other words, this is the
    low water mark". */
    // 
    trx_id_t    m_up_limit_id;
    /** trx id of creating transaction, set to TRX_ID_MAX for free
    views. */
    trx_id_t    m_creator_trx_id;
    /** Set of RW transactions that was active when this snapshot
    was taken */
    ids_t       m_ids;

m_low_limit_id: 表示在当前事务启动后，当前的事务链表中，最大的事务id号，也就是最近创建的除自身以外的最大事务id号
m_up_limit_id: 表示在当前事务启动后，当前的事务链表中，最小的事务id号，也就是最近创建的最古老的还没有提交的事务id号
m_creator_trx_id: 创建当前事务的 trx_id (DATA_TRX_ID )
m_ids: 当前这个读快照中，事务链表中的全部事务数

如图所示

2017070606200122.jpg

根据这些属性来判断事务的可见性，先看代码中如何处理：

// 判断一个修改是否可见
    /** Check whether the changes by id are visible.
    @param[in]  id  transaction id to check against the view
    @param[in]  name    table name
    @return whether the view sees the modifications of id. */
    bool changes_visible(
        trx_id_t        id,
        const table_name_t& name) const
        MY_ATTRIBUTE((warn_unused_result))
    {
        ut_ad(id > 0);
        if (id < m_up_limit_id || id == m_creator_trx_id) {
            return(true);
        }
        check_trx_id_sanity(id, name);
        if (id >= m_low_limit_id) {
            return(false);
        } else if (m_ids.empty()) {
            return(true);
        }
        const ids_t::value_type*    p = m_ids.data();
        return(!std::binary_search(p, p + m_ids.size(), id));
    }

很多方法的意思我也不知道，就看一下那几个if else吧首先

        if (id < m_up_limit_id || id == m_creator_trx_id) {
            return(true);
        }

如果这个事务比事务链中最古老的事务版本号还要早，那么它肯定是在我们当前事务开启之前已经完成了提交，是可以看见的
又或者这个事务就是在我们当前事务中进行开启的（id == m_creator_trx_id），那么这个事务所做的改变我们也可以看见
再接着看

        if (id >= m_low_limit_id) {
            return(false);
        } else if (m_ids.empty()) {
            return(true);
        }

InnoDB默认的是RR级别，在这种级别下，相当于事务开启后，事务链中所有的事务，它们在事务处理期间的一切改变对我们当前开启的事务而言都是不可见的，也可以相当于看作 m_up_limit_id == m_low_limit_id 。

如果事务链中是空的，也就是所有的事务都是可见的
在这里，可见包括两层含义：
记录可见，且Deleted bit = 0；当前记录是可见的有效记录。
记录可见，且Deleted bit = 1；当前记录是可见的删除记录。此记录在本事务开始之前，已经删除。

使用主键（聚簇索引）查找时，当发现事务不可见的时候，可以根据DATA_ROLL_PTR进行回滚，查看上一个事务记录中的数据是否可见。

非主键（二级索引）查找时，流程有一些不同：
首先，查看二级索引页面的最大更新事务MAX_TRX_ID号，如果MAX_TRX_ID < m_up_limit_id，当前页面所有数据均可见，本页面可以进行索引覆盖性扫描。丢弃所有deleted bit = 1的记录，返回deleted bit = 0 的记录
如果不能满足MAX_TRX_ID < m_up_limit_id，说明当前页面无法进行索引覆盖性扫描，此时需要针对每一项，到聚簇索引中判断可见性。这时候就可能会出现，在二级索引页面中，有多个符合查找条件的二级索引记录项，它们指向了聚簇索引界面的同一个记录，那么如何避免返回多次相同的聚簇索引记录呢？代码如下

if (clust_rec
&& (old_vers || rec_get_deleted_flag(rec,dict_table_is_comp(sec_index->table)))
         && !row_sel_sec_rec_is_for_clust_rec(rec, sec_index, clust_rec, clust_index))

满足以上if判断的所有聚簇索引记录，都直接丢弃，以上判断的逻辑如下：

1.需要回聚簇索引扫描，并且获得记录
2.聚簇索引记录为回滚版本，或者二级索引中的记录为删除版本
3.聚簇索引项，与二级索引项，其键值并不相等
注意一定要1.2.3这三个条件同时满足才会被丢弃

讲完了可见性，再更深入一些看一下这整个过程，MVVC是如何进行操作的（以下内容均参考何登成大大博客）：
在更新操作中，更新前后的数据行在聚簇索引中存在了两条记录，区别在于，旧数据的Deleted设为1，同时 DATA_ ROLL_PTR指向Undo segment中之前的版本

对于聚簇索引，如果更新操作没有更新primary key，那么更新不会产生新的记录项，而是在原有记录上进行更新，老版本进入undo表空间，通过记录上的undo指针进行回滚。同时DATA_TRX_ID进行了更新
对于二级索引，如果更新操作没有更新其键值，那么二级索引记录保持不变。
对于二级索引，更新操作无论更新primary key，或者是二级索引键值，都会导致二级索引产生新版本数据（新的数据记录）。
聚簇索引设置记录deleted bit时，会同时更新DATA_TRX_ID列。老版本DATA_TRX_ID进入undo表空间；二级索引设置deleted bit时，不写入undo。

Purge操作
对于用户删除的数据，InnoDB并不是立刻删除，而是标记一下，后台线程批量的真正删除。这个线程就是后台的Purge线程。此外，过期的undolog也需要回收，这里说的过期，指的是undo不需要被用来构建之前的版本，也不需要用来回滚事务。
关于Purge流程，可以参考http://mysql.taobao.org/monthly/2018/03/01/

参考：
http://hedengcheng.com/?p=148
https://liuzhengyang.github.io/2017/04/18/innodb-mvcc/
http://www.ywnds.com/?p=10418

Innodb中的MVVC

猜你喜欢

热点阅读