CAS原理

2021-01-17 本文已影响0人钟离惜

一、什么是CAS原子操作

大家应该还记得操作系统里面关于“原子操作”的概念，一个操作是原子的(atomic)，假设这个操作所处的层(layer)的更高层不能发现其内部实现与结构。原子操作能够是一个步骤，也能够是多个操作步骤。可是其顺序是不能够被打乱，或者分割掉仅仅运行部分。有了这个原子操作这个保证我们就能够实现无锁了。

CAS原子操作在维基百科中的代码描写叙述例如以下:

int compare_and_swap(int* reg, int oldval, int newval)
{
    ATOMIC();
    int old_reg_val = *reg;
    if (old_reg_val == oldval)
        *reg = newval;
    END_ATOMIC();
    return old_reg_val;
}

也就是检查内存*reg里的值是不是oldval，假设是的话。则对其赋值newval。上面的代码总是返回old_reg_value，调用者假设须要知道是否更新成功还须要做进一步推断，为了方便，它能够变种为直接返回是否更新成功，例如以下：

 bool compare_and_swap (int *accum, int *dest, int newval)
 {
   if ( *accum == *dest ) {
       *dest = newval;
       return true;
   }
   return false;
 }

二、CAS 在各个平台下的实现

2.1 Linux GCC 支持的 CAS

bool __sync_bool_compare_and_swap (type *ptr, type oldval type newval, ...)

2.2 Windows支持的CAS

InterlockedCompareExchange ( __inout LONG volatile *Target,
                                 __in LONG Exchange,
                                 __in LONG Comperand);

2.3C++ 11支持的CAS

template< class T >
bool atomic_compare_exchange_weak( std::atomic<T>* obj,
                                    T* expected, T desired );

三、CAS原子操作实现无锁的性能分析

3.1 测试方法叙述

测试平台是虚拟机下Centos7.4
方法a使用C11的互斥锁mutex，方法b使用Linux的__sync_bool_compare_and_swap，方法c使用C11的compare_exchange_weak
采用控制变量法，每种方法起100个线程控制各自的变量（初始0），保证线程安全，在各自的线程函数中循环10000次加法，然后重复上述操作100次，也就是最后的结果应该都是100*10000*100=1亿。
对比a、b、c三种方法的运行时间。

3.2 测试代码如下(计算时间类在后面给出)：

#include <cstdio>
#include <mutex>          // std::mutex
#include <atomic>         // std::atomic
#include <thread>         // std::thread
using namespace std;

#include "timer.h"

#define TIMES (100)
#define THREDS (100)
#define LOOPNUM (10000)

mutex mm;
size_t a = 0;
void funa()
{
    for (int i = 0; i < LOOPNUM; i++)
    {
        mm.lock();
        a++;
        mm.unlock();
    }
}
size_t b = 0;
void funb()
{
    for (int i = 0; i < LOOPNUM; i++)
    {
        size_t oldb = b;
        while (!__sync_bool_compare_and_swap(&b, oldb, oldb + 1))
        {
            oldb = b;
        }
    }
}
std::atomic<size_t> c(0);
void func()
{
    for (int i = 0; i < LOOPNUM; i++)
    {
        size_t oldc = c;
        while (!c.compare_exchange_weak(oldc, oldc + 1))
        {
            oldc = c;
        }
    }
}

int main(int argc, const char *argv[])
{
    CUseTime totalTime;
    {
        CUseTime useTime;
        for (int j = 0; j < TIMES; ++j)
        {
            thread* ttb[THREDS];
            for (int i = 0; i < THREDS; i++)
            {
                ttb[i] = new thread(funa);
            }
            for (int i = 0; i < THREDS; i++)
            {
                ttb[i]->join();
                delete ttb[i];
                ttb[i] = NULL;
            }
        }
        printf("use time %ld ms: C11 \"mutex\" a=%d\n", useTime.GetUseTime(), (int)a);
    }
    {
        CUseTime useTime;
        for (int j = 0; j < TIMES; ++j)
        {
            thread* ttb[THREDS];
            for (int i = 0; i < THREDS; i++)
            {
                ttb[i] = new thread(funb);
            }
            for (int i = 0; i < THREDS; i++)
            {
                ttb[i]->join();
                delete ttb[i];
                ttb[i] = NULL;
            }
        }
        printf("use time %ld ms: linux \"__sync_bool_compare_and_swap\" b=%d\n", useTime.GetUseTime(), (int)b);
    }
    {
        CUseTime useTime;
        for (int j = 0; j < TIMES; ++j)
        {
            thread* ttb[THREDS];
            for (int i = 0; i < THREDS; i++)
            {
                ttb[i] = new thread(func);
            }
            for (int i = 0; i < THREDS; i++)
            {
                ttb[i]->join();
                delete ttb[i];
                ttb[i] = NULL;
            }
        }
        printf("use time %ld ms: C11 \"compare_exchange_weak\" c=%d\n", useTime.GetUseTime(), (int)c);
    }
    printf("use time %ld ms total\n", totalTime.GetUseTime());
    
    return 0;
}

3.3 测试结果

为了方便对比对文字做了后处理

[zlm@localhost Debug]$ ./test.out 
use time 23360 ms: a=100000000 C++11 "mutex"
use time  8850 ms: b=100000000 linux "__sync_bool_compare_and_swap"
use time 24190 ms: c=100000000 C++11 "compare_exchange_weak"
use time 56400 ms: total
[zlm@localhost Debug]$ ./test.out 
use time 24950 ms: a=100000000 C++11 "mutex"
use time  8900 ms: b=100000000 linux "__sync_bool_compare_and_swap"
use time 24400 ms: c=100000000 C++11 "compare_exchange_weak"
use time 58250 ms: total

从耗时上可以看出，Linux的__sync_bool_compare_and_swap应该是效果最好的，在保证了线程安全的前提下拥有最少的耗时，几乎是另外两种方式的三分之一。

3.4 消耗时间类头文件

文件名：timer.h
代码如下：

#ifndef _TIMER_
#define _TIMER_

#include <ctime>

class CUseTime
{
public:
    CUseTime() : m_start_time(clock()) {}
    virtual ~CUseTime() {}
    long GetUseTime() { return (clock() - m_start_time) / (CLOCKS_PER_SEC / 1000); };
private:
    long m_start_time;
};

#endif // !_TIMER_

参考文章
CAS原子操作实现无锁及性能分析