C++速度很快

2021-12-27  本文已影响0人  疾风2018

测试内存访问密集型

计算10亿个整型数的和,开了OpenMP,花费时间是700毫秒,不开OpenMP则更快,150毫秒。这是因为内存访问占主要时间,多核计算并不会提高内存访问效率。

补充一点:Java计算1亿个整型数的和,用了300毫秒。可以反衬C++的效率。

代码

代码贴在下面。

// cpp2.cpp : 此文件包含 "main" 函数。程序执行将在此处开始并结束。
//

#include <chrono>
#include <iostream>
#include <math.h>

int main()
{
    using std::chrono::system_clock;
    using namespace std::chrono;

    short * data = new short[1000000000];
    for (size_t i = 0; i < 1000000000; i++)
    {
        data[i] = (short)std::rand();
    }

    auto now = std::chrono::system_clock::now();
    long sum = 0;

# ifdef _OPENMP
    printf_s("Compiled by an OpenMP-compliant implementation.\n");
# endif

#pragma omp parallel for
    for (long i = 0; i < 1000000000; i++)
    {
        int a = data[i];
        sum += a;
    }

    auto now2 = system_clock::now();

    std::cout << "Sum:" << sum << std::endl << "time: " << duration_cast<milliseconds>(now2 - now).count();

    delete data;
}

测试计算密集型

生成10亿个整型数的随机值,开了OpenMP,花费时间是9秒,不开OpenMP则需要4倍时间(程序运行在4核CPU上),36秒。只有计算密集型的程序才能发挥多核的优势。

代码

// cpp2.cpp : 此文件包含 "main" 函数。程序执行将在此处开始并结束。
//

#include <chrono>
#include <iostream>
#include <math.h>
#include <variant>

void func1()
{
    std::variant<int, double, float> a, b;
    a.emplace<int>(10);
    b.emplace<float>(1.1);
    std::cout << std::get<float>(b) << " , " << std::get<int>(a);
}

int func2(int one, int two)
{
    return one + two;
}

int main()
{
    using std::chrono::system_clock;
    using namespace std::chrono;


    short * data = new short[1000000000];
#pragma omp parallel for
    for (long i = 0; i < 1000000000; i++)
    {
        data[i] = (short)std::rand();
    }

    auto now = std::chrono::system_clock::now();
    long sum = 0;

# ifdef _OPENMP
    printf_s("Compiled by an OpenMP-compliant implementation.\n");
# endif

#pragma omp parallel for
    for (long i = 0; i < 1000000000; i++)
    {
        data[i] = (short)std::rand();
    }

    auto now2 = system_clock::now();

    std::cout << "Sum:" << sum << std::endl << "time: " << duration_cast<milliseconds>(now2 - now).count();

    delete[] data;
}

测试file mapping

先往文件里写一亿个Int值,再通过file mapping的方式加载这个文件里的数据,对其数据做sum聚合运算。在启动OpenMP以及各项优化选项全部拉满的情况下,时间耗费200ms。这样比起来,也就跟Java差不多了。

代码

int main()
{
    using std::chrono::system_clock;
    using namespace std::chrono;
    using namespace boost::interprocess;

    const unsigned long data_size = 100000000;
    const std::string file_path = "C:\\Users\\DELL\\cx1.bin";
    constexpr unsigned long file_size = data_size * sizeof(int);

        std::filebuf fbuf;
    fbuf.open(file_path, std::ios_base::in | std::ios_base::out | std::ios_base::trunc | std::ios_base::binary);
    fbuf.pubseekoff(file_size-1, std::ios_base::beg);
    fbuf.sputc(0x88);
    fbuf.close();

    auto now = std::chrono::system_clock::now();
    file_mapping m_file(file_path.c_str(), read_write);
    //Map the whole file with read-write permissions in this process  
    mapped_region region(m_file, read_write, 0, file_size);
    void* data = region.get_address();
    std::size_t count = region.get_size() / sizeof(int);

        long sum = 0;

# ifdef _OPENMP
    printf_s("Compiled by an OpenMP-compliant implementation.\n");
# endif

    #pragma omp parallel for
    for (long i = 0; i < count; i++)
    {
        sum = sum + *((int*)data + i);
    }

    auto now2 = system_clock::now();

    std::cout << "Sum:" << sum << std::endl << "time: " << duration_cast<milliseconds>(now2 - now).count();

    return 0;
}
上一篇 下一篇

猜你喜欢

热点阅读