数据科学家的自我修养大数据,机器学习,人工智能玩转大数据

GPU support for XGBoost and Ligh

2020-03-08  本文已影响0人  padluo

GBDT 是表格型数据挖掘比赛的大杀器,其主要思想是利用弱分类器(决策树)迭代训练以得到最优模型,该模型具有训练效果好、不易过拟合等优点。XGBoost 和 LightGBM 是两个实现 GBDT 算法的框架,为了加快模型的训练效率,本文记录了 GPU Support 的 XGBoost and LightGBM 的构建过程。

本次构建的系统环境为 CentOS 7.2。

Installation Guide for XGBoost GPU support

Building XGBoost from source,构建和安装 XGBoost 包括如下两个步骤,

Building the Shared Library

在 CentOS 上构建共享库,默认情况下,分布式 GPU 训练是关闭的,仅仅只有一个 GPU 将被使用,为开启分布式 GPU 训练,用 CMake 构建时,设置选项USE_NCLL=ON,分布式 GPU 训练依赖 NCLL2 ,可在 https://developer.nvidia.com/nccl 获取,下载过程需要登录并完成调查问卷,所以构建 GPU Support 的 Pre-installation Actions is,

gcc/g++ 5.0+

CentOS 7.2 默认的 gcc/g++ 版本为4.8.5,不满足构建的需要,需升级 gcc/g++,首先做一些准备工作,

tar xjvf gcc-7.4.0.tar.bz2 && cd gcc-7.4.0
gmp='gmp-6.1.0.tar.bz2'
mpfr='mpfr-3.1.4.tar.bz2'
mpc='mpc-1.0.3.tar.gz'
isl='isl-0.16.1.tar.bz2'

再执行,

./contrib/download_prerequisites
mkdir gcc-7.4.0-build && cd gcc-7.4.0-build
../configure --prefix=/usr/local/gcc-7.4.0 --enable-bootstrap --enable-build-with-cxx --enable-cloog-backend=isl --disable-libjava-multilib --enable-checking=release --enable-gold --enable-ld --enable-libada --enable-libssp --enable-lto --enable-objc-gc --enable-vtable-verify --enable-checking=release --enable-languages=c,c++,objc,obj-c++,fortran --disable-multilib
make -j8
make install
export PATH=/usr/local/gcc-7.4.0/bin/:$PATH
export LD_LIBRARY_PATH=/usr/local/lib:/usr/local/lib64/:$LD_LIBRARY_PATH
export C_INCLUDE_PATH=/usr/local/include/:$C_INCLUDE_PATH
export CPLUS_INCLUDE_PATH=/usr/local/include/:$CPLUS_INCLUDE_PATH

验证 gcc 版本,

$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/local/gcc-7.4.0/libexec/gcc/x86_64-redhat-linux/7.4.0/lto-wrapper
Target: x86_64-redhat-linux
Configured with: ../configure --prefix=/usr/local/gcc-7.4.0 --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-bootstrap --enable-shared --enable-threads=posix --enable-checking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-linker-hash-style=gnu --enable-languages=c,c++,objc,obj-c++,fortran,ada,go,lto --enable-plugin --enable-initfini-array --disable-libgcj --enable-gnu-indirect-function --with-tune=generic --with-arch_32=x86-64 --build=x86_64-redhat-linux
Thread model: posix
gcc version 7.4.0 (GCC)

但 g++ 和 c++ 版本仍指向原来的4.8.5,增加软链接修改指向,

sudo cp /usr/bin/g++ /usr/bin/g++-4.8.5.bak
sudo rm /usr/bin/g++
sudo ln -s  /usr/local/gcc-7.4.0/bin/g++ /usr/bin/g++

sudo cp /usr/bin/c++ /usr/bin/c++-4.8.5.bak
sudo rm /usr/bin/c++
sudo ln -s  g++ /usr/bin/c++
c++ -v
g++ -v

CMake
build CMake from source,参考 https://cmake.org/install/

tar -zxvf cmake-3.16.2.tar.gz
cd cmake-3.16.2/
./bootstrap --prefix=/usr/local
make
make install
which cmake
cmake -version

NCLL2
参考
https://docs.nvidia.com/deeplearning/sdk/nccl-install-guide/index.html

When compiling applications, specify the directory path to where you installed NCCL, for example /usr/local/nccl-<version>/

tar -zxvf nccl-2.5.6-2.tar.gz
mv nccl-2.5.6-2 /usr/local/
ll /usr/local/nccl-2.5.6-2

GPU support for XGBoost

By default, distributed GPU training is disabled and only a single GPU will be used. To enable distributed GPU training, set the option USE_NCCL=ON. Distributed GPU training depends on NCCL2, available at this link. Since NCCL2 is only available for Linux machines, distributed GPU training is available only for Linux.

首先从 Git 获取源码,

git clone --recursive https://github.com/dmlc/xgboost

编译支持 GPU 训练,设置选项 USE_CUDA=ON,分布式 GPU 训练,设置选项 USE_NCCL=ON,指定 NCCL2 根路径,

cd xgboost
mkdir build
cd build
cmake .. -DUSE_CUDA=ON -DUSE_NCCL=ON -DNCCL_ROOT=/usr/local/nccl-2.5.6-2
make -j4

XGBoost Python Package Installation

python 包位于 xgboost/python-package

cd python-package
sudo python setup.py install

benchmark测试,

cd xgboost/tests/benchmark
python benchmark_tree.py

XGBoost GPU accelerated Usage

compute capability of GPU card,
https://en.wikipedia.org/wiki/CUDA#GPUs_supported

使用 sklearn 风格的接口,使用 sklearn 风格的参数。

import xgboost as xgb
clf = xgb.XGBClassifier(tree_method='gpu_hist', gpu_id=0)

Installation Guide for LightGBM GPU support

Building LightGBM from source

Requirements

glibc和CMake已满足要求,CentOS 7.2 的base repo中 boost 版本为1.53,epel repo有1.69,或者从 boost 源码编译。

libboost installation

#sudo yum install boost169 boost169-devel
#Error: Package: python36-libs-3.6.8-1.el7.x86_64 (centos72-epel)
#           Requires: libcrypto.so.10(OPENSSL_1.0.2)(64bit)
#rpm -qa | grep boost
tar -zxvf boost_1_69_0.tar.gz
cd boost_1_69_0/
./bootstrap.sh
./b2
./b2 install

OpenCL

查看 OpenCL headers and library,

ls /usr/local/cuda/lib64/libOpenCL.so /usr/local/cuda/include/

Build LightGBM GPU Version

构建 LightGBM GPU 版本,

git clone --recursive https://github.com/microsoft/LightGBM
cd LightGBM
mkdir build
cd build
cmake -DUSE_GPU=1 -DOpenCL_LIBRARY=/usr/local/cuda/lib64/libOpenCL.so -DOpenCL_INCLUDE_DIR=/usr/local/cuda/include/ ..
make -j4

LightGBM Python-package Installation

cd LightGBM/python-package
python setup.py install

主要参考文档,

https://xgboost.readthedocs.io/en/latest/
https://lightgbm.readthedocs.io/en/latest/index.html


微信公众号「padluo」,分享数据科学家的自我修养,既然遇见,不如一起成长。

数据分析

读者交流电报群

https://t.me/sspadluo


知识星球交流群

知识星球读者交流群
上一篇下一篇

猜你喜欢

热点阅读