深度学习技术深度学习程序员

TensorFlow GPU 版本总结

2018-04-04  本文已影响1118人  SpikeKing

欢迎Follow我的GitHub,关注我的简书

对于TensorFlow的GPU版本而言, 严重依赖系统的CUDA硬件环境.

TF GPU

查看CUDA版本:

(venv)$ nvcc  --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Sun_Sep__4_22:14:01_CDT_2016
Cuda compilation tools, release 8.0, V8.0.44

以下是对于支持CUDA 8.0版本的TF GPU版本的总结:

1.5版本以上的CUDA 9.0错误提示:

ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory

1.4版本的CUDA常量的值:

(venv) $ echo $CUDA_DEVICE_ORDER
PCI_BUS_ID
echo $CUDA_VISIBLE_DEVICES
0,1,2,3

导入CUDA常量的命令:

export CUDA_DEVICE_ORDER="PCI_BUS_ID"
export CUDA_VISIBLE_DEVICES="0,1,2,3"

关于Linux下载的cuDNN包:

cudnn-[CUDA版本]-[操作系统]-[64位]-[动态链接库版本]

CUDA官网无法登录时, 请耐心等待... 或者在CSDN中下载资源.

下载:

wget http://developer.download.nvidia.com/compute/redist/cudnn/v6.0/cudnn-8.0-linux-x64-v6.0.tgz
wget http://developer.download.nvidia.com/compute/redist/cudnn/v7.0.5/cudnn-8.0-linux-x64-v7.tgz

设置LD_LIBRARY_PATH变量

echo $LD_LIBRARY_PATH
/usr/local/cuda-9.0/lib64/usr/local/cuda-8.0/lib:/usr/local/cuda/lib:/usr/local/cuda-8.0/lib64/:/usr/local/cuda/lib64/:

export LD_LIBRARY_PATH="/usr/local/cuda-9.0/lib64/usr/local/cuda-8.0/lib:/usr/local/cuda/lib:/usr/local/cuda-8.0/lib64/:/usr/local/cuda/lib64/:"
export CUDA_DEVICE_ORDER="PCI_BUS_ID"
export CUDA_VISIBLE_DEVICES="0,1,2,3"

CUDA版本列表:

CUDA

cuda压缩包中, 含有两个文件夹

执行命令, 操作usr目录, 需要sudo管理员权限:

tar -xzvf cudnn-8.0-linux-x64-v5.1.tgz
sudo cp cuda/include/cudnn.h /usr/local/cuda/include
sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
sudo chmod a+r /usr/local/cuda/lib64/libcudnn*

测试GPU的Python脚本:

from tensorflow.python.client import device_lib


def get_available_gpus():
    """
    查看GPU的命令:nvidia-smi
    查看被占用的情况:ps aux | grep PID
    :return: GPU个数
    """
    local_device_protos = device_lib.list_local_devices()
    print "all: %s" % [x.name for x in local_device_protos]
    print "gpu: %s" % [x.name for x in local_device_protos if x.device_type == 'GPU']


get_available_gpus()

输出支持GPU的编号列表.

因此, 如果CUDA是8.0版本, 则TF GPU最高支持1.4版本, 不要痴心妄想了!

CUDA和cuDNN是用于神经网络训练的GPU环境,属于硬件信息,不同的CUDA版本支持不同的机器学习库,因此,需要确定当前服务器的CUDA版本,以便于安装相应的机器学习库。

查询CUDA的版本,如8.0.44:

wcl1@BJYS-AMAXGPU-34-1:~$ cat /usr/local/cuda/version.txt
CUDA Version 8.0.44

nvcc  --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Sun_Sep__4_22:14:01_CDT_2016
Cuda compilation tools, release 8.0, V8.0.44

查询cuDNN的版本,如5.1.5:

cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2
#define CUDNN_MAJOR      5
#define CUDNN_MINOR      1
#define CUDNN_PATCHLEVEL 5

查询GPU的信息,如4个GPU:

nvidia-smi
Wed Mar 28 12:32:01 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 387.26                 Driver Version: 387.26                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 108...  Off  | 00000000:02:00.0 Off |                  N/A |
| 23%   22C    P8    16W / 250W |    289MiB / 11172MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 108...  Off  | 00000000:03:00.0 Off |                  N/A |
| 23%   23C    P8     8W / 250W |     10MiB / 11172MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   2  GeForce GTX 108...  Off  | 00000000:82:00.0 Off |                  N/A |
| 23%   18C    P8     8W / 250W |     10MiB / 11172MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   3  GeForce GTX 108...  Off  | 00000000:83:00.0 Off |                  N/A |
| 23%   19C    P8     8W / 250W |     10MiB / 11172MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0     20009      C   python                                       279MiB |
+-----------------------------------------------------------------------------+
上一篇 下一篇

猜你喜欢

热点阅读