AutoDock - 编译与测试

2024-04-11  本文已影响0人  红薯爱帅

1. 准备工作



export GPU_INCLUDE_PATH=/usr/local/cuda-11.2/include
export GPU_LIBRARY_PATH=/usr/local/cuda-11.2/lib64
export PATH="/usr/local/cuda-11.2/bin:$PATH"



比如RTX 3090对应的算力是8.6,TARGETS应该是86。


TARGETS = 52 60 61 70 86



$ nvcc --help
--gpu-code <code>,...                           (-code)                         
        Specify the name of the NVIDIA GPU to assemble and optimize PTX for.
        nvcc embeds a compiled code image in the resulting executable for each specified
        <code> architecture, which is a true binary load image for each 'real' architecture
        (such as sm_50), and PTX code for the 'virtual' architecture (such as compute_50).
        During runtime, such embedded PTX code is dynamically compiled by the CUDA
        runtime system if no binary load image is found for the 'current' GPU.
        Architectures specified for options '--gpu-architecture' and '--gpu-code'
        may be 'virtual' as well as 'real', but the <code> architectures must be
        compatible with the <arch> architecture.  When the '--gpu-code' option is
        used, the value for the '--gpu-architecture' option must be a 'virtual' PTX
        For instance, '--gpu-architecture=compute_60' is not compatible with '--gpu-code=sm_52',
        because the earlier compilation stages will assume the availability of 'compute_60'
        features that are not present on 'sm_52'.
        Note: the values compute_30, compute_32, compute_35, compute_37, compute_50,
        sm_30, sm_32, sm_35, sm_37 and sm_50 are deprecated and may be removed in
        a future release.
        Allowed values for this option:  'compute_35','compute_37','compute_50',

2. 编译

The first step is to set environmental variables GPU_INCLUDE_PATH and GPU_LIBRARY_PATH,
as described here:

Parameters Description Values
<TYPE> Accelerator chosen CPU, GPU, CUDA, OCLGPU
<NWI> work-group/thread block size, Number of work-items (wi) 1, 2, 4, 8, 16, 32, 64, 128, 256

When DEVICE=GPU is chosen, the Makefile will automatically tests if it can compile Cuda succesfully. To override, use DEVICE=CUDA or DEVICE=OCLGPU. The cpu target is only supported using OpenCL. Furthermore, an OpenMP-enabled overlapped pipeline (for setup and processing) can be compiled with OVERLAP=ON.
Hints: The best work-group size depends on the GPU and workload. Try NUMWI=128 or NUMWI=64 for modern cards with the example workloads. On macOS, use NUMWI=1 for CPUs.

After successful compilation, the host binary autodock_<type>_<N>wi is placed under bin.

Binary-name portion Description Values
<type> Accelerator chosen cpu, gpu
<N> work-group/thread block size 1, 2, 4, 8,16, 32, 64, 128, 256

3. 测试

$ ./autodock_gpu_256wi --lfile /home/shuzhang/ai/code/moldock/autodock/output/tmpnnuuab_g.pdbqt --ffile /data/autodock/grid/0cb544cb1474ff6d917fe409598886cb/protein.maps.fld --devnum 2 --ngen 1 --nrun 2 --stopstd 1.999
AutoDock-GPU version: v1.5.3-73-gf5cf6ffdd0c5b3f113d5cc424fabee51df04da7e

Running 1 docking calculation

Cuda device:                              NVIDIA GeForce RTX 3090 (#2 / 6)
Available memory on device:               21182 MB (total: 24268 MB)

CUDA Setup time 0.248527s
(Thread 52 is setting up Job #1)

Running Job #1
    Using heuristics: (capped) number of evaluations set to 6122449
    Warning: The set number of evals is 48.98% of the uncapped heuristics estimate of 12500000 evals.
             This means this docking may not be able to converge. Increasing --heurmax may improve
             convergence but will also increase runtime.
             AutoStop will not stop before 10.50% (643004) of the set number of evaluations.
    Local-search chosen method is: ADADELTA (ad)

Rest of Setup time 0.006511s

Executing docking runs, stopping automatically after either reaching 2.00 kcal/mol standard deviation of
the best molecules of the last 4 * 5 generations, 1 generations, or 6122449 evaluations:

Generations |  Evaluations |     Threshold    |  Average energy of best 10%  | Samples | Best Inter + Intra
          0 |          150 | 1145.95 kcal/mol |  312.45 +/-  222.27 kcal/mol |       4 |   80.37 kcal/mol
          1 |         9167 | 1145.95 kcal/mol |   91.60 +/-  157.57 kcal/mol |      56 |   -7.85 kcal/mol

                                   Finished evaluation after reaching
                          9167 evaluations. Best inter + intra    -7.85 kcal/mol.

Docking time 0.002292s

Shutdown time 0.002002s

Job #1 took 0.011 sec after waiting 0.347 sec for setup

(Thread 52 is processing Job #1)
Run time of entire job set (1 file): 0.360 sec
Processing time: 0.002 sec

All jobs ran without errors.

4. 参考

上一篇 下一篇

