超算mpi环境安装和使用

2018-12-12  本文已影响0人  不想当社畜

在超算环境中一般都是比较老的mpi环境,会出现调用不成功,无法使用等情况。所以一般都会需要在自己的用户目录下去配置mpi环境.

配置mpi一般前提条件需要:

下载mpi源码

在学校的超算上编译运行mpi环境

../configure --prefix=/public/home/zhankang/miaozhaohui/software/install/mpich3.3 # 指定安装目录
make
make install

上述步骤都顺利的完成!
之后会在将在prefix指定问路径下看到编译的结果

zhankang@login2:[/public/home/zhankang/miaozhaohui/software/install/mpich3.3]ls
bin  include  lib  share
zhankang@login2:[/public/home/zhankang/miaozhaohui/software/install/mpich3.3]cd bin/
zhankang@login2:[/public/home/zhankang/miaozhaohui/software/install/mpich3.3/bin]ls
hydra_nameserver  hydra_persist  hydra_pmi_proxy  mpic++  mpicc  mpichversion  mpicxx  mpiexec  mpiexec.hydra  mpif77  mpif90  mpifort  mpirun  mpivars  parkill

看到编译常用的mpicc mpic++ 说明安装成功.
之后可以将bin目录下添加到环境变量.

# 使用vim打开bashrc文件 在文件最后添加一句
# export PATH=/public/home/zhankang/miaozhaohui/software/install/mpich3.3/bin:$PATH
# 前面的目录表示自行编译的mpi的bin目录 执行如下操作 
# which mpiexec 显示结果为bin目录的路径说明成功.
zhankang@login2:[/public/home/zhankang/miaozhaohui/software/install/mpich3.3/bin]vim ~/.bashrc 
zhankang@login2:[/public/home/zhankang/miaozhaohui/software/install/mpich3.3/bin]source ~/.bashrc
zhankang@login2:[/public/home/zhankang/miaozhaohui/software/install/mpich3.3/bin]which mpiexec
/public/home/zhankang/miaozhaohui/software/install/mpich3.3/bin/mpiexec

测试并行程序

#include <mpi.h>
#include<stdio.h>
// 测试并行是否成功
int main(int argc,  char* argv[])
{
    int rank;
    int size;
    int namelen;
    char processor_name[MPI_MAX_PROCESSOR_NAME];
    // 并行环境初始化
    MPI_Init(&argc,&argv);
    // 获得当前进程
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    // 获得运行该程序的总进程
    MPI_Comm_size(MPI_COMM_WORLD, &size);
    // 获得当前进程下主机的名称 
    MPI_Get_processor_name(processor_name, &namelen);

    printf("hello world from process %i of size %i   -- name %s .\n",rank,size,processor_name);
   // mpi环境结束
    MPI_Finalize();
    return 0;
}

编译运行结果:

zhankang@login2:[/public/home/zhankang/miaozhaohui/code]mpicc main.c 
zhankang@login2:[/public/home/zhankang/miaozhaohui/code]mpiexec -n 3 ./a.out 
hello world from process 0 of size 3   -- name login2 .
hello world from process 1 of size 3   -- name login2 .
hello world from process 2 of size 3   -- name login2 .

成功。

在超算集群中一般都是多个用户提交多个作业,为了使系统运行状态最优,通过PBS作业管理系统根据集群上的可用计算节点的计算资源管理和调度所有计算作业.

超算计算资源计较紧缺,使用免费的队列进行计算:free.
对应的pbs文件:

#PBS -N test
#PBS -l nodes=2:ppn=2
#PBS -j oe
#PBS -q free
#PBS -l walltime=0:05:0

cd $PBS_O_WORKDIR
JOBID=`echo $PBS_JOBID | awk -F. '{print $1}'`
echo This job id is $JOBID | tee job_info.log
echo Working directory is $PBS_O_WORKDIR | tee -a job_info.log
echo Start time is `date` | tee -a job_info.log
echo This job runs on the following nodes: | tee -a job_info.log
echo `cat $PBS_NODEFILE | sort | uniq` | tee -a job_info.log
NPROCS=`cat $PBS_NODEFILE | wc -l`
PPROCS=$(($NPROCS/$NNODES))
echo This job has allocated $NNODES nodes, $NPROCS processors.| tee -a job_info.log

uniq $PBS_NODEFILE | sort | sed s/$/i:$PPROCS/ > $PBS_O_WORKDIR/hostfile

#source your profile
MPIRUN="mpiexec -np $NPROCS -f $PBS_O_WORKDIR/hostfile -env I_MPI_DEVICE=rdma"
JOBCMD="./a.out"
{ time $MPIRUN $JOBCMD; } >$PBS_O_WORKDIR/output_$JOBID.log 2>&1

echo End time is `date`| tee -a job_info.log
rm -f  $PBS_O_WORKDIR/hostfile
pkill -P $$
exit 0

指定了两个节点,每个节点计算资源2核,总共4个计算核,计算结果:

hello world from process 1 of size 4   -- name c1137 .
hello world from process 3 of size 4   -- name c1138 .
hello world from process 2 of size 4   -- name c1138 .
hello world from process 0 of size 4   -- name c1137 .

说明环境配置成功!

在学部的超算上编译出现问题

出现以下问题:

checking size of bool... 0
configure: error: unable to determine matching C type for C++ bool

不管怎么调试都出现问题。

解决办法

由于在使用petsc编译的过程中,当系统环境没有mpi环境时,可以自己下载和安装mpich,故使用petsc自带的编译命令进行mpi的安装,对比自行安装过程出现的问题.

工具:

./configure --download-mpich=/public/home2/kangzang/zhmiao/software/mpich-3.3.tar.gz --download-fblaslapack

发现编译成功了,在arch-linux2-c-debug目录下已经编译成功了.
并且使用上述测试程序并行计算成功.

hello world from process 0 of size 3   -- name clusadm .
hello world from process 1 of size 3   -- name clusadm .
hello world from process 2 of size 3   -- name clusadm .
上一篇下一篇

猜你喜欢

热点阅读