单节点和集群服务器上分配串行和并行任务

2021-04-08  本文已影响0人  徐诗芬

在单节点服务器上,可以同步按行执行命令文件test.sh,test.sh里面有5,000行命令,先串联250行再并行提交20次,相当于20个线程,脚本如下:parallel.sh

#!/bin/bash
START=1
PER_TASK=250
for ((count=1; count<=20; count++));do
    END=${PER_TASK}*count
    for(( i=$START; i<=$END; i++ ));do
        `sed -n "$i"p test.sh`
    done &     
    #这里每250行是串联,串联后加一个&进行一次提交
    START=$START+${PER_TASK}
done
wait
echo "all weakup"
#加上 wait命令,意为等待上面所有 & 作用过的后台任务执行结束后才继续往下
#总共加了20个&,也就是并行提交了20次

nohup bash parallel.sh &

注意这里很快就done了,只是完成了sed命令,后台还在进行内部命令并没有完成,注意不要反复提交!!!

&是并行提交的意思
参考:https://youwu.today/blog/parallel-in-shell/

在集群服务器上,例如用Slurm调用作业资源系统的,多节点多线程运行某个作业任务,脚本如下:

参考:https://crc.ku.edu/hpc/slurm/how-to/arrays#examplehttps://help.rc.ufl.edu/doc/SLURM_Job_Arrays

#!/bin/bash
#SBATCH -J codeml_array
#SBATCH -p a05208har3
#SBATCH --array=1-5
#SBATCH --ntasks=1   
#SBATCH --cpus-per-task=4
#SBATCH --output=Array.%A_%a.log

pwd; hostname; date

#Set the number of runs that each SLURM task should do
PER_TASK=100

# Calculate the starting and ending values for this task based
# on the SLURM task and the number of runs per task.
START_NUM=$(( ($SLURM_ARRAY_TASK_ID - 1) * $PER_TASK + 1 ))
END_NUM=$(( $SLURM_ARRAY_TASK_ID * $PER_TASK ))

# Print the task and run range
echo This is task $SLURM_ARRAY_TASK_ID, which will do runs $START_NUM to $END_NUM

# Run the loop of runs for this task.
for (( run=$START_NUM; run<=END_NUM; run++ )); do
  echo This is SLURM task $SLURM_ARRAY_TASK_ID, run number $run
  #Do your stuff here
  `sed -n "$run"p codeml.sh`
done
date

当集群提交任务数有限制时,还可以串行与并行结合使用,以下脚本为“万星”修改

#!/bin/bash
#SBATCH -J array
#SBATCH -p xhacnormala
#SBATCH --array=1-2
#SBTACH --ntasks=4
#SBATCH --cpus-per-task=5
#SBATCH --output=Array.%A_%a.log

date
#--ntasks用于增加线程数,CPUs = array*ntasks*cpus = 40
PER_TASK=5
N=2
START_NUM=$(( ($SLURM_ARRAY_TASK_ID - 1) * $PER_TASK * $N + 1))
START_LINE=$(( ($SLURM_ARRAY_TASK_ID - 1) * $PER_TASK * $N + 1))

END_LINE=$[$SLURM_ARRAY_TASK_ID*$PER_TASK*N]

END_NUM=$[$END_LINE-$N+1]
while (($END_NUM<=$END_LINE))
  do
  echo This is SLURM task $SLURM_ARRAY_TASK_ID, run number $START_NUM"-"$END_NUM by step $N
  for ((RUN_START=$START_LINE; RUN_START<=$END_LINE; RUN_START+=$N))
    do
    echo -e `sed -n ${RUN_START}p qq5.sh` 
    eval `sed -n ${RUN_START}p qq5.sh`&    
    echo wait
    done
    wait
    START_LINE=$[$START_LINE+1]
    END_NUM=$[$END_NUM+1]
    START_NUM=$[START_NUM+1]
done
date
上一篇 下一篇

猜你喜欢

热点阅读