01. Slurm-集群管理和作业调度系统
简介
https://slurm.schedmd.com/overview.html
Overview
Slurm is an open source, fault-tolerant, and highly scalable cluster management and job scheduling system for large and small Linux clusters.
As a cluster workload manager, Slurm has three key functions. First, it allocates exclusive and/or non-exclusive access to resources (compute nodes) to users for some duration of time so they can perform work. Second, it provides a framework for starting, executing, and monitoring work (normally a parallel job) on the set of allocated nodes. Finally, it arbitrates contention for resources by managing a queue of pending work.
Architecture
image
主从式架构,一个primary(slurmctld), 负责作业管理, 多个 nodes(slurmd), 负责执行计算任务, primary有一个可选的backup.
tutorial
https://slurm.schedmd.com/tutorials.html
直接看这份文档 https://www.open-mpi.org/video/slurm/Slurm_EMC_Dec2012.pdf
概念:
SLURM Entities
- Jobs: Resource allocation requests
- Job steps: Set of (typically parallel) tasks
- Partitions: Job queues with limits and access controls
- Nodes
- NUMA boards
- Sockets
- Cores
- Hyperthreads
- Cores
- Memory
- Generic Resources (e.g. GPUs)
- Sockets
- NUMA boards
- Users submit jobs to a partition (queue)
- Jobs are allocated resources
- Jobs spawn steps, which are allocated resources from
within the job's allocation -
Job States
截屏2019-12-14下午1.57.32.png -
Linux Job Launch Sequence
截屏2019-12-14下午3.23.13.png
操作
几种运行模式
- srun
Create a job allocation (if needed) and launch
a job step (typically an MPI job) - salloc
Create job allocation and start a shell to use it
(interactive mode) - sbatch
Submit script for later execution (batch
mode) - sattach
Connect stdin/out/err for an existing job or
job step
其他命令
- sinfo
- squeue
- smap
- sbcast
- scanncel
MPI 支持
- Many different MPI implementations are supported:
- MPICH1, MPICH2, MVAPICH, OpenMPI, etc.
- Many use srun to launch the tasks directly
- Some use “mpirun” or another tool within an existing SLURM allocation (they reference SLURM environment variables to determine what resources are allocated to the job)
- Details are online:
http://www.schedmd.com/slurmdocs/mpi_guide.html
发布节奏借鉴
持续集成,定期发布可用特性
- New minor release about every 9 months
- 2.4.x June 2012
- 2.5.x December 2012
- Micro releases with bug fixes about once each month
构建和安装
Slurm 自带Test Suite, 安装好以后可以用来做回归验证
2019.12.14 Tutorial 看完。