Hadoop

Yarn Capacity Scheduler

2020-02-09  本文已影响0人  spraysss

Yarn中使用scheduler为不同的application分配资源,hadoop yarn的调用策略以可插拔的方式实现,hadoop 自带的资源调度算法有FIFO,CapacitySchedulerFairScheduler,用户也可以自己实现资源分配调度算法然后在配置文件中指定使用自己的调度算法

yarn Scheduler源码位置

yarn Scheduler的所有代码实现都在org.apache.hadoop.yarn.server.resourcemanager.scheduler这个包下

scheduler

scheduler 框架结构

SchedulerApplicationAttempt

scheduler看到的application attempt视图,RM中每个运行的application attempt都对应于此类的一个实例。

FIFOCapacity Scheduler使用其子类FiCaSchedulerApp作为app attemp的视图

SchedulerApplicationAttempt

SchedulerNode

Represents a YARN Cluster Node from the viewpoint of the scheduler

SchedulerNode是从scheduler角度看到的Node视图

SchedulerNode

AbstractYarnScheduler

AbstractYarnScheduler

Scheduler

Capacity Scheduler

The CapacityScheduler is designed to run Hadoop applications as a shared, multi-tenant cluster in an operator-friendly manner while maximizing the throughput and the utilization of the cluster.

CapacitySchedulerd调度算法是为了在共享多租户集群以最大吞吐量的方式运行任务

同步scheduler vs 异步scheduler

参数scheduleAsynchronously用于控制同步调度还是异步调度,配置文件为

yarn.scheduler.capacity.schedule-asynchronously.enable 

默认值为false

同步调度在数据节点发送心跳(NODE_UPDATE)时,根据数据节点汇报的资源情况进行调度分配. 通过如下方法分配资源
allocateContainersToNode(getNode(node.getNodeID()));
异步调度则使用CapacityScheduler中的AsyncScheduleThread线程,周期性的调用cs.allocateContainersToNode(node);分配资源,其周期asyncScheduleInterval通过

yarn.scheduler.capacity.schedule-asynchronously..scheduling-interval-ms

控制,默认为5

capacity maximum-capacity
yarn.scheduler.capacity.<queue-path>.capacity
//实现弹性
yarn.scheduler.capacity.<queue-path>.maximum-capacity
MinimumResourceCapability MaximumResourceCapability

申请一个container的最大最小资源限制

yarn.scheduler.minimum-allocation-mb           1024
yarn.scheduler.minimum-allocation-vcores      1
yarn.scheduler.maximum-allocation-mb     8192
yarn.yarn.scheduler.minimum-allocation-vcores  4

队列级别的配置

yarn.scheduler.capacity.<queue-path>.maximum-allocation-mb
yarn.scheduler.capacity.<queue-path>.maximum-allocation-vcores
ResourceCalculator

yarn提供了两种ResourceCalculator

queueComparator applicationComparator

应用程序只能提交到LeafQueue,
queue通过UsedCapacity,选取
application采用先进先出(ApplicationId)的方式选取

  static final Comparator<CSQueue> queueComparator = new Comparator<CSQueue>() {
    @Override
    public int compare(CSQueue q1, CSQueue q2) {
      if (q1.getUsedCapacity() < q2.getUsedCapacity()) {
        return -1;
      } else if (q1.getUsedCapacity() > q2.getUsedCapacity()) {
        return 1;
      }

      return q1.getQueuePath().compareTo(q2.getQueuePath());
    }
  };

  static final Comparator<FiCaSchedulerApp> applicationComparator = 
    new Comparator<FiCaSchedulerApp>() {
    @Override
    public int compare(FiCaSchedulerApp a1, FiCaSchedulerApp a2) {
      return a1.getApplicationId().compareTo(a2.getApplicationId());
    }
  };
优先级
yarn.cluster.max-application-priority
yarn.scheduler.capacity.root.<leaf-queue-path>.default-application-priority
资源预定
资源抢占

`

上一篇 下一篇

猜你喜欢

热点阅读