yarn label标签

2023-06-07  本文已影响0人  后知不觉1

1、配置标签nodelabel

yarn-site.xml
yarn.node-labels.enabled: true
yarn.node-labels.fs-store.root-dir: hdfs://namaspace/yarn/node-labels
yarn.resourcemanager.scheduler.class: org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler

重启rm

2、添加标签

  su yarn 
  yarn rmadmin -addToClusterNodeLabels "stream_compute "
  yarn rmadmin -addToClusterNodeLabels "stream_compute (exclusive=true)"   exclustive默认是true,也就是默认独占整台节点

3、删除集群标签

 yarn rmadmin -removeFromClusterNodeLabels stream_compute 

4、将机器绑定到label

  yarn rmadmin -replaceLabelsOnNode "hadoop03=stream_compute " 
  #hadoop02 这里默认是hadoop03这台机器上的所有nodemanager,如果要特指,用hostname:port即可,port为YARN UI上的Node Address
  #多台
 yarn rmadmin -replaceLabelsOnNode "hadoop05=stream_compute hadoop06=stream_compute hadoop07=stream_compute "
 #利用这种方式实现删除机器上的标签:
 yarn rmadmin -replaceLabelsOnNode "hadoop05=stream_compute hadoop06=stream_compute hadoop07=stream_compute "

5、队列绑定到label

备注:需要添加完label以后才能配置queue label

yarn.scheduler.capacity.root.queues=default,realtime,realtime1
yarn.scheduler.capacity.root.capacity=100
yarn.scheduler.capacity.root.accessible-node-labels=*
yarn.scheduler.capacity.root.default.state=RUNNING
yarn.scheduler.capacity.root.default.capacity=70
yarn.scheduler.capacity.root.default.accessible-node-labels= #这里空格必须有
yarn.scheduler.capacity.root.default.default-node-label-expression= #default队列任务默认的标签为空(必须空格),则root.default队列只能在没有label的node上执行,如果环境环境中不存在没有label的node,则提交到root.default的任务卡死
yarn.scheduler.capacity.root.accessible-node-labels.stream_compute.capacity=100
yarn.scheduler.capacity.root.realtime.state=RUNNING
yarn.scheduler.capacity.root.realtime.capacity=20
yarn.scheduler.capacity.root.realtime.accessible-node-labels=stream_compute #允许realtime队列在stream_compute标签的机器上运行,可以有多个label
yarn.scheduler.capacity.root.realtime.accessible-node-labels.stream_compute.capacity=70 #允许realtime队列使用stream_compute标签的资源最多50%
yarn.scheduler.capacity.root.realtime.default-node-label-expression=stream_compute #如果realtime队列的任务没有指向特定的标签,那默认就使用stream_compute标签的资源,此处只能是一个label,不能是多个label
yarn.scheduler.capacity.root.realtime1.state=RUNNING
yarn.scheduler.capacity.root.realtime1.capacity=10
yarn.scheduler.capacity.root.realtime1.accessible-node-labels=stream_compute
yarn.scheduler.capacity.root.realtime1.accessible-node-labels.stream_compute.capacity=30
yarn.scheduler.capacity.root.realtime1.default-node-label-expression=stream_compute

6、刷新队列

yarn rmadmin -refreshQueues

这样就实现了队列对资源的完全占用,达到了离线实时完全隔离的目的

上一篇 下一篇

猜你喜欢

热点阅读