无法创建线程导致的nodemanager频繁挂掉
2018-08-21 本文已影响0人
invincine
hadoop集群在执行一个MapReduce任务时,其中一个节点的nodemanager频繁挂掉,以下是日志中纪录的报错内容:
2018-08-21 14:31:05,210 FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: Error in dispatcher thread
java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:714)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:521)
at org.apache.hadoop.util.Shell.run(Shell.java:455)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.containerIsAlive(DefaultContainerExecutor.java:430)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.signalContainer(DefaultContainerExecutor.java:401)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.cleanupContainer(ContainerLaunch.java:419)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:139)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:55)
at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
at java.lang.Thread.run(Thread.java:745)
2018-08-21 14:31:05,214 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Exiting, bbye.
报错内容显而易见,是因为jvm没有多余的内存来创建新的线程导致的
由于该节点是新扩容的服务器,最开始想到的是系统限制用户创建线程数
ulimit -u命令查看,果然是默认的1024
修改数值:ulimit -u 102400
启动nodemanager
但过了一会儿,nodemanager又挂掉了,一样的报错:无法分配线程
最后找到一篇文章,解决了这个问题:
文章地址:https://blog.csdn.net/hw446/article/details/47908571
由于MapReduce分配了过多的内存,导致没有多余的内存供jvm分配线程
解决方法是修改mapred-site.xml配置文件相关参数
修改之前:
mapred-site.xml
<property>
<name>mapreduce.map.memory.mb</name>
<value>4096</value>
</property>
<property>
<name>mapreduce.map.java.opts</name>
<value>-Xmn1200m -Xms3600m -Xmx3600m -XX:MaxPermSize=100m -XX:PermSize=100m -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseCMSCompactAtFullCollection -XX:+DisableExplicitGC -Dfile.encoding=UTF-8</value>
</property>
<property>
<name>mapreduce.reduce.memory.mb</name>
<value>8192</value>
</property>
<property>
<name>mapreduce.reduce.java.opts</name>
<value>-Xmn2000m -Xms7200m -Xmx7200m -XX:MaxPermSize=200m -XX:PermSize=200m -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseCMSCompactAtFullCollection -XX:+DisableExplicitGC -Dfile.encoding=UTF-8</value>
</property>
修改之后:
<property>
<name>mapreduce.map.java.opts</name>
<value>-Xmn1200m -Xms1600m -Xmx1600m -XX:MaxPermSize=100m -XX:PermSize=100m -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseCMSCompactAtFullCollection -XX:+DisableExplicitGC -Dfile.encoding=UTF-8</value>
</property>
<property>
<name>mapreduce.reduce.memory.mb</name>
<value>4096</value>
</property>
<property>
<name>mapreduce.reduce.java.opts</name>
<value>-Xmn2000m -Xms3072m -Xmx3072m -XX:MaxPermSize=200m -XX:PermSize=200m -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseCMSCompactAtFullCollection -XX:+DisableExplicitGC -Dfile.encoding=UTF-8</value>
</property>
最后重启nodemanager