无法创建线程导致的nodemanager频繁挂掉

2018-08-21  本文已影响0人  invincine

hadoop集群在执行一个MapReduce任务时,其中一个节点的nodemanager频繁挂掉,以下是日志中纪录的报错内容:

2018-08-21 14:31:05,210 FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: Error in dispatcher thread
java.lang.OutOfMemoryError: unable to create new native thread
    at java.lang.Thread.start0(Native Method)
    at java.lang.Thread.start(Thread.java:714)
    at org.apache.hadoop.util.Shell.runCommand(Shell.java:521)
    at org.apache.hadoop.util.Shell.run(Shell.java:455)
    at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
    at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.containerIsAlive(DefaultContainerExecutor.java:430)
    at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.signalContainer(DefaultContainerExecutor.java:401)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.cleanupContainer(ContainerLaunch.java:419)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:139)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:55)
    at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
    at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
    at java.lang.Thread.run(Thread.java:745)
2018-08-21 14:31:05,214 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Exiting, bbye.

报错内容显而易见,是因为jvm没有多余的内存来创建新的线程导致的
由于该节点是新扩容的服务器,最开始想到的是系统限制用户创建线程数
ulimit -u命令查看,果然是默认的1024
修改数值:ulimit -u 102400
启动nodemanager
但过了一会儿,nodemanager又挂掉了,一样的报错:无法分配线程

最后找到一篇文章,解决了这个问题:
文章地址:https://blog.csdn.net/hw446/article/details/47908571
由于MapReduce分配了过多的内存,导致没有多余的内存供jvm分配线程
解决方法是修改mapred-site.xml配置文件相关参数
修改之前:

mapred-site.xml
    <property>
        <name>mapreduce.map.memory.mb</name>
        <value>4096</value>
    </property>

    <property>
        <name>mapreduce.map.java.opts</name>
        <value>-Xmn1200m -Xms3600m  -Xmx3600m -XX:MaxPermSize=100m -XX:PermSize=100m -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseCMSCompactAtFullCollection -XX:+DisableExplicitGC -Dfile.encoding=UTF-8</value>
    </property>

    <property>
        <name>mapreduce.reduce.memory.mb</name>
        <value>8192</value>
    </property>

    <property>
        <name>mapreduce.reduce.java.opts</name>
        <value>-Xmn2000m -Xms7200m  -Xmx7200m -XX:MaxPermSize=200m -XX:PermSize=200m -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseCMSCompactAtFullCollection -XX:+DisableExplicitGC -Dfile.encoding=UTF-8</value>
    </property>

修改之后:

    <property>
        <name>mapreduce.map.java.opts</name>
        <value>-Xmn1200m -Xms1600m  -Xmx1600m -XX:MaxPermSize=100m -XX:PermSize=100m -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseCMSCompactAtFullCollection -XX:+DisableExplicitGC -Dfile.encoding=UTF-8</value>
    </property>

    <property>
        <name>mapreduce.reduce.memory.mb</name>
        <value>4096</value>
    </property>

    <property>
        <name>mapreduce.reduce.java.opts</name>
        <value>-Xmn2000m -Xms3072m  -Xmx3072m -XX:MaxPermSize=200m -XX:PermSize=200m -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseCMSCompactAtFullCollection -XX:+DisableExplicitGC -Dfile.encoding=UTF-8</value>
    </property>

最后重启nodemanager

上一篇下一篇

猜你喜欢

热点阅读