NameNode Full GC导致集群退出问题剖析,以及重启j
2021-09-02 本文已影响0人
邵红晓
元数据流程图

问题1
在AsyncLogger异步flush写JournalNode过程中,出现IOException,而这个IOException的依赖判断是基于系统时间,获取系统时间之前发生Full GC,导致timeout,terminate namenode直接退出。
- 分析调用链
JournalSet.flushAndSync.mapJournalsAndReportErrors.flushAndSync()
->实现QuorumOutputStream.flushAndSync()
->
loggers.waitForWriteQuorum(qcall, writeTimeoutMs, "sendEdits");
->
q.waitFor(
loggers.size(), // either all respond 5
majority, // or we get a majority successes 3
majority, // or we get a majority failures, 3
timeoutMs, operationName);// timeoutMs 20秒
->timeoutMs 20秒 等待journal node 20s 请求结果必须返回,否则就是超时
QuorumCall.waitFor()
public synchronized void waitFor(
int minResponses, int minSuccesses, int maxExceptions,
int millis, String operationName)
throws InterruptedException, TimeoutException {
// millis = 20s,必须在20s内方法返回,要不然就是有2个journalnodes写入成功了,要不然就是写入失败
long st = Time.monotonicNow();
long nextLogTime = st + (long)(millis * WAIT_PROGRESS_INFO_THRESHOLD);
/**
* 计算超时时间点
*/
long et = st + millis;
while (true) {//多次尝试的意思
checkAssertionErrors();
/**
*
* 思路:
* 写journalnode 可以,无论写成功和失败,你都跟我个回应。
*
*/
// 第一种情况:写入成功的 + 写入失败的 >= 5,5个journalnodes不管是成功还是失败,都给你返回了一个响应
//countResponses用来统计 journalnode返回来响应点个数
if (minResponses > 0 && countResponses() >= minResponses) return;
// 第二种情况:写入成功的 >= 3,只要有3个journalnodes是写入成功的,也立马返回
if (minSuccesses > 0 && countSuccesses() >= minSuccesses) return;
// 第三种情况:写入失败的 >=3,只要有3个journalnodes是写入失败的,也可以立马返回
if (maxExceptions >= 0 && countExceptions() > maxExceptions) return;
long now = Time.monotonicNow();
if (now > nextLogTime) {
long waited = now - st;
//打印日志
String msg = String.format(
"Waited %s ms (timeout=%s ms) for a response for %s",
waited, millis, operationName);
if (!successes.isEmpty()) {
msg += ". Succeeded so far: [" + Joiner.on(",").join(successes.keySet()) + "]";
}
if (!exceptions.isEmpty()) {
msg += ". Exceptions so far: [" + getExceptionMapString() + "]";
}
if (successes.isEmpty() && exceptions.isEmpty()) {
msg += ". No responses yet.";
}
//如果等待的时间超过了
if (waited > millis * WAIT_PROGRESS_WARN_THRESHOLD) {
QuorumJournalManager.LOG.warn(msg);//warn
} else {
//如果没有超过,那么就打印info日志
QuorumJournalManager.LOG.info(msg);
}
//重新计算下一次日志打印的时间:16:00:07 + 1 =16:00:08
nextLogTime = now + WAIT_PROGRESS_INTERVAL_MILLIS;
}
long rem = et - now;
/**
* 说明已经超时了重点代码
*/
if(rem <= 0){
throw new TimeoutException();
}
rem = Math.min(rem, nextLogTime - now);
rem = Math.max(rem, 1);
wait(rem);//5
}
}
JournalSet.mapJournalsAndReportError
for (JournalAndStream jas : journals) {//
try {
//FileJouanlManager
closure.apply(jas);
} catch (Throwable t) {
if (jas.isRequired()) {
final String msg = "Error: " + status + " failed for required journal ("
+ jas + ")";
LOG.fatal(msg, t);
// If we fail on *any* of the required journals, then we must not
// continue on any of the other journals. Abort them to ensure that
// retry behavior doesn't allow them to keep going in any way.
abortAllJournals();
// the current policy is to shutdown the NN on errors to shared edits
// dir. There are many code paths to shared edits failures - syncs,
// roll of edits etc. All of them go through this common function
// where the isRequired() check is made. Applying exit policy here
// to catch all code paths.
terminate(1, msg);
} else {
LOG.error("Error: " + status + " failed for (journal " + jas + ")", t);
badJAS.add(jas);
}
}
}
总结
- 通过源码分析
1、 关注这行代码if(rem <= 0){ throw new TimeoutException(); }
,和while 循环外赋值 long now = Time.monotonicNow();
和long rem = et - now;
如果while true循环中复制now之前发生full gc,等待时间超过20s,那这里就会抛出TimeoutException异常,追溯到JournalSet.mapJournalsAndReportErrors方法捕获异常,执行terminate操作,导致namenode的直接退出。 - 解决办法
1、修改源码,直接在源码中进行代码执行时间的判断,去除掉GC时间
2、(推荐)进行jvm调优,增加内存,设置预期最大stw时间 ms,选用G1 gc,添加启动参数
-Xmx4096M -Xms4096M -XX:+UseG1GC -XX:MaxGCPauseMillis=100
问题2
重启30节点,隔6min,重启31,隔6min,重启32
2022-01-12 13:21:29,779 WARN client.QuorumJournalManager (IPCLoggerChannel.java:call(388)) - Remote journal 10.32.40.30:8485 failed to write txns 2887442538-2887442538. Will try to write to this JN again after the next log roll.
java.io.EOFException: End of File Exception between local host is: "shyt-hadoop-4031.xxx.com.cn/10.32.40.31"; destination host is: "shyt-hadoop-4030.xxx.com.cn":8485; : java.io.EOFException; For more details see: http://wiki.apache.org/hadoop/EOFException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:801)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:765)
at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1558)
at org.apache.hadoop.ipc.Client.call(Client.java:1498)
at org.apache.hadoop.ipc.Client.call(Client.java:1398)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
at com.sun.proxy.$Proxy11.journal(Unknown Source)
at org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolTranslatorPB.journal(QJournalProtocolTranslatorPB.java:167)
at org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel$7.call(IPCLoggerChannel.java:385)
at org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel$7.call(IPCLoggerChannel.java:378)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1119)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:1014)
2022-01-12 13:21:29,796 FATAL namenode.FSEditLog (JournalSet.java:mapJournalsAndReportErrors(398)) - Error: flush failed for required journal (JournalAndStream(mgr=QJM to [10.32.40.32:8485, 10.32.40.30:8485, 10.32.40.31:8485], stream=QuorumOutputStream starting at txid 2887441291))
org.apache.hadoop.hdfs.qjournal.client.QuorumException: Got too many exceptions to achieve quorum size 2/3. 1 successful responses:
10.32.40.31:8485: null [success]
2 exceptions thrown:
10.32.40.30:8485: End of File Exception between local host is: "shyt-hadoop-4031.xxx.com.cn/10.32.40.31"; destination host is: "shyt-hadoop-4030.xxx.com.cn":8485; : java.io.EOFException; For more details see: http://wiki.apache.org/hadoop/EOFException
10.32.40.32:8485: /export/hadoop/hdfs/journal/xxxdata/current/last-promised-epoch.tmp (No such file or directory)
at java.io.FileOutputStream.open0(Native Method)
at java.io.FileOutputStream.open(FileOutputStream.java:270)
at java.io.FileOutputStream.<init>(FileOutputStream.java:213)
at java.io.FileOutputStream.<init>(FileOutputStream.java:162)
at org.apache.hadoop.hdfs.util.AtomicFileOutputStream.<init>(AtomicFileOutputStream.java:58)
at org.apache.hadoop.hdfs.util.PersistentLongFile.writeFile(PersistentLongFile.java:78)
at org.apache.hadoop.hdfs.util.PersistentLongFile.set(PersistentLongFile.java:64)
at org.apache.hadoop.hdfs.qjournal.server.Journal.updateLastPromisedEpoch(Journal.java:326)
at org.apache.hadoop.hdfs.qjournal.server.Journal.checkRequest(Journal.java:434)
at org.apache.hadoop.hdfs.qjournal.server.Journal.heartbeat(Journal.java:417)
at org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.heartbeat(JournalNodeRpcServer.java:159)
at org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.heartbeat(QJournalProtocolServerSideTranslatorPB.java:172)
at org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:25423)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2347)
at org.apache.hadoop.hdfs.qjournal.client.QuorumException.create(QuorumException.java:81)
at org.apache.hadoop.hdfs.qjournal.client.QuorumCall.rethrowException(QuorumCall.java:223)
at org.apache.hadoop.hdfs.qjournal.client.AsyncLoggerSet.waitForWriteQuorum(AsyncLoggerSet.java:142)
at org.apache.hadoop.hdfs.qjournal.client.QuorumOutputStream.flushAndSync(QuorumOutputStream.java:107)
at org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:113)
at org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:107)
at org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalSetOutputStream$8.apply(JournalSet.java:533)
at org.apache.hadoop.hdfs.server.namenode.JournalSet.mapJournalsAndReportErrors(JournalSet.java:393)
at org.apache.hadoop.hdfs.server.namenode.JournalSet.access$100(JournalSet.java:57)
at org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalSetOutputStream.flush(JournalSet.java:529)
at org.apache.hadoop.hdfs.server.namenode.FSEditLog.logSync(FSEditLog.java:707)
at org.apache.hadoop.hdfs.server.namenode.FSEditLog.logSync(FSEditLog.java:641)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2721)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2586)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:736)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:409)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2347)
2022-01-12 13:21:29,796 WARN client.QuorumJournalManager (QuorumOutputStream.java:abort(72)) - Aborting QuorumOutputStream starting at txid 2887441291
2022-01-12 13:21:29,814 INFO util.ExitUtil (ExitUtil.java:terminate(124)) - Exiting with status 1
分析
3节点的journalnode,只有一个响应成功,30重启,连接不上是正常的,
31 节点连接成功
32 节点连接失败,异常
namenode连接journalnode,edit日志push要求必须过半成功,既2个节点返回成功才可以,否则namenode直接terminate exit 挂了
解决
重新初始化32节点journalnode的目录
1、sudo su - hdfs && mkdir -p /export/hadoop/hdfs/journal/xxxdata/current/
2、复制31节点/export/hadoop/hdfs/journal/xxxdata/current/VERSION
到32节点
3、重启32节点journalnode服务,完美解决