一个ExoPlayer anr引发的思考
最近我们后台收到了很多ExoPlayer anr,具体的堆栈如下:
Subject: Input dispatching timed out (Waiting to send non-key event because the touched window has not finished processing certain input events that were delivered to it over 500.0ms ago. Wait queue length: 2. Wait queue head age: 6515.7ms.)
nullAndroid time :[2020-01-12 18:13:30.52] [276172.906]
CPU usage from 54829ms to 61ms ago (2020-01-12 18:12:30.330 to 2020-01-12 18:13:25.099) with 99% awake:
99% 610/media.codec: 99% user + 0% kernel / faults: 110 minor
9.1% 948/system_server: 6.2% user + 2.9% kernel / faults: 12737 minor 62 major
3.2% 30834/com.android.wifisettings: 2.7% user + 0.5% kernel / faults: 18373 minor 19 major
2.7% 441/surfaceflinger: 1.3% user + 1.3% kernel / faults: 1645 minor 3 major
1.8% 96/kswapd0: 0% user + 1.8% kernel
1.8% 262/exe_cq: 0% user + 1.8% kernel
1.3% 424/android.hardware.graphics.composer@2.1-service: 0.6% user + 0.7% kernel / faults: 1154 minor 1 major
0.6% 612/camerahalserver: 0.4% user + 0.2% kernel / faults: 8438 minor 162 major
19% TOTAL: 16% user + 2.5% kernel + 0.7% iowait + 0% softirq
"main" prio=5 tid=1 Waiting
| group="main" sCount=1 dsCount=0 flags=1 obj=0x72fd7f68 self=0xe35da000
| sysTid=31132 nice=-4 cgrp=default sched=0/0 handle=0xe82f04a4
| state=S schedstat=( 1730336018 88469930 2679 ) utm=152 stm=20 core=6 HZ=100
| stack=0xff16a000-0xff16c000 stackSize=8MB
| held mutexes=
at java.lang.Object.wait(Native method)
- waiting on <0x02c720e2> (a com.google.android.exoplayer2.PlayerMessage)
at com.google.android.exoplayer2.PlayerMessage.k(PlayerMessage.java:282)
- locked <0x02c720e2> (a com.google.android.exoplayer2.PlayerMessage)
at com.google.android.exoplayer2.SimpleExoPlayer.setVideoSurfaceInternal(SimpleExoPlayer.java:986)
at com.google.android.exoplayer2.SimpleExoPlayer.setVideoSurfaceInternal(SimpleExoPlayer.java:1001)
at com.google.android.exoplayer2.SimpleExoPlayer.access$1400(SimpleExoPlayer.java:59)
at com.google.android.exoplayer2.SimpleExoPlayer$a.onSurfaceTextureDestroyed(SimpleExoPlayer.java:1191)
at android.view.TextureView.releaseSurfaceTexture(TextureView.java:249)
at android.view.TextureView.onDetachedFromWindowInternal(TextureView.java:222)
at android.view.View.dispatchDetachedFromWindow(View.java:17690)
at android.view.ViewGroup.dispatchDetachedFromWindow(ViewGroup.java:3781)
at android.view.ViewGroup.dispatchDetachedFromWindow(ViewGroup.java:3781)
at android.view.ViewGroup.dispatchDetachedFromWindow(ViewGroup.java:3781)
... repeated 0 times
at android.os.Handler.handleCallback(Handler.java:790)
at android.os.Handler.dispatchMessage(Handler.java:99)
at android.os.Looper.loop(Looper.java:192)
at android.app.ActivityThread.main(ActivityThread.java:6887)
at java.lang.reflect.Method.invoke(Native method)
at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:549)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:875)
日中看卡在ExoPlayer中的SimpleExoPlayer中了。
1. 主线程在waiting,首先看看卡主的地方;
PlayerMessage.java中的方法:
public synchronized boolean blockUntilDelivered() throws InterruptedException {
Assertions.checkState(isSent);
Assertions.checkState(handler.getLooper().getThread() != Thread.currentThread());
while (!isProcessed) {
wait();
}
return isDelivered;
}
这里直接有一个wait(),行号也能对上,说明这里的wait()函数导致主线程一直在wait();正常情况不应该这么设计,往下看看有没有notify的地方;也是同一个类,下面由函数:
public synchronized void markAsProcessed(boolean isDelivered) {
this.isDelivered |= isDelivered;
isProcessed = true;
notifyAll();
}
说明正常情况下会调用markAsProcessed方法的,这里发生anr的情况下,显然是没有及时调用这个方案,导致主线程卡死的;
2.梳理一下调用流程
setSurface调用流程最终执行到的是ExoPlayerImplInternal中的子线程中;
private void sendMessageInternal(PlayerMessage message) throws ExoPlaybackException {
if (message.getPositionMs() == C.TIME_UNSET) {
// If no delivery time is specified, trigger immediate message delivery.
sendMessageToTarget(message);
} else if (mediaSource == null || pendingPrepareCount > 0) {
// Still waiting for initial timeline to resolve position.
pendingMessages.add(new PendingMessageInfo(message));
} else {
PendingMessageInfo pendingMessageInfo = new PendingMessageInfo(message);
if (resolvePendingMessagePosition(pendingMessageInfo)) {
pendingMessages.add(pendingMessageInfo);
// Ensure new message is inserted according to playback order.
Collections.sort(pendingMessages);
} else {
message.markAsProcessed(/* isDelivered= */ false);
}
}
}
private void sendMessageToTarget(PlayerMessage message) throws ExoPlaybackException {
if (message.getHandler().getLooper() == handler.getLooper()) {
deliverMessage(message);
if (playbackInfo.playbackState == Player.STATE_READY
|| playbackInfo.playbackState == Player.STATE_BUFFERING) {
// The message may have caused something to change that now requires us to do work.
handler.sendEmptyMessage(MSG_DO_SOME_WORK);
}
} else {
handler.obtainMessage(MSG_SEND_MESSAGE_TO_TARGET_THREAD, message).sendToTarget();
}
}
从代码中可知 message.getHandler().getLooper() == handler.getLooper() 是恒成立的,那么一定会走到deliverMessage中;
private void deliverMessage(PlayerMessage message) throws ExoPlaybackException {
if (message.isCanceled()) {
return;
}
try {
message.getTarget().handleMessage(message.getType(), message.getPayload());
} finally {
message.markAsProcessed(/* isDelivered= */ true);
}
}
关键的执行代码是:
message.getTarget().handleMessage(message.getType(), message.getPayload());
这个message.getTarget()代表什么?
还是回到SimpleExoPlayer->setVideoSurfaceInternal(...)
if (renderer.getTrackType() == C.TRACK_TYPE_VIDEO) {
messages.add(
player.createMessage(renderer).setType(C.MSG_SET_SURFACE).setPayload(surface).send());
}
这个一个renderer,这个renderer的类型是一个video,那么肯定代表 MediaCodecVideoRenderer;
3.MediaCodecVideoRenderer-->handleMessage
public void handleMessage(int messageType, @Nullable Object message) throws ExoPlaybackException {
if (messageType == C.MSG_SET_SURFACE) {
setSurface((Surface) message);
} else if (messageType == C.MSG_SET_SCALING_MODE) {
scalingMode = (Integer) message;
MediaCodec codec = getCodec();
if (codec != null) {
codec.setVideoScalingMode(scalingMode);
}
} else if (messageType == C.MSG_SET_VIDEO_FRAME_METADATA_LISTENER) {
frameMetadataListener = (VideoFrameMetadataListener) message;
} else {
super.handleMessage(messageType, message);
}
}
这时候可以看一看发生anr的时候子线程的堆栈在做什么?子线程的名称是:
internalPlaybackThread =
new HandlerThread("ExoPlayerImplInternal:Handler", Process.THREAD_PRIORITY_AUDIO);
"ExoPlayerImplInternal:Handler" prio=5 tid=85 Native
| group="main" sCount=1 dsCount=0 flags=1 obj=0x13603048 self=0xbab14c00
| sysTid=31381 nice=-16 cgrp=default sched=0/0 handle=0xb5b7f970
| state=S schedstat=( 70064540 8235311 408 ) utm=5 stm=1 core=4 HZ=100
| stack=0xb5a7d000-0xb5a7f000 stackSize=1038KB
| held mutexes=
kernel: __switch_to+0x8c/0xb0
kernel: futex_wait_queue_me+0xc0/0x140
kernel: futex_wait+0xf4/0x228
kernel: do_futex+0x4f8/0xb00
kernel: compat_SyS_futex+0x94/0x158
kernel: cpu_switch_to+0x26c/0x2b8
native: #00 pc 00018cf4 /system/lib/libc.so (syscall+28)
native: #01 pc 00047293 /system/lib/libc.so (__pthread_cond_timedwait(pthread_cond_internal_t*, pthread_mutex_t*, bool, timespec const*)+102)
native: #02 pc 00010a2d /system/lib/libstagefright_foundation.so (android::ALooper::awaitResponse(android::sp<android::AReplyToken> const&, android::sp<android::AMessage>*)+92)
native: #03 pc 00012185 /system/lib/libstagefright_foundation.so (android::AMessage::postAndAwaitResponse(android::sp<android::AMessage>*)+136)
native: #04 pc 000e2d41 /system/lib/libstagefright.so (android::MediaCodec::PostAndAwaitResponse(android::sp<android::AMessage> const&, android::sp<android::AMessage>*)+20)
native: #05 pc 000e21f7 /system/lib/libstagefright.so (android::MediaCodec::init(android::AString const&, bool, bool)+1210)
native: #06 pc 000e24d7 /system/lib/libstagefright.so (android::MediaCodec::CreateByComponentName(android::sp<android::ALooper> const&, android::AString const&, int*, int, unsigned int)+482)
native: #07 pc 0001e8ad /system/lib/libmedia_jni.so (android::JMediaCodec::JMediaCodec(_JNIEnv*, _jobject*, char const*, bool, bool)+244)
native: #08 pc 000222cd /system/lib/libmedia_jni.so (???)
native: #09 pc 005e2f8b /system/framework/arm/boot-framework.oat (Java_android_media_MediaCodec_native_1setup__Ljava_lang_String_2ZZ+106)
at android.media.MediaCodec.native_setup(Native method)
at android.media.MediaCodec.<init>(MediaCodec.java:1799)
at android.media.MediaCodec.createByCodecName(MediaCodec.java:1780)
at com.google.android.exoplayer2.mediacodec.MediaCodecRenderer.maybeInitCodec(MediaCodecRenderer.java:415)
at com.google.android.exoplayer2.mediacodec.MediaCodecRenderer.onInputFormatChanged(MediaCodecRenderer.java:920)
at com.google.android.exoplayer2.video.e.onInputFormatChanged(MediaCodecVideoRenderer.java:508)
at com.google.android.exoplayer2.mediacodec.MediaCodecRenderer.render(MediaCodecRenderer.java:557)
at com.google.android.exoplayer2.d.h(ExoPlayerImplInternal.java:519)
at com.google.android.exoplayer2.d.handleMessage(ExoPlayerImplInternal.java:298)
at android.os.Handler.dispatchMessage(Handler.java:102)
at android.os.Looper.loop(Looper.java:192)
at android.os.HandlerThread.run(HandlerThread.java:65)
主线程都卡死了,子线程还困在 MediaCodecRenderer中执行代码;虽然这儿没有直接的证据表明子线程在执行耗时操作,但是我们从全局的代码分析,所有的通路中如果没有耗时代码,肯定会执行markAsProcessed(...);
4.子线程执行耗时操作?可以优化吗?
子线程在执行耗时操作,可以优化吗?我们知道ExoPlayer 内部的MediaCodec使用的是同步解码方式,这种方式的好处是播放不会出错,状态同步非常方便;坏处就是会产生anr,而且是肯定会产生anr;
因为MediaCodec资源相当有限,这时候如果持有的MediaCodec实例过多,就会造成MediaCodec资源紧张,基本上执行MediaCodec里面的什么函数都会耗时;
架构方面的东西我们暂时还无法作出改变;那我们就从流程上分析了;
5.怎么解决这个问题?
发生anr的堆栈是从TextureView的onSurfaceTextureDestroyed回调过来的,这时候绝对不是处于播放的状态,那我们可以作出一些检测操作;如果子线程操作的时间超过相应的时间,那就立即markAsProcessed(...)
private void setVideoSurfaceInternal(Surface surface, boolean ownsSurface, boolean needReleaseCodec) {
//......
if (this.surface != null && this.surface != surface) {
// We're replacing a surface. Block to ensure that it's not accessed after the method returns.
try {
for (PlayerMessage message : messages) {
// EXOPLAYER_OPTIMIZATION_START
boolean isProcessed = message.blockUntilDelivered();
if (!isProcessed) {
Message msg = Message.obtain();
msg.what = MSG_PLAYER_MESSAGE_PROCESSED;
msg.obj = message;
mPlayerMessageHandler.sendMessageDelayed(msg, DELAY_INTERVAL);
}
// EXOPLAYER_OPTIMIZATION_END
}
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
// If we created the previous surface, we are responsible for releasing it.
if (this.ownsSurface) {
this.surface.release();
}
}
this.surface = surface;
this.ownsSurface = ownsSurface;
}
// EXOPLAYER_OPTIMIZATION_START
private static final int MSG_PLAYER_MESSAGE_PROCESSED = 0x1;
private static final int DELAY_INTERVAL = 5000;
private Handler mPlayerMessageHandler = new Handler() {
@Override
public void handleMessage(Message msg) {
if (msg.what == MSG_PLAYER_MESSAGE_PROCESSED) {
PlayerMessage message = (PlayerMessage)msg.obj;
if (!message.isProcessed()) {
message.markAsProcessed(false);
}
}
}
};
// EXOPLAYER_OPTIMIZATION_END
思想就是5s之后检测一下当前的PlayerMessage是否处理完成,如果没有处理完成,也别让主线程一直waiting啦;
上面只是给大家提供一种思路;注意啦,如果在播放器使用过程中是不能这样做得,会造成状态异常;