一个ExoPlayer anr引发的思考

2020-01-15  本文已影响0人  码上就说

最近我们后台收到了很多ExoPlayer anr,具体的堆栈如下:

Subject: Input dispatching timed out (Waiting to send non-key event because the touched window has not finished processing certain input events that were delivered to it over 500.0ms ago.  Wait queue length: 2.  Wait queue head age: 6515.7ms.)

nullAndroid time :[2020-01-12 18:13:30.52] [276172.906]
CPU usage from 54829ms to 61ms ago (2020-01-12 18:12:30.330 to 2020-01-12 18:13:25.099) with 99% awake:
  99% 610/media.codec: 99% user + 0% kernel / faults: 110 minor
  9.1% 948/system_server: 6.2% user + 2.9% kernel / faults: 12737 minor 62 major
  3.2% 30834/com.android.wifisettings: 2.7% user + 0.5% kernel / faults: 18373 minor 19 major
  2.7% 441/surfaceflinger: 1.3% user + 1.3% kernel / faults: 1645 minor 3 major
  1.8% 96/kswapd0: 0% user + 1.8% kernel
  1.8% 262/exe_cq: 0% user + 1.8% kernel
  1.3% 424/android.hardware.graphics.composer@2.1-service: 0.6% user + 0.7% kernel / faults: 1154 minor 1 major
  0.6% 612/camerahalserver: 0.4% user + 0.2% kernel / faults: 8438 minor 162 major

19% TOTAL: 16% user + 2.5% kernel + 0.7% iowait + 0% softirq


"main" prio=5 tid=1 Waiting
  | group="main" sCount=1 dsCount=0 flags=1 obj=0x72fd7f68 self=0xe35da000
  | sysTid=31132 nice=-4 cgrp=default sched=0/0 handle=0xe82f04a4
  | state=S schedstat=( 1730336018 88469930 2679 ) utm=152 stm=20 core=6 HZ=100
  | stack=0xff16a000-0xff16c000 stackSize=8MB
  | held mutexes=
  at java.lang.Object.wait(Native method)
  - waiting on <0x02c720e2> (a com.google.android.exoplayer2.PlayerMessage)
  at com.google.android.exoplayer2.PlayerMessage.k(PlayerMessage.java:282)
  - locked <0x02c720e2> (a com.google.android.exoplayer2.PlayerMessage)
  at com.google.android.exoplayer2.SimpleExoPlayer.setVideoSurfaceInternal(SimpleExoPlayer.java:986)
  at com.google.android.exoplayer2.SimpleExoPlayer.setVideoSurfaceInternal(SimpleExoPlayer.java:1001)
  at com.google.android.exoplayer2.SimpleExoPlayer.access$1400(SimpleExoPlayer.java:59)
  at com.google.android.exoplayer2.SimpleExoPlayer$a.onSurfaceTextureDestroyed(SimpleExoPlayer.java:1191)
  at android.view.TextureView.releaseSurfaceTexture(TextureView.java:249)
  at android.view.TextureView.onDetachedFromWindowInternal(TextureView.java:222)
  at android.view.View.dispatchDetachedFromWindow(View.java:17690)
  at android.view.ViewGroup.dispatchDetachedFromWindow(ViewGroup.java:3781)
  at android.view.ViewGroup.dispatchDetachedFromWindow(ViewGroup.java:3781)
  at android.view.ViewGroup.dispatchDetachedFromWindow(ViewGroup.java:3781)
  ... repeated 0 times
  at android.os.Handler.handleCallback(Handler.java:790)
  at android.os.Handler.dispatchMessage(Handler.java:99)
  at android.os.Looper.loop(Looper.java:192)
  at android.app.ActivityThread.main(ActivityThread.java:6887)
  at java.lang.reflect.Method.invoke(Native method)
  at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:549)
  at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:875)

日中看卡在ExoPlayer中的SimpleExoPlayer中了。

1. 主线程在waiting,首先看看卡主的地方;

PlayerMessage.java中的方法:

  public synchronized boolean blockUntilDelivered() throws InterruptedException {
    Assertions.checkState(isSent);
    Assertions.checkState(handler.getLooper().getThread() != Thread.currentThread());
    while (!isProcessed) {
      wait();
    }
    return isDelivered;
  }

这里直接有一个wait(),行号也能对上,说明这里的wait()函数导致主线程一直在wait();正常情况不应该这么设计,往下看看有没有notify的地方;也是同一个类,下面由函数:

  public synchronized void markAsProcessed(boolean isDelivered) {
    this.isDelivered |= isDelivered;
    isProcessed = true;
    notifyAll();
  }

说明正常情况下会调用markAsProcessed方法的,这里发生anr的情况下,显然是没有及时调用这个方案,导致主线程卡死的;

2.梳理一下调用流程

setSurface调用流程

最终执行到的是ExoPlayerImplInternal中的子线程中;

  private void sendMessageInternal(PlayerMessage message) throws ExoPlaybackException {
    if (message.getPositionMs() == C.TIME_UNSET) {
      // If no delivery time is specified, trigger immediate message delivery.
      sendMessageToTarget(message);
    } else if (mediaSource == null || pendingPrepareCount > 0) {
      // Still waiting for initial timeline to resolve position.
      pendingMessages.add(new PendingMessageInfo(message));
    } else {
      PendingMessageInfo pendingMessageInfo = new PendingMessageInfo(message);
      if (resolvePendingMessagePosition(pendingMessageInfo)) {
        pendingMessages.add(pendingMessageInfo);
        // Ensure new message is inserted according to playback order.
        Collections.sort(pendingMessages);
      } else {
        message.markAsProcessed(/* isDelivered= */ false);
      }
    }
  }

  private void sendMessageToTarget(PlayerMessage message) throws ExoPlaybackException {
    if (message.getHandler().getLooper() == handler.getLooper()) {
      deliverMessage(message);
      if (playbackInfo.playbackState == Player.STATE_READY
          || playbackInfo.playbackState == Player.STATE_BUFFERING) {
        // The message may have caused something to change that now requires us to do work.
        handler.sendEmptyMessage(MSG_DO_SOME_WORK);
      }
    } else {
      handler.obtainMessage(MSG_SEND_MESSAGE_TO_TARGET_THREAD, message).sendToTarget();
    }
  }

从代码中可知 message.getHandler().getLooper() == handler.getLooper() 是恒成立的,那么一定会走到deliverMessage中;

  private void deliverMessage(PlayerMessage message) throws ExoPlaybackException {
    if (message.isCanceled()) {
      return;
    }
    try {
      message.getTarget().handleMessage(message.getType(), message.getPayload());
    } finally {
      message.markAsProcessed(/* isDelivered= */ true);
    }
  }

关键的执行代码是:
message.getTarget().handleMessage(message.getType(), message.getPayload());
这个message.getTarget()代表什么?
还是回到SimpleExoPlayer->setVideoSurfaceInternal(...)

      if (renderer.getTrackType() == C.TRACK_TYPE_VIDEO) {
        messages.add(
            player.createMessage(renderer).setType(C.MSG_SET_SURFACE).setPayload(surface).send());
      }

这个一个renderer,这个renderer的类型是一个video,那么肯定代表 MediaCodecVideoRenderer;

3.MediaCodecVideoRenderer-->handleMessage

  public void handleMessage(int messageType, @Nullable Object message) throws ExoPlaybackException {
    if (messageType == C.MSG_SET_SURFACE) {
      setSurface((Surface) message);
    } else if (messageType == C.MSG_SET_SCALING_MODE) {
      scalingMode = (Integer) message;
      MediaCodec codec = getCodec();
      if (codec != null) {
        codec.setVideoScalingMode(scalingMode);
      }
    } else if (messageType == C.MSG_SET_VIDEO_FRAME_METADATA_LISTENER) {
      frameMetadataListener = (VideoFrameMetadataListener) message;
    } else {
      super.handleMessage(messageType, message);
    }
  }

这时候可以看一看发生anr的时候子线程的堆栈在做什么?子线程的名称是:

    internalPlaybackThread =
        new HandlerThread("ExoPlayerImplInternal:Handler", Process.THREAD_PRIORITY_AUDIO);
"ExoPlayerImplInternal:Handler" prio=5 tid=85 Native
  | group="main" sCount=1 dsCount=0 flags=1 obj=0x13603048 self=0xbab14c00
  | sysTid=31381 nice=-16 cgrp=default sched=0/0 handle=0xb5b7f970
  | state=S schedstat=( 70064540 8235311 408 ) utm=5 stm=1 core=4 HZ=100
  | stack=0xb5a7d000-0xb5a7f000 stackSize=1038KB
  | held mutexes=
  kernel: __switch_to+0x8c/0xb0
  kernel: futex_wait_queue_me+0xc0/0x140
  kernel: futex_wait+0xf4/0x228
  kernel: do_futex+0x4f8/0xb00
  kernel: compat_SyS_futex+0x94/0x158
  kernel: cpu_switch_to+0x26c/0x2b8
  native: #00 pc 00018cf4  /system/lib/libc.so (syscall+28)
  native: #01 pc 00047293  /system/lib/libc.so (__pthread_cond_timedwait(pthread_cond_internal_t*, pthread_mutex_t*, bool, timespec const*)+102)
  native: #02 pc 00010a2d  /system/lib/libstagefright_foundation.so (android::ALooper::awaitResponse(android::sp<android::AReplyToken> const&, android::sp<android::AMessage>*)+92)
  native: #03 pc 00012185  /system/lib/libstagefright_foundation.so (android::AMessage::postAndAwaitResponse(android::sp<android::AMessage>*)+136)
  native: #04 pc 000e2d41  /system/lib/libstagefright.so (android::MediaCodec::PostAndAwaitResponse(android::sp<android::AMessage> const&, android::sp<android::AMessage>*)+20)
  native: #05 pc 000e21f7  /system/lib/libstagefright.so (android::MediaCodec::init(android::AString const&, bool, bool)+1210)
  native: #06 pc 000e24d7  /system/lib/libstagefright.so (android::MediaCodec::CreateByComponentName(android::sp<android::ALooper> const&, android::AString const&, int*, int, unsigned int)+482)
  native: #07 pc 0001e8ad  /system/lib/libmedia_jni.so (android::JMediaCodec::JMediaCodec(_JNIEnv*, _jobject*, char const*, bool, bool)+244)
  native: #08 pc 000222cd  /system/lib/libmedia_jni.so (???)
  native: #09 pc 005e2f8b  /system/framework/arm/boot-framework.oat (Java_android_media_MediaCodec_native_1setup__Ljava_lang_String_2ZZ+106)
  at android.media.MediaCodec.native_setup(Native method)
  at android.media.MediaCodec.<init>(MediaCodec.java:1799)
  at android.media.MediaCodec.createByCodecName(MediaCodec.java:1780)
  at com.google.android.exoplayer2.mediacodec.MediaCodecRenderer.maybeInitCodec(MediaCodecRenderer.java:415)
  at com.google.android.exoplayer2.mediacodec.MediaCodecRenderer.onInputFormatChanged(MediaCodecRenderer.java:920)
  at com.google.android.exoplayer2.video.e.onInputFormatChanged(MediaCodecVideoRenderer.java:508)
  at com.google.android.exoplayer2.mediacodec.MediaCodecRenderer.render(MediaCodecRenderer.java:557)
  at com.google.android.exoplayer2.d.h(ExoPlayerImplInternal.java:519)
  at com.google.android.exoplayer2.d.handleMessage(ExoPlayerImplInternal.java:298)
  at android.os.Handler.dispatchMessage(Handler.java:102)
  at android.os.Looper.loop(Looper.java:192)
  at android.os.HandlerThread.run(HandlerThread.java:65)

主线程都卡死了,子线程还困在 MediaCodecRenderer中执行代码;虽然这儿没有直接的证据表明子线程在执行耗时操作,但是我们从全局的代码分析,所有的通路中如果没有耗时代码,肯定会执行markAsProcessed(...);

4.子线程执行耗时操作?可以优化吗?

子线程在执行耗时操作,可以优化吗?我们知道ExoPlayer 内部的MediaCodec使用的是同步解码方式,这种方式的好处是播放不会出错,状态同步非常方便;坏处就是会产生anr,而且是肯定会产生anr;
因为MediaCodec资源相当有限,这时候如果持有的MediaCodec实例过多,就会造成MediaCodec资源紧张,基本上执行MediaCodec里面的什么函数都会耗时;
架构方面的东西我们暂时还无法作出改变;那我们就从流程上分析了;

5.怎么解决这个问题?

发生anr的堆栈是从TextureView的onSurfaceTextureDestroyed回调过来的,这时候绝对不是处于播放的状态,那我们可以作出一些检测操作;如果子线程操作的时间超过相应的时间,那就立即markAsProcessed(...)

  private void setVideoSurfaceInternal(Surface surface, boolean ownsSurface, boolean needReleaseCodec) {
     //......
    if (this.surface != null && this.surface != surface) {
      // We're replacing a surface. Block to ensure that it's not accessed after the method returns.
      try {
        for (PlayerMessage message : messages) {
          // EXOPLAYER_OPTIMIZATION_START
          boolean isProcessed = message.blockUntilDelivered();
          if (!isProcessed) {
            Message msg = Message.obtain();
            msg.what = MSG_PLAYER_MESSAGE_PROCESSED;
            msg.obj = message;
            mPlayerMessageHandler.sendMessageDelayed(msg, DELAY_INTERVAL);
          }
          // EXOPLAYER_OPTIMIZATION_END
        }
      } catch (InterruptedException e) {
        Thread.currentThread().interrupt();
      }
      // If we created the previous surface, we are responsible for releasing it.
      if (this.ownsSurface) {
        this.surface.release();
      }
    }
    this.surface = surface;
    this.ownsSurface = ownsSurface;
  }

  // EXOPLAYER_OPTIMIZATION_START
  private static final int MSG_PLAYER_MESSAGE_PROCESSED = 0x1;
  private static final int DELAY_INTERVAL = 5000;

  private Handler mPlayerMessageHandler = new Handler() {
    @Override
    public void handleMessage(Message msg) {
      if (msg.what == MSG_PLAYER_MESSAGE_PROCESSED) {
        PlayerMessage message = (PlayerMessage)msg.obj;
        if (!message.isProcessed()) {
          message.markAsProcessed(false);
        }
      }
    }
  };
  // EXOPLAYER_OPTIMIZATION_END

思想就是5s之后检测一下当前的PlayerMessage是否处理完成,如果没有处理完成,也别让主线程一直waiting啦;

上面只是给大家提供一种思路;注意啦,如果在播放器使用过程中是不能这样做得,会造成状态异常;

上一篇下一篇

猜你喜欢

热点阅读