LeakCanary 原理实现

2019-03-20 本文已影响74人 Huangwt

最近接入了Matrix性能监控看了下Matrix的Resource-Plugin，简单的监控了Activity的泄漏但是没有做Fragment的泄漏和View的泄漏分析，但是对生成的堆快照文件做了一个shrink操作，将无用的部分删除，但是由于结构较为简单所以功能缺少的比较多，比如需要手动的analyze hprof文件来查看泄漏发生在哪里，而没有LeakCanary的比较优秀的最短路径计算方案，在手动分析hprof的过程中[依赖square:haha]，发现计算堆中对象建立联系的过程非常非常慢，因为需要从GC_ROOT依次遍历引用，直到将所有对象都计算完毕为止，而Leakcanary找最短路径却十分的快，所以看一下Code的实现。

  /**
   * Creates a {@link RefWatcher} instance and makes it available through {@link
   * LeakCanary#installedRefWatcher()}.
   *
   * Also starts watching activity references if {@link #watchActivities(boolean)} was set to true.
   *
   * @throws UnsupportedOperationException if called more than once per Android process.
   */
  public @NonNull RefWatcher buildAndInstall() {
    if (LeakCanaryInternals.installedRefWatcher != null) {
      throw new UnsupportedOperationException("buildAndInstall() should only be called once.");
    }
    RefWatcher refWatcher = build();
    if (refWatcher != DISABLED) {
      LeakCanaryInternals.setEnabledAsync(context, DisplayLeakActivity.class, true);
      if (watchActivities) {
        ActivityRefWatcher.install(context, refWatcher);
      }
      if (watchFragments) {
        FragmentRefWatcher.Helper.install(context, refWatcher);
      }
    }
    LeakCanaryInternals.installedRefWatcher = refWatcher;
    return refWatcher;
  }

初始化的时候会先绑定给具体的ActivityRefWatcher和FragmentRefWatcher

先查看ActivityRefWatcher

  public static void install(@NonNull Context context, @NonNull RefWatcher refWatcher) {
    Application application = (Application) context.getApplicationContext();
    ActivityRefWatcher activityRefWatcher = new ActivityRefWatcher(application, refWatcher);

    application.registerActivityLifecycleCallbacks(activityRefWatcher.lifecycleCallbacks);
  }

  private final Application.ActivityLifecycleCallbacks lifecycleCallbacks =
      new ActivityLifecycleCallbacksAdapter() {
        @Override public void onActivityDestroyed(Activity activity) {
          refWatcher.watch(activity);
        }
      };

通过绑定Activity生命周期来监听Activity是否泄漏
Fragment类似，依赖FramgentManager.FragmentLifecyclerCallbacks

  private final FragmentManager.FragmentLifecycleCallbacks fragmentLifecycleCallbacks =
      new FragmentManager.FragmentLifecycleCallbacks() {

        @Override public void onFragmentViewDestroyed(FragmentManager fm, Fragment fragment) {
          View view = fragment.getView();
          if (view != null) {
            refWatcher.watch(view);
          }
        }

        @Override
        public void onFragmentDestroyed(FragmentManager fm, Fragment fragment) {
          refWatcher.watch(fragment);
        }
      };

来实现，调用refWatcher.watch建立连接
refWatcher.watch的具体代码如下：

  /**
   * Watches the provided references and checks if it can be GCed. This method is non blocking,
   * the check is done on the {@link WatchExecutor} this {@link RefWatcher} has been constructed
   * with.
   *
   * @param referenceName An logical identifier for the watched object.
   */
  public void watch(Object watchedReference, String referenceName) {
    if (this == DISABLED) {
      return;
    }
    checkNotNull(watchedReference, "watchedReference");
    checkNotNull(referenceName, "referenceName");
    final long watchStartNanoTime = System.nanoTime();
    String key = UUID.randomUUID().toString();
    retainedKeys.add(key);
    final KeyedWeakReference reference =
        new KeyedWeakReference(watchedReference, key, referenceName, queue);

    ensureGoneAsync(watchStartNanoTime, reference);
  }

入参只需要关注第一个watchedReference，这边传入的只有三个参数，Activity/Fragment/View
生成了一个KeyedWeakReference对象，这个对象是一个多了key【判断引用】和name的弱引用
再次之后就会将生成的引用塞到一个一直运行确保引用会被系统回收的队列中
【生成的引用没有对象持有他的强引用，所以一次正常的GC之后会被回收】

  private void ensureGoneAsync(final long watchStartNanoTime, final KeyedWeakReference reference) {
    watchExecutor.execute(new Retryable() {
      @Override public Retryable.Result run() {
        return ensureGone(reference, watchStartNanoTime);
      }
    });
  }

这段代码首先会向一个WatchExector对象扔一个retryable，在Leakcanary中 WatchExector的实现类是AndroidWatchExecutor，execute的实现为

  @Override public void execute(@NonNull Retryable retryable) {
    if (Looper.getMainLooper().getThread() == Thread.currentThread()) {
      waitForIdle(retryable, 0);
    } else {
      postWaitForIdle(retryable, 0);
    }
  }

发送任务需要在主线程中执行，再由

  private void waitForIdle(final Retryable retryable, final int failedAttempts) {
    // This needs to be called from the main thread.
    Looper.myQueue().addIdleHandler(new MessageQueue.IdleHandler() {
      @Override public boolean queueIdle() {
        postToBackgroundWithDelay(retryable, failedAttempts);
        return false;
      }
    });
  }

最终交给backgroundHandler执行

  private void postToBackgroundWithDelay(final Retryable retryable, final int failedAttempts) {
    long exponentialBackoffFactor = (long) Math.min(Math.pow(2, failedAttempts), maxBackoffFactor);
    long delayMillis = initialDelayMillis * exponentialBackoffFactor;
    backgroundHandler.postDelayed(new Runnable() {
      @Override public void run() {
        Retryable.Result result = retryable.run();
        if (result == RETRY) {
          postWaitForIdle(retryable, failedAttempts + 1);
        }
      }
    }, delayMillis);
  }
}

再看一下Retryable的实现


  @SuppressWarnings("ReferenceEquality") // Explicitly checking for named null.
  Retryable.Result ensureGone(final KeyedWeakReference reference, final long watchStartNanoTime) {
    long gcStartNanoTime = System.nanoTime();
    long watchDurationMs = NANOSECONDS.toMillis(gcStartNanoTime - watchStartNanoTime);

    removeWeaklyReachableReferences();

    if (debuggerControl.isDebuggerAttached()) {
      // The debugger can create false leaks.
      return RETRY;
    }
    if (gone(reference)) {
      return DONE;
    }
    gcTrigger.runGc();
    removeWeaklyReachableReferences();
    if (!gone(reference)) {
      long startDumpHeap = System.nanoTime();
      long gcDurationMs = NANOSECONDS.toMillis(startDumpHeap - gcStartNanoTime);

      File heapDumpFile = heapDumper.dumpHeap();
      if (heapDumpFile == RETRY_LATER) {
        // Could not dump the heap.
        return RETRY;
      }
      long heapDumpDurationMs = NANOSECONDS.toMillis(System.nanoTime() - startDumpHeap);

      HeapDump heapDump = heapDumpBuilder.heapDumpFile(heapDumpFile).referenceKey(reference.key)
          .referenceName(reference.name)
          .watchDurationMs(watchDurationMs)
          .gcDurationMs(gcDurationMs)
          .heapDumpDurationMs(heapDumpDurationMs)
          .build();

      heapdumpListener.analyze(heapDump);
    }
    return DONE;
  }

首先执行的

  private void removeWeaklyReachableReferences() {
    // WeakReferences are enqueued as soon as the object to which they point to becomes weakly
    // reachable. This is before finalization or garbage collection has actually happened.
    KeyedWeakReference ref;
    while ((ref = (KeyedWeakReference) queue.poll()) != null) {
      retainedKeys.remove(ref.key);
    }
  }

这个方法的作用是什么呢？
【可能有偏差】

GC回收器回收垃圾前会先堆可被回收的对象进行标记，再依次执行Object的finalize方法。
那么这个函数的目的就是将这些被标记的对象，也是指向可以被回收的对象的弱引用从retainedKeys中移除，
因为构建引用的时候传入的是refWatcher对象持有的ReferenceQueue对象，所以这里可以保证我们创建的所有KeyWeakReference对象都在对象持有的queue中。

另外在Debug的模式下，LeakCanary会执行返回RETRY,也就是重试，而不会再往下执行，备注也写的很清楚，这是因为在Debug模式下会出现错误的内存泄漏。

  private boolean gone(KeyedWeakReference reference) {
    return !retainedKeys.contains(reference.key);
  }

接着会检查我们的keyedweakedReference是否还在retainedKeys中，因为我们刚刚已经把可以被GC回收的对象的WeakReferecne排除掉了，如果不在的话说明没有发生泄漏，返回DONE。表示这次execute已经完成了，另提一下，返回RETRY则会增加一个attemp_count，在进行方法体的执行。
再接着，由我们的gcTrigger执行runGC（）方法， gcTrigger的具体实现类在接口中

public interface GcTrigger {
  GcTrigger DEFAULT = new GcTrigger() {
    @Override public void runGc() {
      // Code taken from AOSP FinalizationTest:
      // https://android.googlesource.com/platform/libcore/+/master/support/src/test/java/libcore/
      // java/lang/ref/FinalizationTester.java
      // System.gc() does not garbage collect every time. Runtime.gc() is
      // more likely to perform a gc.
      Runtime.getRuntime().gc();
      enqueueReferences();
      System.runFinalization();
    }

    private void enqueueReferences() {
      // Hack. We don't have a programmatic way to wait for the reference queue daemon to move
      // references to the appropriate queues.
      try {
        Thread.sleep(100);
      } catch (InterruptedException e) {
        throw new AssertionError();
      }
    }
  };

  void runGc();
}

可以看到代码很简单，rungc的内部就是先执行gc。再给100ms交给gc回收器将可以回收的weakReferecnce添加到他们的queue中，再执行Object的finalize方法。

在这之后我们在进行removeWeaklyReachableReferences（）并检查，如果还存在，说明我们绑定的Object对象已经发生了泄漏，并在发生泄漏之后进行dumpfile，并标明了泄漏的key和name。
随后通知我们的listener进行分析并返回Done。

listener的实现类为ServiceHeapDumpListener

  @Override public void analyze(@NonNull HeapDump heapDump) {
    checkNotNull(heapDump, "heapDump");
    HeapAnalyzerService.runAnalysis(context, heapDump, listenerServiceClass);
  }

交给了HeapAnalyzerService进行analyze

    Intent intent = new Intent(context, HeapAnalyzerService.class);
    intent.putExtra(LISTENER_CLASS_EXTRA, listenerServiceClass.getName());
    intent.putExtra(HEAPDUMP_EXTRA, heapDump);
    ContextCompat.startForegroundService(context, intent);

而HeapAnalyzerService又对intent进行了转接，任务交给了后台的服务去处理.

  @Override protected void onHandleIntentInForeground(@Nullable Intent intent) {
    if (intent == null) {
      CanaryLog.d("HeapAnalyzerService received a null intent, ignoring.");
      return;
    }
    String listenerClassName = intent.getStringExtra(LISTENER_CLASS_EXTRA);
    HeapDump heapDump = (HeapDump) intent.getSerializableExtra(HEAPDUMP_EXTRA);

    HeapAnalyzer heapAnalyzer =
        new HeapAnalyzer(heapDump.excludedRefs, this, heapDump.reachabilityInspectorClasses);

    AnalysisResult result = heapAnalyzer.checkForLeak(heapDump.heapDumpFile, heapDump.referenceKey,
        heapDump.computeRetainedHeapSize);
    AbstractAnalysisResultService.sendResultToListener(this, listenerClassName, heapDump, result);
  }

接受Intent的代码如上，首先会新建一个HeapAnalyzer对象，参数的话第一个是需要排除的，使用过LeakCanary的应该知道这是啥，最后一个也不用在意，现在会是一个空的list，下面就是重头戏，checkLeak.

  /**
   * Searches the heap dump for a {@link KeyedWeakReference} instance with the corresponding key,
   * and then computes the shortest strong reference path from that instance to the GC roots.
   */
  public @NonNull AnalysisResult checkForLeak(@NonNull File heapDumpFile,
      @NonNull String referenceKey,
      boolean computeRetainedSize) {
    long analysisStartNanoTime = System.nanoTime();

    if (!heapDumpFile.exists()) {
      Exception exception = new IllegalArgumentException("File does not exist: " + heapDumpFile);
      return failure(exception, since(analysisStartNanoTime));
    }

    try {
      listener.onProgressUpdate(READING_HEAP_DUMP_FILE);
      HprofBuffer buffer = new MemoryMappedFileBuffer(heapDumpFile);
      HprofParser parser = new HprofParser(buffer);
      listener.onProgressUpdate(PARSING_HEAP_DUMP);
      Snapshot snapshot = parser.parse();
      listener.onProgressUpdate(DEDUPLICATING_GC_ROOTS);
      deduplicateGcRoots(snapshot);
      listener.onProgressUpdate(FINDING_LEAKING_REF);
      Instance leakingRef = findLeakingReference(referenceKey, snapshot);

      // False alarm, weak reference was cleared in between key check and heap dump.
      if (leakingRef == null) {
        return noLeak(since(analysisStartNanoTime));
      }
      return findLeakTrace(analysisStartNanoTime, snapshot, leakingRef, computeRetainedSize);
    } catch (Throwable e) {
      return failure(e, since(analysisStartNanoTime));
    }
  }

首先listener更新progress的代码可以忽略不看，几个比较正常的分析HeapDumpFile的步骤大家可以看一下HaHa库，也是Leakcanary作者的作品。

      HprofBuffer buffer = new MemoryMappedFileBuffer(heapDumpFile);
      HprofParser parser = new HprofParser(buffer);
      Snapshot snapshot = parser.parse();

打开堆存储文件并得到一个快照，在初始化之后可以通过snapshot.findclass("XXX")来找到某个类的文件，不过这个类和Java中的Class对象不一样，他是自己生成的类，并且通过类可以得到他的所有实现类。

      deduplicateGcRoots(snapshot);

再接着是对GC_Root进行裁剪，因为JVM是采用Gc_root的判断方法，所以GC_root的多少决定了对堆快照分析的时间消耗多少。

 Instance leakingRef = findLeakingReference(referenceKey, snapshot);

方法内部就是寻找到堆中所有keyedweakedReference实例检索出来并且检查key_field和我们传入的参数是否一样，如果一样的话则返回他的reference_Instance(Activity, Fragment, View)等等
如果返回的为Null，则说明在dumpfile的过程中对象被回收了，这种也不算Leak，我们将通知一个Noleak消息回去，如果找到了则返回

      return findLeakTrace(analysisStartNanoTime, snapshot, leakingRef, computeRetainedSize);

  private AnalysisResult findLeakTrace(long analysisStartNanoTime, Snapshot snapshot,
      Instance leakingRef, boolean computeRetainedSize) {

    listener.onProgressUpdate(FINDING_SHORTEST_PATH);
    ShortestPathFinder pathFinder = new ShortestPathFinder(excludedRefs);
    ShortestPathFinder.Result result = pathFinder.findPath(snapshot, leakingRef);

    // False alarm, no strong reference path to GC Roots.
    if (result.leakingNode == null) {
      return noLeak(since(analysisStartNanoTime));
    }

    listener.onProgressUpdate(BUILDING_LEAK_TRACE);
    LeakTrace leakTrace = buildLeakTrace(result.leakingNode);

    String className = leakingRef.getClassObj().getClassName();

    long retainedSize;
    if (computeRetainedSize) {

      listener.onProgressUpdate(COMPUTING_DOMINATORS);
      // Side effect: computes retained size.
      snapshot.computeDominators();

      Instance leakingInstance = result.leakingNode.instance;

      retainedSize = leakingInstance.getTotalRetainedSize();

      // TODO: check O sources and see what happened to android.graphics.Bitmap.mBuffer
      if (SDK_INT <= N_MR1) {
        listener.onProgressUpdate(COMPUTING_BITMAP_SIZE);
        retainedSize += computeIgnoredBitmapRetainedSize(snapshot, leakingInstance);
      }
    } else {
      retainedSize = AnalysisResult.RETAINED_HEAP_SKIPPED;
    }

    return leakDetected(result.excludingKnownLeaks, className, leakTrace, retainedSize,
        since(analysisStartNanoTime));
  }

首先创建了一个最短路径Finder的对象，随后将leakingRef和snapshot传入对象进行findpath，

  Result findPath(Snapshot snapshot, Instance leakingRef) {
    clearState();
    canIgnoreStrings = !isString(leakingRef);

    enqueueGcRoots(snapshot);

    boolean excludingKnownLeaks = false;
    LeakNode leakingNode = null;
    while (!toVisitQueue.isEmpty() || !toVisitIfNoPathQueue.isEmpty()) {
      LeakNode node;
      if (!toVisitQueue.isEmpty()) {
        node = toVisitQueue.poll();
      } else {
        node = toVisitIfNoPathQueue.poll();
        if (node.exclusion == null) {
          throw new IllegalStateException("Expected node to have an exclusion " + node);
        }
        excludingKnownLeaks = true;
      }

      // Termination
      if (node.instance == leakingRef) {
        leakingNode = node;
        break;
      }

      if (checkSeen(node)) {
        continue;
      }

      if (node.instance instanceof RootObj) {
        visitRootObj(node);
      } else if (node.instance instanceof ClassObj) {
        visitClassObj(node);
      } else if (node.instance instanceof ClassInstance) {
        visitClassInstance(node);
      } else if (node.instance instanceof ArrayInstance) {
        visitArrayInstance(node);
      } else {
        throw new IllegalStateException("Unexpected type for " + node.instance);
      }
    }
    return new Result(leakingNode, excludingKnownLeaks);
  }

进行findpath之前，需要先对state进行清除，这里的state指的就是在anayze过程中生成的缓存

      switch (rootObj.getRootType()) {
        case JAVA_LOCAL:
          Instance thread = HahaSpy.allocatingThread(rootObj);
          String threadName = threadName(thread);
          Exclusion params = excludedRefs.threadNames.get(threadName);
          if (params == null || !params.alwaysExclude) {
            enqueue(params, null, rootObj, null);
          }
          break;

首先遍历整个snapshot的GC_ROOTs列表，可以看到在对JAVA_LOCAL对象进行判断的时候我们会取出所在线程的Exclusion对象，alwaysExclude代表线程是一个不会被回收，也就是本身设计就是static 的概念，比如main线程。
如果不满足，就会将参数和rootObj塞入队列，另外还有下面的一些正常情况会塞入队列

    if (child == null) {
      return;
    }
    if (isPrimitiveOrWrapperArray(child) || isPrimitiveWrapper(child)) {
      return;
    }
    // Whether we want to visit now or later, we should skip if this is already to visit.
    if (toVisitSet.contains(child)) {
      return;
    }
    boolean visitNow = exclusion == null;
    if (!visitNow && toVisitIfNoPathSet.contains(child)) {
      return;
    }
    if (canIgnoreStrings && isString(child)) {
      return;
    }
    if (visitedSet.contains(child)) {
      return;
    }
    LeakNode childNode = new LeakNode(exclusion, child, parent, leakReference);
    if (visitNow) {
      toVisitSet.add(child);
      toVisitQueue.add(childNode);
    } else {
      toVisitIfNoPathSet.add(child);
      toVisitIfNoPathQueue.add(childNode);
    }

首先查看传入的需要进行判断的对象是否为空，或者是否是基本包装类型对象或者其数组对象，如果满足就返回。
紧接着判断是否在即将被访问的集合中，如果在就返回，防止重复计算。如果String类型可以忽略且为String类型，则返回，如果已经计算过，则返回。
然后就会创建一个childNode加入即将访问的队列中，第一轮下来之后所有的GC_ROOT都会加入这个队列中。

紧接着会对toVisitQueue进行循环寻找是否与我们的leakRef一样直到找到为止

      if (node.instance instanceof RootObj) {
        visitRootObj(node);
      } else if (node.instance instanceof ClassObj) {
        visitClassObj(node);
      } else if (node.instance instanceof ClassInstance) {
        visitClassInstance(node);
      } else if (node.instance instanceof ArrayInstance) {
        visitArrayInstance(node);
      } else {
        throw new IllegalStateException("Unexpected type for " + node.instance);
      }

这些Code会递归调用enqueue方法将待检查的对象放到我们的队列中,找到就结束

最后生成一个instance leak tree，并通知用户

LeakCanary 原理实现

猜你喜欢

热点阅读