第三方开源库源码解析程序员

BlockCanary源码解析

2017-09-09  本文已影响35人  大大大大大先生

BlockCanary原理

如何计算主线程的方法执行耗时

计算方法耗时最简单粗暴的就是在方法之前前记录下开始时间,方法执行完后用当前时间剪去方法开始执行的时间就完事了,但是主线程那么多方法总不能每一个方法都这个干吧?那肯定崩!有没有一个统一的地方来实现这个功能?当然有了,不然这篇博客到这里就戛然而止了......

public static void loop() {
        final Looper me = myLooper();
        if (me == null) {
            throw new RuntimeException("No Looper; Looper.prepare() wasn't called on this thread.");
        }
        final MessageQueue queue = me.mQueue;

        // Make sure the identity of this thread is that of the local process,
        // and keep track of what that identity token actually is.
        Binder.clearCallingIdentity();
        final long ident = Binder.clearCallingIdentity();

        for (;;) {
            Message msg = queue.next(); // might block
            if (msg == null) {
                // No message indicates that the message queue is quitting.
                return;
            }

            // This must be in a local variable, in case a UI event sets the logger
            final Printer logging = me.mLogging;
            if (logging != null) {
                logging.println(">>>>> Dispatching to " + msg.target + " " +
                        msg.callback + ": " + msg.what);
            }

            final long traceTag = me.mTraceTag;
            if (traceTag != 0 && Trace.isTagEnabled(traceTag)) {
                Trace.traceBegin(traceTag, msg.target.getTraceName(msg));
            }
            try {
                msg.target.dispatchMessage(msg);
            } finally {
                if (traceTag != 0) {
                    Trace.traceEnd(traceTag);
                }
            }

            if (logging != null) {
                logging.println("<<<<< Finished to " + msg.target + " " + msg.callback);
            }

            // Make sure that during the course of dispatching the
            // identity of the thread wasn't corrupted.
            final long newIdent = Binder.clearCallingIdentity();
            if (ident != newIdent) {
                Log.wtf(TAG, "Thread identity changed from 0x"
                        + Long.toHexString(ident) + " to 0x"
                        + Long.toHexString(newIdent) + " while dispatching to "
                        + msg.target.getClass().getName() + " "
                        + msg.callback + " what=" + msg.what);
            }

            msg.recycleUnchecked();
        }
    }

如上代码中的loop()方法是Looper中的,我们的目的是监测主线程的卡顿问题,因为UI更新界面都是在主线程中进行的,所以在主线程中做耗时操作可能会造成界面卡顿,而主线程的Looper早已经在APP启动的时候Android framework里面创建了main looper,那么一个线程对应一个Looper,Looper当中有一个MessageQueue,专门用来接收Handler发送过来的msg,并且在looper()方法中循环去从MessageQueue中去取msg,然后执行,而且是顺序执行的,那么前面一个msg还没处理完,loop()就会等待它处理完了才会再去执行下一个msg,如果前面一个msg处理很慢,那就会造成卡顿了,在msg.target.dispatchMessage(msg)前有:

if (logging != null) {
                logging.println(">>>>> Dispatching to " + msg.target + " " +
                        msg.callback + ": " + msg.what);
            }

而在dispatchMessage执行完了之后,又有:

if (logging != null) {
                logging.println("<<<<< Finished to " + msg.target + " " + msg.callback);
            }

所以,我们只需要计算打印这两天log的时间差,就能得到dispatchMessage的耗时,android提供了Looper.getMainLooper().setMessageLogging(Printer printer)来设置这个logging对象,所以只要自定义一个Printer,然后重写println(String x)方法即可实现耗时统计了,所以原理真的很简单,原理固然简单,但是还是要学会发现这个小技巧,对于BlockCanary而言,初始化:

public class MyApplication extends Application {

    @Override
    public void onCreate() {
        super.onCreate();
        BlockCanary.install(this, new BlockCanaryContext()).start();

    }
}

查看start()方法里面的代码:

/**
     * Start monitoring.
     */
    public void start() {
        if (!mMonitorStarted) {
            mMonitorStarted = true;
            Looper.getMainLooper().setMessageLogging(mBlockCanaryCore.monitor);
        }
    }

而monitor是一个继承Printer的LooperMonitor类new出来的对象,重写print(String x)方法:

@Override
    public void println(String x) {
        if (mStopWhenDebugging && Debug.isDebuggerConnected()) {
            return;
        }
        if (!mPrintingStarted) {
            mStartTimestamp = System.currentTimeMillis();
            mStartThreadTimestamp = SystemClock.currentThreadTimeMillis();
            mPrintingStarted = true;
            startDump();
        } else {
            final long endTime = System.currentTimeMillis();
            mPrintingStarted = false;
            if (isBlock(endTime)) {
                notifyBlockEvent(endTime);
            }
            stopDump();
        }
    }

在dispatchMessage执行之前打印log的时候执行print,mPrintingStarted为false,所以就记录当前的时间,以及当前线程时间mPrintingStarted设置为true,而dispatchMessage执行完后第二次打印log执行print方法,mPrintingStarted是true的,这时候dispatchMessage已经执行结束,然后就能计算耗时,搜集方法堆栈信息,cpu信息等等

方法堆栈信息的搜集

private void startDump() {
        if (null != BlockCanaryInternals.getInstance().stackSampler) {
            BlockCanaryInternals.getInstance().stackSampler.start();
        }

        if (null != BlockCanaryInternals.getInstance().cpuSampler) {
            BlockCanaryInternals.getInstance().cpuSampler.start();
        }
    }

在LooperMonitor的print方法中会执行这个方法,同时采集方法堆栈信息和cpu信息,对于堆栈信息stackSampler.start():

abstract class AbstractSampler {

    private static final int DEFAULT_SAMPLE_INTERVAL = 300;

    protected AtomicBoolean mShouldSample = new AtomicBoolean(false);
    protected long mSampleInterval;

    private Runnable mRunnable = new Runnable() {
        @Override
        public void run() {
            doSample();

            if (mShouldSample.get()) {
                HandlerThreadFactory.getTimerThreadHandler()
                        .postDelayed(mRunnable, mSampleInterval);
            }
        }
    };

    public AbstractSampler(long sampleInterval) {
        if (0 == sampleInterval) {
            sampleInterval = DEFAULT_SAMPLE_INTERVAL;
        }
        mSampleInterval = sampleInterval;
    }

    public void start() {
        if (mShouldSample.get()) {
            return;
        }
        mShouldSample.set(true);

        HandlerThreadFactory.getTimerThreadHandler().removeCallbacks(mRunnable);
        HandlerThreadFactory.getTimerThreadHandler().postDelayed(mRunnable,
                BlockCanaryInternals.getInstance().getSampleDelay());
    }

    public void stop() {
        if (!mShouldSample.get()) {
            return;
        }
        mShouldSample.set(false);
        HandlerThreadFactory.getTimerThreadHandler().removeCallbacks(mRunnable);
    }

    abstract void doSample();
}

调用start方法之后就执行:

private Runnable mRunnable = new Runnable() {
        @Override
        public void run() {
            doSample();

            if (mShouldSample.get()) {
                HandlerThreadFactory.getTimerThreadHandler()
                        .postDelayed(mRunnable, mSampleInterval);
            }
        }
    };

并且这里控制stackSampler.start()只能执行一次,在run方法里面我们可以发现每次间隔mSampleInterval就会去重新跑一次doSample(),这里会执行StackSampler的doSample():

@Override
    protected void doSample() {
        StringBuilder stringBuilder = new StringBuilder();

        for (StackTraceElement stackTraceElement : mCurrentThread.getStackTrace()) {
            stringBuilder
                    .append(stackTraceElement.toString())
                    .append(BlockInfo.SEPARATOR);
        }

        synchronized (sStackMap) {
            if (sStackMap.size() == mMaxEntryCount && mMaxEntryCount > 0) {
                sStackMap.remove(sStackMap.keySet().iterator().next());
            }
            sStackMap.put(System.currentTimeMillis(), stringBuilder.toString());
        }
    }

mCurrentThread就是主线程对象,0.8 * mSampleInterval(卡顿时长阀值)后的去获取线程的堆栈信息并保存到sStackMap中,这里的意思是,我们认为方法执行超过mSampleInterval就表示卡顿,当方法执行时间已经到了mSampleInterval的0.8倍的时候还没执行完,那么这时候就开始采集方法执行堆栈信息了,如果方法在0.9 * mSampleInterval的时候执行完成,那么不会警告卡顿,但是如果方法执行耗时超过mSampleInterval,那就把0.8 * mSampleInterval这个时间点的堆栈信息认为是造成耗时原因的堆栈信息,而且,这里只要方法还没执行完,就会间隔mSampleInterval去再次获取函数执行堆栈信息并保存,这里之所以遥在0.8 * mSampleInterval的时候就去获取堆栈信息时为了获取到准确的堆栈信息,因为既然函数耗时已经达到0.8 * mSampleInterval了,并且函数还没执行结束,那么很大概率上会导致卡顿了,所以提前获取函数执行堆栈保证获取到造成卡顿的函数调用堆栈的正确性,后面又不断间隔mSampleInterval去获取函数执行堆栈式要获取到更多完整的堆栈信息,当方法执行完成后就会停止获取函数执行堆栈了,所有的函数执行堆栈信息最多存100条,也就是最多有100个函数调用堆栈,以当前的时间戳作为key,当监测到卡顿的时候,要把之前保存在sStackMap的函数堆栈信息展示通知出来,通过时间戳就能取到:

private void notifyBlockEvent(final long endTime) {
        final long startTime = mStartTimestamp;
        final long startThreadTime = mStartThreadTimestamp;
        final long endThreadTime = SystemClock.currentThreadTimeMillis();
        HandlerThreadFactory.getWriteLogThreadHandler().post(new Runnable() {
            @Override
            public void run() {
                mBlockListener.onBlockEvent(startTime, endTime, startThreadTime, endThreadTime);
            }
        });
    }

然后再看mBlockListener.onBlockEvent(startTime, endTime, startThreadTime, endThreadTime),因为初始化的时候在BlockCanaryInternals构造函数里面已经setMonitor了,并且实现了onBlockEvent:

public BlockCanaryInternals() {

        stackSampler = new StackSampler(
                Looper.getMainLooper().getThread(),
                sContext.provideDumpInterval());

        cpuSampler = new CpuSampler(sContext.provideDumpInterval());

        setMonitor(new LooperMonitor(new LooperMonitor.BlockListener() {

            @Override
            public void onBlockEvent(long realTimeStart, long realTimeEnd,
                                     long threadTimeStart, long threadTimeEnd) {
                // Get recent thread-stack entries and cpu usage
                ArrayList<String> threadStackEntries = stackSampler
                        .getThreadStackEntries(realTimeStart, realTimeEnd);
                if (!threadStackEntries.isEmpty()) {
                    BlockInfo blockInfo = BlockInfo.newInstance()
                            .setMainThreadTimeCost(realTimeStart, realTimeEnd, threadTimeStart, threadTimeEnd)
                            .setCpuBusyFlag(cpuSampler.isCpuBusy(realTimeStart, realTimeEnd))
                            .setRecentCpuRate(cpuSampler.getCpuRateInfo())
                            .setThreadStackEntries(threadStackEntries)
                            .flushString();
                    LogWriter.save(blockInfo.toString());

                    if (mInterceptorChain.size() != 0) {
                        for (BlockInterceptor interceptor : mInterceptorChain) {
                            interceptor.onBlock(getContext().provideContext(), blockInfo);
                        }
                    }
                }
            }
        }, getContext().provideBlockThreshold(), getContext().stopWhenDebugging()));

        LogWriter.cleanObsolete();
    }

再看ArrayList<String> threadStackEntries = stackSampler
.getThreadStackEntries(realTimeStart, realTimeEnd) 获取函数堆栈信息:

public ArrayList<String> getThreadStackEntries(long startTime, long endTime) {
        ArrayList<String> result = new ArrayList<>();
        synchronized (sStackMap) {
            for (Long entryTime : sStackMap.keySet()) {
                if (startTime < entryTime && entryTime < endTime) {
                    result.add(BlockInfo.TIME_FORMATTER.format(entryTime)
                            + BlockInfo.SEPARATOR
                            + BlockInfo.SEPARATOR
                            + sStackMap.get(entryTime));
                }
            }
        }
        return result;
    }

这里面就是通过把开始时间,结束时间和刚才保存起来的堆栈信息的key,也就是保存堆栈信息的时间做对比,在开始时间和结束时间这个范围内的堆栈信息才是有用的,如果一个函数执行了3秒,那么这里会把这三秒内的所有函数执行堆栈信息都取出来,然后再封装成BlockInfo通知到外面,同时可存到文件中,到这里造成卡顿的函数执行堆栈已经采集完成

CPU信息采集

可见%iowait是一个非常模糊的指标,如果看到 %iowait 升高,还需检查I/O量有没有明显增加,avserv/avwait/avque等指标有没有明显增大,应用有没有感觉变慢,如果都没有,就没什么好担心的。

BlockCanary几个核心

flow.png
上一篇下一篇

猜你喜欢

热点阅读