《庖丁解牛Android源码 - OkHttp源码》(一) 拆解

2018-03-02 本文已影响100人非墨Zero

1.简述

OkHttp 是 Square 组织出品的一套支持 http1.x/http 2/WebSocket 的网络框架。由于其易用性被广大的公司所采用。 OkHttp 跟以往的网络不同还在于:
OkHttp 抛弃掉以往基于 Android 平台或者 Java 平台上原生的 HTTP 协议支持的Api ,而自己实现了一套 HTTP 协议。并且这套协议不仅支持 1.x 的 HTTP 协议,还支持 2.0的 HTTP协议

（本文将先拆解 OkHttp 中的各种组件，然后整合到一起分析 OkHttp 框架）

今天我们要分析的类是 DiskLruCache，Disk 代表这个 cache 针对的是外置存储器，可能是磁盘或者 sdcard 。Lru 代表这个 cache 所使用的淘汰算法。其实，将 DiskLruCache 归类于 OkHttp 并不准确，这个类原本属于 android 4.1 的系统类，位于 libcore.io 包下。抽取出来以后就可以应用于任何版本的 Android 系统。实际上，我们依旧能在 OkHttp 的 DiskLruCache 代码中找到 libcore.io 的影子：

// OkHttp3
public final class DiskLruCache implements Closeable, Flushable { 
    ...
    static final String MAGIC = "libcore.io.DiskLruCache";//文件魔数依旧保留libcore包名
   ....
/*
     * This cache uses a journal file named "journal". A typical journal file
     * looks like this:
     *     libcore.io.DiskLruCache
      ....
*/
}

如上面源码所述，DiskLruCache 会将一些元数据文件信息记录到自己的一个数据文件中，而文件魔数 MAGIC 依然保留着 libcore 的包名，甚至连注释，也直接 copy 的当时 libcore 时候的注释。当然，OkHttp 跟 libcore 里的 DiskLruCache 也有差别，主要体现在 IO 处理上。OkHttp 是基于 Okio 框架开发的，因此 OkHttp 在处理 IO 的时候使用了更为方便的 Okio 接口。

(如果你对 Okio 并不熟悉，可以参考我的 Okio 系列文章:Okio源码解析 )

DiskLruCache 的构造需要调用它的静态工厂方法 create :

public static DiskLruCache create(FileSystem fileSystem, File directory, int appVersion,
      int valueCount, long maxSize) {
    if (maxSize <= 0) {
      throw new IllegalArgumentException("maxSize <= 0");
    }
    if (valueCount <= 0) {
      throw new IllegalArgumentException("valueCount <= 0");
    }

    // Use a single background thread to evict entries.
    Executor executor = new ThreadPoolExecutor(0, 1, 60L, TimeUnit.SECONDS,
        new LinkedBlockingQueue<Runnable>(), Util.threadFactory("OkHttp DiskLruCache", true));//构建单线程线程池

    return new DiskLruCache(fileSystem, directory, appVersion, valueCount, maxSize, executor);
  }

静态工厂方法 create 所需要的形参，基本就是 DiskLruCache 构造器的形参。所需要的参数分别是:

参数对应表

一般情况下，我们不需要指定这么多的参数，OkHttp 给我提供了一个很好的门面 okhttp3.Cache 。Cache 类将持有一个 DiskLruCache 对象，最后的实际操作将交给 DiskLruCache 对象去执行，比较类似 ContextWarpper 和 ContextImpl 的关系。

 public Cache(File directory, long maxSize) {//构造器
    this(directory, maxSize, FileSystem.SYSTEM);
  }

  Cache(File directory, long maxSize, FileSystem fileSystem) {
    this.cache = DiskLruCache.create(fileSystem, directory, VERSION, ENTRY_COUNT, maxSize);//构建一个DiskLruCache 存到成员变量 cache中去
  }

通过 okhttp3.Cache ，调用者只需要通过传递一个目录和最大存储值即可构建一个 DiskLruCache 。OkHttp 之所以要在 DiskLruCache 这个类之外再包装一个 Cache 主要是因为 DiskLruCache 并不关注具体的业务种类。而 okhttp3.Cache 主要功能，是将 DiskLruCache 包装成为可以方便处理 OkHttp 的网络相关业务的类。

2.Demo

我们先简单使用一下 DiskLruCache :

private static void test(DiskLruCache cache ,String tag)throws Exception {
        Editor editor = cache.edit(tag);//开启名字为 tag 的事务
        File file = new File("/Users/david/temp2.txt");//temp2.txt 占用1303个字节
        Buffer  buffer = new Buffer();
        Source source = Okio.source(file);
        source.read(buffer, file.length());
        source.close();
        editor.newSink(0).write(buffer,
                buffer.size());
        editor.commit();
    }
    
// test code
DiskLruCache cache = DiskLruCache.create(FileSystem.SYSTEM, 
                    new File("/Users/david/cache"), 1, 1, 3000);
            cache.initialize();
            test(cache,"hello1");
            test(cache,"hello2");
            test(cache,"hello3");
            test(cache,"hello4");

代码执行之后，将会在我们的 cache 目录下生成下列文件：

cache文件夹目录

journal 文件类似一个日志文件，用来保存你对该文件夹的操作记录
hello*.0 文件需要分成两部分 "." 好前部分 "hello*" 就是我们传入的 key。后面的阿拉伯数字代表我们所对应的数据段索引。

3. journal 文件和初始化

journal 是一个日志文件，它实际上是一个文本文件，它的文件结构如下图：

journal 文件结构

上面的例子中我们将得到文件内容：

libcore.io.DiskLruCache //MAGIC
1 //DiskLruCache 版本号
1 //APP_VERSION
1 //VALUE_COUNT
  //BLANK
DIRTY hello1 //RECORDS
CLEAN hello1 1303
DIRTY hello2
CLEAN hello2 1303
DIRTY hello3
CLEAN hello3 1303
DIRTY hello4
REMOVE hello1
CLEAN hello4 1303

当外部需要访问 DiskLruCache 中的数据时候， DiskLruCache 将调用 initialize() 函数，这个函数将读取 journal 文件进行初始化操作。比如你在使用 DiskLruCache.get 获取缓存的时候：

public synchronized Snapshot get(String key) throws IOException {
    initialize();
    ...具体get操作
}

initialize() 函数将会在所有数据访问操作之前执行，类似 AOP 。DiskLruCache 在记录的管理上，保持了比较高的安全策略。为了保证数据的准确性，需要维护多个 journal 文件，避免管理出错

public synchronized void initialize() throws IOException {
    。。。
    if (initialized) {
      return; // Already initialized.
    }

    // If a bkp file exists, use it instead.
    if (fileSystem.exists(journalFileBackup)) {
        ...
      }
    }//如果原始文件消失，可以采用备份文件

    // Prefer to pick up where we left off.
    if (fileSystem.exists(journalFile)) {
      try {
        readJournal();//读取journal文件
        processJournal();//处理从journal文件读取出来的数据,删除不必要的数据
        initialized = true;
        return;
      } catch (IOException journalIsCorrupt) {
        ...
        delete();//如果文件解析出现异常，删除掉journal文件
        closed = false;
      }
    }
    rebuildJournal();//重新构建一个journal文件
    initialized = true;
  }

Journal 文件的读取依赖于文件的数据结构，日志记录数据将通过调用 readJournalLine 方法实现：

private void readJournal() throws IOException {
    BufferedSource source = Okio.buffer(fileSystem.source(journalFile));
    try {
      String magic = source.readUtf8LineStrict();//MAGIC
      String version = source.readUtf8LineStrict();//VERSION
      String appVersionString = source.readUtf8LineStrict();//APPVERSION
      String valueCountString = source.readUtf8LineStrict();//VALUE_COUNT
      String blank = source.readUtf8LineStrict();//BLANK
      ...
      int lineCount = 0;
      while (true) {
        try {
          readJournalLine(source.readUtf8LineStrict());//解析记录数据
          lineCount++;
        } catch (EOFException endOfJournal) {
          break;
        }
      }
      ...
  }

每一个记录都将分成几部分：指令 + key + [文件大小+]，每一个数据元都使用空格分隔。而如果指令可以携带文件大小参数的话，那么这个文件大小参数可以是多个，个数根据参数 valueCount 指定。比如上面的例子中

记录结构

Journal 有四个指令：

  private static final String CLEAN = "CLEAN"; //需要携带文件大小
  private static final String DIRTY = "DIRTY";
  private static final String REMOVE = "REMOVE";
  private static final String READ = "READ";

其中，只有 "CLEAN" 指令需要携带文件大小。这四个指令的含义分别是：

指令对应表

private void readJournalLine(String line) throws IOException {
    int firstSpace = line.indexOf(' ');
    ...
    int keyBegin = firstSpace + 1;
    int secondSpace = line.indexOf(' ', keyBegin);//判断 CLEAN 指令
    final String key;
    if (secondSpace == -1) {
      key = line.substring(keyBegin);
      //code step1
      if (firstSpace == REMOVE.length() && line.startsWith(REMOVE)) {
        lruEntries.remove(key);
        return;
      }
    } else {
      key = line.substring(keyBegin, secondSpace);
    }

    Entry entry = lruEntries.get(key);
    if (entry == null) {
      entry = new Entry(key);
      lruEntries.put(key, entry);//构建 lruEntries
    }

    if (secondSpace != -1 && firstSpace == CLEAN.length() && line.startsWith(CLEAN)) {
        //code step2
      String[] parts = line.substring(secondSpace + 1).split(" ");
      entry.readable = true;
      entry.currentEditor = null;
      entry.setLengths(parts);
    } else if (secondSpace == -1 && firstSpace == DIRTY.length() && line.startsWith(DIRTY)) {
      entry.currentEditor = new Editor(entry);
    } else if (secondSpace == -1 && firstSpace == READ.length() && line.startsWith(READ)) {
        //nothing
    } else {
      throw new IOException("unexpected journal line: " + line);
    }
  }

反复调用 readJournalLine 函数的目的，是为了构建 lruEntries 对象，而在将字符串指令转换成为具体对象的时候，根据指令的特性，DiskLruCache 运用了一些简单的算法：

CLEAN: 当第二个空格存在的时候，代表着后面携带文件大小参数，(对应代码 step2 位置)，得到文件大小参数字符串 line.substring(secondSpace + 1) 通过调用 entry.setLengths 方法设置到 entry 文件内存记录中去。
DIRTY: 判断指令字符和长度
READ: 判断指令字符和长度
REMOVE: 判断指令长度和长度

initialize 函数通过调用 readJournal 完成配置之后，将会调用 processJournal 。这个函数一方面是用于计算 key 对应的数据的总大小，一方面是对一些脏数据处理，最后状态 DIRTY 的数据是不安全的。可能是你在准备写入的时候，程序中断，导致这个事务并没被执行。为了保证数据的完整性和安全性，DiskLruCache 会将这个 key 对应的相关数据删除。

private void processJournal() throws IOException {
    fileSystem.delete(journalFileTmp);
    for (Iterator<Entry> i = lruEntries.values().iterator(); i.hasNext(); ) {
      Entry entry = i.next();
      if (entry.currentEditor == null) {
        //计算文件大小
        for (int t = 0; t < valueCount; t++) {
          size += entry.lengths[t];
        }
      } else {
        ...
        //脏数据进行处理
        for (int t = 0; t < valueCount; t++) {
          fileSystem.delete(entry.cleanFiles[t]);
          fileSystem.delete(entry.dirtyFiles[t]);
        }
        i.remove();
      }
    }
  }

4. 添加记录

通过我们上面的 demo 和后面的代码分析。我们知道，对 DiskLruCache 对象的处理是基于事务对象 Editor 。这种做法像极了我们的 SharePerference 对象。DiskLruCache 的 Editor 事务是通过调用 DiskLruCache.edit 函数获得:

synchronized Editor edit(String key, long expectedSequenceNumber) throws IOException {
    initialize();//初始化
    ...
    validateKey(key);//检测key名字是否合法
    Entry entry = lruEntries.get(key);
    ...
    if (entry != null && entry.currentEditor != null) {//保证每次只有一个东西在操作文件
      return null; // Another edit is in progress.
    }
    if (mostRecentTrimFailed || mostRecentRebuildFailed) {
      executor.execute(cleanupRunnable);//执行淘汰机制
      return null;
    }
    journalWriter.writeUtf8(DIRTY).writeByte(' ').writeUtf8(key).writeByte('\n');//保存DIRTY 记录
    journalWriter.flush();
    ...
    if (entry == null) {
      entry = new Entry(key);
      lruEntries.put(key, entry);
    }
    Editor editor = new Editor(entry);
    entry.currentEditor = editor;//将该事务分配在这条记录之上
    return editor;
  }

lruEntries 是 LinkedHashMap类型，Linked 前缀的 Map 类型表示，当采用迭代器方式获取 Map 中的数据时候，将以 LinkedList 的有序序列返回，利用这个中有序性，就可以实现 LRU 的简单算法。当你对 Edit 事务都处理完了以后，就需要调用 Edit.commit() 函数提交最后的修改，实际上就是在 Journal 文件的最后将你操作记录的文件状态设置为 CLEAN

public void commit() throws IOException {
      synchronized (DiskLruCache.this) {
        if (done) {
          throw new IllegalStateException();
        }
        if (entry.currentEditor == this) {
          completeEdit(this, true);//第二个参数代表这个事务是否执行正常
        }
        done = true;
      }
    }
    
synchronized void completeEdit(Editor editor, boolean success) throws IOException {
    ...
    if (success && !entry.readable) {
      for (int i = 0; i < valueCount; i++) {
        if (!editor.written[i]) {//step1 保证每一个数据都被写入
          editor.abort();
          throw new IllegalStateException("Newly created entry didn't create value for index " + i);
        }
        if (!fileSystem.exists(entry.dirtyFiles[i])) {
          editor.abort();//dirty文件不存在，那么需要将该事务弃置
          return;
        }
      }
    }
    for (int i = 0; i < valueCount; i++) {
      File dirty = entry.dirtyFiles[i];
      if (success) {
        if (fileSystem.exists(dirty)) {
          File clean = entry.cleanFiles[i];
          fileSystem.rename(dirty, clean);//dirtyFile->cleanFile
          ...
        }
      }
    }

    redundantOpCount++;
    entry.currentEditor = null;
    if (entry.readable | success) {
      entry.readable = true;
      journalWriter.writeUtf8(CLEAN).writeByte(' ');
      journalWriter.writeUtf8(entry.key);
      entry.writeLengths(journalWriter);
      journalWriter.writeByte('\n');
      if (success) {
        entry.sequenceNumber = nextSequenceNumber++;
      }
    } else {
      lruEntries.remove(entry.key);
      journalWriter.writeUtf8(REMOVE).writeByte(' ');
      journalWriter.writeUtf8(entry.key);
      journalWriter.writeByte('\n');
    }
    journalWriter.flush();

    if (size > maxSize || journalRebuildRequired()) {
      executor.execute(cleanupRunnable);
    }
  }

这里，size 变量代表的是目前的目录下所有文件的大小。在 Edit.commit 里调用了内部函数 completeEdit(Editor editor, boolean success) 。代码 step1 用于检查是否所有的数据都被覆盖。如果你要修改一个数据，需要将这条记录的所有文件都要修改。比如我们的 OkHttp 框架,每一次 OkHttp 修改记录的时候都会对所有的索引文件进行修改：

//code com.squareup.okhttp.Cache
public final class Cache {
    private static final int ENTRY_METADATA = 0;
    private static final int ENTRY_BODY = 1;
    private static final int ENTRY_COUNT = 2;
}

每一个 OkHttp 记录包含两个数据段，对应的索引分别是:

ENTRY_METADATA： 元数据段，用于储存请求信息和时间信息
ENTRY_BODY： 数据体，用于储存实际的数据

当通过 Cache.put(Response response) 往缓冲池里存入数据的时候，会调用到我们上面所阐述的 DiskLruCache 存放记录的流程：

private CacheRequest put(Response response) throws IOException {
    ...
    Entry entry = new Entry(response);
    DiskLruCache.Editor editor = null;
    try {
      editor = cache.edit(urlToKey(response.request()));
      if (editor == null) {
        return null;
      }
      entry.writeTo(editor);
      //这个函数用于写入索引为ENTRY_METADATA的数据
      return new CacheRequestImpl(editor); 
      //CacheRequestImpl 这个对象包装了写入ENTRY_BODY的操作和commit 操作
    } catch (IOException e) {
      abortQuietly(editor);
      return null;
    }
  }

当调用Cache.put(Response response) 函数的时候，OkHttp 会生成一个 Entry 对象，这个Entry 用于表示 OkHttp 的每一个请求的记录，而 DiskLruCache 中的 Entry 是用来表示key数据的记录。OkHttp 会通过调用 entry.writeTo(editor) 的方式将 ENTRY_METADATA 数据写入 editor 事务中去，并且构建了一个 CacheRequest 对象给上层，用于后面 ENTRY_BODY 数据的输入。

//code Cache.Entry
public void writeTo(DiskLruCache.Editor editor) throws IOException {
      BufferedSink sink = Okio.buffer(editor.newSink(ENTRY_METADATA));
      ....
      sink.close();
      //注意，此处并没有对 editor 做commit，是因为 ENTRY_BODY 还没写
 }

而CacheRequest 对象实际上是做了一层简单的装饰，保证当你数据书写完毕以后 editor 对象被 commit :

//code CacheRequestImpl
public CacheRequestImpl(final DiskLruCache.Editor editor) throws IOException {
      this.editor = editor;
      this.cacheOut = editor.newSink(ENTRY_BODY);
      this.body = new ForwardingSink(cacheOut) {
        @Override public void close() throws IOException {
          synchronized (Cache.this) {
            if (done) {
              return;
            }
            done = true;
            writeSuccessCount++;
          }
          super.close();
          editor.commit();
          //保证close函数调用以后事务 editor 被commit
        }
      };
    }

4. 淘汰和排序记录

上面我们说明了如何通过 DiskLruCache 对象添加一条记录，以及在 OkHttp 里是如何添加记录的，但是我们还遗留了一个问题，就是在 DiskLruCache中是如何实现 LRU 的？我们似乎并没有在代码中找到对应的代码。上面我们说到，文件记录的存储是放在一个 Map 对象 lruEntries 中去，而这个 lruEntries 是一个 LinkedHashMap 类型。在 DiskLruCache 生成 lruEntries 对象的时候，调用的是LinkedHashMap 三参构造器：

public final class DiskLruCache {
    ...
    final LinkedHashMap<String, Entry> lruEntries = new LinkedHashMap<>(0, 0.75f, true);
    ...
}

我们说的Lru算法实现就依赖于 LinkedHashMap 构造器的最后一个参数。参数注释翻译过来就是是否支持访问后迭代排序。

/**
     * The iteration ordering method for this linked hash map: <tt>true</tt>
     * for access-order, <tt>false</tt> for insertion-order.
     *
     * @serial
     */
    private final boolean accessOrder;
    public LinkedHashMap(int initialCapacity,
                         float loadFactor,
                         boolean accessOrder) {
        super(initialCapacity, loadFactor);
        this.accessOrder = accessOrder;
    }

我们来看下这个参数是如何影响最后的迭代序列的，我们从访问函数 get 入手：

//code LinkedHashMap
 public V get(Object key) {
        LinkedHashMapEntry<K,V> e = (LinkedHashMapEntry<K,V>)getEntry(key);
        if (e == null)
            return null;
        e.recordAccess(this);
        return e.value;
    }
//code LinkedHashMapEntry
void recordAccess(HashMap<K,V> m) {
            LinkedHashMap<K,V> lm = (LinkedHashMap<K,V>)m;
            if (lm.accessOrder) {
                lm.modCount++;
                remove();//将自己从迭代队列中删除
                addBefore(lm.header);// 将自己放入迭代队列队头
            }
        }

每一次访问操作都会触发 LinkedHashMapEntry.recordAccess 接口。 LinkedHashMapEntry 是迭代器线性列表中的一个节点，在调用 recordAccess 的时候会判断 lm.accessOrder 属性是否为 true 。然后将通过调用 remove 和 addBefore(lm.header) 。 lm.header LinkedHashMap 的散列队列的队列头部。通过这两步操作，这个LinkedHashMapEntry 对象就将自己变成队列头部,从而实现了 LRU 算法。

DiskLruCache 认为数据被访问依赖于两个函数：

Editor edit(String): 用于开启编辑事务
Snapshot get(String key) : 用于获取记录快照

当需要重建 journal 文件或者触发清理操作的时候，会往线程池中抛出一个 cleanupRunnable 消息：

private final Runnable cleanupRunnable = new Runnable() {
    public void run() {
      synchronized (DiskLruCache.this) {
        ...
          trimToSize();
        ...
      }
    }
  };
  
void trimToSize() throws IOException {
    while (size > maxSize) {
      Entry toEvict = lruEntries.values().iterator().next();
      removeEntry(toEvict);
    }
    mostRecentTrimFailed = false;
  }

cleanupRunnable 命令中会调用 trimToSize 函数，用于删除和计算当前 cache 文件夹中所包含的文件大小总和。清理操作由 removeEntry(Entry) 完成：

boolean removeEntry(Entry entry) throws IOException {
    ...
    for (int i = 0; i < valueCount; i++) {
      fileSystem.delete(entry.cleanFiles[i]);
      size -= entry.lengths[i];
      entry.lengths[i] = 0;
    }
    ...
    return true;
  }

所谓清理，也就是删除掉掉这条记录下所有文件。

5. 总结

DiskLruCache 这个类，或者这种模式有很好的通用性，目前也非只有 OkHttp 一个框架在用。作者希望读者们将这篇作为一篇工具文档，而不是作为一篇知识储备文档，结合这篇工具文档去结合代码去看或者去实际操作，这样能更加深刻的理解DiskLruCache 这类工具。