刨根问底（一）：ThreadLocal

2018-06-28 本文已影响110人叫我宫城大人

一、什么是ThreadLocal

顾名思义，线程本地变量，用ThreadLocal修饰的变量在线程间相互独立，互不影响。

二、编码体验

创建测试程序，分别启用两个线程，在各个线程中设置并打印当前用户名，观察输出；

public class Test {

    private static ThreadLocal<String> usernameLocal = new ThreadLocal<>();

    public static void main(String[] args) {
        usernameLocal.set("main");
        new Thread(() -> {
            System.out.println("[a]" + usernameLocal.get());
            usernameLocal.set("a");
            System.out.println("[a]" + usernameLocal.get());
        }).start();

        new Thread(() -> {
            System.out.println("[b]" + usernameLocal.get());
            usernameLocal.set("b");
            System.out.println("[b]" + usernameLocal.get());
        }).start();
        System.out.println("[main]" + usernameLocal.get());
    }
}

由于多线程的交错性，结果顺序不一，但最终结果相同，每个线程操作的用户名（usernameLocal）互不影响；

[main]main
[a]null
[b]null
[a]a
[b]b

三、源码剖析

那么ThreadLocal是怎么做到线程变量隔离的？先查看ThreadLocal源码一探究竟。

1. set

/**
 * Sets the current thread's copy of this thread-local variable
 * to the specified value.  Most subclasses will have no need to
 * override this method, relying solely on the {@link #initialValue}
 * method to set the values of thread-locals.
 *
 * @param value the value to be stored in the current thread's copy of
 *        this thread-local.
 */
public void set(T value) {
    Thread t = Thread.currentThread();
    ThreadLocalMap map = getMap(t);
    if (map != null)
        map.set(this, value);
    else
        createMap(t, value);
}

源码注释描述得很清楚，将当前线程局部变量的副本设置为指定值。

指的注意的是ThreadLocalMap这个类，它是ThreadLocal的静态内部类，手动简化下；

static class ThreadLocalMap {

    /**
     * The entries in this hash map extend WeakReference, using
     * its main ref field as the key (which is always a
     * ThreadLocal object).  Note that null keys (i.e. entry.get()
     * == null) mean that the key is no longer referenced, so the
     * entry can be expunged from table.  Such entries are referred to
     * as "stale entries" in the code that follows.
     */
    static class Entry extends WeakReference<ThreadLocal<?>> {
        /** The value associated with this ThreadLocal. */
        Object value;

        Entry(ThreadLocal<?> k, Object v) {
            super(k);
            value = v;
        }
    }

    /**
   * The table, resized as necessary.
   * table.length MUST always be a power of two.
   */
  private Entry[] table;

  // 注：省略一些代码

  /**
   * Construct a new map initially containing (firstKey, firstValue).
   * ThreadLocalMaps are constructed lazily, so we only create
   * one when we have at least one entry to put in it.
   */
  ThreadLocalMap(ThreadLocal<?> firstKey, Object firstValue) {
      table = new Entry[INITIAL_CAPACITY];
      int i = firstKey.threadLocalHashCode & (INITIAL_CAPACITY - 1);
      table[i] = new Entry(firstKey, firstValue);
      size = 1;
      setThreshold(INITIAL_CAPACITY);
  }

  /**
   * Get the entry associated with key.  This method
   * itself handles only the fast path: a direct hit of existing
   * key. It otherwise relays to getEntryAfterMiss.  This is
   * designed to maximize performance for direct hits, in part
   * by making this method readily inlinable.
   *
   * @param  key the thread local object
   * @return the entry associated with key, or null if no such
   */
  private Entry getEntry(ThreadLocal<?> key) {
      int i = key.threadLocalHashCode & (table.length - 1);
      Entry e = table[i];
      if (e != null && e.get() == key)
          return e;
      else
          return getEntryAfterMiss(key, i, e);
  }

  /**
   * Set the value associated with key.
   *
   * @param key the thread local object
   * @param value the value to be set
   */
  private void set(ThreadLocal<?> key, Object value) {

      // We don't use a fast path as with get() because it is at
      // least as common to use set() to create new entries as
      // it is to replace existing ones, in which case, a fast
      // path would fail more often than not.

      Entry[] tab = table;
      int len = tab.length;
      int i = key.threadLocalHashCode & (len-1);

      for (Entry e = tab[i];
           e != null;
           e = tab[i = nextIndex(i, len)]) {
          ThreadLocal<?> k = e.get();

          if (k == key) {
              e.value = value;
              return;
          }

          if (k == null) {
              replaceStaleEntry(key, value, i);
              return;
          }
      }

      tab[i] = new Entry(key, value);
      int sz = ++size;
      if (!cleanSomeSlots(i, sz) && sz >= threshold)
          rehash();
  }

  /**
   * Remove the entry for key.
   */
  private void remove(ThreadLocal<?> key) {
      Entry[] tab = table;
      int len = tab.length;
      int i = key.threadLocalHashCode & (len-1);
      for (Entry e = tab[i];
           e != null;
           e = tab[i = nextIndex(i, len)]) {
          if (e.get() == key) {
              e.clear();
              expungeStaleEntry(i);
              return;
          }
      }
  }
}

与map结构非常类似，根据get/set方法可以判断出key是ThreadLocal类型的，然后value就是我们实际需要存放的值。

非常值得注意的entry实体是继承WeakReference弱引用，发生gc就它会被回收掉。

了解了ThreadLocalMap的结构，那么回过头继续阅读ThreadLocal的set方法，有个关键的getMap方法：

/**
 * Get the map associated with a ThreadLocal. Overridden in
 * InheritableThreadLocal.
 *
 * @param  t the current thread
 * @return the map
 */
ThreadLocalMap getMap(Thread t) {
    return t.threadLocals;
}

原来是获取的是线程ThreadLocalMap类型的私有变量threadLocals，这差不多就能解释上面的为什么问题了。看看Thread类下面的私有变量threadLocals：

/* ThreadLocal values pertaining to this thread. This map is maintained
 * by the ThreadLocal class. */
ThreadLocal.ThreadLocalMap threadLocals = null;

初始值为null，然后我们在set中看到当map为null会走createMap方法：

/**
 * Create the map associated with a ThreadLocal. Overridden in
 * InheritableThreadLocal.
 *
 * @param t the current thread
 * @param firstValue value for the initial entry of the map
 */
void createMap(Thread t, T firstValue) {
    t.threadLocals = new ThreadLocalMap(this, firstValue);
}

以当前的ThreadLocal对象实例为key，传入的值为value，初始化t线程的私有变量threadLocals(ThreadLocalMap)，有兴趣可以回到上面看看这个构造方法。

当t线程的threadLocals已经初始化后，再进行set操作，就简单多了，直接类似map的put方法，和上面的初始化设置操作一样。

图示更为清晰：

2、get

/**
 * Returns the value in the current thread's copy of this
 * thread-local variable.  If the variable has no value for the
 * current thread, it is first initialized to the value returned
 * by an invocation of the {@link #initialValue} method.
 *
 * @return the current thread's value of this thread-local
 */
public T get() {
    Thread t = Thread.currentThread();
    ThreadLocalMap map = getMap(t);
    if (map != null) {
        ThreadLocalMap.Entry e = map.getEntry(this);
        if (e != null) {
            @SuppressWarnings("unchecked")
            T result = (T)e.value;
            return result;
        }
    }
    return setInitialValue();
}

理解了set的操作，get就容易的多了。先根据当前线程获取到threadLocals，然后如果这个threadLocals不为空并且根据当前ThreadLocal为key取到的entry也不为空，才返回entry中保存的值，否则返回setInitialVal()方法的结果；

/**
 * Variant of set() to establish initialValue. Used instead
 * of set() in case user has overridden the set() method.
 *
 * @return the initial value
 */
private T setInitialValue() {
    T value = initialValue();
    Thread t = Thread.currentThread();
    ThreadLocalMap map = getMap(t);
    if (map != null)
        map.set(this, value);
    else
        createMap(t, value);
    return value;
}

先看看第一行的initialValue这个方法：

/**
 * Returns the current thread's "initial value" for this
 * thread-local variable.  This method will be invoked the first
 * time a thread accesses the variable with the {@link #get}
 * method, unless the thread previously invoked the {@link #set}
 * method, in which case the {@code initialValue} method will not
 * be invoked for the thread.  Normally, this method is invoked at
 * most once per thread, but it may be invoked again in case of
 * subsequent invocations of {@link #remove} followed by {@link #get}.
 *
 * <p>This implementation simply returns {@code null}; if the
 * programmer desires thread-local variables to have an initial
 * value other than {@code null}, {@code ThreadLocal} must be
 * subclassed, and this method overridden.  Typically, an
 * anonymous inner class will be used.
 *
 * @return the initial value for this thread-local
 */
protected T initialValue() {
    return null;
}

很简单的返回了一个null，结合前面的源码，就是说默认会把当前ThreadLocal实例为key，null为value初始化到threadLocals中取，并且返回null值。但是注释说明很好，if the programmer desires thread-local variables to have an initial value other than null, ThreadLocal must be subclassed，我们可以通过重写该方法，改变这个默认值。

3、remove

/**
 * Removes the current thread's value for this thread-local
 * variable.  If this thread-local variable is subsequently
 * {@linkplain #get read} by the current thread, its value will be
 * reinitialized by invoking its {@link #initialValue} method,
 * unless its value is {@linkplain #set set} by the current thread
 * in the interim.  This may result in multiple invocations of the
 * {@code initialValue} method in the current thread.
 *
 * @since 1.5
 */
 public void remove() {
     ThreadLocalMap m = getMap(Thread.currentThread());
     if (m != null)
         m.remove(this);
 }

这个方法就是移除掉当前线程threadLocals中这个ThreadLocal实例对应的Entry，那么为什么需要有这个方法呢？

内存泄露，老生常谈的Java话题。当一个对象应该被释放掉但是还持有引用（GCRoot搜索），GC并不会释放掉这块内存，从而造成内存泄露。

由于ThreadLocalMap.Entry继承自WeakReference的原因，穿插说说弱引用GC回收的情况；

String str = new String("hello");
WeakReference<String> entry  = new WeakReference<>(str);
System.out.println("before gc, " + entry.get());
System.gc();
Thread.sleep(1000);
System.out.println("after gc, " + entry.get());

结果是：

before gc, hello
after gc, hello

可能会说引用的new String("hello")没有被回收，那是因为不止entry这个弱引用指向了这个对象，str这个强引用也指向了它，根据GCRoot根搜索原则，它也不会被回收。

那么我们修改测试代码：

WeakReference<String> entry  = new WeakReference<>(new String("hello"));
System.out.println("before gc, " + entry.get());
System.gc();
Thread.sleep(1000);
System.out.println("after gc, " + entry.get());

结果是：

before gc, hello
after gc, null

印证了上面的分析。回到本文来说，如果在使用了ThreadLocal后，而没有手动执行remove方法，当这个线程迟迟没有结束，gc未发生，引用的对象就不会释放内存，从而造成内存泄露。

最典型的场景就是在线程池中使用ThreadLocal，由于线程都是可缓存的，线程一直处在存活状况，每个线程threadLocals也会一直存在，那么所对应的Entry引用的对象空间除了gc也得不到释放。

所以，在每次使用ThreadLocal类时，最后记得使用remove方法显得尤为重要。

四、结束语

这是刨根问底系列的第一篇，从个人学习钻研源码的角度，一步一步地描述这个过程。相信不管是对于正在阅读这篇文章的你们，还是对于我自己，都是一个很好的学习和总结。

我个人偏向于洁癖，所以每一步粘贴的Java源码都原汁原味，带上官方的注释，有助于大家思考。除非万不得已，才略微插入几行注释加以说明。这也能在未来回顾这些原理时，能加以不同的思考，理解得更透彻。