java中的reference(二): jdk1.8中Refer

2020-07-11 本文已影响0人冬天里的懒喵

[toc]

1. java1.8 中的Reference结构

在jdk1.8中，Reference位于java.lan.ref包中。

image.png

主要的类有:Reference、SoftReference、WeakReference、PhantomReference及FinalReference、和Finalizer。其中最核心的是抽象类Reference，其他的Reference都继承了这个抽象类。分别对应java的软、弱、虚引用。而强引用是系统缺省的引用关系，用等号即可表示。因此没有专门的类。另外还有一个FinalReference,这个类主要是配合Finalizer机制使用。Finalizer本身存在诸多问题，在jdk1.9中已经被替换为另外一种Cleaner机制来配合PhantomReference机制，本文暂不涉及jdk1.9中的内容仅限于jdk1.8。
还有一个关键的类是ReferenceQueue,
java.lan.ref包中各类的关系如下图：

image.png

也可以通过idea提供的Diagram查看：

image.png

上述Reference总结见下表：

类名	引用类型	说明
SoftReference	软引用	堆内存不足时，垃圾回收器会回收对应引用
WeakReference	弱引用	每次垃圾回收都会回收其引用
PhantomReference	虚引用	对引用无影响，只用于获取对象被回收的通知
FinalReference	-	Java用于实现finalization的一个内部类

2. 引用与可达性

要搞懂Reference，必须要对GC的过程进行进一步的了解。
我们在前文中已经体会了jvm中定义的这些引用的具体用法。
我们知道，GC决定是否对一个对象进行回收，主要根据的是从GC ROOT 节点往下搜索，进行可达性计算。GC根据可达性结果决定是否对这些对象进行回收。可达性主要有五种，分别与这4种引用类型进行对应。

可达性类型	引用类型	说明
强可达(Strongly Reachable)	强引用(Strong Reference)	如果线程能通过强引用访问到对象，那么这个对象就是强可达的。
软可达(Soft Reachable)	软引用(Soft Reference)	如果一个对象不是强可达的，但是可以通过软引用访问到，那么这个对象就是软可达的
弱可达(Weak Reachable)	弱引用(Weak Reference)	如果一个对象不是强可达或者软可达的，但是可以通过弱引用访问到，那么这个对象就是弱可达的。
虚可达(Phantom Reachable)	虚引用(Phantom Reference)	如果一个对象不是强可达，软可达或者弱可达，并且这个对象已经finalize过了，并且有虚引用指向该对象，那么这个对象就是虚可达的。
不可达(Unreachable)	-	如果一个对象不是强可达，软可达或者弱可达，并且这个对象已经finalize过了，并且有虚引用指向该对象，那么这个对象就是虚可达的。

这是可达性的概念，我们可以通过如下示例进一步分析：

image.png

在上面这个例子中，A~D，每个对象只存在一个引用，分别是：A-强引用，B-软引用，C-弱引用，D-虚引用，所以他们的可达性为：A-强可达，B-软可达，C-弱可达，D-虚可达。因为E没有存在和GC Root的引用链，所以它是不可达。
再看如下这个更加复杂的例子：

image.png

A依然只有一个强引用，所以A是强可达
B存在两个引用，强引用和软引用，但是B可以通过强引用访问到，所以B是强可达
C只能通过弱引用访问到，所以是弱可达
D存在弱引用和虚引用，所以是弱可达
E虽然存在F的强引用，但是GC Root无法访问到它，所以它依然是不可达。

这是jvm种的5种可达性。不难看出，jvm主要是根据这些Reference的4种子类，来实现GC面对这些对象不可达的时候的不同处理办法。

3. Reference源码

3.1 核心源码

首先来看Reference源码

/**
 * Abstract base class for reference objects.  This class defines the
 * operations common to all reference objects.  Because reference objects are
 * implemented in close cooperation with the garbage collector, this class may
 * not be subclassed directly.
 *
 * @author   Mark Reinhold
 * @since    1.2
 */

注释说，这个抽象类是所有Reference类的基类，定义了所有Reference相关的操作，与GC紧密关联。也就是说GC会根据这些类来做一些特定的处理，直接实现其子类没有意义。什么意思，也就是说，jvm会对这个类及其子类做特殊的处理，jvmGC程序会硬编码识别SoftReference，WeakReference，PhantomReference等这些具体的类，对其reference变量进行特殊对象，才有了不同的引用类型的效果。否则，Reference与普通的类没啥区别。
Reference 主要实现两大核心功能：

实现特定的引用类型
用户可以对象被回收后得到通知
那么第一个功能在此已经可以很明白了。对于第二个功能，GC如何实现垃圾回收之后发送消息通知呢？很显然，对于GC这种性能要求很高的场景，不能采用传统的消息回调模式。万一再FullGC重消息回调阻塞或者出现性能问题，那么会导致整个JVM挂起。所以，Reference采用了另外一种方式，把被回收的Reference添加到了一个队列中。后续用户根据需要自行从queue中获取。这也解释了为啥软、弱引用提供了两调用方式，可以选择ReferenceQueue一起使用，也可以不用。但是虚引用由于只有通知消息，必须和ReferenceQuene一起使用。
现在查看Reference的源码：

public abstract class Reference<T> {
    //会被GC特殊对待
    private T referent;         /* Treated specially by GC */ 
    //Reference被回收之后会被添加到这个queue
    volatile ReferenceQueue<? super T> queue;
    
    
    /* -- Constructors -- */
    //用户只需要特殊的Reference，并不关心GC状态，因此可以不需要ReferenceQueue
    Reference(T referent) {
        this(referent, null);
    }
    //构造函数中传入了queue,如果reference被GC回收，则会添加到queue中去
    Reference(T referent, ReferenceQueue<? super T> queue) {
        this.referent = referent;
        this.queue = (queue == null) ? ReferenceQueue.NULL : queue;
    }

}

3.2 reference的状态

再Reference中，定义了Reference的状态：

 /* A Reference instance is in one of four possible internal states:
     *
     *     Active: Subject to special treatment by the garbage collector.  Some
     *     time after the collector detects that the reachability of the
     *     referent has changed to the appropriate state, it changes the
     *     instance's state to either Pending or Inactive, depending upon
     *     whether or not the instance was registered with a queue when it was
     *     created.  In the former case it also adds the instance to the
     *     pending-Reference list.  Newly-created instances are Active.
     *
     *     Pending: An element of the pending-Reference list, waiting to be
     *     enqueued by the Reference-handler thread.  Unregistered instances
     *     are never in this state.
     *
     *     Enqueued: An element of the queue with which the instance was
     *     registered when it was created.  When an instance is removed from
     *     its ReferenceQueue, it is made Inactive.  Unregistered instances are
     *     never in this state.
     *
     *     Inactive: Nothing more to do.  Once an instance becomes Inactive its
     *     state will never change again.
     *
     * The state is encoded in the queue and next fields as follows:
     *
     *     Active: queue = ReferenceQueue with which instance is registered, or
     *     ReferenceQueue.NULL if it was not registered with a queue; next =
     *     null.
     *
     *     Pending: queue = ReferenceQueue with which instance is registered;
     *     next = this
     *
     *     Enqueued: queue = ReferenceQueue.ENQUEUED; next = Following instance
     *     in queue, or this if at end of list.
     *
     *     Inactive: queue = ReferenceQueue.NULL; next = this.
     *
     * With this scheme the collector need only examine the next field in order
     * to determine whether a Reference instance requires special treatment: If
     * the next field is null then the instance is active; if it is non-null,
     * then the collector should treat the instance normally.
     *
     * To ensure that a concurrent collector can discover active Reference
     * objects without interfering with application threads that may apply
     * the enqueue() method to those objects, collectors should link
     * discovered objects through the discovered field. The discovered
     * field is also used for linking Reference objects in the pending list.
     */

大段的英文注释，实际上在学习java源代码的过程中，看懂这些注释往往比源码更加重要，有时候源码只能反应实现的具体过程，但是究竟为什么要真没实现，则在很多源码的注释中有说明。
注释中，将Reference的状态分为4种：

状态	说明
Active	刚初始化的实例是Active状态，在可达性发生变化之后，由于GC的各种特殊处理，可能会切换为Pendig或者Inactive状态，如果实例创建时注册了referenceQueue,则会切换到Pending状态，并将Reference加入到Pending-Reference队列，如果没有注册ReferenceQueue,则会切换到Inactive状态
Pending	当被加入到Penging-reference链表中的时候的状态，这些Reference等待被加入到ReferenceQueue。如果没有注册ReferenceQueue则永远不会出现这个状态
Enqueued	在ReferenceQueue队列中的Reference的状态，如果从ReferenceQueue中移除，则会进入Inactive状态
Inactive	Reference的最终状态，一旦到达Inactive状态则状态不会再发生改变

对于这四种状态，Reference的next指针和queue如下：

状态	queue	next
Active	ReferenceQueue or ReferenceQueue.NULL	null
Pending	ReferenceQueue	this
Enqueued	ReferenceQueue.ENQUEUED	队列中的下一个
Inactive	ReferenceQueue.NULL	this

状态图如下：

image.png

在上文注释中我们发现有一个Penging-reference链表，还有一个ReferenceQueue。这个链表又是来做什么的呢？常规来说，jvm应该直接将gc后的Referencce加入到ReferenceQueue中即可。但是实际上并不是如此。GC为了保证执行效率，而ReferenceQueue中的数据本身也不需要那么高的时效性，因此，在具体的代码中，jvm的GC操作只把Reference加入到了pending-Reference链表中。这是一个轻量级的操作，效率会非常高。Reference中有一个pending的成员变量，他就是这个pending-Reference链表的头节点。而discoverd 则是指向下一个节点的指针。
我们再看看Reference源码：

    /* List of References waiting to be enqueued.  The collector adds
     * References to this list, while the Reference-handler thread removes
     * them.  This list is protected by the above lock object. The
     * list uses the discovered field to link its elements.
     */
    private static Reference<Object> pending = null;
    
        /* When active:   next element in a discovered reference list maintained by GC (or this if last)
     *     pending:   next element in the pending list (or null if last)
     *   otherwise:   NULL
     */
    transient private Reference<T> discovered;  /* used by VM */

GC操作将Active的reference添加到了pending链表中。

3.3 ReferenceHandler

上文中说到GC只将reference添加到了Pending-Reference链表中。何时会被加入到ReferenceQueue中呢？这个过程就需要通过一个独立的线程来运行，这个线程就是ReferenceHandler。它是Reference的一个内部类,同时，为了线程安全，还有一个全局的锁：

    /* Object used to synchronize with the garbage collector.  The collector
     * must acquire this lock at the beginning of each collection cycle.  It is
     * therefore critical that any code holding this lock complete as quickly
     * as possible, allocate no new objects, and avoid calling user code.
     */
     //GC在操作过程中需要获取reference的这个锁，与ReferenceHandler线程同步。避免造成线程不安全。
     //由于GC也要用到这个锁，因此referenceHandler中的操作必须尽快完成，不生成新的对象，也不调用用户代码。避免对GC过程造成影响。
    static private class Lock { }
    private static Lock lock = new Lock();
    /* High-priority thread to enqueue pending References
     */
    private static class ReferenceHandler extends Thread {

        private static void ensureClassInitialized(Class<?> clazz) {
            try {
                Class.forName(clazz.getName(), true, clazz.getClassLoader());
            } catch (ClassNotFoundException e) {
                throw (Error) new NoClassDefFoundError(e.getMessage()).initCause(e);
            }
        }

        static {
            // pre-load and initialize InterruptedException and Cleaner classes
            // so that we don't get into trouble later in the run loop if there's
            // memory shortage while loading/initializing them lazily.
            ensureClassInitialized(InterruptedException.class);
            ensureClassInitialized(Cleaner.class);
        }

        ReferenceHandler(ThreadGroup g, String name) {
            super(g, name);
        }

        public void run() {
            while (true) {
                tryHandlePending(true);
            }
        }
    }

线程的核心逻辑都在tryHandlePending中：

/**
     * Try handle pending {@link Reference} if there is one.<p>
     * Return {@code true} as a hint that there might be another
     * {@link Reference} pending or {@code false} when there are no more pending
     * {@link Reference}s at the moment and the program can do some other
     * useful work instead of looping.
     *
     * @param waitForNotify if {@code true} and there was no pending
     *                      {@link Reference}, wait until notified from VM
     *                      or interrupted; if {@code false}, return immediately
     *                      when there is no pending {@link Reference}.
     * @return {@code true} if there was a {@link Reference} pending and it
     *         was processed, or we waited for notification and either got it
     *         or thread was interrupted before being notified;
     *         {@code false} otherwise.
     */
    static boolean tryHandlePending(boolean waitForNotify) {
        Reference<Object> r;
        Cleaner c;
        try {
        // 获取锁，避免与垃圾回收器同时操作
            synchronized (lock) {
             //判断pending-Reference链表是否有数据
                if (pending != null) {
                 // 如果有Pending Reference，从列表中取出
                    r = pending;
                    // 'instanceof' might throw OutOfMemoryError sometimes
                    // so do this before un-linking 'r' from the 'pending' chain...
                    c = r instanceof Cleaner ? (Cleaner) r : null;
                    // unlink 'r' from 'pending' chain
                    pending = r.discovered;
                    r.discovered = null;
                } else {
                 // 如果没有Pending Reference，调用wait等待
                    // 
                    // wait等待锁，是可能抛出OOME的，
                    // 因为可能发生InterruptedException异常，然后就需要实例化这个异常对象，
                    // 如果此时内存不足，就可能抛出OOME，所以这里需要捕获OutOfMemoryError，
                    // 避免因为OOME而导致ReferenceHandler进程静默退出
                    // The waiting on the lock may cause an OutOfMemoryError
                    // because it may try to allocate exception objects.
                    if (waitForNotify) {
                        lock.wait();
                    }
                    // retry if waited
                    return waitForNotify;
                }
            }
        } catch (OutOfMemoryError x) {
            // Give other threads CPU time so they hopefully drop some live references
            // and GC reclaims some space.
            // Also prevent CPU intensive spinning in case 'r instanceof Cleaner' above
            // persistently throws OOME for some time...
            Thread.yield();
            // retry
            return true;
        } catch (InterruptedException x) {
            // retry
            return true;
        }
      //调用clean方法
        // Fast path for cleaners
        if (c != null) {
            c.clean();
            return true;
        }

        ReferenceQueue<? super Object> q = r.queue;
        //如果ReferenceQueue不为null 则入队
        if (q != ReferenceQueue.NULL) q.enqueue(r);
        return true;
    }

ReferenceHandler则是在线程中的静态代码块中启动的：

  static {
        ThreadGroup tg = Thread.currentThread().getThreadGroup();
        for (ThreadGroup tgn = tg;
             tgn != null;
             tg = tgn, tgn = tg.getParent());
        Thread handler = new ReferenceHandler(tg, "Reference Handler");
        /* If there were a special system-only priority greater than
         * MAX_PRIORITY, it would be used here
         */
        handler.setPriority(Thread.MAX_PRIORITY);
        handler.setDaemon(true);
        handler.start();

        // provide access in SharedSecrets
        SharedSecrets.setJavaLangRefAccess(new JavaLangRefAccess() {
            @Override
            public boolean tryHandlePendingReference() {
                return tryHandlePending(false);
            }
        });
    }

可以看出，ReferenceHandler设置了Thread.MAX_PRIORITY 最高优先级。主要逻辑是将Pending-reference链表中的Reference添加到ReferenceUqeue。需要注意的是，为了不与GC冲突，ReferenceHandler不生成新的对象，也不调用用户代码。避免对GC过程造成影响。

4. ReferenceQueue

我们再来看看ReferenceQueue的源码。

/**
 * Reference queues, to which registered reference objects are appended by the
 * garbage collector after the appropriate reachability changes are detected.
 *
 * @author   Mark Reinhold
 * @since    1.2
 */

Reference queues 在注册queue之后，将GC之后的Reference放到这个队列中。其本身也是一个链表。

    // 引用链表的头节点
    private volatile Reference<? extends T> head = null;
    // 引用队列长度，入队则增加1，出队则减少1
    private long queueLength = 0;

为了在多线程下运行，同样也实现了锁：

    // 静态内部类，作为锁对象
    static private class Lock { };
    /* 互斥锁，用于同步ReferenceHandler的enqueue和用户线程操作的remove和poll出队操作 */
    private Lock lock = new Lock();
    
      // 用于标识没有注册Queue
    static ReferenceQueue<Object> NULL = new Null<>();
    // 用于标识已经处于对应的Queue中
    static ReferenceQueue<Object> ENQUEUED = new Null<>();

重点是入队的方法enqueue：

 boolean enqueue(Reference<? extends T> r) { /* Called only by Reference class */
        //获得锁
        synchronized (lock) {
            //判断是否需要入队
            // Check that since getting the lock this reference hasn't already been
            // enqueued (and even then removed)
            ReferenceQueue<?> queue = r.queue;
              // 如果引用实例持有的队列为ReferenceQueue.NULL或者ReferenceQueue.ENQUEUED则入队失败返回false
            if ((queue == NULL) || (queue == ENQUEUED)) {
                return false;
            }
            assert queue == this;
            //入队之后 设置为ENQUEUED 将Reference绑定只queue改为new一个新的Enqueue队列，避免循环引用
            r.queue = ENQUEUED;
            // 如果链表没有元素，则此引用实例直接作为头节点，否则把前一个引用实例作为下一个节点
            r.next = (head == null) ? r : head;
            // 当前实例更新为头节点，也就是每一个新入队的引用实例都是作为头节点，已有的引用实例会作为后继节点
            head = r;
            // 队列长度增加1
            queueLength++;
            // 特殊处理FinalReference，VM进行计数
            if (r instanceof FinalReference) {
                sun.misc.VM.addFinalRefCount(1);
            }
            lock.notifyAll();
            return true;
        }
    }

poll 方法和reallypoll方法：

 // 引用队列的poll操作，此方法必须在加锁情况下调用
    private Reference<? extends T> reallyPoll() {       /* Must hold lock */
        Reference<? extends T> r = head;
        if (r != null) {
            @SuppressWarnings("unchecked")
            Reference<? extends T> rn = r.next;
            // 更新next节点为头节点，如果next节点为自身，说明已经走过一次出队，则返回null
            head = (rn == r) ? null : rn;
            r.queue = NULL;
            // 当前头节点变更为环状队列，考虑到FinalReference尚为inactive和避免重复出队的问题
            r.next = r;
            // 队列长度减少1
            queueLength--;
            if (r instanceof FinalReference) {
                sun.misc.VM.addFinalRefCount(-1);
            }
            return r;
        }
        return null;
    }

    // 队列的公有poll操作，主要是加锁后调用reallyPoll
    public Reference<? extends T> poll() {
        if (head == null)
            return null;
        synchronized (lock) {
            return reallyPoll();
        }
    }

移除引用队列中的下一个引用元素的remove方法：

// 移除引用队列中的下一个引用元素，实际上也是依赖于reallyPoll的Object提供的阻塞机制
    public Reference<? extends T> remove(long timeout)
        throws IllegalArgumentException, InterruptedException
    {
        if (timeout < 0) {
            throw new IllegalArgumentException("Negative timeout value");
        }
        synchronized (lock) {
            Reference<? extends T> r = reallyPoll();
            if (r != null) return r;
            long start = (timeout == 0) ? 0 : System.nanoTime();
            for (;;) {
                lock.wait(timeout);
                r = reallyPoll();
                if (r != null) return r;
                if (timeout != 0) {
                    long end = System.nanoTime();
                    timeout -= (end - start) / 1000_000;
                    if (timeout <= 0) return null;
                    start = end;
                }
            }
        }
    }

不难看出，实际上ReferenceQueue只存储了Reference链表的头节点，真正的Reference链表的所有节点是存储在Reference实例本身，通过属性 next 拼接的，ReferenceQueue提供了对Reference链表的入队、poll、remove等操作。
Reference与ReferenceQueue的完整关系如下图：

image.png

5.其他Reference源码

5.1 SoftReference

SoftReference的实现很简单，继承Reference之后，只是增加了一个时间戳。

    /**
     * Timestamp clock, updated by the garbage collector
     */
    static private long clock;

    /**
     * Timestamp updated by each invocation of the get method.  The VM may use
     * this field when selecting soft references to be cleared, but it is not
     * required to do so.
     */
    private long timestamp;

在SoftReference中，有一个全局的变量clock(实际上就是java.lang.ref.SoftReference的类变量clock，其保持了最后一次GC的时间点（以毫秒为单位），即每一次GC发生时，该值均会被重新设置。同时，java.lang.ref.SoftReference对象实例均有一个timestamp的属性，其被设置为最后一次成功通过SoftReference对象获取其引用对象时的clock的值（最后一次GC）。所以，java.lang.ref.SoftReference对象实例的timestamp属性，保持的是这个对象被访问时的最后一次GC的时间戳。
get 方法如下：

    /**
     * Returns this reference object's referent.  If this reference object has
     * been cleared, either by the program or by the garbage collector, then
     * this method returns <code>null</code>.
     *
     * @return   The object to which this reference refers, or
     *           <code>null</code> if this reference object has been cleared
     */
    public T get() {
        T o = super.get();
        if (o != null && this.timestamp != clock)
            this.timestamp = clock;
        return o;
    }

在每次调用get的过程中，实际上只是修改了这个时间戳的值。GC每次调用会同时修改clock和timestamp。这样就可以计算出这个softReference有多久没访问。之后决定要不要将其删除。
当GC发生时，以下两个因素影响SoftReference引用的对象是否被回收：
1、SoftReference 对象实例的timestamp有多旧；
2、内存空闲空间的大小。
具体回收过程本文不做详细展开。

5.2 WeakReference

weakReference中只有构造方法，其他方法全部继承Reference构造方法。

    /**
     * Creates a new weak reference that refers to the given object.  The new
     * reference is not registered with any queue.
     *
     * @param referent object the new weak reference will refer to
     */
    public WeakReference(T referent) {
        super(referent);
    }

    /**
     * Creates a new weak reference that refers to the given object and is
     * registered with the given queue.
     *
     * @param referent object the new weak reference will refer to
     * @param q the queue with which the reference is to be registered,
     *          or <tt>null</tt> if registration is not required
     */
    public WeakReference(T referent, ReferenceQueue<? super T> q) {
        super(referent, q);
    }

5.3 PhantomReference

PhantomReference 只有一个带ReferenceQueue的构造方法。在使用的时候必须和ReferenceQueue配合一起使用。

    /**
     * Creates a new phantom reference that refers to the given object and
     * is registered with the given queue.
     *
     * <p> It is possible to create a phantom reference with a <tt>null</tt>
     * queue, but such a reference is completely useless: Its <tt>get</tt>
     * method will always return null and, since it does not have a queue, it
     * will never be enqueued.
     *
     * @param referent the object the new phantom reference will refer to
     * @param q the queue with which the reference is to be registered,
     *          or <tt>null</tt> if registration is not required
     */
    public PhantomReference(T referent, ReferenceQueue<? super T> q) {
        super(referent, q);
    }

由此不难发现PhantomReference和weakReference在代码层面只有一个构造方法的差异。

关于Finalizer和FinaReference将在后面专门介绍。
本文参考:
JDK源码阅读-Reference
阿里面试：说说强引用、软引用、弱引用、虚引用吧

java中的reference(二): jdk1.8中Refer

1. java1.8 中的Reference结构

2. 引用与可达性

3. Reference源码

3.1 核心源码

3.2 reference的状态

3.3 ReferenceHandler

4. ReferenceQueue

5.其他Reference源码

5.1 SoftReference

5.2 WeakReference

5.3 PhantomReference

猜你喜欢

热点阅读