ArrayList中elementData为什么被transie

2018-12-05 本文已影响0人汪和呆喵

在阅读ArrayList源码时，发现保存元素的数组 elementData 使用 transient 修饰，该关键字声明数组默认不会被序列化。

    /**
     * The array buffer into which the elements of the ArrayList are stored.
     * The capacity of the ArrayList is the length of this array buffer. Any
     * empty ArrayList with elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA
     * will be expanded to DEFAULT_CAPACITY when the first element is added.
     */
    // Android-note: Also accessed from java.util.Collections
    transient Object[] elementData; // non-private to simplify nested class access

那么在序列化后，ArrayList里面的元素数组保存的数据不就完全丢失了吗？
深入研究代码后发现事实上，并不会，ArrayList提供了两个用于序列化和反序列化的方法，
readObject和writeObject：

    /**
     * Save the state of the <tt>ArrayList</tt> instance to a stream (that
     * is, serialize it).
     *
     * @serialData The length of the array backing the <tt>ArrayList</tt>
     *             instance is emitted (int), followed by all of its elements
     *             (each an <tt>Object</tt>) in the proper order.
     */
    private void writeObject(java.io.ObjectOutputStream s)
        throws java.io.IOException{
        // Write out element count, and any hidden stuff
        int expectedModCount = modCount;
        s.defaultWriteObject();

        // Write out size as capacity for behavioural compatibility with clone()
        s.writeInt(size);

        // Write out all elements in the proper order.
        for (int i=0; i<size; i++) {
            s.writeObject(elementData[i]);
        }

        if (modCount != expectedModCount) {
            throw new ConcurrentModificationException();
        }
    }

    /**
     * Reconstitute the <tt>ArrayList</tt> instance from a stream (that is,
     * deserialize it).
     */
    private void readObject(java.io.ObjectInputStream s)
        throws java.io.IOException, ClassNotFoundException {
        elementData = EMPTY_ELEMENTDATA;

        // Read in size, and any hidden stuff
        s.defaultReadObject();

        // Read in capacity
        s.readInt(); // ignored

        if (size > 0) {
            // be like clone(), allocate array based upon size not capacity
            ensureCapacityInternal(size);

            Object[] a = elementData;
            // Read in all elements in the proper order.
            for (int i=0; i<size; i++) {
                a[i] = s.readObject();
            }
        }
    }

ArrayList在序列化的时候会调用writeObject，直接将size和element写入ObjectOutputStream；
反序列化时调用readObject，从ObjectInputStream获取size和element，再恢复到elementData。

为什么不直接用elementData来序列化，而采用上面的方式来实现序列化呢？

原因在于elementData是一个缓存数组，默认size为10,对ArrayList进行add操作当空间不足时，
会对ArrayList进行扩容。通常扩容的倍数为1.5倍。

    /**
     * Increases the capacity to ensure that it can hold at least the
     * number of elements specified by the minimum capacity argument.
     *
     * @param minCapacity the desired minimum capacity
     */
    private void grow(int minCapacity) {
        // overflow-conscious code
        int oldCapacity = elementData.length;
        int newCapacity = oldCapacity + (oldCapacity >> 1);
        if (newCapacity - minCapacity < 0)
            newCapacity = minCapacity;
        if (newCapacity - MAX_ARRAY_SIZE > 0)
            newCapacity = hugeCapacity(minCapacity);
        // minCapacity is usually close to size, so this is a win:
        elementData = Arrays.copyOf(elementData, newCapacity);
    }

所以elementData数组会预留一些容量，等容量不足时再扩充容量，那么有些空间可能就没有实际存储元素，采用上面的方式来实现序列化时，就可以保证只序列化实际存储的那些元素，而不是整个数组，从而节省空间和时间。

序列化的时候是怎么调用writeObject和readObject的

奇怪了？尽管writeObject和readObject被外部类调用但事实上这是两个private的方法。并且它们既不存在于java.lang.Object，也没有在Serializable中声明。那么ObjectOutputStream如何使用它们的呢？

序列化时需要使用 ObjectOutputStream 的 writeObject() 将对象转换为字节流并输出。而 writeObject() 方法在传入的对象存在 writeObject() 的时候会去反射调用该对象的 writeObject() 来实现序列化。反序列化使用的是 ObjectInputStream 的 readObject() 方法，原理类似。

ObjectOutputStream oos = new ObjectOutputStream(new FileOutputStream(file));
oos.writeObject(list);

我们以ObjectInputStream为例，大体梳理一下调用流程，感兴趣的同学可以跟着读一下源码

首先，反序列化时会调用readObject() -> Object obj = readObject0(false) -> readObject0 -> return checkResolve(readOrdinaryObject(unshared)) -> readOrdinaryObject -> readSerialData(obj, desc);

然后readSerialData会调用slotDesc.invokeReadObject(obj, this)
这里调用ObjectStreamClass的invokeReadObject(Object obj, ObjectInputStream in)
里面的readObjectMethod.invoke(obj, new Object[]{ in });

这显然是一个通过反射进行的方法调用，那么readObjectMethod是什么方法?
readObjectMethod = getPrivateMethod(cl, "readObject",new Class<?>[] { ObjectInputStream.class },Void.TYPE); writeObjectMethod = getPrivateMethod(cl, "writeObject",new Class<?>[] { ObjectOutputStream.class },Void.TYPE);
可以看到writeObjectMethod也在这里
getPrivateMethod方法如下：

    /**
     * Returns non-static private method with given signature defined by given
     * class, or null if none found.  Access checks are disabled on the
     * returned method (if any).
     */
    private static Method getPrivateMethod(Class<?> cl, String name,
                                           Class<?>[] argTypes,
                                           Class<?> returnType)
    {
        try {
            Method meth = cl.getDeclaredMethod(name, argTypes);
            meth.setAccessible(true);
            int mods = meth.getModifiers();
            return ((meth.getReturnType() == returnType) &&
                    ((mods & Modifier.STATIC) == 0) &&
                    ((mods & Modifier.PRIVATE) != 0)) ? meth : null;
        } catch (NoSuchMethodException ex) {
            return null;
        }
    }

到这里我们就大概上明白了，ObjectInputStream会通过反射的形式，调用private的readObject方法。
实现反序列化。

其实在java集合框架中，还有很多中集合都采用了这种方式，修饰数据集合数组，比如
CopyOnWriteArrayList
private transient volatile Object[] elements;
HashMap
transient Node<K,V>[] table;
HashSet
private transient HashMap<E,Object> map;

究其原因，都是为了保证只序列化实际存储的那些元素，而不是整个数组，从而节省空间和时间。

参考文章：

JAVA对象流序列化时的readObject，writeObject，readResolve是怎么被调用的

ArrayList中elementData为什么被transie

为什么不直接用elementData来序列化，而采用上面的方式来实现序列化呢？

序列化的时候是怎么调用writeObject和readObject的

参考文章：

猜你喜欢

热点阅读