集合包系列一 —— ArrayList

2017-03-04 本文已影响139人 FlySheep_ly

本系列文章所描述的所有类或接口都是基于 JDK 1.7的源码，在其它 JDK 的实现方式中可能会有所不同。

一、 ArrayList

1. 创建 ArrayList

在 JDK 1.7 中创建 ArrayList 提供了三个构造函数，分别如下：

    /**
     * Constructs an empty list with an initial capacity of ten.
     */
    public ArrayList() {
        super();   // 调用的是 AbstractList 空的构造函数
        this.elementData = EMPTY_ELEMENTDATA;   // EMPTY_ELEMENTDATA 的初始值为 Object[] EMPTY_ELEMENTDATA = {}
    }

    /**
     * Constructs an empty list with the specified initial capacity.
     *
     * @param  initialCapacity  the initial capacity of the list
     * @throws IllegalArgumentException if the specified initial capacity
     *         is negative
     */
    public ArrayList(int initialCapacity) {
        super();
        if (initialCapacity < 0)
            throw new IllegalArgumentException("Illegal Capacity: "+
                                               initialCapacity);
        this.elementData = new Object[initialCapacity];
    }

    /**
     * Constructs a list containing the elements of the specified
     * collection, in the order they are returned by the collection's
     * iterator.
     *
     * @param c the collection whose elements are to be placed into this list
     * @throws NullPointerException if the specified collection is null
     */
    public ArrayList(Collection<? extends E> c) {
        elementData = c.toArray();
        size = elementData.length;
        // c.toArray might (incorrectly) not return Object[] (see 6260652)
        if (elementData.getClass() != Object[].class)
            elementData = Arrays.copyOf(elementData, size, Object[].class);
    }

据此可以看出，ArrayList 采用的是数组的方式来存放对象，在没有指定初始化容量的时候，就分配了一个空的对象数组，没有任何对象元素，也就是容量为0，没有分配空间。

2. 插入对象：add(E)

add 方法简单来看就是将数组中某元素的值赋值为传入的对象，但在 add 时有个很明显的问题是：如果此时数组满了，该怎么办？

    public boolean add(E e) {
        ensureCapacityInternal(size + 1);  // Increments modCount!!
        elementData[size++] = e;
        return true;
    }

    private void ensureCapacityInternal(int minCapacity) {
        if (elementData == EMPTY_ELEMENTDATA) {
            minCapacity = Math.max(DEFAULT_CAPACITY, minCapacity); // DEFAULT_CAPACITY = 10
        }

        ensureExplicitCapacity(minCapacity);
    }

    private void ensureExplicitCapacity(int minCapacity) {
        modCount++;

        // overflow-conscious code
        if (minCapacity - elementData.length > 0)
            grow(minCapacity);
    }

    private void grow(int minCapacity) {
        // overflow-conscious code
        int oldCapacity = elementData.length;
        int newCapacity = oldCapacity + (oldCapacity >> 1);
        if (newCapacity - minCapacity < 0)
            newCapacity = minCapacity;
        if (newCapacity - MAX_ARRAY_SIZE > 0)
            newCapacity = hugeCapacity(minCapacity);
        // minCapacity is usually close to size, so this is a win:
        elementData = Arrays.copyOf(elementData, newCapacity);
    }

当调用 ArrayList 的 add 方法时，首先基于 ArrayList 中已有的元素数量加1，产生一个名为 minCapacity 的变量，然后比较此值和 Object 数组的大小，如果此值大于 Object 数组大小值，产生一个新的数组的容量值，此值的计数方法为当前数组大小值*1.5，注意：网上很多人说是新的容量值是原来的1.5倍然后加1，其实这是错误的。，如果得出的容量值仍然小于 minCapacity ，那么就以 minCapacity 作为新的容量值，在得出这个容量值后，调用 Arrays.copyOf 来生成新的数组对象。如果想调整容量的增长策略，可以调用 ensureCapacity 方法。

    public void ensureCapacity(int minCapacity) {
        int minExpand = (elementData != EMPTY_ELEMENTDATA)
            // any size if real element table
            ? 0
            // larger than default for empty table. It's already supposed to be
            // at default size.
            : DEFAULT_CAPACITY;

        if (minCapacity > minExpand) {
            ensureExplicitCapacity(minCapacity);
        }
    }

Arrays.copyOf 的实现方法简单来说，首先是创建一个新的数组对象，该数组对象的类型和之前 ArrayList 中元素的类型是一致的，如果是 Object 类型，则直接通过 new Object[newLength] 的方式来创建；如果不是 Object 类型，则通过 Array.newInstance 调用 native 方法来创建相应类型的数组，在创建完新的数组对象后，调用 System.arraycopy 通过 native 方法将之前数组中的对象复制到新的数组中。ArrayList相当于在没指定initialCapacity时就是会使用延迟分配对象数组空间，当第一次插入元素时才分配10个对象空间。

public static <T,U> T[] copyOf(U[] original, int newLength, Class<? extends T[]> newType) {
    T[] copy = ((Object)newType == (Object)Object[].class)
        ? (T[]) new Object[newLength]
        : (T[]) Array.newInstance(newType.getComponentType(), newLength);
    System.arraycopy(original, 0, copy, 0,
                     Math.min(original.length, newLength));
    return copy;
}

当我们已经确定了要插入的对象的数目（并不是在创建ArrayList之前就知道有多少对象要插入的情况），就应该首先调用ensureCapacity来一次性扩容到可以容得下要插入的对象个数，这样就避免的多次进行数组拷贝的情况，提高了效率，算是优化吧，当然，我们在创建ArrayList之前就知道有多少对象要插入就使用有参构造。
　　在 Collection 中增加对象时，ArrayList 还提供了 add(int, E) 这样的方法，允许将元素直接插入指定的 int 位置上，这个方法的实现首先要确保插入的位置是目前的 Array 数组中存在的，之后还要确保数组的容量是够用的。在完成了这些动作后，和 add(E) 不同的地方就出现了，它要将当前的数组对象进行一次复制，即将目前 index 及其后的数据都往后挪动一位，然后才能将指定的 index 位置的赋值为传入的对象，可见这种方式要多付出一次复制数组的代价。

    public void add(int index, E element) {
        rangeCheckForAdd(index);

        ensureCapacityInternal(size + 1);  // Increments modCount!!
        System.arraycopy(elementData, index, elementData, index + 1,
                         size - index);
        elementData[index] = element;
        size++;
    }

    private void rangeCheckForAdd(int index) {
        if (index > size || index < 0)
            throw new IndexOutOfBoundsException(outOfBoundsMsg(index));
    }

除了 add(int, E) 这种方法可将对象插入指定的位置外，ArrayList 还提供了 set(int, E) 这样的方法来替换指定位置上的对象。

    public E set(int index, E element) {
        rangeCheck(index);

        E oldValue = elementData(index);
        elementData[index] = element;
        return oldValue;
    }

    private void rangeCheck(int index) {
        if (index >= size)
            throw new IndexOutOfBoundsException(outOfBoundsMsg(index));
    }

ArrayList 还对外提供了 addAll(Collection<? extends E> c) 和 addAll(int index, Collection<? extends E> c) 这样的方法，其实现方式和 add(E)、add(int, E) 基本类似。

3. 删除对象：remove(E)

remove 对于集合的性能而言也非常重要，当执行此方法时，ArrayList 首先判断对象是否为 null，如为 null，则遍历数组中已有值的元素，并比较其是否为 null，如为 null，则调用 fastRemove 来删除相应位置的对象。fastRemove 方法的实现方式为将 index 后的对象往前复制一位，并将数组中的最后一个元素的值设置为 null，即释放了对此对象的引用；如对象非 null，唯一的不同在于通过 E 的 equals 来比较元素的值是否相同，如相同则认为是需要删除对象的位置，然后同样是调用 fastRemove 来完成对象的删除。由此可见调用 remove 方法一次只会移除一个元素。

    public boolean remove(Object o) {
        if (o == null) {
            for (int index = 0; index < size; index++)
                if (elementData[index] == null) {
                    fastRemove(index);
                    return true;
                }
        } else {
            for (int index = 0; index < size; index++)
                if (o.equals(elementData[index])) {
                    fastRemove(index);
                    return true;
                }
        }
        return false;
    }

    private void fastRemove(int index) {
        modCount++;
        int numMoved = size - index - 1;
        if (numMoved > 0)
            System.arraycopy(elementData, index+1, elementData, index,
                             numMoved);
        elementData[--size] = null; // clear to let GC do its work
    }

ArrayList 中还提供了 remove(int) 这样的方法来删除指定位置的对象，remove(int) 的实现比 remove(E) 多了一个数组范围检测，但少了对象位置的查找，因此性能会更好。

    public E remove(int index) {
        rangeCheck(index);

        modCount++;
        E oldValue = elementData(index);

        int numMoved = size - index - 1;
        if (numMoved > 0)
            System.arraycopy(elementData, index+1, elementData, index,
                             numMoved);
        elementData[--size] = null; // clear to let GC do its work

        return oldValue;
    }

    private void rangeCheck(int index) {
        if (index >= size)
            throw new IndexOutOfBoundsException(outOfBoundsMsg(index));
    }

4. 获取单个对象：get(int)

get 传入的参数为数组元素的位置，因此 ArrayList 仅须先做数组范围的检测，然后即可直接返回数组中位于此位置的对象。

    public E get(int index) {
        rangeCheck(index);

        return elementData(index);
    }

    private void rangeCheck(int index) {
        if (index >= size)
            throw new IndexOutOfBoundsException(outOfBoundsMsg(index));
    }

5. 遍历对象：iterator()

当每次调用 iterator 方法时，都会创建一个新的内部类 Itr 的实例。当调用此实例的 hasNext 方法时，比较当前指向的数组的位置是否和数组中已有的元素大小相等，如相等则返回 false，否则返回 true。
　　当调用实例的 next 方法时，首先比较在创建此 Iterator 时获取的 modCount 与目前的 modCount，如果这两个 modCount 不相等，则说明在获取 next 元素时，发生了对于集合大小产生影响（新增、删除）的动作。当发生这种情况时，则抛出 ConcurrentModificationException。如果 modCount 相等，则比较当前的游标，如果当前的游标大于数组的实际大小，则抛出 NoSuchElementException。如果当前游标大于数组的长度则抛出 ConcurrentModificationException。
　　注意：返回的 iterator 是 fail-fast 的。fail-fast 也就是“快速失败”，它是Java 集合的一种错误检测机制。当多个线程对集合进行结构上的改变的操作时，有可能会产生fail-fast机制。记住是有可能，而不是一定。例如：假设存在两个线程（线程1、线程2），线程1通过Iterator在遍历集合A中的元素，在某个时候线程2修改了集合A的结构（是结构上面的修改，而不是简单的修改集合元素的内容），那么这个时候程序就会抛出 ConcurrentModificationException 异常，从而产生fail-fast机制。

    public Iterator<E> iterator() {
        return new Itr();
    }

    private class Itr implements Iterator<E> {
        int cursor;       // index of next element to return
        int lastRet = -1; // index of last element returned; -1 if no such
        int expectedModCount = modCount;

        public boolean hasNext() {
            return cursor != size;
        }

        @SuppressWarnings("unchecked")
        public E next() {
            checkForComodification();
            int i = cursor;
            if (i >= size)
                throw new NoSuchElementException();
            Object[] elementData = ArrayList.this.elementData;
            if (i >= elementData.length)
                throw new ConcurrentModificationException();
            cursor = i + 1;
            return (E) elementData[lastRet = i];
        }

        public void remove() {
            if (lastRet < 0)
                throw new IllegalStateException();
            checkForComodification();

            try {
                ArrayList.this.remove(lastRet);
                cursor = lastRet;
                lastRet = -1;
                expectedModCount = modCount;
            } catch (IndexOutOfBoundsException ex) {
                throw new ConcurrentModificationException();
            }
        }

        final void checkForComodification() {
            if (modCount != expectedModCount)
                throw new ConcurrentModificationException();
        }
    }

fail-fast解决办法

方案一：在遍历过程中所有涉及到改变modCount值得地方全部加上synchronized或者直接使用Collections.synchronizedList，这样就可以解决。但是不推荐，因为增删造成的同步锁可能会阻塞遍历操作。
　　方案二：使用CopyOnWriteArrayList来替换ArrayList。推荐使用该方案。

6. 判断对象是否存在：contains(E)

为了判断 E 在 ArrayList 中是否存在，做法为遍历整个 ArrayList 中已有的元素，如 E 为 null，则直接判断已有元素是否为 null，如为 null，则返回 true；如 E 不为 null，则通过判断 E.equals 和元素是否相等，如相等则返回 true。
　　indexOf 和 lastIndexOf 是 ArrayList 中用于获取对象所在位置的方法，其中 indexOf 为从前往后寻找，而 lastIndexOf 为从后向前寻找。

    public boolean contains(Object o) {
        return indexOf(o) >= 0;
    }

    public int indexOf(Object o) {
        if (o == null) {
            for (int i = 0; i < size; i++)
                if (elementData[i]==null)
                    return i;
        } else {
            for (int i = 0; i < size; i++)
                if (o.equals(elementData[i]))
                    return i;
        }
        return -1;
    }

7. 注意要点

对于 ArrayList 而言，最须注意的有以下几点：

ArrayList 基于数组方式实现，无容量的限制；
ArrayList 在执行插入元素时可能要扩容，在删除元素时并不会减小数组的容量(如希望相应的缩小数组容量，可以调用 ArrayList 的 trimToSize() )，在查找元素时要遍历数组，对于非 null 的元素采取 equals 的方式寻找；
ArrayList 是非线程安全的。