ArrayList、LinkedList和Vector

2019-11-17 本文已影响0人最美的风景

Collection的类继承图

从图中可以看出：Vector、ArrayList、LinkedList这三者都实现了List 接口.所有使用方式也很相似,主要区别在于实现方式的不同,所以对不同的操作具有不同的效率。
• ArrayList 就是动态数组，是Array的复杂版本，动态的增加和减少元素.当更多的元素加入到ArrayList中时,其大小将会动态地增长。它的元素可以通过get/set方法直接访问，因为ArrayList本质上是一个数组。
• Vector 和ArrayList类似, 区别在于Vector是同步类(synchronized).因此,开销就比ArrayList要大。
• LinkedList 是一个双链表,在添加和删除元素时具有比ArrayList更好的性能.但在get与set方面弱于ArrayList.当然,这些对比都是指数据量很大或者操作很频繁的情况下的对比。它还实现了 Queue 接口,该接口比List提供了更多的方法,包括 offer(),peek(),poll()等。
注意:默认情况下ArrayList和Vector的初始容量都是10,所以如果可以预估数据量的话,分配一个较大的初始值属于最佳实践,这样可以减少调整大小的开销。

一、ArrayList

ArrayList是List接口可调整数组大小的实现。实现所有可选列表操作，并允许放入包括空值在内的所有元素，ArrayList是非线程安全的。每个ArrayList都有一个容量（capacity,区别于size），表示底层数组的实际大小，容器内存储元素的个数不能多于当前容量。
java.util.ArrayList类的继承关系如下：


public class ArrayList<E> extends AbstractList<E> implements List<E>, RandomAccess, Cloneable, java.io.Serializable

其中需要注意的是RandomAccess接口，这是一个标记接口，没有定义任何具体的内容，该接口的意义是随机存取数据。在数据量很大的情况下，采用迭代器遍历实现了该接口的集合，速度比较慢。
ArrayList默认第一次插入元素时创建数组的大小为10；当向容器中添加元素时，如果容量不足，容器会自动增加50%的容量；最大容量为2147483639，如下：

/**
 * The maximum size of array to allocate.
 * Some VMs reserve some header words in an array.
 * Attempts to allocate larger arrays may result in
 * OutOfMemoryError: Requested array size exceeds VM limit
 */
private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;

至于为什么是Integer.MAX_VALUE - 8上面注释写的很清楚。

ArrayList 相当于动态数组，其中最重要的两个属性分别是: elementData 数组，以及 size 大小。在调用 add() 方法的时候：

public boolean add(E e) {
    ensureCapacityInternal(size + 1);  // Increments modCount!!
    elementData[size++] = e;
    return true;
}

首先进行扩容校验。
将插入的值放到尾部，并将 size + 1 。

如果是调用 add(index,e) 在指定位置添加的话：

public void add(int index, E element) {
    rangeCheckForAdd(index);

    ensureCapacityInternal(size + 1);  // Increments modCount!!
    //复制，向后移动
    System.arraycopy(elementData, index, elementData, index + 1,
                     size - index);
    elementData[index] = element;
    size++;
}

也是首先扩容校验。
接着对数据进行复制，目的是把 index 位置空出来放本次插入的数据，并将后面的数据向后移动一个位置。

其实扩容最终调用的代码:

private void grow(int minCapacity) {
    // overflow-conscious code
    int oldCapacity = elementData.length;
    int newCapacity = oldCapacity + (oldCapacity >> 1);
    if (newCapacity - minCapacity < 0)
        newCapacity = minCapacity;
    if (newCapacity - MAX_ARRAY_SIZE > 0)
        newCapacity = hugeCapacity(minCapacity);
    // minCapacity is usually close to size, so this is a win:
    elementData = Arrays.copyOf(elementData, newCapacity);
}

也是一个数组复制的过程。

由此可见 ArrayList 的主要消耗是数组扩容以及在指定位置添加数据，在日常使用时最好是指定大小，尽量减少扩容。更要减少在指定位置插入数据的操作。

数组复制

ArrayList的实现中大量地调用了Arrays.copyof()和System.arraycopy()方法。在此介绍一下这两个方法。
System.arraycopy()方法是一个native方法，调用了系统的C/C++代码，在openJDK中可以看到其源码。该方法最终调用了C语言的memmove()函数，因此它可以保证同一个数组内元素的正确复制和移动，比一般的复制方法的实现效率要高很多，很适合用来批量处理数组。Java强烈推荐在复制大量数组元素时使用该方法，以取得更高的效率。
Arrays.copyOf()方法实际上是在其内部创建了一个类型为newType、长度为newLength的新数组，调用System.arraycopy()方法，将原来数组中的元素复制到新的数组中。

序列化

由于 ArrayList 是基于动态数组实现的，所以并不是所有的空间都被使用。因此使用了 transient 修饰，可以防止被自动序列化。

transient Object[] elementData;

因此 ArrayList 自定义了序列化与反序列化：

private void writeObject(java.io.ObjectOutputStream s)
    throws java.io.IOException{
    // Write out element count, and any hidden stuff
    int expectedModCount = modCount;
    s.defaultWriteObject();

    // Write out size as capacity for behavioural compatibility with clone()
    s.writeInt(size);

    // Write out all elements in the proper order.
    //只序列化了被使用的数据
    for (int i=0; i<size; i++) {
        s.writeObject(elementData[i]);
    }

    if (modCount != expectedModCount) {
        throw new ConcurrentModificationException();
    }
}

private void readObject(java.io.ObjectInputStream s)
    throws java.io.IOException, ClassNotFoundException {
    elementData = EMPTY_ELEMENTDATA;

    // Read in size, and any hidden stuff
    s.defaultReadObject();

    // Read in capacity
    s.readInt(); // ignored

    if (size > 0) {
        // be like clone(), allocate array based upon size not capacity
        ensureCapacityInternal(size);

        Object[] a = elementData;
        // Read in all elements in the proper order.
        for (int i=0; i<size; i++) {
            a[i] = s.readObject();
        }
    }
}

当对象中自定义了 writeObject 和 readObject 方法时，JVM 会调用这两个自定义方法来实现序列化与反序列化。

从实现中可以看出 ArrayList 只序列化了被使用的数据。

二、LinkedList

LinkedList与ArrayList一样也实现了List接口，LinkedList使用双向链表实现，允许存储元素重复，链表与ArrayList的数组实现相比，在进行插入和删除操作时效率更高，但查找操作效率更低，因此在实际使用中应根据自己的程序计算需求来从二者中取舍。
与ArrayList一样，LinkedList也是非线程安全的。

底层实现

java.util.LinkedList的继承关系如下：


publicclassLinkedList<E>extendsAbstractSequentialList<E>implementsList<E>,Deque<E>,Cloneable,java.io.Serializable

LinkedList继承自抽象类AbstractSequenceList，其实AbstractSequenceList已经实现了List接口，LinkedList除了实现了List接口外，还实现了Deque接口，是可以在两端插入和移动数据的线性数据结构，我们熟知的栈和队列皆可以通过实现Deque接口来实现。因此在LinkedList的实现中，除了提供了列表相关的方法如add()、remove()等，还提供了栈和队列的pop()、peek()、poll()、offer()等相关方法。
LinkedList要想找到index对应位置的元素，必须要遍历整个列表，在源码实现中已经使用了二分查找的方法来进行优化，但查找元素的开销依然很大，并且与查找的位置有关。相比较ArrayList的常数级时间的消耗而言，差距很大。

三、Vector

Vector与ArrayList的实现基本相同，它们底层都是基于Object数组实现的，两者最大的区别在于ArrayList是非线程安全的，而Vector是线程安全的。
Vector与ArrayList还有一处细节上的不同，那就是Vector进行添加操作时，如果列表容量不够需要扩容，每次增加的大小是原来的100%，而前面已经讲过，ArrayList一次只增加原有容量的50%。
Vector类内部的大部分方法与ArrayList均相同，区别仅在于加上了synchronized关键字，保证了同一时刻只有一个线程能够写Vector，避免多线程同时写而引起的不一致性，但实现同步需要很高的花费，因此，访问Vector比访问ArrayList要慢。

以下是add() 方法：

public synchronized boolean add(E e) {
    modCount++;
    ensureCapacityHelper(elementCount + 1);
    elementData[elementCount++] = e;
    return true;
}

以及指定位置插入数据add(index,e)方法:

public void add(int index, E element) {
    insertElementAt(element, index);
}
public synchronized void insertElementAt(E obj, int index) {
    modCount++;
    if (index > elementCount) {
        throw new ArrayIndexOutOfBoundsException(index
                                                 + " > " + elementCount);
    }
    ensureCapacityHelper(elementCount + 1);
    System.arraycopy(elementData, index, elementData, index + 1, elementCount - index);
    elementData[index] = obj;
    elementCount++;
}

LinkedList和ArrayList的区别

LinkedList和ArrayList的差别主要来自于ArrayList和LinkedList数据结构的不同：

因为ArrayList是基于索引(index)的数据结构，它使用索引在数组中搜索和读取数据是很快的。ArrayList获取数据的时间复杂度是O(1),但是要删除数据却是开销很大的，因为这需要重排数组中的所有数据。
相对于ArrayList，LinkedList插入是更快的。因为LinkedList不像ArrayList一样，不需要改变数组的大小，也不需要在数组装满的时候要将所有的数据重新装入一个新的数组，这是ArrayList最坏的一种情况，时间复杂度是O(n)，而LinkedList中插入或删除的时间复杂度仅为O(1)。ArrayList在插入数据时还需要更新索引（除了插入数组的尾部）。
对于新增和删除操作add和remove，LinedList比较占优势，因为ArrayList要移动数据。
LinkedList需要更多的内存，因为ArrayList的每个索引的位置是实际的数据，而LinkedList中的每个节点中存储的是实际的数据和前后节点的位置。
ArrayList 遍历的时候选择for循环，LinkedList遍历的时候选择Iterator迭代器。

Vector 和ArrayList 的区别

1、当更多的元素被添加的时候，Vector和ArrayList需要更多的空间。Vector每次扩容会增加一倍的空间，而ArrayList增加50%。
2、Vector的方法加了synchronized, 而ArrayList则没有。Vector属于线程安全级别的，但是大多数情况下不使用Vector，因为线程安全需要更大的系统开销。