ArrayMap完全剖析

2018-10-26 本文已影响310人码上就说

ArrayMap是一种通用的key-value映射的数据结构，旨在提高内存效率，它与传统的HashMap有很大的不同。它将其映射保留在数组数据结构中：
两个数组（其中一个存放每个item的hash值的整数数组，以及key/value对的Object数组）。
这避免了它为放入映射的每个item创建额外的对象，并且它还积极地控制这些数组的增长。
数组的增长只需要复制数组中的item，而不是重建hash映射。

ArrayMap有如下的特点：

ArrayMap是Android特有的api，用在移动端，所以它主要是提高内存效率。

ArrayMap比传统的HashMap慢，所以ArrayMap不适合包含大数据的处理，因为添加和删除元素的时候需要使用二分搜索来查找元素。

ArrayMap会在remove item的时候收缩数组。

ArrayMap不是线程安全的。

概要

1.1 ArrayMap类结构

1.2 ArrayMap构造函数

1.3 添加数据

1.4 分配数组空间

1.5 释放数组

1.6 删除特定key

1.7 删除特定index的元素

1.1 ArrayMap类结构

Map接口是java.util.Map中定义的接口，这和接口中提供了通用的数据接口，put remove putAll removeAll等等，这些都需要ArrayMap中实现的。

public final class ArrayMap<K, V> implements Map<K, V> {
}

1.2 ArrayMap构造函数

    public ArrayMap() {
        this(0, false);
    }
    public ArrayMap(int capacity) {
        this(capacity, false);
    }

    /** {@hide} */
    public ArrayMap(int capacity, boolean identityHashCode) {
        mIdentityHashCode = identityHashCode;
        if (capacity < 0) {
            mHashes = EMPTY_IMMUTABLE_INTS;
            mArray = EmptyArray.OBJECT;
        } else if (capacity == 0) {
            mHashes = EmptyArray.INT;
            mArray = EmptyArray.OBJECT;
        } else {
            allocArrays(capacity);
        }
        mSize = 0;
    }
    public ArrayMap(ArrayMap<K, V> map) {
        this();
        if (map != null) {
            putAll(map);
        }
    }

其中ArrayMap(int capacity, boolean identityHashCode)构造函数前面有一个hide，表明这个构造函数一般不为开发者直接调用，传入的两个参数：capacity表示ArrayMap初始化的容量，identityHashCode表计算hashcode的方式由System调用还是由Object自己调用（其中两者调用没有什么本质区别，就是System调用的方式可以避免key为null的情况，但是也无所谓，因为ArrayMap.put的时候key=null时直接将hash赋为0），identityHashCode一般都是false。
开头判断了capacity < 0 和 capacity == 0的两种情况，分别赋一个final数据。当capacity > 0，开始执行allocArrays(capacity)[见1.4 分配数组空间]，下面会谈到这个函数。然后mSize = 0，这个mSize就是当前ArrayMap中包含值的个数。
另外一个public ArrayMap(ArrayMap<K, V> map)直接put一个ArrayMap，相当于赋值操作，但还是有所不同。

1.3 添加数据

构造好了ArrayMap数据结构，就需要向ArrayMap中添加数据，调用的方法就是put函数。

public V put(K key, V value) {
        final int osize = mSize;
        final int hash;
        int index;
        if (key == null) {
            hash = 0;
            index = indexOfNull();
        } else {
            hash = mIdentityHashCode ? System.identityHashCode(key) : key.hashCode();
            index = indexOf(key, hash);
        }
        if (index >= 0) {
            index = (index<<1) + 1;
            final V old = (V)mArray[index];
            mArray[index] = value;
            return old;
        }

        index = ~index;
        if (osize >= mHashes.length) {
            final int n = osize >= (BASE_SIZE*2) ? (osize+(osize>>1))
                    : (osize >= BASE_SIZE ? (BASE_SIZE*2) : BASE_SIZE);

            if (DEBUG) Log.d(TAG, "put: grow from " + mHashes.length + " to " + n);

            final int[] ohashes = mHashes;
            final Object[] oarray = mArray;
            allocArrays(n);

            if (CONCURRENT_MODIFICATION_EXCEPTIONS && osize != mSize) {
                throw new ConcurrentModificationException();
            }

            if (mHashes.length > 0) {
                if (DEBUG) Log.d(TAG, "put: copy 0-" + osize + " to 0");
                System.arraycopy(ohashes, 0, mHashes, 0, ohashes.length);
                System.arraycopy(oarray, 0, mArray, 0, oarray.length);
            }

            freeArrays(ohashes, oarray, osize);
        }

        if (index < osize) {
            if (DEBUG) Log.d(TAG, "put: move " + index + "-" + (osize-index)
                    + " to " + (index+1));
            System.arraycopy(mHashes, index, mHashes, index + 1, osize - index);
            System.arraycopy(mArray, index << 1, mArray, (index + 1) << 1, (mSize - index) << 1);
        }

        if (CONCURRENT_MODIFICATION_EXCEPTIONS) {
            if (osize != mSize || index >= mHashes.length) {
                throw new ConcurrentModificationException();
            }
        }
        mHashes[index] = hash;
        mArray[index<<1] = key;
        mArray[(index<<1)+1] = value;
        mSize++;
        return null;
    }

这个过程非常关键，我会详细讲解一下这个过程。

ArrayMap中的对应关系是下面的关系，这个图出来了，我们就能理清了。
ArrayMap中两个重要的变量mHashes与mArray，分别存储当前key的hashcode与key-value，这一点一定要记清楚了。下面的图示其实已经表达地很清楚了。

ArrayMap原理图.jpg

添加元素的过程很重要，ArrayMap扩充工作都是在这里面做的，我们需要搞清楚它的流程。
下面用一个简单的流程图来表示这个添加元素的流程：
非常关键的地方是就是红色框中的内容，会重点讲解的：

osize是ArrayMap中存储的元素个数。

每次put的时候都会判断一下：当前ArrayMap的大小空间是否需要重新分配，下面是判断的语句。如果osize大于等于8的话，那么每次扩充的时候直接增长2倍，变成原来的3倍大小。

final int n = osize >= (BASE_SIZE2) ? (osize+(osize>>1))
: (osize >= BASE_SIZE ? (BASE_SIZE2) : BASE_SIZE);**

这时候用临时变量将mHashes与mArray取出来，因为接下来要执行allocArrays(n) 方法了。这个方法放在[1.4 分配数组空间]中讲解一下。

分配数组空间完成，内存拷贝mHashes与mArray

释放数组空间，执行freeArrays(ohashes, oarray, osize)，详情请看[1.5 释放数组]

if (index < osize) {
    if (DEBUG) Log.d(TAG, "put: move " + index + "-" + (osize-index)
            + " to " + (index+1));
    System.arraycopy(mHashes, index, mHashes, index + 1, osize - index);
    System.arraycopy(mArray, index << 1, mArray, (index + 1) << 1, (mSize - index) << 1);
}

因为这时候已经找到了插入的index位置，所以mHashes与mArray都要挪一位，为了给即将插入的元素留下位置。

mHash数组是有序的，这个很好理解，只有有序，用二分查找才有意义的。

下面是ArrayMap增加元素的流程图：大家简单了解一下。

ArrayMap-put执行流程.jpg

1.4 分配数组空间

谈到了构造函数中用到的allocArrays(capacity)

private void allocArrays(final int size) {
        if (mHashes == EMPTY_IMMUTABLE_INTS) {
            throw new UnsupportedOperationException("ArrayMap is immutable");
        }
        if (size == (BASE_SIZE*2)) {
            synchronized (ArrayMap.class) {
                if (mTwiceBaseCache != null) {
                    final Object[] array = mTwiceBaseCache;
                    mArray = array;
                    mTwiceBaseCache = (Object[])array[0];
                    mHashes = (int[])array[1];
                    array[0] = array[1] = null;
                    mTwiceBaseCacheSize--;
                    if (DEBUG) Log.d(TAG, "Retrieving 2x cache " + mHashes
                            + " now have " + mTwiceBaseCacheSize + " entries");
                    return;
                }
            }
        } else if (size == BASE_SIZE) {
            synchronized (ArrayMap.class) {
                if (mBaseCache != null) {
                    final Object[] array = mBaseCache;
                    mArray = array;
                    mBaseCache = (Object[])array[0];
                    mHashes = (int[])array[1];
                    array[0] = array[1] = null;
                    mBaseCacheSize--;
                    if (DEBUG) Log.d(TAG, "Retrieving 1x cache " + mHashes
                            + " now have " + mBaseCacheSize + " entries");
                    return;
                }
            }
        }

        mHashes = new int[size];
        mArray = new Object[size<<1];
    }

介绍一下此函数中用到的变量：
BASE_SIZE：private static final int BASE_SIZE = 4;初始化定义的4
mHashes：hash数组，这是有序的。
mArray：存放key-value的数组，和hash数组中的item对应的。
下面这四个变量是缓存使用的，当执行freeArrays的时候，array中的对象被置null了，但是用这四个变量将array中的数据保存下来，为了下次allocArrays的时候可以快速启用。
这就是为什么ArrayMap初始化的建议大小是4，因为这个大小使用效率最好了。

mBaseCacheSize
mTwiceBaseCacheSize
mBaseCache
mTwiceBaseCache

当需要分配的size不是4，不是8的时候，就直接执行
mHashes = new int[size];
mArray = new Object[size<<1];
mArray的length是mHashes的length的2倍。

1.5 释放数组

private static void freeArrays(final int[] hashes, final Object[] array, final int size) {
        if (hashes.length == (BASE_SIZE*2)) {
            synchronized (ArrayMap.class) {
                if (mTwiceBaseCacheSize < CACHE_SIZE) {
                    array[0] = mTwiceBaseCache;
                    array[1] = hashes;
                    for (int i=(size<<1)-1; i>=2; i--) {
                        array[i] = null;
                    }
                    mTwiceBaseCache = array;
                    mTwiceBaseCacheSize++;
                    if (DEBUG) Log.d(TAG, "Storing 2x cache " + array
                            + " now have " + mTwiceBaseCacheSize + " entries");
                }
            }
        } else if (hashes.length == BASE_SIZE) {
            synchronized (ArrayMap.class) {
                if (mBaseCacheSize < CACHE_SIZE) {
                    array[0] = mBaseCache;
                    array[1] = hashes;
                    for (int i=(size<<1)-1; i>=2; i--) {
                        array[i] = null;
                    }
                    mBaseCache = array;
                    mBaseCacheSize++;
                    if (DEBUG) Log.d(TAG, "Storing 1x cache " + array
                            + " now have " + mBaseCacheSize + " entries");
                }
            }
        }
    }

释放数组的执行函数中mTwiceBaseCache中存储这当前array与对应的hashes数组，保存size为4和8的数组信息，是为了在allocArrays的时候可以直接复用，这样提升效率。

1.6 删除特定key

public V remove(Object key) {
        final int index = indexOfKey(key);
        if (index >= 0) {
            return removeAt(index);
        }
        return null;
    }

最终还是会调用[1.7 删除特定index的元素]
这时候index >= 0，说明这个key肯定在ArrayMap中。那么接下来直接删除对应mHashes与对应mArray中的元素就行了。

1.7 删除特定index的元素

public V removeAt(int index) {
        final Object old = mArray[(index << 1) + 1];
        final int osize = mSize;
        final int nsize;
        if (osize <= 1) {
            // Now empty.
            if (DEBUG) Log.d(TAG, "remove: shrink from " + mHashes.length + " to 0");
            final int[] ohashes = mHashes;
            final Object[] oarray = mArray;
            mHashes = EmptyArray.INT;
            mArray = EmptyArray.OBJECT;
            freeArrays(ohashes, oarray, osize);
            nsize = 0;
        } else {
            nsize = osize - 1;
            if (mHashes.length > (BASE_SIZE*2) && mSize < mHashes.length/3) {
                // Shrunk enough to reduce size of arrays.  We don't allow it to
                // shrink smaller than (BASE_SIZE*2) to avoid flapping between
                // that and BASE_SIZE.
                final int n = osize > (BASE_SIZE*2) ? (osize + (osize>>1)) : (BASE_SIZE*2);

                if (DEBUG) Log.d(TAG, "remove: shrink from " + mHashes.length + " to " + n);

                final int[] ohashes = mHashes;
                final Object[] oarray = mArray;
                allocArrays(n);

                if (CONCURRENT_MODIFICATION_EXCEPTIONS && osize != mSize) {
                    throw new ConcurrentModificationException();
                }

                if (index > 0) {
                    if (DEBUG) Log.d(TAG, "remove: copy from 0-" + index + " to 0");
                    System.arraycopy(ohashes, 0, mHashes, 0, index);
                    System.arraycopy(oarray, 0, mArray, 0, index << 1);
                }
                if (index < nsize) {
                    if (DEBUG) Log.d(TAG, "remove: copy from " + (index+1) + "-" + nsize
                            + " to " + index);
                    System.arraycopy(ohashes, index + 1, mHashes, index, nsize - index);
                    System.arraycopy(oarray, (index + 1) << 1, mArray, index << 1,
                            (nsize - index) << 1);
                }
            } else {
                if (index < nsize) {
                    if (DEBUG) Log.d(TAG, "remove: move " + (index+1) + "-" + nsize
                            + " to " + index);
                    System.arraycopy(mHashes, index + 1, mHashes, index, nsize - index);
                    System.arraycopy(mArray, (index + 1) << 1, mArray, index << 1,
                            (nsize - index) << 1);
                }
                mArray[nsize << 1] = null;
                mArray[(nsize << 1) + 1] = null;
            }
        }
        if (CONCURRENT_MODIFICATION_EXCEPTIONS && osize != mSize) {
            throw new ConcurrentModificationException();
        }
        mSize = nsize;
        return (V)old;
    }

首先赋值当前的ArrayMap中元素个数，osize = mSize;

如果只有一个元素，删除此元素，同时mHashes与mArray还原到最初的样子，释放数组空间。

如果不止一个元素：

if (mHashes.length > (BASE_SIZE2) && mSize < mHashes.length/3)
还记得之前分配的时候那句判断：
final int n = osize >= (BASE_SIZE2) ? (osize+(osize>>1))
: (osize >= BASE_SIZE ? (BASE_SIZE2) : BASE_SIZE);*
也就是说如果如果当前元素个数超过8个，那么分配的mHash数组长度必须是mSize的3倍，如果不是3倍，那需要本地校验一下，因为毕竟ArrayMap不是线程安全的，mHash大小还是有可能被改变的。
这时候我们会重新allocArrays，重新计算mSize，如果发现重新计算的mSize与现在的osize大小不一样，说明ArrayMap操作中肯定涉及到多线程了，这时候直接抛出ConcurrentModificationException。
接下来就是正常的remove操作了，index位置的元素被抹掉了。

else ，换言之mHashes.length <= (BASE_SIZE*2) || mSize >= mHashes.length/3
这是正常的情况，直接remove index位置的元素就行了。

mSize = osize，这时候的mSize更新一下。

本文留下一个疑问：下一篇文章我们接着解惑，ArrayMap不是线程安全的，当出现多线程操作ArrayMap的时候，会抛出ConcurrentModificationException，但是aosp上代码没有cover住所有的多线程情况。解析一下ArrayMap上多线程情况下的问题。敬请期待。

ArrayMap完全剖析

概要

1.1 ArrayMap类结构

1.2 ArrayMap构造函数

1.3 添加数据

1.4 分配数组空间

1.5 释放数组

1.6 删除特定key

1.7 删除特定index的元素

1.1 ArrayMap类结构

1.2 ArrayMap构造函数

1.3 添加数据

1.4 分配数组空间

1.5 释放数组

1.6 删除特定key

1.7 删除特定index的元素

猜你喜欢

热点阅读