初识 ArrayMap

2020-10-31 本文已影响0人阿策神奇

小菜在之前学习 SharedPreferences 源码时注意到，其数据存储主要用到了 ArrayMap，小菜在日常中对于 key-value 方式主要是 HashMap 居多，今天简单研究一下 ArrayMap；

ArrayMap

ArrayMap 是一种相较于 HashMap 具有更高内存效率的 key-value 对存储结构；ArrayMap 内部包括两个数组结构，分别是专门用来存储 HashCode 的 mHashes 和用来存储 key-value 的 Object 数组类型的 mArray；
ArrayMap 是非线程安全的；

源码分析

构造函数

public ArrayMap() {
    this(0, false);
}

public ArrayMap(int capacity) {
    this(capacity, false);
}

public ArrayMap(int capacity, boolean identityHashCode) {
    mIdentityHashCode = identityHashCode;
    if (capacity < 0) {
        mHashes = EMPTY_IMMUTABLE_INTS;
        mArray = EmptyArray.OBJECT;
    } else if (capacity == 0) {
        mHashes = EmptyArray.INT;
        mArray = EmptyArray.OBJECT;
    } else {
        allocArrays(capacity);
    }
    mSize = 0;
}

ArrayMap 提供了三种构造方法，分别为无参的默认构造函数，添加默认容积的构造函数和一个隐藏的构造函数；若指定容积大小则直接分配相应大小的内存；若是默认构造函数，默认为 0，会在添加数据时动态扩容；

元素查询

int indexOf(Object key, int hash) {
    final int N = mSize;
    // ====== TAG 01 ======
    int index = binarySearchHashes(mHashes, N, hash);
    if (index < 0) {
        return index;
    }
    if (key.equals(mArray[index<<1])) {
        return index;
    }
    // ====== TAG 02 ======
    for (end = index + 1; end < N && mHashes[end] == hash; end++) {
        if (key.equals(mArray[end << 1])) return end;
    }
    for (int i = index - 1; i >= 0 && mHashes[i] == hash; i--) {
        if (key.equals(mArray[i << 1])) return i;
    }
    return ~end;
}

ArrayMap 查找元素主要是通过 binarySearchHashes 二分查找方式来查找 index；若 HashCode 和 Key 均匹配则为要查找的 index；若只有 HashCode 相同但对象不同(即 HashCode 冲突)，则从当前对应的 index 向后和向前分别遍历查找；注意：采用二分查找，则说明 mHashes 数组是有序的；

元素添加

ArrayMap 添加元素的方式主要有 put 和 append 方式；

1. put()

public V put(K key, V value) {
    final int osize = mSize;
    final int hash;
    int index;
    if (key == null) {
        hash = 0;
        index = indexOfNull();
    } else {
        hash = mIdentityHashCode ? System.identityHashCode(key) : key.hashCode();
        // ====== TAG 01 ====== 
        index = indexOf(key, hash);
    }
    if (index >= 0) {
        // ====== TAG 02 ====== 
        index = (index<<1) + 1;
        final V old = (V)mArray[index];
        mArray[index] = value;
        return old;
    }
    // ====== TAG 03 ====== 
    index = ~index;
    if (osize >= mHashes.length) {
        final int n = osize >= (BASE_SIZE*2) ? (osize+(osize>>1)) : (osize >= BASE_SIZE ? (BASE_SIZE*2) : BASE_SIZE);
        final int[] ohashes = mHashes;
        final Object[] oarray = mArray;
        // ====== TAG 04 ======
        allocArrays(n);
        if (CONCURRENT_MODIFICATION_EXCEPTIONS && osize != mSize) {
            throw new ConcurrentModificationException();
        }
        // ====== TAG 05 ======
        if (mHashes.length > 0) {
            System.arraycopy(ohashes, 0, mHashes, 0, ohashes.length);
            System.arraycopy(oarray, 0, mArray, 0, oarray.length);
        }
        freeArrays(ohashes, oarray, osize);
    }
    // ====== TAG 06 ======
    if (index < osize) {
        System.arraycopy(mHashes, index, mHashes, index + 1, osize - index);
        System.arraycopy(mArray, index << 1, mArray, (index + 1) << 1, (mSize - index) << 1);
    }

    mHashes[index] = hash;
    mArray[index<<1] = key;
    mArray[(index<<1)+1] = value;
    mSize++;
    return null;
}

小菜在源码处打了几个需要注意的 TAG；

TAG 01: 主要用于二分查找的方式，在 mHashes 数组中查找 HashCode 值相等的 Key；
TAG 02: 在 index > 0 即有对应 HashCode 相等的 Key 之后，更新其 Value 值；通过 index << 1 左移的方式相当于 index * 2 只是效率更高效，可多在实际项目中应用；
TAG 03: 在 index < 0 即没有对应 HashCode 相等的 Key，此时需要插入新数据；
TAG 04: 当需要扩容时，可采用 allocArrays() 方式分配更大的内存空间；且 ArrayMap 是非线程安全的，不可并行；
TAG 05: 通过 System.arraycopy 将老的数组数据拷贝到新的数组中，再通过 freeArrays() 释放老的数组内存；
TAG 06: 当需要插入的元素不在末尾时，拷贝完数据之后需要将 index 后移一位；

2. append()

public void append(K key, V value) {
    int index = mSize;
    final int hash = key == null ? 0 : (mIdentityHashCode ? System.identityHashCode(key) : key.hashCode());
    if (index >= mHashes.length) {
        throw new IllegalStateException("Array is full");
    }
    if (index > 0 && mHashes[index-1] > hash) {
        RuntimeException e = new RuntimeException("here");
        e.fillInStackTrace();
        // ====== TAG ======
        put(key, value);
        return;
    }
    mSize = index+1;
    mHashes[index] = hash;
    index <<= 1;
    mArray[index] = key;
    mArray[index+1] = value;
}

简单查看 append() 源码，与 put() 相比少了扩容和内存回收等步骤；其两者使用场景不同，append() 无需验证即可将元素追加到数组末尾的特殊快速路径，要求是数组必须足够大；当数据需要插入到数组中间时调用 put() 方式；

元素删除

public V remove(Object key) {
    final int index = indexOfKey(key);
    if (index >= 0) {
        return removeAt(index);
    }
    return null;
}

ArrayMap 可以通过 remove() 来删除固定元素；首先通过 indexOfKey 二分查找是否有对应要删除的元素，如果有通过 removeAt() 进行删除；

public V removeAt(int index) {
    final Object old = mArray[(index << 1) + 1];
    final int osize = mSize;
    final int nsize;
    // ====== TAG 01 ======
    if (osize <= 1) {
        final int[] ohashes = mHashes;
        final Object[] oarray = mArray;
        mHashes = EmptyArray.INT;
        mArray = EmptyArray.OBJECT;
        freeArrays(ohashes, oarray, osize);
        nsize = 0;
    } else {
        nsize = osize - 1;
        // ====== TAG 02 ======
        if (mHashes.length > (BASE_SIZE*2) && mSize < mHashes.length/3) {
            final int n = osize > (BASE_SIZE*2) ? (osize + (osize>>1)) : (BASE_SIZE*2);
            final int[] ohashes = mHashes;
            final Object[] oarray = mArray;
            allocArrays(n);

            if (index > 0) {
                System.arraycopy(ohashes, 0, mHashes, 0, index);
                System.arraycopy(oarray, 0, mArray, 0, index << 1);
            }
            if (index < nsize) {
                System.arraycopy(ohashes, index + 1, mHashes, index, nsize - index);
                System.arraycopy(oarray, (index + 1) << 1, mArray, index << 1, (nsize - index) << 1);
            }
        } else {
            // ====== TAG 03 ======
            if (index < nsize) {
                System.arraycopy(mHashes, index + 1, mHashes, index, nsize - index);
                System.arraycopy(mArray, (index + 1) << 1, mArray, index << 1, (nsize - index) << 1);
            }
            mArray[nsize << 1] = null;
            mArray[(nsize << 1) + 1] = null;
        }
    }
    mSize = nsize;
    return (V)old;
}

TAG 01: 当数组只有一个要删除的元素时，直接将 mHashes 和 mArray 置空并通过 freeArrays 释放内存即可；
TAG 02: 当数组内存大小大于 8 并且元素数量少于 1/3 空间大小时，通过 allocArrays 进行减少内存分配，将老数据拷贝到新的数组中，并清空老数据数组空间；这是 HashMap 不曾实现的；
TAG 03: 当删除其中一个元素时，需要将该元素之后的所有元素向前移动一位；

数组扩容

HashMap 直接以容积的 2 倍进行扩容，而 ArrayMap 数组扩容是分阶段扩容的；与基础数组长度有关；

final int n = osize >= (BASE_SIZE*2) ? (osize+(osize>>1)) : (osize >= BASE_SIZE ? (BASE_SIZE*2) : BASE_SIZE);

当 osize >= 8 时，数组扩容为原来的 1.5 倍；其中 osize >> 1 相当于 osize / 2；
当 4 <= osize < 8 时，此时扩展为 8；
当 osize < 4 时，此时扩展为 4；

内存机制

// ArrayMap 扩容的最小容积(用于调整为相对节省空间方式)
private static final int BASE_SIZE = 4;
// 最大缓存数组数
private static final int CACHE_SIZE = 10;
// 用来缓存 BASE_SIZE = 4 的数组内容
static Object[] mBaseCache;
// mBaseCache 缓存的数量
static int mBaseCacheSize;
// 用来存储 BASE_SIZE * 2 = 8 的数组内容
static Object[] mTwiceBaseCache;
// mTwiceBaseCache 缓存的数量
static int mTwiceBaseCacheSize;

ArrayMap 为了避免频繁的创建和销毁，提供了 mBaseCache 和 mTwiceBaseCache 两个数组缓冲池，同时提供了 allocArrays 和 freeArrays 内存分配和释放的方法，两者相互对应，都通过缓冲池分配和释放内存；

private void allocArrays(final int size) {
    if (size == (BASE_SIZE*2)) {
        synchronized (ArrayMap.class) {
            if (mTwiceBaseCache != null) {
                final Object[] array = mTwiceBaseCache;
                mArray = array;
                mTwiceBaseCache = (Object[])array[0];
                mHashes = (int[])array[1];
                array[0] = array[1] = null;
                mTwiceBaseCacheSize--;
                return;
            }
        }
    } else if (size == BASE_SIZE) {
        ...
    }
    mHashes = new int[size];
    mArray = new Object[size<<1];
}

当需要分配内存大小为 BASE_SIZE 或 BASE_SIZE * 2 时，会先查看对应的缓存池中取出 mArray 和 mHashes；其方式是先将缓存池指向上一条缓存地址，将缓存池的第 array[1] 个元素赋值为 mHashes ，再把 mArray 的第 array[0] 和第 array[1] 个位置的数据置为 null；

private static void freeArrays(final int[] hashes, final Object[] array, final int size) {
    if (hashes.length == (BASE_SIZE*2)) {
        synchronized (ArrayMap.class) {
            if (mTwiceBaseCacheSize < CACHE_SIZE) {
                array[0] = mTwiceBaseCache;
                array[1] = hashes;
                for (int i=(size<<1)-1; i>=2; i--) {
                    array[i] = null;
                }
                mTwiceBaseCache = array;
                mTwiceBaseCacheSize++;
            }
        }
    } else if (hashes.length == BASE_SIZE) {
        ...
    }
}

当内存需要释放时，释放大小为 BASE_SIZE 或 BASE_SIZE * 2 时，会将数组加入到缓冲池中；其方she式是先将原缓冲池和哈希数组分别指向 array 前两位，之后的元素全部置空，最后将缓存池的头部指向最新的数组位置；

注意事项

ArrayMap 不适合大量的数据结构，Google 建议在 1000 以内的数据量比较好；ArrayMap 内部采用了二分查找方式查询，时间复杂度 O(logN)，每次添加和删除元素都需要移动其后面的元素，速度相对较慢；而 HashMap 查找和删除时间复杂度为 O(1)；

ArrayMap 相对于 HashMap 增加了内存缓存策略，避免频繁创建对象而分配内存与 GC 操作，同时限制了缓冲池的上限(10 个)；与此同时，ArrayMap 还提供了灵活的扩容和缩容机制，这两种机制比 HashMap 更灵活且节省时间；

ArrayMap 还解决了 HashMap 稀疏数组的问题，相对占用内存更少；

案例尝试

ArrayMap map01 = new ArrayMap();
ArrayMap map02 = new ArrayMap(4);
// 元素添加
for (int i = 0;i < 6;i ++) {
    map01.put("index_"+(i+1), "map01.value_"+(i +1));
    map02.put("index_"+(i+1), "map02.value_"+(i +1));
}
for(int i = 0 ;i < map01.size();i ++){
    Log.e("ArrayMap 01 -->", map01.get("index_"+(i+1)).toString()+"==HashCode=="+map01.get("index_"+(i+1)).hashCode());
}
for(int i = 0 ;i < map02.size();i ++){
    Log.e("ArrayMap 02 -->", map02.get("index_"+(i+1)).toString()+"==HashCode=="+map02.get("index_"+(i+1)).hashCode());
}
// 元素删除
map01.remove("index_3");
// 元素修改
map01.put("index_4", "map01.value_4 changed!");
map02.remove("index_1");
map02.remove("index_2");
for(int i = 0 ;i < map01.size();i ++){
    if (map01.get("index_"+(i+1)) != null) {
        Log.e("ArrayMap 01 -->", map01.get("index_" + (i + 1)).toString() + "==HashCode==" + map01.get("index_" + (i + 1)).hashCode());
    }
}
for(int i = 0 ;i < map02.size();i ++){
    if (map02.get("index_"+(i+1)) != null) {
        Log.e("ArrayMap 02 -->", map02.get("index_" + (i + 1)).toString() + "==HashCode==" + map02.get("index_" + (i + 1)).hashCode());
    }
}

小菜对 ArrayMap 的源码了解还不够深入，与其他存储方式的横向对比也不够全面；如有错误请多多指导！

来源：阿策小和尚