HashMap jdk 1.8源码分析

2019-04-17 本文已影响0人一只猪啊啊

首先放上参考的博客
https://blog.csdn.net/v123411739/article/details/78996181
jdk1.8之前的hashMap 是基于数组加链表的形式的，jdk1.8 oracleJdk优化了jdk的源码采用数组加链表或者数组加红黑树的形式在链表上挂的数据超过一定长度后就会转为红黑树。

我先搬上面博客的一点内容：
几个点：
先了解以下几个点，有利于更好的理解HashMap的源码和阅读本文。

头节点指的是table表上索引位置的节点，也就是链表的头节点。
根结点（root节点）指的是红黑树最上面的那个节点，也就是没有父节点的节点。
红黑树的根结点不一定是索引位置的头结点。
转为红黑树节点后，链表的结构还存在，通过next属性维持，红黑树节点在进行操作时都会维护链表的结构，并不是转为红黑树节点，链表结构就不存在了。
在红黑树上，叶子节点也可能有next节点，因为红黑树的结构跟链表的结构是互不影响的，不会因为是叶子节点就说该节点已经没有next节点。
源码中一些变量定义：如果定义了一个节点p，则pl为p的左节点，pr为p的右节点，pp为p的父节点，ph为p的hash值，pk为p的key值，kc为key的类等等。源码中很喜欢在if/for等语句中进行赋值并判断，请注意。
链表中移除一个节点只需如下图操作，其他操作同理。

红黑树在维护链表结构时，移除一个节点只需如下图操作（红黑树中增加了一个prev属性），其他操作同理。注：此处只是红黑树维护链表结构的操作，红黑树还需要单独进行红黑树的移除或者其他操作。

源码中进行红黑树的查找时，会反复用到以下两条规则：1）如果目标节点的hash值小于p节点的hash值，则向p节点的左边遍历；否则向p节点的右边遍历。2）如果目标节点的key值小于p节点的key值，则向p节点的左边遍历；否则向p节点的右边遍历。这两条规则是利用了红黑树的特性（左节点<根结点<右节点）。
源码中进行红黑树的查找时，会用dir（direction）来表示向左还是向右查找，dir存储的值是目标节点的hash/key与p节点的hash/key的比较结果。

HashMap的基本数据结构


/**

 * The default initial capacity - MUST be a power of two.

 */

static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // 默认容量16



/**

 * The maximum capacity, used if a higher value is implicitly specified

 * by either of the constructors with arguments.

 * MUST be a power of two <= 1<<30.

 */

static final int MAXIMUM_CAPACITY = 1 << 30; // 最大容量



/**

 * The load factor used when none specified in constructor.

 */

static final float DEFAULT_LOAD_FACTOR = 0.75f; // 默认负载因子0.75



/**

 * The bin count threshold for using a tree rather than list for a

 * bin. Bins are converted to trees when adding an element to a

 * bin with at least this many nodes. The value must be greater

 * than 2 and should be at least 8 to mesh with assumptions in

 * tree removal about conversion back to plain bins upon

 * shrinkage.

 */

static final int TREEIFY_THRESHOLD = 8; // 链表节点转换红黑树节点的阈值, 9个节点转



/**

 * The bin count threshold for untreeifying a (split) bin during a

 * resize operation. Should be less than TREEIFY_THRESHOLD, and at

 * most 6 to mesh with shrinkage detection under removal.

 */

static final int UNTREEIFY_THRESHOLD = 6; // 红黑树节点转换链表节点的阈值, 6个节点转



/**

 * The smallest table capacity for which bins may be treeified.

 * (Otherwise the table is resized if too many nodes in a bin.)

 * Should be at least 4 * TREEIFY_THRESHOLD to avoid conflicts

 * between resizing and treeification thresholds.

 */

static final int MIN_TREEIFY_CAPACITY = 64; // 转红黑树时, table的最小长度



/* ---------------- Fields -------------- */



    /**

     * The table, initialized on first use, and resized as

     * necessary. When allocated, length is always a power of two.

     * (We also tolerate length zero in some operations to allow

     * bootstrapping mechanics that are currently not needed.)

     */

    transient Node<K,V>[] table; // node数组 也就是上面的 数组加链表的数组



    /**

     * Holds cached entrySet(). Note that AbstractMap fields are used

     * for keySet() and values().

     */

    transient Set<Map.Entry<K,V>> entrySet; // 内部key的set集合



    /**

     * The number of key-value mappings contained in this map.

     */

    transient int size; //map大小



    /**

     * The number of times this HashMap has been structurally modified

     * Structural modifications are those that change the number of mappings in

     * the HashMap or otherwise modify its internal structure (e.g.,

     * rehash). This field is used to make iterators on Collection-views of

     * the HashMap fail-fast. (See ConcurrentModificationException).

     */

    transient int modCount; //操作此时



    /**

     * The next size value at which to resize (capacity * load factor).

     * 要调整大小的下一个大小值（容量*加载因子）。

     * @serial

     */

    // (The javadoc description is true upon serialization.

    // Additionally, if the table array has not been allocated, this

    // field holds the initial array capacity, or zero signifying

    // DEFAULT_INITIAL_CAPACITY.)

    int threshold; 



    /**

     * The load factor for the hash table.

     * 负载因子

     * @serial

     */

    final float loadFactor;



    /* ---------------- Public operations -------------- */



/**

 * Basic hash bin node, used for most entries. (See below for

 * TreeNode subclass, and in LinkedHashMap for its Entry subclass.)

 */

static class Node<K,V> implements Map.Entry<K,V> { // 基本hash节点, 继承自Entry

    final int hash;

    final K key;

    V value;

    Node<K,V> next;



    Node(int hash, K key, V value, Node<K,V> next) {

        this.hash = hash;

        this.key = key;

        this.value = value;

        this.next = next;

    }



    public final K getKey() { return key; }

    public final V getValue() { return value; }

    public final String toString() { return key + "=" + value; }



    public final int hashCode() {

        return Objects.hashCode(key) ^ Objects.hashCode(value);

    }



    public final V setValue(V newValue) {

        V oldValue = value;

        value = newValue;

        return oldValue;

    }



    public final boolean equals(Object o) {

        if (o == this)

            return true;

        if (o instanceof Map.Entry) {

            Map.Entry<?,?> e = (Map.Entry<?,?>)o;

            if (Objects.equals(key, e.getKey()) &&

                Objects.equals(value, e.getValue()))

                return true;

        }

        return false;

    }

}



/**

 * Entry for Tree bins. Extends LinkedHashMap.Entry (which in turn

 * extends Node) so can be used as extension of either regular or

 * linked node.

 */

static final class TreeNode<K,V> extends LinkedHashMap.Entry<K,V> {// 红黑树节点

    TreeNode<K,V> parent; // red-black tree links

    TreeNode<K,V> left;

    TreeNode<K,V> right;

    TreeNode<K,V> prev; // needed to unlink next upon deletion

    boolean red;

    TreeNode(int hash, K key, V val, Node<K,V> next) {

        super(hash, key, val, next);

    }

}

定位哈希桶数组索引位置
不管增加、删除、查找键值对，定位到哈希桶数组的位置都是很关键的第一步。前面说过HashMap的数据结构是“数组+链表+红黑树”的结合，所以我们当然希望这个HashMap里面的元素位置尽量分布均匀些，尽量使得每个位置上的元素数量只有一个，那么当我们用hash算法求得这个位置的时候，马上就可以知道对应位置的元素就是我们要的，不用遍历链表/红黑树，大大优化了查询的效率。HashMap定位数组索引位置，直接决定了hash方法的离散性能。下面是定位哈希桶数组的源码：


// 代码1

static final int hash(Object key) { // 计算key的hash值

    int h;

    // 1.先拿到key的hashCode值; 2.将hashCode的高16位参与运算

    return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);

}

// 代码2

int n = tab.length;

// 将(tab.length - 1) 与 hash值进行&运算

int index = (n - 1) & hash;

整个过程本质上就是三步：
拿到key的hashCode值
将hashCode的高位参与运算，重新计算hash值
将计算出来的hash值与(table.length - 1)进行&运算
代码1 是hashMap的核心方法，代码2 和代码3是 get 和 put 时定位当前node 的位置也就是上面定义的 Node[] table 的索引值。
这位老哥写的太好我都不知道能补充啥。

方法解读：

对于任意给定的对象，只要它的hashCode()返回值相同，那么计算得到的hash值总是相同的。我们首先想到的就是把hash值对table长度取模运算，这样一来，元素的分布相对来说是比较均匀的。

但是模运算消耗还是比较大的，我们知道计算机比较快的运算为位运算，因此JDK团队对取模运算进行了优化，使用上面代码2的位与运算来代替模运算。这个方法非常巧妙，它通过 “(table.length -1) & h” 来得到该对象的索引位置，这个优化是基于以下公式：x mod 2^n = x & (2^n - 1)。我们知道HashMap底层数组的长度总是2的n次方，并且取模运算为“h mod table.length”，对应上面的公式，可以得到该运算等同于“h & (table.length - 1)”。这是HashMap在速度上的优化，因为&比%具有更高的效率。

在JDK1.8的实现中，还优化了高位运算的算法，将hashCode的高16位与hashCode进行异或运算，主要是为了在table的length较小的时候，让高位也参与运算，并且不会有太大的开销。

下图是一个简单的例子，table长度为16：

膜拜之情油然而生！
下面就来看下get方法


public V get(Object key) {

    Node<K,V> e;

    return (e = getNode(hash(key), key)) == null ? null : e.value;

}



final Node<K,V> getNode(int hash, Object key) {

    Node<K,V>[] tab; Node<K,V> first, e; int n; K k;

    // table不为空 && table长度大于0 && table索引位置(根据hash值计算出)不为空

    if ((tab = table) != null && (n = tab.length) > 0 &&

        (first = tab[(n - 1) & hash]) != null) { // 这里的 tab[(n - 1) & hash] 就是取到 当前key在 Node数组中所在索引的位置

        if (first.hash == hash && // always check first node

            ((k = first.key) == key || (key != null && key.equals(k)))) 

            return first; // first的key等于传入的key则返回first对象

        if ((e = first.next) != null) { // 向下遍历

            if (first instanceof TreeNode) // 判断是否为TreeNode

             // 如果是红黑树节点，则调用红黑树的查找目标节点方法getTreeNode

                return ((TreeNode<K,V>)first).getTreeNode(hash, key);

            // 走到这代表节点为链表节点

            do { // 向下遍历链表, 直至找到节点的key和传入的key相等时,返回该节点

                if (e.hash == hash &&

                    ((k = e.key) == key || (key != null && key.equals(k))))

                    return e;

            } while ((e = e.next) != null);

        }

    }

    return null; // 找不到符合的返回空

}

1.先对table进行校验，校验是否为空，length是否大于0
2.使用table.length - 1和hash值进行位与运算，得出在table上的索引位置，将该索引位置的节点赋值给first节点，校验该索引位置是否为空
3.检查first节点的hash值和key是否和入参的一样，如果一样则first即为目标节点，直接返回first节点
4.如果first的next节点不为空则继续遍历
5.如果first节点为TreeNode，则调用getTreeNode方法（见下文代码块1）查找目标节点
6.如果first节点不为TreeNode，则调用普通的遍历链表方法查找目标节点
7.如果查找不到目标节点则返回空
第一次看的时候一脸懵逼，但是等过了几个月看过了很多开源框架后，现在再来看觉得也就那样，所有还是要多看多学习，才能有进步。

这里对于查链表来说很简单，不需要解释，但是对于该索引上的Node 是一颗红黑树的情况来说就比较复杂了。


final TreeNode<K,V> getTreeNode(int h, Object k) {

 // 使用根结点调用find方法

    return ((parent != null) ? root() : this).find(h, k, null); 

}



 final TreeNode<K,V> root() {

            for (TreeNode<K,V> r = this, p;;) {

                // 如果一个节点的父为空那么他就是 根节点  上面的for是一个类似死循环的东西，只有当r为null时退出 而r又被每次循环赋值为r.parent

                if ((p = r.parent) == null)

                    return r;

                r = p;

            }

        }

然后就是 find方法了




/**

 * 从调用此方法的结点开始查找, 通过hash值和key找到对应的节点

 * 此处是红黑树的遍历, 红黑树是特殊的自平衡二叉查找树

 * 平衡二叉查找树的特点：左节点<根节点<右节点

 */

final TreeNode<K,V> find(int h, Object k, Class<?> kc) {    

    TreeNode<K,V> p = this; // this为调用此方法的节点

    do {

        int ph, dir; K pk;

        TreeNode<K,V> pl = p.left, pr = p.right, q;

        if ((ph = p.hash) > h) // 传入的hash值小于p节点的hash值, 则往p节点的左边遍历

            p = pl; // p赋值为p节点的左节点

        else if (ph < h) // 传入的hash值大于p节点的hash值, 则往p节点的右边遍历

            p = pr; // p赋值为p节点的右节点

        // 传入的hash值和key值等于p节点的hash值和key值,则p节点为目标节点,返回p节点

        else if ((pk = p.key) == k || (k != null && k.equals(pk))) 

            return p;

        else if (pl == null) // p节点的左节点为空则将向右遍历

            p = pr; 

        else if (pr == null) // p节点的右节点为空则向左遍历

            p = pl;

        else if ((kc != null ||

           // 如果传入的key(k)所属的类实现了Comparable接口,则将传入的key跟p节点的key比较

                  (kc = comparableClassFor(k)) != null) && // 此行不为空代表k实现了Comparable

                 (dir = compareComparables(kc, k, pk)) != 0)//k<pk则dir<0, k>pk则dir>0

            p = (dir < 0) ? pl : pr; // k < pk则向左遍历(p赋值为p的左节点), 否则向右遍历

        // 代码走到此处, 代表key所属类没有实现Comparable, 直接指定向p的右边遍历

        else if ((q = pr.find(h, k, kc)) != null)   

            return q;

        else// 代码走到此处代表上一个向右遍历（pr.find(h, k, kc)）为空, 因此直接向左遍历

            p = pl; 

    } while (p != null);

    return null;

}

接下来看 put方法


public V put(K key, V value) {

    return putVal(hash(key), key, value, false, true);

}

 

final V putVal(int hash, K key, V value, boolean onlyIfAbsent,

               boolean evict) {

    Node<K,V>[] tab; Node<K,V> p; int n, i;

    // table是否为空或者length等于0, 如果是则调用resize方法进行初始化

    if ((tab = table) == null || (n = tab.length) == 0)

        n = (tab = resize()).length;    

    // 通过hash值计算索引位置, 如果table表该索引位置节点为空则新增一个

    if ((p = tab[i = (n - 1) & hash]) == null)// 将索引位置的头节点赋值给p

        tab[i] = newNode(hash, key, value, null);

    else { // table表该索引位置不为空

        Node<K,V> e; K k;

        if (p.hash == hash && // 判断p节点的hash值和key值是否跟传入的hash值和key值相等

            ((k = p.key) == key || (key != null && key.equals(k)))) 

            e = p; // 如果相等, 则p节点即为要查找的目标节点，赋值给e

        // 判断p节点是否为TreeNode, 如果是则调用红黑树的putTreeVal方法查找目标节点

        else if (p instanceof TreeNode) 

            e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);

        else {  // 走到这代表p节点为普通链表节点

            for (int binCount = 0; ; ++binCount) { // 遍历此链表, binCount用于统计节点数

                if ((e = p.next) == null) { // p.next为空代表不存在目标节点则新增一个节点插入链表尾部

                    p.next = newNode(hash, key, value, null);

                    // 计算节点是否超过8个, 减一是因为循环是从p节点的下一个节点开始的

                    if (binCount >= TREEIFY_THRESHOLD - 1)

                        treeifyBin(tab, hash);// 如果超过8个，调用treeifyBin方法将该链表转换为红黑树

                    break;

                }

                if (e.hash == hash && // e节点的hash值和key值都与传入的相等, 则e即为目标节点,跳出循环

                    ((k = e.key) == key || (key != null && key.equals(k)))) 

                    break;

                p = e; // 将p指向下一个节点

            }

        }

        // e不为空则代表根据传入的hash值和key值查找到了节点,将该节点的value覆盖,返回oldValue

        if (e != null) { 

            V oldValue = e.value;

            if (!onlyIfAbsent || oldValue == null)

                e.value = value;

            afterNodeAccess(e); // 用于LinkedHashMap

            return oldValue;

        }

    }

    ++modCount;

    if (++size > threshold) // 插入节点后超过阈值则进行扩容

        resize();

    afterNodeInsertion(evict); // 用于LinkedHashMap

    return null;

}

首先来看resize方法动态扩容


final Node<K,V>[] resize() {

        // 新建一个 oldTab 指向 table

        Node<K,V>[] oldTab = table;

        // 获取旧的 数组大小和扩容阈值

        int oldCap = (oldTab == null) ? 0 : oldTab.length;

        int oldThr = threshold;

        // 新建新的 数组大小和阈值

        int newCap, newThr = 0;

        // 如果旧的 cap大于0

        if (oldCap > 0) {

            // 旧的大小已经超过 hashMap的最大长度

            if (oldCap >= MAXIMUM_CAPACITY) {

                // 阈值视为int最大值

                threshold = Integer.MAX_VALUE;

                return oldTab;

            }

            // 如果原来的oldCap * 2 小于最大值，且原来的 大小大于等于 16 那么新的 阈值 等于旧阈值 * 2

            // 这里如果 oldCap * 2 大于最大值那么 新阈值就不被赋值

            else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&

                    oldCap >= DEFAULT_INITIAL_CAPACITY)

                newThr = oldThr << 1; // double threshold

        }

        // 如果 oldCap == 0 也就是没有初始化

        // newCap = 原先的阈值

        else if (oldThr > 0) // initial capacity was placed in threshold

            newCap = oldThr;

        // oldCap 和 oldThr 都为 0 那么就设置新的cap 为 default = 16 newThr = 16 * 0.75

        else { // zero initial threshold signifies using defaults

            newCap = DEFAULT_INITIAL_CAPACITY;

            newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);

        }

        // 如果newThr == 0 也就是 上面 oldCap * 2 超过 最大长度限制

        if (newThr == 0) {

            //定义一个float

            float ft = (float)newCap * loadFactor;

            // 如果新的cap或者新的阈值都小于 int最大长度 那么就能转为int 否则 直接设置为最大值

            newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?

                    (int)ft : Integer.MAX_VALUE);

        }

        // 设置类的新阈值

        threshold = newThr;

        // 接下来就是生成新的table 并且把数据 重新塞到新table中

        @SuppressWarnings({"rawtypes","unchecked"})

        Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];

        table = newTab;

        // 如果是第一次初始化 这里的oldTab是null 直接返回新table即可

        if (oldTab != null) {

            // 进到这里说明就table 有信息就循环信息 循环的是数组的头结点

            for (int j = 0; j < oldCap; ++j) {

                Node<K,V> e;

                // 这里e就是头结点 并且头结点不为null

                if ((e = oldTab[j]) != null) {

                    // 原来的数据没有用了 置为null等待gc回收

                    oldTab[j] = null;

                    // 如果e没有next 说明该索引位置只有一个节点 直接 算hash桶 塞到新的table的头结点

                    if (e.next == null)

                        newTab[e.hash & (newCap - 1)] = e;

                    // 如果节点是红黑树

                    else if (e instanceof TreeNode)

                        ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);

                    // 到这里也就是链表不止一个值 处理该节点下的链表数据

                    else { // preserve order

                        Node<K,V> loHead = null, loTail = null;

                        Node<K,V> hiHead = null, hiTail = null;

                        Node<K,V> next;

                        do {

                            next = e.next;

                            /**

                             * e.hash & oldCap 判断用来计算hash值的高位是否为1。

                             * 这个1决定着节点在新的数组中的位置，这是因为数组扩容即 oldCap << 1 右移了一位

                             * 举个栗子，假设原数组容量为 4 （只要是2的幂就行），即： 100

                             * hash值为 5,(101) 的节点在原数组的下标为：

                             * 5 & (4-1)

                             * = 101

                             * & 011

                             * = 001;

                             * 在 数组容量为4的情况下 索引位置为 1

                             * 数组扩容后，容量为8 (1000), 节点在新数组下标为 ：

                             * 5&(8-1)

                             * = 101

                             * & 111

                             * = 101;

                             * 在 数组容量为8的情况下 索引位置为 5。

                             * 我们再来看一个例子:

                             * hash值为 1,(101) 的节点在原数组的下标为：

                             * 1 & (4-1)

                             * = 001

                             * & 011

                             * = 001;

                             * 在 数组容量为4的情况下 索引位置为 1

                             * 数组扩容后，容量为8 (1000), 节点在新数组下标为 ：

                             * 1&(8-1)

                             * = 001

                             * & 111

                             * = 001;

                             * 在 数组容量为8的情况下 索引位置为 1。

                             * 所以在扩容后 原链表的值 由于高位不同 就要分成两组 ，这时我们直接拿 2的幂 的容量大小 和 hash &

                             * 就能得到它在扩容2倍后是在原索引位置还是在 原索引位置+ 扩容数位置

                             * 新的下标正好等于 旧的下标 001 加上 旧容量 4(100) ,即 101.

                             * 上面是hash & oldCapacity = 1 的情况

                             * 当 hash & oldCapacity = 0 时自然容易算出新的下标和旧的下标相等。

                             */

                            // 由以上结论可知 e.hash & oldCap为1时 他就要位移到 原索引位置 + 扩容数的位置

                            // 为 0 时还在原位

                            if ((e.hash & oldCap) == 0) {

                                // 这里操作的是lo 也就是还在原索引位置的Node

                                // 如果 loTail为null 说明 位置为空 也就是第一次进来时

                                // 这时 loHead = e = loTail

                                // 第二次进来时 走else loHead.next = loTail.next = e

                                // 然后 loTail = e

                                // 第三次进来时 走else loHead.next.next = loTail.next = e

                                if (loTail == null)

                                    loHead = e;

                                // 否则 当前指向位置的next为e

                                else

                                    loTail.next = e;

                                // 当前指针下移一位

                                loTail = e;

                            }

                            // 这里同理 只是 操作的是 hi相关

                            else {

                                if (hiTail == null)

                                    hiHead = e;

                                else

                                    hiTail.next = e;

                                hiTail = e;

                            }

                        } while ((e = next) != null);

                        // 没有next 以后 newTab[j] = head

                        if (loTail != null) {

                            loTail.next = null;

                            newTab[j] = loHead;

                        }

                        // 高位 为1 的 到 新扩容的地方去

                        if (hiTail != null) {

                            hiTail.next = null;

                            newTab[j + oldCap] = hiHead;

                        }

                    }

                }

            }

        }

        return newTab;

    }

此方法初看前面比较简单但是到后面赋值给新 table的时候就会非常绕主要就是控制新生产的table 维持新的链表结构

重点理解 2的幂次长度的 hash & oldCap 和 hash & (oldCap -1 ) 的关系、灵活应用了扩容两倍后高位为1的hash值会跑到新的链表里也就是[j+oldCap]的索引处。

看完了resize()方法，我们来看链表转树的方法 treeifyBin


final void treeifyBin(Node<K,V>[] tab, int hash) {

        int n, index; Node<K,V> e;

        if (tab == null || (n = tab.length) < MIN_TREEIFY_CAPACITY)

            resize();

        else if ((e = tab[index = (n - 1) & hash]) != null) {

            TreeNode<K,V> hd = null, tl = null;

            // 这里其实只是构建了TreeNode 的节点 以及 构建了一个双向的链表 还没有转换成树

            do {

                // 这里其实就是简单的 new 了一个 TreeNode

                TreeNode<K,V> p = replacementTreeNode(e, null);

                // 第一次tl是null 就给 hd挂上 p

                if (tl == null)

                    hd = p;

                // 不是第一次 循环

                else {

                    // p的上一个是pl

                    p.prev = tl;

                    // tl的下一个是p

                    tl.next = p;

                }

                // 然后tl = p 游标下移 准备下一次循环

                tl = p;

            } while ((e = e.next) != null);

            // 如果有值 真正的转为 红黑树才开始

            if ((tab[index] = hd) != null)

                hd.treeify(tab);

        }

    }

这里其实就是给原先的Node节点绑上 prev属性和 next属性，原先node属性只有next属性,这个方法的逻辑很简单。最主要的还是看下面的treeify方法

这个才是真正转树的地方


final void treeify(Node<K,V>[] tab) {

            TreeNode<K,V> root = null;

            // 这里的 this 就是hd

            for (TreeNode<K,V> x = this, next; x != null; x = next) {

                // 开始遍历 链表

                next = (TreeNode<K,V>)x.next;

                // 清除 left 和 right （这里可能是因为 tree退化会链表时没有清除left 和 right）

                x.left = x.right = null;

                // 如果没有根节点 就 设置当前为 根节点

                if (root == null) {

                    x.parent = null;

                    x.red = false;

                    root = x;

                }

                else {

                    K k = x.key;

                    int h = x.hash;

                    Class<?> kc = null;

                    // 从 root 开始 找该节点 应该在树的 位置

                    for (TreeNode<K,V> p = root;;) {

                        int dir, ph;

                        K pk = p.key;

                        // 如果这次循环的 hash大于 x的hash dir - 1 说明要向左找子

                        if ((ph = p.hash) > h)

                            dir = -1;

                        //反之到右边找

                        else if (ph < h)

                            dir = 1;

                        // hash 值相等 比较key 值

                        else if ((kc == null && // 如果k没有实现Comparable接口 或者 x节点的key和p节点的key相等

                                (kc = comparableClassFor(k)) == null) ||

                                (dir = compareComparables(kc, k, pk)) == 0)

                            // 使用定义的一套规则来比较x节点和p节点的大小，用来决定向左还是向右查找

                            dir = tieBreakOrder(k, pk);

                        //知道了dir 继续向下

                        // 保留当前位置

                        TreeNode<K,V> xp = p;

                        // 当前p 根据dir 的规则 变成了 p的左 或者 右 并且这是左或者右为null 时 说明找到了 x的正确位置

                        if ((p = (dir <= 0) ? p.left : p.right) == null) {

                            // x的父节点就是 xp （这里p已经变成了 p的左或右 因为 p.parent 可能为 null 这里给x.parent 赋值）

                            x.parent = xp;

                            // 根据dir 给 xp 帮左子树还是右字数

                            if (dir <= 0)

                                xp.left = x;

                            else

                                xp.right = x;

                            // 重新构建平衡树 也就是把x 放到红黑树的正确位置

                            // 进行红黑树的插入平衡(通过左旋、右旋和改变节点颜色来保证当前树符合红黑树的要求)

                            root = balanceInsertion(root, x);

                            break;

                        }

                    }

                }

            }

            // 如果root节点不在table索引位置的头结点, 则将其调整为头结点

            moveRootToFront(tab, root);

        }

上面的步骤中给每一个TreeNode找到自己合适的位置，也就是balanceInsertion()方法之前的代码，定位当前TreeNode 属于左子树还是右子树。而balanceInsertion才是真正保证树是红黑树。

接下来balanceInsertion方法

然后是moveRootToFront方法，将root节点移至头结点


static <K,V> void moveRootToFront(Node<K,V>[] tab, TreeNode<K,V> root) {

            int n;

            if (root != null && tab != null && (n = tab.length) > 0) {

                // hash 桶

                int index = (n - 1) & root.hash;

                //得到树的第一个节点 后面 tab[index] 会被赋值 这里做保留

                TreeNode<K,V> first = (TreeNode<K,V>)tab[index];

                // 如果 root 不等于 first

                if (root != first) {

                    Node<K,V> rn;

                    // 索引值头结点设置为root

                    tab[index] = root;

                    // 维护 链表结构

                    // 拿到上一个节点

                    TreeNode<K,V> rp = root.prev;

                    // 这里相当于是 rn ->rp rp -> rn 把 root给解放出来 并且挂在first前面

                    // 由于原来first 前就是null 所有不影响，把root移除也不影响原先的链表结构

                    // rn的上一个是rp

                    if ((rn = root.next) != null)

                        ((TreeNode<K,V>)rn).prev = rp;

                    // rp的下一个是rn

                    if (rp != null)

                        rp.next = rn;

                    // first的上一个是 root

                    if (first != null)

                        first.prev = root;

                    // root的下一个是 first

                    root.next = first;

                    // root的前一个是null

                    root.prev = null;

                    //到这就重新把first加入链表

                }

                assert checkInvariants(root);

            }

        }

这个过程也是很简单就是如果原先头结点不是root 就把原来的first 转为 root的下一个然后把root 放到头结点。

接下来我们来看 putAll方法


public void putAll(Map<? extends K, ? extends V> m) {

        putMapEntries(m, true);

    }



final void putMapEntries(Map<? extends K, ? extends V> m, boolean evict) {

        int s = m.size();

        if (s > 0) {

            if (table == null) { // pre-size

                // 重新计算需要的数组长度

                float ft = ((float)s / loadFactor) + 1.0F;

                int t = ((ft < (float)MAXIMUM_CAPACITY) ?

                        (int)ft : MAXIMUM_CAPACITY);

                // 如果 大于了阈值 重新计算 阈值

                if (t > threshold)

                    threshold = tableSizeFor(t);

            }

            // 如果长度大于阈值 直接 扩容

            else if (s > threshold)

                resize();

            // 其实 就是putVal

            for (Map.Entry<? extends K, ? extends V> e : m.entrySet()) {

                K key = e.getKey();

                V value = e.getValue();

                putVal(hash(key), key, value, false, evict);

            }

        }

    }

主要的方法就是 putMapEntries 其实他最终调用的其实就是 putVal方法。然后就是一些计算容量扩容的东西了。

接下来我们来看containsKey方法


// 直接getNode

public boolean containsKey(Object key) {

        return getNode(hash(key), key) != null;

    }



 final Node<K,V> getNode(int hash, Object key) {

        Node<K,V>[] tab; Node<K,V> first, e; int n; K k;

        if ((tab = table) != null && (n = tab.length) > 0 &&

                // hash桶 拿到头结点

                (first = tab[(n - 1) & hash]) != null) {

            // 如果头结点就是 目标节点 直接返回 这里其实利用了 无论是Node还是TreeNode 第一个节点 都是可以直接判断的

            if (first.hash == hash && // always check first node

                    ((k = first.key) == key || (key != null && key.equals(k))))

                return first;

            // 如果头结点不是目标 就next

            if ((e = first.next) != null) {

                // 树

                if (first instanceof TreeNode)

                    return ((TreeNode<K,V>)first).getTreeNode(hash, key);

                // 链表

                do {

                    // 循环 直到找到了为止

                    if (e.hash == hash &&

                            ((k = e.key) == key || (key != null && key.equals(k))))

                        return e;

                } while ((e = e.next) != null);

            }

        }

        return null;

    }



// 这里其实调用的TreeNode的getTreeNode的  这个方法之前已经说过了

final TreeNode<K,V> getTreeNode(int h, Object k) {

            return ((parent != null) ? root() : this).find(h, k, null);

        }

接着讲remove方法


 public V remove(Object key) {

        Node<K,V> e;

        return (e = removeNode(hash(key), key, null, false, true)) == null ?

                null : e.value;

    }



final Node<K,V> removeNode(int hash, Object key, Object value,

                               boolean matchValue, boolean movable) {

        Node<K,V>[] tab; Node<K,V> p; int n, index;

        if ((tab = table) != null && (n = tab.length) > 0 &&

                // 又是hash桶 hash桶真是无处不在啊

                (p = tab[index = (n - 1) & hash]) != null) {

            Node<K,V> node = null, e; K k; V v;

            //其实这里就是getNode 的逻辑 找到需要删除的node

            if (p.hash == hash &&

                    ((k = p.key) == key || (key != null && key.equals(k))))

                node = p;

            else if ((e = p.next) != null) {

                if (p instanceof TreeNode)

                    node = ((TreeNode<K,V>)p).getTreeNode(hash, key);

                else {

                    do {

                        if (e.hash == hash &&

                                ((k = e.key) == key ||

                                        (key != null && key.equals(k)))) {

                            node = e;

                            break;

                        }

                        p = e;

                    } while ((e = e.next) != null);

                }

            }

            // 如果找到了node

            if (node != null && (!matchValue || (v = node.value) == value ||

                    (value != null && value.equals(v)))) {

                // 如果是tree

                if (node instanceof TreeNode)

                    ((TreeNode<K,V>)node).removeTreeNode(this, tab, movable);

                // 如果是头结点 将头结点置为 下一个节点

                else if (node == p)

                    tab[index] = node.next;

                // 不是头结点 就 把p的next 指向node的next 这里其实p就是node的下一个节点 因为上面循环找node 的同时 p的指向一直在变

                else

                    p.next = node.next;

                ++modCount;

                --size;

                afterNodeRemoval(node);

                return node;

            }

        }

        return null;

    }

就像上面说的，有从链表转为树，就有从树退化为链表removeTreeNode方法是一个极其复杂的过程我还没看明白，这里想说一点之前我猜测要将left 和 right 置为null的原因就是因为退化为tree的时候我们的 left 和 right 并未被清空。


final Node<K,V> untreeify(HashMap<K,V> map) {

            Node<K,V> hd = null, tl = null;

            for (Node<K,V> q = this; q != null; q = q.next) {

                Node<K,V> p = map.replacementNode(q, null);

                if (tl == null)

                    hd = p;

                else

                    tl.next = p;

                tl = p;

            }

            return hd;

        }

看上去不是很难理解，接下来看一下clear 和 containsValue方法


 public void clear() {

        Node<K,V>[] tab;

        // 操作次数+1

        modCount++;

        if ((tab = table) != null && size > 0) {

            size = 0;

            // 循环 给tab置null 并没有直接 tab = new tab[];

            for (int i = 0; i < tab.length; ++i)

                tab[i] = null;

        }

    }



 public boolean containsValue(Object value) {

        Node<K,V>[] tab; V v;

        if ((tab = table) != null && size > 0) {

            // 两重循环 外层 遍历tab数组 内层遍历两边 这里并没有用到红黑树的特性

            // 所以如果 链表很长 会有性能的损耗，说实话我之前并不知道hashMap还有这个方法

            for (int i = 0; i < tab.length; ++i) {

                for (Node<K,V> e = tab[i]; e != null; e = e.next) {

                    if ((v = e.value) == value ||

                            (value != null && value.equals(v)))

                        return true;

                }

            }

        }

        return false;

    }

到这里我们基本的方法就看的差不多了剩下的红黑树的构建与维护以及1.8新增的几个函数式方法我还没看明白，后续补充吧。

HashMap jdk 1.8源码分析

猜你喜欢

热点阅读