HashMap 大全

2020-08-31 本文已影响0人 NazgulSun

to be man know hashmap most

hashmap里面的数学知识《链表到红黑树转化的阈值为啥是 8》

我看到了泊松分布

* Because TreeNodes are about twice the size of regular nodes, we
     * use them only when bins contain enough nodes to warrant use
     * (see TREEIFY_THRESHOLD). And when they become too small (due to
     * removal or resizing) they are converted back to plain bins.  In
     * usages with well-distributed user hashCodes, tree bins are
     * rarely used.  Ideally, under random hashCodes, the frequency of
     * nodes in bins follows a Poisson distribution
     * (http://en.wikipedia.org/wiki/Poisson_distribution) with a
     * parameter of about 0.5 on average for the default resizing
     * threshold of 0.75, although with a large variance because of
     * resizing granularity. Ignoring variance, the expected
     * occurrences of list size k are (exp(-0.5) * pow(0.5, k) /
     * factorial(k)). The first values are:

1）查找性能和存储之间的一个折中，有理有据，细节拉满。

2）为啥是红黑树？而不是平衡树
红黑树，在包含插入，删除，查询操作的时候，性能要好于avl。
avl 是严格的二叉查找，对于删除，修改会有很多的 rebalance炒作，当然查询更快。
rb-tree 是一个综合的性能折中。
红黑树的模型，需要另外的篇幅：通过好几个约束条件来保证，黑色节点的平衡，
从而不完美的平衡，有比较不错的查询速度。
- 黑色根节点
- 红色节点子节点都是黑色
- 从任一节点到其每个叶子的所有路径都包含相同数目的黑色节点

尽可能想要分散的hash函数

这个要从后面的寻址说起，
hash &（length-1）
length 的长度通常不会很长，比如16位的话，那么hash 值，如果超过16位呢？那么高位的数值就会丢失
就有可能造成，相对不均匀。例如：最极端的情况，就是一大堆 hash，就高位不一样，然后低位全部一样，
这个寻址后，就会造成分散不均匀。
hashmap在设计的时候，有很多这样的优化考虑。
所以，给定一个 key，他并没有只用 key.hashCode.
还做了一定的操作

static final int hash(Object key) {
     int h;
     return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}

原理就是把hashcode与本身右移动16位之后，做一个与或，把高位16位的一些数值与低位16位搅合一下。

为啥size 都是2的 N次方。

1）利用到一个数学公司，当 len = 2^n次方的时候
hash% len = hash &(len-1).
要知道，位运算的性能比取莫要快很多。
2）在hashmap resize 的时候，保持这个特型的话，
可以让现有的元素非常快速的计算出 resize 之后的坐标，无需重新计算hash，也无需处理冲突。
一切都可以平移。
power-of-two masking，这个原理可以好好看看。

多线程相关（1）【fail fast】

   abstract class HashIterator {
        Node<K,V> next;        // next entry to return
        Node<K,V> current;     // current entry
        int expectedModCount;  // for fast-fail
        int index;             // current slot

        HashIterator() {
            expectedModCount = modCount;
           *// 构造iterator 的时候，就设置了一个 snapshot下的 modCount *
            Node<K,V>[] t = table;
            current = next = null;
            index = 0;
            if (t != null && size > 0) { // advance to first entry
                do {} while (index < t.length && (next = t[index++]) == null);
            }
        }

        public final boolean hasNext() {
            return next != null;
        }

        final Node<K,V> nextNode() {
            Node<K,V>[] t;
            Node<K,V> e = next;
            if (modCount != expectedModCount)
                **调用 next 的时候都会检查 modCount是否已经修改，否则抛出异常**
                throw new ConcurrentModificationException();
            if (e == null)
                throw new NoSuchElementException();
            if ((next = (current = e).next) == null && (t = table) != null) {
                do {} while (index < t.length && (next = t[index++]) == null);
            }
            return e;
        }

        public final void remove() {
            Node<K,V> p = current;
            if (p == null)
                throw new IllegalStateException();
            if (modCount != expectedModCount)
                throw new ConcurrentModificationException();
            current = null;
            K key = p.key;
            removeNode(hash(key), key, null, false, false);
           **如果使用自己的remove方法，会更新这个值，就不会抛出 mdfe**
            expectedModCount = modCount;
        }
    }

对于hashmap的foreach，以及key，value 等list 的foreach，都实现了fail-fast机制。

多线程相关（2） hashmap 为啥不是线性安全的？

https://www.cnblogs.com/developer_chan/p/10450908.html
HashMap是线程不安全的，其主要体现：
1.在jdk1.7中，在多线程环境下，扩容时会造成环形链或数据丢失。
扩容的时候采用链表头插法
2.在jdk1.8中，在多线程环境下，会发生数据覆盖的情况。

concurrentMap

1） 1.7 和1.8 的区别，各自的优劣
2）为什么不用segment
3） cas的场景，synchronized 和 reenterlock 的却别和优劣
4） CounterCell @sun.misc.Contended 高级特型的作用

HashMap 大全

to be man know hashmap most

hashmap里面的数学知识《链表到红黑树转化的阈值为啥是 8》

尽可能想要分散的hash函数

为啥size 都是2的 N次方。

多线程相关（1）【fail fast】

多线程相关（2） hashmap 为啥不是线性安全的？

concurrentMap

猜你喜欢

热点阅读

HashMap 大全

to be man know hashmap most

hashmap里面的数学知识《链表到红黑树转化的阈值为啥是 8》

尽可能想要分散的hash函数

为啥size 都是2的 N次方。

多线程相关 （1）【fail fast】

多线程相关（2） hashmap 为啥不是线性安全的？

concurrentMap

猜你喜欢

热点阅读

多线程相关（1）【fail fast】