Android基础知识-HashMap的Hash是什么？

2020-07-17 本文已影响0人费城的二鹏

首先，Hash 是哈希算法，可以将一个复杂的值计算成一个数字，通常情况下，不同值的哈希值是不同的。如果出现不同值的哈希值相同，就是出现了哈希冲突。

本文关注的重点是，HashMap 的 hash 方法源码：

/**
     * Computes key.hashCode() and spreads (XORs) higher bits of hash
     * to lower.
     * 计算 key.hashCode() 并且将高位与地位进行异或运算。
     * Because the table uses power-of-two masking, sets of
     * hashes that vary only in bits above the current mask will
     * always collide.
     * 由于 map 的表格使用使用 2 的 n 次幂作为掩码，仅在掩码位之上发生变化的哈希集合频繁出现哈希碰撞。
     * (因为 map 使用数组存储数据，数组的长度总是 2 的 n 次幂，通常情况下数组的长度比较小，只在比较高位变化的 hash 与
     * 数组长度取模得到的结果总是相同的，所以需要将高位与低位异或运算，这样可以将高位的影响传递给低位，减少 hash 冲突)
     * (Among known examples are sets of Float keys
     * holding consecutive whole numbers in small tables.)
     * （已知的示例中有一组 Float 键值，它们在小表中持有连续的整数。）
     * 0:    0.60653066
     * 1:    0.30326533
     * 2:    0.07581633
     * 3:    0.01263606
     * 4:    0.00157952
     * 5:    0.00015795
     * 6:    0.00001316
     * 7:    0.00000094
     * 8:    0.00000006
     * more: less than 1 in ten million
     * 更多的小于1的8位小数
     * So we apply a transform that spreads the impact of higher bits
     * downward.
     * 因此，我们应用了将高位向下影响的变换。
     * There is a tradeoff between speed, utility, and quality of bit-spreading.
     * 在速度，实用性和位扩展质量之间进行权衡
     * Because many common sets of hashes are already reasonably distributed
     * (so don't benefit from spreading), and because we use trees to handle large sets of
     * collisions in bins, we just XOR some shifted bits in the
     * cheapest possible way to reduce systematic lossage, as well as
     * to incorporate impact of the highest bits that would otherwise
     * never be used in index calculations because of table bounds.
     * 因为很多常见的哈希集已经合理的分布（所以不能从传播中受益），而且我们使用了树结构处理箱子中大量的冲突
     * 我们仅使用移位后的位运算这种最廉价的方式减少系统损失，以合并最高位的影响，
     * 否则由于表范围的限制，这些位将不永远不会在索引计算中使用。
     */
    static final int hash(Object key) {
        int h;
        return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
    }

可以看到方法非常简单，只是使用 key.hashCode() 获取哈希值，然后与自己的高 16 位进行异或运算并返回。

此处插播一个没啥用的知识点，当 key 是 0 的时候，哈希值为 0，也就是说 HashMap 可以存储 key 为 null 的值，并且放在数组首位，与其他值的 key 使用起来区别不大。恭喜你，学到了一个没有用的知识点，可以面试的时候给面试官显摆一下。别问我为啥关注这个，因为面试官问过我这个[弱鸡]。

然后，可以看下 key.hashCode() 的源码。

public native int hashCode();

没错只有一行，它是个原生方法，具体实现我也不了解，这不是本文的重点。因为 hash 方法需要被频繁使用，put、get 等方法都会使用，所以它必须执行起来非常迅速，所以，这是使用原生方法的一个原因。其次，hash 方法的另一个需求就是尽量减少哈希冲突的情况，那个与自己高 16 位异或计算就是低成本的避免哈希冲突的一个思路，在我的塑料英语翻译中能看见详细的描述。

by 费城的二鹏 2020.07.16 长春

Android基础知识-HashMap的Hash是什么？

猜你喜欢

热点阅读