HashMap 中hash值生成规则

2017-06-30 本文已影响0人 snail_knight

原文讲解的HashMap JDK1.6
http://www.jianshu.com/p/8b372f3a195d/
该文没有很细的说明hash值是怎么生成的

    /**
     * Computes key.hashCode() and spreads (XORs) higher bits of hash
     * to lower.  Because the table uses power-of-two masking, sets of
     * hashes that vary only in bits above the current mask will
     * always collide. (Among known examples are sets of Float keys
     * holding consecutive whole numbers in small tables.)  So we
     * apply a transform that spreads the impact of higher bits
     * downward. There is a tradeoff between speed, utility, and
     * quality of bit-spreading. Because many common sets of hashes
     * are already reasonably distributed (so don't benefit from
     * spreading), and because we use trees to handle large sets of
     * collisions in bins, we just XOR some shifted bits in the
     * cheapest possible way to reduce systematic lossage, as well as
     * to incorporate impact of the highest bits that would otherwise
     * never be used in index calculations because of table bounds.
     */
    static final int hash(Object key) {
        int h;
        return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
    }

该代码列表是截取自jdk1.8 HashMap中 hash的生成方式：在分析之前应该清楚Object 的native hash方法，没有深入到native 源码。关于源码的链接：
其中
首先通过传入的key获取hashcode 记为var1
然后在var1基础上无符号右移16位，int类型四个字节，共32位，也就是去高16位，记为var2
最后var1 ^ var2 做亦或操作得到最终hash值返回

import lombok.AllArgsConstructor;
import lombok.Data;
import lombok.ToString;

import java.util.HashMap;

/**
 * Created by yanghuanqing@wdai.com on 30/06/2017.
 */
public class HashMapTest {
    public static void main(String[] args) {


        int i  = "yang".hashCode();
        System.out.println(i);

        System.out.println(i+">>>16     ="+( "yang".hashCode()>>>16));
        int off = i^(i>>>16);
        System.out.println(off);
    }
}

//没有lombok 可以自己敲set get construct
@Data
@AllArgsConstructor
@ToString
class  Student{

    private String name;
    private int age;
}

3701441 // yang 的内置hash值
3701441>>>16 =56 //位置之后的值
3701497 //取异或操作的值

此外有string的哈希的生成算法
Returns a hash code for this string. The hash code for a String object is computed as
s[0]31^(n-1) + s[1]31^(n-2) + … + s[n-1]

String.clss

 public int hashCode() {
        int h = hash;
        if (h == 0 && value.length > 0) {
            char val[] = value;

            for (int i = 0; i < value.length; i++) {
                h = 31 * h + val[i];
            }
            hash = h;
        }
        return h;
    }

补充HashMap put操作

image.png

//第一次参数是 key的Object.hashCode方法生成的hash值
final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        Node<K,V>[] tab; Node<K,V> p; int n, i;
        if ((tab = table) == null || (n = tab.length) == 0)
          //如果第一次插入数据，那么会通过resize（）函数范围一个默认的大小为16，该
//函数也是扩容过程中一个重要的函数
            n = (tab = resize()).length;
//这一步就是在寻找桶的过程，就是上图总数组，根据容量取如果容量是16 对hash值取低16位，那么下标范围就在容量大小范围内了。
        if ((p = tab[i = (n - 1) & hash]) == null)
      //如果对应数据内没有对应值，就加入新节点。
            tab[i] = newNode(hash, key, value, null);
        else {
            Node<K,V> e; K k;
        //哈希值一样，key地址一样，key不为空，key的数据一样
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                e = p;
            else if (p instanceof TreeNode)
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else {
              
                for (int binCount = 0; ; ++binCount) {
                    if ((e = p.next) == null) {
                        p.next = newNode(hash, key, value, null);
//如果连表的超过8采用二叉树存储结构
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            treeifyBin(tab, hash);
                        break;
                    }
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        break;
                    p = e;
                }
            }
//在key的哈希碰撞的情况下，旧值会被替换掉
            if (e != null) { // existing mapping for key
                V oldValue = e.value;
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
            }
        }
//该参数记录什么？？？数据的大小？？还是整个hashMap中总的元素个数？？
        ++modCount;
//应该是size来记录数组的大小
        if (++size > threshold)
//超出数组就要重新分配大的数组
            resize();
        //如果扩容了，需要清理一些原来的数据
        afterNodeInsertion(evict);
        return null;
    }

下面为map取数的过程

final Node<K,V> getNode(int hash, Object key) {
        Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
        if ((tab = table) != null && (n = tab.length) > 0 &&
            (first = tab[(n - 1) & hash]) != null) {
            if (first.hash == hash && // always check first node
                ((k = first.key) == key || (key != null && key.equals(k))))
                return first;
            if ((e = first.next) != null) {
                if (first instanceof TreeNode)
                    return ((TreeNode<K,V>)first).getTreeNode(hash, key);
                do {
//定位到对应的数组，然后第一次没有命中，就直接一次往下遍历
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        return e;
                } while ((e = e.next) != null);
            }
        }
        return null;
    }

HashMap 中hash值生成规则

猜你喜欢

热点阅读