Java集合类源码探究

Arrays.equals与ArraysSupport.vect

2019-03-18  本文已影响0人  Ghamster

博客发表于:Ghamster Blog
转载请注明出处

概述

Arrays类的equals方法实现

jdk8中,数组判等的逻辑是:遍历数组中的元素,依次对每个元素判等
int[]为例,源码如下:

    public static boolean equals(int[] a, int[] a2) {
        if (a==a2)
            return true;
        if (a==null || a2==null)
            return false;

        int length = a.length;
        if (a2.length != length)
            return false;

        for (int i=0; i<length; i++)
            if (a[i] != a2[i])
                return false;

        return true;
    }

jdk11中,修改了所有用于基本数据类型的数组的equals方法,使用mismatch方法实现
int[]为例,源码如下:

    public static boolean equals(int[] a, int[] a2) {
        if (a==a2)
            return true;
        if (a==null || a2==null)
            return false;

        int length = a.length;
        if (a2.length != length)
            return false;

        return ArraysSupport.mismatch(a, a2, length) < 0;
    }

mismatch和vectorizedMismatch

为了探究jdk11equals的魔法,必须看看mismatch方法到底做了什么。mismatch方法也拥有各种基本数据类型数组的重载版本,以int[]版本为例

mismatch方法

mismatch方法返回给定的两个数组中,首个不相同元素的下标,当完全匹配(0到length-1)时,返回-1,源码如下:

    public static int mismatch(int[] a,
                               int[] b,
                               int length) {
        int i = 0;
        if (length > 1) {
            if (a[0] != b[0])
                return 0;
            i = vectorizedMismatch(
                    a, Unsafe.ARRAY_INT_BASE_OFFSET,
                    b, Unsafe.ARRAY_INT_BASE_OFFSET,
                    length, LOG2_ARRAY_INT_INDEX_SCALE);
            if (i >= 0)
                return i;
            i = length - ~i;
        }
        for (; i < length; i++) {
            if (a[i] != b[i])
                return i;
        }
        return -1;
    }

方法主要可以分为两个部分:

  1. 调用vectorizedMismatch方法,对两个数组对象匹配,若方法返回正值,则表示找到了未匹配的元素,返回值为该元素在数组中的索引;若方法返回负值,则表示未找到不相等元素,但需要注意的是,vectorizedMismatch方法可能没有处理完数组中的所有元素,对返回值取反,得到还需处理的数组元素个数
  2. vectorizedMismatch返回值取反,对剩余元素逐个验证

vectorizedMismatch

该方法标注了 @HotSpotIntrinsicCandidate 注解,即jvm可以有自己的优化实现方式,所以...懂吧...?

该方法的核心思想是以long数据格式对比对象(不一定是数组)中的数据,即每次循环处理数据长度为8个字节,而不考虑对象是long[]还是int[]short[]等。方法的源码如下:

@HotSpotIntrinsicCandidate
    public static int vectorizedMismatch(Object a, long aOffset,
                                         Object b, long bOffset,
                                         int length,
                                         int log2ArrayIndexScale) {
        // assert a.getClass().isArray();
        // assert b.getClass().isArray();
        // assert 0 <= length <= sizeOf(a)
        // assert 0 <= length <= sizeOf(b)
        // assert 0 <= log2ArrayIndexScale <= 3

        int log2ValuesPerWidth = LOG2_ARRAY_LONG_INDEX_SCALE - log2ArrayIndexScale;
        int wi = 0;
        for (; wi < length >> log2ValuesPerWidth; wi++) {
            long bi = ((long) wi) << LOG2_ARRAY_LONG_INDEX_SCALE;
            long av = U.getLongUnaligned(a, aOffset + bi);
            long bv = U.getLongUnaligned(b, bOffset + bi);
            if (av != bv) {
                long x = av ^ bv;
                int o = BIG_ENDIAN
                        ? Long.numberOfLeadingZeros(x) >> (LOG2_BYTE_BIT_SIZE + log2ArrayIndexScale)
                        : Long.numberOfTrailingZeros(x) >> (LOG2_BYTE_BIT_SIZE + log2ArrayIndexScale);
                return (wi << log2ValuesPerWidth) + o;
            }
        }

        // Calculate the tail of remaining elements to check
        int tail = length - (wi << log2ValuesPerWidth);

        if (log2ArrayIndexScale < LOG2_ARRAY_INT_INDEX_SCALE) {
            int wordTail = 1 << (LOG2_ARRAY_INT_INDEX_SCALE - log2ArrayIndexScale);
            // Handle 4 bytes or 2 chars in the tail using int width
            if (tail >= wordTail) {
                long bi = ((long) wi) << LOG2_ARRAY_LONG_INDEX_SCALE;
                int av = U.getIntUnaligned(a, aOffset + bi);
                int bv = U.getIntUnaligned(b, bOffset + bi);
                if (av != bv) {
                    int x = av ^ bv;
                    int o = BIG_ENDIAN
                            ? Integer.numberOfLeadingZeros(x) >> (LOG2_BYTE_BIT_SIZE + log2ArrayIndexScale)
                            : Integer.numberOfTrailingZeros(x) >> (LOG2_BYTE_BIT_SIZE + log2ArrayIndexScale);
                    return (wi << log2ValuesPerWidth) + o;
                }
                tail -= wordTail;
            }
            return ~tail;
        }
        else {
            return ~tail;
        }
    }

首先简单看一下方法的入参:

方法操作逻辑如下(逐行;接上节,以int[]为例):

性能测试

下面通过简单代码测试一下相对于jdk8jdk11是否有性能的提升,测试代码如下:

package main.test;

import java.util.Arrays;
import java.util.Random;

public class TestLegacyArrays {

    private static Random r = new Random();

    private static class LegacyArrays {
        private LegacyArrays() {}

        public static boolean equals(int[] a, int[] a2) {
            if (a==a2)
                return true;
            if (a==null || a2==null)
                return false;

            int length = a.length;
            if (a2.length != length)
                return false;

            for (int i=0; i<length; i++)
                if (a[i] != a2[i])
                    return false;

            return true;
        }
    }

    private static void test(int size, int fill){
        int[] a = new int[size];
        Arrays.fill(a, fill);
        int[] b = new int[size];
        Arrays.fill(b, fill);

        boolean isEqual;
        long now, finish;
        System.out.println("=> TestLegacyArrays: array size "+size);

        now = System.nanoTime();
        isEqual = LegacyArrays.equals(a,b);
        finish = System.nanoTime();
        System.out.println("  LegacyArrays: "+isEqual+" time: "+(finish-now));

        now = System.nanoTime();
        isEqual = Arrays.equals(a,b);
        finish = System.nanoTime();
        System.out.println("  Arrays: "+isEqual+" time: "+(finish-now));
        System.out.println();
    }

    public static void main(String[] args) {
        
        test(10, r.nextInt());
        test(100, r.nextInt());
        test(1000, r.nextInt());
        test(10000, r.nextInt());
        test(100000, r.nextInt());

    }
}

代码输出如下:

=> TestLegacyArrays: array size 10
  LegacyArrays: true time: 373700
  Arrays: true time: 193700

=> TestLegacyArrays: array size 100
  LegacyArrays: true time: 2000
  Arrays: true time: 10200

=> TestLegacyArrays: array size 1000
  LegacyArrays: true time: 15400
  Arrays: true time: 186500

=> TestLegacyArrays: array size 10000
  LegacyArrays: true time: 148900
  Arrays: true time: 286800

=> TestLegacyArrays: array size 100000
  LegacyArrays: true time: 1233700
  Arrays: true time: 2424000


Process finished with exit code 0

emmmmmmmmmmmmmmm...................
所以这次的更新,或许是出于流式计算或者多线程或者其他因素的考量,又或者我的测试方法有问题?。。。

上一篇 下一篇

猜你喜欢

热点阅读