

2020-08-10





public final class String
    implements, Comparable<String>, CharSequence {
    /** The value is used for character storage. */
    private final char value[];

    /** Cache the hash code for the string */
    private int hash; // Default to 0

    /** use serialVersionUID from JDK 1.0.2 for interoperability */
    private static final long serialVersionUID = -6849794470754667710L;

String类的核心是通过对final属性的 char数组values操作。由于只能初始化一次,因而大部分操作都是通过System.arraycopy方法,复制一个新的char数组,然后返回。



     * Initializes a newly created {@code String} object so that it represents
     * the same sequence of characters as the argument; in other words, the
     * newly created string is a copy of the argument string. Unless an
     * explicit copy of {@code original} is needed, use of this constructor is
     * unnecessary since Strings are immutable.
     * @param  original
     *         A {@code String}
    public String(String original) {
        this.value = original.value;
        this.hash = original.hash;


     * Allocates a new {@code String} so that it represents the sequence of
     * characters currently contained in the character array argument. The
     * contents of the character array are copied; subsequent modification of
     * the character array does not affect the newly created string.
     * @param  value
     *         The initial value of the string
    public String(char value[]) {
        this.value = Arrays.copyOf(value, value.length);

但是需要注意的是,这两个构造函数有着本质的区别。String(String original) 实际上只创建了一个新的String对象,但是其属性还是通过指针的方式,指向原来的char数组。而String(char value[]) 则是通过System.arraycopy的方式,重新在堆区copy出了一个全新的char数组。这是有本质区别的。我们可以通过反射方法进行验证:

    public static void main(String[] args) {
        String a = "12345";
        String b = "12345";
        String c = new String("12345");
        String d = new String(b.toCharArray());

        try {
            Field charField = String.class.getDeclaredField("value");
            char[] objects = (char[]) charField.get(a);
        } catch (NoSuchFieldException e) {
        } catch (IllegalAccessException e) {
        System.out.println("a is : {"+a+"}");
        System.out.println("b is : {"+b+"}");
        System.out.println("c is : {"+c+"}");
        System.out.println("d is : {"+d+"}");


a is : {02345}
b is : {02345}
c is : {02345}
d is : {12345}

这是因为,java中字符串有字符串常量池StringTable,这个将在后续介绍。需要说明的是,a b 实际上都是指向常量池中的同一内容。那么c的构造方法我们可以发现,其内部的指针仍然指向的是最初a里面的char数组。而d则采用了arraycopy重建了新的char数组。
在String中,除了public String(String original) 这个构造方法之外,其他都是通过arraycopy生成新的char数组。



     * Encodes this {@code String} into a sequence of bytes using the
     * platform's default charset, storing the result into a new byte array.
     * <p> The behavior of this method when this string cannot be encoded in
     * the default charset is unspecified.  The {@link
     * java.nio.charset.CharsetEncoder} class should be used when more control
     * over the encoding process is required.
     * @return  The resultant byte array
     * @since      JDK1.1
   public byte[] getBytes() {
        return StringCoding.encode(value, 0, value.length);


    // Trim the given byte array to the given length
    private static byte[] safeTrim(byte[] ba, int len, Charset cs, boolean isTrusted) {
        if (len == ba.length && (isTrusted || System.getSecurityManager() == null))
            return ba;
            return Arrays.copyOf(ba, len);


  public byte[] getBytes(String charsetName)
  public byte[] getBytes(Charset charset) 
   public void getChars(int srcBegin, int srcEnd, char dst[], int dstBegin) 




     * Class String is special cased within the Serialization Stream Protocol.
     * A String instance is written into an ObjectOutputStream according to
     * <a href="{@docRoot}/../platform/serialization/spec/output.html">
     * Object Serialization Specification, Section 6.2, "Stream Elements"</a>
    private static final ObjectStreamField[] serialPersistentFields =
        new ObjectStreamField[0];

申明了一个private final static的ObjectStreamField数组,但是奇怪的是这个数组的长度为0。经过查询文档发现,这是对序列化接口implements的一种约定。
通常implements的类能够被序列化,在序列化的过程中,serialVersionUID用于实现反序列化的约束,如果不一致则反序列化会失败。而所有非 static 和 transient 修饰的属性都会被序列化。在前面学习Date对象的时候就学过,Date的fastTime由于被transient修饰因而不会被序列化。哪 private static final ObjectStreamField[] serialPersistentFields 的作用又是什么呢?

    private static final ObjectStreamField[] serialPersistentFields =
        new ObjectStreamField[0];


class List implements Serializable {
public ObjectStreamField[] serialPersistentFields = { new ObjectStreamField("myField", List.class) };

Code Correctness: Incorrect serialPersistentFields Modifier



3.1 CaseInsensitiveComparator内部类


     * A Comparator that orders {@code String} objects as by
     * {@code compareToIgnoreCase}. This comparator is serializable.
     * <p>
     * Note that this Comparator does <em>not</em> take locale into account,
     * and will result in an unsatisfactory ordering for certain locales.
     * The java.text package provides <em>Collators</em> to allow
     * locale-sensitive ordering.
     * @see     java.text.Collator#compare(String, String)
     * @since   1.2
    public static final Comparator<String> CASE_INSENSITIVE_ORDER
                                         = new CaseInsensitiveComparator();
    private static class CaseInsensitiveComparator
            implements Comparator<String>, {
        // use serialVersionUID from JDK 1.2.2 for interoperability
        private static final long serialVersionUID = 8575799808933029326L;

        public int compare(String s1, String s2) {
            int n1 = s1.length();
            int n2 = s2.length();
            int min = Math.min(n1, n2);
            for (int i = 0; i < min; i++) {
                char c1 = s1.charAt(i);
                char c2 = s2.charAt(i);
                if (c1 != c2) {
                    c1 = Character.toUpperCase(c1);
                    c2 = Character.toUpperCase(c2);
                    if (c1 != c2) {
                        c1 = Character.toLowerCase(c1);
                        c2 = Character.toLowerCase(c2);
                        if (c1 != c2) {
                            // No overflow because of numeric promotion
                            return c1 - c2;
            return n1 - n2;

        /** Replaces the de-serialized object. */
        private Object readResolve() { return CASE_INSENSITIVE_ORDER; }


3.2 compareTo

     * Compares two strings lexicographically.
     * The comparison is based on the Unicode value of each character in
     * the strings. The character sequence represented by this
     * {@code String} object is compared lexicographically to the
     * character sequence represented by the argument string. The result is
     * a negative integer if this {@code String} object
     * lexicographically precedes the argument string. The result is a
     * positive integer if this {@code String} object lexicographically
     * follows the argument string. The result is zero if the strings
     * are equal; {@code compareTo} returns {@code 0} exactly when
     * the {@link #equals(Object)} method would return {@code true}.
     * <p>
     * This is the definition of lexicographic ordering. If two strings are
     * different, then either they have different characters at some index
     * that is a valid index for both strings, or their lengths are different,
     * or both. If they have different characters at one or more index
     * positions, let <i>k</i> be the smallest such index; then the string
     * whose character at position <i>k</i> has the smaller value, as
     * determined by using the &lt; operator, lexicographically precedes the
     * other string. In this case, {@code compareTo} returns the
     * difference of the two character values at position {@code k} in
     * the two string -- that is, the value:
     * <blockquote><pre>
     * this.charAt(k)-anotherString.charAt(k)
     * </pre></blockquote>
     * If there is no index position at which they differ, then the shorter
     * string lexicographically precedes the longer string. In this case,
     * {@code compareTo} returns the difference of the lengths of the
     * strings -- that is, the value:
     * <blockquote><pre>
     * this.length()-anotherString.length()
     * </pre></blockquote>
     * @param   anotherString   the {@code String} to be compared.
     * @return  the value {@code 0} if the argument string is equal to
     *          this string; a value less than {@code 0} if this string
     *          is lexicographically less than the string argument; and a
     *          value greater than {@code 0} if this string is
     *          lexicographically greater than the string argument.
    public int compareTo(String anotherString) {
        int len1 = value.length;
        int len2 = anotherString.value.length;
        int lim = Math.min(len1, len2);
        char v1[] = value;
        char v2[] = anotherString.value;

        int k = 0;
        while (k < lim) {
            char c1 = v1[k];
            char c2 = v2[k];
            if (c1 != c2) {
                return c1 - c2;
        return len1 - len2;

我们知道,当调用Arrays.sort或者Collections.sort方法的时候,实际上就是使用的compareTo(T o)来进行排序。实现了可比较性。

3.3 String的可比较性总结


    public int compareToIgnoreCase(String str) {
        return, str);


public int compareTo(String anotherString) 




4.1 hashcode方法

     * Returns a hash code for this string. The hash code for a
     * {@code String} object is computed as
     * <blockquote><pre>
     * s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]
     * </pre></blockquote>
     * using {@code int} arithmetic, where {@code s[i]} is the
     * <i>i</i>th character of the string, {@code n} is the length of
     * the string, and {@code ^} indicates exponentiation.
     * (The hash value of the empty string is zero.)
     * @return  a hash code value for this object.
    public int hashCode() {
        int h = hash;
        if (h == 0 && value.length > 0) {
            char val[] = value;

            for (int i = 0; i < value.length; i++) {
                h = 31 * h + val[i];
            hash = h;
        return h;


s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]

在Effective Java中说过,采用奇素数来进行,如果乘数是偶数,并且乘法溢出的话,信息就会丢失,因为与2相乘等价于移位运算。使用素数的好处并不是很明显,但是习惯上都使用素数来计算散列结果。
另外 31有个很好的特性,就是用移位和减法来代替乘法,可以得到更好的性能:



4.2 equals方法

     * Compares this string to the specified object.  The result is {@code
     * true} if and only if the argument is not {@code null} and is a {@code
     * String} object that represents the same sequence of characters as this
     * object.
     * @param  anObject
     *         The object to compare this {@code String} against
     * @return  {@code true} if the given object represents a {@code String}
     *          equivalent to this string, {@code false} otherwise
     * @see  #compareTo(String)
     * @see  #equalsIgnoreCase(String)
    public boolean equals(Object anObject) {
        if (this == anObject) {
            return true;
        if (anObject instanceof String) {
            String anotherString = (String)anObject;
            int n = value.length;
            if (n == anotherString.value.length) {
                char v1[] = value;
                char v2[] = anotherString.value;
                int i = 0;
                while (n-- != 0) {
                    if (v1[i] != v2[i])
                        return false;
                return true;
        return false;

其源码如上述。可以发现该方法首先比较两个对象时,首先判断地址是否相等,如果地址相等则直接返回。如果不同,则看看需要对比的object是否instanceof String,之后转换为String,首先比较长度,之后挨个比较字符串内容。如都相同则返回true。反之则返回false。



     * Returns a canonical representation for the string object.
     * <p>
     * A pool of strings, initially empty, is maintained privately by the
     * class {@code String}.
     * <p>
     * When the intern method is invoked, if the pool already contains a
     * string equal to this {@code String} object as determined by
     * the {@link #equals(Object)} method, then the string from the pool is
     * returned. Otherwise, this {@code String} object is added to the
     * pool and a reference to this {@code String} object is returned.
     * <p>
     * It follows that for any two strings {@code s} and {@code t},
     * {@code s.intern() == t.intern()} is {@code true}
     * if and only if {@code s.equals(t)} is {@code true}.
     * <p>
     * All literal strings and string-valued constant expressions are
     * interned. String literals are defined in section 3.10.5 of the
     * <cite>The Java&trade; Language Specification</cite>.
     * @return  a string that has the same contents as this string, but is
     *          guaranteed to be from a pool of unique strings.
    public native String intern();


