jdk8 String类源码分析

2017-12-01 本文已影响0人 trons先生

工作中经常遇到的一些类，工具，框架，虽然知道其大致的一些特性，但是源码并没有阅读过。也为了加强自己的记性，就写一些源码分析文章。

首先看看源码中的类注释

/**
 * Strings are constant; their values cannot be changed after they
 * are created. String buffers support mutable strings.
 */

这里指出了Java中的String是一个“常量”，是一个不可变对象，它的值在创建了之后就不能再被修改。当对一个String进行任意修改实际上都会创建一个新的String。如果需要对String进行大量操作，应该使用StringBuffer(通常是StringBuilder)。

/**
 * <blockquote><pre>
 *     String str = "abc";
 * </pre></blockquote><p>
 * is equivalent to:
 * <blockquote><pre>
 *     char data[] = {'a', 'b', 'c'};
 *     String str = new String(data);
 * </pre></blockquote><p>
 */

也因为String不可变的性质，因此Java内部实现了常量池。当一个String被创建时，会先去常量池查看有没有值相同的示例，有的话直接返回。节省了内存，加快了字符串的加载速度。不可变的对象也可以保证在并发中保持线程安全。

public final class String
    implements java.io.Serializable, Comparable<String>, CharSequence {

String 实现了 Serializable Comparable CharSequence 接口，使String可以被序列化，比较大小，以及获取长度，字符迭代等一些操作。

/** The value is used for character storage. */
    private final char value[];

    /** Cache the hash code for the string */
    private int hash; // Default to 0

    /** use serialVersionUID from JDK 1.0.2 for interoperability */
    private static final long serialVersionUID = -6849794470754667710L;

    /**
     * Class String is special cased within the Serialization Stream Protocol.
     *
     * A String instance is written into an ObjectOutputStream according to
     * <a href="{@docRoot}/../platform/serialization/spec/output.html">
     * Object Serialization Specification, Section 6.2, "Stream Elements"</a>
     */
    private static final ObjectStreamField[] serialPersistentFields =
        new ObjectStreamField[0];

属性value 域用来存放String的值，为一个final的char数组，不可更改。
属性hash用来缓存方法hash()的结果，避免重复计算。
属性serialVersionUID是Serializable接口必要的，值是 jdk 1.0.2版本就是定下的。如果serialVersionUID不对应，反序列化就会失败。
属性serialPersistentFields是用来指定默认的序列化字段。

public String() { this.value = "".value; }
public String(String original){ ... }
public String(char value[]) { this.value = Arrays.copyOf(value, value.length); }
public String(char value[], int offset, int count){ ... }
public String(int[] codePoints, int offset, int count){ ... }
public String(byte ascii[], int hibyte, int offset, int count){ ... }
public String(byte ascii[], int hibyte) { this(ascii, hibyte, 0, ascii.length); }
public String(byte bytes[], int offset, int length, String charsetName){ ... }
public String(byte bytes[], int offset, int length, Charset charset) { ... }
public String(byte bytes[], String charsetName){ ... }
public String(byte bytes[], Charset charset) { this(bytes, 0, bytes.length, charset); }
public String(byte bytes[], int offset, int length) { ... }
public String(byte bytes[]) { this(bytes, 0, bytes.length); }
public String(StringBuffer buffer) { ... }
public String(StringBuilder builder) { ... }
String(char[] value, boolean share) { ... }

String 的构造大概可以分为x类

通过其他String构成
通过char数组
通过codePoint
通过ascii码
通过byte数组
通过字符串构造器

如果是通过其他String来构造的话，我们可以相信它的“值”的不可变性，直接复用原来的value和hash属性。如果是通过char数组或者byte数组的话，则需要重新复制来避免以后数组的修改会影响String的不可变性。如果通过代码点数组构造的话，代码点数组中的增补字符会占用两个char。byte数组方式的话，如果不指定编码，则会先查找文件的编码，如果有则使用，否则默认为utf-8编码。字符构造器方式则是直接复制构造器中的char。不直接使用构造器自身的toString()方法，我个人认为是为了节省一次new对象操作。

简单介绍下String类方法

字符串实例的一些常用方法如
- length 返回字符串长度,
- isEmpty 判断字符串是否为空
- charAt 根据索引位置获取char,
- getChars 复制对应位置范围的char到数组中
- equals, equalsIgnoreCase 对比顺序依次为引用地址，char数组长度，char数组内容
- compareTo 对比字符串大小
- startsWith, endsWith 判断前后缀
- hashCode 计算hash值，公式为s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]
- indexOf 查找首次出现的位置
- lastIndexOf 查找最后出现的位置
- substring 返回子串（旧版本是返回一个引用在父串的一个新串，节省重新分配内存。但实际如果子串引用了一个占用极大的父串，会因为子串一直被使用导致父串没法被垃圾回收，新版本substring每次重新复制char数组）
- concat 拼接字符串（拼接char数组，重新创建字符串）
- replace 用新字符替换所有的旧字符（会先遍历一次char数组，寻找时候存在，再去替换，避免每次都要分配char数组）
- matches 判断是否符合正则（复用Pattern.matches()方法）
- contains 判断是否包含子串（复用indexOf()方法）
- replaceFirst 只替换一次
- replaceAll 替换所有正则符合的地方
- split 按照正则分割字符串
- toLowerCase 返回小写
- toUpperCase 返回大写
- trim 去除前后空格
- toCharArray 重新复制char数组返回
- intern 这是个naive方法，直接返回常量池中的引用
常用的String的工具方法
- join 组合子串，以分隔符间隔
- format 格式化字符串
- valueOf 将其他类型数据转换成字符串

jdk8 String类源码分析

猜你喜欢

热点阅读