对象的内存布局&hotspot对象模型
对象组成
对象在内存中布局可以分为三个区域:
对象头
运行时数据-通过Mark Word实现
-
包括hashcode、GC分代年龄、锁状态标识、线程持有的锁、偏向锁ID和偏向时间戳
-
官方称为Mark Word,在32位虚拟机中长度为32bit
-
在64位虚拟机中长度为64bit
-
非固定的数据结构,以实现在有限空间内保存尽可能多的数据
-
32位的Mark Word,在对象未被锁定状态下,其结构如下
25bit hashcode,4bit分代年龄,2bit锁标志位,1bit固定为0
千万记住,Mark Word中的数据结构是一直在变化的,根据对象状态的不同,其记录的内容不同,则结构也不同,下面是其他状态下,Mark Word中存储的内容:(标志位两bit始终存在)
状态 | 存储内容 | 标志位 |
---|---|---|
轻量级锁定 | 指向锁记录的指针 | 00 |
重量级锁定 | 指向重量级锁的指针 | 10 |
GC标识 | 空 | 11 |
可偏向 | 偏向线程ID、时间戳、分代年龄 | 01 |
未锁定 | 对象Hashcode、分代年龄,如上图实例 | 01 |
Mark Word的实现
HotSpot通过markOop类实现Mark Word,markOop.hpp文件中有对应的代码:(均基于openJDK9,其它版本代码实现可能不同)
// The markOop describes the header of an object.
//
// Note that the mark is not a real oop but just a word.
// It is placed in the oop hierarchy for historical reasons.
//
// Bit-format of an object header (most significant first, big endian layout below):
//
// 32 bits:
// --------
// hash:25 ------------>| age:4 biased_lock:1 lock:2 (normal object)
// JavaThread*:23 epoch:2 age:4 biased_lock:1 lock:2 (biased object)
// size:32 ------------------------------------------>| (CMS free block)
// PromotedObject*:29 ---------->| promo_bits:3 ----->| (CMS promoted object)
//
// 64 bits:
// --------
// unused:25 hash:31 -->| unused:1 age:4 biased_lock:1 lock:2 (normal object)
// JavaThread*:54 epoch:2 unused:1 age:4 biased_lock:1 lock:2 (biased object)
// PromotedObject*:61 --------------------->| promo_bits:3 ----->| (CMS promoted object)
// size:64 ----------------------------------------------------->| (CMS free block)
//
// unused:25 hash:31 -->| cms_free:1 age:4 biased_lock:1 lock:2 (COOPs && normal object)
// JavaThread*:54 epoch:2 cms_free:1 age:4 biased_lock:1 lock:2 (COOPs && biased object)
// narrowOop:32 unused:24 cms_free:1 unused:4 promo_bits:3 ----->| (COOPs && CMS promoted object)
// unused:21 size:35 -->| cms_free:1 unused:7 ------------------>| (COOPs && CMS free block)
//
// - hash contains the identity hash value: largest value is
// 31 bits, see os::random(). Also, 64-bit vm's require
// a hash value no bigger than 32 bits because they will not
// properly generate a mask larger than that: see library_call.cpp
// and c1_CodePatterns_sparc.cpp.
//
// - the biased lock pattern is used to bias a lock toward a given
// thread. When this pattern is set in the low three bits, the lock
// is either biased toward a given thread or "anonymously" biased,
// indicating that it is possible for it to be biased. When the
// lock is biased toward a given thread, locking and unlocking can
// be performed by that thread without using atomic operations.
// When a lock's bias is revoked, it reverts back to the normal
// locking scheme described below.
//
// Note that we are overloading the meaning of the "unlocked" state
// of the header. Because we steal a bit from the age we can
// guarantee that the bias pattern will never be seen for a truly
// unlocked object.
//
// Note also that the biased state contains the age bits normally
// contained in the object header. Large increases in scavenge
// times were seen when these bits were absent and an arbitrary age
// assigned to all biased objects, because they tended to consume a
// significant fraction of the eden semispaces and were not
// promoted promptly, causing an increase in the amount of copying
// performed. The runtime system aligns all JavaThread* pointers to
// a very large value (currently 128 bytes (32bVM) or 256 bytes (64bVM))
// to make room for the age bits & the epoch bits (used in support of
// biased locking), and for the CMS "freeness" bit in the 64bVM (+COOPs).
//
// [JavaThread* | epoch | age | 1 | 01] lock is biased toward given thread
// [0 | epoch | age | 1 | 01] lock is anonymously biased
//
// - the two lock bits are used to describe three states: locked/unlocked and monitor.
//
// [ptr | 00] locked ptr points to real header on stack
// [header | 0 | 01] unlocked regular object header
// [ptr | 10] monitor inflated lock (header is wapped out)
// [ptr | 11] marked used by markSweep to mark an object
// not valid at any other time
//
// We assume that stack/thread pointers have the lowest two bits cleared.
public:
// Constants
enum { age_bits = 4,
lock_bits = 2,
biased_lock_bits = 1,
max_hash_bits = BitsPerWord - age_bits - lock_bits - biased_lock_bits,
hash_bits = max_hash_bits > 31 ? 31 : max_hash_bits,
cms_bits = LP64_ONLY(1) NOT_LP64(0),
epoch_bits = 2
};
// The biased locking code currently requires that the age bits be
// contiguous to the lock bits.
enum { lock_shift = 0,
biased_lock_shift = lock_bits,
age_shift = lock_bits + biased_lock_bits,
cms_shift = age_shift + age_bits,
hash_shift = cms_shift + cms_bits,
epoch_shift = hash_shift
};
Mark Word的结构是非固定的,根据不同的状态有对应的不同实现。
类型指针
对象头的类型指针,指向该对象的类数据,jvm可以根据这个指针确定该对象是哪个类的实例
如果对象是一个数组,还需要一块用于记录数据长度的区域
实例数据
在程序代码中,所定义的各种类型的字段,包括从父类继承的。
这部分的存储顺序会受到JVM分配策略,以及字段在源码中定义顺序的影响
对齐填充
要求对象的起始地址必须是8字节的整数倍,即对象的大小必须是8字节的整数倍。
由于对象头的大小刚好是8bit的整数倍(32bit或者64bit),所以如果实例数据+对象头,不够8字节的整数倍时,需要通过对齐填充进行补全。
1 byte = 8 bit
1B = 1 byte
1b = 1 bit
HotSpot对象模型
OOP/Klass
HotSpot JVM并没有根据Java对象直接通过虚拟机映射到新建的C++对象,而是设计了一个oop/klass Model。
OOP:Ordinary Object Pointer,用来表示对象的实例信息
Klass:用来保存 描述元数据
Klass.hpp中对klass的描述:
// A Klass provides:
// 1: language level class object (method dictionary etc.)
// 2: provide vm dispatch behavior for the object
// Both functions are combined into one C++ class.
// One reason for the oop/klass dichotomy in the implementation is
// that we don't want a C++ vtbl pointer in every object. Thus,
// normal oops don't have any virtual functions. Instead, they
// forward all "virtual" functions to their klass, which does have
// a vtbl and does the C++ dispatch depending on the object's
// actual type. (See oop.inline.hpp for some of the forwarding code.)
// ALL FUNCTIONS IMPLEMENTING THIS DISPATCH ARE PREFIXED WITH "oop_"!
设计OOP/Klass这种模型,原因是不希望每个对象(Object)中都包含一个vtbl(虚方法表),其中oop中不含有任何虚方法,虚方法保存在klass中。
OOP
oop基于oopDesc实现,参见oop.hpp,部分代码如下
class oopDesc {
friend class VMStructs;
friend class JVMCIVMStructs;
private:
volatile markOop _mark;
union _metadata {
Klass* _klass;
narrowKlass _compressed_klass;
} _metadata;
// Fast access to barrier set. Must be initialized.
static BarrierSet* _bs;
public:
markOop mark() const { return _mark; }
markOop* mark_addr() const { return (markOop*) &_mark; }
void set_mark(volatile markOop m) { _mark = m; }
主要看private域中包含的内容,这才是oopDesc本身包含的数据。
可以看到,一个oopDesc由两部分组成,分别是_mark 和 _metadata
_mark
_mark是markOop类型,也就是Mark Word的实现,是对象头运行时数据实现。
详见上文中对象头—运行时数据。
其占用内存大小与JVM位长保持一致。
_metadata
是一个结构体(联合体),Klass 和 narrowKlass都指向instanceKlass对象,其中narrowKlass指向的是经过压缩的对象。
_klass字段建立了oop对象与klass对象之间的关联关系。
在联合体中,各个成员共享一段内存空间,一个联合变量的长度等于各成员中最长的长度。
instanceKlass.hpp中对于InstanceKlass的描述如下
// An InstanceKlass is the VM level representation of a Java class.
// It contains all information needed for at class at execution runtime.
// InstanceKlass embedded field layout (after declared fields):
// [EMBEDDED Java vtable ] size in words = vtable_len
// [EMBEDDED nonstatic oop-map blocks] size in words = nonstatic_oop_map_size
// The embedded nonstatic oop-map blocks are short pairs (offset, length)
// indicating where oops are located in instances of this klass.
// [EMBEDDED implementor of the interface] only exist for interface
// [EMBEDDED host klass ] only exist for an anonymous class (JSR 292 enabled)
// [EMBEDDED fingerprint ] only if should_store_fingerprint()==true