Cache_t结构分析

2020-09-19 本文已影响0人 _Luyouli

Cache_t初识

我们在前面对类的结构探索中知道了类结构体成员如下

struct objc_class : objc_object {
    // Class ISA;
    Class superclass;
    cache_t cache;             // formerly cache pointer and vtable
    class_data_bits_t bits;    // class_rw_t * plus custom rr/alloc flags
    ...
}

我们通过地址偏移探索知道在bits中包含了类的属性和方法，那么cache_t cache又是什么呢？
从名字上我们可以简单的认知就是缓存，那究竟是不是缓存呢？它到底缓存了类中的什么信息呢？我们接下来就探索来cache_t。

Cache_t源码探索

还是老规矩，从源码下手分析其结构，我们来到其源码的地方如下:

struct cache_t {
#if CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_OUTLINED
    explicit_atomic<struct bucket_t *> _buckets;
    explicit_atomic<mask_t> _mask;
#elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16
    explicit_atomic<uintptr_t> _maskAndBuckets;
    mask_t _mask_unused;
    
    // How much the mask is shifted by.
    static constexpr uintptr_t maskShift = 48;
    
    // Additional bits after the mask which must be zero. msgSend
    // takes advantage of these additional bits to construct the value
    // `mask << 4` from `_maskAndBuckets` in a single instruction.
    static constexpr uintptr_t maskZeroBits = 4;
    
    // The largest mask value we can store.
    static constexpr uintptr_t maxMask = ((uintptr_t)1 << (64 - maskShift)) - 1;
    
    // The mask applied to `_maskAndBuckets` to retrieve the buckets pointer.
    static constexpr uintptr_t bucketsMask = ((uintptr_t)1 << (maskShift - maskZeroBits)) - 1;
    
    // Ensure we have enough bits for the buckets pointer.
    static_assert(bucketsMask >= MACH_VM_MAX_ADDRESS, "Bucket field doesn't have enough bits for arbitrary pointers.");
#elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_LOW_4
    // _maskAndBuckets stores the mask shift in the low 4 bits, and
    // the buckets pointer in the remainder of the value. The mask
    // shift is the value where (0xffff >> shift) produces the correct
    // mask. This is equal to 16 - log2(cache_size).
    explicit_atomic<uintptr_t> _maskAndBuckets;
    mask_t _mask_unused;

    static constexpr uintptr_t maskBits = 4;
    static constexpr uintptr_t maskMask = (1 << maskBits) - 1;
    static constexpr uintptr_t bucketsMask = ~maskMask;
#else
#error Unknown cache mask storage type.
#endif
    
#if __LP64__
    uint16_t _flags;
#endif
    uint16_t _occupied;
...

}

从源码我们可以看出cache_t依然是一个结构体，在里面做了诸多判断，我们首先弄清楚这些判断是什么意思，点进其中一个发现

#if defined(__arm64__) && __LP64__
#define CACHE_MASK_STORAGE CACHE_MASK_STORAGE_HIGH_16 //真机
#elif defined(__arm64__) && !__LP64__
#define CACHE_MASK_STORAGE CACHE_MASK_STORAGE_LOW_4 //真机 非64位
#else
#define CACHE_MASK_STORAGE CACHE_MASK_STORAGE_OUTLINED //MacOS、模拟器
#endif

显然这是对我们的架构进行了区分。可以看到在真机中把mask和buckets合并到一起为_maskAndBuckets。
通过源码得到cache_t包含了以下成员

buckets探索

点进buckets可以看到下面信息

#if __arm64__
    explicit_atomic<uintptr_t> _imp;
    explicit_atomic<SEL> _sel;
#else
    explicit_atomic<SEL> _sel;
    explicit_atomic<uintptr_t> _imp;
#endif

类结构

对真机和非真机进行了判断，我们可以看到两个很重要的东西imp和sel，那么这是不是代表里面缓存类类的方法呢？同样我们可以通过地址偏移去分析。首先还是新建一个类SYPerson,添加个方法

- (void)helloWorld;

- (void)sayGoodJo;

//调用
SYPerson *person = [[SYPerson alloc]init];
        
[person helloWorld];

在方法调用前我们打下断点就行lldb调试打印如下

//获取类的首地址
(lldb) p/x pClass
(Class) $0 = 0x0000000100002420 CCPerson
//地址偏移0x10 即16位打印出cache_t地址
(lldb) p (cache_t *)0x0000000100002430
(cache_t *) $1 = 0x0000000100002430
//打印cache_t信息
(lldb) p *$1
(cache_t) $2 = {
  _buckets = {
    std::__1::atomic<bucket_t *> = 0x0000000100794100 {
      _sel = {
        std::__1::atomic<objc_selector *> = ""
      }
      _imp = {
        std::__1::atomic<unsigned long> = 3265552
      }
    }
  }
  _mask = {
    std::__1::atomic<unsigned int> = 3
  }
  _flags = 32784
  _occupied = 1
}
//调用buckets()方法
(lldb) p $2.buckets()
(bucket_t *) $3 = 0x0000000100794100
//打印buckets中的信息
(lldb) p *$3
(bucket_t) $4 = {
  _sel = {
    std::__1::atomic<objc_selector *> = ""
  }
  _imp = {
    std::__1::atomic<unsigned long> = 3265552
  }
}
//打印sel,发现了初始化的init方法
(lldb) p $4.sel()
(SEL) $5 = "init"
//试图打印其它方法
(lldb) p *($3+1)
(bucket_t) $6 = {
  _sel = {
    std::__1::atomic<objc_selector *> = (null) //发现为空，说明只有一个init方法
  }
  _imp = {
    std::__1::atomic<unsigned long> = 0
  }
}
//接下来执行一个实例方法打印
2020-09-19 18:59:40.848267+0800 KCObjc[8767:115568] SYPerson -- -[CCPerson helloWorld]
//再次打印
(lldb) p *($3+1)
(bucket_t) $7 = {
  _sel = {
    std::__1::atomic<objc_selector *> = ""
  }
  _imp = {
    std::__1::atomic<unsigned long> = 10496
  }
}
//打印sel,这里可以看出在执行完helloWorld方法后在cache中就可以找到了，说明已经缓存进去了
(lldb) p $7.sel()
(SEL) $8 = "helloWorld"
//打印imp
(lldb) p $7.imp(pClass)
(IMP) $9 = 0x0000000100000d20 (KCObjc`-[CCPerson helloWorld])

通过lldb调试发现init和helloWorld都加入了缓存。验证了之前所说的buckets中缓存了方法.

_occupied & _mask

_occupied表示哈希表中 sel-imp 的占用大小 (即可以理解为分配的内存中已经存储了sel-imp的的个数)，
_mask是指掩码数据，用于在哈希算法或者哈希冲突算法中计算哈希下标，其中mask 等于capacity - 1

cache_t在下层通过系统的算法分配内存空间时候会根据_occupied的值增加进行扩容，扩容后会将原来的内存都清除，重新开辟内存。

Cache_t结构分析

Cache_t初识

Cache_t源码探索

buckets探索

_occupied & _mask

猜你喜欢

热点阅读