iOS runtime 四: 详细分析entsize_list_

2022-09-06 本文已影响0人 Trigger_o

entsize_list_tt

在class_ro_t中:
成员baseMethods类型是method_list_t;
成员ivars类型是ivar_list_t;
成员baseProperties类型是property_list_t;
成员baseMethods的类型是method_list_t;
这四个类型都是继承自entsize_list_tt的,定义在objc-runtime-new.h中,是个非常重要的基本类型.

entsize_list_tt<Element, List, FlagMask, PointerModifier>
Generic implementation of an array of non-fragile structs.
Element is the struct type (e.g. method_t)
List is the specialization of entsize_list_tt (e.g. method_list_t)
FlagMask is used to stash extra bits in the entsize field (e.g. method list fixup markers)
PointerModifier is applied to the element pointers retrieved fromthe array.

首先注释说明了这个东西是干嘛用的,Element是元素,List是列表,它实现了一个集合的功能.
元素必须是结构体,列表必须是继承自entsize_list_tt的结构体,FlagMask是额外的位.
既然是集合,从swift的经验我们知道,集合首先需要是一个序列,序列只有一个要求,那就是有一个迭代器,因此迭代器+有限的元素个数就是一个集合了.

struct PointerModifierNop {
    template <typename ListType, typename T>
    static T *modify(__unused const ListType &list, T *ptr) { return ptr; }
};

PointerModifier是一个指针操作模板,用于获取元素指针.

template <typename Element, typename List, uint32_t FlagMask, typename PointerModifier = PointerModifierNop>
struct entsize_list_tt {
    uint32_t entsizeAndFlags;
    uint32_t count;

Element& getOrEnd(uint32_t i) const { 
        ASSERT(i <= count);
        return *PointerModifier::modify(*this, (Element *)((uint8_t *)this + sizeof(*this) + i*entsize()));
    }

template <bool authenticated>
    struct iteratorImpl;
    using iterator = iteratorImpl<false>;
    using signedIterator = iteratorImpl<true>;

template <bool authenticated>
    struct iteratorImpl {
        uint32_t entsize;
        uint32_t index;  // keeping track of this saves a divide in operator-

        using ElementPtr = std::conditional_t<authenticated, Element * __ptrauth(ptrauth_key_process_dependent_data, 1, 0xdead), Element *>;

        ElementPtr element;

        typedef std::random_access_iterator_tag iterator_category;
        typedef Element value_type;
        typedef ptrdiff_t difference_type;
        typedef Element* pointer;
        typedef Element& reference;

//...

对应使用来看,比如ivar_list_t是这样的

struct ivar_list_t : entsize_list_tt<ivar_t, ivar_list_t, 0> { //... }

分别是ivar_t结构体,ivar_list_t结构体,0,和默认值PointerModifierNop.
ivar_list_t继承自entsize_list_tt,并且只是添加了一个函数,所以在entsize_list_tt里,typename List就是entsize_list_tt自己.
entsizeAndFlags用不同的位存储元素的大小和flag.entsizeAndFlags & ~FlagMask取出entsize, entsizeAndFlags & FlagMask就取出flag.
count是元素的个数.
getOrEnd取出元素本身,不可以越界.

接下来是二级结构struct iteratorImpl,创建的时候需要设置是否签名,authenticated,声明了两个别名iterator和signedIterator分别是不需要签名的和需要签名的iteratorImpl.
然后entsize_list_tt声明了一个成员struct iteratorImpl;

接下来是iteratorImpl的具体实现.这个结构体实现迭代器的功能,有点swift里Sequence和IteratorProtocol的感觉.

std::conditional<表达式, 类型1, 类型2>相当于二元运算,如果表达式为true,则使用类型1,否则使用类型2.
authenticated就是是否签名,再加上using,所以ElementPtr就是Element *

//接上面的代码
      iteratorImpl() { }
       iteratorImpl(const List& list, uint32_t start = 0)
            : entsize(list.entsize())
            , index(start)
            , element(&list.getOrEnd(start))
        { }

然后是iteratorImpl的两个构造函数,第二个构造函数可以传一个start,这个start就是index的值,也就是迭代器当前的位于第几个元素.

  iterator begin() {
        return iterator(*static_cast<const List*>(this), 0);
    }
    iterator end() {
        return iterator(*static_cast<const List*>(this), count);
    }

entsize_list_tt还有几个begin和end方法,iterator是iteratorImpl的别名,里面调用iteratorImpl的构造函数,返回一个iteratorImpl.
begin()传一个0,使迭代器位于开始,end() 传count,使迭代器位于末尾,和IteratorProtocol的statrtIndex和endIndex很像.
需要注意,和swift的endIndex一样,end()并非一个有效的索引,它位于最后一个有效索引的后面,比如count是1的时候,end应该是第0个,或者说偏移0,如果把偏移1作为索引就越界了,它只是一个标记.

const iteratorImpl& operator += (ptrdiff_t delta) {
            element = (Element*)((uint8_t *)element + delta*entsize);
            index += (int32_t)delta;
            return *this;
        }
        const iteratorImpl& operator -= (ptrdiff_t delta) {
            element = (Element*)((uint8_t *)element - delta*entsize);
            index -= (int32_t)delta;
            return *this;
        }
        const iteratorImpl operator + (ptrdiff_t delta) const {
            return iteratorImpl(*this) += delta;
        }
        const iteratorImpl operator - (ptrdiff_t delta) const {
            return iteratorImpl(*this) -= delta;
        }

        iteratorImpl& operator ++ () { *this += 1; return *this; }
        iteratorImpl& operator -- () { *this -= 1; return *this; }
        iteratorImpl operator ++ (int) {
            iteratorImpl result(*this); *this += 1; return result;
        }
        iteratorImpl operator -- (int) {
            iteratorImpl result(*this); *this -= 1; return result;
        }

        ptrdiff_t operator - (const iteratorImpl& rhs) const {
            return (ptrdiff_t)this->index - (ptrdiff_t)rhs.index;
        }

        Element& operator * () const { return *element; }
        Element* operator -> () const { return element; }

        operator Element& () const { return *element; }

        bool operator == (const iteratorImpl& rhs) const {
            return this->element == rhs.element;
        }
        bool operator != (const iteratorImpl& rhs) const {
            return this->element != rhs.element;
        }

        bool operator < (const iteratorImpl& rhs) const {
            return this->element < rhs.element;
        }
        bool operator > (const iteratorImpl& rhs) const {
            return this->element > rhs.element;
        }

然后就是迭代器的工作方式,和IteratorProtocol的offset类似,这里通过重载一大堆运算符来实现迭代器的移动.
并且可以比较两个迭代器的位置关系,也就是> == < !=这些.

具体都是通过指针偏移实现的
element的值就是存储第0个元素的地址,+=1就是从element开始偏移一个元素大小,就是第1个元素的地址.

list_array_tt

list_array_tt是entsize_list_tt套娃,可以实现二维数组但是不仅仅是这样.
因为它可以是空的,可以是一维数组,也可以是二维数组,通过不同的数据结构来实现这三种效果,主要是后面两种.先记着这个结论再看代码就容易理解了.

在class_rw_ext_t中:
methods的类型是method_array_t;
properties的类型是property_array_t;
protocols的类型是protocol_array_t;
这三种类型都继承自list_array_tt,与ro里的不同.

list_array_tt<Element, List, Ptr>
Generic implementation for metadata that can be augmented by categories.
Element is the underlying metadata type (e.g. method_t)
List is the metadata's list type (e.g. method_list_t)
List is a template applied to Element to make Element *. Useful for applying qualifiers to the pointer type.
A list_array_tt has one of three values:

empty
a pointer to a single list
an array of pointers to lists
countLists/beginLists/endLists iterate the metadata lists
count/begin/end iterate the underlying metadata elements

首先注释中说明,Element就像是method_t,叫做metadata,List就像是method_list_t,叫做metadata's List.
所以这是一个放着List的List,也就是实现了二级集合的效果,它有三种构造结果,空,集合指针,或者指针集合.
注意,这里的意思是,它可以一层都没有,也可以只有一层,也可以有两层,并非固定的结构
countLists/beginLists/endLists这三个用来迭代第一层
count/begin/end用来迭代第二层.

template <typename Element, typename List, template<typename> class Ptr>
class list_array_tt {
//...

和entsize_list_tt一样,list_array_tt也有一个模板,和前面一样,先看一下使用的地方传的是什么.

class property_array_t : 
    public list_array_tt<property_t, property_list_t, RawPtr>
{

比如property_array_t,传了property_t结构体,
然后是property_list_t,这个是前面提到的ro里面存储属性的成员,它是继承自entsize_list_tt的,也就是说list_array_t是建立在entsize_list_tt的基础上的.
最后一个,Ptr,它要的是一个模板,模板里面套模板.

 struct array_t {
        uint32_t count;
        Ptr<List> lists[0];
    };

首先是一个结构体array_t,定义了元素个数count,和一个数组,里面放的就是(继承自)entsize_list_tt的结构体.
这个array_t就是二层结构的提现,如果它没有初始化,那么这个list_array_tt就不是二层结构,

protected:
    template <bool authenticated>
    class iteratorImpl {
        const Ptr<List> *lists;
        const Ptr<List> *listsEnd;

        template<bool B>
        struct ListIterator {
            using Type = typename List::signedIterator;
            static Type begin(Ptr<List> ptr) { return ptr->signedBegin(); }
            static Type end(Ptr<List> ptr) { return ptr->signedEnd(); }
        };
        template<>
        struct ListIterator<false> {
            using Type = typename List::iterator;
            static Type begin(Ptr<List> ptr) { return ptr->begin(); }
            static Type end(Ptr<List> ptr) { return ptr->end(); }
        };
        typename ListIterator<authenticated>::Type m, mEnd;

然后是一个双层迭代器,外部是iteratorImpl类,内部是ListIterator
ListIterator有两种定义,以及两种模板和别名,一个是需要签名的,一个是不需要签名的.

然后using Type = typename List::iterator;这句,List::是域,意思是取List里面的iterator,定义别名Type,List就是entsize_list_tt.
下面定义了一个begin方法,内部调用的就是entsize_list_tt里的迭代器的begin().所以内层迭代还是用的entsize_list_tt的功能.
另外begin和end的实质都是一个迭代器,只不过索引不同.

typename ListIterator<authenticated>::Type m, mEnd;
iteratorImpl(const Ptr<List> *begin, const Ptr<List> *end)
            : lists(begin), listsEnd(end)
        {
            if (begin != end) {
                m = ListIterator<authenticated>::begin(*begin);
                mEnd = ListIterator<authenticated>::end(*begin);
            }
        }

        const Element& operator * () const {
            return *m;
        }
        Element& operator * () {
            return *m;
        }

然后定义了两个变量,m和mEnd,都是Element,比如method_t.
接下来是iteratorImpl的构造函数,传进去两个list的指针,begin和end
ListIterator<authenticated>::begin(*begin)的意思是,调用ListIterator里的begin方法,把第一个list传进去,最终返回的是第一个list的第一个元素,是一个Element,复制给m. 然后mEnd也类似.

 bool operator != (const iteratorImpl& rhs) const {...
const iteratorImpl& operator ++ () {...

两个运算符重载

union {
        Ptr<List> list;
        uintptr_t arrayAndFlag;
    };

    bool hasArray() const {
        return arrayAndFlag & 1;
    }

    array_t *array() const {
        return (array_t *)(arrayAndFlag & ~1);
    }

    void setArray(array_t *array) {
        arrayAndFlag = (uintptr_t)array | 1;
    }

这个共同体是list_array_tt的成员,其中list是list_array_tt作为一维数组时的体现,arrayAndFlag记录了是否是二维数组.
hasArray()判断是否有数组,只要arrayAndFlag 不是0, arrayAndFlag & 1就是真
然后下面两个是set和get

list_array_tt() : list(nullptr) { }
    list_array_tt(List *l) : list(l) { }
    list_array_tt(const list_array_tt &other) {
        *this = other;
    }

然后终于到了list_array_tt的构造函数,三种,1是空,2是给一个list,3是给另一个list_array_tt

inline uint32_t countLists(const std::function<const array_t * (const array_t *)> & peek) const {
        if (hasArray()) {
            return peek(array())->count;
        } else if (list) {
            return 1;
        } else {
            return 0;
        }
    }

获取List的个数,就是array_t里的count,假如有数组,那么就是二维的,返回array_t的count;
如果是一维的,那么共用体的list是有值的,返回它的1;
如果是空的,返回0.

   const Ptr<List>* beginLists() const {
        if (hasArray()) {
            return array()->lists;
        } else {
            return &list;
        }
    }

前两个是获取第一个和最后一个List.
beginLists: 如果有数组,就返回数组名,因为数组名指向数组第一个元素的地址;如过没有数组,就返回union的list.
endLists: 如果有数组,就返回数组名加偏移,偏移量是元素个数.如果只有list,就返回list下一位,什么都没有就返回一个无效的引用.
原因前面已经说了,和swift的endIndex一样,end()并非一个有效的索引,它位于最后一个有效索引的后面,比如count是1的时候,end应该是第0个,或者说偏移0,如果把偏移1作为索引就越界了,它只是一个标记.

using iterator = iteratorImpl<false>;
    const Ptr<List>* endLists() const {
        if (hasArray()) {
            return array()->lists + array()->count;
        } else if (list) {
            return &list + 1;
        } else {
            return &list;
        }
    }

iterator begin() const {
        return iterator(beginLists(), endLists());
    }

    iterator end() const {
        auto e = endLists();
        return iterator(e, e);
    }

这两个将迭代器置于起始和结束.都是调用迭代器的构造函数.需要指定一个开始List和一个结束List,迭代器就在这两个List之间移动.
此时m是beginList的第一个Element,mEnd是endList的最后一个Element.

uint32_t count() const {
        uint32_t result = 0;
        for (auto lists = beginLists(), end = endLists();  lists != end; ++lists)
        {
            result += (*lists)->count;
        }
        return result;
    }

获取全部Element的个数,从第一个List开始迭代到最后一个,把元素个数加起来.

void attachLists(List* const * addedLists, uint32_t addedCount) {
        if (addedCount == 0) return;

        if (hasArray()) {
            // many lists -> many lists
            uint32_t oldCount = array()->count;
            uint32_t newCount = oldCount + addedCount;
            array_t *newArray = (array_t *)malloc(array_t::byteSize(newCount));
            newArray->count = newCount;
            array()->count = newCount;

            for (int i = oldCount - 1; i >= 0; i--)
                newArray->lists[i + addedCount] = array()->lists[i];
            for (unsigned i = 0; i < addedCount; i++)
                newArray->lists[i] = addedLists[i];
            free(array());
            setArray(newArray);
            validate();
        }
        else if (!list  &&  addedCount == 1) {
            // 0 lists -> 1 list
            list = addedLists[0];
            validate();
        } 
        else {
            // 1 list -> many lists
            Ptr<List> oldList = list;
            uint32_t oldCount = oldList ? 1 : 0;
            uint32_t newCount = oldCount + addedCount;
            setArray((array_t *)malloc(array_t::byteSize(newCount)));
            array()->count = newCount;
            if (oldList) array()->lists[addedCount] = oldList;
            for (unsigned i = 0; i < addedCount; i++)
                array()->lists[i] = addedLists[i];
            validate();
        }
    }

这个函数它描述了list_array_tt的工作方式.
首先传进来的是数组,里面放的是List *和List的个数.
1.如果已经有数组了,此时是二维数组,那么就需要创建的新的数组.新的长度是newCount,然后申请空间.
然后把旧的元素放到新数组的后面位置.把新元素放到前面的位置,
2.如果没有数组,也没有list,就是说,是空的,并且只有一个List,那么就让list指向这个List.
3.如果已经有list,或者传进来不止一个List,那就得创建数组了.然后把旧的list放在最后.

iOS runtime 四: 详细分析entsize_list_

entsize_list_tt

list_array_tt

猜你喜欢

热点阅读