大师兄的Python源码学习笔记(四十七): Python的内存
2021-12-24 本文已影响0人
superkmi
大师兄的Python源码学习笔记(四十六): Python的内存管理机制(一)
大师兄的Python源码学习笔记(四十八): Python的内存管理机制(三)
二、小块空间的内存池
- 在Python中,经常需要申请小块内存,这些小块内存在申请后,很快会被释放。
- 由于这些小块内存的申请并不是为了创建对象,所以没有对象一级的内存池机制,这意味着需要执行大量的
malloc
和free
操作。 - 为了提高效率,Python引入了一个内存池机制,用于管理对小块内存的申请和释放,也就是Pymalloc机制,通过
PyObject_Malloc
、PyObject_Realloc
和PyObject_Free
三个接口显示给Python。 - 这个小块内存的内存池可以视为一个层次结构,从下至上分别是block、pool、arena和内存池。
- 其中block、pool、arena在代码中可以找到实体,而内存池只是一个概念,表示Python对整个小块内存分配和释放行为的内存管理机制。
1. block
- 在最底层,block是一个确定大小的内存块。
- block有很多种,不同种类的block有不同的内存大小,这个值为size class。
- 为了在当前32位和64位平台获得最佳性能,所有block的size class都是8字节对齐的。
Objects\obmalloc.c
* Allocation strategy abstract:
*
* For small requests, the allocator sub-allocates <Big> blocks of memory.
* Requests greater than SMALL_REQUEST_THRESHOLD bytes are routed to the
* system's allocator.
*
* Small requests are grouped in size classes spaced 8 bytes apart, due
* to the required valid alignment of the returned address. Requests of
* a particular size are serviced from memory pools of 4K (one VMM page).
* Pools are fragmented on demand and contain free lists of blocks of one
* particular size class. In other words, there is a fixed-size allocator
* for each size class. Free pools are shared by the different allocators
* thus minimizing the space reserved for a particular size class.
*
* This allocation strategy is a variant of what is known as "simple
* segregated storage based on array of free lists". The main drawback of
* simple segregated storage is that we might end up with lot of reserved
* memory for the different free lists, which degenerate in time. To avoid
* this, we partition each free list in pools and we share dynamically the
* reserved space between all free lists. This technique is quite efficient
* for memory intensive programs which allocate mainly small-sized blocks.
*
* For small requests we have the following table:
*
* Request in bytes Size of allocated block Size class idx
* ----------------------------------------------------------------
* 1-8 8 0
* 9-16 16 1
* 17-24 24 2
* 25-32 32 3
* 33-40 40 4
* 41-48 48 5
* 49-56 56 6
* 57-64 64 7
* 65-72 72 8
* ... ... ...
* 497-504 504 62
* 505-512 512 63
*
* 0, SMALL_REQUEST_THRESHOLD + 1 and up: routed to the underlying
* allocator.
*/
/*==========================================================================*/
/*
* -- Main tunable settings section --
*/
/*
* Alignment of addresses returned to the user. 8-bytes alignment works
* on most current architectures (with 32-bit or 64-bit address busses).
* The alignment value is also used for grouping small requests in size
* classes spaced ALIGNMENT bytes apart.
*
* You shouldn't change this unless you know what you are doing.
*/
#define ALIGNMENT 8 /* must be 2^N */
#define ALIGNMENT_SHIFT 3
/* Return the number of bytes in size class I, as a uint. */
#define INDEX2SIZE(I) (((uint)(I) + 1) << ALIGNMENT_SHIFT)
- 此外,block的上限被设定为512,当申请内存小于这个上限时,交给不同种类的block处理;当申请的内存超过上限,则交给第一层的内存管理机制,即PyMem函数来处理。
Objects\obmalloc.c
#define SMALL_REQUEST_THRESHOLD 512
#define NB_SMALL_SIZE_CLASSES (SMALL_REQUEST_THRESHOLD / ALIGNMENT)
- 根据SMALL_REQUEST_THRESHOLD和ALIGNMENT的限定,可以看到不同种类block的size class对应一个Size class idx,下面两个公式用来转换size class和size class idx:
Objects\obmalloc.c
#define INDEX2SIZE(I) (((uint)(I) + 1) << ALIGNMENT_SHIFT)
size = (uint)(nbytes - 1) >> ALIGNMENT_SHIFT;
- 但是在Python中,block只是一个概念,源码中并没有与之对应的实体存在,我们知道他是具有一定大小的内存,但不与Python中的某个实体对应。
- 然后Python提供了一个用来管理block的实体,就是pool。
2. pool
- 一组block的集合称为一个pool,一个pool管理着一堆固定大小的内存块。
- 事实上,pool管理着一大块内存,并通过策略将这块大内存划分为多个小内存块。
- 一个pool的大小通常为一个系统内存页,由于当前大多数Python支持的系统内存页都是4KB,所以pool的大小通常定义为4KB。
Objects\obmalloc.c
/*
* The system's VMM page size can be obtained on most unices with a
* getpagesize() call or deduced from various header files. To make
* things simpler, we assume that it is 4K, which is OK for most systems.
* It is probably better if this is the native page size, but it doesn't
* have to be. In theory, if SYSTEM_PAGE_SIZE is larger than the native page
* size, then `POOL_ADDR(p)->arenaindex' could rarely cause a segmentation
* violation fault. 4K is apparently OK for all the platforms that python
* currently targets.
*/
#define SYSTEM_PAGE_SIZE (4 * 1024)
#define SYSTEM_PAGE_SIZE_MASK (SYSTEM_PAGE_SIZE - 1)
... ...
/*
* Size of the pools used for small blocks. Should be a power of 2,
* between 1K and SYSTEM_PAGE_SIZE, that is: 1k, 2k, 4k.
*/
#define POOL_SIZE SYSTEM_PAGE_SIZE /* must be 2^N */
#define POOL_SIZE_MASK SYSTEM_PAGE_SIZE_MASK
- pool在Python对应的结构体是pool_header:
Objects\obmalloc.c
/* Pool for small blocks. */
struct pool_header {
union { block *_padding;
uint count; } ref; /* number of allocated blocks */
block *freeblock; /* pool's free list head */
struct pool_header *nextpool; /* next pool of this size class */
struct pool_header *prevpool; /* previous pool "" */
uint arenaindex; /* index into arenas of base adr */
uint szidx; /* block size class index */
uint nextoffset; /* bytes to virgin block */
uint maxnextoffset; /* largest valid nextoffset */
};
- 从名字可以看出,pool_header是一个pool的头部。
- 一个pool管理的所有block大小都是一样的,也就是他们的size class idx是一样的,由pool_header中的szidx定义。
- 假如有一块4KB的内存,看看Python如何将其改造成pool:
Objects\obmalloc.c
static int
pymalloc_alloc(void *ctx, void **ptr_p, size_t nbytes)
{
block *bp;
poolp pool;
poolp next;
uint size;
... ...
init_pool:
/* Frontlink to used pools. */
next = usedpools[size + size]; /* == prev */
pool->nextpool = next;
pool->prevpool = next;
next->nextpool = pool;
next->prevpool = pool;
pool->ref.count = 1;
if (pool->szidx == size) {
/* Luckily, this pool last contained blocks
* of the same size class, so its header
* and free list are already initialized.
*/
bp = pool->freeblock;
assert(bp != NULL);
pool->freeblock = *(block **)bp;
goto success;
}
/*
* Initialize the pool header, set up the free list to
* contain just the second block, and return the first
* block.
*/
pool->szidx = size;
size = INDEX2SIZE(size);
bp = (block *)pool + POOL_OVERHEAD;
pool->nextoffset = POOL_OVERHEAD + (size << 1);
pool->maxnextoffset = POOL_SIZE - size;
pool->freeblock = bp + size;
*(block **)(pool->freeblock) = NULL;
goto success;
}
... ...
success:
UNLOCK();
assert(bp != NULL);
*ptr_p = (void *)bp;
return 1;
}
- 最后传出的*ptr_p(bp)也就是从pool中取出的第一块block的指针。
- 由于第一块block已经分配了,也就是已经分配了1块block,所以
pool->ref.count = 1
。 - 此外,*ptr_p(bp)指向的地址后有将近4KB内存都是可用的,但申请内存的函数只会使用[bp,bp+size]这个区间的内存,这是由size class idx保证的,所以经过改造后的4KB内存如下:
- 图中的实线箭头是指针,虚线箭头是偏移位置,在nextoffset和maxnextoffset中存储的是相对于pool头部的偏移位置
- 现在假设申请5块28字节的内存,由于28个字节对应的size class idx为3,所以实际会申请5块32字节的内存:
Objects\obmalloc.c
static int
pymalloc_alloc(void *ctx, void **ptr_p, size_t nbytes)
{
block *bp;
poolp pool;
poolp next;
uint size;
... ...
if (pool != pool->nextpool) {
/*
* There is a used pool for this size class.
* Pick up the head block of its free list.
*/
++pool->ref.count;
bp = pool->freeblock;
assert(bp != NULL);
if ((pool->freeblock = *(block **)bp) != NULL) {
goto success;
}
/*
* Reached the end of the free list, try to extend it.
*/
if (pool->nextoffset <= pool->maxnextoffset) {
/* There is room for another block. */
pool->freeblock = (block*)pool +
pool->nextoffset;
pool->nextoffset += INDEX2SIZE(size);
*(block **)(pool->freeblock) = NULL;
goto success;
}
... ...
}
}
- 可以看出freeblock是下一个可用block的起始地址,当再次申请32字节的block时,只需要返回freeblock只想的地址就可以。
- nextoffset和maxoffset是两个用于对pool中的block集合进行迭代的变量,它所指的偏移位置正好指向了freeblock之后的下一个可用的block地址,在分配block之后,freeblock和nextoffset都会向前移动一个block的距离,如此反复进行遍历。
- maxoffset指向pool中的最后一个可用block,它界定了pool的边界。
- 除了freeblock之外,Python还设置了*freeblock:
Objects\obmalloc.c
static int
pymalloc_free(void *ctx, void *p)
{
poolp pool;
block *lastfree;
poolp next, prev;
uint size;
... ...
/* Link p to the start of the pool's freeblock list. Since
* the pool had at least the p block outstanding, the pool
* wasn't empty (so it's already in a usedpools[] list, or
* was full and is in no list -- it's not in the freeblocks
* list in any case).
*/
assert(pool->ref.count > 0); /* else it was empty */
*(block **)p = lastfree = pool->freeblock;
pool->freeblock = (block *)p;
}
- 当一个block被释放时,freeblock虽然指向了一个有效的pool内地址,但*freeblock为NULL。
- 假设Python释放的是blockA,在代码清单1之后,A中的第一个字节的值被设置成为当前的freeblock值,而在代码清单2之后,freeblock的值被更新了,指向block A的首地址。
- 这样,一个block被插入到了离散自由的block链表中。
- 当第2块和第4块block都被释放后,可以看到一个初具规模的离散自由block链表:
- 对于这个离散自由block链表,可以从freeblock开始很容易的以
freeblock = *freeblock
的方式遍历这条链表,当*freeblock为NULL时,表明到达了链表尾部。
Objects\obmalloc.c
static int
pymalloc_alloc(void *ctx, void **ptr_p, size_t nbytes)
{
block *bp;
poolp pool;
poolp next;
uint size;
... ...
if (pool != pool->nextpool) {
/*
* There is a used pool for this size class.
* Pick up the head block of its free list.
*/
++pool->ref.count;
bp = pool->freeblock;
assert(bp != NULL);
if ((pool->freeblock = *(block **)bp) != NULL) {
goto success;
}
... ...
}
- 在前面出现过的这段代码中,可以看到判断
pool->freeblock = *(block **)bp
,如果为真,则表示离散链表已经不存在了。 - 如果可能则继续分配pool的nextoffset指定的下一块block,否则就提供另一个pool。
- 这意味着也存在一个pool的集合