__gnu_cxx::__mt_alloc 是stl 拓展库的支持多线程应用的内存分配器。
__gnu_cxx::__mt_alloc是一个固定大小(2的幂)内存的分配器,最初是为多线程应用程序(以下简称为MT程序)设计的。经过多年的改进,现在它在单线程应用程序(以下简称为ST程序)里也有出色的表现了。
template<bool _Thread> class __pool
这个类表示是否支持线程,然后对多线程(bool==true)和单线程(bool==false)情况进行显式特化。可以用定制的参数来替代这个类。
对于policy类,至少有2种不同的风格,每种都可以和上面不同的内存池参数单独搭配:
策略一:
__common_pool_policy,实现了一个通用内存池,即使分配的对象类型不同,比如char和long,也使用同一个的内存池。这是默认的策略。
template<bool _Thread> struct __common_pool_policy template<typename _Tp, bool _Thread> struct __per_type_pool_policy
策略二:
__per_type_pool_policy,对每个对象类型都实现了一个单独的内存池,于是char和long会使用不同的内存池。这样可以对某些类型进行单独调整。
实际的内存分配器:
template<typename _Tp, typename _Poolp = __default_policy> class __mt_alloc : public __mt_alloc_base<_Tp>, _Poolp
这个类有标准库要求的接口,比如allocate和deallocate函数等。
有些配置参数可以修改,或调整。有一个嵌套类:包含了所有可调的参数,即:
struct __pool_base::_Tune
代码:默认配置(8字节对齐,128字节以上的内存直接用new分配,可分配的最小的内存块大小8字节,每次从os申请的内存块的大小为4080字节,可支持的最多的线程数是4096,单线程能保存的空闲块的百分比为10%,是否直接使用new和delete 根据 是否设置 getenv("GLIBCXX_FORCE_NEW")
// Variables used to configure the behavior of the allocator, // assigned and explained in detail below. struct _Tune { // Compile time constants for the default _Tune values. enum { _S_align = 8 }; enum { _S_max_bytes = 128 }; enum { _S_min_bin = 8 }; enum { _S_chunk_size = 4096 - 4 * sizeof(void*) }; enum { _S_max_threads = 4096 }; enum { _S_freelist_headroom = 10 };
特性:
1)字节对齐
2)多少字节以上的内存直接用new分配
3)可分配的最小的内存块大小
4)每次从OS申请的内存块的大小
5)可支持的最多线程数目
6)单个线程能保存的空闲块的百分比(超过的空闲块会归还给全局空闲链表)
7)是否直接使用new和delete
可以通过接口来设置和获取这个参数
const _Tune& _M_get_options() const { return _M_options; } void _M_set_options(_Tune __t) { if (!_M_init) _M_options = __t; }
对这些参数的调整必须在任何内存分配动作之前,即内存分配器初始化的时候,比如:
#include <ext/mt_allocator.h> struct pod { int i; int j; }; int main() { typedef pod value_type; typedef __gnu_cxx::__mt_alloc<value_type> allocator_type; typedef __gnu_cxx::__pool_base::_Tune tune_type; tune_type t_default; tune_type t_opt(16, 5120, 32, 5120, 20, 10, false);//16字节对齐,5120字节以上的内存直接用new分配,可分配的最小的内存块大小32字节,每次从os申请的内存块的大小为5120字节,可支持的最多的线程数是20,单线程能保存的空闲块的百分比为10% tune_type t_single(16, 5120, 32, 5120, 1, 10, false); tune_type t; t = allocator_type::_M_get_options(); allocator_type::_M_set_options(t_opt); t = allocator_type::_M_get_options(); allocator_type a; allocator_type::pointer p1 = a.allocate(128); allocator_type::pointer p2 = a.allocate(5128); a.deallocate(p1, 128); a.deallocate(p2, 5128); return 0; }
首次调用allocate()时,也会调用_S_init()函数。MT程序里,为了保证只被调用一次,我们使用了__gthread_once(参数是_S_once_mt和_S_init)函数;在ST程序里则检查静态bool变量_S_initialized。
_S_init()函数:如果设置了GLIBCXX_FORCE_NEW环境变量,它会把_S_force_new设置成true,这样allocate()就直接用new来申请内存,deallocate()用delete来释放内存。
以下是没有设置GLIBCXX_FORCE_NEW的情况。
ST和MT模式:
1)计算二进制内存分配大小类型(_M_bin的个数)
内存块管理器(_M_bin)是指2的指数字节的内存集合。默认情况下,__mt_alloc只处理128字节以内的小内存分配(或者通过在_S_init()里设置_S_max_bytes来更改这个值),这样就有如下几个字节大小的bin:1,2,4,8,16,32,64,128。
2)创建_M_bin数组(内存分配大小查询器):
所有的内存申请都上调到2 的指数大小,所以29字节的内存申请会交给32字节的bin处理。_M_bin数组的作用就是快速定位到合适的bin,比如数值29被定位到5(bin 5 = 32字节)。
3)创建_M_bin数组(内存管理器 _Bin_record数组):
这个数组由_Bin_record组成,数组的长度就是前面计算的bin的个数,比如,当_S_max_bytes = 128时长度为8。
4)初始化每个内存管理器(_Bin_record):
_M_first是_Block_record 指针数组,程序可以有多少个线程,这个数组就有多长(ST程序只有1个线程,MT程序最多允许_S_max_threads个线程)。_M_first里保存的是这个bin里每个线程第一个空闲块的地址,比如,我们要找线程3里32字节的空闲块,只需调用:_M_bin[ 5 ]._M_first[ 3 ]。开始的时候first数组元素全是NULL。
MT模式(只有MT模式才有的):
5)创建一个空闲线程ID(1到_S_max_threads间的一个数值)的列表,列表的入口是_S_thread_freelist_first。由于__gthread_self()函数返回的不是我们需要的1到_S_max_threads之间的数值,而是类似于进程ID的随机数,所以我们需要创建一个thread_record链表,长度为_S_max_threads,每个thread_record元素的thread_id字段依次初始化成1,2,3,直到_S_max_threads,作为4)步中_M_first的索引。当一个线程调用allocate()或deallocate()时,我们会调用_S_get_thread_id(),检查线程本地存储的变量_S_thread_key的值。如果是NULL则表示是新创建的线程,那么从_S_thread_freelist_first列表里拿出一个元素给该线程。下次调用_S_get_thread_id()时就会找到这个对象,并且找到thread_id字段和对应的bin位置。所以,首先调用allocate()的线程会分配到thread_id=1的thread_record,于是它的bin索引就是1,我们可以用_M_bin[ 5 ]._M_first[ 1 ]来为它获取32字节的空闲内存。当创建_S_thread_key时我们定制了析构函数,这样当线程退出后,它的thread_record会归还给_S_thread_freelist_first,以便重复使用。_S_thread_freelist_first链表有锁保护,在增、删元素的时候加锁。
6)初始化每个内存管理器(_Bin_record) 的空闲和使用的块数计数器。_Bin_record->free是size_t 的数组,记录每个线程空闲块的个数。bin_record->used也是size_t 的数组,记录每个线程正在使用的块的个数。这些数组的元素初始值都是0。
7)初始化每个内存管理器(_Bin_record)的锁。_Bin_record->mutex用来保护全局的空闲块链表,每当有内存块加入或拿出某个bin时,都要进行加锁。这种情况只出现在线程需要从全局空闲链表里获取内存,或者把一些内存归还给全局链表的时候。
源代码:(内存池基类定义)
/// @brief Base class for pool object. struct __pool_base { // Using short int as type for the binmap implies we are never // caching blocks larger than 32768 with this allocator. typedef unsigned short int _Binmap_type;//块的数量大小不超过32768 // Variables used to configure the behavior of the allocator, // assigned and explained in detail below. struct _Tune { // Compile time constants for the default _Tune values. enum { _S_align = 8 };//字节对齐 enum { _S_max_bytes = 128 };//128字节以上的分配使用new enum { _S_min_bin = 8 };//可分配的最小内存块是8字节 enum { _S_chunk_size = 4096 - 4 * sizeof(void*) };//每次从os申请内存块大小 enum { _S_max_threads = 4096 };//可支持的最多线程数目 enum { _S_freelist_headroom = 10 };//单个线程能保存的空闲块的百分比(超过的空闲块会归还给全局空闲链表) // Alignment needed. // NB: In any case must be >= sizeof(_Block_record), that // is 4 on 32 bit machines and 8 on 64 bit machines. size_t _M_align; // Allocation requests (after round-up to power of 2) below // this value will be handled by the allocator. A raw new/ // call will be used for requests larger than this value. // NB: Must be much smaller than _M_chunk_size and in any // case <= 32768. size_t _M_max_bytes; // Size in bytes of the smallest bin. // NB: Must be a power of 2 and >= _M_align (and of course // much smaller than _M_max_bytes). size_t _M_min_bin; // In order to avoid fragmenting and minimize the number of // new() calls we always request new memory using this // value. Based on previous discussions on the libstdc++ // mailing list we have choosen the value below. // See http://gcc.gnu.org/ml/libstdc++/2001-07/msg00077.html // NB: At least one order of magnitude > _M_max_bytes. size_t _M_chunk_size; // The maximum number of supported threads. For // single-threaded operation, use one. Maximum values will // vary depending on details of the underlying system. (For // instance, Linux 2.4.18 reports 4070 in // /proc/sys/kernel/threads-max, while Linux 2.6.6 reports // 65534) size_t _M_max_threads; // Each time a deallocation occurs in a threaded application // we make sure that there are no more than // _M_freelist_headroom % of used memory on the freelist. If // the number of additional records is more than // _M_freelist_headroom % of the freelist, we move these // records back to the global pool. size_t _M_freelist_headroom; // Set to true forces all allocations to use new(). bool _M_force_new; explicit _Tune() : _M_align(_S_align), _M_max_bytes(_S_max_bytes), _M_min_bin(_S_min_bin), _M_chunk_size(_S_chunk_size), _M_max_threads(_S_max_threads), _M_freelist_headroom(_S_freelist_headroom), _M_force_new(getenv("GLIBCXX_FORCE_NEW") ? true : false) { } explicit _Tune(size_t __align, size_t __maxb, size_t __minbin, size_t __chunk, size_t __maxthreads, size_t __headroom, bool __force) : _M_align(__align), _M_max_bytes(__maxb), _M_min_bin(__minbin), _M_chunk_size(__chunk), _M_max_threads(__maxthreads), _M_freelist_headroom(__headroom), _M_force_new(__force) { } }; struct _Block_address { void* _M_initial; _Block_address* _M_next; }; const _Tune& _M_get_options() const { return _M_options; } void _M_set_options(_Tune __t) { if (!_M_init) _M_options = __t; } bool _M_check_threshold(size_t __bytes) { return __bytes > _M_options._M_max_bytes || _M_options._M_force_new; } size_t _M_get_binmap(size_t __bytes) { return _M_binmap[__bytes]; } const size_t _M_get_align() { return _M_options._M_align; } explicit __pool_base() : _M_options(_Tune()), _M_binmap(NULL), _M_init(false) { } explicit __pool_base(const _Tune& __options) : _M_options(__options), _M_binmap(NULL), _M_init(false) { } private: explicit __pool_base(const __pool_base&); __pool_base& operator=(const __pool_base&); protected: // Configuration options. _Tune _M_options; _Binmap_type* _M_binmap; // Configuration of the pool object via _M_options can happen // after construction but before initialization. After // initialization is complete, this variable is set to true. bool _M_init;//初始化标识 };
空闲块链表的内存布局:
在ST程序里,所有的操作都在全局内存池里--即thread_id为0(MT程序里任何线程都不会分配到这个id)。
当程序申请内存(调用allocate()),我们首先看申请的内存大小是否大于_S_max_bytes,如果是则直接用new。
否则通过_S_binmap找出合适的bin。查看一下_M_bin[ bin ]._M_first[ 0 ]就能知道是否有空闲的块。如果有,那么直接把块移出_M_bin[ bin ]._M_first[ 0 ],返回数据的地址。
如果没有空闲块,就需要从系统申请内存,然后建立空闲块链表。已知block_record的大小和当前bin管理的块的大小,我们算出申请的内存能分出多少个块,然后建立链表,并把第一个块的数据返回给用户。
内存释放的过程同样简单。先把指针转换回block_record指针,根据内存大小找到合适的内存块,然后把块加到空闲列表的前面。
通过一系列的性能测试,我们发现“加到空闲列表前面”比加到后面有10%的性能提升。
源代码如下(mt_allocator.h):
/// Specialization for single thread. template<> class __pool<false> : public __pool_base { public: union _Block_record//内存块链表节点 { // Points to the block_record of the next free block. _Block_record* volatile _M_next; }; struct _Bin_record { // An "array" of pointers to the first free block. _Block_record** volatile _M_first; // A list of the initial addresses of all allocated blocks. _Block_address* _M_address; }; void _M_initialize_once() { if (__builtin_expect(_M_init == false, false)) _M_initialize(); } void _M_destroy() throw(); char* _M_reserve_block(size_t __bytes, const size_t __thread_id); void _M_reclaim_block(char* __p, size_t __bytes); size_t _M_get_thread_id() { return 0; } const _Bin_record& _M_get_bin(size_t __which){ return _M_bin[__which]; } void _M_adjust_freelist(const _Bin_record&, _Block_record*, size_t){} explicit __pool() : _M_bin(NULL), _M_bin_size(1) { } explicit __pool(const __pool_base::_Tune& __tune) : __pool_base(__tune), _M_bin(NULL), _M_bin_size(1) { } private: // An "array" of bin_records each of which represents a specific // power of 2 size. Memory to this "array" is allocated in // _M_initialize(). _Bin_record* volatile _M_bin;//二进制内存管理块数组 // Actual value calculated in _M_initialize(). size_t _M_bin_size; void _M_initialize(); };
void __pool<false>::_M_initialize() { // _M_force_new must not change after the first allocate(), which // in turn calls this method, so if it's false, it's false forever // and we don't need to return here ever again. if (_M_options._M_force_new) //强制使用new操作符,就不需要内存池机制了 { _M_init = true; return; } // Create the bins. // Calculate the number of bins required based on _M_max_bytes. // _M_bin_size is statically-initialized to one. size_t __bin_size = _M_options._M_min_bin; while (_M_options._M_max_bytes > __bin_size) { __bin_size <<= 1; ++_M_bin_size;//二进制结构大小类型计数 } // Setup the bin map for quick lookup of the relevant bin. const size_t __j = (_M_options._M_max_bytes + 1) * sizeof(_Binmap_type);//图类型数组成员数量(最大限制分配字节大小<span style="font-family: Arial, Helvetica, sans-serif; font-size: 12px;">_M_max_bytes + 1个unsigned short类型</span>) _M_binmap = static_cast<_Binmap_type*>(::operator new(__j));//初始化图数组(快速查找内存管理器) _Binmap_type* __bp = _M_binmap; _Binmap_type __bin_max = _M_options._M_min_bin; _Binmap_type __bint = 0; for (_Binmap_type __ct = 0; __ct <= _M_options._M_max_bytes; ++__ct) { if (__ct > __bin_max) { __bin_max <<= 1;//二进制块最大大小 ++__bint; } *__bp++ = __bint;//把需求字节大小作为索引映射到图表示的数组里,数组成员的值为内存块管理器数组的索引 } // Initialize _M_bin and its members. void* __v = ::operator new(sizeof(_Bin_record) * _M_bin_size); _M_bin = static_cast<_Bin_record*>(__v); for (size_t __n = 0; __n < _M_bin_size; ++__n) { _Bin_record& __bin = _M_bin[__n];//内存块管理器 __v = ::operator new(sizeof(_Block_record*));//二进制链表节点 __bin._M_first = static_cast<_Block_record**>(__v); __bin._M_first[0] = NULL; __bin._M_address = NULL; } _M_init = true; }
void __pool<false>::_M_reclaim_block(char* __p, size_t __bytes) { // Round up to power of 2 and figure out which bin to use. const size_t __which = _M_binmap[__bytes];//由字节大小映射到内存管理器索引 _Bin_record& __bin = _M_bin[__which]; char* __c = __p - _M_get_align(); _Block_record* __block = reinterpret_cast<_Block_record*>(__c); // Single threaded application - return to global pool. __block->_M_next = __bin._M_first[0]; __bin._M_first[0] = __block; }
在ST程序里从来用不到thread_id变量,那么现在我们从它的作用开始介绍。
向共享容器申请或释放内存的MT程序有“所有权”的概念,但有一个问题就是线程只把空闲内存返回到自己的空闲块链表里。(比如一个线程专门进行内存的申请,然后转交给其他线程使用,那么其他线程的空闲链表会越来越长,最终导致内存用尽)
每当一个块从全局链表(没有所有权)移到某个线程的空闲链表时,都会设置thread_id。其他需要设置thread_id的情况还包括直接从空白内存上建立某个线程的空闲块链表,和释放某个块删除时,发现申请块的线程id和执行释放操作的线程id不同的时候。
那么到底thread_id有什么用呢?当释放块时,我们比较块的thread_id和当前线程的thread_id是否一致,然后递减生成这个块的线程的used变量,确保free和used计数器的正确。这是很重要的,因为它们决定了是否需要把内存归还给全局内存池。
当程序申请内存(调用allocate()),我们首先看申请的内存大小是否大于_S_max_bytes,如果是则直接用new。
否则通过_S_binmap找出合适的bin。_S_get_thread_id()返回当前线程的thread_id,如果这是第一次调用allocate(),线程会得到一个新的thread_id,保存在_S_thread_key里。
查看_M_bin[ bin ].first[ thread_id ]能知道是否有空闲的内存块。如果有,则移出第一个块,返回给用户,别忘了更新used和free计数器。
如果没有,我们先从全局链表(freelist (0))里寻找。如果找到了,那么把当前bin锁住,然后从全局空闲链表里移出最多block_count(从OS申请的一个内存块能生成多少个当前bin的块)个块到当前线程的空闲链表里,改变它们的所有权,更新计数器和指针。接着把bin解锁,把_S_bin[ bin ].first[ thread_id ]里第一个块返回给用户。
最多只移动block_count个块的原因是,降低后续释放块请求可能导致的归还操作(通过_S_freelist_headroom来计算,后面详述)。
如果在全局链表里也没有空闲块了,那么我们需要从OS申请内存。这和ST程序的做法一样,只有一点注意区别:从新申请的内存块(大小为_S_chunk_size字节)上建立起来的空闲链表直接交给当前进程,而不是加入全局空闲链表。
释放内存块的基本操作很简单:把内存块直接加到当前线程的空闲链表里,更新计数器和指针(前面说过如果当前线程的id和块的thread id不一致的情况下该如何处理)。随后free和used计数器就要发挥作用了,即空闲链表的长度(free)和当前线程正在使用的块的个数(used)。
让我们回想前面一个线程专门负责分配内存的程序模型。假设开始时每个线程使用了512个32字节的块,那么他们的used计数器此时都是516。负责分配内存的线程接着又得到了1000个32字节的块,那么此时它的used计数器是1516。
如果某个线程释放了500个块,每次释放操作都会导致used计数器递减,和该线程的空闲链表(free)越来越长。不过deallocate()会把free控制在used的_S_freelist_headroom%以内(默认是10%),于是当free超过52(516 / 10)时,释放的空闲块会归还给全局空闲链表,从而负载分配的线程就能重用它们。
为了减少锁竞争(这种归还操作需要对bin进行加锁),归还操作是以block_count个块为单位进行的(和从全局空闲链表里获得块一样)。这个“规则”还可以改进,减少某些块“来回转移”的几率。
源代码如下:(mt_allocator.h)
/// Specialization for thread enabled, via gthreads.h. template<> class __pool<true> : public __pool_base { public: // Each requesting thread is assigned an id ranging from 1 to // _S_max_threads. Thread id 0 is used as a global memory pool. // In order to get constant performance on the thread assignment // routine, we keep a list of free ids. When a thread first // requests memory we remove the first record in this list and // stores the address in a __gthread_key. When initializing the // __gthread_key we specify a destructor. When this destructor // (i.e. the thread dies) is called, we return the thread id to // the front of this list. struct _Thread_record { // Points to next free thread id record. NULL if last record in list. _Thread_record* volatile _M_next; // Thread id ranging from 1 to _S_max_threads. size_t _M_id; }; union _Block_record { // Points to the block_record of the next free block. _Block_record* volatile _M_next; // The thread id of the thread which has requested this block. size_t _M_thread_id; }; struct _Bin_record//内存块管理器 { // An "array" of pointers to the first free block for each // thread id. Memory to this "array" is allocated in // _S_initialize() for _S_max_threads + global pool 0. _Block_record** volatile _M_first;//内存块指针数组 // A list of the initial addresses of all allocated blocks. _Block_address* _M_address;//分配到的内存块的初始化地址的链表 // An "array" of counters used to keep track of the amount of // blocks that are on the freelist/used for each thread id. // Memory to these "arrays" is allocated in _S_initialize() for // _S_max_threads + global pool 0. size_t* volatile _M_free;//空闲的块计数器 size_t* volatile _M_used;//使用的块计数器 // Each bin has its own mutex which is used to ensure data // integrity while changing "ownership" on a block. The mutex // is initialized in _S_initialize(). __gthread_mutex_t* _M_mutex;//二进制链表修改用的互斥量 }; // XXX GLIBCXX_ABI Deprecated void _M_initialize(__destroy_handler); void _M_initialize_once() { if (__builtin_expect(_M_init == false, false)) _M_initialize(); } void _M_destroy() throw(); char* _M_reserve_block(size_t __bytes, const size_t __thread_id); void _M_reclaim_block(char* __p, size_t __bytes); const _Bin_record& _M_get_bin(size_t __which) { return _M_bin[__which]; }//由大小索引内存块管理器 void _M_adjust_freelist(const _Bin_record& __bin, _Block_record* __block, size_t __thread_id) { if (__gthread_active_p()) { __block->_M_thread_id = __thread_id; --__bin._M_free[__thread_id];//空闲块计数器 ++__bin._M_used[__thread_id];//使用块计数器 } } // XXX GLIBCXX_ABI Deprecated void _M_destroy_thread_key(void*); size_t _M_get_thread_id(); explicit __pool() : _M_bin(NULL), _M_bin_size(1), _M_thread_freelist(NULL) { } explicit __pool(const __pool_base::_Tune& __tune) : __pool_base(__tune), _M_bin(NULL), _M_bin_size(1), _M_thread_freelist(NULL) { } private: // An "array" of bin_records each of which represents a specific // power of 2 size. Memory to this "array" is allocated in // _M_initialize(). _Bin_record* volatile _M_bin;//内存块管理器数组 // Actual value calculated in _M_initialize(). size_t _M_bin_size;//二进制数组大小 _Thread_record* _M_thread_freelist; void* _M_thread_freelist_initial; void _M_initialize(); };
__pool<true>::_M_initialize(__destroy_handler __d) { // _M_force_new must not change after the first allocate(), // which in turn calls this method, so if it's false, it's false // forever and we don't need to return here ever again. if (_M_options._M_force_new) { _M_init = true; return; } // Create the bins. // Calculate the number of bins required based on _M_max_bytes. // _M_bin_size is statically-initialized to one. size_t __bin_size = _M_options._M_min_bin; while (_M_options._M_max_bytes > __bin_size) { __bin_size <<= 1; ++_M_bin_size; } // Setup the bin map for quick lookup of the relevant bin. const size_t __j = (_M_options._M_max_bytes + 1) * sizeof(_Binmap_type); _M_binmap = static_cast<_Binmap_type*>(::operator new(__j)); _Binmap_type* __bp = _M_binmap; _Binmap_type __bin_max = _M_options._M_min_bin; _Binmap_type __bint = 0; for (_Binmap_type __ct = 0; __ct <= _M_options._M_max_bytes; ++__ct) { if (__ct > __bin_max) { __bin_max <<= 1; ++__bint; } *__bp++ = __bint; } // Initialize _M_bin and its members. void* __v = ::operator new(sizeof(_Bin_record) * _M_bin_size);//<span style="font-family: Arial, Helvetica, sans-serif; font-size: 12px;">_M_bin_size个内存管理器</span> _M_bin = static_cast<_Bin_record*>(__v); // If __gthread_active_p() create and initialize the list of // free thread ids. Single threaded applications use thread id 0 // directly and have no need for this. if (__gthread_active_p()) { const size_t __k = sizeof(_Thread_record) * _M_options._M_max_threads; __v = ::operator new(__k); _M_thread_freelist = static_cast<_Thread_record*>(__v); _M_thread_freelist_initial = __v; // NOTE! The first assignable thread id is 1 since the // global pool uses id 0 size_t __i; for (__i = 1; __i < _M_options._M_max_threads; ++__i) { _Thread_record& __tr = _M_thread_freelist[__i - 1]; __tr._M_next = &_M_thread_freelist[__i]; __tr._M_id = __i;//线程id数组初始化 } // Set last record. _M_thread_freelist[__i - 1]._M_next = NULL; _M_thread_freelist[__i - 1]._M_id = __i; // Initialize per thread key to hold pointer to // _M_thread_freelist. __gthread_key_create(&__gnu_internal::freelist_key, __d); const size_t __max_threads = _M_options._M_max_threads + 1; for (size_t __n = 0; __n < _M_bin_size; ++__n) { _Bin_record& __bin = _M_bin[__n]; __v = ::operator new(sizeof(_Block_record*) * __max_threads); __bin._M_first = static_cast<_Block_record**>(__v); __bin._M_address = NULL; __v = ::operator new(sizeof(size_t) * __max_threads); __bin._M_free = static_cast<size_t*>(__v); __v = ::operator new(sizeof(size_t) * __max_threads); __bin._M_used = static_cast<size_t*>(__v); __v = ::operator new(sizeof(__gthread_mutex_t)); __bin._M_mutex = static_cast<__gthread_mutex_t*>(__v); #ifdef __GTHREAD_MUTEX_INIT { // Do not copy a POSIX/gthr mutex once in use. __gthread_mutex_t __tmp = __GTHREAD_MUTEX_INIT; *__bin._M_mutex = __tmp; } #else { __GTHREAD_MUTEX_INIT_FUNCTION(__bin._M_mutex); } #endif for (size_t __threadn = 0; __threadn < __max_threads; ++__threadn) { __bin._M_first[__threadn] = NULL; __bin._M_free[__threadn] = 0; __bin._M_used[__threadn] = 0; } } } else { for (size_t __n = 0; __n < _M_bin_size; ++__n) { _Bin_record& __bin = _M_bin[__n];//对每类型大小初始化内存管理器 __v = ::operator new(sizeof(_Block_record*)); __bin._M_first = static_cast<_Block_record**>(__v); __bin._M_first[0] = NULL;//<span style="font-family: Arial, Helvetica, sans-serif; font-size: 12px;">_M_first是对每个线程使用的空闲块链表数组</span> __bin._M_address = NULL; } } _M_init = true; }
void __pool<true>::_M_reclaim_block(char* __p, size_t __bytes) { // Round up to power of 2 and figure out which bin to use. const size_t __which = _M_binmap[__bytes]; const _Bin_record& __bin = _M_bin[__which]; // Know __p not null, assume valid block. char* __c = __p - _M_get_align(); _Block_record* __block = reinterpret_cast<_Block_record*>(__c); if (__gthread_active_p()) { // Calculate the number of records to remove from our freelist: // in order to avoid too much contention we wait until the // number of records is "high enough". const size_t __thread_id = _M_get_thread_id(); const _Tune& __options = _M_get_options(); const unsigned long __limit = 100 * (_M_bin_size - __which) * __options._M_freelist_headroom; unsigned long __remove = __bin._M_free[__thread_id]; __remove *= __options._M_freelist_headroom; if (__remove >= __bin._M_used[__thread_id]) __remove -= __bin._M_used[__thread_id]; else __remove = 0; if (__remove > __limit && __remove > __bin._M_free[__thread_id]) { _Block_record* __first = __bin._M_first[__thread_id]; _Block_record* __tmp = __first; __remove /= __options._M_freelist_headroom; const unsigned long __removed = __remove; while (--__remove > 0) __tmp = __tmp->_M_next; __bin._M_first[__thread_id] = __tmp->_M_next; __bin._M_free[__thread_id] -= __removed; __gthread_mutex_lock(__bin._M_mutex); __tmp->_M_next = __bin._M_first[0]; __bin._M_first[0] = __first; __bin._M_free[0] += __removed; __gthread_mutex_unlock(__bin._M_mutex); } // Return this block to our list and update counters and // owner id as needed. --__bin._M_used[__block->_M_thread_id]; __block->_M_next = __bin._M_first[__thread_id]; __bin._M_first[__thread_id] = __block; ++__bin._M_free[__thread_id]; } else { // Not using threads, so single threaded application - return // to global pool. __block->_M_next = __bin._M_first[0]; __bin._M_first[0] = __block; } }