支持多线程的内存分配器__gnu_cxx::__mt_alloc

__gnu_cxx::__mt_alloc 是stl 拓展库的支持多线程应用的内存分配器。

1、__mt_alloc组成部分

 __gnu_cxx::__mt_alloc是一个固定大小(2的幂)内存的分配器,最初是为多线程应用程序(以下简称为MT程序)设计的。经过多年的改进,现在它在单线程应用程序(以下简称为ST程序)里也有出色的表现了。

 (1)线程支持参数

template<bool _Thread>  class __pool

这个类表示是否支持线程,然后对多线程(bool==true)和单线程(bool==false)情况进行显式特化。可以用定制的参数来替代这个类。

(2)把内存池关联到通用或专用方案的policy

对于policy类,至少有2种不同的风格,每种都可以和上面不同的内存池参数单独搭配:

策略一:

__common_pool_policy,实现了一个通用内存池,即使分配的对象类型不同,比如charlong,也使用同一个的内存池。这是默认的策略。

template<bool _Thread>
    struct __common_pool_policy

    template<typename _Tp, bool _Thread>
    struct __per_type_pool_policy

策略二:

__per_type_pool_policy,对每个对象类型都实现了一个单独的内存池,于是charlong会使用不同的内存池。这样可以对某些类型进行单独调整。

(3)实际的内存分配器类

实际的内存分配器:

template<typename _Tp, typename _Poolp = __default_policy>
    class __mt_alloc : public __mt_alloc_base<_Tp>,  _Poolp

这个类有标准库要求的接口,比如allocatedeallocate函数等。 

 (4)描述内存池特征的参数

有些配置参数可以修改,或调整。有一个嵌套类:包含了所有可调的参数,即:

struct __pool_base::_Tune

 代码:默认配置(8字节对齐,128字节以上的内存直接用new分配,可分配的最小的内存块大小8字节,每次从os申请的内存块的大小为4080字节,可支持的最多的线程数是4096,单线程能保存的空闲块的百分比为10%,是否直接使用new和delete 根据 是否设置 getenv("GLIBCXX_FORCE_NEW")

    // Variables used to configure the behavior of the allocator,
    // assigned and explained in detail below.
    struct _Tune
     {
      // Compile time constants for the default _Tune values.
      enum { _S_align = 8 };
      enum { _S_max_bytes = 128 };
      enum { _S_min_bin = 8 };
      enum { _S_chunk_size = 4096 - 4 * sizeof(void*) };
      enum { _S_max_threads = 4096 };
      enum { _S_freelist_headroom = 10 };

特性:

1)字节对齐

2)多少字节以上的内存直接用new分配

3)可分配的最小的内存块大小

4)每次从OS申请的内存块的大小

5)可支持的最多线程数目

6)单个线程能保存的空闲块的百分比(超过的空闲块会归还给全局空闲链表)

7)是否直接使用newdelete


可以通过接口来设置和获取这个参数

const _Tune&
    _M_get_options() const
    { return _M_options; }

    void
    _M_set_options(_Tune __t)
    { 
      if (!_M_init)
	_M_options = __t;
    }

对这些参数的调整必须在任何内存分配动作之前,即内存分配器初始化的时候,比如: 

#include <ext/mt_allocator.h>
struct pod
{
  int i;
  int j;
};

int main()
{
  typedef pod value_type;
  typedef __gnu_cxx::__mt_alloc<value_type> allocator_type;
  typedef __gnu_cxx::__pool_base::_Tune tune_type;

  tune_type t_default;
  tune_type t_opt(16, 5120, 32, 5120, 20, 10, false);//16字节对齐,5120字节以上的内存直接用new分配,可分配的最小的内存块大小32字节,每次从os申请的内存块的大小为5120字节,可支持的最多的线程数是20,单线程能保存的空闲块的百分比为10%
  tune_type t_single(16, 5120, 32, 5120, 1, 10, false);

  tune_type t;
  t = allocator_type::_M_get_options();  
  allocator_type::_M_set_options(t_opt);
  t = allocator_type::_M_get_options();  

  allocator_type a;
  allocator_type::pointer p1 = a.allocate(128);
  allocator_type::pointer p2 = a.allocate(5128);

  a.deallocate(p1, 128);
  a.deallocate(p2, 5128);

  return 0;
}

2、基类配置初始化

(1)基类配置

首次调用allocate()时,也会调用_S_init()函数。MT程序里,为了保证只被调用一次,我们使用了__gthread_once(参数是_S_once_mt_S_init)函数;在ST程序里则检查静态bool变量_S_initialized

_S_init()函数:如果设置了GLIBCXX_FORCE_NEW环境变量,它会把_S_force_new设置成true,这样allocate()就直接用new来申请内存,deallocate()delete来释放内存。

以下是没有设置GLIBCXX_FORCE_NEW的情况。

STMT模式:

1)计算二进制内存分配大小类型(_M_bin的个数

    内存块管理器(_M_bin是指2的指数字节的内存集合。默认情况下,__mt_alloc只处理128字节以内的小内存分配(或者通过在_S_init()里设置_S_max_bytes来更改这个值),这样就有如下几个字节大小的bin1248163264128

2)创建_M_bin数组(内存分配大小查询器):

所有的内存申请都上调到的指数大小,所以29字节的内存申请会交给32字节的bin处理。_M_bin数组的作用就是快速定位到合适的bin,比如数值29被定位到5bin 5 = 32字节)。

3)创建_M_bin数组(内存管理器 _Bin_record数组):

这个数组由_Bin_record组成,数组的长度就是前面计算的bin的个数,比如,当_S_max_bytes = 128时长度为8

4)初始化每个内存管理器(_Bin_record

          _M_first_Block_record 指针数组,程序可以有多少个线程,这个数组就有多长(ST程序只有1个线程,MT程序最多允许_S_max_threads个线程)。_M_first里保存的是这个bin里每个线程第一个空闲块的地址,比如,我们要找线程332字节的空闲块,只需调用:_M_bin[ 5 ]._M_first[ 3 ]。开始的时候first数组元素全是NULL

MT模式(只有MT模式才有的

5)创建一个空闲线程ID1_S_max_threads间的一个数值)的列表,列表的入口是_S_thread_freelist_first。由于__gthread_self()函数返回的不是我们需要的1_S_max_threads之间的数值,而是类似于进程ID的随机数,所以我们需要创建一个thread_record链表,长度为_S_max_threads,每个thread_record元素的thread_id字段依次初始化成123,直到_S_max_threads,作为4)步中_M_first的索引。当一个线程调用allocate()deallocate()时,我们会调用_S_get_thread_id(),检查线程本地存储的变量_S_thread_key的值。如果是NULL则表示是新创建的线程,那么从_S_thread_freelist_first列表里拿出一个元素给该线程。下次调用_S_get_thread_id()时就会找到这个对象,并且找到thread_id字段和对应的bin位置。所以,首先调用allocate()的线程会分配到thread_id=1thread_record,于是它的bin索引就是1,我们可以用_M_bin[ 5 ]._M_first[ 1 ]来为它获取32字节的空闲内存。当创建_S_thread_key时我们定制了析构函数,这样当线程退出后,它的thread_record会归还给_S_thread_freelist_first,以便重复使用。_S_thread_freelist_first链表有锁保护,在增、删元素的时候加锁。

6)初始化每个内存管理器(_Bin_record 的空闲和使用的块数计数器。_Bin_record->freesize_t 的数组,记录每个线程空闲块的个数。bin_record->used也是size_t 的数组,记录每个线程正在使用的块的个数。这些数组的元素初始值都是0

7)初始化每个内存管理器(_Bin_record的锁。_Bin_record->mutex用来保护全局的空闲块链表,每当有内存块加入或拿出某个bin时,都要进行加锁。这种情况只出现在线程需要从全局空闲链表里获取内存,或者把一些内存归还给全局链表的时候。 

源代码:(内存池基类定义)

 /// @brief  Base class for pool object.
  struct __pool_base
  {
    // Using short int as type for the binmap implies we are never
    // caching blocks larger than 32768 with this allocator.
    typedef unsigned short int _Binmap_type;//块的数量大小不超过32768

    // Variables used to configure the behavior of the allocator,
    // assigned and explained in detail below.
    struct _Tune
     {
      // Compile time constants for the default _Tune values.
      enum { _S_align = 8 };//字节对齐
      enum { _S_max_bytes = 128 };//128字节以上的分配使用new
      enum { _S_min_bin = 8 };//可分配的最小内存块是8字节
      enum { _S_chunk_size = 4096 - 4 * sizeof(void*) };//每次从os申请内存块大小
      enum { _S_max_threads = 4096 };//可支持的最多线程数目
      enum { _S_freelist_headroom = 10 };//单个线程能保存的空闲块的百分比(超过的空闲块会归还给全局空闲链表)

      // Alignment needed.
      // NB: In any case must be >= sizeof(_Block_record), that
      // is 4 on 32 bit machines and 8 on 64 bit machines.
      size_t	_M_align;
      
      // Allocation requests (after round-up to power of 2) below
      // this value will be handled by the allocator. A raw new/
      // call will be used for requests larger than this value.
      // NB: Must be much smaller than _M_chunk_size and in any
      // case <= 32768.
      size_t	_M_max_bytes; 

      // Size in bytes of the smallest bin.
      // NB: Must be a power of 2 and >= _M_align (and of course
      // much smaller than _M_max_bytes).
      size_t	_M_min_bin;

      // In order to avoid fragmenting and minimize the number of
      // new() calls we always request new memory using this
      // value. Based on previous discussions on the libstdc++
      // mailing list we have choosen the value below.
      // See http://gcc.gnu.org/ml/libstdc++/2001-07/msg00077.html
      // NB: At least one order of magnitude > _M_max_bytes. 
      size_t	_M_chunk_size;

      // The maximum number of supported threads. For
      // single-threaded operation, use one. Maximum values will
      // vary depending on details of the underlying system. (For
      // instance, Linux 2.4.18 reports 4070 in
      // /proc/sys/kernel/threads-max, while Linux 2.6.6 reports
      // 65534)
      size_t 	_M_max_threads;

      // Each time a deallocation occurs in a threaded application
      // we make sure that there are no more than
      // _M_freelist_headroom % of used memory on the freelist. If
      // the number of additional records is more than
      // _M_freelist_headroom % of the freelist, we move these
      // records back to the global pool.
      size_t 	_M_freelist_headroom;
      
      // Set to true forces all allocations to use new().
      bool 	_M_force_new; 
      
      explicit
      _Tune()
      : _M_align(_S_align), _M_max_bytes(_S_max_bytes), _M_min_bin(_S_min_bin),
      _M_chunk_size(_S_chunk_size), _M_max_threads(_S_max_threads), 
      _M_freelist_headroom(_S_freelist_headroom), 
      _M_force_new(getenv("GLIBCXX_FORCE_NEW") ? true : false)
      { }

      explicit
      _Tune(size_t __align, size_t __maxb, size_t __minbin, size_t __chunk, 
	    size_t __maxthreads, size_t __headroom, bool __force) 
      : _M_align(__align), _M_max_bytes(__maxb), _M_min_bin(__minbin),
      _M_chunk_size(__chunk), _M_max_threads(__maxthreads),
      _M_freelist_headroom(__headroom), _M_force_new(__force)
      { }
    };
    
    struct _Block_address
    {
      void* 			_M_initial;
      _Block_address* 		_M_next;
    };
    
    const _Tune&
    _M_get_options() const
    { return _M_options; }

    void
    _M_set_options(_Tune __t)
    { 
      if (!_M_init)
	_M_options = __t;
    }

    bool
    _M_check_threshold(size_t __bytes)
    { return __bytes > _M_options._M_max_bytes || _M_options._M_force_new; }

    size_t
    _M_get_binmap(size_t __bytes)
    { return _M_binmap[__bytes]; }

    const size_t
    _M_get_align()
    { return _M_options._M_align; }

    explicit 
    __pool_base() 
    : _M_options(_Tune()), _M_binmap(NULL), _M_init(false) { }

    explicit 
    __pool_base(const _Tune& __options)
    : _M_options(__options), _M_binmap(NULL), _M_init(false) { }

  private:
    explicit 
    __pool_base(const __pool_base&);

    __pool_base&
    operator=(const __pool_base&);

  protected:
    // Configuration options.
    _Tune 	       		_M_options;
    
    _Binmap_type* 		_M_binmap;

    // Configuration of the pool object via _M_options can happen
    // after construction but before initialization. After
    // initialization is complete, this variable is set to true.
    bool 			_M_init;//初始化标识
  };

3、单线程模型

空闲块链表的内存布局:

 ST程序里,所有的操作都在全局内存池里--即thread_id0MT程序里任何线程都不会分配到这个id)。

当程序申请内存(调用allocate()),我们首先看申请的内存大小是否大于_S_max_bytes,如果是则直接用new

否则通过_S_binmap找出合适的bin。查看一下_M_bin[ bin ]._M_first[ 0 ]就能知道是否有空闲的块。如果有,那么直接把块移出_M_bin[ bin ]._M_first[ 0 ],返回数据的地址。

如果没有空闲块,就需要从系统申请内存,然后建立空闲块链表。已知block_record的大小和当前bin管理的块的大小,我们算出申请的内存能分出多少个块,然后建立链表,并把第一个块的数据返回给用户。

内存释放的过程同样简单。先把指针转换回block_record指针,根据内存大小找到合适的内存块,然后把块加到空闲列表的前面。

通过一系列的性能测试,我们发现“加到空闲列表前面”比加到后面有10%的性能提升。 

(1)单线程的内存池定义

源代码如下(mt_allocator.h):

/// Specialization for single thread.
template<>
  class __pool<false> : public __pool_base
  {
  public:
    union _Block_record//内存块链表节点
    {
	// Points to the block_record of the next free block.
	_Block_record* volatile         _M_next;
    };

    struct _Bin_record
    {
	// An "array" of pointers to the first free block.
	_Block_record** volatile        _M_first;
	// A list of the initial addresses of all allocated blocks.
	_Block_address*		     	_M_address;
    };
    
    void _M_initialize_once()
    {
	if (__builtin_expect(_M_init == false, false))
	  _M_initialize();
    }
    void _M_destroy() throw();
    char*  _M_reserve_block(size_t __bytes, const size_t __thread_id);
      void _M_reclaim_block(char* __p, size_t __bytes);
      size_t  _M_get_thread_id() { return 0; }
     const _Bin_record& _M_get_bin(size_t __which){ return _M_bin[__which]; }
    void _M_adjust_freelist(const _Bin_record&, _Block_record*, size_t){}
    explicit __pool() 
    : _M_bin(NULL), _M_bin_size(1) { }
    explicit __pool(const __pool_base::_Tune& __tune) 
    : __pool_base(__tune), _M_bin(NULL), _M_bin_size(1) { }
  private:
    // An "array" of bin_records each of which represents a specific
    // power of 2 size. Memory to this "array" is allocated in
    // _M_initialize().
    _Bin_record* volatile	_M_bin;//二进制内存管理块数组
 
    // Actual value calculated in _M_initialize().
    size_t 	       	     	_M_bin_size;     
    void _M_initialize();
};

(2)内存池单线程模式初始化

源代码:(mt_allocator.cc

void
  __pool<false>::_M_initialize()
  {
    // _M_force_new must not change after the first allocate(), which
    // in turn calls this method, so if it's false, it's false forever
    // and we don't need to return here ever again.
    if (_M_options._M_force_new) //强制使用new操作符,就不需要内存池机制了
      {
	_M_init = true;
	return;
      }
      
    // Create the bins.
    // Calculate the number of bins required based on _M_max_bytes.
    // _M_bin_size is statically-initialized to one.
    size_t __bin_size = _M_options._M_min_bin;
    while (_M_options._M_max_bytes > __bin_size)
      {
	__bin_size <<= 1;
	++_M_bin_size;//二进制结构大小类型计数
      }
      
    // Setup the bin map for quick lookup of the relevant bin.
    const size_t __j = (_M_options._M_max_bytes + 1) * sizeof(_Binmap_type);//图类型数组成员数量(最大限制分配字节大小<span style="font-family: Arial, Helvetica, sans-serif; font-size: 12px;">_M_max_bytes + 1个unsigned short类型</span>)
    _M_binmap = static_cast<_Binmap_type*>(::operator new(__j));//初始化图数组(快速查找内存管理器)
    _Binmap_type* __bp = _M_binmap;
    _Binmap_type __bin_max = _M_options._M_min_bin;
    _Binmap_type __bint = 0;
    for (_Binmap_type __ct = 0; __ct <= _M_options._M_max_bytes; ++__ct)
      {
	if (__ct > __bin_max)
	  {
	    __bin_max <<= 1;//二进制块最大大小
	    ++__bint;
	  }
	*__bp++ = __bint;//把需求字节大小作为索引映射到图表示的数组里,数组成员的值为内存块管理器数组的索引
      }
      
    // Initialize _M_bin and its members.
    void* __v = ::operator new(sizeof(_Bin_record) * _M_bin_size);
    _M_bin = static_cast<_Bin_record*>(__v);
    for (size_t __n = 0; __n < _M_bin_size; ++__n)
      {
	_Bin_record& __bin = _M_bin[__n];//内存块管理器
	__v = ::operator new(sizeof(_Block_record*));//二进制链表节点
	__bin._M_first = static_cast<_Block_record**>(__v);
	__bin._M_first[0] = NULL;
	__bin._M_address = NULL;
      }
    _M_init = true;
  }

(3)回收内存块到全局的内存池

源代码:( mt_allocator.cc

void
  __pool<false>::_M_reclaim_block(char* __p, size_t __bytes)
  {
    // Round up to power of 2 and figure out which bin to use.
    const size_t __which = _M_binmap[__bytes];//由字节大小映射到内存管理器索引
    _Bin_record& __bin = _M_bin[__which];

    char* __c = __p - _M_get_align();
    _Block_record* __block = reinterpret_cast<_Block_record*>(__c);
      
    // Single threaded application - return to global pool.
    __block->_M_next = __bin._M_first[0];
    __bin._M_first[0] = __block;
  }


4、多线程模型

ST程序里从来用不到thread_id变量,那么现在我们从它的作用开始介绍。

向共享容器申请或释放内存的MT程序有“所有权”的概念,但有一个问题就是线程只把空闲内存返回到自己的空闲块链表里。(比如一个线程专门进行内存的申请,然后转交给其他线程使用,那么其他线程的空闲链表会越来越长,最终导致内存用尽)

每当一个块从全局链表(没有所有权)移到某个线程的空闲链表时,都会设置thread_id。其他需要设置thread_id的情况还包括直接从空白内存上建立某个线程的空闲块链表,和释放某个块删除时,发现申请块的线程id和执行释放操作的线程id不同的时候。

那么到底thread_id有什么用呢?当释放块时,我们比较块的thread_id和当前线程的thread_id是否一致,然后递减生成这个块的线程的used变量,确保freeused计数器的正确。这是很重要的,因为它们决定了是否需要把内存归还给全局内存池。

当程序申请内存(调用allocate()),我们首先看申请的内存大小是否大于_S_max_bytes,如果是则直接用new

否则通过_S_binmap找出合适的bin_S_get_thread_id()返回当前线程的thread_id,如果这是第一次调用allocate(),线程会得到一个新的thread_id,保存在_S_thread_key里。

查看_M_bin[ bin ].first[ thread_id ]能知道是否有空闲的内存块。如果有,则移出第一个块,返回给用户,别忘了更新usedfree计数器。

如果没有,我们先从全局链表(freelist (0))里寻找。如果找到了,那么把当前bin锁住,然后从全局空闲链表里移出最多block_count(从OS申请的一个内存块能生成多少个当前bin的块)个块到当前线程的空闲链表里,改变它们的所有权,更新计数器和指针。接着把bin解锁,把_S_bin[ bin ].first[ thread_id ]里第一个块返回给用户。

最多只移动block_count个块的原因是,降低后续释放块请求可能导致的归还操作(通过_S_freelist_headroom来计算,后面详述)。

如果在全局链表里也没有空闲块了,那么我们需要从OS申请内存。这和ST程序的做法一样,只有一点注意区别:从新申请的内存块(大小为_S_chunk_size字节)上建立起来的空闲链表直接交给当前进程,而不是加入全局空闲链表。

释放内存块的基本操作很简单:把内存块直接加到当前线程的空闲链表里,更新计数器和指针(前面说过如果当前线程的id和块的thread id不一致的情况下该如何处理)。随后freeused计数器就要发挥作用了,即空闲链表的长度(free)和当前线程正在使用的块的个数(used)。

让我们回想前面一个线程专门负责分配内存的程序模型。假设开始时每个线程使用了51232字节的块,那么他们的used计数器此时都是516。负责分配内存的线程接着又得到了100032字节的块,那么此时它的used计数器是1516

如果某个线程释放了500个块,每次释放操作都会导致used计数器递减,和该线程的空闲链表(free)越来越长。不过deallocate()会把free控制在used_S_freelist_headroom%以内(默认是10%),于是当free超过52516 / 10)时,释放的空闲块会归还给全局空闲链表,从而负载分配的线程就能重用它们。

为了减少锁竞争(这种归还操作需要对bin进行加锁),归还操作是以block_count个块为单位进行的(和从全局空闲链表里获得块一样)。这个“规则”还可以改进,减少某些块“来回转移”的几率。

(1)多线程使用的内存池定义

源代码如下:mt_allocator.h

/// Specialization for thread enabled, via gthreads.h.
  template<>
    class __pool<true> : public __pool_base
    {
    public:
      // Each requesting thread is assigned an id ranging from 1 to
      // _S_max_threads. Thread id 0 is used as a global memory pool.
      // In order to get constant performance on the thread assignment
      // routine, we keep a list of free ids. When a thread first
      // requests memory we remove the first record in this list and
      // stores the address in a __gthread_key. When initializing the
      // __gthread_key we specify a destructor. When this destructor
      // (i.e. the thread dies) is called, we return the thread id to
      // the front of this list.
      struct _Thread_record
      {
	// Points to next free thread id record. NULL if last record in list.
	_Thread_record* volatile        _M_next;
	// Thread id ranging from 1 to _S_max_threads.
	size_t                          _M_id;
      };
      
      union _Block_record
      {
	// Points to the block_record of the next free block.
	_Block_record* volatile         _M_next;
	// The thread id of the thread which has requested this block.
	size_t                          _M_thread_id;
      };
 
 struct _Bin_record//内存块管理器
      {
	// An "array" of pointers to the first free block for each
	// thread id. Memory to this "array" is allocated in
	// _S_initialize() for _S_max_threads + global pool 0.
	_Block_record** volatile        _M_first;//内存块指针数组
	
	// A list of the initial addresses of all allocated blocks.
	_Block_address*		     	_M_address;//分配到的内存块的初始化地址的链表

	// An "array" of counters used to keep track of the amount of
	// blocks that are on the freelist/used for each thread id.
	// Memory to these "arrays" is allocated in _S_initialize() for
	// _S_max_threads + global pool 0.
	size_t* volatile                _M_free;//空闲的块计数器
	size_t* volatile                _M_used;//使用的块计数器
	
	// Each bin has its own mutex which is used to ensure data
	// integrity while changing "ownership" on a block.  The mutex
	// is initialized in _S_initialize().
	__gthread_mutex_t*              _M_mutex;//二进制链表修改用的互斥量
      };
      
      // XXX GLIBCXX_ABI Deprecated
      void
      _M_initialize(__destroy_handler);

      void
      _M_initialize_once()
      {
	if (__builtin_expect(_M_init == false, false))
	  _M_initialize();
      }

      void
      _M_destroy() throw();

      char* 
      _M_reserve_block(size_t __bytes, const size_t __thread_id);
    
      void
      _M_reclaim_block(char* __p, size_t __bytes);
    
      const _Bin_record&
      _M_get_bin(size_t __which)
      { return _M_bin[__which]; }//由大小索引内存块管理器
      
      void
      _M_adjust_freelist(const _Bin_record& __bin, _Block_record* __block,  size_t __thread_id)
      {
	if (__gthread_active_p())
	{
	  __block->_M_thread_id = __thread_id;
	 --__bin._M_free[__thread_id];//空闲块计数器
	 ++__bin._M_used[__thread_id];//使用块计数器
	 }
      }

      // XXX GLIBCXX_ABI Deprecated
      void 
      _M_destroy_thread_key(void*);

      size_t 
      _M_get_thread_id();

      explicit __pool() 
      : _M_bin(NULL), _M_bin_size(1), _M_thread_freelist(NULL) 
      { }

      explicit __pool(const __pool_base::_Tune& __tune) 
      : __pool_base(__tune), _M_bin(NULL), _M_bin_size(1), 
      _M_thread_freelist(NULL) 
      { }

    private:
      // An "array" of bin_records each of which represents a specific
      // power of 2 size. Memory to this "array" is allocated in
      // _M_initialize().
      _Bin_record* volatile	_M_bin;//内存块管理器数组

      // Actual value calculated in _M_initialize().
      size_t 	       	     	_M_bin_size;//二进制数组大小

      _Thread_record* 		_M_thread_freelist;
      void*			_M_thread_freelist_initial;

      void
      _M_initialize();
    };

(2)内存池初始化

源代码:( mt_allocator.cc

__pool<true>::_M_initialize(__destroy_handler __d)
  {
    // _M_force_new must not change after the first allocate(),
    // which in turn calls this method, so if it's false, it's false
    // forever and we don't need to return here ever again.
    if (_M_options._M_force_new) 
      {
	_M_init = true;
	return;
      }
      
    // Create the bins.
    // Calculate the number of bins required based on _M_max_bytes.
    // _M_bin_size is statically-initialized to one.
    size_t __bin_size = _M_options._M_min_bin;
    while (_M_options._M_max_bytes > __bin_size)
      {
	__bin_size <<= 1;
	++_M_bin_size;
      }
      
    // Setup the bin map for quick lookup of the relevant bin.
    const size_t __j = (_M_options._M_max_bytes + 1) * sizeof(_Binmap_type);
    _M_binmap = static_cast<_Binmap_type*>(::operator new(__j));
    _Binmap_type* __bp = _M_binmap;
    _Binmap_type __bin_max = _M_options._M_min_bin;
    _Binmap_type __bint = 0;
    for (_Binmap_type __ct = 0; __ct <= _M_options._M_max_bytes; ++__ct)
      {
	if (__ct > __bin_max)
	  {
	    __bin_max <<= 1;
	    ++__bint;
	  }
	*__bp++ = __bint;
      }
      
    // Initialize _M_bin and its members.
    void* __v = ::operator new(sizeof(_Bin_record) * _M_bin_size);//<span style="font-family: Arial, Helvetica, sans-serif; font-size: 12px;">_M_bin_size个内存管理器</span>
    _M_bin = static_cast<_Bin_record*>(__v);
      
    // If __gthread_active_p() create and initialize the list of
    // free thread ids. Single threaded applications use thread id 0
    // directly and have no need for this.
    if (__gthread_active_p())
      {
	const size_t __k = sizeof(_Thread_record) * _M_options._M_max_threads;
	__v = ::operator new(__k);
	_M_thread_freelist = static_cast<_Thread_record*>(__v);
	_M_thread_freelist_initial = __v;
	  
	// NOTE! The first assignable thread id is 1 since the
	// global pool uses id 0
	size_t __i;
	for (__i = 1; __i < _M_options._M_max_threads; ++__i)
	  {
	    _Thread_record& __tr = _M_thread_freelist[__i - 1];
	    __tr._M_next = &_M_thread_freelist[__i];
	    __tr._M_id = __i;//线程id数组初始化
	  }
	  
	// Set last record.
	_M_thread_freelist[__i - 1]._M_next = NULL;
	_M_thread_freelist[__i - 1]._M_id = __i;
	  
	// Initialize per thread key to hold pointer to
	// _M_thread_freelist.
	__gthread_key_create(&__gnu_internal::freelist_key, __d);
	  
	const size_t __max_threads = _M_options._M_max_threads + 1;
	for (size_t __n = 0; __n < _M_bin_size; ++__n)
	  {
	    _Bin_record& __bin = _M_bin[__n];
	    __v = ::operator new(sizeof(_Block_record*) * __max_threads);
	    __bin._M_first = static_cast<_Block_record**>(__v);

	    __bin._M_address = NULL;

	    __v = ::operator new(sizeof(size_t) * __max_threads);
	    __bin._M_free = static_cast<size_t*>(__v);
	      
	    __v = ::operator new(sizeof(size_t) * __max_threads);
	    __bin._M_used = static_cast<size_t*>(__v);
	      
	    __v = ::operator new(sizeof(__gthread_mutex_t));
	    __bin._M_mutex = static_cast<__gthread_mutex_t*>(__v);
	      
#ifdef __GTHREAD_MUTEX_INIT
	    {
	      // Do not copy a POSIX/gthr mutex once in use.
	      __gthread_mutex_t __tmp = __GTHREAD_MUTEX_INIT;
	      *__bin._M_mutex = __tmp;
	    }
#else
	    { __GTHREAD_MUTEX_INIT_FUNCTION(__bin._M_mutex); }
#endif
	    for (size_t __threadn = 0; __threadn < __max_threads; ++__threadn)
	      {
		__bin._M_first[__threadn] = NULL;
		__bin._M_free[__threadn] = 0;
		__bin._M_used[__threadn] = 0;
	      }
	  }
      }
    else
      {
	for (size_t __n = 0; __n < _M_bin_size; ++__n)
	  {
	    _Bin_record& __bin = _M_bin[__n];//对每类型大小初始化内存管理器
	    __v = ::operator new(sizeof(_Block_record*));
	    __bin._M_first = static_cast<_Block_record**>(__v);
	    __bin._M_first[0] = NULL;//<span style="font-family: Arial, Helvetica, sans-serif; font-size: 12px;">_M_first是对每个线程使用的空闲块链表数组</span>
	    __bin._M_address = NULL;
	  }
      }
    _M_init = true;
  }


 
 

(3)回收内存块到全局内存池

源代码:(mt_allocator.cc

void
  __pool<true>::_M_reclaim_block(char* __p, size_t __bytes)
  {
    // Round up to power of 2 and figure out which bin to use.
    const size_t __which = _M_binmap[__bytes];
    const _Bin_record& __bin = _M_bin[__which];

    // Know __p not null, assume valid block.
    char* __c = __p - _M_get_align();
    _Block_record* __block = reinterpret_cast<_Block_record*>(__c);
    if (__gthread_active_p())
      {
	// Calculate the number of records to remove from our freelist:
	// in order to avoid too much contention we wait until the
	// number of records is "high enough".
	const size_t __thread_id = _M_get_thread_id();
	const _Tune& __options = _M_get_options();	
	const unsigned long __limit = 100 * (_M_bin_size - __which)
		                      * __options._M_freelist_headroom;

	unsigned long __remove = __bin._M_free[__thread_id];
	__remove *= __options._M_freelist_headroom;
	if (__remove >= __bin._M_used[__thread_id])
	  __remove -= __bin._M_used[__thread_id];
	else
	  __remove = 0;
	if (__remove > __limit && __remove > __bin._M_free[__thread_id])
	  {
	    _Block_record* __first = __bin._M_first[__thread_id];
	    _Block_record* __tmp = __first;
	    __remove /= __options._M_freelist_headroom;
	    const unsigned long __removed = __remove;
	    while (--__remove > 0)
	      __tmp = __tmp->_M_next;
	    __bin._M_first[__thread_id] = __tmp->_M_next;
	    __bin._M_free[__thread_id] -= __removed;
	    
	    __gthread_mutex_lock(__bin._M_mutex);
	    __tmp->_M_next = __bin._M_first[0];
	    __bin._M_first[0] = __first;
	    __bin._M_free[0] += __removed;
	    __gthread_mutex_unlock(__bin._M_mutex);
	  }

	// Return this block to our list and update counters and
	// owner id as needed.
	--__bin._M_used[__block->_M_thread_id];
	
	__block->_M_next = __bin._M_first[__thread_id];
	__bin._M_first[__thread_id] = __block;
	
	++__bin._M_free[__thread_id];
      }
    else
      {
	// Not using threads, so single threaded application - return
	// to global pool.
	__block->_M_next = __bin._M_first[0];
	__bin._M_first[0] = __block;
      }
  }




你可能感兴趣的:(支持多线程的内存分配器__gnu_cxx::__mt_alloc)