ConcurrentMarkSweepGeneration 之二

2022-11-24  本文已影响0人  程序员札记

ModUnionClosure / ModUnionClosurePar

这两个类的定义都在concurrentMarkSweepGeneration.hpp中,用来遍历MemRegion,将其在BitMap对应的内存区域打标,其类继承关系如下:


image.png

其核心do_MemRegion方法的实现如下:


image.png

_t就是构造方法传入的CMSBitMap指针。

CMSIsAliveClosure / CMSParKeepAliveClosure

CMSIsAliveClosure用于判断某个对象是否是存活的,CMSParKeepAliveClosure用于将某个对象标记成存活的,底层都是依赖于BitMap,其实现如下:

CMSIsAliveClosure(MemRegion span,
                    CMSBitMap* bit_map):
    _span(span), //span表示老年代对应的内存区域
    _bit_map(bit_map) //CMSBitMap引用
  {
    assert(!span.is_empty(), "Empty span could spell trouble");
  }
 
bool CMSIsAliveClosure::do_object_b(oop obj) {
  HeapWord* addr = (HeapWord*)obj;
  //BitMap中打标则认为其是存活的
  return addr != NULL &&
         (!_span.contains(addr) || _bit_map->isMarked(addr));
}
 
CMSParKeepAliveClosure::CMSParKeepAliveClosure(CMSCollector* collector,
  MemRegion span, CMSBitMap* bit_map, OopTaskQueue* work_queue):
   _span(span), //老年代对应的内存区域
   _bit_map(bit_map), //老年代的BitMap
   _work_queue(work_queue), //执行任务的队列
   _mark_and_push(collector, span, bit_map, work_queue), //CMSInnerParMarkAndPushClosure实例
   _low_water_mark(MIN2((uint)(work_queue->max_elems()/4),
                       //CMSWorkQueueDrainThreshold表示CMSWorkQueue的阈值,默认是10
                        (uint)(CMSWorkQueueDrainThreshold * ParallelGCThreads))) //_work_queue的最大容量
{ }
 
void CMSKeepAliveClosure::do_oop(oop* p)       { CMSKeepAliveClosure::do_oop_work(p); }
 
void CMSKeepAliveClosure::do_oop(narrowOop* p) { CMSKeepAliveClosure::do_oop_work(p); }
 
void CMSParKeepAliveClosure::do_oop(oop obj) {
  HeapWord* addr = (HeapWord*)obj;
  //如果addr在老年代中且没有打标
  if (_span.contains(addr) &&
      !_bit_map->isMarked(addr)) {
    //如果打标成功,因为其他线程可能已经完成打标了,所以可能返回false
    if (_bit_map->par_mark(addr)) {
      //将obj放入队列中
      bool res = _work_queue->push(obj);
      assert(res, "Low water mark should be much less than capacity");
      //如果_work_queue中的oop超过指定容量了,则处理一部分
      trim_queue(_low_water_mark);
    } // Else, another thread got there first
  }
}
 
void CMSParKeepAliveClosure::trim_queue(uint max) {
  //如果待处理的oop过多
  while (_work_queue->size() > max) {
    oop new_oop;
    //弹出一个待处理的oop
    if (_work_queue->pop_local(new_oop)) {
      assert(new_oop != NULL && new_oop->is_oop(), "Expected an oop");
      assert(_bit_map->isMarked((HeapWord*)new_oop),
             "no white objects on this stack!");
      assert(_span.contains((HeapWord*)new_oop), "Out of bounds oop");
      //遍历该oop所引用的其他oop
      new_oop->oop_iterate(&_mark_and_push);
    }
  }
}

CFLS_LAB

CFLS_LAB定义在同目录下的compactibleFreeListSpace.hpp中,是老年代并行GC下本地线程的内存分配缓存,其包含的属性如下:

重点关注以下方法的实现。

1、构造方法和modify_initialization
modify_initialization是当命令行显示修改了CMSParPromoteBlocksToClaim或者OldPLABWeight的默认值时才会调用,其调用链如下:


image.png

该方法修改的是静态属性_blocks_to_claim,所以可以在启动时执行,Arguments::set_cms_and_parnew_gc_flags方法中的调用如下图:


image.png

两者的实现如下:

CFLS_LAB::CFLS_LAB(CompactibleFreeListSpace* cfls) :
  _cfls(cfls)
{
  assert(CompactibleFreeListSpace::IndexSetSize == 257, "Modify VECTOR_257() macro above");
  //CompactibleFreeListSpace::set_cms_values方法把IndexSetStart初始化成MinChunkSize,IndexSetStride初始化成MinObjAlignment
  for (size_t i = CompactibleFreeListSpace::IndexSetStart;
       i < CompactibleFreeListSpace::IndexSetSize;
       i += CompactibleFreeListSpace::IndexSetStride) {
    _indexedFreeList[i].set_size(i);
    _num_blocks[i] = 0;
  }
}
 
static bool _CFLS_LAB_modified = false;
 
//实际调用时n传入的是OldPLABSize,wt传入的是OldPLABWeight
//OldPLABSize表示老年代中用于promotion的LAB的大小,默认值是1024
//OldPLABWeight表示重置CMSParPromoteBlocksToClaim时指数衰减的百分比,默认值是50
//CMSParPromoteBlocksToClaimb表示并行GC时重新填充LAB需要声明的内存块的个数
void CFLS_LAB::modify_initialization(size_t n, unsigned wt) {
  assert(!_CFLS_LAB_modified, "Call only once");
  _CFLS_LAB_modified = true;
  for (size_t i = CompactibleFreeListSpace::IndexSetStart;
       i < CompactibleFreeListSpace::IndexSetSize;
       i += CompactibleFreeListSpace::IndexSetStride) {
    _blocks_to_claim[i].modify(n, wt, true /* force */);
  }
}
 
#define VECTOR_257(x)                                                                                  \
  /* 1  2  3  4  5  6  7  8  9 1x 11 12 13 14 15 16 17 18 19 2x 21 22 23 24 25 26 27 28 29 3x 31 32 */ \
  {  x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x,   \
     x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x,   \
     x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x,   \
     x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x,   \
     x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x,   \
     x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x,   \
     x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x,   \
     x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x,   \
     x }
 
//初始化
AdaptiveWeightedAverage CFLS_LAB::_blocks_to_claim[]    =
  VECTOR_257(AdaptiveWeightedAverage(OldPLABWeight, (float)CMSParPromoteBlocksToClaim));
size_t CFLS_LAB::_global_num_blocks[]  = VECTOR_257(0);
uint   CFLS_LAB::_global_num_workers[] = VECTOR_257(0);

构造方法的调用链如下:

image.png

2、alloc
alloc方法用于分配指定大小的内存,如果大于IndexSetSize则尝试从_dictionary中分配,否则从本地的对应大小的FreeList中分配,如果对应FreeList中的空闲内存块的个数为0,则重新填充。其实现如下:

HeapWord* CFLS_LAB::alloc(size_t word_sz) {
  FreeChunk* res;
  assert(word_sz == _cfls->adjustObjectSize(word_sz), "Error");
  if (word_sz >=  CompactibleFreeListSpace::IndexSetSize) {
    //如果超过IndexSetSize,则获取_parDictionaryAllocLock锁,从_dictionary中分配
    MutexLockerEx x(_cfls->parDictionaryAllocLock(),
                    Mutex::_no_safepoint_check_flag);
    res = _cfls->getChunkFromDictionaryExact(word_sz);
    //分配失败返回NULL
    if (res == NULL) return NULL;
  } else {
    //获取对应大小的FreeList
    AdaptiveFreeList<FreeChunk>* fl = &_indexedFreeList[word_sz];
    if (fl->count() == 0) {
      //如果fl是空则尝试重新填充
      get_from_global_pool(word_sz, fl);
      //填充失败返回Null
      if (fl->count() == 0) return NULL;
    }
    //获取链表头的FreeChunk
    res = fl->get_chunk_at_head();
    assert(res != NULL, "Why was count non-zero?");
  }
  //标记成非空闲的
  res->markNotFree();
  assert(!res->is_free(), "shouldn't be marked free");
  assert(oop(res)->klass_or_null() == NULL, "should look uninitialized");
  return (HeapWord*)res;
}
 
void CFLS_LAB::get_from_global_pool(size_t word_sz, AdaptiveFreeList<FreeChunk>* fl) {
  //获取需要填充的FreeChunk的个数
  size_t n_blks = (size_t)_blocks_to_claim[word_sz].average();
  assert(n_blks > 0, "Error");
  //ResizeOldPLAB表示是否动态调整用于promote的LAB的个数,默认是true
  assert(ResizeOldPLAB || n_blks == OldPLABSize, "Error");
  //CMSOldPLABResizeQuicker默认为false
  if (ResizeOldPLAB && CMSOldPLABResizeQuicker) {
    size_t multiple = _num_blocks[word_sz]/(CMSOldPLABToleranceFactor*CMSOldPLABNumRefills*n_blks);
    n_blks +=  CMSOldPLABReactivityFactor*multiple*n_blks;
    n_blks = MIN2(n_blks, CMSOldPLABMax);
  }
  assert(n_blks > 0, "Error");
  //从_cfls中申请最多n_blks个指定大小的内存块并放到fl中
  _cfls->par_get_chunk_of_blocks(word_sz, n_blks, fl);
  //更新_num_blocks中对应大小的内存块的个数
  _num_blocks[word_sz] += fl->count();
}

3、retire
retire是promote执行完成后由VMThread调用,用于归还不同大小的FreeList中所有未使用的空闲内存块的,并将对应FreeList和num_blocks重置成初始状态,其实现如下:

void CFLS_LAB::retire(int tid) {
  //VMThread 执行GC时才会调用此方法
  assert(Thread::current()->is_VM_thread(), "Error");
  for (size_t i =  CompactibleFreeListSpace::IndexSetStart;
       i < CompactibleFreeListSpace::IndexSetSize;
       i += CompactibleFreeListSpace::IndexSetStride) {
    //_num_blocks只有在填充FreeList才会改变,而count是只要分配出去一个内存块就减1,所以前者大于等于后者
    assert(_num_blocks[i] >= (size_t)_indexedFreeList[i].count(),
           "Can't retire more than what we obtained");
    //_num_blocks等于0的说明未进行填充,对应的FreeList肯定是空的      
    if (_num_blocks[i] > 0) {
      //获取FreeList中剩余的空闲内存块个数
      size_t num_retire =  _indexedFreeList[i].count();
      assert(_num_blocks[i] > num_retire, "Should have used at least one");
      {
        
        //累加使用的内存块个数
        _global_num_blocks[i] += (_num_blocks[i] - num_retire);
        _global_num_workers[i]++;
        //因为一个GC线程对应一个CFLS_LAB实例,所以不会超过ParallelGCThreads
        assert(_global_num_workers[i] <= ParallelGCThreads, "Too big");
        if (num_retire > 0) {
          //将剩余的空闲内存块归还到_cfls中对应大小的FreeList中
          _cfls->_indexedFreeList[i].prepend(&_indexedFreeList[i]);
          // 重置FreeList
          _indexedFreeList[i] = AdaptiveFreeList<FreeChunk>();
          _indexedFreeList[i].set_size(i);
        }
      }
      if (PrintOldPLAB) {
        gclog_or_tty->print_cr("%d[" SIZE_FORMAT "]: " SIZE_FORMAT "/" SIZE_FORMAT "/" SIZE_FORMAT,
                               tid, i, num_retire, _num_blocks[i], (size_t)_blocks_to_claim[i].average());
      }
      //将_num_blocks置为0
      _num_blocks[i]         = 0;
    }
  }
}

其调用链如下:


image.png

4、 compute_desired_plab_size
compute_desired_plab_size方法会综合某个大小的FreeList的剩余FreeChunk的个数和曾经获取该大小的GC线程的数量以及其他配置参数来动态调整FreeList填充时的填充的FreeChunk的个数,其实现如下:

void CFLS_LAB::compute_desired_plab_size() {
  for (size_t i =  CompactibleFreeListSpace::IndexSetStart;
       i < CompactibleFreeListSpace::IndexSetSize;
       i += CompactibleFreeListSpace::IndexSetStride) {
    //这两个条件要么同时成立,要么不成立
    assert((_global_num_workers[i] == 0) == (_global_num_blocks[i] == 0),
           "Counter inconsistency");
    if (_global_num_workers[i] > 0) {
      //如果大于0,说明有GC线程获取过对应大小的内存块
      //ResizeOldPLAB默认值为true
      if (ResizeOldPLAB) {
        //CMSOldPLABMin的默认值是16,CMSOldPLABMax的默认值是1024,表示CMS下老年代为promote提前分配空闲内存块的个数的最小值和最大值
        _blocks_to_claim[i].sample(
          MAX2((size_t)CMSOldPLABMin,
          MIN2((size_t)CMSOldPLABMax,
               _global_num_blocks[i]/(_global_num_workers[i]*CMSOldPLABNumRefills))));
      }
      //重置成初始状态
      _global_num_workers[i] = 0;
      _global_num_blocks[i] = 0;
      if (PrintOldPLAB) {
        gclog_or_tty->print_cr("[" SIZE_FORMAT "]: " SIZE_FORMAT, i, (size_t)_blocks_to_claim[i].average());
      }
    }
  }
}

其调用链如下:


image.png

ConcurrentMarkSweepGeneration

1、定义
ConcurrentMarkSweepGeneration表示CMS的老年代,其定义同样在concurrentMarkSweepGeneration.hpp中,包含的属性如下:

CMSExpansionCause表示CMS老年代扩展的原因,其定义如下:


image.png

to_string方法返回Cause对应的字符串描述,打印GC日志使用。

CollectionTypes是一个枚举,定义如下:


image.png

CMSParGCThreadState是一个简单的数据结构,是执行老年代promote时用来给GC线程提前分配内存使用,相当于GC线程的TLAB,避免每次promote复制对象时都从堆内存中申请空间,提升promote效率,如下:

注意CMSParGCThreadState的两个属性都是public的,调用方可以直接访问这两个属性的public方法的。PromotionInfo的讲解参考CompactibleFreeListSpace

CMSCollector的定义在同一个文件中,这两个类密切相关,因此放在一起讲解,其包含的属性如下:

image.png

状态的流转如下:


image.png

CMSStats就是一个数据结构用来保存CMS内存分配,垃圾回收相关统计数据的,其定义的属性如下:

image.png

重点关注以下方法的实现

2、构造方法和ref_processor_init

ConcurrentMarkSweepGeneration::ConcurrentMarkSweepGeneration(
     ReservedSpace rs, size_t initial_byte_size, int level,
     CardTableRS* ct, bool use_adaptive_freelists,
     FreeBlockDictionary<FreeChunk>::DictionaryChoice dictionaryChoice) :
  CardGeneration(rs, initial_byte_size, level, ct),
  //MinChunkSize在CompactibleFreeListSpace::set_cms_values方法中完成初始化
  _dilatation_factor(((double)MinChunkSize)/((double)(CollectedHeap::min_fill_size()))),
  _debug_collection_type(Concurrent_collection_type),
  _did_compact(false)
{
  HeapWord* bottom = (HeapWord*) _virtual_space.low();
  HeapWord* end    = (HeapWord*) _virtual_space.high();
 
  _direct_allocated_words = 0;
  //初始化cmsSpace
  _cmsSpace = new CompactibleFreeListSpace(_bts, MemRegion(bottom, end),
                                           use_adaptive_freelists,
                                           dictionaryChoice);
  
  if (_cmsSpace == NULL) {
    vm_exit_during_initialization(
      "CompactibleFreeListSpace allocation failure");
  }
  _cmsSpace->_gen = this;
 
  _gc_stats = new CMSGCStats();
 
  
  if (CollectedHeap::use_parallel_gc_threads()) {
    typedef CMSParGCThreadState* CMSParGCThreadStatePtr;
    //创建一个CMSParGCThreadStatePtr数组
    _par_gc_thread_states =
      NEW_C_HEAP_ARRAY(CMSParGCThreadStatePtr, ParallelGCThreads, mtGC);
    if (_par_gc_thread_states == NULL) {
      vm_exit_during_initialization("Could not allocate par gc structs");
    }
    //初始化数组元素
    for (uint i = 0; i < ParallelGCThreads; i++) {
      _par_gc_thread_states[i] = new CMSParGCThreadState(cmsSpace());
      if (_par_gc_thread_states[i] == NULL) {
        vm_exit_during_initialization("Could not allocate par gc structs");
      }
    }
  } else {
    _par_gc_thread_states = NULL;
  }
  _incremental_collection_failed = false;
  assert(MinChunkSize >= CollectedHeap::min_fill_size(), "just checking");
  assert(_dilatation_factor >= 1.0, "from previous assert");
}
 
void ConcurrentMarkSweepGeneration::ref_processor_init() {
  assert(collector() != NULL, "no collector");
  collector()->ref_processor_init();
}
 
void CMSCollector::ref_processor_init() {
  if (_ref_processor == NULL) {
    // Allocate and initialize a reference processor
    _ref_processor =
      new ReferenceProcessor(_span,                               // span
                             (ParallelGCThreads > 1) && ParallelRefProcEnabled, // mt processing,ParallelRefProcEnabled表示是否并行处理Reference,默认为false
                             (int) ParallelGCThreads,             // mt processing degree
                             _cmsGen->refs_discovery_is_mt(),     // mt discovery
                             (int) MAX2(ConcGCThreads, ParallelGCThreads), // mt discovery degree
                             _cmsGen->refs_discovery_is_atomic(), // discovery is not atomic
                             &_is_alive_closure);                 // closure for liveness info
    _cmsGen->set_ref_processor(_ref_processor);
 
  }
}
 
bool refs_discovery_is_mt()     const {
    return ConcGCThreads > 1;
  }
 
bool refs_discovery_is_atomic() const { return false; }
 
static size_t min_fill_size() {
    return size_t(align_object_size(oopDesc::header_size()));
  }
 
CompactibleFreeListSpace*  cmsSpace() const { return _cmsSpace;  }
 
CMSCollector::CMSCollector(ConcurrentMarkSweepGeneration* cmsGen,
                           CardTableRS*                   ct,
                           ConcurrentMarkSweepPolicy*     cp):
  _cmsGen(cmsGen),
  _ct(ct),
  _ref_processor(NULL),    // will be set later
  _conc_workers(NULL),     // may be set later
  _abort_preclean(false),
  _start_sampling(false),
  _between_prologue_and_epilogue(false),
  _markBitMap(0, Mutex::leaf + 1, "CMS_markBitMap_lock"), //shifter为0
  _modUnionTable((CardTableModRefBS::card_shift - LogHeapWordSize),
                 -1 /* lock-free */, "No_lock" /* dummy */), //shifter为CardTableModRefBS::card_shift - LogHeapWordSize,前者取值为9,后者取值为3
  _modUnionClosure(&_modUnionTable),
  _modUnionClosurePar(&_modUnionTable),
  _span(cmsGen->reserved()),
  _is_alive_closure(_span, &_markBitMap),
  _restart_addr(NULL),
  _overflow_list(NULL),
  _stats(cmsGen),//CMSStats实例,用来收集内存分配,对象复制等GC相关数据
  _eden_chunk_lock(new Mutex(Mutex::leaf + 1, "CMS_eden_chunk_lock", true)),
  _eden_chunk_array(NULL),     // may be set in ctor body
  _eden_chunk_capacity(0),     // -- ditto --
  _eden_chunk_index(0),        // -- ditto --
  _survivor_plab_array(NULL),  // -- ditto --
  _survivor_chunk_array(NULL), // -- ditto --
  _survivor_chunk_capacity(0), // -- ditto --
  _survivor_chunk_index(0),    // -- ditto --
  _ser_pmc_preclean_ovflw(0),
  _ser_kac_preclean_ovflw(0),
  _ser_pmc_remark_ovflw(0),
  _par_pmc_remark_ovflw(0),
  _ser_kac_ovflw(0),
  _par_kac_ovflw(0),
  _collection_count_start(0),
  _verifying(false),
  _icms_start_limit(NULL),
  _icms_stop_limit(NULL),
  _verification_mark_bm(0, Mutex::leaf + 1, "CMS_verification_mark_bm_lock"),
  _completed_initialization(false),
  _collector_policy(cp),
  _should_unload_classes(CMSClassUnloadingEnabled),//CMSClassUnloadingEnabled表示使用CMS GC算法时是否允许类卸载,默认为true
  _concurrent_cycles_since_last_unload(0),
  _roots_scanning_options(GenCollectedHeap::SO_None),//根节点扫描的选项
  _inter_sweep_estimate(CMS_SweepWeight, CMS_SweepPadding),//CMS_SweepWeight的默认值是75,CMS_SweepPadding的默认值是1
  _intra_sweep_estimate(CMS_SweepWeight, CMS_SweepPadding),
  _gc_tracer_cm(new (ResourceObj::C_HEAP, mtGC) CMSTracer()),
  _gc_timer_cm(new (ResourceObj::C_HEAP, mtGC) ConcurrentGCTimer()),
  _cms_start_registered(false)
{
  //ExplicitGCInvokesConcurrentAndUnloadsClasses只在CMS下使用,默认为false,为true表示当调用System.gc()就会执行并行GC并且卸载class
  if (ExplicitGCInvokesConcurrentAndUnloadsClasses) {
    ExplicitGCInvokesConcurrent = true;
  }
  
  //设置cmsSpace的_collector属性
  _cmsGen->cmsSpace()->set_collector(this);
 
  //获取_markBitMap的锁,完成_markBitMap和_modUnionTable两个CMSBitMap的初始化
  {
    MutexLockerEx x(_markBitMap.lock(), Mutex::_no_safepoint_check_flag);
    if (!_markBitMap.allocate(_span)) {
      warning("Failed to allocate CMS Bit Map");
      return;
    }
    assert(_markBitMap.covers(_span), "_markBitMap inconsistency?");
  }
  {
    _modUnionTable.allocate(_span);
    assert(_modUnionTable.covers(_span), "_modUnionTable inconsistency?");
  }
  //MarkStackSize表示markStack的初始容量,默认值是4M,初始化_markStack
  if (!_markStack.allocate(MarkStackSize)) {
    warning("Failed to allocate CMS Marking Stack");
    return;
  }
 
  //CMSConcurrentMTEnabled表示是否允许并行GC,默认为true 
  if (CMSConcurrentMTEnabled) {
    //ConcGCThreads表示并行GC的线程数,默认值是0
    if (FLAG_IS_DEFAULT(ConcGCThreads)) {
      //如果是默认值,则重置,ParallelGCThreads也是表示并行GC的线程数,默认值为0
      FLAG_SET_DEFAULT(ConcGCThreads, (ParallelGCThreads + 3)/4);
    }
    if (ConcGCThreads > 1) {
      _conc_workers = new YieldingFlexibleWorkGang("Parallel CMS Threads",
                                 ConcGCThreads, true);
      if (_conc_workers == NULL) {
        warning("GC/CMS: _conc_workers allocation failure: "
              "forcing -CMSConcurrentMTEnabled");
        CMSConcurrentMTEnabled = false;
      } else {
        //初始化多个GC线程
        _conc_workers->initialize_workers();
      }
    } else {
      CMSConcurrentMTEnabled = false;
    }
  }
  if (!CMSConcurrentMTEnabled) {
    ConcGCThreads = 0;
  } else {
    //CMSCleanOnEnter选项默认为true,是减少脏的卡表项的优化,如果开启并行GC则重置为false
    CMSCleanOnEnter = false;
  }
  assert((_conc_workers != NULL) == (ConcGCThreads > 1),
         "Inconsistency");
 
  {
    uint i;
    //取ParallelGCThreads和ConcGCThreads的最大值
    uint num_queues = (uint) MAX2(ParallelGCThreads, ConcGCThreads);
    //CMSParallelRemarkEnabled表示是否允许并行Remark,默认为true
    //ParallelRefProcEnabled表示是否允许并行的处理Reference实例,默认为false 
    if ((CMSParallelRemarkEnabled || CMSConcurrentMTEnabled
         || ParallelRefProcEnabled)
        && num_queues > 0) {
      //初始化任务队列  
      _task_queues = new OopTaskQueueSet(num_queues);
      if (_task_queues == NULL) {
        warning("task_queues allocation failure.");
        return;
      }
      //初始化一个保存hash种子的数组
      _hash_seed = NEW_C_HEAP_ARRAY(int, num_queues, mtGC);
      if (_hash_seed == NULL) {
        warning("_hash_seed array allocation failure");
        return;
      }
 
      typedef Padded<OopTaskQueue> PaddedOopTaskQueue;
      for (i = 0; i < num_queues; i++) {
        PaddedOopTaskQueue *q = new PaddedOopTaskQueue();
        if (q == NULL) {
          warning("work_queue allocation failure.");
          return;
        }
        _task_queues->register_queue(i, q);
      }
      for (i = 0; i < num_queues; i++) {
        _task_queues->queue(i)->initialize();
        _hash_seed[i] = 17;  // copied from ParNew
      }
    }
  }
  //CMSInitiatingOccupancyFraction表示触发老年代垃圾回收时的堆内存占用百分比,即属性initiating_occupancy,默认是-1,如果是-1则使用参数CMSTriggerRatio
  //CMSTriggerRatio表示MinHeapFreeRatio的百分比,默认值是80,MinHeapFreeRatio的默认值40,据此算出触发老年代垃圾回收时的堆内存占用百分比
  _cmsGen ->init_initiating_occupancy(CMSInitiatingOccupancyFraction, CMSTriggerRatio);
 
  //CMSBootstrapOccupancy表示触发第一次老年代垃圾回收的内存使用量占比,默认值是50
  _bootstrap_occupancy = ((double)CMSBootstrapOccupancy)/(double)100;
  //Full GC的次数
  _full_gcs_since_conc_gc = 0;
 
  //设置collecter属性
  ConcurrentMarkSweepGeneration::set_collector(this);
 
  //初始化ConcurrentMarkSweepThread和CGC_lock
  _cmsThread = ConcurrentMarkSweepThread::start(this);
  assert(cmsThread() != NULL, "CMS Thread should have been created");
  assert(cmsThread()->collector() == this,
         "CMS Thread should refer to this gen");
  assert(CGC_lock != NULL, "Where's the CGC_lock?");
 
  GenCollectedHeap* gch = GenCollectedHeap::heap();
  //获取年轻代的引用
  _young_gen = gch->prev_gen(_cmsGen);
  //年轻代是否支持线性内存分配,DefNewGeneration支持
  if (gch->supports_inline_contig_alloc()) {
    //获取年轻代的起止地址
    _top_addr = gch->top_addr();
    _end_addr = gch->end_addr();
    assert(_young_gen != NULL, "no _young_gen");
    _eden_chunk_index = 0;
    //CMSSamplingGrain表示eden区中两个CMS samples之间的最小间隔,默认为4k
    //计算eden区容纳的Chunk的个数
    _eden_chunk_capacity = (_young_gen->max_capacity()+CMSSamplingGrain)/CMSSamplingGrain;
    //初始化一个保存eden区 Chunk地址的数组
    _eden_chunk_array = NEW_C_HEAP_ARRAY(HeapWord*, _eden_chunk_capacity, mtGC);
    if (_eden_chunk_array == NULL) {
      //数组初始化失败,将_eden_chunk_capacity置为0
      _eden_chunk_capacity = 0;
      warning("GC/CMS: _eden_chunk_array allocation failure");
    }
  }
  assert(_eden_chunk_array != NULL || _eden_chunk_capacity == 0, "Error");
 
  //CMSParallelSurvivorRemarkEnabled表示Survivor区是否允许并行remark,默认为true
  //CMSParallelInitialMarkEnabled表示是否使用并行初始标记,默认为true
  if ((CMSParallelRemarkEnabled && CMSParallelSurvivorRemarkEnabled) || CMSParallelInitialMarkEnabled) {
    const size_t max_plab_samples =
      ((DefNewGeneration*)_young_gen)->max_survivor_size() / plab_sample_minimum_size();
    //初始化三个数组
    _survivor_plab_array  = NEW_C_HEAP_ARRAY(ChunkArray, ParallelGCThreads, mtGC);
    _survivor_chunk_array = NEW_C_HEAP_ARRAY(HeapWord*, 2*max_plab_samples, mtGC);
    _cursor               = NEW_C_HEAP_ARRAY(size_t, ParallelGCThreads, mtGC);
    //如果其中有任何一个初始化失败
    if (_survivor_plab_array == NULL || _survivor_chunk_array == NULL
        || _cursor == NULL) {
      warning("Failed to allocate survivor plab/chunk array");
      if (_survivor_plab_array  != NULL) {
        FREE_C_HEAP_ARRAY(ChunkArray, _survivor_plab_array, mtGC);
        _survivor_plab_array = NULL;
      }
      if (_survivor_chunk_array != NULL) {
        FREE_C_HEAP_ARRAY(HeapWord*, _survivor_chunk_array, mtGC);
        _survivor_chunk_array = NULL;
      }
      if (_cursor != NULL) {
        FREE_C_HEAP_ARRAY(size_t, _cursor, mtGC);
        _cursor = NULL;
      }
    } else {
      //都初始化成功
      _survivor_chunk_capacity = 2*max_plab_samples;
      for (uint i = 0; i < ParallelGCThreads; i++) {
        //初始化_survivor_plab_array数组,元素类型是ChunkArray
        HeapWord** vec = NEW_C_HEAP_ARRAY(HeapWord*, max_plab_samples, mtGC);
        if (vec == NULL) {
          warning("Failed to allocate survivor plab array");
          for (int j = i; j > 0; j--) {
            FREE_C_HEAP_ARRAY(HeapWord*, _survivor_plab_array[j-1].array(), mtGC);
          }
          FREE_C_HEAP_ARRAY(ChunkArray, _survivor_plab_array, mtGC);
          FREE_C_HEAP_ARRAY(HeapWord*, _survivor_chunk_array, mtGC);
          _survivor_plab_array = NULL;
          _survivor_chunk_array = NULL;
          _survivor_chunk_capacity = 0;
          break;
        } else {
          ChunkArray* cur =
            ::new (&_survivor_plab_array[i]) ChunkArray(vec,
                                                        max_plab_samples);
          assert(cur->end() == 0, "Should be 0");
          assert(cur->array() == vec, "Should be vec");
          assert(cur->capacity() == max_plab_samples, "Error");
        }
      }
    }
  }
  assert(   (   _survivor_plab_array  != NULL
             && _survivor_chunk_array != NULL)
         || (   _survivor_chunk_capacity == 0
             && _survivor_chunk_index == 0),
         "Error");
 
  _gc_counters = new CollectorCounters("CMS", 1);
  _completed_initialization = true;
  _inter_sweep_timer.start();  // start of time
}
 
void ConcurrentMarkSweepGeneration::init_initiating_occupancy(intx io, uintx tr) {
  assert(io <= 100 && tr <= 100, "Check the arguments");
  if (io >= 0) {
    _initiating_occupancy = (double)io / 100.0;
  } else {
    _initiating_occupancy = ((100 - MinHeapFreeRatio) +
                             (double)(tr * MinHeapFreeRatio) / 100.0)
                            / 100.0;
  }
}
 
static void set_collector(CMSCollector* collector) {
    assert(_collector == NULL, "already set");
    _collector = collector;
  }
 
bool GenCollectedHeap::supports_inline_contig_alloc() const {
  return _gens[0]->supports_inline_contig_alloc();
}
 
HeapWord** GenCollectedHeap::top_addr() const {
  return _gens[0]->top_addr();
}
 
HeapWord** GenCollectedHeap::end_addr() const {
  return _gens[0]->end_addr();
}
 
HeapWord** DefNewGeneration::top_addr() const { return eden()->top_addr(); }
HeapWord** DefNewGeneration::end_addr() const { return eden()->end_addr(); }
 
//返回eden区的最大容量
size_t DefNewGeneration::max_capacity() const {
  const size_t alignment = GenCollectedHeap::heap()->collector_policy()->space_alignment();
  const size_t reserved_bytes = reserved().byte_size();
  return reserved_bytes - compute_survivor_size(reserved_bytes, alignment);
}
 
size_t CMSCollector::plab_sample_minimum_size() {
  //取参数MinTLABSize,默认为2k,如果被改写了依然返回2k
  return MAX2(ThreadLocalAllocBuffer::min_size() * HeapWordSize, 2 * K);
}

其调用链如下:

image.png
image.png image.png

先调用构造方法创建 ConcurrentMarkSweepGeneration实例,然后调用create_cms_collector方法创建CMSCollector实例,CMSCollector的构造方法执行时会将自己设置到ConcurrentMarkSweepGeneration的collector属性。ref_processor_init方法是最后调用的,用来初始化ReferenceProcessor并将其设置到ConcurrentMarkSweepGeneration的_ref_processor属性上。

3、 ConcGCThreads / ParallelGCThreads
第一个参数是指并行标记时的并行线程数,只有CMS和G1使用,第二个参数是指执行引用遍历,promote等GC动作时的并行线程数,各个GC算法都在使用,通常情况前者要小于后者,初始状态下这两个参数都默认为0,参考runtime\globals.hpp中的定义,如下:

image.png
image.png

搜索ConcGCThreads的调用链,如下:


image.png

其中修改ConcGCThreads参数的只有一个地方,CMSCollector的构造方法,如下:

image.png

即ConcGCThreads如果是默认值0,就会被重置为(ParallelGCThreads + 3)/4。

ParallelGCThreads的调用链很多,这里只截取一部分与CMS相关的,如下:


image.png

不同的GC是算法调用的set_flags的方法稍有差异,以set_parallel_gc_flags为例,设置ParallelGCThreads的实现如下:


image.png
Abstract_VM_Version::parallel_worker_threads方法的实现如下:

unsigned int Abstract_VM_Version::parallel_worker_threads() {
  //如果_parallel_worker_threads_initialized未初始化
  if (!_parallel_worker_threads_initialized) {
    //如果ParallelGCThreads是默认值
    if (FLAG_IS_DEFAULT(ParallelGCThreads)) {
      _parallel_worker_threads = VM_Version::calc_parallel_worker_threads();
    } else {
      _parallel_worker_threads = ParallelGCThreads;
    }
    _parallel_worker_threads_initialized = true;
  }
  return _parallel_worker_threads;
}
 
unsigned int Abstract_VM_Version::calc_parallel_worker_threads() {
  return nof_parallel_worker_threads(5, 8, 8);
}
 
unsigned int Abstract_VM_Version::nof_parallel_worker_threads(
                                                      unsigned int num,
                                                      unsigned int den,
                                                      unsigned int switch_pt) {
  if (FLAG_IS_DEFAULT(ParallelGCThreads)) {
  //如果ParallelGCThreads是默认值,则必须等于0
    assert(ParallelGCThreads == 0, "Default ParallelGCThreads is not 0");
    //initial_active_processor_count返回当前机器CPU的有效核数
    //如果ncpus小于等于8,则返回ncpus,如果大于8,比如72,则返回8 + (72 - 8) * (5/8) == 48
    unsigned int ncpus = (unsigned int) os::initial_active_processor_count();
    return (ncpus <= switch_pt) ?
           ncpus :
          (switch_pt + ((ncpus - switch_pt) * num) / den);
  } else {
    return ParallelGCThreads;
  }
}

即默认设置下,ParallelGCThreads会自动根据当前机器的CPU核数自动调整,ConcGCThreads会根据ParallelGCThreads自动调整。

上一篇下一篇

猜你喜欢

热点阅读