Elasticsearch中fielddata_cache的实现
背景
基于一次fielddata_cache(容量还没到阈值)被逐出后,想具体了解fielddata_cache的实现来判断fielddata数据是否是常驻内存亦或是只是个软、弱引用,本文基于v1.0.0版本。
实现
我们直接从Elasticsearch.java这个启动类开始往下看:
Elasticsearch.java {
public static void main(String[] args) {
Bootstrap.main(args);
}
}
Elasticsearch通过Bootstrap类来启动,具体再看Bootstrap的实现,忽略一些代码,我们来Bootstrap的实例化和初始化:
Bootstrap.java {
public static void main(String[] args) {
bootstrap = new Bootstrap();
Tuple<Settings, Environment> tuple = null; //我们的一些配置
try {
tuple = initialSettings();
setupLogging(tuple);
} catch (Exception e) {
...
}
try {
bootstrap.setup(true, tuple);
...
} catch (Throwable e) {
...
}
}
}
Bootstrap的setup()会创建我们的Elasticsearch的节点实例:
Bootstrap.java {
private Node node;
private void setup(boolean addShutdownHook, Tuple<Settings, Environment> tuple) throws Exception {
NodeBuilder nodeBuilder = NodeBuilder.nodeBuilder().settings(tuple.v1()).loadConfigSettings(false);
node = nodeBuilder.build();
...
}
}
NodeBuilder会创建一个InternalNode实例,我们InternalNode的初始化,重点看到我们会添加一个IndicesModule:
InternalNode.java {
public InternalNode(Settings pSettings, boolean loadConfigSettings) throws ElasticsearchException {
logger.info("initializing ...");
...
ModulesBuilder modules = new ModulesBuilder();
modules.add(new IndicesModule(settings));
...
logger.info("initialized");
}
}
再接着看IndicesModule的实现,我们通过绑定IndicesFieldDataCache类来实现索引级别的fielddata_cache:
IndicesModule.java {
protected void configure() {
...
bind(IndicesFieldDataCache.class).asEagerSingleton();
...
}
}
重点来看IndicesFieldDataCache的实现,从下面代码可以看到Elasticsearch通过guava的CacheBuilder来实现索引级别的fielddata_cache,具体的CacheBuilder介绍可以自行查阅一下:
IndicesFieldDataCache.java {
Cache<Key, AtomicFieldData> cache;
private volatile String size;
private volatile long sizeInBytes;
private volatile TimeValue expire;
@Inject
public IndicesFieldDataCache(Settings settings) {
super(settings);
this.size = componentSettings.get("size", "-1"); //indices.fielddata.cache.size的大小
this.sizeInBytes = componentSettings.getAsMemory("size", "-1").bytes(); //indices.fielddata.cache.size的大小
this.expire = componentSettings.getAsTime("expire", null); //indices.fielddata.cache.expire的大小
buildCache();
}
private void buildCache() {
CacheBuilder<Key, AtomicFieldData> cacheBuilder = CacheBuilder.newBuilder()
.removalListener(this);
if (sizeInBytes > 0) { //设置LRU的阈值
cacheBuilder.maximumWeight(sizeInBytes).weigher(new FieldDataWeigher());
}
cacheBuilder.concurrencyLevel(16);
if (expire != null && expire.millis() > 0) { //设置Cache的过期时间
cacheBuilder.expireAfterAccess(expire.millis(), TimeUnit.MILLISECONDS);
}
logger.debug("using size [{}] [{}], expire [{}]", size, new ByteSizeValue(sizeInBytes), expire);
cache = cacheBuilder.build();
}
...
}
最后再看CacheBuilder是怎么被使用的(默认情况下CacheBuilder的key和value都是强引用的),IndicesFieldDataCache在给上层提供实现时是返回了一个IndexFieldCache,可以看到在需要load索引的fielddata_cache时通过CacheBuilder在get时候的原则"获取缓存-如果没有-则计算"实现:
IndexFieldCache.java {
@Nullable
private final IndexService indexService;
final Index index;
final FieldMapper.Names fieldNames;
final FieldDataType fieldDataType;
IndexFieldCache(@Nullable IndexService indexService, Index index, FieldMapper.Names fieldNames, FieldDataType fieldDataType) {
this.indexService = indexService;
this.index = index;
this.fieldNames = fieldNames;
this.fieldDataType = fieldDataType;
}
@Override
public <FD extends AtomicFieldData, IFD extends IndexFieldData<FD>> FD load(final AtomicReaderContext context, final IFD indexFieldData) throws Exception {
final Key key = new Key(this, context.reader().getCoreCacheKey());
return (FD) cache.get(key, new Callable<AtomicFieldData>() {
@Override
public AtomicFieldData call() throws Exception {
SegmentReaderUtils.registerCoreListener(context.reader(), IndexFieldCache.this);
AtomicFieldData fieldData = indexFieldData.loadDirect(context);
...
return fieldData;
}
});
}
}
总结
简单介绍了Elasticsearch-1.0.0版本fielddata_cache的实现,经过分析知道fielddata_cache默认是强引用对象,所以只存在LRU并不会被GC掉,至于为啥会被逐出还需要再看看指标怎么统计的。
(个人分析,有错误请指正)