微服务缓存的使用度量
缓存也许是程序员心中最熟悉的性能优化手段之一, 在旧文中 微服务缓存漫谈之Guava Cache 和 Redis 集群的构建和监控 中分别介绍了最常用的本地内存的 Guava Cache 和远程的 Redis Cache. 这里我们重点聊聊缓存的度量。
缓存的常见问题
对于缓存,我们关心这几个问题:
- Cache hit ratio 缓存的命中率
- Cache key size 缓存的键值数量
- Cache resource usage 缓存的资源使用率
- Cache loading performance 缓存的加载性能
- Cache capacity 缓存的容量
- Cache lifetime 缓存的生命周期
Cache 不可能无限增长, 不可能永远有效, 所以对于 Cache 的清除策略和失效策略要细细考量.
对于放在 Cache 中的数据也最好是读写比较高的, 即读得多, 写得少, 不会频繁地更新.
缓存不是万能药,缓存使用不当会生成缓存穿透,击穿和雪崩,先简单解释一下这几个概念
- 穿透
某条记录压根不存在,所以在缓存中找不到,每次都需要到数据库中读取,但是结果还是找不到。
常用的应对方法是布隆过滤器(它的特点是)或者反向缓存(在缓存中保存这条记录,标识它是不存在的)
- 击穿
某条记录过期被移除了,恰好大量相关的查询请求这条记录,导致瞬时间大量请求绕过缓存访问数据库
常用的应对方法是将从数据库加载数据的操作加锁,这样就不会有很多访问请求绕过缓存。
或者干脆不设置过期时间,而是用一个后台job 定时刷新缓存,外部的请求总能从缓存中读到数据
3.雪崩
多条记录在多个服务器上的缓存同时过期失效,导致瞬时间大量请求绕过缓存访问数据库,这个比击穿更严重。
常用的应对方法是多个服务器上的多条记录设置不同的失效时间,可以用个随机值作为零头,将大量的并发请求从某个时间点分布到一个时间段中
缓存的度量
缓存的命中率,加载性能等等都是我们关心的重点,例如:
- 性能 Performance: Cache 加载的延迟 latency
- 吞吐量Throughput: 每秒请求次数 CPS(Call Per Second)
- 命中率:sucess_ratio = hitCount / (hitCount + missCount)
- 资源使用量: 使用了多少内存
- 资源饱和度 saturation: 由于容量限制被移出cache 的记录数,缓存满了无法增加的记录数
注: 饱和度是资源负载超出其处理能力的地方。
以 Guava Cache 为例,它的缓存统计信息根据以下规则递增:
- 当缓存查找遇到现有缓存条目时,hitCount会增加。
- 当缓存查找第一次遇到丢失的缓存条目时,将加载一个新条目。
- 成功加载条目后,missCount和loadSuccessCount会增加,并将总加载时间(以纳秒为单位)添加到totalLoadTime中。
- 在加载条目时引发异常时,missCount和loadExceptionCount会增加,并且总加载时间(以纳秒为单位)将添加到totalLoadTime中。
- 遇到仍在加载的缺少高速缓存条目的高速缓存查找将等待加载完成(无论是否成功),然后递增missCount。
- 从缓存中逐出条目时,evictionCount会增加。
- 当缓存条目无效或手动删除时,不会修改任何统计信息。
- 在缓存的asMap视图上调用的操作不会修改任何统计信息。
我们在写代码时可以调用它的 recordStats 来记录这些度量数据
@Bean
public LoadingCache<String, CityWeather> cityWeatherCache() {
LoadingCache<String, CityWeather> cache = CacheBuilder.newBuilder()
.recordStats()
.maximumSize(1000)
.expireAfterWrite(60, TimeUnit.MINUTES)
.build(weatherCacheLoader());
recordCacheMetrics("cityWeatherCache", cache);
return cache;
}
public void recordCacheMetrics(String cacheName, Cache cache) {
MetricRegistry metricRegistry = metricRegistry();
metricRegistry.gauge(generateMetricsKeyForCache(cacheName, "hitCount"), () -> () -> cache.stats().hitCount());
metricRegistry.gauge(generateMetricsKeyForCache(cacheName, "hitRate"), () -> () -> cache.stats().hitRate());
metricRegistry.gauge(generateMetricsKeyForCache(cacheName, "missCount"), () -> () -> cache.stats().missCount());
metricRegistry.gauge(generateMetricsKeyForCache(cacheName, "missRate"), () -> () -> cache.stats().missRate());
metricRegistry.gauge(generateMetricsKeyForCache(cacheName, "requestCount"), () -> () -> cache.stats().requestCount());
metricRegistry.gauge(generateMetricsKeyForCache(cacheName, "loadCount"), () -> () -> cache.stats().loadCount());
metricRegistry.gauge(generateMetricsKeyForCache(cacheName, "loadSuccessCount"), () -> () -> cache.stats().loadSuccessCount());
metricRegistry.gauge(generateMetricsKeyForCache(cacheName, "loadExceptionCount"), () -> () -> cache.stats().loadExceptionCount());
}
public String generateMetricsKeyForCache(String cacheName, String keyName) {
String metricKey = MetricRegistry.name("cache", cacheName, keyName);
log.info("metric key generated for cache: {}", metricKey);
return metricKey;
}
对基于 Spring boot 的应用程序,我们可以用 Micrometer 这个软件库来暴露度量指标。
监控软件百花齐放,类似于 SLF4j 作为一个外观模式的应用把 log4j, logback 这些库的异同封装起来, MicroMeter 也把对各种监控和度量技术栈的细节封装起来,这样应用程序的开发者可以把精力放到应用本身的度量上面。
Micrometer provides a simple facade over the instrumentation clients for the most popular monitoring systems, allowing you to instrument your JVM-based application code without vendor lock-in. Think SLF4J, but for application metrics! Application metrics recorded by Micrometer are intended to be used to observe, alert, and react to the current/recent operational state of your environment.
它的应用也很简单,在 pom.xml 中加入下面的依赖项
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-core</artifactId>
</dependency>
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-registry-prometheus</artifactId>
</dependency>
- WeatherCacheConfig 配置如下
package com.github.walterfan.hellocache;
import com.codahale.metrics.MetricRegistry;
import com.google.common.cache.Cache;
import com.google.common.cache.CacheBuilder;
import com.google.common.cache.LoadingCache;
import io.micrometer.core.instrument.Gauge;
import io.micrometer.core.instrument.MeterRegistry;
import io.micrometer.core.instrument.composite.CompositeMeterRegistry;
import io.micrometer.core.instrument.simple.SimpleMeterRegistry;
import io.micrometer.prometheus.PrometheusConfig;
import io.micrometer.prometheus.PrometheusMeterRegistry;
import lombok.extern.slf4j.Slf4j;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.ComponentScan;
import org.springframework.context.annotation.Configuration;
import org.springframework.context.annotation.EnableAspectJAutoProxy;
import org.springframework.context.annotation.Lazy;
import org.springframework.core.env.Environment;
import org.springframework.http.MediaType;
import org.springframework.http.converter.HttpMessageConverter;
import org.springframework.http.converter.json.MappingJackson2HttpMessageConverter;
import org.springframework.web.client.RestTemplate;
import java.util.ArrayList;
import java.util.Collections;
import java.util.List;
import java.util.concurrent.TimeUnit;
/**
* Created by yafan on 14/10/2017.
*/
@EnableAspectJAutoProxy
@ComponentScan
@Configuration
@Slf4j
public class WeatherCacheConfig {//implements EnvironmentAware
@Autowired
private Environment environment;
@Bean
public WeatherCacheLoader weatherCacheLoader() {
return new WeatherCacheLoader();
}
@Bean
public RestTemplate restTemplate() {
final RestTemplate restTemplate = new RestTemplate();
List<HttpMessageConverter<?>> messageConverters = new ArrayList<>();
MappingJackson2HttpMessageConverter converter = new MappingJackson2HttpMessageConverter();
converter.setSupportedMediaTypes(Collections.singletonList(MediaType.ALL));
messageConverters.add(converter);
restTemplate.setMessageConverters(messageConverters);
return restTemplate;
}
@Bean
public String appToken() {
return this.environment.getProperty("BAIDU_AK");
}
@Bean
public LoadingCache<String, CityWeather> cityWeatherCache() {
LoadingCache<String, CityWeather> cache = CacheBuilder.newBuilder()
.recordStats()
.maximumSize(1000)
.expireAfterWrite(60, TimeUnit.MINUTES)
.build(weatherCacheLoader());
recordCacheMetrics("cityWeatherCache", cache);
recordCacheMeters("cityWeatherCache", cache);
return cache;
}
public void recordCacheMetrics(String cacheName, Cache cache) {
MetricRegistry metricRegistry = metricRegistry();
metricRegistry.gauge(makeMetricsKeyName(cacheName, "hitCount"), () -> () -> cache.stats().hitCount());
metricRegistry.gauge(makeMetricsKeyName(cacheName, "hitRate"), () -> () -> cache.stats().hitRate());
metricRegistry.gauge(makeMetricsKeyName(cacheName, "missCount"), () -> () -> cache.stats().missCount());
metricRegistry.gauge(makeMetricsKeyName(cacheName, "missRate"), () -> () -> cache.stats().missRate());
metricRegistry.gauge(makeMetricsKeyName(cacheName, "requestCount"), () -> () -> cache.stats().requestCount());
metricRegistry.gauge(makeMetricsKeyName(cacheName, "loadCount"), () -> () -> cache.stats().loadCount());
metricRegistry.gauge(makeMetricsKeyName(cacheName, "loadSuccessCount"), () -> () -> cache.stats().loadSuccessCount());
metricRegistry.gauge(makeMetricsKeyName(cacheName, "loadExceptionCount"), () -> () -> cache.stats().loadExceptionCount());
}
public void recordCacheMeters(String cacheName, Cache cache) {
MeterRegistry meterRegistry = meterRegistry();
Gauge.builder(makeMetricsKeyName(cacheName, "hitCount"), () -> cache.stats().hitCount()).register(meterRegistry);
Gauge.builder(makeMetricsKeyName(cacheName, "hitRate"), () -> cache.stats().hitRate()).register(meterRegistry);
Gauge.builder(makeMetricsKeyName(cacheName, "missCount"), () -> cache.stats().missCount()).register(meterRegistry);
Gauge.builder(makeMetricsKeyName(cacheName, "missRate"), () -> cache.stats().missRate()).register(meterRegistry);
Gauge.builder(makeMetricsKeyName(cacheName, "requestCount"), () -> cache.stats().requestCount()).register(meterRegistry);
Gauge.builder(makeMetricsKeyName(cacheName, "loadCount"), () -> cache.stats().loadCount()).register(meterRegistry);
Gauge.builder(makeMetricsKeyName(cacheName, "loadSuccessCount"), () -> cache.stats().loadSuccessCount()).register(meterRegistry);
Gauge.builder(makeMetricsKeyName(cacheName, "loadExceptionCount"), () -> cache.stats().loadExceptionCount()).register(meterRegistry);
}
public String makeMetricsKeyName(String cacheName, String keyName) {
String metricKey = MetricRegistry.name("cache", cacheName, keyName);
log.info("metric key generated for cache: {}", metricKey);
return metricKey;
}
@Bean
public DurationTimerAspect durationTimerAspect() {
return new DurationTimerAspect();
}
@Bean
@Lazy
public MetricRegistry metricRegistry() {
return new MetricRegistry();
}
@Bean
public MeterRegistry meterRegistry() {
CompositeMeterRegistry compositeRegistry = new CompositeMeterRegistry();
SimpleMeterRegistry simpleMeter = new SimpleMeterRegistry();
PrometheusMeterRegistry prometheusMeterRegistry = new PrometheusMeterRegistry(PrometheusConfig.DEFAULT);
compositeRegistry.add(simpleMeter);
compositeRegistry.add(prometheusMeterRegistry);
return compositeRegistry;
}
}
这样打开 http://localhost:8080/actuator/metrics
可以看到如下度量指标
{
names: [
"cache.cityWeatherCache.hitCount",
"cache.cityWeatherCache.hitRate",
"cache.cityWeatherCache.loadCount",
"cache.cityWeatherCache.loadExceptionCount",
"cache.cityWeatherCache.loadSuccessCount",
"cache.cityWeatherCache.missCount",
"cache.cityWeatherCache.missRate",
"cache.cityWeatherCache.requestCount",
"http.server.requests",
"jvm.buffer.count",
"jvm.buffer.memory.used",
"jvm.buffer.total.capacity",
"jvm.classes.loaded",
"jvm.classes.unloaded",
"jvm.gc.live.data.size",
"jvm.gc.max.data.size",
"jvm.gc.memory.allocated",
"jvm.gc.memory.promoted",
"jvm.gc.pause",
"jvm.memory.committed",
"jvm.memory.max",
"jvm.memory.used",
"jvm.threads.daemon",
"jvm.threads.live",
"jvm.threads.peak",
"jvm.threads.states",
"logback.events",
"process.cpu.usage",
"process.files.max",
"process.files.open",
"process.start.time",
"process.uptime",
"system.cpu.count",
"system.cpu.usage",
"system.load.average.1m",
"tomcat.sessions.active.current",
"tomcat.sessions.active.max",
"tomcat.sessions.alive.max",
"tomcat.sessions.created",
"tomcat.sessions.expired",
"tomcat.sessions.rejected"
]
}
详情可打开 http://localhost:8080/actuator/metrics/cache.cityWeatherCache.hitCount
{
name: "cache.cityWeatherCache.hitCount",
description: null,
baseUnit: null,
measurements: [
{
statistic: "VALUE",
value: 3
}
],
availableTags: [ ]
}