Jackson2.x中内存泄露的风险点—封装的intern逻辑
1. String#intern方法
在 JAVA 语言中有8中基本类型和一种比较特殊的类型String。这些类型为了使他们在运行过程中速度更快,更节省内存,都提供了一种常量池的概念。常量池就类似一个JAVA系统级别提供的缓存。
8种基本类型的常量池都是系统协调的,String类型的常量池比较特殊。它的主要使用方法有两种:
直接使用双引号声明出来的String对象会直接存储在常量池中。
如果不是用双引号声明的String对象,可以使用String提供的intern方法。intern 方法会从字符串常量池中查询当前字符串是否存在,若不存在就会将当前字符串放入常量池中
正确使用intern方法,可以极大的减少内存使用空间,但是不当的使用intern方法,就会导致性能急剧下降。下面请看一个实际案例。
2. Jackson中使用intern方法导致内存泄露
2.1 问题分析
项目版本:Jackson:2.x、JDK1.8
问题原因:业务侧在Jackson反序列化时,会调用String#intern方法,触发JDK的bug(https://bugs.openjdk.java.net/browse/JDK-8180048)导致,这个bug会导致interned string得不到回收,从而导致内存泄露。
触发场景:反序列化的对象为Map<Long,String>,但是key为userId,不收敛。这些userId的字符串都会进入常量池,由于G1的bug,GC时没有被回收,导致内存持续泄露。
解决方案1:Jaskson使用intern string的这个特效,可以通过配置进行关闭:
private static final ObjectMapper MAPPER = new ObjectMapper(new JsonFactory().disable(INTERN_FIELD_NAMES));
github上有关于Jackson.INTERN_FIELD_NAMES特效的讨论,见https://github.com/FasterXML/jackson-core/issues/332,这个特性在2.x版本默认打开,在3.x版本默认关闭。
解决方案2:升级JDK到192u
2.2 源码分析
问题的根源就是反序列化时,key一般是固定的,若使用intern来处理,那么会大大节约反序列化时的空间,但是Map<Long,String>中的key因为是userId,所以会将大量的数据放入常量池中,从而导致内存泄露。
com.fasterxml.jackson.core.sym.CharsToNameCanonicalizer#_addSymbol
源码分析.png2.3 源码学习—如何优化intern性能损耗
调用String#intern()方法有两个操作:
- new String()的操作;
- intern()的操作;(这个操作是额外的,会存在性能上的损耗)
Jackson给出了一个缓存类,可以借鉴思索下:
package com.fasterxml.jackson.core.util;
import java.util.LinkedHashMap;
import java.util.concurrent.ConcurrentHashMap;
/**
* Singleton class that adds a simple first-level cache in front of
* regular String.intern() functionality. This is done as a minor
* performance optimization, to avoid calling native intern() method
* in cases where same String is being interned multiple times.
*<p>
* Note: that this class extends {@link LinkedHashMap} is an implementation
* detail -- no code should ever directly call Map methods.
*/
public final class InternCache
extends ConcurrentHashMap<String,String> // since 2.3
{
private static final long serialVersionUID = 1L;
/**
* Size to use is somewhat arbitrary, so let's choose something that's
* neither too small (low hit ratio) nor too large (waste of memory).
*<p>
* One consideration is possible attack via colliding {@link String#hashCode};
* because of this, limit to reasonably low setting.
*/
private final static int MAX_ENTRIES = 180;
public final static InternCache instance = new InternCache();
/**
* As minor optimization let's try to avoid "flush storms",
* cases where multiple threads might try to concurrently
* flush the map.
*/
private final Object lock = new Object();
private InternCache() { super(MAX_ENTRIES, 0.8f, 4); }
public String intern(String input) {
String result = get(input);
if (result != null) { return result; }
/* 18-Sep-2013, tatu: We used to use LinkedHashMap, which has simple LRU
* method. No such functionality exists with CHM; and let's use simplest
* possible limitation: just clear all contents. This because otherwise
* we are simply likely to keep on clearing same, commonly used entries.
*/
if (size() >= MAX_ENTRIES) {
/* Not incorrect wrt well-known double-locking anti-pattern because underlying
* storage gives close enough answer to real one here; and we are
* more concerned with flooding than starvation.
*/
synchronized (lock) {
if (size() >= MAX_ENTRIES) {
clear();
}
}
}
result = input.intern();
put(result, result);
return result;
}
}