Gson转换错误导致Int变为Double类型
问题描述
埋点系统负责接收客户端、H5等系统发送过来的用户行为埋点数据,经过统一的接收、解析,最终发到Kafka中,提供给下游业务方进行消费。在一个变更测试中,发现原本是整型的数据转换后变成了double。为了问题描述简单,便于大家理解,简化为下面的例子,本文源码基于gson-2.8.0。
{
"rate": 1.0,
"extend": {
"number": 30,
"amount": 120.3
}
}
处理后边变为:
{
"rate":1.0,
"extend":{
"number":30.0,
"amount":120.3
}
}
如extend字段中的number的值的从整型30变为了double类型,刚好下游业务方有些是把数值型转换为字符串类型进行逻辑判断,比如两张表的join操作的时候,由于类型发生了切换,导致关联不上,字符串"30"和"30.0"不相等。
代码分析
看到上面的部分,可能有些同学会说了,为什么不直接把number字段定义为整型来规避这个问题。此处的原因是extend字段是扩展字段,不确定里面包含哪些字段,跟业务的上报方密切相关。
Data类定义
public class Data {
private Double rate;
private Object extend;
public Double getRate() {
return rate;
}
public void setRate(Double rate) {
this.rate = rate;
}
@Override
public String toString() {
return "Data{" +
"rate=" + rate +
", extend=" + extend +
'}';
}
}
再看下测试代码:
public class GsonTest {
public static void main(String[] args) {
String dataJson = "{\"rate\" : 1.0, \"extend\" : {\"number\" : 30, \"amount\" : 120.3}}";
Gson gson = buildGson();
Data data = gson.fromJson(dataJson, Data.class);
System.out.println(data.toString());
}
private static Gson buildGson() {
GsonBuilder gsonBuilder = new GsonBuilder();
return gsonBuilder.create();
}
}
输出结果:
结果输出
根因分析
接下来我们简述下反序列化的过程,Gson根据待解析的类型定位到具体的TypeAdaptor<T>类,其接口的主要方法如下:
public abstract class TypeAdapter<T> {
/**
* Writes one JSON value (an array, object, string, number, boolean or null)
* for {@code value}.
*
* @param value the Java object to write. May be null.
*/
public abstract void write(JsonWriter out, T value) throws IOException;
/**
* Reads one JSON value (an array, object, string, number, boolean or null)
* and converts it to a Java object. Returns the converted object.
*
* @return the converted Java object. May be null.
*/
public abstract T read(JsonReader in) throws IOException;
}
通过read方法从JsonReader中读取相应的数据组装成最终的对象,由于Data类中的extend字段的声明类型是Object,最终Gson会定位到内置的ObjectTypeAdaptor类,我们来分析一下该类的逻辑过程。
/**
* Adapts types whose static type is only 'Object'. Uses getClass() on
* serialization and a primitive/Map/List on deserialization.
*/
public final class ObjectTypeAdapter extends TypeAdapter<Object> {
public static final TypeAdapterFactory FACTORY = new TypeAdapterFactory() {
@SuppressWarnings("unchecked")
@Override public <T> TypeAdapter<T> create(Gson gson, TypeToken<T> type) {
if (type.getRawType() == Object.class) {
return (TypeAdapter<T>) new ObjectTypeAdapter(gson);
}
return null;
}
};
private final Gson gson;
ObjectTypeAdapter(Gson gson) {
this.gson = gson;
}
@Override public Object read(JsonReader in) throws IOException {
JsonToken token = in.peek();
switch (token) {
case BEGIN_ARRAY:
List<Object> list = new ArrayList<Object>();
in.beginArray();
while (in.hasNext()) {
list.add(read(in));
}
in.endArray();
return list;
case BEGIN_OBJECT:
Map<String, Object> map = new LinkedTreeMap<String, Object>();
in.beginObject();
while (in.hasNext()) {
map.put(in.nextName(), read(in));
}
in.endObject();
return map;
case STRING:
return in.nextString();
//数值类型全部转换为了Double类型
case NUMBER:
return in.nextDouble();
case BOOLEAN:
return in.nextBoolean();
case NULL:
in.nextNull();
return null;
default:
throw new IllegalStateException();
}
}
@SuppressWarnings("unchecked")
@Override public void write(JsonWriter out, Object value) throws IOException {
if (value == null) {
out.nullValue();
return;
}
TypeAdapter<Object> typeAdapter = (TypeAdapter<Object>) gson.getAdapter(value.getClass());
if (typeAdapter instanceof ObjectTypeAdapter) {
out.beginObject();
out.endObject();
return;
}
typeAdapter.write(out, value);
}
}
看到该逻辑过程我们看到,如果Json对应的是Object类型,最终会解析为Map<String, Object>类型;其中Object类型跟Json中具体的值有关,比如双引号的""值翻译为STRING。我们可以看下数值类型(NUMBER)全部转换为了Double类型,所以就有了我们之前的问题,整型数据被翻译为了Double类型,比如30变为了30.0。看到这,大家是不是也在想应该细分下NUMBER数值类型,按照整型和浮点型分开处理,我们看下JsonToken是否有更细分的类型。
public enum JsonToken {
/**
* The opening of a JSON array. Written using {@link JsonWriter#beginArray}
* and read using {@link JsonReader#beginArray}.
*/
BEGIN_ARRAY,
/**
* The closing of a JSON array. Written using {@link JsonWriter#endArray}
* and read using {@link JsonReader#endArray}.
*/
END_ARRAY,
/**
* The opening of a JSON object. Written using {@link JsonWriter#beginObject}
* and read using {@link JsonReader#beginObject}.
*/
BEGIN_OBJECT,
/**
* The closing of a JSON object. Written using {@link JsonWriter#endObject}
* and read using {@link JsonReader#endObject}.
*/
END_OBJECT,
/**
* A JSON property name. Within objects, tokens alternate between names and
* their values. Written using {@link JsonWriter#name} and read using {@link
* JsonReader#nextName}
*/
NAME,
/**
* A JSON string.
*/
STRING,
/**
* A JSON number represented in this API by a Java {@code double}, {@code
* long}, or {@code int}.
*/
NUMBER,
/**
* A JSON {@code true} or {@code false}.
*/
BOOLEAN,
/**
* A JSON {@code null}.
*/
NULL,
/**
* The end of the JSON stream. This sentinel value is returned by {@link
* JsonReader#peek()} to signal that the JSON-encoded value has no more
* tokens.
*/
END_DOCUMENT
}
居然没有细分类型,那这怎么办。�没事,我们再分析下JsonReader.peek方法
/**
* Returns the type of the next token without consuming it.
*/
public JsonToken peek() throws IOException {
int p = peeked;
if (p == PEEKED_NONE) {
p = doPeek();
}
switch (p) {
case PEEKED_BEGIN_OBJECT:
return JsonToken.BEGIN_OBJECT;
case PEEKED_END_OBJECT:
return JsonToken.END_OBJECT;
case PEEKED_BEGIN_ARRAY:
return JsonToken.BEGIN_ARRAY;
case PEEKED_END_ARRAY:
return JsonToken.END_ARRAY;
case PEEKED_SINGLE_QUOTED_NAME:
case PEEKED_DOUBLE_QUOTED_NAME:
case PEEKED_UNQUOTED_NAME:
return JsonToken.NAME;
case PEEKED_TRUE:
case PEEKED_FALSE:
return JsonToken.BOOLEAN;
case PEEKED_NULL:
return JsonToken.NULL;
case PEEKED_SINGLE_QUOTED:
case PEEKED_DOUBLE_QUOTED:
case PEEKED_UNQUOTED:
case PEEKED_BUFFERED:
return JsonToken.STRING;
case PEEKED_LONG:
case PEEKED_NUMBER:
return JsonToken.NUMBER;
case PEEKED_EOF:
return JsonToken.END_DOCUMENT;
default:
throw new AssertionError();
}
}
可以看到其实在JsonReader的读取过程中是有细分整型和浮点型,可以对外转换后不再区分数值类型了,一种改法是直接修改源码,在JsonToken多定义定义一个整型Long,然后在读取的过程中细分下类型,修改ObjectTypeAdaptor的方法后大概如下所示
@Override public Object read(JsonReader in) throws IOException {
JsonToken token = in.peek();
switch (token) {
..........................
case LONG:
return in.nextLong();
case NUMBER:
return in.nextDouble();
..........................
}
}
什么,居然要修改源码,是不是改动太大了!!!我们再回到之前的知识点,解析方式是根据类型找到具体的TypeAdaptor,同时我们不希望改变JsonToken等类的实现。所以我们首先为Data定义一个适配器,命名为DataTypeAdaptor,具体实现如下:
public class DataTypeAdaptor extends TypeAdapter<Data> {
public static final TypeAdapterFactory FACTORY = new TypeAdapterFactory() {
@SuppressWarnings("unchecked")
@Override
public <T> TypeAdapter<T> create(Gson gson, TypeToken<T> type) {
if (type.getRawType() == Data.class) {
return (TypeAdapter<T>) new DataTypeAdaptor(gson);
}
return null;
}
};
private final Gson gson;
DataTypeAdaptor(Gson gson) {
this.gson = gson;
}
@Override
public void write(JsonWriter out, Data value) throws IOException {
if (value == null) {
out.nullValue();
return;
}
out.beginObject();
out.name("rate");
gson.getAdapter(Double.class).write(out, value.getRate());
out.name("extend");
gson.getAdapter(Object.class).write(out, value.getExtend());
out.endObject();
}
@Override
public Data read(JsonReader in) throws IOException {
Data data = new Data();
Map<String, Object> dataMap = (Map<String, Object>) readInternal(in);
data.setRate((Double) dataMap.get("rate"));
data.setExtend(dataMap.get("extend"));
return data;
}
private Object readInternal(JsonReader in) throws IOException {
JsonToken token = in.peek();
switch (token) {
case BEGIN_ARRAY:
List<Object> list = new ArrayList<Object>();
in.beginArray();
while (in.hasNext()) {
list.add(readInternal(in));
}
in.endArray();
return list;
case BEGIN_OBJECT:
Map<String, Object> map = new LinkedTreeMap<String, Object>();
in.beginObject();
while (in.hasNext()) {
map.put(in.nextName(), readInternal(in));
}
in.endObject();
return map;
case STRING:
return in.nextString();
case NUMBER:
//将其作为一个字符串读取出来
String numberStr = in.nextString();
//返回的numberStr不会为null
if (numberStr.contains(".") || numberStr.contains("e")
|| numberStr.contains("E")) {
return Double.parseDouble(numberStr);
}
return Long.parseLong(numberStr);
case BOOLEAN:
return in.nextBoolean();
case NULL:
in.nextNull();
return null;
default:
throw new IllegalStateException();
}
}
}
改动点为读取数值类型的时候按照字符串读取,如果原始数据中包含小数点或者是科学表示法则认为是浮点型,否则则是整型。再回过头的看下原始的例子
public class GsonTest {
public static void main(String[] args) {
String dataJson = "{\"rate\" : 1.0, \"extend\" : {\"number\" : 30, \"amount\" : 120.3}}";
Gson gson = buildGson();
Data data = gson.fromJson(dataJson, Data.class);
System.out.println(data.toString());
System.out.println(gson.toJson(data, Data.class));
}
private static Gson buildGson() {
GsonBuilder gsonBuilder = new GsonBuilder();
gsonBuilder.registerTypeAdapterFactory(DataTypeAdaptor.FACTORY);
return gsonBuilder.create();
}
}
运行结果
Data{rate=1.0, extend={number=30, amount=120.3}}
{"rate":1.0,"extend":{"number":30,"amount":120.3}}
Process finished with exit code 0
结果正确,整型的依然是整型,浮点型依旧为浮点型,问题得到解决。对于问题本身其实应该推动业务方去按照schema类型进行整改,由于本文主要讨论gson,在此不再赘述其它解决方式。另外其实个人觉得Gson本身应该区分开来整型和浮点型,从代码的情况来看,其应该是考虑了该问题,但是最终却没有开发给用户,暂不得其解,后续准备在社区里咨询该问题。