FlatBuffers反序列化过程
FlatBuffers简介
FlatBuffers Schema解析
FlatBuffers序列化过程
FlatBuffers反序列化过程
在上一篇详细讲解了FlatBuffers的序列化过程,现在来讲解其逆过程:反序列化。
使用的代码仍然是SampleBinary.java,简单修改了下最后的输出。
class SampleBinary {
// Example how to use FlatBuffers to create and read binary buffers.
public static void main(String[] args) {
FlatBufferBuilder builder = new FlatBufferBuilder(0);
// Create some weapons for our Monster ('Sword' and 'Axe').
int weaponOneName = builder.createString("Sword");
short weaponOneDamage = 3;
int weaponTwoName = builder.createString("Axe");
short weaponTwoDamage = 5;
// Use the `createWeapon()` helper function to create the weapons, since we set every field.
int[] weaps = new int[2];
weaps[0] = Weapon.createWeapon(builder, weaponOneName, weaponOneDamage);
weaps[1] = Weapon.createWeapon(builder, weaponTwoName, weaponTwoDamage);
// Serialize the FlatBuffer data.
int name = builder.createString("Orc");
byte[] treasure = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9};
int inv = Monster.createInventoryVector(builder, treasure);
int weapons = Monster.createWeaponsVector(builder, weaps);
int pos = Vec3.createVec3(builder, 1.0f, 2.0f, 3.0f);
Monster.startMonster(builder);
Monster.addPos(builder, pos);
Monster.addName(builder, name);
Monster.addColor(builder, Color.Red);
Monster.addHp(builder, (short)300);
Monster.addInventory(builder, inv);
Monster.addWeapons(builder, weapons);
Monster.addEquippedType(builder, Equipment.Weapon);
Monster.addEquipped(builder, weaps[1]);
int orc = Monster.endMonster(builder);
builder.finish(orc); // You could also call `Monster.finishMonsterBuffer(builder, orc);`.
// We now have a FlatBuffer that can be stored on disk or sent over a network.
// ...Code to store to disk or send over a network goes here...
// Instead, we are going to access it right away, as if we just received it.
ByteBuffer buf = builder.dataBuffer();
// Get access to the root:
Monster monster = Monster.getRootAsMonster(buf);
System.out.println(monster.hp());
System.out.println(monster.mana());
System.out.println(monster.name());
System.out.println(monster.pos());
System.out.println(monster.inventory(1));
System.out.println(monster.weapons(0).name());
System.out.println(monster.weapons(0).damage());
}
}
内存结构:
8 24 40 84 112 124 140 148 160 168 180 188 199
| | | | | | | | | | | | |
192 186 160 116 88 76 60 52 40 32 20 12 1
| | | | | | | | | | | | | | | | | | | | | | | |
root_table Monster vtable Monster path weapons inv name axe Weapon vtable sword weaponTwoName weaponOneName
| | | | | | | | | | | | | | | | | | | | | | | |
32 0 0 0 0 0 26 0 44 0 32 0 0 0 24 0 28 0 0 0 20 0 27 0 16 0 15 0 8 0 4 0 26 0 0 0 40 0 0 0 100 0 0 0 0 0 0 1 56 0 0 0 64 0 0 0 -12 1 0 0 72 0 0 0 0 0 -128 63 0 0 0 64 0 0 64 64 2 0 0 0 0 0 -128 64 0 0 -96 64 0 0 -64 64 0 0 -128 63 0 0 0 64 0 0 64 64 2 0 0 0 52 0 0 0 28 0 0 0 10 0 0 0 0 1 2 3 4 5 6 7 8 9 0 0 3 0 0 0 79 114 99 0 -12 -1 -1 -1 0 0 5 0 24 0 0 0 8 0 12 0 8 0 6 0 8 0 0 0 0 0 3 0 12 0 0 0 3 0 0 0 65 120 101 0 5 0 0 0 83 119 111 114 100 0 0 0
代码很简单,首先获取一个monster实例,然后就可以调用相应的方法获取值。获取值的逻辑也很简单:首先获取根据字段的vtable_offset从vtable中获取到offset,然后到对应的对象内存中读取对应的值。如果是非引用类型则直接获取值;如果是引用类型(string/vector/table
)则从获取到的值引用的位置,需要再进行转化。
1. getRootAsMonster获取root type
Monster monster = Monster.getRootAsMonster(buffer);
|
public static Monster getRootAsMonster(ByteBuffer _bb, Monster obj) { _bb.order(ByteOrder.LITTLE_ENDIAN); return (obj.__assign(_bb.getInt(_bb.position()) + _bb.position(), _bb)); }
|
public Monster __assign(int _i, ByteBuffer _bb) { __init(_i, _bb); return this; }
|
public void __init(int _i, ByteBuffer _bb) { bb_pos = _i; bb = _bb; }
getRootAsMonster
这个方法只是简单的进行标记工作:1、标记root table的起始位置(_bb.position()
保存的是到root table开始位置的offset);2、标记使用的ByteBuffer。
序列化过程数据是从ByteBuffer的高位往低位写,反序列化的时候刚好相反,从低位往高位读。
2. 从vtable获取offset,这里使用的是Table类的__offset
方法
/**
* Look up a field in the vtable.
*
* @param vtable_offset An `int` offset to the vtable in the Table's ByteBuffer.
* @return Returns an offset into the object, or `0` if the field is not present.
*/
protected int __offset(int vtable_offset) {
// 获取vtable的开始位置
int vtable = bb_pos - bb.getInt(bb_pos);
// bb.getShort(vtable) 获取vtable的大小,bb.getShort(vtable + vtable_offset)获取vtable_offset的字段对应于的值,如果返回0则使用默认值。
return vtable_offset < bb.getShort(vtable) ? bb.getShort(vtable + vtable_offset) : 0;
}
3. 获取Primitive类型
分两种情况:
- 获取非默认值
monster.hp()
|
public short hp() { int o = __offset(8); return o != 0 ? bb.getShort(o + bb_pos) : 100; }
这里返回的__offset(8)
为24,o + bb_pos
为64,因此是获取offset为164的值,调用bb.getShort(o + bb_pos)
得到-12 1
,即500。
- 获取默认值
monster.mana()
|
public short mana() { int o = __offset(6); return o != 0 ? bb.getShort(o + bb_pos) : 150; }
__offset(6)
返回值为0,因此使用默认值150。
4. 获取string
monster.name()
|
public String name() { int o = __offset(10); return o != 0 ? __string(o + bb_pos) : null; }
|
/**
* Create a Java `String` from UTF-8 data stored inside the FlatBuffer.
*
* This allocates a new string and converts to wide chars upon each access,
* which is not very efficient. Instead, each FlatBuffer string also comes with an
* accessor based on __vector_as_bytebuffer below, which is much more efficient,
* assuming your Java program can handle UTF-8 data directly.
*
* @param offset An `int` index into the Table's ByteBuffer.
* @return Returns a `String` from the data stored inside the FlatBuffer at `offset`.
*/
protected String __string(int offset) {
CharsetDecoder decoder = UTF8_DECODER.get();
decoder.reset();
// bb.getInt(offset)获取相对当前位置的偏移值,加上当前位置才是真正的偏移值。
offset += bb.getInt(offset);
ByteBuffer src = bb.duplicate().order(ByteOrder.LITTLE_ENDIAN);
// string的第一个字段存储的是string的长度
int length = src.getInt(offset);
src.position(offset + SIZEOF_INT);
src.limit(offset + SIZEOF_INT + length);
int required = (int)((float)length * decoder.maxCharsPerByte());
CharBuffer dst = CHAR_BUFFER.get();
if (dst == null || dst.capacity() < required) {
dst = CharBuffer.allocate(required);
CHAR_BUFFER.set(dst);
}
dst.clear();
try {
CoderResult cr = decoder.decode(src, dst, true);
if (!cr.isUnderflow()) {
cr.throwException();
}
} catch (CharacterCodingException x) {
throw new Error(x);
}
return dst.flip().toString();
}
string在table中存储的是引用值,即string开始的位置与存储引用值位置的距离(注意,这里存储的不是引用对象真正的offset,而是相对这个存储位置的offset,两者相加才是真正的偏移值。这是由addOffset方法决定的。)。
__offset
返回28,__string
输入值为68,bb.getInt(68);
得到72,offset += bb.getInt(offset);
得到140,即为string的offset。
5. 获取struct
monster.pos().x()
|
public Vec3 pos() { return pos(new Vec3()); }
|
public Vec3 pos(Vec3 obj) { int o = __offset(4); return o != 0 ? obj.__assign(o + bb_pos, bb) : null; }
|
public Vec3 __assign(int _i, ByteBuffer _bb) { __init(_i, _bb); return this; }
|
public void __init(int _i, ByteBuffer _bb) { bb_pos = _i; bb = _bb; }
|
public float x() { return bb.getFloat(bb_pos + 0); }
这里类似getRootAsMonster,最后只是标记下Vec3对象使用的ByteBuffer和开始的位置,最后直接从开始位置按字节读取即可。
__offset(4)
返回32,o + bb_pos
返回72,刚好是pos在内存中的位置。
6. 获取vector
monster.inventory(1)
|
public int inventory(int j) { int o = __offset(14); return o != 0 ? bb.get(__vector(o) + j * 1) & 0xFF : 0; }
|
/**
* Get the start data of a vector.
*
* @param offset An `int` index into the Table's ByteBuffer.
* @return Returns the start of the vector data whose offset is stored at `offset`.
*/
protected int __vector(int offset) {
offset += bb_pos;
return offset + bb.getInt(offset) + SIZEOF_INT; // data starts after the length
}
获取vector类似string,主要逻辑在__vector(int offset)
中,返回真正的偏移量,然后就可以按字节读取。
7. 获取table
获取table的逻辑也类似getRootAsMonster
,转化offset后使用assign标记table的中的ByteBuffer和bb_pos,然后就可以使用上面的方法获取值。
monster.weapons(0)
|
public Weapon weapons(int j) { return weapons(new Weapon(), j); }
|
public Weapon weapons(Weapon obj, int j) { int o = __offset(18); return o != 0 ? obj.__assign(__indirect(__vector(o) + j * 4), bb) : null; }
总结
到这里FlatBuffers的内容基本结束。FlatBuffers给出了一种序列化和反序列的新的视角,在保持内存和速度的高效性的同时原理也很简单。虽然没有像Protocol Buffer那么有名,但是也有不少项目在使用。刚刚成为Apache顶级项目的Arrow就是使用FlatBuffers作为schema序列化存储格式的。但是FlatBuffers也有一个比较大的缺点,其生成的代码风格不太符合正常的调用习惯,这点从上面的代码中也可以看出来:构造的时候需要先构造string/vector/table,然后才开始构造root type,不能嵌套。当然这种风格的争议更多的使用习惯的问题,并不会影响到功能,因此对于大部分用户来说是可以忽略的。