Protocol Buffers 入门(10 minutes)

2020-02-15 本文已影响0人衣介书生

protobuf

简介

Protocol Buffers 是一种语言无关、平台无关、可扩展的用于序列化和结构化数据的工具，常用于通信协议，数据存储。相对于json和xml，体量更小、解析速度更快。

Protobuf 有两个大版本，proto2 和 proto3。proto3 相对 proto2 而言，支持更多的语言（Ruby、C#等）、删除了一些复杂的语法和特性、引入了更多的约定。

protobuf、xml、json对比

	XML	JSON	protobuf
数据结构	复杂	简单	一般
数据存储方式	文本	文本	二进制
数据存储大小	大	一般	小
解析效率	慢	一般	快

示例程序

程序分为读写两部分，使用 protobuf 与 C++ 进行开发。

1. 编写 proto 文件（push.recall.userinfo.proto）

syntax = "proto3"; // PB 协议版本

package push.recall; // 包名

message userinfo
{ 
   required int32   age = 1; 
   required string name = 2;
   optional string address = 3; 
}

编译

写好 proto 文件之后就可以用 protobuf 编译器将该文件编译成目标语言了。本例中我们将使用 C++。假设您的 proto 文件存放在 $SRC_DIR 下面，您也想把生成的文件放在同一个目录下，则可以使用命令：protoc -I=./ --cpp_out=./ ./push.recall.userinfo.proto，命令生成在同一目录下生成以下两个文件:

push.recall.userinfo.pb.h
push.recall.userinfo.pb.cc

3. 编写 reader.cpp 和 writer.cpp

#include <iostream>
#include <fstream>
#include "push.recall.userinfo.pb.h"
using namespace std;

void ListMsg(const push::recall::userinfo &msg) { 
  cout << msg.name() << endl; 
  cout << msg.age() << endl;
  cout << msg.address() << endl; 
} 
  
int main(int argc, char* argv[]) { 
    push::recall::userinfo msg1; 
    fstream input("./log", ios::in | ios::binary); 
    if (!msg1.ParseFromIstream(&input)) { 
        cerr << "Failed to parse address book." << endl; 
        return -1; 
    } 
    ListMsg(msg1); 
}

#include <iostream>
#include <fstream>
#include "push.recall.userinfo.pb.h"
using namespace std;

int main(void) 
{ 
    push::recall::userinfo msg1; 
    msg1.set_name("yilonghao"); 
    msg1.set_age(18);
    msg1.set_address("Peking");
    fstream output("./log", ios::out | ios::trunc | ios::binary); 
    if (!msg1.SerializeToOstream(&output)) { 
        cerr << "Failed to write msg." << endl; 
        return -1; 
    }         
    return 0; 
}

protobuf 语法基础

1. 命名空间

可用 package 关键字定义命名空间(包名)，如：package push.recall;

2. 引用

import 关键字后接引用的文件路径，如：import "user.proto";

3. 注释

proto文件通过“//”和“/**/”来注释。

4. 常见数据类型

proto	C++	备注
double	double	浮点数
float	float	单精度浮点
int32	int32	使用可变长编码方式。编码负数时不够高效——如果你的字段可能含有负数，那么请使用sint32。
int64	int64	使用可变长编码方式。编码负数时不够高效——如果你的字段可能含有负数，那么请使用sint64。
unit32	unit32	无符号整数使用可变长编码方式。
unit64	unit64	无符号整数使用可变长编码方式。
sint32	int32	使用可变长编码方式。有符号的整型值。编码时比通常的int32高效
sint64	int64	使用可变长编码方式。有符号的整型值。编码时比通常的int64高效。
fixed32	unit32	总是4个字节。如果数值总是比总是比228大的话，这个类型会比uint32高效。
fixed64	unit64	总是8个字节。如果数值总是比总是比256大的话，这个类型会比uint64高效。
sfixed32	int32	总是4个字节。
sfixed64	int64	总是8个字节。
bool	bool	bool 值
string	string	一个字符串必须是UTF-8编码或者7-bit ASCII编码的文本。
bytes	string	可能包含任意顺序的字节数据

5. 枚举类型

enum Recode {
    SUCCESS = 0;
    ERROR = 1;
};

6. 消息体内容组成

[字段修饰] <数据类型|消息体|引用外部消息体> <名称> = <编号>;

7. 字段修饰

required：声明该值是必要的值，不能为空（proto3舍弃）
optional：声明该值是可选的值，可以为空（proto3舍弃）
repeated：声明该值为多个数值，可以组成数组的形式
// proto3 舍弃了 required 和 optional 两种修饰，值都可以为空

8. 编号规则

编号必须是正整数，非必须连续