ovs datapath
ovs有两种datapath(其实有且也只有这两个datapath)
一种是位于kernel的datapath,报文的收发都在内核态进行;
另一种是位于userspace的datapath,支持两种类型的端口: non-pmd/pmd,前者可以是linux系统上绑定在kernel driver的物理网卡,也可以是tap等虚拟网卡,后者需要dpdk的支持,所以需要在编译时,就指定dpdk提供的lib,其使用dpdk的pmd线程处理网卡,可以是物理网卡,也可以是vhost-user虚拟网卡。
kernel datapath和userspace datapath是可以共存的,下图为ovs集成dpdk后,两种datapath框架图。
image.png
a. 每种datapath可以支持多个网桥,不同datapath类型的网桥不能互通。
b. patch端口只存在于ofproto层面,不会下发到datapath中,所以通过 "ovs-appctl dpctl/show" 命令是看不到patch端口的。
c. kernel datapath在openvswitch.ko模块中直接调用网卡驱动进行收发包。
userspace datapath中,对于non-pmd端口,使用af_socket或者read/write /dev/net/tun实现收发包,对于pmd类型端口,使用其dpdk driver在用户态实现收发包。
配置
可以在创建网桥时,使用datapath_type指定datapath类型: system或者netdev
//kernel datapath,需要提前加载openvswitch.ko内核模块 -- modprobe openvswitch.ko
ovs-vsctl set bridge br0 datapath_type=system
//userspace datapath
ovs-vsctl set bridge br1 datapath_type=netdev
添加端口
//在kernel datapath类型网桥上,添加网卡eth0
ovs-vsctl add-port br0 eth0
//在userspace datapath类型网桥上,添加网卡eth1
ovs-vsctl add-port br1 eth1
//在userspace datapath类型网桥上,添加pmd类型网卡dpdk0
ovs-vsctl add-port br0 dpdk0-- set Interface dpdk0 type=dpdk options:dpdk-devargs=0000:06:00.1
//在userspace datapath类型网桥上,添加pmd类型虚拟网卡vhost-user
ovs-vsctl add-port br1 dpdkvhostuser0 -- set Interface dpdkvhostuser0 type=dpdkvhostuser
pmd线程个数
在userspace datapath下,如果配置了ovs+dpdk,会启动多少pmd线程?
ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true
可参考如下代码,决定了启动pmd线程个数。
static void
reconfigure_pmd_threads(struct dp_netdev *dp)
OVS_REQUIRES(dp->port_mutex)
{
struct dp_netdev_pmd_thread *pmd;
struct ovs_numa_dump *pmd_cores;
struct ovs_numa_info_core *core;
struct hmapx to_delete = HMAPX_INITIALIZER(&to_delete);
struct hmapx_node *node;
bool changed = false;
bool need_to_adjust_static_tx_qids = false;
/* The pmd threads should be started only if there's a pmd port in the
* datapath. If the user didn't provide any "pmd-cpu-mask", we start
* NR_PMD_THREADS per numa node. */
//如果没有 pmd 类型的port,则pmd_cores为空
if (!has_pmd_port(dp)) {
pmd_cores = ovs_numa_dump_n_cores_per_numa(0);
}
//如果指定了 pmd-cpu-mask,则按照指定的值启动pmd线程
else if (dp->pmd_cmask && dp->pmd_cmask[0]) {
pmd_cores = ovs_numa_dump_cores_with_cmask(dp->pmd_cmask);
}
//如果没指定 pmd-cpu-mask,则默认每个numa节点上启动一个pmd线程
else {
pmd_cores = ovs_numa_dump_n_cores_per_numa(NR_PMD_THREADS);
}
....
如果userspace datapath没有pmd类型的端口,则不启动pmd线程。
如果有pmd类型的端口,则判断是否配置了pmd-cpu-mask,则根据配置的cpu mask启动对应数量的pmd,并且pmd默认pinned在cpu上。
ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x6
如果没有指定pmd-cpu-mask配置,则默认每个numa节点上启动一个pmd线程。
可通过此命令设置dpdk0的队列由哪个cpu上的pmd线程处理
ovs-vsctl set interface dpdk0 options:n_rxq=1 other_config:pmd-rxq-affinity="0:47"
使用下面命令查看pmd处理哪个端口的哪个队列
root#ovs-appctl dpif-netdev/pmd-rxq-show
pmd thread numa_id 0 core_id 42:
isolated : true
port: dpdk0 queue-id: 0
port: dpdk1 queue-id: 0
pmd thread numa_id 1 core_id 43:
isolated : true
port: dpdk0 queue-id: 1
port: dpdk1 queue-id: 1
使用下面命令查看pmd的统计信息
root#ovs-appctl dpif-netdev/pmd-stats-show
pmd thread numa_id 0 core_id 42:
emc hits:875898311
megaflow hits:1782277167
avg. subtable lookups per hit:1.44
miss:1933315462
lost:19
idle cycles:49668309598846806 (99.78%)
processing cycles:109907759597880 (0.22%)
avg cycles per packet: 10841405.99 (49778217358444686/4591490940)
avg processing cycles per packet: 23937.27 (109907759597880/4591490940)
pmd thread numa_id 1 core_id 43:
emc hits:839760161
megaflow hits:327153813
avg. subtable lookups per hit:1.34
miss:617222608
lost:6
idle cycles:49736207941507126 (99.92%)
processing cycles:42009278307486 (0.08%)
avg cycles per packet: 27900452.09 (49778217219814612/1784136582)
avg processing cycles per packet: 23546.00 (42009278307486/1784136582)
参考
https://ovs-dpdk-1808-merge.readthedocs.io/en/latest/intro/install/userspace.html