epoll（2018.07.29 更新）

2016-10-29 本文已影响315人 linjinhe

开始

epoll 是 Linux 提供的 I/O event notification facility。在需要监听的 fd 数量很多（成千上万）而同一时刻可读/可写的数量又比较少（几个？几十个？几百个？）的情况下，性能要明显优于 select、poll。

API

与 epoll 直接相关的 API 有：
创建 epoll fd：epoll_create/epoll_create1
操作 epoll fd：epoll_ctl
监听 epoll fd：epoll_wait/epoll_pwait

创建

int epoll_create(int size);
int epoll_create1(int flags);

epoll_create 用于创建一个 epoll fd。从 Linux2.6.8 开始，size 参数就被废弃了，但是使用时传入的参数必须大于0。
epoll_create1 的参数 flags 可以为 0 或 EPOLL_CLOEXEC。
- flags 为 0：epoll_create1 的作用和 epoll_create 一样。
- flags 为 EPOLL_CLOEXEC：将返回的 epoll fd 设置为 close on exec。

操作

int epoll_ctl(int epfd, int op, int fd, struct epoll_event* event);

在 epfd 上设置 (op参数) 描述符 fd 的事件 (event 参数)。
epfd 是 epoll_create 的返回值。
fd 是想要操作的文件描述符。
op 是操作的类型，其取值有：
- EPOLL_CTL_ADD
- EPOLL_CTL_MOD
- EPOLL_CTL_DEL
event 是 fd 所关心的事件。对于 EPOLL_CTL_DEL，可以传 NULL（BUG：Linux2.6.9 之前不能传NULL）。

typedef union epoll_data {
    void* ptr;
    int fd;
    uint32_t u32;
    uint64_t u64;
} epoll_data_t;
struct epoll_event {
    uint32_t events;     // events is a bit set
    epoll_data_t data;
};

epoll_event 的成员 events 是一个 bit set。其值可以是下面各值的或。
- EPOLLINT：监听可读事件。
- EPOLLOUT：监听可写事件。
- EPOLLRDHUP ：监听对端断开连接。
- EPOLLPRI：外带数据。
- EPOLLERR：Error condition happened on the associated file descriptor.
- EPOLLHUP：Hang up happened on the associated file descriptor.
- EPOLLET：边沿触发，epoll 监听的 fd 默认是电平触发。
- EPOLLONESHOT：对应 fd 的事件被触发通知后，需要重新调用 epoll_ctl 对其进行修改(EPOLL_CTL_MOD)，才能恢复事件触发通知。

监听

int epoll_wait(int epfd, struct epoll_event* events, int maxevents, int timeout);
int epoll_pwait(int epfd, struct epoll_event* events, int maxevents, int timeout, const sigset_t* sigmask);

监听的接口比较简单，暂时没什么好介绍。

线程安全性

在知乎上看到一个问题讨论 epoll 的线程安全性，里面提到 github 上的一个 issue，有一段话：

One thread is calling epoll_wait() and blocks there. After that another thread is adding a new file descriptor to the epoll file. Because epoll_wait() will not take this change into consideration, it will not be woken up on that file descriptor's events.

简单说就是，一个线程阻塞在 epoll_wait 上，另一个线程调用 epoll_ctl设置 fd 会有问题。

但是，我在 man epoll_wait 中看到的内容，却说这样做是安全的：

While one thread is blocked in a call to epoll_pwait(), it is possible for another thread to add a file descriptor to the waited-upon epoll instance. If the new file descriptor becomes ready, it will cause the epoll_wait() call to unblock.

自己简单写一个测试工具：一个线程负责 accept TCP 连接，然后调用 epoll_ctl 将新连接注册到 epoll。另一个线程负责 epoll_wait，读取 client 发送过来的数据，并简单打印出来。并没有出现，这个 github issue 提到的问题…

网上看到，一个线程负责 accept，其它线程负责 epoll_wait 的主流方案是，通过 pipe 或 eventfd 传送 fd，在负责 epoll_wait 的线程中执行注册。（同时可以通过对多个 epoll_wait的线程进行负载均衡。）

小结

最近感觉近几年 Linux 内核更新得挺快的，需要的时候还是去看对应版本的 man page 比较靠谱…

另外，学习学习就好，I/O 多路复用的事还是交给库和框架吧 ^_^

参考文档

（2018.07.29 更新“线程安全性”一节。）

epoll（2018.07.29 更新）

开始

API

创建

操作

监听

线程安全性

小结

参考文档

猜你喜欢

热点阅读