ceph rgw:rgw的I/O路径 后篇
在上一篇文章中,我分析了rgw main函数的流程,其中fe->run()
开始了frontend的运行,这篇文章就以run()函数开始。
rgw 支持很多frontend,以默认的frontend civietweb来分析。
RGWCivetWebFrontend::run
run函数很长,但其中大部分都是在处理配置,其功能代码只有下面几行:
struct mg_callbacks cb;
memset((void *)&cb, 0, sizeof(cb));
cb.begin_request = civetweb_callback;
cb.log_message = rgw_civetweb_log_callback;
cb.log_access = rgw_civetweb_log_access_callback;
ctx = mg_start(&cb, this, options.data());
代码很简单,就是对civetweb的使用,首先注册了我们自己的各种事件处理函数,然后使用mg_start开启了服务器(在新的线程执行)。
有关mg_callbacks
和mg_start
参考:
https://github.com/civetweb/civetweb/blob/master/docs/api/mg_callbacks.md
https://github.com/civetweb/civetweb/blob/master/docs/api/mg_start.md
civetweb_callback
其中,请求的处理函数civetweb_callback
实现如下:
static int civetweb_callback(struct mg_connection* conn)
{
const struct mg_request_info* const req_info = mg_get_request_info(conn);
return static_cast<RGWCivetWebFrontend *>(req_info->user_data)->process(conn);
}
可以看到,这里只是做了用于参数的获取和转发,其真正的处理函数是RGWCivetWebFrontend::process
。
RGWCivetWebFrontend::process
int RGWCivetWebFrontend::process(struct mg_connection* const conn)
{
/* Hold a read lock over access to env.store for reconfiguration. */
RWLock::RLocker lock(env.mutex);
RGWCivetWeb cw_client(conn);
auto real_client_io = rgw::io::add_reordering(
rgw::io::add_buffering(dout_context,
rgw::io::add_chunking(
rgw::io::add_conlen_controlling(
&cw_client))));
RGWRestfulIO client_io(dout_context, &real_client_io);
RGWRequest req(env.store->get_new_req_id());
//处理函数
int ret = process_request(env.store, env.rest, &req, env.uri_prefix,
*env.auth_registry, &client_io, env.olog);
if (ret < 0) {
/* We don't really care about return code. */
dout(20) << "process_request() returned " << ret << dendl;
}
/* Mark as processed. */
return 1;
}
rgw_process.cc/process_request
process
函数将请求以及处理请求所需要的环境信息都准备好,调用process_request
函数进行处理。这个函数比较长,只贴出关键的代码片段:
struct req_state rstate(g_ceph_context, &rgw_env, &userinfo);
struct req_state *s = &rstate;
......
RGWRESTMgr *mgr;
RGWHandler_REST *handler = rest->get_handler(store, s,
auth_registry,
frontend_prefix,
client_io, &mgr, &init_error);
......
ret = rgw_process_authenticated(handler, op, req, s);
......
client_io->complete_request();
......
RGWREST::get_handler
process_request
将req的状态和一些必要的env存入rstate对象,然后调用rest->get_handler
获得对应api的处理函数,要注意的是,这里的rest就是之前传入process的env.rest,我们追踪下这个env.rest究竟是什么。
让我们回到rgw_main.cc/main函数:
RGWREST rest;
......
if (apis_map.count("s3") > 0 || s3website_enabled) {
if (! swift_at_root) {
rest.register_default_mgr(set_logging(rest_filter(store, RGW_REST_S3,new RGWRESTMgr_S3(s3website_enabled))));
} else {
derr << "Cannot have the S3 or S3 Website enabled together with "
<< "Swift API placed in the root of hierarchy" << dendl;
return EINVAL;
}
}
......
RGWProcessEnv env = { store, &rest, olog, 0, uri_prefix, auth_registry };
fe = new RGWCivetWebFrontend(env, config);
上面的代码很清楚了,env.rest会随着api配置的不同而不同,下面代码继续对get_handler进行fen分析,以S3的api为例。
rest->get_handler
(RGWHandler_REST* RGWREST::get_handler
)函数比较复杂,只列出关键代码片段:
RGWRESTMgr *m = mgr.get_manager(s, frontend_prefix, s->decoded_uri,&s->relative_uri);
RGWHandler_REST* handler = m->get_handler(s, auth_registry, frontend_prefix);
return handler;
RGWRESTMgr_S3::get_handler
可以看到它转而去调用了具体的api所对应的get_handler
函数,具体到S3,会调用RGWHandler_REST* RGWRESTMgr_S3::get_handler(..)
函数:
RGWHandler_REST* RGWRESTMgr_S3::get_handler(struct req_state* const s,
const rgw::auth::StrategyRegistry& auth_registry,
const std::string& frontend_prefix)
{
// 根据配置判断使用html还是xml控制
bool is_s3website = enable_s3website && (s->prot_flags & RGW_REST_WEBSITE);
int ret =
RGWHandler_REST_S3::init_from_header(s,
is_s3website ? RGW_FORMAT_HTML :
RGW_FORMAT_XML, true);
if (ret < 0)
return NULL;
RGWHandler_REST* handler;
// 基于html的handler
if (is_s3website) {
// 根据请求中操作对象的不同返回不同的handler
if (s->init_state.url_bucket.empty()) {
handler = new RGWHandler_REST_Service_S3Website(auth_registry);
} else if (s->object.empty()) {
handler = new RGWHandler_REST_Bucket_S3Website(auth_registry);
} else {
handler = new RGWHandler_REST_Obj_S3Website(auth_registry);
}
//基于xml的handler
} else {
// 根据请求中操作对象的不同返回不同的handler
if (s->init_state.url_bucket.empty()) {
handler = new RGWHandler_REST_Service_S3(auth_registry);
} else if (s->object.empty()) {
handler = new RGWHandler_REST_Bucket_S3(auth_registry);
} else {
handler = new RGWHandler_REST_Obj_S3(auth_registry);
}
}
ldout(s->cct, 20) << __func__ << " handler=" << typeid(*handler).name()
<< dendl;
return handler;
}
回到 rgw_process.cc/process_request
struct req_state rstate(g_ceph_context, &rgw_env, &userinfo);
struct req_state *s = &rstate;
......
RGWRESTMgr *mgr;
RGWHandler_REST *handler = rest->get_handler(store, s,
auth_registry,
frontend_prefix,
client_io, &mgr, &init_error);
......
// 开始分析以下部分代码
ret = rgw_process_authenticated(handler, op, req, s);
......
client_io->complete_request();
......
我们在之前已经分析了process_request的前部分代码,分析了handler是如何获得的。
在获得handler之后,经过各种参数检查,权限认证之后,其真正执行请求是在rgw_process_authenticated函数中,执行完之后,调用complete_request完成请求。
rgw_process.cc/rgw_process_authenticated
这是rgw_process_authenticated
有关执行逻辑的代码:
req->log(s, "pre-executing");
op->pre_exec(); //拼接reponse的header,并返回给client
req->log(s, "executing");
op->execute(); //执行
req->log(s, "completing");
op->complete(); //调用send_response,返回执行结果给client
至于op的获得,稍微补充下
op = handler->get_op(store);
get_op
函数会根据req的信息,去调用对应的handler的op_xxx
函数,比如RGWHandler_REST_Obj_S3首先了下面一系列操作。
RGWOp *op_get() override;
RGWOp *op_head() override;
RGWOp *op_put() override;
RGWOp *op_delete() override;
RGWOp *op_post() override;
RGWOp *op_options() override;
每一个操作对对应一个RGWOp的子类,比如RGWGetObj_ObjStore_S3、RGWGetObjTags_ObjStore_S3、RGWListBucket_ObjStore_S3等一系列类对象。
到这,从frontend到操作的执行就走通了,接下来就可以对自己想要详细学习的operation进行阅读了。只需要看对应op对象的execute函数,pre_exec和complete函数基本一致,具体见代码注释。