PostgreSQL Executor(3): 可优化语句的执行

2019-05-19 本文已影响0人 DavidLi2010

可优化语句经过优化器优化后生成查询计划树，并由Executor执行。Executor对外有四个接口函数：ExecutorStart、ExecutorRun、ExecutorFinish、ExecutorEnd。Executor的输入是查询描述符QueryDesc，输出是结果数据或相关的执行信息。

查询描述符封装了Executor执行查询需要的一切信息。QueryDesc定义在src/include/executor/execdesc.h中。

执行查询计划树，只需要构造QueryDesc，并依次调用上面四个接口函数就能完成执行过程。

Executor的接口函数全部在Portal的执行过程中被调用。以PORTAL_ONE_SELECT的执行策略为例：

在PortalStart中首先确定执行策略。如果执行策略是PORTAL_ONE_SELECT，则会创建QueryDesc,将查询计划树赋给查询描述符。然后执行ExecutorStart完成Executor的初始化工作。
在PortalRun中调用PortalRunSelect，在其中执行ExecutorRun，完成查询计划的执行。
在PortalDrop中调用PortalCleanup，在其中执行ExecutorFinish和ExecutorEnd以清理环境，最后释放QueryDesc。

Executor的处理模式

Executor对查询计划树的执行过程，实际上是对计划树的每一个节点的处理。查询树的每一个节点都表示一种操作，节点的处理被设计成按需要驱动的模式。节点使用子节点输出的数据作为输入，按自身的操作逻辑处理之后向上层节点返回结果数据。实现上，从根节点开始处理，每个节点在处理过程中根据需要调用子节点的处理过程来获取数据。通过递归的方式，实现整个计划树的遍历执行。

初始化和清理操作也是采用相同的模式，从根节点开始递归处理子节点。

计划树中的每一个节点都是一个操作符，完成一个具体的物理操作。在PostgreSQL中，操作符被定义为有0～2个输入和1个输出。这样所有的操作符可以组织成一个二叉树，下层节点的输出是上层节点的输入，直至根节点对外输出结果数据。数据（元组）从叶子节点向上层节点流动，直至根节点完成处理。

在Executor中，通过ExecInitNode、ExecProcNode和ExecEndNode三个入口函数统一对节点进行初始化、执行和清理。每个节点都实现了对应的初始化、执行和清理函数，并且通过三个入口函数从根节点开始递归执行。

PostgreSQL采用一次一个元组的执行模式，每个节点一次向上层节点返回一个元组。整个查询计划树的节点就构成了一个管道，查询计划树的执行过程可以看成拉动元组穿过管道的过程。

计划节点的数据结构

PostgreSQL采用面向对象的思想设计节点的数据结构，所有节点都继承自Plan。Plan是所有节点的通用抽象类型。

Plan的定义在src/include/nodes/plannodes.h中，定义如下：

/* ----------------
 *        Plan node
 *
 * All plan nodes "derive" from the Plan structure by having the
 * Plan structure as the first field.  This ensures that everything works
 * when nodes are cast to Plan's.  (node pointers are frequently cast to Plan*
 * when passed around generically in the executor)
 *
 * We never actually instantiate any Plan nodes; this is just the common
 * abstract superclass for all Plan-type nodes.
 * ----------------
 */
typedef struct Plan
{
    NodeTag     type;

    /*
     * estimated execution costs for plan (see costsize.c for more info)
     */
    Cost        startup_cost;   /* cost expended before fetching any tuples */
    Cost        total_cost;     /* total cost (assuming all tuples fetched) */

    /*
     * planner's estimate of result size of this plan step
     */
    double      plan_rows;      /* number of rows plan is expected to emit */
    int         plan_width;     /* average row width in bytes */

    /*
     * information needed for parallel query
     */
    bool        parallel_aware; /* engage parallel-aware logic? */
    bool        parallel_safe;  /* OK to use as part of parallel plan? */

    /*
     * Common structural data for all Plan types.
     */
    int         plan_node_id;   /* unique across entire final plan tree */
    List       *targetlist;     /* target list to be computed at this node */
    List       *qual;           /* implicitly-ANDed qual conditions */
    struct Plan *lefttree;      /* input plan tree(s) */
    struct Plan *righttree;
    List       *initPlan;       /* Init Plan nodes (un-correlated expr
                                 * subselects) */

    /*
     * Information for management of parameter-change-driven rescanning
     *
     * extParam includes the paramIDs of all external PARAM_EXEC params
     * affecting this plan node or its children.  setParam params from the
     * node's initPlans are not included, but their extParams are.
     *
     * allParam includes all the extParam paramIDs, plus the IDs of local
     * params that affect the node (i.e., the setParams of its initplans).
     * These are _all_ the PARAM_EXEC params that affect this node.
     */
    Bitmapset  *extParam;
    Bitmapset  *allParam;
} Plan;

Plan中定义了左右子树(lefttree, righttree)、节点类型(type)、选择表达式(qual)、投影列表(targetlist)等公共字段。

PostgreSQL将所有的计划节点按功能分为四类：

控制节点(control node)
扫描节点(scan node)
连接节点(join node)
物化节点(materalization node)

其中，扫描和连接节点类型定义了公共父类Scan和Join。具体的节点继承了公共父类并增加了与自身操作相关的扩展字段。

节点通过左右子树指针链接了子节点，根节点指针保存在PlannedStmt中。而PlannedStmt被存放在QueryDesc`中。

PostgreSQL为每一种计划节点定义了一个状态节点。与计划节点类似，所有的状态节点都继承自PlanState，其中包含计划节点指针、执行器全局状态结构指针、投影运算信息、选择运算条件，以及左右子状态节点指针。状态节点之间组成了与计划树类似的状态树。

在执行器初始化时，ExecutorStart会根据查询计划树构造执行器全局状态(EState)以及计划节点状态树。在查询树执行过程中，执行器将使用状态节点记录计划节点的执行状态和数据，并通过全局状态在节点间传递元组。执行器的清理函数ExecutorEnd将回收执行器全局状态和状态节点。

PostgreSQL Executor(3): 可优化语句的执行

Executor的处理模式

计划节点的数据结构

猜你喜欢

热点阅读