Postgres兼容Oracle研究——orafce调研
一、背景
PostgreSQL是和Oracle最接近的企业数据库,包括数据类型,功能,架构和语法等几个方面。甚至大多数的日常应用的性能也不会输给Oracle。
但是Oracle有些函数或者包,默认PostgreSQL是没有的,需要安装orafce包来实现这些兼容性。
orafce是PostgreSQL的一个extension,主要是为PostgreSQL提供Oracle的部分语法、函数、字典表等兼容。
二、安装orafce
版本:orafce 3.7 + PostgreSQL 10.5
注意,目前GitHub的主线版本是3.7版本,未发布。
orafce源码:https://github.com/orafce/orafce
编译安装:
解压后进入源码目录执行 make & make install
进入postgresql执行 create extension orafce
【注意事项】
下载的源码中,orafce--3.7.sql文件中下面这段sql,需要将 'BASE_TABLE' 改成 'BASE TABLE' 再编译安装
create view oracle.user_tables as
select table_name
from information_schema.tables
where table_type = 'BASE_TABLE';
三、orafce包含的内容
- 类型 date, varchar2 and nvarchar2
- 函数 concat, nvl, nvl2, lnnvl, decode, bitand, nanvl, sinh, cosh, tanh and oracle.substr
- dual 表
- package :
dbms_alert
dbms_assert
dbms_output
dbms_pipe
dbms_random
dbms_utility
plunit
plvchr
plvdate
plvlex
plvstr
plvsubst
utl_file
##Oracle兼容 包列表:
##在PostgreSQL里用 schema+函数 来实现。
atlas=# \dn
List of schemas
Name | Owner
--------------+--------
dbms_alert | appusr
dbms_assert | appusr
dbms_output | appusr
dbms_pipe | appusr
dbms_random | appusr
dbms_utility | appusr
oracle | appusr
plunit | appusr
plvchr | appusr
plvdate | appusr
plvlex | appusr
plvstr | appusr
plvsubst | appusr
public | appusr
utl_file | appusr
(15 rows)
## 查看包
例如dbms_output包:
atlas=# \df dbms_output.*
List of functions
Schema | Name | Result data type | Argument data types | Type
-------------+--------------+------------------+------------------------------------------+--------
dbms_output | disable | void | | normal
dbms_output | enable | void | | normal
dbms_output | enable | void | buffer_size integer | normal
dbms_output | get_line | record | OUT line text, OUT status integer | normal
dbms_output | get_lines | record | OUT lines text[], INOUT numlines integer | normal
dbms_output | new_line | void | | normal
dbms_output | put | void | a text | normal
dbms_output | put_line | void | a text | normal
dbms_output | serveroutput | void | boolean | normal
(9 rows)
##Oracle兼容 公共函数 列表:
atlas=# \df
List of functions
Schema | Name | Result data type | Argument data types | Type
--------+---------------------+-----------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------
public | bitand | bigint | bigint, bigint | normal
public | cosh | double precision | double precision | normal
public | decode | bigint | anyelement, anyelement, bigint | normal
public | decode | bigint | anyelement, anyelement, bigint, anyelement, bigint | normal
public | decode | bigint | anyelement, anyelement, bigint, anyelement, bigint, anyelement, bigint | normal
public | decode | bigint | anyelement, anyelement, bigint, anyelement, bigint, anyelement, bigint, bigint | normal
public | decode | bigint | anyelement, anyelement, bigint, anyelement, bigint, bigint | normal
public | decode | bigint | anyelement, anyelement, bigint, bigint | normal
public | decode | character | anyelement, anyelement, character | normal
public | decode | character | anyelement, anyelement, character, anyelement, character | normal
public | decode | character | anyelement, anyelement, character, anyelement, character, anyelement, character | normal
public | decode | character | anyelement, anyelement, character, anyelement, character, anyelement, character, character | normal
public | decode | character | anyelement, anyelement, character, anyelement, character, character | normal
public | decode | character | anyelement, anyelement, character, character | normal
public | decode | date | anyelement, anyelement, date | normal
public | decode | date | anyelement, anyelement, date, anyelement, date | normal
public | decode | date | anyelement, anyelement, date, anyelement, date, anyelement, date | normal
public | decode | date | anyelement, anyelement, date, anyelement, date, anyelement, date, date | normal
public | decode | date | anyelement, anyelement, date, anyelement, date, date | normal
public | decode | date | anyelement, anyelement, date, date | normal
public | decode | integer | anyelement, anyelement, integer | normal
public | decode | integer | anyelement, anyelement, integer, anyelement, integer | normal
public | decode | integer | anyelement, anyelement, integer, anyelement, integer, anyelement, integer | normal
public | decode | integer | anyelement, anyelement, integer, anyelement, integer, anyelement, integer, integer | normal
public | decode | integer | anyelement, anyelement, integer, anyelement, integer, integer | normal
public | decode | integer | anyelement, anyelement, integer, integer | normal
public | decode | numeric | anyelement, anyelement, numeric | normal
public | decode | numeric | anyelement, anyelement, numeric, anyelement, numeric | normal
public | decode | numeric | anyelement, anyelement, numeric, anyelement, numeric, anyelement, numeric | normal
public | decode | numeric | anyelement, anyelement, numeric, anyelement, numeric, anyelement, numeric, numeric | normal
public | decode | numeric | anyelement, anyelement, numeric, anyelement, numeric, numeric | normal
public | decode | numeric | anyelement, anyelement, numeric, numeric | normal
public | decode | text | anyelement, anyelement, text | normal
public | decode | text | anyelement, anyelement, text, anyelement, text | normal
public | decode | text | anyelement, anyelement, text, anyelement, text, anyelement, text | normal
public | decode | text | anyelement, anyelement, text, anyelement, text, anyelement, text, text | normal
public | decode | text | anyelement, anyelement, text, anyelement, text, text | normal
public | decode | text | anyelement, anyelement, text, text | normal
public | decode | timestamp without time zone | anyelement, anyelement, timestamp without time zone | normal
public | decode | timestamp without time zone | anyelement, anyelement, timestamp without time zone, anyelement, timestamp without time zone | normal
public | decode | timestamp without time zone | anyelement, anyelement, timestamp without time zone, anyelement, timestamp without time zone, anyelement, timestamp without time zone | normal
public | decode | timestamp without time zone | anyelement, anyelement, timestamp without time zone, anyelement, timestamp without time zone, anyelement, timestamp without time zone, timestamp without time zone | normal
public | decode | timestamp without time zone | anyelement, anyelement, timestamp without time zone, anyelement, timestamp without time zone, timestamp without time zone | normal
public | decode | timestamp without time zone | anyelement, anyelement, timestamp without time zone, timestamp without time zone | normal
public | decode | timestamp with time zone | anyelement, anyelement, timestamp with time zone | normal
public | decode | timestamp with time zone | anyelement, anyelement, timestamp with time zone, anyelement, timestamp with time zone | normal
public | decode | timestamp with time zone | anyelement, anyelement, timestamp with time zone, anyelement, timestamp with time zone, anyelement, timestamp with time zone | normal
public | decode | timestamp with time zone | anyelement, anyelement, timestamp with time zone, anyelement, timestamp with time zone, anyelement, timestamp with time zone, timestamp with time zone | normal
public | decode | timestamp with time zone | anyelement, anyelement, timestamp with time zone, anyelement, timestamp with time zone, timestamp with time zone | normal
public | decode | timestamp with time zone | anyelement, anyelement, timestamp with time zone, timestamp with time zone | normal
public | decode | time without time zone | anyelement, anyelement, time without time zone | normal
public | decode | time without time zone | anyelement, anyelement, time without time zone, anyelement, time without time zone | normal
public | decode | time without time zone | anyelement, anyelement, time without time zone, anyelement, time without time zone, anyelement, time without time zone | normal
public | decode | time without time zone | anyelement, anyelement, time without time zone, anyelement, time without time zone, anyelement, time without time zone, time without time zone | normal
public | decode | time without time zone | anyelement, anyelement, time without time zone, anyelement, time without time zone, time without time zone | normal
public | decode | time without time zone | anyelement, anyelement, time without time zone, time without time zone | normal
public | dump | character varying | "any" | normal
public | dump | character varying | "any", integer | normal
public | dump | character varying | text | normal
public | dump | character varying | text, integer | normal
public | nanvl | double precision | double precision, character varying | normal
public | nanvl | double precision | double precision, double precision | normal
public | nanvl | numeric | numeric, character varying | normal
public | nanvl | numeric | numeric, numeric | normal
public | nanvl | real | real, character varying | normal
public | nanvl | real | real, real | normal
public | nvarchar2 | nvarchar2 | nvarchar2, integer, boolean | normal
public | nvarchar2_transform | internal | internal | normal
public | nvarchar2in | nvarchar2 | cstring, oid, integer | normal
public | nvarchar2out | cstring | nvarchar2 | normal
public | nvarchar2recv | nvarchar2 | internal, oid, integer | normal
public | nvarchar2send | bytea | nvarchar2 | normal
public | nvarchar2typmodin | integer | cstring[] | normal
public | nvarchar2typmodout | cstring | integer | normal
public | nvl | anyelement | anyelement, anyelement | normal
public | nvl2 | anyelement | anyelement, anyelement, anyelement | normal
public | sinh | double precision | double precision | normal
public | tanh | double precision | double precision | normal
public | to_multi_byte | text | str text | normal
public | to_single_byte | text | str text | normal
public | varchar2 | varchar2 | varchar2, integer, boolean | normal
public | varchar2_transform | internal | internal | normal
public | varchar2in | varchar2 | cstring, oid, integer | normal
public | varchar2out | cstring | varchar2 | normal
public | varchar2recv | varchar2 | internal, oid, integer | normal
public | varchar2send | bytea | varchar2 | normal
public | varchar2typmodin | integer | cstring[] | normal
public | varchar2typmodout | cstring | integer | normal
(88 rows)
atlas=# \df oracle.*
List of functions
Schema | Name | Result data type | Argument data types | Type
--------+-----------------------+-----------------------------+----------------------------------------------------+--------
oracle | add_days_to_timestamp | timestamp without time zone | oracle.date, bigint | normal
oracle | add_days_to_timestamp | timestamp without time zone | oracle.date, integer | normal
oracle | add_days_to_timestamp | timestamp without time zone | oracle.date, numeric | normal
oracle | add_days_to_timestamp | timestamp without time zone | oracle.date, smallint | normal
oracle | add_months | timestamp without time zone | timestamp with time zone, integer | normal
oracle | btrim | text | character | normal
oracle | btrim | text | character, character | normal
oracle | btrim | text | character, nvarchar2 | normal
oracle | btrim | text | character, text | normal
oracle | btrim | text | character, varchar2 | normal
oracle | btrim | text | nvarchar2 | normal
oracle | btrim | text | nvarchar2, character | normal
oracle | btrim | text | nvarchar2, nvarchar2 | normal
oracle | btrim | text | nvarchar2, text | normal
oracle | btrim | text | nvarchar2, varchar2 | normal
oracle | btrim | text | text | normal
oracle | btrim | text | text, character | normal
oracle | btrim | text | text, nvarchar2 | normal
oracle | btrim | text | text, text | normal
oracle | btrim | text | text, varchar2 | normal
oracle | btrim | text | varchar2 | normal
oracle | btrim | text | varchar2, character | normal
oracle | btrim | text | varchar2, nvarchar2 | normal
oracle | btrim | text | varchar2, text | normal
oracle | btrim | text | varchar2, varchar2 | normal
oracle | dbtimezone | text | | normal
oracle | get_full_version_num | text | | normal
oracle | get_major_version | text | | normal
oracle | get_major_version_num | text | | normal
oracle | get_platform | text | | normal
oracle | get_status | text | | normal
oracle | last_day | timestamp without time zone | timestamp with time zone | normal
oracle | length | integer | character | normal
oracle | lpad | text | bigint, integer, integer | normal
oracle | lpad | text | character, integer | normal
oracle | lpad | text | character, integer, character | normal
oracle | lpad | text | character, integer, nvarchar2 | normal
oracle | lpad | text | character, integer, text | normal
oracle | lpad | text | character, integer, varchar2 | normal
oracle | lpad | text | integer, integer, integer | normal
oracle | lpad | text | numeric, integer, integer | normal
oracle | lpad | text | nvarchar2, integer | normal
oracle | lpad | text | nvarchar2, integer, character | normal
oracle | lpad | text | nvarchar2, integer, nvarchar2 | normal
oracle | lpad | text | nvarchar2, integer, text | normal
oracle | lpad | text | nvarchar2, integer, varchar2 | normal
oracle | lpad | text | smallint, integer, integer | normal
oracle | lpad | text | text, integer | normal
oracle | lpad | text | text, integer, character | normal
oracle | lpad | text | text, integer, nvarchar2 | normal
oracle | lpad | text | text, integer, text | normal
oracle | lpad | text | text, integer, varchar2 | normal
oracle | lpad | text | varchar2, integer | normal
oracle | lpad | text | varchar2, integer, character | normal
oracle | lpad | text | varchar2, integer, nvarchar2 | normal
oracle | lpad | text | varchar2, integer, text | normal
oracle | lpad | text | varchar2, integer, varchar2 | normal
oracle | ltrim | text | character | normal
oracle | ltrim | text | character, character | normal
oracle | ltrim | text | character, nvarchar2 | normal
oracle | ltrim | text | character, text | normal
oracle | ltrim | text | character, varchar2 | normal
oracle | ltrim | text | nvarchar2 | normal
oracle | ltrim | text | nvarchar2, character | normal
oracle | ltrim | text | nvarchar2, nvarchar2 | normal
oracle | ltrim | text | nvarchar2, text | normal
oracle | ltrim | text | nvarchar2, varchar2 | normal
oracle | ltrim | text | text | normal
oracle | ltrim | text | text, character | normal
oracle | ltrim | text | text, nvarchar2 | normal
oracle | ltrim | text | text, text | normal
oracle | ltrim | text | text, varchar2 | normal
oracle | ltrim | text | varchar2 | normal
oracle | ltrim | text | varchar2, character | normal
oracle | ltrim | text | varchar2, nvarchar2 | normal
oracle | ltrim | text | varchar2, text | normal
oracle | ltrim | text | varchar2, varchar2 | normal
oracle | months_between | numeric | timestamp with time zone, timestamp with time zone | normal
oracle | next_day | timestamp without time zone | timestamp with time zone, integer | normal
oracle | next_day | timestamp without time zone | timestamp with time zone, text | normal
oracle | numtodsinterval | interval | double precision, text | normal
oracle | nvl | bigint | bigint, integer | normal
oracle | nvl | numeric | numeric, integer | normal
oracle | round | numeric | double precision, integer | normal
oracle | round | numeric | real, integer | normal
oracle | rpad | text | character, integer | normal
oracle | rpad | text | character, integer, character | normal
oracle | rpad | text | character, integer, nvarchar2 | normal
oracle | rpad | text | character, integer, text | normal
oracle | rpad | text | character, integer, varchar2 | normal
oracle | rpad | text | nvarchar2, integer | normal
oracle | rpad | text | nvarchar2, integer, character | normal
oracle | rpad | text | nvarchar2, integer, nvarchar2 | normal
oracle | rpad | text | nvarchar2, integer, text | normal
oracle | rpad | text | nvarchar2, integer, varchar2 | normal
oracle | rpad | text | text, integer | normal
oracle | rpad | text | text, integer, character | normal
oracle | rpad | text | text, integer, nvarchar2 | normal
oracle | rpad | text | text, integer, text | normal
oracle | rpad | text | text, integer, varchar2 | normal
oracle | rpad | text | varchar2, integer | normal
oracle | rpad | text | varchar2, integer, character | normal
oracle | rpad | text | varchar2, integer, nvarchar2 | normal
oracle | rpad | text | varchar2, integer, text | normal
oracle | rpad | text | varchar2, integer, varchar2 | normal
oracle | rtrim | text | character | normal
oracle | rtrim | text | character, character | normal
oracle | rtrim | text | character, nvarchar2 | normal
oracle | rtrim | text | character, text | normal
oracle | rtrim | text | character, varchar2 | normal
oracle | rtrim | text | nvarchar2 | normal
oracle | rtrim | text | nvarchar2, character | normal
oracle | rtrim | text | nvarchar2, nvarchar2 | normal
oracle | rtrim | text | nvarchar2, text | normal
oracle | rtrim | text | nvarchar2, varchar2 | normal
oracle | rtrim | text | text | normal
oracle | rtrim | text | text, character | normal
oracle | rtrim | text | text, nvarchar2 | normal
oracle | rtrim | text | text, text | normal
oracle | rtrim | text | text, varchar2 | normal
oracle | rtrim | text | varchar2 | normal
oracle | rtrim | text | varchar2, character | normal
oracle | rtrim | text | varchar2, nvarchar2 | normal
oracle | rtrim | text | varchar2, text | normal
oracle | rtrim | text | varchar2, varchar2 | normal
oracle | sessiontimezone | text | | normal
oracle | substr | text | character varying, numeric | normal
oracle | substr | text | character varying, numeric, numeric | normal
oracle | substr | text | numeric, numeric | normal
oracle | substr | text | numeric, numeric, numeric | normal
oracle | substr | text | str text, start integer | normal
oracle | substr | text | str text, start integer, len integer | normal
oracle | subtract | timestamp without time zone | oracle.date, bigint | normal
oracle | subtract | timestamp without time zone | oracle.date, integer | normal
oracle | subtract | timestamp without time zone | oracle.date, numeric | normal
oracle | subtract | double precision | oracle.date, oracle.date | normal
oracle | subtract | timestamp without time zone | oracle.date, smallint | normal
oracle | sysdate | oracle.date | | normal
oracle | to_char | text | timestamp without time zone | normal
oracle | to_date | oracle.date | text | normal
oracle | to_date | oracle.date | text, text | normal
oracle | trunc | numeric | double precision, integer | normal
oracle | trunc | numeric | real, integer | normal
(143 rows)
## Oracle兼容 系统表 视图:
atlas=# \dv oracle.*
List of relations
Schema | Name | Type | Owner
--------+---------------------------+------+--------
oracle | dba_segments | view | appusr
oracle | product_component_version | view | appusr
oracle | user_cons_columns | view | appusr
oracle | user_constraints | view | appusr
oracle | user_ind_columns | view | appusr
oracle | user_objects | view | appusr
oracle | user_procedures | view | appusr
oracle | user_source | view | appusr
oracle | user_tab_columns | view | appusr
oracle | user_tables | view | appusr
oracle | user_views | view | appusr
(11 rows)
## Oracle兼容 dual表,在PG里用了一个视图来实现。
atlas=# \dv
List of relations
Schema | Name | Type | Owner
--------+------+------+--------
public | dual | view | appusr
(1 row)
atlas=# \d+ dual
View "public.dual"
Column | Type | Collation | Nullable | Default | Storage | Description
--------+-------------------+-----------+----------+---------+----------+-------------
dummy | character varying | | | | extended |
View definition:
SELECT 'X'::character varying AS dummy;
atlas=# select * from dual;
dummy
-------
X
(1 row)
atlas=# select 1 from dual;
?column?
----------
1
(1 row)
四、实现过程
1.添加自定义类型
orafce添加了varchar2 和nvarchar2两种类型,varchar2的实现过程如下:
## 自定义类型
/* CREATE TYPE */
CREATE TYPE varchar2 (
internallength = VARIABLE,
input = varchar2in,
output = varchar2out,
receive = varchar2recv,
send = varchar2send,
category = 'S',
typmod_in = varchar2typmodin,
typmod_out = varchar2typmodout,
collatable = true
);
## 创建转换函数
/* CREATE CAST */
CREATE CAST (varchar2 AS text)
WITHOUT FUNCTION
AS IMPLICIT;
CREATE CAST (text AS varchar2)
WITHOUT FUNCTION
AS IMPLICIT;
CREATE CAST (varchar2 AS char)
WITHOUT FUNCTION
AS IMPLICIT;
...
## 其他varchar2的操作函数:
## 此函数使用pg内核提供的C函数,部分函数会采用orafce动态库自定义的C函数
CREATE OR REPLACE FUNCTION pg_catalog.substrb(varchar2, integer, integer) RETURNS varchar2
AS 'bytea_substr'
LANGUAGE internal
STRICT IMMUTABLE;
...
2.对date类型增加oracle相关的特性
orafce为postgres的date类型做了增强,尽量减少oracle迁移至postgres的代码修改量
## 创建oracle对应的date类型:
CREATE DOMAIN oracle.date AS timestamp(0);
## 对oracle.date类型增强操作符支持
CREATE OPERATOR oracle.+ (
LEFTARG = oracle.date,
RIGHTARG = INTEGER,
PROCEDURE = oracle.add_days_to_timestamp
);
CREATE OPERATOR oracle.- (
LEFTARG = oracle.date,
RIGHTARG = INTEGER,
PROCEDURE = oracle.subtract
);
CREATE OPERATOR oracle.+ (
LEFTARG = oracle.date,
RIGHTARG = bigint,
PROCEDURE = oracle.add_days_to_timestamp
);
CREATE OPERATOR oracle.- (
LEFTARG = oracle.date,
RIGHTARG = bigint,
PROCEDURE = oracle.subtract
);
CREATE OPERATOR oracle.+ (
LEFTARG = oracle.date,
RIGHTARG = smallint,
PROCEDURE = oracle.add_days_to_timestamp
);
CREATE OPERATOR oracle.- (
LEFTARG = oracle.date,
RIGHTARG = smallint,
PROCEDURE = oracle.subtract
);
CREATE OPERATOR oracle.+ (
LEFTARG = oracle.date,
RIGHTARG = numeric,
PROCEDURE = oracle.add_days_to_timestamp
);
CREATE OPERATOR oracle.- (
LEFTARG = oracle.date,
RIGHTARG = numeric,
PROCEDURE = oracle.subtract
);
CREATE OPERATOR oracle.- (
LEFTARG = oracle.date,
RIGHTARG = oracle.date,
PROCEDURE = oracle.subtract
);
## 操作符实现(以 date + int 为例):
CREATE OR REPLACE FUNCTION oracle.add_days_to_timestamp(oracle.date,integer)
RETURNS timestamp AS $$
SELECT $1 + interval '1 day' * $2;
$$ LANGUAGE SQL IMMUTABLE;
...
3.用视图代替oracle的dual虚表
在oracle中,查询系统变量或者函数返回值时,经常会用到虚表dual。
例如:
select 1 from dual
但是在postgres中,对应的sql为:
select 1
orafce为了兼容oracle的dual用法,添加了一个名为dual的视图,并授权给public:
CREATE VIEW public.dual AS SELECT 'X'::varchar AS dummy;
REVOKE ALL ON public.dual FROM PUBLIC;
GRANT SELECT, REFERENCES ON public.dual TO PUBLIC;
4.oracle的sysdate实现
在oracle中,提供了部分系统变量,可以通过select获取其中值。例如oracle的sysdate
## Oracle 数据库使用
select sysdate from dual
在postgres中,没有sysdate,为了实现此功能,提供一个oracle.sysdate()函数。实现思路:
CREATE FUNCTION oracle.sysdate()
RETURNS oracle.date
AS 'MODULE_PATHNAME','orafce_sysdate'
LANGUAGE C STABLE STRICT;
COMMENT ON FUNCTION oracle.sysdate() IS 'Ruturns statement timestamp at server time zone';
注意,这个函数是C语言函数,由动态库实现过程,代码如下:
/* src:datefce.c */
/********************************************************************
*
* ora_sysdate - sysdate
*
* Syntax:
*
* timestamp sysdate()
*
* Purpose:
*
* Returns statement_timestamp in server time zone
* Note - server time zone doesn't exists on PostgreSQL - emulated
* by orafce_timezone
*
********************************************************************/
Datum
orafce_sysdate(PG_FUNCTION_ARGS)
{
Datum sysdate;
Datum sysdate_scaled;
sysdate = DirectFunctionCall2(timestamptz_zone,
CStringGetTextDatum(orafce_timezone),
TimestampTzGetDatum(GetCurrentStatementStartTimestamp()));
/* necessary to cast to timestamp(0) to emulate Oracle's date */
sysdate_scaled = DirectFunctionCall2(timestamp_scale,
sysdate,
Int32GetDatum(0));
PG_RETURN_DATUM(sysdate_scaled);
}
postgres+orafce的用法:
select oracle.sysdate() from dual
5.语法解析
整个orafce都没有实现hook的切入。使用extension的用意,估计是为了初始化和挂载so文件。
因为在“create extension orafce”,就会创建所有的自定义类型、函数、字典表等对象。
但是在orafce的源码中,却发现了词法分析和语法分析两个文件:
image.png
经过代码分析,语法分析时作为函数调用的,对应的函数为 plvlex.tokens。
其实现过程:
## 创建函数plvlex.tokens,为C语言函数
CREATE SCHEMA plvlex;
CREATE FUNCTION plvlex.tokens(IN str text, IN skip_spaces bool, IN qualified_names bool,
OUT pos int, OUT token text, OUT code int, OUT class text, OUT separator text, OUT mod text)
RETURNS SETOF RECORD
AS 'MODULE_PATHNAME','plvlex_tokens'
LANGUAGE C IMMUTABLE STRICT;
COMMENT ON FUNCTION plvlex.tokens(text,bool,bool) IS 'Parse SQL string';
## C语言实现代码
Datum
plvlex_tokens(PG_FUNCTION_ARGS)
{
#ifdef _MSC_VER
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("plvlex.tokens is not available in the built")));
PG_RETURN_VOID();
#else
FuncCallContext *funcctx;
TupleDesc tupdesc;
TupleTableSlot *slot;
AttInMetadata *attinmeta;
tokensFctx *fctx;
if (SRF_IS_FIRSTCALL ())
{
MemoryContext oldcontext;
List *lexems;
text *src = PG_GETARG_TEXT_P(0);
bool skip_spaces = PG_GETARG_BOOL(1);
bool qnames = PG_GETARG_BOOL(2);
/* 此处调用了语法分析器 */
orafce_sql_scanner_init(CSTRING(src));
if (orafce_sql_yyparse(&lexems) != 0)
orafce_sql_yyerror(NULL, "bogus input");
orafce_sql_scanner_finish();
funcctx = SRF_FIRSTCALL_INIT ();
oldcontext = MemoryContextSwitchTo (funcctx->multi_call_memory_ctx);
fctx = (tokensFctx*) palloc (sizeof (tokensFctx));
funcctx->user_fctx = (void *)fctx;
fctx->nodes = filterList(lexems, skip_spaces, qnames);
fctx->nnodes = list_length(fctx->nodes);
fctx->cnode = 0;
fctx->values = (char **) palloc (6 * sizeof (char *));
fctx->values [0] = (char*) palloc (16 * sizeof (char));
fctx->values [1] = (char*) palloc (1024 * sizeof (char));
fctx->values [2] = (char*) palloc (16 * sizeof (char));
fctx->values [3] = (char*) palloc (16 * sizeof (char));
fctx->values [4] = (char*) palloc (255 * sizeof (char));
fctx->values [5] = (char*) palloc (255 * sizeof (char));
tupdesc = CreateTemplateTupleDesc (6 , false);
TupleDescInitEntry (tupdesc, 1, "start_pos", INT4OID, -1, 0);
TupleDescInitEntry (tupdesc, 2, "token", TEXTOID, -1, 0);
TupleDescInitEntry (tupdesc, 3, "keycode", INT4OID, -1, 0);
TupleDescInitEntry (tupdesc, 4, "class", TEXTOID, -1, 0);
TupleDescInitEntry (tupdesc, 5, "separator", TEXTOID, -1, 0);
TupleDescInitEntry (tupdesc, 6, "mod", TEXTOID, -1, 0);
slot = TupleDescGetSlot (tupdesc);
funcctx -> slot = slot;
attinmeta = TupleDescGetAttInMetadata (tupdesc);
funcctx -> attinmeta = attinmeta;
MemoryContextSwitchTo (oldcontext);
}
funcctx = SRF_PERCALL_SETUP ();
fctx = (tokensFctx*) funcctx->user_fctx;
while (fctx->cnode < fctx->nnodes)
{
char **values;
Datum result;
HeapTuple tuple;
char *back_vals[6];
orafce_lexnode *nd = (orafce_lexnode*) list_nth(fctx->nodes, fctx->cnode++);
values = fctx->values;
back_vals[2] = values[2];
back_vals[4] = values[4];
back_vals[5] = values[5];
snprintf(values[0], 16, "%d", nd->lloc);
snprintf(values[1], 10000, "%s", SF(nd->str));
snprintf(values[2], 16, "%d", nd->keycode);
snprintf(values[3], 16, "%s", nd->classname);
snprintf(values[4], 255, "%s", SF(nd->sep));
snprintf(values[5], 48, "%s", SF(nd->modificator));
if (nd->keycode == -1)
values[2] = NULL;
if (!nd->sep)
values[4] = NULL;
if (!nd->modificator)
values[5] = NULL;
tuple = BuildTupleFromCStrings (funcctx -> attinmeta,
fctx -> values);
result = TupleGetDatum (funcctx -> slot, tuple);
values[2] = back_vals[2];
values[4] = back_vals[4];
values[5] = back_vals[5];
SRF_RETURN_NEXT (funcctx, result);
}
SRF_RETURN_DONE (funcctx);
#endif
}
orafce的函数plvlex.tokens的用法:
atlas=# select * from plvlex.tokens('select * from a.b.c join d ON x=y', true, true);
pos | token | code | class | separator | mod
-----+--------+------+---------+-----------+------
0 | select | 597 | KEYWORD | |
7 | * | 42 | OTHERS | | self
9 | from | 417 | KEYWORD | |
14 | a.b.c | | IDENT | |
20 | join | 464 | KEYWORD | |
25 | d | | IDENT | |
27 | on | 521 | KEYWORD | |
30 | x | | IDENT | |
31 | = | 61 | OTHERS | | self
32 | y | | IDENT | |
(10 rows)
对plvlex.tokens的官方说明:
Package PLVlex
This package isn’t compatible with original PLVlex.
Warning: Keyword’s codes can be changed between PostgreSQL versions! o plvlex.tokens(str text, skip_spaces bool, qualified_names bool) - Returns table of lexical elements in str.
总结
orafce的实现都是基于函数、视图来实现的。所以如果要做语法兼容,orafce的做法是无法实现的。因为postgres的语法分析在调用视图和函数之前。必须要在语法分析之前切入hook才能使用extension的实现做语法兼容性。未来postgres也许会提供相应的hook切入点,orafce实现更加完美的oracle兼容性。