Oracle jdbc获取元数据过程优化
一、背景
因业务需要,需要使用java程序自动获取oracle数据库的元数据信息(包括表、字段、索引、约束。。。)。
在个别客户环境中,获取元数据信息的效率低下,获取速度低至每分钟6个表(还是4线程同时在工作,换算为单线程,平均每个数据表的元数据需要接近40s),在如此慢的效率下,如果客户单个数据库有20万个表,全部获取完成需要消耗时间:200 000(张表) * 单个线程32秒每张表 / 4线程 = 1600000 秒 = 444.44 小时 = 18.52 天。
仅仅获取全库20W的元数据就要消耗18天,如此效率会影响到其他业务或者工作的开展。
二、原因分析
因客户无法接受这么慢的处理效率,配合我们分析慢的原因,采集了一份元数据扫描处理过程中的Oracle AWR 报告,供我们分析原因和解决问题。
一份完整的AWR报告,包含oracle数据库中大部分处理信息
image.png
分析SQL统计 —— SQL Statistics
image.png
分析SQL执行耗时—— SQL ordered by Elapsed Time
image.png
这里可以看到最耗时的一个SQL,一个小时内执行了450次,执行一次平均需要29秒。
查看详细的SQL:
image.png
SELECT NULL AS TABLE_CAT, T.OWNER AS TABLE_SCHEM, T.TABLE_NAME AS TABLE_NAME, T.COLUMN_NAME AS COLUMN_NAME, DECODE( (SELECT A.TYPECODE FROM ALL_TYPES A WHERE A.TYPE_NAME = T.DATA_TYPE), 'OBJECT', 2002, 'COLLECTION', 2003, DECODE(SUBSTR(T.DATA_TYPE, 1, 9), 'TIMESTAMP', DECODE(SUBSTR(T.DATA_TYPE, 10, 1), '(', DECODE(SUBSTR(T.DATA_TYPE, 19, 5), 'LOCAL', -102, 'TIME ', -101, 93), DECODE(SUBSTR(T.DATA_TYPE, 16, 5), 'LOCAL', -102, 'TIME ', -101, 93)), 'INTERVAL ', DECODE(SUBSTR(T.DATA_TYPE, 10, 3), 'DAY', -104, 'YEA', -103), DECODE(T.DATA_TYPE, 'BINARY_DOUBLE', 101, 'BINARY_FLOAT', 100, 'BFILE', -13, 'BLOB', 2004, 'CHAR', 1, 'CLOB', 2005, 'COLLECTION', 2003, 'DATE', 93, 'FLOAT', 6, 'LONG', -1, 'LONG RAW', -4, 'NCHAR', -15, 'NCLOB', 2011, 'NUMBER', 2, 'NVARCHAR', -9, 'NVARCHAR2', -9, 'OBJECT', 2002, 'OPAQUE/XMLTYPE', 2009, 'RAW', -3, 'REF', 2006, 'ROWID', -8, 'SQLXML', 2009, 'UROWID', -8, 'VARCHAR2', 12, 'VARRAY', 2003, 'XMLTYPE', 2009, 1111))) AS DATA_TYPE, T.DATA_TYPE AS TYPE_NAME, DECODE (T.DATA_PRECISION, NULL, DECODE(T.DATA_TYPE, 'NUMBER', DECODE(T.DATA_SCALE, NULL, 0 , 38), DECODE (T.DATA_TYPE, 'CHAR', T.CHAR_LENGTH, 'VARCHAR', T.CHAR_LENGTH, 'VARCHAR2', T.CHAR_LENGTH, 'NVARCHAR2', T.CHAR_LENGTH, 'NCHAR', T.CHAR_LENGTH, 'NUMBER', 0, T.DATA_LENGTH) ), T.DATA_PRECISION) AS COLUMN_SIZE, 0 AS BUFFER_LENGTH, DECODE (T.DATA_TYPE, 'NUMBER', DECODE(T.DATA_PRECISION , NULL, DECODE(T.DATA_SCALE, NULL, -127 , T.DATA_SCALE), T.DATA_SCALE), T.DATA_SCALE) AS DECIMAL_DIGITS, 10 AS NUM_PREC_RADIX, DECODE (T.NULLABLE, 'N', 0, 1) AS NULLABLE, C.COMMENTS AS REMARKS, T.DATA_DEFAULT AS COLUMN_DEF, 0 AS SQL_DATA_TYPE, 0 AS SQL_DATETIME_SUB, T.DATA_LENGTH AS CHAR_OCTET_LENGTH, T.COLUMN_ID AS ORDINAL_POSITION, DECODE (T.NULLABLE, 'N', 'NO', 'YES') AS IS_NULLABLE, NULL AS SCOPE_CATALOG, NULL AS SCOPE_SCHEMA, NULL AS SCOPE_TABLE, NULL AS SOURCE_DATA_TYPE, 'NO' AS IS_AUTOINCREMENT, NULL AS IS_GENERATEDCOLUMN FROM ALL_TAB_COLUMNS T, ALL_COL_COMMENTS C WHERE T.OWNER LIKE :B3 ESCAPE '/' AND T.TABLE_NAME LIKE :B2 ESCAPE '/' AND T.COLUMN_NAME LIKE :B1 ESCAPE '/' AND T.OWNER = C.OWNER (+) AND T.TABLE_NAME = C.TABLE_NAME (+) AND T.COLUMN_NAME = C.COLUMN_NAME (+) ORDER BY TABLE_SCHEM, TABLE_NAME, ORDINAL_POSITION
格式化之后
SELECT NULL AS TABLE_CAT, T.OWNER AS TABLE_SCHEM, T.TABLE_NAME AS TABLE_NAME, T.COLUMN_NAME AS COLUMN_NAME, DECODE( (SELECT A.TYPECODE FROM ALL_TYPES A WHERE A.TYPE_NAME = T.DATA_TYPE), 'OBJECT', 2002, 'COLLECTION', 2003,
DECODE(SUBSTR(T.DATA_TYPE, 1, 9), 'TIMESTAMP', DECODE(SUBSTR(T.DATA_TYPE, 10, 1), '(', DECODE(SUBSTR(T.DATA_TYPE, 19, 5), 'LOCAL', -102, 'TIME ', -101, 93),
DECODE(SUBSTR(T.DATA_TYPE, 16, 5), 'LOCAL', -102, 'TIME ', -101, 93)), 'INTERVAL ', DECODE(SUBSTR(T.DATA_TYPE, 10, 3), 'DAY', -104, 'YEA', -103),
DECODE(T.DATA_TYPE, 'BINARY_DOUBLE', 101, 'BINARY_FLOAT', 100, 'BFILE', -13, 'BLOB', 2004, 'CHAR', 1, 'CLOB', 2005, 'COLLECTION', 2003, 'DATE', 93, 'FLOAT', 6, 'LONG', -1, 'LONG RAW', -4, 'NCHAR', -15, 'NCLOB', 2011, 'NUMBER', 2, 'NVARCHAR', -9, 'NVARCHAR2', -9, 'OBJECT', 2002, 'OPAQUE/XMLTYPE', 2009, 'RAW', -3, 'REF', 2006, 'ROWID', -8, 'SQLXML', 2009, 'UROWID', -8, 'VARCHAR2', 12, 'VARRAY', 2003, 'XMLTYPE', 2009, 1111))) AS DATA_TYPE, T.DATA_TYPE AS TYPE_NAME,
DECODE (T.DATA_PRECISION, NULL, DECODE(T.DATA_TYPE, 'NUMBER', DECODE(T.DATA_SCALE, NULL, 0 , 38), DECODE (T.DATA_TYPE, 'CHAR', T.CHAR_LENGTH, 'VARCHAR', T.CHAR_LENGTH, 'VARCHAR2', T.CHAR_LENGTH, 'NVARCHAR2', T.CHAR_LENGTH, 'NCHAR', T.CHAR_LENGTH, 'NUMBER', 0, T.DATA_LENGTH) ), T.DATA_PRECISION) AS COLUMN_SIZE, 0 AS BUFFER_LENGTH,
DECODE (T.DATA_TYPE, 'NUMBER', DECODE(T.DATA_PRECISION , NULL,
DECODE(T.DATA_SCALE, NULL, -127 , T.DATA_SCALE), T.DATA_SCALE), T.DATA_SCALE) AS DECIMAL_DIGITS, 10 AS NUM_PREC_RADIX, DECODE (T.NULLABLE, 'N', 0, 1) AS NULLABLE, C.COMMENTS AS REMARKS, T.DATA_DEFAULT AS COLUMN_DEF, 0 AS SQL_DATA_TYPE, 0 AS SQL_DATETIME_SUB, T.DATA_LENGTH AS CHAR_OCTET_LENGTH, T.COLUMN_ID AS ORDINAL_POSITION, DECODE (T.NULLABLE, 'N', 'NO', 'YES') AS IS_NULLABLE, NULL AS SCOPE_CATALOG, NULL AS SCOPE_SCHEMA, NULL AS SCOPE_TABLE, NULL AS SOURCE_DATA_TYPE, 'NO' AS IS_AUTOINCREMENT, NULL AS IS_GENERATEDCOLUMN
FROM ALL_TAB_COLUMNS T, ALL_COL_COMMENTS C
WHERE T.OWNER LIKE :B3 ESCAPE '/' AND T.TABLE_NAME LIKE :B2 ESCAPE '/' AND T.COLUMN_NAME LIKE :B1 ESCAPE '/' AND T.OWNER = C.OWNER (+) AND T.TABLE_NAME = C.TABLE_NAME (+) AND T.COLUMN_NAME = C.COLUMN_NAME (+)
ORDER BY TABLE_SCHEM, TABLE_NAME, ORDINAL_POSITION;
SQL性能分析:
- 两个视图之间外关联
- 子查询
- 模糊查询
- 大量调用 oracle的DECODE函数
- 3个字段的排序
SQL出处
分析SQL查询结果,是为了查询一个表的字段信息,但是并非我们代码中写的SQL。
回归程序,找到获取oracle表字段信息的代码:
...
/**
* 获取字段拓展信息
*
* @param catalog
* 数据库名
* @param schema
* 用户名
* @param table
* 表名
* @return 字段拓展信息map对象
*/
protected Map<String, Map> getColumnExt(Connection conn, String catalog, String schema, String tabname) {
Map<String, Map> m_ext = new HashMap<String, Map>();
String sql = "select column_name colname,data_type coltype,data_precision prec,data_scale scale,virtual_column virtualcolumn,data_length,char_length,char_used"
+ " from all_tab_cols where HIDDEN_COLUMN='NO' and owner=? and table_name=?";
String colname, coltype, virtualCol, charUsed;
long charLength, dataLength;
boolean isPrecNull, isScaleNull;
try (PreparedStatement pstmt = conn.prepareStatement(sql)) {
pstmt.setFetchSize(FETCH_SIZE);
pstmt.setString(1, schema);
pstmt.setString(2, tabname);
try (ResultSet rs = pstmt.executeQuery()) {
while (rs.next()) {
...
}
}
} catch (SQLException ex) {
...
throw new RuntimeException(ex);
}
return m_ext;
}
// 获取表中的所有字段以及定义信息
protected List<Document> getColumns(Connection conn, DatabaseMetaData metadata, String catalog, String schemaname, String tabname, String tabid, String dbcharset) {
...
ResultSet rs = null;
try {
...
Map<String, Map> m_ext = this.getColumnExt(conn, catalog, schemaname, tabname);
...
rs = metadata.getColumns(catalog, schema_tmp, table_tmp, null);
rs.setFetchSize(1);
...
while (rs.next()) {
...
}
} catch (SQLException e) {
...
throw new RuntimeException(e);
} finally {
if (rs != null) { // 关闭结果集
try {
rs.close();
} catch (SQLException e) {
logger.error("Resultset close failed with " + e.getMessage(), e);
}
}
}
return cols;
}
...
刚才那个执行耗时最多的SQL,是以下命令发出的,jdbc驱动内部的SQL。
rs = metadata.getColumns(catalog, schema_tmp, table_tmp, null);
原来获取一个表的信息执行了这个SQL外,还要执行以下SQL。
select column_name colname,data_type coltype,data_precision prec,data_scale scale,virtual_column virtualcolumn,data_length,char_length,char_used from all_tab_cols where HIDDEN_COLUMN='NO' and owner=? and table_name=?
小结
获取一个表的字段信息,需要查询两次数据库,并且有一个查询的效率非常低,平均耗时接近30秒。
三、优化过程
分解目标:
- 取消模糊查询:一般情况下,oracle的模糊查询需要全表扫描+模糊匹配,耗时非常大
- 取消子查询:子查询效率低
- 取消 DECODE 函数 : 尽量减少函数调用,减轻数据库处理消耗
- 减少oracle数据库数据排序
- 尽量减少 表间关联查询
- 减少 查询次数
最终目标:
- 减少字段信息获取的时间及资源开销,提升处理效率
但是这个SQL是oracle jdbc驱动内部实现jdbc元数据接口的函数,为了优化SQL提升性能,首先就要放弃使用jdbc标准的元数据接口。
以下两个查询:
#查询1
select column_name colname,data_type coltype,data_precision prec,data_scale scale,virtual_column virtualcolumn,data_length,char_length,char_used from all_tab_cols where HIDDEN_COLUMN='NO' and owner=? and table_name=? ;
#查询2
SELECT NULL AS TABLE_CAT, T.OWNER AS TABLE_SCHEM, T.TABLE_NAME AS TABLE_NAME, T.COLUMN_NAME AS COLUMN_NAME, DECODE( (SELECT A.TYPECODE FROM ALL_TYPES A WHERE A.TYPE_NAME = T.DATA_TYPE), 'OBJECT', 2002, 'COLLECTION', 2003,
DECODE(SUBSTR(T.DATA_TYPE, 1, 9), 'TIMESTAMP', DECODE(SUBSTR(T.DATA_TYPE, 10, 1), '(', DECODE(SUBSTR(T.DATA_TYPE, 19, 5), 'LOCAL', -102, 'TIME ', -101, 93),
DECODE(SUBSTR(T.DATA_TYPE, 16, 5), 'LOCAL', -102, 'TIME ', -101, 93)), 'INTERVAL ', DECODE(SUBSTR(T.DATA_TYPE, 10, 3), 'DAY', -104, 'YEA', -103),
DECODE(T.DATA_TYPE, 'BINARY_DOUBLE', 101, 'BINARY_FLOAT', 100, 'BFILE', -13, 'BLOB', 2004, 'CHAR', 1, 'CLOB', 2005, 'COLLECTION', 2003, 'DATE', 93, 'FLOAT', 6, 'LONG', -1, 'LONG RAW', -4, 'NCHAR', -15, 'NCLOB', 2011, 'NUMBER', 2, 'NVARCHAR', -9, 'NVARCHAR2', -9, 'OBJECT', 2002, 'OPAQUE/XMLTYPE', 2009, 'RAW', -3, 'REF', 2006, 'ROWID', -8, 'SQLXML', 2009, 'UROWID', -8, 'VARCHAR2', 12, 'VARRAY', 2003, 'XMLTYPE', 2009, 1111))) AS DATA_TYPE, T.DATA_TYPE AS TYPE_NAME,
DECODE (T.DATA_PRECISION, NULL, DECODE(T.DATA_TYPE, 'NUMBER', DECODE(T.DATA_SCALE, NULL, 0 , 38), DECODE (T.DATA_TYPE, 'CHAR', T.CHAR_LENGTH, 'VARCHAR', T.CHAR_LENGTH, 'VARCHAR2', T.CHAR_LENGTH, 'NVARCHAR2', T.CHAR_LENGTH, 'NCHAR', T.CHAR_LENGTH, 'NUMBER', 0, T.DATA_LENGTH) ), T.DATA_PRECISION) AS COLUMN_SIZE, 0 AS BUFFER_LENGTH,
DECODE (T.DATA_TYPE, 'NUMBER', DECODE(T.DATA_PRECISION , NULL,
DECODE(T.DATA_SCALE, NULL, -127 , T.DATA_SCALE), T.DATA_SCALE), T.DATA_SCALE) AS DECIMAL_DIGITS, 10 AS NUM_PREC_RADIX, DECODE (T.NULLABLE, 'N', 0, 1) AS NULLABLE, C.COMMENTS AS REMARKS, T.DATA_DEFAULT AS COLUMN_DEF, 0 AS SQL_DATA_TYPE, 0 AS SQL_DATETIME_SUB, T.DATA_LENGTH AS CHAR_OCTET_LENGTH, T.COLUMN_ID AS ORDINAL_POSITION, DECODE (T.NULLABLE, 'N', 'NO', 'YES') AS IS_NULLABLE, NULL AS SCOPE_CATALOG, NULL AS SCOPE_SCHEMA, NULL AS SCOPE_TABLE, NULL AS SOURCE_DATA_TYPE, 'NO' AS IS_AUTOINCREMENT, NULL AS IS_GENERATEDCOLUMN
FROM ALL_TAB_COLUMNS T, ALL_COL_COMMENTS C
WHERE T.OWNER LIKE :B3 ESCAPE '/' AND T.TABLE_NAME LIKE :B2 ESCAPE '/' AND T.COLUMN_NAME LIKE :B1 ESCAPE '/' AND T.OWNER = C.OWNER (+) AND T.TABLE_NAME = C.TABLE_NAME (+) AND T.COLUMN_NAME = C.COLUMN_NAME (+)
ORDER BY TABLE_SCHEM, TABLE_NAME, ORDINAL_POSITION;
涉及的表或试图:
- all_tab_cols
- ALL_TAB_COLUMNS
- ALL_COL_COMMENTS
- ALL_TYPES (子查询)
目标逐个实现:
1. 取消模糊查询
根据业务需要,对复杂SQL调整,取消模糊查询
...
WHERE T.OWNER LIKE :B3 ESCAPE '/' AND T.TABLE_NAME LIKE :B2 ESCAPE '/' AND T.COLUMN_NAME LIKE :B1 ESCAPE '/' AND T.OWNER = C.OWNER (+) AND T.TABLE_NAME = C.TABLE_NAME (+) AND T.COLUMN_NAME = C.COLUMN_NAME (+)
...
改为
...
WHERE T.OWNER = :B3 AND T.TABLE_NAME = :B2 AND T.OWNER = C.OWNER (+) AND T.TABLE_NAME = C.TABLE_NAME (+) AND T.COLUMN_NAME = C.COLUMN_NAME (+)
...
2. 取消子查询,同时取消 DECODE 函数
在复杂SQL中,存在以下子查询,并使用很多decode函数。
DECODE( (SELECT A.TYPECODE FROM ALL_TYPES A WHERE A.TYPE_NAME = T.DATA_TYPE), 'OBJECT', 2002, 'COLLECTION', 2003,
DECODE(SUBSTR(T.DATA_TYPE, 1, 9), 'TIMESTAMP', DECODE(SUBSTR(T.DATA_TYPE, 10, 1), '(', DECODE(SUBSTR(T.DATA_TYPE, 19, 5), 'LOCAL', -102, 'TIME ', -101, 93),
DECODE(SUBSTR(T.DATA_TYPE, 16, 5), 'LOCAL', -102, 'TIME ', -101, 93)), 'INTERVAL ', DECODE(SUBSTR(T.DATA_TYPE, 10, 3), 'DAY', -104, 'YEA', -103),
DECODE(T.DATA_TYPE, 'BINARY_DOUBLE', 101, 'BINARY_FLOAT', 100, 'BFILE', -13, 'BLOB', 2004, 'CHAR', 1, 'CLOB', 2005, 'COLLECTION', 2003, 'DATE', 93, 'FLOAT', 6, 'LONG', -1, 'LONG RAW', -4, 'NCHAR', -15, 'NCLOB', 2011, 'NUMBER', 2, 'NVARCHAR', -9, 'NVARCHAR2', -9, 'OBJECT', 2002, 'OPAQUE/XMLTYPE', 2009, 'RAW', -3, 'REF', 2006, 'ROWID', -8, 'SQLXML', 2009, 'UROWID', -8, 'VARCHAR2', 12, 'VARRAY', 2003, 'XMLTYPE', 2009, 1111))) AS DATA_TYPE
分析:
子查询涉及表ALL_TYPES ,这是oracle的类型表,然后使用多层decode函数对DATA_TYPE、TYPECODE的值进行分支判断,返回jdbc对应的DATA_TYPE。
优化方案:
- 对应DECODE函数,采用java函数处理代替oracle decode函数。
- 对应ALL_TYPES表采用缓存,避免对其关联查询找出额外的数据库资源开销。
3. 减少oracle数据库数据排序
目前是每次获取一个表
ORDER BY TABLE_SCHEM, TABLE_NAME, ORDINAL_POSITION;
改为
ORDER BY ORDINAL_POSITION;
还可以直接取消数据库排序,采用java函数排序。(目前没有做到这点)。
4. 减少查询次数
将上面两个涉及到字段的查询,合成一个查询,避免多次查询数据库。
4. 减少 表外关联查询
以上查询涉及表或视图
- all_tab_cols
- ALL_TAB_COLUMNS
- ALL_COL_COMMENTS
- ALL_TYPES (子查询,前面已经对其缓存)
为了尽可能减少外关联查询造成的开销,对all_tab_cols 、ALL_TAB_COLUMNS、ALL_COL_COMMENTS 这3个视图进一步了解。其中ALL_TAB_COLUMNS是all_tab_cols 的视图。
CREATE VIEW
ALL_TAB_COLUMNS
(
OWNER,
TABLE_NAME,
COLUMN_NAME,
DATA_TYPE,
DATA_TYPE_MOD,
DATA_TYPE_OWNER,
DATA_LENGTH,
DATA_PRECISION,
DATA_SCALE,
NULLABLE,
COLUMN_ID,
DEFAULT_LENGTH,
DATA_DEFAULT,
NUM_DISTINCT,
LOW_VALUE,
HIGH_VALUE,
DENSITY,
NUM_NULLS,
NUM_BUCKETS,
LAST_ANALYZED,
SAMPLE_SIZE,
CHARACTER_SET_NAME,
CHAR_COL_DECL_LENGTH,
GLOBAL_STATS,
USER_STATS,
AVG_COL_LEN,
CHAR_LENGTH,
CHAR_USED,
V80_FMT_IMAGE,
DATA_UPGRADED,
HISTOGRAM,
DEFAULT_ON_NULL,
IDENTITY_COLUMN,
EVALUATION_EDITION,
UNUSABLE_BEFORE,
UNUSABLE_BEGINNING
) AS
SELECT
OWNER,
TABLE_NAME,
COLUMN_NAME,
DATA_TYPE,
DATA_TYPE_MOD,
DATA_TYPE_OWNER,
DATA_LENGTH,
DATA_PRECISION,
DATA_SCALE,
NULLABLE,
COLUMN_ID,
DEFAULT_LENGTH,
DATA_DEFAULT,
NUM_DISTINCT,
LOW_VALUE,
HIGH_VALUE,
DENSITY,
NUM_NULLS,
NUM_BUCKETS,
LAST_ANALYZED,
SAMPLE_SIZE,
CHARACTER_SET_NAME,
CHAR_COL_DECL_LENGTH,
GLOBAL_STATS,
USER_STATS,
AVG_COL_LEN,
CHAR_LENGTH,
CHAR_USED,
V80_FMT_IMAGE,
DATA_UPGRADED,
HISTOGRAM,
DEFAULT_ON_NULL,
IDENTITY_COLUMN,
EVALUATION_EDITION,
UNUSABLE_BEFORE,
UNUSABLE_BEGINNING
FROM
ALL_TAB_COLS
WHERE
USER_GENERATED = 'YES';
所以,所有查询需要的信息都在all_tab_cols、ALL_COL_COMMENTS 两个视图中,取消对ALL_TAB_COLUMNS的关联,根据业务需要调整SQL:
SELECT T.DATA_DEFAULT, T.COLUMN_ID, T.COLUMN_NAME, T.DATA_TYPE, T.DATA_TYPE_OWNER, T.DATA_LENGTH, T.DATA_PRECISION, T.DATA_SCALE, T.NULLABLE, T.CHAR_LENGTH, T.CHAR_USED, C.COMMENTS, T.VIRTUAL_COLUMN
FROM ALL_TAB_COLS T LEFT JOIN ALL_COL_COMMENTS C ON T.OWNER = C.OWNER AND T.TABLE_NAME = C.TABLE_NAME AND T.COLUMN_NAME = C.COLUMN_NAME
WHERE T.USER_GENERATED = 'YES' AND T.OWNER=? AND T.TABLE_NAME=?
ORDER BY T.COLUMN_ID ASC
以上SQL涵盖我们需要的信息。
让客户做SQL优化对比测试:
schema:ABM
table:ABM_IMP_BILL_LINES
-------优化前--------
SELECT NULL AS TABLE_CAT, T.OWNER AS TABLE_SCHEM, T.TABLE_NAME AS TABLE_NAME, T.COLUMN_NAME AS COLUMN_NAME, DECODE( (SELECT A.TYPECODE FROM ALL_TYPES A WHERE A.TYPE_NAME = T.DATA_TYPE), 'OBJECT', 2002, 'COLLECTION', 2003,
DECODE(SUBSTR(T.DATA_TYPE, 1, 9), 'TIMESTAMP', DECODE(SUBSTR(T.DATA_TYPE, 10, 1), '(', DECODE(SUBSTR(T.DATA_TYPE, 19, 5), 'LOCAL', -102, 'TIME ', -101, 93),
DECODE(SUBSTR(T.DATA_TYPE, 16, 5), 'LOCAL', -102, 'TIME ', -101, 93)), 'INTERVAL ', DECODE(SUBSTR(T.DATA_TYPE, 10, 3), 'DAY', -104, 'YEA', -103),
DECODE(T.DATA_TYPE, 'BINARY_DOUBLE', 101, 'BINARY_FLOAT', 100, 'BFILE', -13, 'BLOB', 2004, 'CHAR', 1, 'CLOB', 2005, 'COLLECTION', 2003, 'DATE', 93, 'FLOAT', 6, 'LONG', -1, 'LONG RAW', -4, 'NCHAR', -15, 'NCLOB', 2011, 'NUMBER', 2, 'NVARCHAR', -9, 'NVARCHAR2', -9, 'OBJECT', 2002, 'OPAQUE/XMLTYPE', 2009, 'RAW', -3, 'REF', 2006, 'ROWID', -8, 'SQLXML', 2009, 'UROWID', -8, 'VARCHAR2', 12, 'VARRAY', 2003, 'XMLTYPE', 2009, 1111))) AS DATA_TYPE, T.DATA_TYPE AS TYPE_NAME,
DECODE (T.DATA_PRECISION, NULL, DECODE(T.DATA_TYPE, 'NUMBER', DECODE(T.DATA_SCALE, NULL, 0 , 38), DECODE (T.DATA_TYPE, 'CHAR', T.CHAR_LENGTH, 'VARCHAR', T.CHAR_LENGTH, 'VARCHAR2', T.CHAR_LENGTH, 'NVARCHAR2', T.CHAR_LENGTH, 'NCHAR', T.CHAR_LENGTH, 'NUMBER', 0, T.DATA_LENGTH) ), T.DATA_PRECISION) AS COLUMN_SIZE, 0 AS BUFFER_LENGTH,
DECODE (T.DATA_TYPE, 'NUMBER', DECODE(T.DATA_PRECISION , NULL,
DECODE(T.DATA_SCALE, NULL, -127 , T.DATA_SCALE), T.DATA_SCALE), T.DATA_SCALE) AS DECIMAL_DIGITS, 10 AS NUM_PREC_RADIX, DECODE (T.NULLABLE, 'N', 0, 1) AS NULLABLE, C.COMMENTS AS REMARKS, T.DATA_DEFAULT AS COLUMN_DEF, 0 AS SQL_DATA_TYPE, 0 AS SQL_DATETIME_SUB, T.DATA_LENGTH AS CHAR_OCTET_LENGTH, T.COLUMN_ID AS ORDINAL_POSITION, DECODE (T.NULLABLE, 'N', 'NO', 'YES') AS IS_NULLABLE, NULL AS SCOPE_CATALOG, NULL AS SCOPE_SCHEMA, NULL AS SCOPE_TABLE, NULL AS SOURCE_DATA_TYPE, 'NO' AS IS_AUTOINCREMENT, NULL AS IS_GENERATEDCOLUMN
FROM ALL_TAB_COLUMNS T, ALL_COL_COMMENTS C
WHERE T.OWNER LIKE 'ABM' ESCAPE '/' AND T.TABLE_NAME LIKE 'ABM/_IMP/_BILL/_LINES' ESCAPE '/' AND T.OWNER = C.OWNER (+) AND T.TABLE_NAME = C.TABLE_NAME (+) AND T.COLUMN_NAME = C.COLUMN_NAME (+)
ORDER BY TABLE_SCHEM, TABLE_NAME, ORDINAL_POSITION;
Elapsed: 00:00:26.14
-------优化后--------
--oracle 10/11g
SELECT T.DATA_DEFAULT, T.COLUMN_ID, T.COLUMN_NAME, T.DATA_TYPE, T.DATA_TYPE_OWNER, T.DATA_LENGTH, T.DATA_PRECISION, T.DATA_SCALE, T.NULLABLE, T.CHAR_LENGTH, T.CHAR_USED, C.COMMENTS, T.VIRTUAL_COLUMN
FROM ALL_TAB_COLS T
LEFT JOIN ALL_COL_COMMENTS C ON T.OWNER = C.OWNER AND T.TABLE_NAME = C.TABLE_NAME AND T.COLUMN_NAME = C.COLUMN_NAME
WHERE T.HIDDEN_COLUMN = 'NO' AND T.OWNER='ABM' AND T.TABLE_NAME='ABM_IMP_BILL_LINES'
ORDER BY T.COLUMN_ID ASC ;
Elapsed: 00:00:00.16
这里就完成了耗时最多的SQL优化,每个表获取字段信息由原来的耗时 29s 优化为 0.15s
四、其他优化
以上只是最耗时的SQL,因采集AWR报告时,已经屏蔽了其它信息采集,其中有两个相对比较耗时的信息采集:
- 表数据量
- 表记录数
在优化和内部测试过程中,发现这两个相对比较耗时,为了提升性能同时保留采集信息完整性。对处理过程进行优化: - 在开始采集一个schem的元数据时,一次性把整个schema的表记录数和数据量查出来并缓存。
- 后面采集过程中,需要 表记录数和数据量 。就直接在缓存中取。
五、性能提升
对于公司内部测试(oracle 12c 整库20W张表,并使用4线程扫描):
优化之前每秒扫描6~10个表的元数据
优化后平均每秒扫描54个表的元数据,20万张表在一个小时左右完成
客户环境(HR库,3W+张表):
优化前耗时(3小时49分)
image.png
优化后耗时(4分35秒)
image.png
五、进一步优化
虽然对客户环境的数据库优化,已经让客户满意,但是根据数据分析,客户环境扫描优化还能进一步优化。
所以,对耗时较多的扫描查询,也采用与表记录数、表数据量相同的优化方案:
- check约束
- 主键约束
- 外键约束
- 唯一约束
- 其他(表其他信息、临时表标志。。。)
对于公司内部测试(oracle 12c 整库20W张表,并使用4线程扫描):
每秒扫描54张表提升到 每秒扫描65张表。
客户环境中(客户oracle 10.2 整个库20W表):
预计可以将扫描时间降到0.2秒/每表(原来每个表32秒)。
附录:生成AWR的过程:
1、执行两次生成快照的过程,中间需要间隔一定时间
SQL> exec dbms_workload_repository.create_snapshot();
2、执行生成AWR报告的脚本
@/app/oracle/product/12c/db_1/rdbms/admin/awrrpt.sql
输入报告的类型、报告开始快照id及结束快照id、快照文件名,就会生成相应的报告文件。