hive 操作表报错：Caused by: org.apache

2019-11-02 本文已影响0人 alexlee666

有一次在 hive shell 中向hive 表插入数据时报了如下错误：

Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
Unable to determine if hdfs://nameservice1:9001/user/hive/warehouse/test.db/tbl1 is encrypted: java.lang.IllegalArgumentException: Wrong FS: hdfs://nameservice1:9000/user/hive/warehouse/test.db/tbl1, expected: hdfs://127.0.0.1:9001

后经过仔细排查，才发现了原因所在。

一、原因分析

hive 通常使用 mysql 数据库作为metastore（存储metadata），当初在配置hive的时候，需要在 mysql中新建一个数据库（比如名为hive）并创建了对应的user/password。关于hive 的metastore配置可以参考博客：https://www.jianshu.com/p/ce4c5826a078。
在mysql中的名为hive的数据库中包含了两张表：DBS 和 SDS，其记录了查询引擎 hive 中的所有数据库的信息（数据存储在hdfs中的location、user等）。
接下来划重点：
当时配置hive的时候，DBS 和 SDS中记录的location是HDFS的配置项fs.defaultFS的值为：

    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://nameservice1:9000</value>   
    </property>

但是之后做了修改:

    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://127.0.0.1:9000</value>   
    </property>

因此，这就导致 hive metastore中的信息和表在HDFS真实存储路径信息不一致，进而导致了本文所提到的错误。

二、解决方法

因此，只需要修改hive metastore中的信息，使其和表在HDFS真实存储路径信息一致即可。为此修改mysql中hive数据库的两张表的内容:

DBS 表的DB_LOCATION_URI字段；
SDS 表的LOCATION字段。

注意：修改为hadoop配置文件hdfs-site.xml中的配置项fs.defaultFS对应的值。

可以在mysql中执行如下命令：

update DBS set DB_LOCATION_URI=REPLACE (DB_LOCATION_URI, 'nameservice1:9000', '127.0.0.1:9000');

update SDS set LOCATION=REPLACE (LOCATION,'nameservice1:9000', '127.0.0.1:9000');

DBS表修改前：

mysql> select * from DBS;

+-------+-----------------------+---------------------------------------------------------------------------------+------------------------------------+------------+------------+
| DB_ID | DESC                  | DB_LOCATION_URI                                                            | NAME       | OWNER_NAME | OWNER_TYPE |
+-------+-----------------------+---------------------------------------------------------------------------------+------------------------------------+------------+------------+
|     1 | Default Hive database | hdfs://nameservice1:9000/user/hive/warehouse                | default      | public     | ROLE       |
|     6 | NULL                  | hdfs://nameservice1:9000/user/hive/warehouse/test.db        | test         | user1      | USER       |
+-------+-----------------------+---------------------------------------------------------------------------------+------------------------------------+------------+------------+

DBS表修改后：

mysql> select * from DBS;

+-------+-----------------------+---------------------------------------------------------------------------------+------------------------------------+------------+------------+
| DB_ID | DESC                  | DB_LOCATION_URI                                                            | NAME       | OWNER_NAME | OWNER_TYPE |
+-------+-----------------------+---------------------------------------------------------------------------------+------------------------------------+------------+------------+
|     1 | Default Hive database | hdfs://127.0.0.1:9000/user/hive/warehouse                | default      | public     | ROLE       |
|     6 | NULL                  | hdfs://127.0.0.1:9000/user/hive/warehouse/test.db        | test         | user1      | USER       |
+-------+-----------------------+---------------------------------------------------------------------------------+------------------------------------+------------+------------+

三、测试结果

重启hive shell，并重新向hive 表中插入数据（通过mapreduce），成功了。

hive> use test;

hive> describe tbl1;
OK
id                      int                                         
name                    string                                      
gender                  string                                      
age                     int                                         
Time taken: 0.154 seconds, Fetched: 4 row(s)

hive> insert into tbl1 values(1, 'a', 'm', 1);

Query ID = ......
Total jobs = 3
Launching Job 1 out of 3
Number of reduce tasks is set to 0 since there's no reduce operator
Job running in-process (local Hadoop)
2019-11-02 16:09:44,943 Stage-1 map = 100%,  reduce = 0%
Ended Job = job_local32255317_0001
Stage-4 is selected by condition resolver.
Stage-3 is filtered out by condition resolver.
Stage-5 is filtered out by condition resolver.
Moving data to: hdfs://.../test.db/tbl1/.hive-staging_hive_2019-11-02_16-09-41_685_8562247153682161957-1/-ext-10000
......
MapReduce Jobs Launched: 
Stage-Stage-1:  HDFS Read: 8 HDFS Write: 101 SUCCESS
Total MapReduce CPU Time Spent: 0 msec
OK
Time taken: 3.609 seconds

笔者水平有限，如有错误，敬请指正！

hive 操作表报错：Caused by: org.apache

一、原因分析

二、解决方法

三、测试结果

猜你喜欢

热点阅读