Hive学习笔记一:远程服务器模式搭建
Hive官方安装介绍地址 https://cwiki.apache.org/confluence/display/Hive/AdminManual+Metastore+Administration
基础知识
Hive架构图
Hive架构.jpgHive远程服务器模式,用于非Java客户端访问元数据库,在服务器端启动MetaStoreServer,客户端利用Thrift协议通过MetaStoreServer访问元数据库。
Hive远程服务器模式.jpg服务器准备
本文搭建所用服务器环境是在【Hadoop学习笔记三:高可用集群搭建(Hadoop2.x)】https://www.jianshu.com/p/666ff9bbf784 基础上进行的,本次搭建服务器规划如下。
1.首先搭建单节点模式,服务器使用如下
node01 mysql
node02 作为hive单节点
2.后续搭建多节点模式,服务器使用如下
node01 mysql
node03 元数据服务
node04 hive客户端
一、安装Mysql
在node01节点上安装mysql-server
yum install mysql-server
登录mysql提示没有启动服务
[root@node01 ~] mysql
ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock' (2)
启动mysql服务
service mysqld start
初始没有用户名密码,需要修改,且配置其他机器可远程访问。
show databases;
use mysql;
show tables;
desc user;
select host,user,password from user
+-----------+------+----------+
| host | user | password |
+-----------+------+----------+
| localhost | root | |
| node01 | root | |
| 127.0.0.1 | root | |
| localhost | | |
| node01 | | |
+-----------+------+----------+
表中只允许本机访问,其他无法访问
grant all privileges on *.* to 'root'@'%' identified by '123' with grant option;
select host,user,password from user
delete from user where host != '%';
mysql> select host,user,password from user;
+------+------+-------------------------------------------+
| host | user | password |
+------+------+-------------------------------------------+
| % | root | *23AE809DDACAF96AF0FD78ED04B6A265E05AA257 |
+------+------+-------------------------------------------+
此时退出登录
mysql> quit
Bye
[root@node01 ~] mysql -uroot -p
Enter password:
ERROR 1045 (28000): Access denied for user 'root'@'localhost' (using password: YES)
刷新权限
mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)
#退出再次登录即可
mysql> quit
[root@node01 ~] mysql -uroot -p
二、安装启动单节点的Hive
node01安装mysql,node02安装Hive
在node02上解压安装包
tar xf apache-hive-1.2.1-src.tar.gz -C /opt/sxt/
cd /opt/sxt/
mv apache-hive-1.2.1-src/ hive
配置环境变量
vi /etc/profile
新增变量
export HIVE_HOME=/opt/sxt/hive
PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HIVE_HOME/bin
source /etc/profile
修改配置
vi /opt/sxt/hive/conf/hive-site.xml
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://node01/hive?createDatabaseIfNotExist=true</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>123</value>
</property>
启动HDFS和Yarn集群后,直接执行 hive,报错如下
Caused by: org.datanucleus.store.rdbms.connectionpool.DatastoreDriverNotFoundException: The specified datastore driver (
"com.mysql.jdbc.Driver") was not found in the CLASSPATH. Please check your CLASSPATH specification, and the name of the driver.
将驱动包拷贝到lib
cp ~/software/mysql-connector-java-5.1.32-bin.jar /opt/sxt/hive/lib/
再次执行hive,报错
[ERROR] Terminal initialization failed; falling back to unsupported
java.lang.IncompatibleClassChangeError: Found class jline.Terminal, but interface was expected
at jline.TerminalFactory.create(TerminalFactory.java:101)
错误的原因: Hadoop jline版本和hive的jline不一致
cd /opt/sxt/hadoop-2.6.5/share/hadoop/yarn/lib
rm -rf jline-0.9.94.jar
cp /opt/sxt/hive/lib/jline-2.12.jar ./
此时使用hive命令可以启动单节点模式
三、安装启动多节点的Hive
node01安装mysql,node03做metastore server,node04做客户端
有了上一步的基础,可直接发送hive解压目录给node03/node04
scp -r /opt/sxt/hive/ node03:/opt/sxt/
scp -r /opt/sxt/hive/ node04:/opt/sxt/
同样分别去node03、node04配置HIVE环境变量
node03修改配置hive-site.xml(元数据服务)
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive_remote/warehouse</value>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://node01/hive_remote?createDatabaseIfNotExist=true</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>123</value>
</property>
node04修改配置hive-site.xml(客户端)
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive_remote/warehouse</value>
</property>
<property>hive.metastore.uris</name>
<value>thrift://node03:9083</value>
</property>
在node03上执行启动元数据服务
hive --service metastore
#阻塞在这里表示启动成功
Starting Hive Metastore Server
另启动一个终端输入 ss-nal 发现有9083端口表示成功
此时在node04上输入hive启动,之前在node02上输入hive启动,其实单节点模式是先给启动了一个元数据服务,再启动Hive客户端。现在多节点模式,就会去找配置文件里9083那个元数据服务,即node03的服务。
注意:元数据服务和客户端也可配置到一台机器,配置如下
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive_remote/warehouse</value>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://node03:3306/hive_remote?createDatabaseIfNotExist=true</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>password</value>
</property>
<property>
<name>hive.metastore.uris</name>
<value>thrift://node03:9083</value>
</property>