大数据BigData

Hive的serde,beeline,jdbc

2018-11-29  本文已影响18人  geekAppke

Hive SerDe - Serializer and Deserializer

row_format
: DELIMITED 
          [FIELDS TERMINATED BY char [ESCAPED BY char]] 
          [COLLECTION ITEMS TERMINATED BY char] 
          [MAP KEYS TERMINATED BY char] 
          [LINES TERMINATED BY char] 
: SERDE serde_name [WITH SERDEPROPERTIES (property_name=property_value, property_name=property_value, ...)]

Hive正则匹配

CREATE TABLE logtbl (
    host STRING,
    identity STRING,
    t_user STRING,
    time STRING,
    request STRING,
    referer STRING,
    agent STRING
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.RegexSerDe'
WITH SERDEPROPERTIES (
    "input.regex" = "([^ ]*) ([^ ]*) ([^ ]*) \\[(.*)\\] \"(.*)\" (-|[0-9]*) (-|[0-9]*)"
)
STORED AS TEXTFILE;
把各个字段取出来,提取时间,修改提取规则
满足读取规则就展示,不满足这行全是NULL
    写数据你随便,hive是度检查
    上传数据还检查格式,浪费时间、资源

192.168.57.4 - - [29/Feb/2016:18:14:35 +0800] "GET /bg-upper.png HTTP/1.1" 304 -
([^ ]*) ([^ ]*) ([^ ]*) \\[(.*)\\] \"(.*)\" (-|[0-9]*) (-|[0-9]*)

hive> load data local inpath '/root/log' into table logtbl;
hive> select * from logtbl;
OK
192.168.57.4 - - 29/Feb/2016:18:14:35 +0800 GET /bg-upper.png HTTP/1.1 304 -
192.168.57.4 - - 29/Feb/2016:18:14:35 +0800 GET /bg-nav.png HTTP/1.1 304 -

Tomcat日志

192.168.57.4 - - [29/Feb/2016:18:14:35 +0800] "GET /bg-upper.png HTTP/1.1" 304 -
192.168.57.4 - - [29/Feb/2016:18:14:35 +0800] "GET /bg-nav.png HTTP/1.1" 304 -
192.168.57.4 - - [29/Feb/2016:18:14:35 +0800] "GET /asf-logo.png HTTP/1.1" 304 -
192.168.57.4 - - [29/Feb/2016:18:14:35 +0800] "GET /bg-button.png HTTP/1.1" 304 -
192.168.57.4 - - [29/Feb/2016:18:14:35 +0800] "GET /bg-middle.png HTTP/1.1" 304 -
192.168.57.4 - - [29/Feb/2016:18:14:36 +0800] "GET / HTTP/1.1" 200 11217
192.168.57.4 - - [29/Feb/2016:18:14:36 +0800] "GET / HTTP/1.1" 200 11217
192.168.57.4 - - [29/Feb/2016:18:14:36 +0800] "GET /tomcat.css HTTP/1.1" 304 -
192.168.57.4 - - [29/Feb/2016:18:14:36 +0800] "GET /tomcat.png HTTP/1.1" 304 -
192.168.57.4 - - [29/Feb/2016:18:14:36 +0800] "GET /asf-logo.png HTTP/1.1" 304 -
192.168.57.4 - - [29/Feb/2016:18:14:36 +0800] "GET /bg-middle.png HTTP/1.1" 304 -
192.168.57.4 - - [29/Feb/2016:18:14:36 +0800] "GET /bg-button.png HTTP/1.1" 304 -
192.168.57.4 - - [29/Feb/2016:18:14:36 +0800] "GET /bg-nav.png HTTP/1.1" 304 -
192.168.57.4 - - [29/Feb/2016:18:14:36 +0800] "GET /bg-upper.png HTTP/1.1" 304 -
192.168.57.4 - - [29/Feb/2016:18:14:36 +0800] "GET / HTTP/1.1" 200 11217
192.168.57.4 - - [29/Feb/2016:18:14:36 +0800] "GET /tomcat.css HTTP/1.1" 304 -
192.168.57.4 - - [29/Feb/2016:18:14:36 +0800] "GET /tomcat.png HTTP/1.1" 304 -
192.168.57.4 - - [29/Feb/2016:18:14:36 +0800] "GET / HTTP/1.1" 200 11217
192.168.57.4 - - [29/Feb/2016:18:14:36 +0800] "GET /tomcat.css HTTP/1.1" 304 -
192.168.57.4 - - [29/Feb/2016:18:14:36 +0800] "GET /tomcat.png HTTP/1.1" 304 -
192.168.57.4 - - [29/Feb/2016:18:14:36 +0800] "GET /bg-button.png HTTP/1.1" 304 -
192.168.57.4 - - [29/Feb/2016:18:14:36 +0800] "GET /bg-upper.png HTTP/1.1" 304 -

Hive Beeline

node002既是服务端,有时客户端,开多个连接
[root@node002 ~]# hiveserver2
让它后台启动!

查看10000端口有没有启动
# ss -nal


[root@node002 ~]# beeline -u jdbc:hive2://node002:10000/default -n root
0: jdbc:hive2://node002:10000/default>!quit


默认 用户名、密码不验证(但不能不输!)权限控制!
[root@node002 ~]# beeline
beeline> !connect jdbc:hive2://node002:10000/default root 123
0: jdbc:hive2://node002:10000/default> select * from psn;
+---------+-----------+-------------------------+-------------------------------------------+--+
| psn.id  | psn.name  |        psn.likes        |                psn.address                |
+---------+-----------+-------------------------+-------------------------------------------+--+
| 1       | 小明1       | ["lol","book","movie"]  | {"shenzhen":"luohu","shanghai":"pudong"}  |
| 2       | 小明2       | ["lol","book","movie"]  | {"shenzhen":"luohu","shanghai":"pudong"}  |
| 3       | 小明3       | ["lol","book","movie"]  | {"shenzhen":"luohu","shanghai":"pudong"}  |
| 4       | 小明4       | ["lol","movie"]         | {"shenzhen":"luohu","shanghai":"pudong"}  |
| 5       | 小明5       | ["lol","book","movie"]  | {"shenzhen":"luohu","shanghai":"pudong"}  |
| 6       | 小明6       | ["lol","book","movie"]  | {"shenzhen":"luohu","shanghai":"pudong"}  |
| 7       | 小明7       | ["lol","book","game"]   | {"shenzhen":"luohu","shanghai":"pudong"}  |
| 8       | 小明8       | ["lol","book","movie"]  | {"shenzhen":"luohu","shanghai":"pudong"}  |
| 9       | 小明9       | ["lol","book","movie"]  | {"shenzhen":"luohu","shanghai":"pudong"}  |
+---------+-----------+-------------------------+-------------------------------------------+--+
9 rows selected (0.219 seconds)
0: jdbc:hive2://node002:10000/default> !quit
Closing: 0: jdbc:hive2://node002:10000/default
[root@node002 ~]# 

更加以易读的格式输出而已,2种不同客户端的访问方式!
beeline的本质是jdbc

Hive JDBC

Hive JDBC运行方式
服务端启动hiveserver2后,在java代码中通过调用hive的jdbc访问默认端口10000进行连接、访问
上一篇下一篇

猜你喜欢

热点阅读