Hive-3.1.2（三）数据类型

2021-08-07 本文已影响0人 _大叔_

常用的基本数据类型

基本数据类型	所占字节
int
boolean
float
double
string

复杂数据类型

基本数据类型	说明
array	array 类型是由一系列相同数据类型的元素组成。并且可以通过下标来进行访问。下标从0开始计
map	map包含key-value键值对，可以通过key来访问元素
struct	struct 可以包含不同的数据类型元素。相当于一个对象结构。可以通过对象.属性来访问

array类型

现有外部数据路径为 /a1/a1.txt 结构如下：

100,300,200 aa,v,cc
442,245,214 dd,ee,dd

把外部数据创建到hive里，terminated by ' ' 为列分隔符，terminated by ',' 集合元素分隔符

create external table table_name(t1 array<int>,t2 array<string>) row format delimited fields terminated by ' ' collection items terminated by ',' location '/a1'

查看 t1 列的元素个数

select size(t1) from table_name;

通过下标获取 t1 列的集合元素的第一个数据

select t1[0] from table_name;

map类型

现有外部数据路径为 /a2/a2.txt 结构如下：

test1,100
test2,200
test3,300
test1,200
test2,200

把外部数据创建到hive里，terminated by '\t' 为列分隔符（必须），terminated by ',' key-value 分隔符

create external table table_name(t1 map<string,int>) row format delimited fields terminated by '\t' map keys terminated by ',' location '/a2'

查询 t1 列为 test2 的key的值，并去重，distinct 函数会触发 mapreduce。

select distinct(t1['test2']) from table_name where  t1['test1' is not null;

struct类型

现有外部数据路径为 /a3/a3.txt 结构如下：

1 zs 22
1 ls 23
1 ww 18
1 zl 25