电商项目(Hive实现)

2021-11-29  本文已影响0人  V_6619

查看表:

desc formatted <table_name>

MANAGED_TABLE:内部表
删除内部表:HDFS上的数据被删除 & Meta也被删除

创建外部表:

CREATE EXTERNAL TABLE emp_external(
empno int,
ename string,
job string,
mgr int,
hiredate string,
sal double,
comm double,
deptno int
) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
location '/external/emp/';

删除外部表: HDFS上的数据不被删除 & Meta被删除

上次ETL的数据:

create external table vvtrack_info(
ip string,
country string,
province string,
city string,
url string,
time string,
page string
) partitioned by (day string)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
location '/project/vvtrackinfo/';

LOAD DATA INPATH 'hdfs://hadoop000:8020/project/input/etl' OVERWRITE INTO TABLE track_info partition(day='2013-07-21');

统计浏览量

select count(*) from vvtrack_info where day='2013-07-21';

统计省份:

select province, count(*) from vvtrack_info where day='2013-07-21 group by province;

省份统计表

create table vvtrack_info_province_stat(
province string,
cnt bigint
) partitioned by (day string)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
insert overwrite table vvtrack_info_province_stat partition(day='2013-07-21') 
select province,count(*) as cnt from vvtrack_info where day='2013-07-21' group by province ;

1)ETL
2)把ETL输出的数据加载到track_info分区表里
3)各个维度统计结果的数据输出到各自维度的表里(track_info_province_stat)
4)将数据导出(optional)


查看函数

show functions

查看具体函数用法

desc function <function_name>

查看具体例子

desc function extended <function_name>
上一篇下一篇

猜你喜欢

热点阅读