Oozie WorkFlow中Hive Action使用案例
2018-04-03 本文已影响241人
明明德撩码
官方地址
http://archive.cloudera.com/cdh5/cdh/5/oozie-4.0.0-cdh5.3.6/DG_HiveActionExtension.html
复制样例重新命名后对hive进行修改
cp -r examples/apps/hive oozie-apps/
mv oozie-apps/hive hive-select
修改hive-select中的job.properties
nameNode=hdfs://hadoop-senior.beifeng.com:8020
jobTracker=hadoop-senior.beifeng.com:8032
queueName=default
examplesRoot=examples
oozieAppsRoot=user/beifeng/oozie-apps
oozieDataRoot=user/beifeng/oozie/datas
oozie.use.system.libpath=true
oozie.wf.application.path=${nameNode}/${oozieAppsRoot}/hive-select/workflow.xml
inputDir=hive-select/input
outputDir=hive-select/output
oozie.use.system.libpath=true 表示使用hdfs系统beifeng用户下的share依赖包。
注意:端口号是否正确。hdfs:8020 jobtracker:8032
测试hive使用的api是新版本还是老版本
- [beifeng@hadoop-senior hive-0.13.1-cdh5.3.6]$ bin/hive
- select count(1) from dept;
-
http://hadoop-senior.beifeng.com:8088/cluster
在hive中创建dept表
CREATE TABLE IF NOT EXISTS default.dept
(
dept_no string COMMENT 'id',
dept_name string ,
dept_url string
)
COMMENT 'dept'
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
STORED AS TEXTFILE
LOCATION '/user/hive/warehouse/dept'
编写hive的sql脚本
load data local inpath '/opt/datas/dept.txt' overwrite into table dept;
编写流程xml文件
<?xml version="1.0" encoding="UTF-8"?>
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<workflow-app xmlns="uri:oozie:workflow:0.5" name="hive-wf">
<start to="hive-node"/>
<action name="hive-node">
<hive xmlns="uri:oozie:hive-action:0.5">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<prepare>
<delete path="${nameNode}/${oozieAppsRoot}/${outputDir}"/>
</prepare>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>${queueName}</value>
</property>
</configuration>
<script>dept-select.sql</script>
<param>OUTPUT=${nameNode}/${oozieAppsRoot}/${outputDir}</param>
</hive>
<ok to="end"/>
<error to="fail"/>
</action>
<kill name="fail">
<message>Hive failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name="end"/>
</workflow-app>
注意:workflow和hive的版本信息。根据.cloudera的oozie官方文档说明为主。
创建hdfs上的oozie-app目录
bin/hdfs dfs -mkdir -p /user/beifeng/oozie-apps
复制oozie中的工作流select-dept到hdfs系统
../hadoop-2.5.0-cdh5.3.6/bin/hdfs dfs -put oozie-apps/hive-select /user/beifeng/oozie-apps/
复制hive配置文件及修改工作流文件
cp ../hive-0.13.1-cdh5.3.6/conf/hive-site.xml oozie-apps/hive-select/
创建hive的依赖jar包lib及上传
mkdir -p oozie-apps/hive-select/lib
cp ../hive-0.13.1-cdh5.3.6/lib/mysql-connector-java-5.1.27-bin.jar oozie-apps/hive-select/lib
复制hive-select 到HDFS
bin/hdfs dfs -put ../oozie-4.0.0-cdh5.3.6/oozie-apps/hive-select/* /user/beifeng/oozie-apps/hive-select/
设置oozie请求地址
export OOZIE_URL=http://hadoop-senior.beifeng.com:11000/oozie
运行job
bin/oozie job -config oozie-apps/hive-select/job.properties -run
查看job运行状态
bin/oozie job -info 0000001-180315133250705-oozie-beif-W