oozie工作流定义
介绍
流程定义
流程节点
启动节点
结束节点
Kill Control Node
Map-Reduce Action
-
A map-reduce action can be configured to perform file system cleanup and directory creation before starting the map reduce job. This capability enables Oozie to retry a Hadoop job in the situation of a transient failure (Hadoop checks the non-existence of the job output directory and then creates it when the Hadoop job is starting, thus a retry without cleanup of the job output directory would fail).运行文件系统前清理以前的输出
-
The workflow job will wait until the Hadoop map/reduce job completes before continuing to the next action in the workflow execution path.工作流中,一个动作完成后才能进行到下一个动作
-
The counters of the Hadoop job and job exit status (=FAILED=, KILLED or SUCCEEDED ) must be available to the workflow job after the Hadoop jobs ends. This information can be used from within decision nodes and other actions configurations.在Hadoop作业结束后,Hadoop作业和作业退出状态的计数器(=FILIL=、KIND或WITED)必须可用于工作流作业
-
The map-reduce action has to be configured with all the necessary Hadoop JobConf properties to run the Hadoop map/reduce job.
-
任务定义(job define)
<workflow-app name="[WF-DEF-NAME]" xmlns="uri:oozie:workflow:0.5">
...
<action name="[NODE-NAME]">
<map-reduce>
...
<job-xml>[JOB-XML-FILE]</job-xml>
<configuration>
<property>
<name>[PROPERTY-NAME]</name>
<value>[PROPERTY-VALUE]</value>
</property>
...
</configuration>
<config-class>com.example.MyConfigClass</config-class>
...
</map-reduce>
<ok to="[NODE-NAME]"/>
<error to="[NODE-NAME]"/>
</action>
...
</workflow-app>
shell action
The shell action runs a Shell command.
The workflow job will wait until the Shell command completes before continuing to the next action.
To run the Shell job, you have to configure the shell action with the =job-tracker=, name-node and Shell exec elements as well as the necessary arguments and configuration.
A shell action can be configured to create or delete HDFS directories before starting the Shell job.
- Syntax:
<workflow-app name="[WF-DEF-NAME]" xmlns="uri:oozie:workflow:0.3">
...
<action name="[NODE-NAME]">
<shell xmlns="uri:oozie:shell-action:0.1">
<job-tracker>[JOB-TRACKER]</job-tracker>
<name-node>[NAME-NODE]</name-node>
<prepare>
<delete path="[PATH]"/>
...
<mkdir path="[PATH]"/>
...
</prepare>
<job-xml>[SHELL SETTINGS FILE]</job-xml>
<configuration>
<property>
<name>[PROPERTY-NAME]</name>
<value>[PROPERTY-VALUE]</value>
</property>
...
</configuration>
<exec>[SHELL-COMMAND]</exec>
<argument>[ARG-VALUE]</argument>
...
<argument>[ARG-VALUE]</argument>
<env-var>[VAR1=VALUE1]</env-var>
...
<env-var>[VARN=VALUEN]</env-var>
<file>[FILE-PATH]</file>
...
<archive>[FILE-PATH]</archive>
...
<capture-output/>
</shell>
<ok to="[NODE-NAME]"/>
<error to="[NODE-NAME]"/>
</action>
...
</workflow-app>
- Example:How to run any shell script or perl script or CPP executable
<workflow-app xmlns='uri:oozie:workflow:0.3' name='shell-wf'>
<start to='shell1' />
<action name='shell1'>
<shell xmlns="uri:oozie:shell-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>${queueName}</value>
</property>
</configuration>
<exec>${EXEC}</exec>
<argument>A</argument>
<argument>B</argument>
<file>${EXEC}#${EXEC}</file> <!--Copy the executable to compute node's current working directory -->
</shell>
<ok to="end" />
<error to="fail" />
</action>
<kill name="fail">
<message>Script failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name='end' />
</workflow-app>