Pig从入门到精通2:Pig安装配置
Pig的安装模式有两种:本地模式和MapReduce模式(集群模式)。二者的区别是:本地模式操作的是本地Linux文件系统,不需要Hadoop的支持,所以不需要配置PIG_CLASSPATH参数,本地模式常用于测试;MapReduce模式又叫做集群模式,顾名思义,该模式需要Hadoop的支持,操作的是HDFS文件系统,需要通过PIG_CLASSPATH参数指定Hadoop配置文件的所在路径,MapReduce模式常用于实际的生产环境。本节就来介绍这两种模式详细的安装配置过程。
本节用到的安装介质:
1.Pig本地模式的安装配置
(1)官网下载Pig的最新安装包
<pre style="box-sizing: border-box; border: 0px; font-family: "Courier 10 Pitch", Courier, monospace; font-size: 1.5rem; font-style: normal; font-weight: 400; margin: 0px 0px 1.6em; outline: 0px; padding: 1.6em; vertical-align: baseline; background: rgb(238, 238, 238); line-height: 1.6; max-width: 100%; overflow: auto; color: rgb(64, 64, 64); font-variant-ligatures: normal; font-variant-caps: normal; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">[root@bigdata ~]# ls /root/tools/pig-0.17.0.tar.gz
/root/tools/pig-0.17.0.tar.gz</pre>
(2)解压至安装目录
<pre style="box-sizing: border-box; border: 0px; font-family: "Courier 10 Pitch", Courier, monospace; font-size: 1.5rem; font-style: normal; font-weight: 400; margin: 0px 0px 1.6em; outline: 0px; padding: 1.6em; vertical-align: baseline; background: rgb(238, 238, 238); line-height: 1.6; max-width: 100%; overflow: auto; color: rgb(64, 64, 64); font-variant-ligatures: normal; font-variant-caps: normal; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">[root@bigdata ~]# tar -zxvf /root/tools/pig-0.17.0.tar.gz -C /root/trainings/</pre>
(3)设置PATH环境变量
<pre style="box-sizing: border-box; border: 0px; font-family: "Courier 10 Pitch", Courier, monospace; font-size: 1.5rem; font-style: normal; font-weight: 400; margin: 0px 0px 1.6em; outline: 0px; padding: 1.6em; vertical-align: baseline; background: rgb(238, 238, 238); line-height: 1.6; max-width: 100%; overflow: auto; color: rgb(64, 64, 64); font-variant-ligatures: normal; font-variant-caps: normal; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">[root@bigdata ~]# vim /root/.bash_profile
PIG_HOME=/root/trainings/pig-0.17.0
export PIG_HOME
PATH=PATH
export PATH
[root@bigdata ~]# source /root/.bash_profile</pre>
(4)启动本地模式
<pre style="box-sizing: border-box; border: 0px; font-family: "Courier 10 Pitch", Courier, monospace; font-size: 1.5rem; font-style: normal; font-weight: 400; margin: 0px 0px 1.6em; outline: 0px; padding: 1.6em; vertical-align: baseline; background: rgb(238, 238, 238); line-height: 1.6; max-width: 100%; overflow: auto; color: rgb(64, 64, 64); font-variant-ligatures: normal; font-variant-caps: normal; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">[root@bigdata ~]# pig -x local
[root@bigdata tools]# pig -x local
18/09/24 16:17:06 INFO pig.ExecTypeProvider: Trying ExecType : LOCAL
18/09/24 16:17:06 INFO pig.ExecTypeProvider: Picked LOCAL as the ExecType
2018-09-24 16:17:06,985 [main] INFO org.apache.pig.Main - Apache Pig version 0.17.0 (r1797386) compiled Jun 02 2017, 15:41:58
2018-09-24 16:17:06,985 [main] INFO org.apache.pig.Main - Logging error messages to: /root/tools/pig_1537777026984.log
2018-09-24 16:17:07,010 [main] INFO org.apache.pig.impl.util.Utils - Default bootup file /root/.pigbootup not found
2018-09-24 16:17:07,174 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2018-09-24 16:17:07,175 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: file:///
2018-09-24 16:17:07,349 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2018-09-24 16:17:07,362 [main] INFO org.apache.pig.PigServer - Pig Script ID for the session: PIG-default-4c59aed0-92e1-4fe5-b974-d8d55c254695
2018-09-24 16:17:07,363 [main] WARN org.apache.pig.PigServer - ATS is disabled since yarn.timeline-service.enabled set to false
grunt></pre>
说明:-x选项用于指明工作模式,local表示本地模式。
根据下面的打印出的日志可以知道当前Pig运行模式为本地模式,操作的是本地文件系统file:
Picked LOCAL as the ExecType
Connecting to hadoop file system at: file:///
(5)测试本地模式
查看当前工作目录:
<pre style="box-sizing: border-box; border: 0px; font-family: "Courier 10 Pitch", Courier, monospace; font-size: 1.5rem; font-style: normal; font-weight: 400; margin: 0px 0px 1.6em; outline: 0px; padding: 1.6em; vertical-align: baseline; background: rgb(238, 238, 238); line-height: 1.6; max-width: 100%; overflow: auto; color: rgb(64, 64, 64); font-variant-ligatures: normal; font-variant-caps: normal; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">grunt> pwd
file:/root/tools</pre>
查看根目录下面的内容:
<pre style="box-sizing: border-box; border: 0px; font-family: "Courier 10 Pitch", Courier, monospace; font-size: 1.5rem; font-style: normal; font-weight: 400; margin: 0px 0px 1.6em; outline: 0px; padding: 1.6em; vertical-align: baseline; background: rgb(238, 238, 238); line-height: 1.6; max-width: 100%; overflow: auto; color: rgb(64, 64, 64); font-variant-ligatures: normal; font-variant-caps: normal; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">grunt> ls /
file:/boot <dir>
file:/dev <dir>
file:/proc <dir>
file:/run <dir>
file:/sys <dir>
file:/etc <dir>
file:/root <dir>
file:/var <dir>
file:/tmp <dir>
file:/usr <dir>
file:/bin <dir>
file:/sbin <dir>
file:/lib <dir>
file:/lib64 <dir>
file:/home <dir>
file:/media <dir>
file:/mnt <dir>
file:/opt <dir>
file:/srv <dir>
file:/.autorelabel<r 1> 0</pre>
(6)退出pig
<pre style="box-sizing: border-box; border: 0px; font-family: "Courier 10 Pitch", Courier, monospace; font-size: 1.5rem; font-style: normal; font-weight: 400; margin: 0px 0px 1.6em; outline: 0px; padding: 1.6em; vertical-align: baseline; background: rgb(238, 238, 238); line-height: 1.6; max-width: 100%; overflow: auto; color: rgb(64, 64, 64); font-variant-ligatures: normal; font-variant-caps: normal; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">grunt> quit;
2018-09-24 16:23:25,663 [main] INFO org.apache.pig.Main - Pig script completed in 6 minutes, 19 seconds and 259 milliseconds (379259 ms)</pre>
2.Pig MapReduce模式的安装配置
(1)官网下载Pig的最新安装包
<pre style="box-sizing: border-box; border: 0px; font-family: "Courier 10 Pitch", Courier, monospace; font-size: 1.5rem; font-style: normal; font-weight: 400; margin: 0px 0px 1.6em; outline: 0px; padding: 1.6em; vertical-align: baseline; background: rgb(238, 238, 238); line-height: 1.6; max-width: 100%; overflow: auto; color: rgb(64, 64, 64); font-variant-ligatures: normal; font-variant-caps: normal; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">[root@bigdata ~]# ls /root/tools/pig-0.17.0.tar.gz
/root/tools/pig-0.17.0.tar.gz</pre>
(2)解压至安装目录
<pre style="box-sizing: border-box; border: 0px; font-family: "Courier 10 Pitch", Courier, monospace; font-size: 1.5rem; font-style: normal; font-weight: 400; margin: 0px 0px 1.6em; outline: 0px; padding: 1.6em; vertical-align: baseline; background: rgb(238, 238, 238); line-height: 1.6; max-width: 100%; overflow: auto; color: rgb(64, 64, 64); font-variant-ligatures: normal; font-variant-caps: normal; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">[root@bigdata ~]# tar -zxvf /root/tools/pig-0.17.0.tar.gz -C /root/trainings/</pre>
(3)设置PATH环境变量
<pre style="box-sizing: border-box; border: 0px; font-family: "Courier 10 Pitch", Courier, monospace; font-size: 1.5rem; font-style: normal; font-weight: 400; margin: 0px 0px 1.6em; outline: 0px; padding: 1.6em; vertical-align: baseline; background: rgb(238, 238, 238); line-height: 1.6; max-width: 100%; overflow: auto; color: rgb(64, 64, 64); font-variant-ligatures: normal; font-variant-caps: normal; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">[root@bigdata ~]# vim /root/.bash_profile
PIG_HOME=/root/trainings/pig-0.17.0
export PIG_HOME
PATH=PATH
export PATH
[root@bigdata ~]# source /root/.bash_profile</pre>
(4)设置PIG_CLASSPATH环境变量
<pre style="box-sizing: border-box; border: 0px; font-family: "Courier 10 Pitch", Courier, monospace; font-size: 1.5rem; font-style: normal; font-weight: 400; margin: 0px 0px 1.6em; outline: 0px; padding: 1.6em; vertical-align: baseline; background: rgb(238, 238, 238); line-height: 1.6; max-width: 100%; overflow: auto; color: rgb(64, 64, 64); font-variant-ligatures: normal; font-variant-caps: normal; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">[root@bigdata ~]# vim /root/.bash_profile
PIG_CLASSPATH=/root/trainings/hadoop-2.7.3/etc/hadoop
export PIG_CLASSPATH
[root@bigdata ~]# source /root/.bash_profile</pre>
(5)启动Hadoop集群
<pre style="box-sizing: border-box; border: 0px; font-family: "Courier 10 Pitch", Courier, monospace; font-size: 1.5rem; font-style: normal; font-weight: 400; margin: 0px 0px 1.6em; outline: 0px; padding: 1.6em; vertical-align: baseline; background: rgb(238, 238, 238); line-height: 1.6; max-width: 100%; overflow: auto; color: rgb(64, 64, 64); font-variant-ligatures: normal; font-variant-caps: normal; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">[root@bigdata hadoop]# start-all.sh
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
Starting namenodes on [bigdata]
bigdata: starting namenode, logging to /root/trainings/hadoop-2.7.3/logs/hadoop-root-namenode-bigdata.out
localhost: starting datanode, logging to /root/trainings/hadoop-2.7.3/logs/hadoop-root-datanode-bigdata.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /root/trainings/hadoop-2.7.3/logs/hadoop-root-secondarynamenode-bigdata.out
starting yarn daemons
starting resourcemanager, logging to /root/trainings/hadoop-2.7.3/logs/yarn-root-resourcemanager-bigdata.out
localhost: starting nodemanager, logging to /root/trainings/hadoop-2.7.3/logs/yarn-root-nodemanager-bigdata.out
[root@bigdata hadoop]# jps
2720 DataNode
2900 SecondaryNameNode
2582 NameNode
3064 ResourceManager
3322 NodeManager
3487 Jps</pre>
(6)启动Pig的MapReduce模式
<pre style="box-sizing: border-box; border: 0px; font-family: "Courier 10 Pitch", Courier, monospace; font-size: 1.5rem; font-style: normal; font-weight: 400; margin: 0px 0px 1.6em; outline: 0px; padding: 1.6em; vertical-align: baseline; background: rgb(238, 238, 238); line-height: 1.6; max-width: 100%; overflow: auto; color: rgb(64, 64, 64); font-variant-ligatures: normal; font-variant-caps: normal; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">[root@bigdata ~]# pig
18/09/24 16:32:12 INFO pig.ExecTypeProvider: Trying ExecType : LOCAL
18/09/24 16:32:12 INFO pig.ExecTypeProvider: Trying ExecType : MAPREDUCE
18/09/24 16:32:12 INFO pig.ExecTypeProvider: Picked MAPREDUCE as the ExecType
2018-09-24 16:32:12,188 [main] INFO org.apache.pig.Main - Apache Pig version 0.17.0 (r1797386) compiled Jun 02 2017, 15:41:58
2018-09-24 16:32:12,188 [main] INFO org.apache.pig.Main - Logging error messages to: /root/pig_1537777932187.log
2018-09-24 16:32:12,208 [main] INFO org.apache.pig.impl.util.Utils - Default bootup file /root/.pigbootup not found
2018-09-24 16:32:12,652 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2018-09-24 16:32:12,652 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://bigdata:9000
2018-09-24 16:32:13,144 [main] INFO org.apache.pig.PigServer - Pig Script ID for the session: PIG-default-137e17fb-e88a-4d00-821a-06b59a0349e0
2018-09-24 16:32:13,144 [main] WARN org.apache.pig.PigServer - ATS is disabled since yarn.timeline-service.enabled set to false
grunt>
说明:pig命令不加任何参数,默认启动MapReduce模式。
根据打印出的日志可以知道当前Pig的运行模式是MapReduce模式,操作的是HDFS文件系统:
Picked MAPREDUCE as the ExecType
Connecting to hadoop file system at: hdfs://bigdata:9000</pre>
(6)测试MapReduce模式
查看当前工作目录:
<pre style="box-sizing: border-box; border: 0px; font-family: "Courier 10 Pitch", Courier, monospace; font-size: 1.5rem; font-style: normal; font-weight: 400; margin: 0px 0px 1.6em; outline: 0px; padding: 1.6em; vertical-align: baseline; background: rgb(238, 238, 238); line-height: 1.6; max-width: 100%; overflow: auto; color: rgb(64, 64, 64); font-variant-ligatures: normal; font-variant-caps: normal; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">grunt> pwd
hdfs://bigdata:9000/user/root</pre>
查看HDFS根目录下的内容:
<pre style="box-sizing: border-box; border: 0px; font-family: "Courier 10 Pitch", Courier, monospace; font-size: 1.5rem; font-style: normal; font-weight: 400; margin: 0px 0px 1.6em; outline: 0px; padding: 1.6em; vertical-align: baseline; background: rgb(238, 238, 238); line-height: 1.6; max-width: 100%; overflow: auto; color: rgb(64, 64, 64); font-variant-ligatures: normal; font-variant-caps: normal; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">grunt> ls /
hdfs://bigdata:9000/hbase <dir>
hdfs://bigdata:9000/input <dir>
hdfs://bigdata:9000/output <dir>
hdfs://bigdata:9000/tmp <dir>
hdfs://bigdata:9000/user <dir></pre>
(7)退出pig
<pre style="box-sizing: border-box; border: 0px; font-family: "Courier 10 Pitch", Courier, monospace; font-size: 1.5rem; font-style: normal; font-weight: 400; margin: 0px 0px 1.6em; outline: 0px; padding: 1.6em; vertical-align: baseline; background: rgb(238, 238, 238); line-height: 1.6; max-width: 100%; overflow: auto; color: rgb(64, 64, 64); font-variant-ligatures: normal; font-variant-caps: normal; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">grunt> quit;
2018-09-24 16:36:24,915 [main] INFO org.apache.pig.Main - Pig script completed in 4 minutes, 12 seconds and 907 milliseconds (252907 ms)</pre>
本节详细介绍了Pig两种工作模式的安装配置过程,祝你玩得愉快!