年薪40万的大数据工程师是如何安装Strom
Strom集群的安装配置
主机规划
机器名域名 IP地址
storm-01 192.168.33.31 Storm(minbus)、zookeeper
storm-02 192.168.33.32 Storm(supervisor)、zookeeper
storm-03 192.168.33.33 storm(supervisor)、zookeeper
Storm-04 192.168.33.34 storm(supervisor)
一、准备服务器
关闭防火墙
chkconfig iptables off && setenforce 0
创建用户
groupadd hadoop && useradd hadoop && usermod -a -G hadoop hadoop
创建工作目录并赋权
mkdir apps/
chmod 755 -R apps/
修改每台机器的域名
[hadoop@storm01 ~]$ vi /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.33.31 storm01 zk01
192.168.33.32 storm02 zk02
192.168.33.33 storm03 zk03
192.168.33.34 storm04
使用hadoop登录
哈尔滨爱尚实训
二、storm集群是依赖于zookeeper集群的,需要在zk01、zk02、zk03上安装集群
1. 先将zookeeper拷贝到zk01上,解压到/home/hadoop/apps下
hadoop@storm01 ~]$ tar -zxvf zookeeper-3.4.5.tar.gz -C apps/
然后可以删除刚才拷贝的的zookeeper了
[hadoop@storm01 ~]$ rm -f zookeeper-3.4.5.tar.gz
2. 进入到zookeeper的安装目录下:
cd /home/hadoop/apps/zookeeper-3.4.5
删除zookeeper下的一些不必要的文件
[hadoop@storm01 zookeeper-3.4.5]$ rm -rf *.xml *.txt docs src *.asc *.md5 *.sha1
[hadoop@storm01 zookeeper-3.4.5]$ rm -rf dist-maven/
目前目录结构
[hadoop@storm01 zookeeper-3.4.5]$ ll
总用量 1308
drwxr-xr-x. 2 hadoop hadoop 4096 1月 22 22:34 bin
drwxr-xr-x. 2 hadoop hadoop 4096 1月 22 22:34 conf
drwxr-xr-x. 10 hadoop hadoop 4096 1月 22 22:34 contrib
drwxr-xr-x. 4 hadoop hadoop 4096 1月 22 22:34 lib
drwxr-xr-x. 5 hadoop hadoop 4096 1月 22 22:34 recipes
-rw-r--r--. 1 hadoop hadoop 1315806 11月 5 2012 zookeeper-3.4.5.jar
3. 进入到conf下,修改配置文件
[hadoop@storm01 zookeeper-3.4.5]$ cd conf/
将配置文件改名
[hadoop@storm01 conf]$ mv zoo_sample.cfg zoo.cfg
4. 编辑zoo.cfg
[hadoop@storm01 conf]$ vi zoo.cfg
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
#存放数据的目录
dataDir=/home/hadoop/zkdata
# the port at which the clients will connect
clientPort=2181
#
# Be sure to read the maintenance section of the
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
server.1=storm01:2888:3888
server.2=storm02:2888:3888
server.3=storm03:2888:3888
5. 创建/home/hadoop/zkdata目录,并创建文件myid,写入节点id “1”
[hadoop@storm01 ~]$ mkdir zkdata
[hadoop@storm01 ~]$ echo 1 > zkdata/myid
ll查看apps/目录
[hadoop@storm01 apps]$ ll
总用量 8
drwxrwxr-x. 3 hadoop hadoop 4096 1月 22 23:37 apps
drwxrwxr-x. 2 hadoop hadoop 4096 1月 22 23:38 zkdata
6. 将apps/ zkdata/分别拷贝到storm02、storm03上
[hadoop@storm01 ~]$ scp -r zkdata/ storm02:/home/hadoop/
[hadoop@storm01 ~]$ scp -r zkdata/ storm03:/home/hadoop/
7. 分别修改storm02、storm03上的/home/hadoop/zkdata/myid中的内容,分别为2,3
8. 将/home/hadoop/apps/zookeeper-3.4.5拷贝到storm02、storm03的/home/hadoop/apps下
[hadoop@storm01 ~]$ scp -r /home/hadoop/apps/zookeeper-3.4.5/ storm02:/home/hadoop/apps/
[hadoop@storm01 ~]$ scp -r /home/hadoop/apps/zookeeper-3.4.5/ storm03:/home/hadoop/apps/
9. 启动三台机器上的zookeeper
[hadoop@storm01 ~]$ cd apps/zookeeper-3.4.5/
[hadoop@storm01 zookeeper-3.4.5]$ bin/zkServer.sh start
JMX enabled by default
Using config: /home/hadoop/apps/zookeeper-3.4.5/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
查看zookeeper的状态
[hadoop@storm01 zookeeper-3.4.5]$ bin/zkServer.sh status
JMX enabled by default
Using config: /home/hadoop/apps/zookeeper-3.4.5/bin/../conf/zoo.cfg
Mode: follower
或者查看进程
[hadoop@storm03 zookeeper-3.4.5]$ jps
1599 Jps
1489 QuorumPeerMain
能起来,能查看状态证明成功!
三、安装Storm
1. 用工具将storm安装包apache-storm-0.9.6.tar.gz上传到/home/hadoop下,确保用户是hadoop
2. 将storm的压缩包解压到 /home/hadoop/apps下,并且改名storm,并删掉原始的storm的压缩包。
[hadoop@storm01 ~]$ mkdir apps/
[hadoop@storm01 ~]$ ll
总用量 19716
-rw-rw-r--. 1 hadoop hadoop 20183501 2月 3 10:14 apache-storm-0.9.5.tar.gz
drwxrwxr-x. 2 hadoop hadoop 4096 2月 3 19:28 apps
[hadoop@storm01 ~]$ chmod -R 755 apps/
[hadoop@storm01 ~]$ tar -zxvf apache-storm-0.9.6.tar.gz -C apps/
[hadoop@storm01 ~]$ ll
总用量 19716
-rw-rw-r--. 1 hadoop hadoop 20183501 2月 3 10:14 apache-storm-0.9.6.tar.gz
drwxr-xr-x. 3 hadoop hadoop 4096 2月 3 19:30 apps
[hadoop@storm01 ~]$ cd apps/
[hadoop@storm01 apps]$ ll
总用量 4
drwxrwxr-x. 9 hadoop hadoop 4096 2月 3 19:30 apache-storm-0.9.6
[hadoop@storm01 apps]$ mv apache-storm-0.9.6/ storm/
[hadoop@storm01 apps]$ ll
总用量 4
drwxrwxr-x. 9 hadoop hadoop 4096 2月 3 19:30 storm
[hadoop@storm01 apps]$ cd ..
[hadoop@storm01 ~]$ ll
总用量 19716
-rw-rw-r--. 1 hadoop hadoop 20183501 2月 3 10:14 apache-storm-0.9.6.tar.gz
drwxr-xr-x. 3 hadoop hadoop 4096 2月 3 19:31 apps
[hadoop@storm01 ~]$ rm -rf apache-storm-0.9.6.tar.gz
[hadoop@storm01 ~]$ ll
总用量 4
drwxr-xr-x. 3 hadoop hadoop 4096 2月 3 19:31 apps
3. 进入到storm的安装目录下
[hadoop@storm01 storm]$ pwd
/home/hadoop/apps/storm
[hadoop@storm01 storm]$ ll
总用量 124
drwxrwxr-x. 2 hadoop hadoop 4096 2月 3 19:30 bin
-rw-r--r--. 1 hadoop hadoop 41732 5月 29 2015 CHANGELOG.md
drwxrwxr-x. 2 hadoop hadoop 4096 2月 3 19:30 conf
-rw-r--r--. 1 hadoop hadoop 538 5月 29 2015 DISCLAIMER
drwxr-xr-x. 3 hadoop hadoop 4096 5月 29 2015 examples
drwxrwxr-x. 5 hadoop hadoop 4096 2月 3 19:30 external
drwxrwxr-x. 2 hadoop hadoop 4096 2月 3 19:30 lib
-rw-r--r--. 1 hadoop hadoop 23004 5月 29 2015 LICENSE
drwxrwxr-x. 2 hadoop hadoop 4096 2月 3 19:30 logback
-rw-r--r--. 1 hadoop hadoop 981 5月 29 2015 NOTICE
drwxrwxr-x. 6 hadoop hadoop 4096 2月 3 19:30 public
-rw-r--r--. 1 hadoop hadoop 10987 5月 29 2015 README.markdown
-rw-r--r--. 1 hadoop hadoop 6 5月 29 2015 RELEASE
-rw-r--r--. 1 hadoop hadoop 3581 5月 29 2015 SECURITY.md
4. 进入到storm的conf的配置文件目录,修改配置文件
[hadoop@storm01 storm]$ cd conf
[hadoop@storm01 conf]$ ll
总用量 8
-rw-r--r--. 1 hadoop hadoop 1128 5月 29 2015 storm_env.ini
-rw-r--r--. 1 hadoop hadoop 1613 5月 29 2015 storm.yaml
5. 先将strom.yaml文件改名,然后重新编辑strom.yaml
[hadoop@storm01 conf]$ mv storm.yaml storm.yaml.bak
[hadoop@storm01 conf]$ ll
总用量 8
-rw-r--r--. 1 hadoop hadoop 1128 5月 29 2015 storm_env.ini
-rw-r--r--. 1 hadoop hadoop 1613 5月 29 2015 storm.yaml.bak
[hadoop@storm01 conf]$ vi storm.yaml
"storm.yaml" [New File]
storm.zookeeper.servers:
- "zk01"
- "zk02"
- "zk03"
#指定storm集群中的minbus节点所在的服务器
nimbus.host: "storm01"
#指定minbus启动JVM最大可用内存大小
nimbus.childopts: "-Xmx1024m"
#指定supervisor启动JVM最大可用内存大小
supervisor.childopts: "-Xmx1024m"
#指定supervisor节点上,每个worker启动JVM最大可用内存大小
worker.childopts: "-Xmx768m"
#指定ui启动JVM最大可用内存大小,ui服务一般与minbus同在一个节点上
ui.childopts: "-Xmx768m"
#指定supervisor节点上,启动worker时对应的端口号,每个端口号对应一个
槽,每个槽位对应一个worker
supervisor.slots.ports:
- 6700
- 6701
- 6702
- 6703
6. 将storm01机器上的/home/hadoop/apps/storm分发到storm02,storm03,storm04的/home/hadoop/apps/下
scp -r /home/hadoop/apps/storm/ storm02:/home/hadoop/apps/
scp -r /home/hadoop/apps/storm/ storm03:/home/hadoop/apps/
scp -r /home/hadoop/apps/storm/ storm04:/home/hadoop/apps/
四、启动集群
1. 先启动zookeeper
l 在nimbus.host所属的机器上启动 nimbus服务
2. 在storm01上启动 nimbus服务
[hadoop@storm01~]$ cd /home/hadoop/apps/storm/bin
[hadoop@storm01 bin]$ nohup ./storm nimbus &
[1] 1603
[hadoop@storm01 bin]$ nohup: 忽略输入并把输出追加到"nohup.out"
3. 在storm01上启动 nimbus服务
[hadoop@storm01 bin]$ nohup ./storm ui &
[2] 1665
[hadoop@storm01 bin]$ nohup: 忽略输入并把输出追加到"nohup.out"
[hadoop@storm01 ~]$ jps
1733 Jps
1665 core
1603 nimbus
1518 QuorumPeerMain
4. 在storm02、sotrm03、sotrm04上启动supervisor服务
[hadoop@storm02~]$ cd /home/hadoop/apps/storm/bin
[hadoop@storm02 bin]$ nohup ./storm supervisor &
[1] 1647
[hadoop@storm02 bin]$ nohup: 忽略输入并把输出追加到"nohup.out"
[hadoop@storm02 ~]$ jps
1717 Jps
1647 supervisor
1503 QuorumPeerMain
如果在每一台节点上配置了环境变量即:
[hadoop@storm01 ~]$ sudo vi /etc/profile
export JAVA_HOME=/usr/local/jdk1.7.0_45
export PATH=$PATH:$JAVA_HOME/bin
export STORM_HOME=/home/hadoop/apps/storm
export PATH=$PATH:$STORM_HOME/bin
[hadoop@storm01 ~]$ source /etc/profile
启动的时候就可以这么启动:
1. 启动nimbus
[hadoop@storm01 ~]$ nohup storm nimbus &
[hadoop@storm01 ~]$ nohup storm ui &
2. 启动supervisor
[hadoop@storm04 ~]$ nohup storm supervisor &
在客户机上,输入storm01的web服务界面,查看storm集群状态:
http://192.168.33.31:8080/
进入到storm的安装目录中,提交一个示例任务
没有配置环境变量的启动方式
[hadoop@storm01 storm]$ bin/storm jar examples/storm-starter/storm-starter-topologies-0.9.6.jar storm.starter.WordCountTopology wordcount
配置了环境变量的启动方式
[hadoop@storm01 ~]$ storm jar /home/hadoop/apps/storm/examples/storm-starter/storm-starter-topologies-0.9.6.jar storm.starter.WordCountTopology wordcount
本文为爱尚实训原创精编,转载请注明来源 www.hrbasjy.org 百度一下“爱尚实训”,获取更多大数据、Java相关资讯