[集群自动化四] 来用Docker装个HADOOP集群
2019-02-08 本文已影响3人
LinuxMan_yang
前言:
本篇写一个Hadoop自动安装脚本,以5台KVM虚拟机和docker为基础,来自动安装Hadoop集群(HDFS、YARN、HBASE、HIVE、SPARK、HUE、Jupyter-notebook、Scope等)
各组件版本
HDFS 3.1.1
YARN 3.1.1
HBASE 2.1.1
HIVE 3.1.1
SPARK 2.4.0
HUE 4.3.0
Jupyter-notebook 4.4.0
Scope 1.10.1
Mysql:mariadb 10.3.11
Solr: 7.6.0
Livy: 0.5.0
OOZIE: 5.0.0
Myweb: 自编,服务索引与服务状态监控
架构与服务分配
| nn1 | nn2 | dn1 | dn2 | dn3 |
|---|---|---|---|---|
| zookeeper1 | zookeeper2 | zookeeper3 | ||
| JournalNode | JournalNode | JournalNode | ||
| NameNode | NameNode | DataNode | DataNode | DataNode |
| NodeManager | NodeManager | NodeManager | NodeManager | NodeManager |
| ResourceManager | ResourceManager | |||
| HbaseMaster&Rest&Thrift | HiveMetadata&HiveServer2 | HbaseRegion | HbaseRegion | HbaseRegion |
| SparkMaster | SparkWorker | SparkWorker | SparkWorker | |
| YarnHistory&WebProxy | SparkHistory&Livy&Solr&Oozie | |||
| Myweb | Hue&Mysql&Jupyter | |||
| WeaveScope | WeaveScope | WeaveScope | WeaveScope | WeaveScope |
[图片上传失败...(image-26fe0f-1549602339382)]
准备
-
克隆5台虚拟机,详见前篇[基础架构十四:来试试把Centos7改造成Coreos],nn2需要调整到8G内存
-
下载所需docker镜相(包含9个安装用到的docker image, 解压到下一步clone的目录里)
-
安装脚本工具 git clone https://github.com/Thomas-YangHT/hadoopHA-autoins.git
配置
cd hadoopHa-autoins; vim CONFIG #修改5台机器的IP
4.png
安装
sh install all # 大约10分钟后完成安装
3.png
验证
打开安装完提示的链接:http://<your nn1 IP>
1.png
服务端口状态
2.png
HDFS
5.png
YARN
6.png
HBASE
7.png
SPARK
8.png
HUE
10.png
Solr
9.png
Oozie
13.png
Jupyter-notebook
11.png
WeaveScope
12.png
install.sh 更多用法:
usage: install.sh [option]
option:
p0|pimages :cp&load all tgz&images to all nodes.
p|pconfig :cp config&shell to all nodes.
zookeeper :install zookeeper cluster on ZKX
journalnode :install JN on JNX
format :format ZKFC&Nodename on nn1
startnn1 :start NN/ZKFC/RM on nn1
standby :sync namenode info on nn2
startnn2 :start NN/ZKFC/RM on nn2
datanode :start datanode on DNX
nodemanager :start nodemanager on all nodes
hmaster :HBASE master
hregion :HBASE region
spark :start spark master
sparkslave :start spark slaves
oozie :oozie for schedule jobs
hue :HUE manager page
scope :weavescope monitor
myweb :index for all services
genindex :generate svc-hadoop.html
finish :print finish page
status :get status of NameNode&ResourceManager(NN&RM) & zookeeper
timezone8 :set timezone CST-8
route :add route temporally
all :install all units