hadoop虚拟机全套配置攻略
Hadoop集群安装
1总览
Hdfs(hadoop内置) 2.8.5
Yarn(hadoop内置) 2.8.5
Zookeeper 3.4.10
Hbase 2.0.3
Hive 1.22
java 1.8.0_191(11的话browser的hdfs有bug)
本人操作系统windows10 vmware15 虚拟机为centos7
(1)映射文件配置
Windows下域名映射文件(C:\Windows\System32\drivers\etc\hosts)、
Centos下域名映射文件(/etc/hosts)
192.168.10.101 hadoop1
192.168.10.102 hadoop2
192.168.10.103 hadoop3
192.168.10.101 hadoop1.com
192.168.10.102 hadoop2.com
192.168.10.103 hadoop3.com
(2)静态域名配置
wmware->编辑->虚拟网络编辑器->VMnet8
选择NAT模式,DHCP服务取消
子网设为192.168.10.0 掩码为255.255.255.0
net设置的网关设置为192.168.10.2
三台虚拟机选择:自定义VMnet8
本地windows进入网络中心配置VM8网卡,选择ipv4
ip:192.168.10.1 网关:192.168.10.2 掩码:255.255.255.0
接下来为虚拟机配置静态ip
我配置的虚拟机ip为
hadoop1 192.168.10.101
Hadoop2 192.168.10.102
Hadoop3 192.168.10.103
vim /etc/sysconfig/network-scripts/ifcfg-ens33
TYPE=Ethernet
PROXY_METHOD=none
BROWSER_ONLY=no
BOOTPROTO=static
DEFROUTE=yes
IPV4_FAILURE_FATAL=no
IPV6INIT=yes
IPV6_AUTOCONF=yes
IPV6_DEFROUTE=yes
IPV6_FAILURE_FATAL=no
IPV6_ADDR_GEN_MODE=stable-privacy
NAME=ens33
DEVICE=ens33
ONBOOT=yes
IPADDR=192.168.10.103
GATEWAY=192.168.10.2
NETMASK=255.255.255.0
DNS1=192.168.10.2
(3)防火墙关闭
关闭防火墙:service iptables stop
关闭防火墙自启: chkconfig iptables off
(4)集群间免密通信
cd /root/.ssh
ssh-keygen –t rsa
cat id_rsa.pub >> authorized_keys
ssh-copy-id –i hadoop1(在每台机器上都将公钥考到hadoop1上,每台机器上执行)
chmod 600 authorized_keys
[root@hadoop3 .ssh]# more authorized_keys
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDBWG+cVAf07PQ7x9H++Cz3pgCx5jvbdIt2nLF+DUfwkA94B3Er82DNqOAwbbPJdL+wFlvMgnr5phGmrws65Z
j0WqEycLt167IhwvllOqoOmQHUPoGoa3mIQk7CU12KM80xNgKfmMh25SpSfrIY4OYzlGagV5iuY9Jf0odKB2N9EISxKNIFysc6P5aQw+8ZQebdfxtZJ+VXi9yP
ZARsY/WpQtORooacWJ61ybQFJ/drkzz89xDx1zi0qKrL73EaI0TeY3bmzEM7eptmkWX/SNXnMBECiqt7IYszSgZJrY8/Tnkj93plEJL/aOhdzLVyrVnqI5KzLG
R4UFoOC1sZzLUZ root@hadoop1
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDUGzXK4hasRrjvT7EB8uk/NzpAT/zt+ti2rHIo1HtFfNlwyDAYeqtJUb1zvuzyYtJlzWtnZi8aSeIs2rbyFS
uu7fX/kixRrxUQ/szX00vMqAER/Zt8uW4u4WnxQjHr8/Q/EuIjrAV6YU7AaWa15+Hrz9uHsXl26x9umjYZytevkpQKKh9nthMJg0so/6Fg0azO7tFiJKUp4VEt
vC/n8ceiG7bnZHcIdvQfAWJGW9uatB/GzFeobA1/9nObeMpsOK0ZLY7utG8q/ZhgnXeG/tN90AzjV04G144V6DDeOm/lutZwX17G9NRmnioH2gstx+mswUSfIu
KNwkRe+hSuPdYX root@hadoop2
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCdrqjg+Dk63yBtqPULIzvV8veRrkMYPZeostbwHCdj4I0TxzyfPWsIXkanXPbSLbsU6QGfA/dlDNOf17XOwR
qSReH15/G3xhdcMtU43PjcQznOX1iKZaXKpaPpeIrMdJDlu53x1pbGE64vovzhjj0CRIuxPs4hUKi4RqDtJlmSfbaL9/lUunAwNHhsw3uAuV5XmDU8+vAiTtO0
lnhyU9mB1fJfEx0CAzFFJm5Aw+dv2dufhPNkESIrbwuG/CUMfiNbfCthNxtVc0q2+bVLosIpc5aFV4O3p8WEN+IFuWiWRQl+7ApxaHyYzhboan6HRuE5c7QDwW
jUHSSULlAtXiur root@hadoop3
(5)启动
hdfs:start-dfs.sh
yarn:start-yarn.sh
zookeeper:脚本启动 sh zkmanage.sh start
#!/bin/bash
for host in hadoop1 hadoop2 hadoop3
do
echo "${host}:${1}..."
ssh $host "source /etc/profile;/root/apps/zookeeper-3.4.10/bin/zkServer.sh $1"
done
sleep 2
for host in hadoop1 hadoop2 hadoop3
do
ssh $host "source /etc/profile;/root/apps/zookeeper-3.4.10/bin/zkServer.sh status"
done
Hbase: bin/start-hbase.sh
Hive:bin/hive
(6)查看进程
Hadoop1:
[root@hadoop1 ~]# jps
126769 ResourceManager
126368 NameNode
5314 RunJar
127346 HMaster
129319 Jps
127480 HRegionServer
121467 Elasticsearch
126173 QuorumPeerMain
126876 NodeManager
126478 DataNode
Hadoop2
[root@hadoop2 ~]# jps
99015 SecondaryNameNode
86742 Elasticsearch
102601 HRegionServer
112409 Jps
96029 QuorumPeerMain
100541 NodeManager
98524 DataNode
hdfs: NameNode, DataNode, SecondaryNameNode
yarn: ResourceManager, NodeManager
zookeeper: QuorumPeerMain
hbase: HMaster, HRegionServer
hive: RunJar
Elasticsearch: Elasticsearch
(7)网页端查看
hdfs: http://hadoop1:50070
yarn: http://hadoop1:8088/cluster
Elasticsearch的head工具: http://hadoop1:9100/
2 hdfs和yarn配置
Pwd 均在(hadoop-2.8.5/etc/hadoop)
hadoop-env.sh
export JAVA_HOME=/usr/java/jdk1.8.0_191
slaves
192.168.10.102
192.168.10.103
192.168.10.101
core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://192.168.10.101:9000</value>
</property>
</configuration>
hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>/root/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/root/dfs/data</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>hadoop2:50090</value>
</property>
<property>
<name>dfs.namenode.checkpoint.dir</name>
<value>/root/dfs/namesecondary</value>
</property>
</configuration>
mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
yarn-site.xml
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>hadoop1</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
3 zookeeper
Pwd(zookeeper-3.4.10/conf)
zoo.cfg
dataDir=/root/zkdata
# the port at which the clients will connect
clientPort=2181
server.1=hadoop1:2888:3888
server.2=hadoop2:2888:3888
server.3=hadoop3:2888:3888
对机器创建ID
mkdir /root/zkdata
echo 1 > /root/zkdata/myid
echo 2 > /root/zkdata/myid
echo 3 > /root/zkdata/myid
4 hase
Pwd(hbase-2.0.3/conf)
hbase-site.xml
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://hadoop1:9000/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>hadoop1:2181,hadoop2:2181,hadoop3:2181</value>
</property>
</configuration>
regionservers
hadoop1
hadoop2
hadoop3
5 hive
必须先安mysql(元数据在其上面)
5.7即可,要下载相应的连接jar包
Hive配置
Pwd(hive-1.2.2/conf)
hive-site.xml
<configuration>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>123456</value>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://hadoop1:3306/hive?createDatabaseIfNotExist=true</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
</configuration>
传一个mysql的驱动jar包到hive的安装目录的lib中(我传的版本是mysql-connector-java-5.1.47.jar,据说8.0兼容一切)
将hive当成服务使用
nohup bin/hiveserver2 1>/dev/null 2>&1 &
再别的机器上打开hadoop1的命令行(得启动hive)
beeline connect -u jdbc:hive2://hadoop1:10000 -n root