HADOOP完全配置HA
关于节点配置
NN DN JN ZK ZKFC RM
master 1 1
slave1 1 1 1 1 1
slave2 1 1 1 1
slave3 1 1 1
1.时间同步
date -s "2015-5-8 19:48:00" (只要所有机器时间一样就可以了)
防火墙关闭
service iptables stop(每次都要设置)
2.JDK
3.解压
4.免密钥
master 到所有节点
slave1到所有节点
两台NN之间也要做免密钥(参照之前的文章就好)
在slave上:
1.cd .ssh
2.ssh-keygen -t rsa (按四下回车)
在目录下会出现私钥id_rsa和公钥id_rsa.pub
image3.ssh-copy-id slave2 ;ssh-copy-id slave2 ;ssh-copy-id slave3 ;ssh-copy-id master
把生成的公钥copy到所有的节点上(包括master)。
image在目录下会生成 authorized_keys 文件,这样就可以实现免密钥登录
5.修改配置文件
5.1.设置HADOOP环境变量
export HADOOP_HOME=(hadoop解压地址)
export PATH=$HADOOP_HOME/:bin$HADOOP_HOME/sbin:$PATH
5.2配置hdfs-site.xml
<property>
<name>dfs.nameservices</name>
<value>sxt</value>
</property>
<property>
<name>dfs.ha.namenodes.sxt</name>
<value>nn1,nn2</value>
</property>
<property>
<name>dfs.namenode.rpc-address.sxt.nn1</name>
<value>master:8020</value>
</property>
<property>
<name>dfs.namenode.rpc-address.sxt.nn2</name>
<value>slave1:8020</value>
</property>
<property>
<name>dfs.namenode.http-address.sxt.nn1</name>
<value>master:50070</value>
</property>
<property>
<name>dfs.namenode.http-address.sxt.nn2</name>
<value>slave1:50070</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://slave1:8485;slave2:8485;slave3:8485/sxt</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.sxt</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/root/.ssh/id_rsa</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/opt/journal/data</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
5.3配置core-site.xml
<property>
<name>fs.defaultFS</name>
<value>hdfs://sxt</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/hadoop</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>master:2181,slave1:2181,slave2:2181</value>
</property>
5.4配置yarn-site.xml
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>sxt2yarn</value>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>slave2</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>slave3</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address.rm1</name>
<value>slave2:8088</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address.rm2</name>
<value>slave3:8088</value>
</property>
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>master:2181,slave1:2181,slave2:2181</value>
</property>
5.5配置mapred-site.xml
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>fireslate.cis.umac.mo:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>fireslate.cis.umac.mo:19888</value>
</property>
<property>
<name>yarn.app.mapreduce.am.staging-dir</name>
<value>/user</value>
</property>
注意!!
hadoop-env.sh中的 JAVAHOME
core-site.xml hadoop.tmp.dir <>/opt/hadoop/</>!!保证每台服务器该目录为空或者不存在
slaves 指定 DN(slave2 slave3)
每台服务器的master全部删除!!!
每一台服务器的配置文件完全相同!!
7.启动JN slave1 2 3上执行 hadoop-daemon.sh start journalnode
8.格式化NN (在一台NN上 (master)) 启动当前服务器上的NN hadoop-daemon.sh start namenode
9.同步 :其他没有格式化的NN(slave1)上执行 hdfs namenode -bootstrapStandby
10.启动ZK集群 master slave1 slave2 执行 zkServer.sh start
11.格式化zk 在一台NN上执行 master hdfs zkfc -formatZK
12.启动 start-dfs.sh
这样 集群已经搭建完成
搭建完成后重启全部集群
再启动
1.启动zk集群 zkServer.sh start
2.启动 start-dfs.sh
resourcemanager需要单独启动
yarn-daemons.sh start resourcemanager