大数据学习笔记

03_HADOOP_03_HDFS分布式环境搭建

2019-08-18  本文已影响0人  超级小小张

上一节用伪分布式部署了hadoop环境,分布式环境其实就是把hadoop不同角色拆分到其他机器

环境准备

机器角色分布

1.配置jdk
2.免密钥配置:把node01的公钥分发给其他机器

在node01里面分发密钥到其他节点
scp ~/.ssh/id_dsa.pub  root@node02:~/.ssh/node01.pub
scp ~/.ssh/id_dsa.pub  root@node03:~/.ssh/node01.pub
scp ~/.ssh/id_dsa.pub  root@node04:~/.ssh/node01.pub

到各个机器将公钥添加到认证库里面
cat ~/.ssh/node01.pub >> ~/.ssh/authorized_keys
chmod 0600 ~/.ssh/authorized_keys

在node01机器登录其他节点验证是否已经免密登录
[root@node01 ~]# ssh node02
Last login: Sun Aug 18 17:11:46 2019 from localhost
[root@node02 ~]# exit
登出
Connection to node02 closed.
[root@node01 ~]# ssh node03
Last login: Sun Aug 18 17:11:59 2019 from localhost
[root@node03 ~]# exit
登出
Connection to node03 closed.
[root@node01 ~]# ssh node04
Last login: Sun Aug 18 17:12:16 2019 from localhost
[root@node04 ~]# exit
登出
Connection to node04 closed.
[root@node01 ~]# 

3.备份$HADOOP_PREFIX/etc/hadoop,方便保存伪分布式的配置

cp -r hadoop hadoop-local

4.如果是直接做分布式环境配置,参考上一节做PATH和JAVA_HOME的二次配置
5.配置core-site.xml

<configuration>
    <!-- 指定NameNode节点-->
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://node01:9000</value>
    </property>
    <!-- 修改hadoop.tmp.dir默认配置,否则会默认到/tmp下面,容易造成数据丢失,其他地方会引用这个配置-->
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/var/hadoop/full</value>
    </property>
</configuration>

6.配置hdfs-site.xml

<configuration>
    <!-- 配置文件副本数 -->
    <property>
        <name>dfs.replication</name>
        <value>2</value>
    </property>
    <!-- 配置secondaryNameNode位置-->
    <property>
        <name>dfs.namenode.secondary.http-address</name>
        <value>node02:50090</value>
    </property>
</configuration>

7.配置slaves,dataNode位置

node02
node03
node04

8.把/opt/hadoop-2.6.5这个目录远程复制到其他节点

scp -r /opt/hadoop-2.6.5/ root@node02:/opt/
scp -r /opt/hadoop-2.6.5/ root@node03:/opt/
scp -r /opt/hadoop-2.6.5/ root@node04:/opt/

9.把/etc/profile远程复制到其他节点

scp /etc/profile root@node02:/etc/
scp /etc/profile root@node03:/etc/
scp /etc/profile root@node04:/etc/

10.到其他机器上编译一下/etc/profile文件

.   /etc/profile

11.格式化hdfs nameNode文件系统
12.启动

[root@node01 ~]# start-dfs.sh 
Starting namenodes on [node01]
node01: starting namenode, logging to /opt/hadoop-2.6.5/logs/hadoop-root-namenode-node01.out
node04: starting datanode, logging to /opt/hadoop-2.6.5/logs/hadoop-root-datanode-node04.out
node03: starting datanode, logging to /opt/hadoop-2.6.5/logs/hadoop-root-datanode-node03.out
node02: starting datanode, logging to /opt/hadoop-2.6.5/logs/hadoop-root-datanode-node02.out
Starting secondary namenodes [node02]
node02: starting secondarynamenode, logging to /opt/hadoop-2.6.5/logs/hadoop-root-secondarynamenode-node02.out

其他参考上一节

--创建一个文本文件
for i in `seq 100000`;do  echo "hello sxt $i" >> test.txt;done
--指定块大小上传文件
hdfs dfs -D dfs.blocksize=1048576 -put ./test.txt /user/root
上一篇下一篇

猜你喜欢

热点阅读