hadoop学习01

2020-05-11  本文已影响0人  Demons_LLL

hadoop学习笔记 - HDFS - 伪分布式模式

准备工作

下载需要の包 && 配置jdk环境

下载hadoop包

解压包 && 配置环境

tar -zxvf hadoop-2.10.0.tar.gz -C /opt/module/
vim /etc/profile
配置hadoop环境
##JAVA_HOME
export JAVA_HOME=/opt/module/jdk1.8.0_231
export PATH=$PATH:$JAVA_HOME/bin

##HADOOP_HOME
export HADOOP_HOME=/opt/module/hadoop-2.10.0
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin

export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_YARN_HOME=$HADOOP_HOME

export HADOOP_INSTALL=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_CONF_DIR=$HADOOP_HOME
export HADOOP_HOME=$HADOOP_HOME
export HADOOP_LIBEXEC_DIR=$HADOOP_HOME/libexec
export JAVA_LIBRARY_PATH=$HADOOP_HOME/lib/native:$JAVA_LIBRARY_PATH
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop

source /etc/profile

查看hadoop版本

其实hadoop的各个文件的配置官网文档基本都有讲解,传送门:https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SingleCluster.html

配置文件

Hadoop三种模式

Mode 模式
Local (Standalone) Mode 本地(独立)模式
Pseudo-Distributed Mode 伪分布模式
Fully-Distributed Mode 全分布式模式

这里我们选择 Pseudo-Distributed Mode - 伪分布模式

伪分布操作

Configuration
<configuration>
    <property>
         <name>fs.defaultFS</name>
         <value>hdfs://hadoop03:9000</value> 
         <description>HDFS的URI,文件系统://namenode标识:端口</description>
     </property>
     <property>
         <name>hadoop.tmp.dir</name>
         <value>/opt/hadoop</value>
         <description>namenode上传到hadoop的临时文件夹</description>
     </property>
</configuration>
core-site.xml
<configuration>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
        <description>副本个数,默认配置是3,应小于datanode机器数量</description>
    </property>
</configuration>
hdfs-site.xml
设置无密码ssh
# 产生密钥对,-t 指定生成的密钥类型 -P 旧密码 -f 保存密钥的文件
ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa

# 将产生的公钥追加到authorized_keys文件
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

# 修改文件权限
chmod 0600 ~/.ssh/authorized_keys

# 执行以上操作后可以实现免密登录本机
ssh localhost

# 实现远程免密登录,需要将本机公钥拷贝到远程机器,username@ip或者username@hostname
ssh-copy-id -i ~/.ssh/id_rsa.pub root@hadoop01
# 免密验证
ssh  root@hadoop01
Execution 启动
[root@hadoop03 hadoop-2.10.0]# ./bin/hdfs namenode -format
[root@hadoop03 hadoop-2.10.0]# ./sbin/start-dfs.sh

hadoop守护进程日志输出被写入$hadoop_log_DIR目录(默认为$hadoop_HOME/logs)。

[root@hadoop03 hadoop-2.10.0]# ./sbin/start-dfs.sh 
20/05/10 18:24:08 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [hadoop03]
hadoop03: starting namenode, logging to /opt/module/hadoop-2.10.0/logs/hadoop-root-namenode-hadoop03.out
localhost: starting datanode, logging to /opt/module/hadoop-2.10.0/logs/hadoop-root-datanode-hadoop03.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /opt/module/hadoop-2.10.0/logs/hadoop-root-secondarynamenode-hadoop03.out
20/05/10 18:24:25 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[root@hadoop03 hadoop-2.10.0]# jps
6032 DataNode
5905 NameNode
2804 QuorumPeerMain
6199 SecondaryNameNode
6317 Jps
[root@hadoop03 hadoop-2.10.0]# 
image.png
$ bin/hdfs dfs -mkdir /user
$ bin/hdfs dfs -mkdir /user/<username>
$ bin/hdfs dfs -mkdir input
$ bin/hdfs dfs -put etc/hadoop/*.xml input
 $ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.10.0.jar grep input output 'dfs[a-z.]+'
$ bin/hdfs dfs -get output output
$ cat output/*
or
$ bin/hdfs dfs -cat output/*
$ sbin/stop-dfs.sh
YARN on a Single Node
<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
    <property>
        <name>mapreduce.jobhistory.address</name>
        <value>hadoop03:10020</value>
    </property>
    <property>
        <name>mapreduce.jobhistory.webapp.address</name>
        <value>hadoop03:19888</value>
    </property>
</configuration>
export JAVA_HOME=/opt/module/jdk1.8.0_231
<configuration>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
    <property>
        <name>yarn.resourcemanager.hostname</name>
        <value>hadoop01</value>
    </property>
    <property>
        <name>yarn.log-aggregation-enable</name>
        <value>true</value>
    </property>
    <property>
        <name>yarn.log-aggregation.retain-seconds</name>
        <value>604800</value>
    </property>
</configuration>
export JAVA_HOME=/opt/module/jdk1.8.0_231
 $ sbin/start-yarn.sh
$ sbin/stop-yarn.sh

最后提醒一下

  1. 多看官方文档
  2. 启动或关闭进程 多使用 jps 查看
上一篇 下一篇

猜你喜欢

热点阅读