大数据之Hadoop 安装(macOS Mojave)

2019-01-05  本文已影响0人  etrols

本教程采用 CDH 版,以避免版本依赖冲突导致错误,本教程同样适用于 Linux(推荐 CentOS);
本教程 Hadoop 使用伪分布式模式

Hadoop 运行模式

本地模式(单机模式)

Hadoop 默认模式为非分布式模式(本地模式),无需进行配置即可运行,即单 java 进程,方便进行调试。

伪分布式模式

Hadoop 可以在单节点上以伪分布式的方式运行,Hadoop 进程以分离的 Java 进程来运行,节点即作为 NameNode,也作为 DataNode,同时,读取的是 HDFS 中的文件

分布式模式

使用多个节点构成集群环境来运行 Hadoop

Hadoop CDH版本下载

下载地址:https://archive.cloudera.com/cdh5/cdh/5/
版本:hadoop-2.6.0-cdh5.9.3.tar.gz

环境准备

ssh 免密登录(此步骤可以忽略,但 Hadoop 每次启动都需要输入密码)

终端执行以下命令:

zhangzhaodeMacBook-Pro:~ zhangzhao$ ssh-keygen -t rsa -P "" //一直回车即可
zhangzhaodeMacBook-Pro:~ zhangzhao$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

验证免密登录

zhangzhaodeMacBook-Pro:~ zhangzhao$ ssh localhost
Last login: Fri Jan  4 13:45:54 2019 //出现这个结果表示免密登录成功

JDK 安装

JDK 版本:
        macOS:jdk-8u192-macosx-x64.dmg
        Linux:jdk-8u192-linux-x64.tar.gz
macOS 双击安装,Linux 解压即可

JDK 环境变量配置:

macOS:

在系统根目录(~)下打开.bash_profile

zhangzhaodeMacBook-Pro:~ zhangzhao$ vim .bash_profile

添加以下内容:

  1 JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_192.jdk/Contents/Home/
  2 CLASSPAHT=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
  3 PATH=$JAVA_HOME/bin:$PATH:
  4 export JAVA_HOME
  5 export CLASSPATH
  6 export PATH

最后使环境变量生效:

zhangzhaodeMacBook-Pro:~ zhangzhao$ source .bash_profile

JDK 验证:

zhangzhaodeMacBook-Pro:~ zhangzhao$ java -version
java version "1.8.0_192"
Java(TM) SE Runtime Environment (build 1.8.0_192-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.192-b12, mixed mode) java -version

Linux(有默认的 openJDK 的话,可以忽略):

在系统根目录(~)下打开.bash_profile

vim .bash_profile

添加以下内容:

JAVA_HOME=/usr/lib/jdk1.8.0_192
CLASSPATH=.:$JAVA_HOME/lib/tools.jar:$JAVA_HOME/lib/dt.jar 
PATH=$JAVA_HOME/bin:$HOME/bin:$HOME/.local/bin:$PATH

最后使环境变量生效:

source .bash_profile

JDK 验证:

java -version
java version "1.8.0_192"
Java(TM) SE Runtime Environment (build 1.8.0_192-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.192-b12, mixed mode) java -version

下载 Hadoop

使用 wget 命令,也可以手动下载
我这里下载到 /Users/zhangzhao/develop/hadoop

zhangzhaodeMacBook-Pro:hadoop zhangzhao$ wget https://archive.cloudera.com/cdh5/cdh/5/hadoop-2.6.0-cdh5.9.3.tar.gz

mac 系统默认没有 wget,使用 Homebrew 安装(Linux 请忽略):

zhangzhaodeMacBook-Pro:~ zhangzhao$ brew install wget

Homebrew官网
安装Homebrew(Linux 请忽略):

zhangzhaodeMacBook-Pro:~ zhangzhao$ /usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"

Homebrew使用请参考官网

解压 Hadoop

zhangzhaodeMacBook-Pro:hadoop zhangzhao$ zhangzhao$ tar -zxvf hadoop-2.6.0-cdh5.9.3.tar.gz
zhangzhaodeMacBook-Pro:hadoop zhangzhao$ ls
hadoop-2.6.0-cdh5.9.3
hadoop-2.6.0-cdh5.9.3.tar.gz

Hadoop 目录结构

zhangzhaodeMacBook-Pro:hadoop zhangzhao$ cd hadoop-2.6.0-cdh5.9.3/
zhangzhaodeMacBook-Pro:hadoop-2.6.0-cdh5.9.3 zhangzhao$ ls
LICENSE.txt        cloudera                     lib
NOTICE.txt         etc                          libexec
README.txt         examples                     sbin
bin                examples-mapreduce1          share
bin-mapreduce1     include                      src

bin:存放基础的管理脚本和使用脚本,这些脚本是sbin目录下管理脚本的基础实现,用户可以用这些脚本管理和使用hadoop
etc:存放包括core-site.xml、hdfs-site.xml、mapred-site.xml和yarn-site.xml等配置文件。.template是模板文件。
lib:存放Hadoop的本地库(对数据进行压缩解压缩功能)
sbin:存放启动或停止Hadoop集群相关服务的脚本
share:存放Hadoop的依赖jar包、文档、和官方案例
libexec:各个服务所对应的shell配置文件所在目录,可用于配置日志输出目录、启动参数(比如JVM参数)等基本信息

Hadoop 核心配置文件配置

配置文件目录:~/develop/hadoop/hadoop-2.6.0-cdh5.9.3/etc/hadoop

hadoop-env.sh

添加 JDK 安装目录路径:

export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_192.jdk/Contents/Home/
vim hadoop-env.sh
hadoop-env.sh

core-site.xml

添加如下配置:

<!-- hdfs 端口 -->
<property>
    <name>fs.defaultFS</name>
    <value>hdfs://localhost:8020</value>
</property>
<!-- hadoop 临时数据目录 -->
<property>
    <name>hadoop.tmp.dir</name>
    <value>/Users/zhangzhao/develop/tmp</value>
</property>
vim core-site.xml
core-site.xml

hdfs-site.xml

添加如下配置:

<configuration>
    <!-- hdfs 数据副本数目  -->
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
    <!-- hdfs 存储 fsimage 的地方  -->
    <property>
        <name>dfs.namenode.name.dir</name>
        <value>/Users/zhangzhao/develop/tmp/dfs/name</value>
    </property>
    <!-- hdfs 数据存放 block 的地方  -->
    <property>
        <name>dfs.datanode.data.dir</name>
        <value>/Users/zhangzhao/develop/tmp/dfs/data</value>
    </property>
</configuration>
vim hdfs-site.xml
hdfs-site.xml

Hadoop 环境变量

vim ~/.bash_profile

添加如下配置:

# added by Hadoop installer
export HADOOP_HOME=/Users/zhangzhao/develop/hadoop/hadoop-2.6.0-cdh5.9.3
export HADOOP_INSTALL=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin

使配置生效

source ~/.bash_profile

HDFS 格式化与启动停止

格式化 HDFS

注意:这一步操作,只在初始化时执行一次,如果每次都格式化,那么 HDFS 上的数据会全部清空。

zhangzhaodeMacBook-Pro:bin zhangzhao$ hdfs namenode -format

出现以下日志表示格式化成功:


HDFS 格式化日志

启动 HDFS

zhangzhaodeMacBook-Pro:hadoop-2.6.0-cdh5.9.3 zhangzhao$ cd sbin/
zhangzhaodeMacBook-Pro:sbin zhangzhao$ start-dfs.sh 
19/01/05 12:43:51 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [localhost]
localhost: starting namenode, logging to /Users/zhangzhao/develop/hadoop/hadoop-2.6.0-cdh5.9.3/logs/hadoop-zhangzhao-namenode-zhangzhaodeMacBook-Pro.local.out
localhost: starting datanode, logging to /Users/zhangzhao/develop/hadoop/hadoop-2.6.0-cdh5.9.3/logs/hadoop-zhangzhao-datanode-zhangzhaodeMacBook-Pro.local.out
Starting secondary namenodes [account.jetbrains.com]
account.jetbrains.com: starting secondarynamenode, logging to /Users/zhangzhao/develop/hadoop/hadoop-2.6.0-cdh5.9.3/logs/hadoop-zhangzhao-secondarynamenode-zhangzhaodeMacBook-Pro.local.out
19/01/05 12:44:06 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

验证 HDFS 启动是否成功

zhangzhaodeMacBook-Pro:sbin zhangzhao$ jps
87715 NameNode
87781 DataNode
87871 SecondaryNameNode
87950 Jps

出现以上三个 node,表示成功
访问 HDFS:http://localhost:50070

HDFS 地址

停止 HDFS

zhangzhaodeMacBook-Pro:sbin zhangzhao$ stop-dfs.sh 
19/01/05 12:47:47 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Stopping namenodes on [localhost]
localhost: stopping namenode
localhost: stopping datanode
Stopping secondary namenodes [account.jetbrains.com]
account.jetbrains.com: stopping secondarynamenode
19/01/05 12:48:05 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
zhangzhaodeMacBook-Pro:sbin zhangzhao$ jps
88263 Jps

启动 Hadoop 集群

zhangzhaodeMacBook-Pro:sbin zhangzhao$ start-all.sh 
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
19/01/05 13:13:07 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [localhost]
localhost: namenode running as process 88426. Stop it first.
localhost: datanode running as process 88500. Stop it first.
Starting secondary namenodes [account.jetbrains.com]
account.jetbrains.com: secondarynamenode running as process 88592. Stop it first.
19/01/05 13:13:10 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
starting yarn daemons
starting resourcemanager, logging to /Users/zhangzhao/develop/hadoop/hadoop-2.6.0-cdh5.9.3/logs/yarn-zhangzhao-resourcemanager-zhangzhaodeMacBook-Pro.local.out
localhost: starting nodemanager, logging to /Users/zhangzhao/develop/hadoop/hadoop-2.6.0-cdh5.9.3/logs/yarn-zhangzhao-nodemanager-zhangzhaodeMacBook-Pro.local.out
zhangzhaodeMacBook-Pro:sbin zhangzhao$ jps
88592 SecondaryNameNode
88500 DataNode
89591 NodeManager
88426 NameNode
89519 ResourceManager
89615 Jps

jps 命令出现以上 5 个服务表示正常

停止 Hadoop 集群

zhangzhaodeMacBook-Pro:sbin zhangzhao$ start-all.sh 
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
19/01/05 13:15:11 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [localhost]
localhost: namenode running as process 88426. Stop it first.
localhost: datanode running as process 88500. Stop it first.
Starting secondary namenodes [account.jetbrains.com]
account.jetbrains.com: secondarynamenode running as process 88592. Stop it first.
19/01/05 13:15:14 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
starting yarn daemons
resourcemanager running as process 89519. Stop it first.
localhost: nodemanager running as process 89591. Stop it first.
上一篇下一篇

猜你喜欢

热点阅读