hadoop单节点的安装和配置

2018-04-01 本文已影响16人尚学先生

#目的 (Purpose)# 这篇文档描述了怎样安装和配置一个单节点的Hadoop，并且利用Hadoop MapReduce和HDFS进行简单的操作。 #先决条件（Prerequisites)# ##支持的操作系统平台（Supported Platforms)##

支持GUN/Linux，并且可以作为开发和生产平台，Hadoop被证明在GUN/Linux上的集群可以达到2000个节点

Win32仅仅支持作为开发平台，分布式操作在Win32上没有进行很好的测试，不推荐作为生产平台

##必须的软件## Linux和Windows都必须的软件：

Java 1.6及以上，最好是Sun Java环境

ssh 必须安装并且sshd必须运行通过使用Hadoop scripts来管理运程Hadoop进程

Windows:

cygwin

open ssh

#准备运行Hadoop集群# 解压下载的hadoop分发包，修改/conf/hadoop-env.sh来指定JAVA_HOME 然后运行bin/hadoop，显示hadoop操作指令

现在你已经准备好了开始你的Hadoop集群。hadoop集群支持以下3中模式：

Local（Standalone)Mode(本地/独立/单机模式）

Pseudo_Distributed Mode（伪分布式模式）

Fully-Distributed Mode（完全的分布式模式）

#Standalone 模式# 这也是Hadoop的默认运行模式，此时作为一个单独的Java 进程，此模式对Debugging非常有用。下面的例子演示了单机模式 $ mkdir input $ cp conf/.xml input $ bin/hadoop jar hadoop-examples-.jar grep input output 'dfs[a-z.]+' $ cat output/*

#Pseudo—Distributed模式# Hadoop 也可以在一个单独的节点上以伪分布式模式运行，此时，每一个Hadoop程序作为一个独立的Java进程运行。 ##配置（Configuration）## conf/core-site.xml:

fs.default.name hdfs://localhost:9000

conf/hdfs-site.xml:

dfs.replication 1

conf/mapred-site.xml:

mapred.job.tracker localhost:9001

##setup passphraseless ssh## 检查你是否可以使用ssh不需要passphrase登录到localhost ssh localhost

eg：ssh localhost The authenticity of host '[localhost]:11201 ([::1]:11201)' can't be established. RSA key fingerprint is 01:05:83:c6:d3:a7:7a:92:c6:c0:0c:3e:55:60:85:b1. Are you sure you want to continue connecting (yes/no)?

如上如果不能登录，执行下面代码配置本地ssh $ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa $ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

此时，运行ssh localhost xxx@xxx:~/programs/hadoop-1.2.1$ ssh localhost Linux xxx 2.6.32-5-amd64 #1 SMP Fri Feb 15 15:39:52 UTC 2013 x86_64

The programs included with the Debian GNU/Linux system are free software; the exact distribution terms for each program are described in the individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent permitted by applicable law. You have new mail. Last login: Mon Apr 21 14:10:46 2014 from localhost

##execution（执行）## 格式化一个新的分布式文件系统 `bin/hadoop namenode -format 启动hadoop程序 $ bin/start-all.sh 此时hadoop伪分布式模式完成。

hadoop后台程序日志输出到${HADOOP_LOG_DIR}，默认为${HADOOP_HOME}/logs) Haddop同时提供了Web接口，默认可以访问：

NameNode http://localhost:50070/

JobTracker http://localhost:50030

##测试操作## Copy the input files into the distributed filesystem: $ bin/hadoop fs -put conf input

Run some of the examples provided: $ bin/hadoop jar hadoop-examples-*.jar grep input output 'dfs[a-z.]+'

Examine the output files:

Copy the output files from the distributed filesystem to the local filesytem and examine them: $ bin/hadoop fs -get output output $ cat output/*

View the output files on the distributed filesystem: $ bin/hadoop fs -cat output/*

When you're done, stop the daemons with: $ bin/stop-all.sh

#完全分布式模式#

小伙伴们需要大数据的相关资料可以加我免费领取：150-4333-182

hadoop单节点的安装和配置

猜你喜欢

热点阅读