互联网金融互联网证券我爱编程

用cloudera manager在阿里云搭建hadoop环境

2015-10-09  本文已影响2801人  ff7e4f9cb3e3

准备四台机器,并且做同样的操作

假设名字为cmf、hdp1、hdp2、hdp3、

系统全部为centos 6.5 x64

CDH版本 5.x

修改所有机器的机器名及hosts

以cmf为例

(1) vi /etc/sysconfig/network

修改这一行HOSTNAME=cmf.local  #注意不同主机修改相应的名字

(2) vi /etc/hosts

127.0.0.1 localhost

::1        localhost localhost.localdomain localhost6 localhost6.localdomain6

10.45.236.66    cmf.local      cmf

10.45.235.201  hdp1.local      hdp1

10.45.232.251  hdp2.local      hdp2

10.45.236.21  hdp3.local      hdp3

# The following lines are desirable for IPv6 capable hosts

::1 ip6-localhost ip6-loopbackfe00

::0 ip6-localnetff00

::0 ip6-mcastprefixff02

::1 ip6-allnodesff02

::2 ip6-allrouters

设置时间同步

chkconfig ntpd on

service ntpd restart

ntpdate -u ntp1.aliyun.com

设置dns

vi /etc/resolv.conf

options timeout:1 attempts:1 rotate

nameserver 223.5.5.5

nameserver 223.6.6.6

编辑fstab,挂数据盘

注:需要实现申请好四块同样大小的数据盘

在所有需要挂在数据盘的机器上运行

fdisk -l

fdisk /dev/vdb

分别选择n,p,1

回车两次

按w退出

mkfs.ext4 /dev/vdb1

echo /dev/vdb1       /mnt/data    ext4    defaults >>/etc/fstab

mkdir /mnt/data

mount -a

rm -rf /mnt/data/*

df -h

设置iptables

如果主机上开启了iptables,则需要把主机设置成互相信任,方法是:

vi /etc/sysconfig/iptables,添加如下行,对于没有内网的,每台主机一行,对于有内网的,可以直接添加网段。

# Accept packets from trusted IP addresses

-A INPUT -s xxx.xxx.xxx.xxx -j ACCEPT

然后重启服务:service iptables restart

完成后重启计算机

注意检查:

1. df -h 看数据盘自动mount上了

2. 注意一定要保证/mnt/data是空的

3. 用hostname看机器名称是否改过来了。

关闭hugepage

echo never > /sys/kernel/mm/redhat_transparent_hugepage/defrag

echo 'echo never > /sys/kernel/mm/redhat_transparent_hugepage/defrag' >>/etc/rc.local

在cmf主机上下载安装包

1. 下载

wget http://archive.cloudera.com/cm5/installer/latest/cloudera-manager-installer.bin

2.运行

chmod 755 cloudera-manager-installer.bin

./cloudera-manager-installer.bin

3.安装完成后,到浏览器访问界面,一般为http://ip:7180

用admin/admin登录,就到了安装向导界面

调整日志目录

安装结束后,由于日志缺省是存在/var/log目录中,而阿里云的主机的系统盘都较小,所以会报出警告来。要解除报警,就要把日志目录都配置到/mnt/data下面去。

方法是:

图1

点击配置报警如图1。

图2

找到log directory free space的报警,点击进去,如图2。

图3

在搜索框中输入"Log Directory",找到所有相关的配置,并且把/var目录变目录到/mnt/data下,然后点保存。如图3.

调整parcel目录

agent的包目录还会报空间不够,解决方法是:

见上图,找到parcel目录配置,然后修改成/mnt/data下的目录。

然后ssh登录到所有agent机器上,执行如下命令,将parcel目录迁移到/mnt/data下:

mkdir /mnt/data/cloudera/ /mnt/data/cloudera/parcels

cp -r /opt/cloudera/parcels/* /mnt/data/cloudera/parcels/

错误处理

1. 注意一定要选择安装jdk

并正确设置java

rm -f /usr/java/latest

ln -sf /usr/java/jdk1.7.0_67-cloudera /usr/java/latest

export JAVA_HOME=/usr/java/latest

export PATH=$PATH:$JAVA_HOME/bin

echo export JAVA_HOME=/usr/java/latest >>/etc/profile.d/java.sh

echo export PATH=$PATH:$JAVA_HOME/bin >>/etc/profile.d/java.sh

2. 如果在安装服务时,hdfs部署失败试试下面的命令

在所有主机执行:

chown -R hdfs:hdfs /mnt/data/dfs

3. 如果hdfs的nfs服务启动出错

报错连接111端口失败,可能是rpcbind服务没有启动,则在相应的nfs主机上:

service rpcbind start

4. 遇到错误重装的操作

在cmf服务器上删除manageserver:

service cloudera-scm-server stop

service cloudera-scm-server-db stop

yum remove cloudera-manager-*

yum remove cloudera-manager-server-db

yum clean all

/usr/share/cmf/uninstall-cloudera-manager.sh

service cloudera-scm-agent hard_stop_confirmed

rm -Rf /usr/share/cmf /var/lib/cloudera* /var/cache/yum/cloudera* /etc/cloudera-scm-* /var/lib/cloudera-scm-server-db

yum remove 'cloudera-manager-*' hadoop hue-common 'bigtop-*'

yum clean all

rm -Rf /usr/share/cmf /var/lib/cloudera* /var/cache/yum/cloudera* /mnt/data/*

在所有的安装agent的机器上:

service cloudera-scm-agent hard_stop_confirmed

yum remove 'cloudera-manager-*' hadoop hue-common 'bigtop-*'

yum clean all

rm -Rf /usr/share/cmf /var/lib/cloudera* /var/cache/yum/cloudera* /mnt/data/*

5. 万一不小心关闭了页面

安装向导起始页面地址:http://cmf.jinchongzi.com:7180/cmf/express-wizard/welcome

6. 报时钟不同步

检查ntpdc -c loopinfo的运行结果,如果是:

Name or service not known

很可能是/etc/hosts文件有错,缺少127.0.0.1 localhost这一行。仔细检查hosts文件,修改正确即可解决。

7. 重装后,hbase启动失败,报master服务启动错误

首先检查java设置是否正确,参见常见错误1

(1)

service hbase-regionserver stop

service hbase-master stop

(2)

hbase org.apache.hadoop.hbase.util.hbck.OfflineMetaRepair

(3) 删除zookeeper中的数据

cd /opt/cloudera/parcels/lib/zookeeper/bin/

./zkCli.sh

打开zookeeper的shell

执行:

ls /

rmr /hbase

(4)启动hbase

service hbase-master restart

service hbase-regionserver restart

7. 安装高版本的cloudera manager时报找不到rpm包

错误内容:

Error Downloading Packages:

cloudera-manager-daemons-5.4.0-1.cm540.p0.165.el6.x86_64: failure: RPMS/x86_64/cloudera-manager-daemons-5.4.0-1.cm540.p0.165.el6.x86_64.rpm from cloudera-manager: [Errno 256]

No more mirrors to try.

解决方案:

rm -r /etc/yum.repos.d/cloudera-manager.repo.*

vi /etc/yum.repos.d/cloudera-manager.repo

贴入如下内容:

[cloudera-manager]

name = Cloudera Manager, Version 5.5.1

baseurl =http://archive.cloudera.com/cm5/redhat/6/x86_64/cm​/5.5.1/

gpgkey =http://archive.cloudera.com/redhat/cdh/RPM-GPG-KEY​-cloudera

gpgcheck = 1

重新运行安装程序即可。

8. 安装parcel时有时会很慢,下载不下来,甚至失败

新开一个窗口,到http://xxxx:7180/cmf/parcel/status下面,取消某个包的下载操作(点右边的cancel),然后再重新点download即可。

9. Acquiring Installation lock

rm /tmp/.scm_prepare_node.lock

10. hdfs增加机器rebalance使文件分布均匀

Running the Balancer

Go to the HDFS service.

Select Actions > Rebalance.

Click Rebalance that appears in the next screen to confirm. If you see a Finished status, the Balancer ran successfully.

Configuring the Balancer Threshold

The Balancer has a default threshold of 10%, which ensures that disk usage on each DataNode differs from the overall usage in the cluster by no more than 10%. For example, if overall usage across all the DataNodes in the cluster is 40% of the cluster's total disk-storage capacity, the script ensures that DataNode disk usage is between 30% and 50% of the DataNode disk-storage capacity. To change the threshold:

Go to the HDFS service.

Click the Configuration tab.

Expand the Balancer Default Group category.

Set the Rebalancing Threshold property.

Click Save Changes to commit the changes.

上一篇 下一篇

猜你喜欢

热点阅读