20.Hadoop:httpfs安装及简单使用
本节主要内容:
httpfs安装及简单使用
httpfs是cloudera公司提供的一个hadoop hdfs的一个http接口,通过WebHDFS REST API 可以对hdfs进行读写等访问
1.系统环境:
OS:CentOS Linux release 7.5.1804 (Core)
CPU:2核心
Memory:1GB
运行用户:root
JDK版本:1.8.0_252
Hadoop版本:cdh5.16.2
2.集群各节点角色规划为:
172.26.37.245 node1.hadoop.com---->namenode,zookeeper,journalnode,hadoop-hdfs-zkfc,resourcenode,historyserver,hbase,hbase-master,hive,hive-metastore,hive-server2,hive-hbase,sqoop,impala,impala-server,impala-state-store,impala-catalog,pig,spark-core,spark-master,spark-worker,spark-python
172.26.37.246 node2.hadoop.com---->datanode,zookeeper,journalnode,nodemanager,hadoop-client,mapreduce,hbase-regionserver,impala,impala-server,hive,spark-core,spark-worker,spark-history-server,spark-python
172.26.37.247 node3.hadoop.com---->datanode,nodemanager,hadoop-client,mapreduce,hive,mysql-server,impala,impala-server,hadoop-httpfs
172.26.37.248 node4.hadoop.com---->namenode,zookeeper,journalnode,hadoop-hdfs-zkfc,hive,hive-server2,impala-shell
3.环境说明:
本次追加部署
172.26.37.247 node3.hadoop.com---->hadoop-httpfs
一.安装
Node3节点(只要在任意一台可以访问hdfs的主机,即可安装)
# yum install -y hadoop-httpfs
二.配置
Node1、Node4节点core-site.xml配置添加如下
# cp -p /etc/hadoop/conf/core-site.xml /etc/hadoop/conf/core-site.xml.20200507
# vi /etc/hadoop/conf/core-site.xml
增加以下内容
<property>
<name>hadoop.proxyuser.httpfs.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.httpfs.groups</name>
<value>*</value>
</property>
配置完后重启主机
# reboot
Node3节点启动服务
# service hadoop-httpfs start
# service hadoop-httpfs status
三.测试
浏览器访问
http://172.26.37.247:14000/webhdfs/v1?op=LISTSTATUS&user.name=httpfs
说明:webhdfs/v1为httpfs的根目录,映射到了hdfs://cluster1,从日志中可以看到
API创建文件夹(Node3节点)
# curl -i -X PUT "http://172.26.37.247:14000/webhdfs/v1/user/abc?op=MKDIRS&user.name=httpfs"
HTTP/1.1 200 OK
Server: Apache-Coyote/1.1
说明:
op=MKDIRS 调用方法创建文件夹
/user/abc 带入的变量名给op=MKDIRS
pig中查看
# sudo -u hdfs pig
grunt> cd /user
grunt> ls
hdfs://cluster1/user/abc <dir>
hdfs://cluster1/user/cloudera <dir>
hdfs://cluster1/user/hdfs <dir>
hdfs://cluster1/user/history <dir>
hdfs://cluster1/user/hive <dir>
hdfs://cluster1/user/pig <dir>
hdfs://cluster1/user/root <dir>
hdfs://cluster1/user/spark <dir>
grunt>
查看文件
创建测试文件
# echo "test" > test.txt
上传
# sudo -u hdfs hdfs dfs -put /test.txt /user/abc/test.txt
pig中查看文件
# sudo -u hdfs pig
grunt> ls /user/abc/
hdfs://cluster1/user/abc/test.txt<r 3> 5
通过httpfs查看
# curl -i -X GET "http://172.26.37.247:14000/webhdfs/v1/user/abc/test.txt?op=OPEN&user.name=httpfs"
HTTP/1.1 200 OK
Server: Apache-Coyote/1.1
test
说明:
op=OPEN 调用的OPEN方法
/user/abc/test.txt 对象