12. apache hadoop 伪分布式搭建-part2-H
HDFS 简单用法
从官网查看dfs的基本用法
HDFS用法
操作HDFS 一共有三个命令:
hadoop fs: 使用面最广,可以操作任何文件系统。
hadoop dfs:只能操作HDFS文件系统相关(已Deprecated废弃)
hdfs dfs:只能操作HDFS文件系统相关
引用官网:
hadoop fs <args>
FS relates to a generic file system which can point to any file systems like local, HDFS etc. So this can be used when you are dealing with different file systems such as Local FS, HFTP FS, S3 FS, and others
hadoop fs可以操作本地文件系统
hadoop fs -cat file:///app/zpy/hadoop/etc/hadoop/kms-site.xml.
hadoop fs可以操纵HDFS文件系统
hadoop fs -cat hdfs://host1:port1/file1 hdfs://host2:port2/file2
hadoop dfs <args>
dfs is very specific to HDFS. would work for operation relates to HDFS. This has been deprecated and we should use hdfs dfs instead.
只能操作 HDFS 文件系统
hdfs dfs <args>
same as 2nd i.e would work for all the operations related to HDFS and is the recommended command instead of hadoop dfs
[hadoop@bogon /]$ hadoop fs -mkdir /dailiang
[hadoop@bogon /]$ hadoop fs -ls /
Found 1 items
drwxr-xr-x - hadoop supergroup 0 2017-08-31 15:12 /dailiang
[hadoop@bogon /]$ hdfs dfs -ls /
Found 1 items
drwxr-xr-x - hadoop supergroup 0 2017-08-31 15:12 /dailiang
[hadoop@bogon /]$ hadoop fs -ls file:///app/zpy
Found 4 items
drwxr-xr-x - hadoop hadoop 161 2017-08-31 14:52 file:///app/zpy/hadoop
-rw-r--r-- 1 root root 195515434 2015-10-31 05:04 file:///app/zpy/hadoop-2.6.2.tar.gz
drwxr-xr-x - root root 25 2017-07-27 12:24 file:///app/zpy/java
drwxr-xr-x - root root 20 2017-07-27 12:24 file:///app/zpy/ntp
上传本地文件至HDFS目录:
Usage: hdfs dfs -put <localsrc> ... <dst>
Copy single src, or multiple srcs from local file system to the destination file system.
hdfs dfs -put /etc/fstab /test/fstsb
/etc/fstab 是 本地文件
/test/fstab 是 HDFS 文件系统文件
lsr 递归显示目录或者ls -R
hdfs dfs -lsr /
lsr: DEPRECATED: Please use 'ls -R' instead.
drwxr-xr-x - hadoop supergroup 0 2017-04-17 14:44 /test
-rw-r--r-- 1 hadoop supergroup 805 2017-04-17 14:44 /test/fstsb
我们上面创建了一些文件,可去真正的数据目录看一下文件
/data/hadoop/hdfs/dn/current,发现里面很多东西,但是无法看懂
这里我们做个实验看一下
cd /data/hadoop/hdfs/dn/current/BP-47003996-127.0.0.1-1492394837095/current/finalized/subdir0/subdir0
[hadoop@localhost subdir0]$ cat blk_1073741825
# /etc/fstab
# Created by anaconda on Mon Nov 7 10:03:45 2016
#
# Accessible filesystems, by reference, are maintained under '/dev/disk'
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info
#
UUID=78ab5608-8689-4ba6-90e5-dfb1c97895f9 / ext4 defaults 1 1
UUID=5629de06-8894-4df3-a822-6a4d8a846d79 /boot ext4 defaults 1 2
UUID=ad837aa0-307a-4957-a27e-bfe34809dcc9 swap swap defaults 0 0
tmpfs /dev/shm tmpfs defaults 0 0
devpts /dev/pts devpts gid=5,mode=620 0 0
sysfs /sys sysfs defaults 0 0
proc /proc proc defaults 0 0
我们是伪分布式,真正的hadoop集群的数据是会分片的
真正的hadoop集群应该这样看
hdfs dfs -cat /test/fstsb
查看版本:
[hadoop@localhost hadoop]$ hdfs version
Hadoop 2.6.2