HDFS

2018-01-16  本文已影响38人  BlackChen

HDFS

Hadoop Distributed File System

易于扩展的分布式文件系统
运行在大量普通廉价机器上,提供容错机制

优点

缺点

HDFS基本架构和原理

HDFS架构图

HDFS核心概念

ActiveNameNode

Standby NameNode

NameNode元数据文件

DataNode

Block数据块

Client

HDFS为什么不适合存储小文件

HDFS高可用

HDFS内部机制

Block副本放置策略

HDFS写入流程

HDFS读取流程

HDFS数据完整性

HDFS 服务脚本

sbin目录

HDFS文件操作命令

bin/hadoop fs 或者bin/hfds dfs



hdfs dfsadmin -report

[hadoop@hadoop0 bin]$ hdfs dfsadmin -report
Configured Capacity: 36182937600 (33.70 GB)
Present Capacity: 31441104896 (29.28 GB)
DFS Remaining: 31432712192 (29.27 GB)
DFS Used: 8392704 (8.00 MB)
DFS Used%: 0.03%
Under replicated blocks: 10
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0

-------------------------------------------------
Live datanodes (3):

Name: 10.211.55.8:50010 (hadoop3)
Hostname: hadoop3
Decommission Status : Normal
Configured Capacity: 12060979200 (11.23 GB)
DFS Used: 2797568 (2.67 MB)
Non DFS Used: 945188864 (901.40 MB)
DFS Remaining: 10476720128 (9.76 GB)
DFS Used%: 0.02%
DFS Remaining%: 86.86%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Mon Jan 22 20:54:24 CST 2018


Name: 10.211.55.7:50010 (hadoop4)
Hostname: hadoop4
Decommission Status : Normal
Configured Capacity: 12060979200 (11.23 GB)
DFS Used: 2797568 (2.67 MB)
Non DFS Used: 944205824 (900.46 MB)
DFS Remaining: 10477703168 (9.76 GB)
DFS Used%: 0.02%
DFS Remaining%: 86.87%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Mon Jan 22 20:54:24 CST 2018


Name: 10.211.55.9:50010 (hadoop2)
Hostname: hadoop2
Decommission Status : Normal
Configured Capacity: 12060979200 (11.23 GB)
DFS Used: 2797568 (2.67 MB)
Non DFS Used: 943620096 (899.91 MB)
DFS Remaining: 10478288896 (9.76 GB)
DFS Used%: 0.02%
DFS Remaining%: 86.88%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Mon Jan 22 20:54:24 CST 2018

hdfs dfsadmin -safemode 安全模式

[hadoop@hadoop0 bin]$ hdfs dfsadmin -safemode
Usage: hdfs dfsadmin [-safemode enter | leave | get | wait]

[hadoop@hadoop0 bin]$ hdfs dfsadmin -safemode get
Safe mode is OFF in hadoop0/10.211.55.5:9000
Safe mode is OFF in hadoop1/10.211.55.10:9000

HDFS增加和移除节点

集群中加入新的datanode方法
• 在新的机器上拷贝一份包含配置文件的hadoop安装包 • 单独启动datanode
sbin/hadoop-daemon.sh start datanode从集群中移除故障或者废弃的datanode
• 将需要移除的datanode的主机名或者IP加入到NameNode的黑名单 加入黑名单方法: 修改NameNode的hdfs-site.xml文件,设置dfs.hosts.exclude配置 的值为需要移除的datanode的主机名或者IP
• 更新黑名单
bin/hadoop dfsadmin –refreshNodes

HDFS数据均衡器

HDFS冷数据处理

“冷”数据文件
• 很长时间没有被访问过的数据文件(如半年内)
处理方法
• 高压缩比算法进行压缩,如Gzip或bzip2 • 小文件合并

上一篇下一篇

猜你喜欢

热点阅读