本篇博客介绍Hadoop的HDFS客户端的基本shell命令
hdfs dfsadmin -report
输出结果:
Configured Capacity: 123250802688 (114.79 GB)PResent Capacity: 100918267904 (93.99 GB)DFS Remaining: 100473401344 (93.57 GB)DFS Used: 444866560 (424.26 MB)DFS Used%: 0.44%Under replicated blocks: 7Blocks with corrupt replicas: 0Missing blocks: 0Missing blocks (with replication factor 1): 0-------------------------------------------------Live datanodes (3):Name: 192.168.37.130:50010 (anode2.mrbcy.tech)Hostname: anode2.mrbcy.techDecommission Status : NormalConfigured Capacity: 41083600896 (38.26 GB)DFS Used: 220000256 (209.81 MB)Non DFS Used: 7411515392 (6.90 GB)DFS Remaining: 33452085248 (31.15 GB)DFS Used%: 0.54%DFS Remaining%: 81.42%Configured Cache Capacity: 0 (0 B)Cache Used: 0 (0 B)Cache Remaining: 0 (0 B)Cache Used%: 100.00%Cache Remaining%: 0.00%Xceivers: 1Last contact: Sun Feb 19 19:42:23 CST 2017Name: 192.168.37.129:50010 (anode1.mrbcy.tech)Hostname: anode1.mrbcy.techDecommission Status : NormalConfigured Capacity: 41083600896 (38.26 GB)DFS Used: 220393472 (210.18 MB)Non DFS Used: 7484911616 (6.97 GB)DFS Remaining: 33378295808 (31.09 GB)DFS Used%: 0.54%DFS Remaining%: 81.24%Configured Cache Capacity: 0 (0 B)Cache Used: 0 (0 B)Cache Remaining: 0 (0 B)Cache Used%: 100.00%Cache Remaining%: 0.00%Xceivers: 1Last contact: Sun Feb 19 19:42:23 CST 2017Name: 192.168.37.131:50010 (anode3.mrbcy.tech)Hostname: anode3.mrbcy.techDecommission Status : NormalConfigured Capacity: 41083600896 (38.26 GB)DFS Used: 4472832 (4.27 MB)Non DFS Used: 7436107776 (6.93 GB)DFS Remaining: 33643020288 (31.33 GB)DFS Used%: 0.01%DFS Remaining%: 81.89%Configured Cache Capacity: 0 (0 B)Cache Used: 0 (0 B)Cache Remaining: 0 (0 B)Cache Used%: 100.00%Cache Remaining%: 0.00%Xceivers: 1Last contact: Sun Feb 19 19:42:23 CST 2017hadoop fs
帮助如下:
Usage: hadoop fs [generic options] [-appendToFile <localsrc> ... <dst>] [-cat [-ignoreCrc] <src> ...] [-checksum <src> ...] [-chgrp [-R] GROUP PATH...] [-chmod [-R] <MODE[,MODE]... | OCTALMODE> PATH...] [-chown [-R] [OWNER][:[GROUP]] PATH...] [-copyFromLocal [-f] [-p] [-l] <localsrc> ... <dst>] [-copyToLocal [-p] [-ignoreCrc] [-crc] <src> ... <localdst>] [-count [-q] [-h] <path> ...] [-cp [-f] [-p | -p[topax]] <src> ... <dst>] [-createSnapshot <snapshotDir> [<snapshotName>]] [-deleteSnapshot <snapshotDir> <snapshotName>] [-df [-h] [<path> ...]] [-du [-s] [-h] <path> ...] [-expunge] [-find <path> ... <expression> ...] [-get [-p] [-ignoreCrc] [-crc] <src> ... <localdst>] [-getfacl [-R] <path>] [-getfattr [-R] {-n name | -d} [-e en] <path>] [-getmerge [-nl] <src> <localdst>] [-help [cmd ...]] [-ls [-d] [-h] [-R] [<path> ...]] [-mkdir [-p] <path> ...] [-moveFromLocal <localsrc> ... <dst>] [-moveToLocal <src> <localdst>] [-mv <src> ... <dst>] [-put [-f] [-p] [-l] <localsrc> ... <dst>] [-renameSnapshot <snapshotDir> <oldName> <newName>] [-rm [-f] [-r|-R] [-skipTrash] <src> ...] [-rmdir [--ignore-fail-on-non-empty] <dir> ...] [-setfacl [-R] [{-b|-k} {-m|-x <acl_spec>} <path>]|[--set <acl_spec> <path>]] [-setfattr {-n name [-v value] | -x name} <path>] [-setrep [-R] [-w] <rep> <path> ...] [-stat [format] <path> ...] [-tail [-f] <file>] [-test -[defsz] <path>] [-text [-ignoreCrc] <src> ...] [-touchz <path> ...] [-truncate [-w] <length> <path> ...] [-usage [cmd ...]]Generic options supported are-conf <configuration file> specify an application configuration file-D <property=value> use value for given property-fs <local|namenode:port> specify a namenode-jt <local|resourcemanager:port> specify a ResourceManager-files <comma separated list of files> specify comma separated files to be copied to the map reduce cluster-libjars <comma separated list of jars> specify comma separated jar files to include in the classpath.-archives <comma separated list of archives> specify comma separated archives to be unarchived on the compute machines.The general command line syntax isbin/hadoop command [genericOptions] [commandOptions]hadoop fs -ls /
hadoop fs -put /root/GitHubLog.txt /
然后再用ls来查看一下文件列表:
hadoop fs -ls /
输出结果为:
Found 1 items-rw-r--r-- 2 root supergroup 23562 2017-02-18 12:53 /GitHubLog.txt真实的数据文件路径在DataNode节点的数据目录的/root/apps/hadoop-2.7.3/data/hdfs/data/current/BP-2098819308-192.168.37.143-1487177379733/current/finalized/subdir0/subdir0#
下面有如下文件列表:
可以用下面的命令查看一下blk_1073741825文件的内容。
cat blk_1073741825
然后会发现输出的内容就是GitHubLog.txt文件的内容。这是因为这个文件太小了,没有切开。默认到128M才会切开。
再放一个超过128M的文件。
hadoop fs -put /root/temp/hadoop-2.7.3.tar.gz /
然后通过ls命令来看一下结果。
hadoop fs -ls /
结果是:
Found 2 items-rw-r--r-- 2 root supergroup 23562 2017-02-18 12:53 /GitHubLog.txt-rw-r--r-- 2 root supergroup 214092195 2017-02-18 13:09 /hadoop-2.7.3.tar.gz再到文件系统里面看一下,当前的文件列表是:
-rw-r--r-- 1 root root 23562 2月 18 12:53 blk_1073741825-rw-r--r-- 1 root root 195 2月 18 12:53 blk_1073741825_1001.meta-rw-r--r-- 1 root root 134217728 2月 18 13:09 blk_1073741826-rw-r--r-- 1 root root 1048583 2月 18 13:09 blk_1073741826_1002.meta-rw-r--r-- 1 root root 79874467 2月 18 13:09 blk_1073741827-rw-r--r-- 1 root root 624027 2月 18 13:09 blk_1073741827_1003.meta接下来我们把blk_1073741826和blk_1073741827尝试拼接起来,然后试一下能不能解压。
cat blk_1073741826 >> tmp.filecat blk_1073741827 >> tmp.filetar -zxvf tmp.file操作后的文件列表是:
-rw-r--r-- 1 root root 23562 2月 18 12:53 blk_1073741825-rw-r--r-- 1 root root 195 2月 18 12:53 blk_1073741825_1001.meta-rw-r--r-- 1 root root 134217728 2月 18 13:09 blk_1073741826-rw-r--r-- 1 root root 1048583 2月 18 13:09 blk_1073741826_1002.meta-rw-r--r-- 1 root root 79874467 2月 18 13:09 blk_1073741827-rw-r--r-- 1 root root 624027 2月 18 13:09 blk_1073741827_1003.metadrwxr-xr-x 9 root root 4096 8月 18 2016 hadoop-2.7.3/-rw-r--r-- 1 root root 214092195 2月 18 13:16 tmp.file说明解压成功,HDFS并没做很多的事情
hadoop fs -less /GitHubLog.txt
hadoop fs -get /GitHubLog.txt
hadoop fs -rm /GitHubLog.txt
新闻热点
疑难解答