How to find the size of the HDFS file

Size of the HDFS file in Hadoop File system

Some time we will check the size of the HDFS file to understand that how much space the file occupied.In that situation, we can use few of the hadoop commands to get size of the HDFS file.

Hadoop fs -ls command:

Basically hadoop fs -ls command is used to list out the files under the specific hadoop directory.But still ,the output of the ls command will return the size of the each HDFS file for the given HDFS directory.

Example :

It will return the list of files under the directory /apps/cnn_bnk. It includes other information such as read/write permission details,owner of the file, size of the file in bytes,creation date and name of the file.

Here there are 2 files stored under the directory /apps/cnn_bnk and the size of the HDFS files are 137087 and 825 bytes.

Output:

Hadoop fs -du -s -h command

The Hadoop fs -du -s -h command is used to check the size of the HDFS file/directory in human readable format.Since the hadoop file system replicates every file ,the actual physical size of the file will be number of replication with multiply of size of the file.

Example:

This command will return the size of the file /apps/cnn_bnk/customer1_txn.txt with units such as KB,MB or GB.Here the size of given HDFS file is 133KB and it is shown below.

Output:

Recommended Articles