[Image via BSB]

In a typical Hadoop slave node, a node running as a datanode and tasktracker, it is typical to provide 75% of the disk for HDFS storage and the remaining 25% for MapReduce intermediate data. The MapReduce intermediate data is the data created after a map task has run over an input split, typically a HDFS block.  Given a single disk, this is simple task, but with multiple slavenode disks the decision becomes more complicated. We want to choose the best disk configuration to maximize all available resources.

Assume we have 4x 1TB disks in our example slavenode.

The logical assumption would be to assign 3x 1TB disks for HDFS storage and 1x 1TB disk for MapReduce intermediate data storage. The problem with this approach is that we sacrifice potential HDFS throughout by assigning one full volume to only store MapReduce intermediate data.

The better approach is to store both HDFS and MapReduce intermediate data on each disk on the slave node. This can be accomplished a few different ways. One way would be to use separate partitions, but using this method would leave you stuck if you ever needed to change the percentage split (e.g. allocated more storage for HDFS or MapReduce intermidate data). Another way is to use the dfs.datanode.du.reserve configuration property in hdfs-site.xml  to control the split by specifying separate directories for each on the same volume. This would allow you to modify the capacity available to HFDS on the fly after a namenode restart.

Another inserting solution, I heard a classmate say, is to use the native Linux file system user disk quota system. This should work because the MapReduce daemon and HDFS daemon both run under separate user accounts, but I’m not sure how thoroughly it has been tested.