SC14 New Orleans, LA

The International Conference for High Performance Computing, Networking, Storage and Analysis

I/O Monitoring in a Hadoop Cluster.


Authors: Carson L. Wiens (Los Alamos National Laboratory), Joshua M. C. Long (Los Alamos National Laboratory), Joel R. Ornstein (Los Alamos National Laboratory)

Abstract: As data sizes for scientific computations grow larger, more of these types of computations are bottlenecked by disk input and output, rather than processing speed. Monitoring reads and writes, therefore, has become an important component of distributed computing clusters. Our team investigated several different possibilities for monitoring I/O on a Hadoop cluster, including the Splunk app for HadoopOps, Ganglia, and log file output. Each of these three methods were evaluated for compatibility, ease of use, and display. Surprisingly, despite the fact that Splunk HadoopOps is made specifically for Hadoop clusters, other monitoring programs and techniques still proved to be useful. Using our monitoring tools, we were also able to observe input and output behavior over different cluster architectures. LA-UR-14-25814

Poster: pdf
Two-page extended abstract: pdf


Poster Index