Publications

MOSAIC: Detection and Categorization of I/O Patterns in HPC Applications

Abstract: With the gap between computing power and I/O performance growing ever wider on HPC systems, it is becoming crucial to optimize how applications perform I/O on storage resources. To achieve this, a good understanding of application I/O behavior is an essential preliminary step. In this paper, we introduce MOSAIC, a method for categorizing applications according to their I/O behavior. We first propose an abstraction for characterizing I/O operations in terms of periodicity, temporality and metadata access. We then present a set of segmentation-based techniques for quickly and automatically detecting meaningful data access patterns. In the end, MOSAIC is able to characterize a full set of real-world I/O traces from the Blue Waters supercomputer with 92% accuracy.