Uppsala Architecture Research Team
Online Phase Detection
What does this research do and how. Why is it interesting? One paragraph
Efficient Software-based Online Phase Classification
Many programs exhibit execution phases with time-varying behavior. Phase detection has been used extensively to find short and representative simulation points, used to quickly get representative simulation results for long-running applications. Several proposals for hardware-assisted phase detection have also been proposed to guide various forms of optimizations and hardware configurations. This paper explores the feasibility of low overhead phase detection at runtime based entirely on existing features found in modern processors. If successful, such a technology would be useful for cache management, frequency adjustments, runtime scheduling and profiling techniques. The paper evaluates several existing and new alternatives for efficient runtime data collection and online phase detection. ScarPhase (Sample-based Classification and Analysis for Runtime Phases), a new online phase detection library, is presented. It makes extensive usage of the new hardware counter features, introduces a new phase classification heuristic and suggests a way to dynamically adjust the sample rate. ScarPhase exhibits runtime overhead below 2%. Phase detection using PEBS to sample and cluster branches. Poster
|
Phase Behavior in Serial and Parallel Applications
It is well known that most serial programs exhibit time varying behavior, for example, alternating between memory- and compute-bound phases. However, most research into program phase behavior has focused on the serial SPEC benchmark suite, with little investigations into large scale phase behavior in parallel applications. In this study we compare and examine the time-varying behavior of the SPEC2006 (serial) and the PARSEC 2.1 (parallel) benchmarks suites, and investigate the program phase behavior found in parallel applications with different paral- lelization models. To this end, we extend a general purpose runtime phase desection library to handle parallel applications. Our results reveal that serial applications have significantly more program phases (2.4x) with larger variation in CPI (1.5x) compared to parallel applications. While the number of phases are fewer in parallel applications, there still exists interesting phase behavior. In particular, we find that data-parallel applications have shorter phases with more threads. This makes phase-guided runtime optimiza- tions (e.g., dynamic voltage frequency scaling) less attractive as the number of threads grows. Meaning it is much more difficult to exploit runtime optimizations in parallel applications. Parallel phase detection demonstrating thread-local phases.
|