Date Awarded

2020

Document Type

Dissertation

Degree Name

Doctor of Philosophy (Ph.D.)

Department

Computer Science

Advisor

Xu Liu

Committee Member

Weizhen Mao

Committee Member

Andreas Stathopoulos

Committee Member

Bin Ren

Abstract

Complex codebases with several layers of abstractions have abundant inefficiencies that affect the performance. These inefficiencies arise due to various causes such as developers' inattention to performance, inappropriate choice of algorithms and inefficient code generation among others. To eliminate the redundancies, lots of work has been done during the compiling phase. However, not all redundancies can be easily detected or eliminated with compiler optimization passes due to aliasing, limited optimization scopes, and insensitivity to input and execution contexts act as severe deterrents to static program analysis. There are also profiling tools which can reveal how resources are used. However, they can hard to distinguish whether the resource is worth fully used. More profiling tools are in needed to diagnose resource wastage and pinpoint inefficiencies. We have developed three tools to pinpoint different types of inefficiencies in different granularity. We build Runtime Value Numbering (RVN), a dynamic fine-grained profiler to pinpoint and quantify redundant computations in an execution. It is based on the classical value numbering technique but works at runtime instead of compile-time. We developed RedSpy, a fine-grained profiler to pinpoint and quantify value redundancies in program executions. Value redundancy may happen overtime at the same locations or in adjacent locations, and thus it has temporal and spatial locality. RedSpy identifies both temporal and spatial value locality. Furthermore, RedSpy is capable of identifying values that are approximately the same, enabling optimization opportunities in HPC codes that often use floating-point computations. RVN and RedSpy are both instrumentation based tools. They provide comprehensive result while introducing high space and time overhead. Our lightweight framework, Witch, samples consecutive accesses to the same memory location by exploiting two ubiquitous hardware features: the performance monitoring units (PMU) and debug registers. Witch performs no instrumentation. Hence, witchcraft - tools built atop Witch - can detect a variety of software inefficiencies while introducing negligible slowdown and insignificant memory consumption and yet maintaining accuracy comparable to exhaustive instrumentation tools. Witch allowed us to scale our analysis to a large number of codebases. All the tools work on fully optimized binary executable and provide insightful optimization guidance by apportioning redundancies to their provenance - source lines and full calling contexts. We apply RVN, RedSpy, and Witch on programs that were optimization targets for decades and guided by the tools, we were able to eliminate redundancies that resulted in significant speedups.

DOI

http://dx.doi.org/10.21220/s2-t14d-3d56

Rights

© The Author

Share

COinS