Date Awarded

2004

Document Type

Dissertation

Degree Name

Doctor of Philosophy (Ph.D.)

Department

Computer Science

Advisor

andreas Stathopoulos

Abstract

As commodity computers and networking technologies have become faster and more affordable, fairly capable machines have become nearly ubiquitous while the effective "distance" between them has decreased as network connectivity and capacity has multiplied. There is considerable interest in developing means to readily access such vast amounts of computing power to solve scientific problems, but the complexity of these modern computing environments pose problems for conventional computer codes designed to run on a static, homogeneous set of resources. One source of problems is the heterogeneity that is naturally present in these settings. More problematic is the competition that arises between programs for shared resources in these semi-autonomous environments. Fluctuations in the availability of CPU, memory, and other resources can cripple application performance. Contention for CPU time between jobs may introduce significant load imbalance in parallel applications. Contention for limited memory resources may cause even more severe performance problems, as thrashing may increase execution times by an order of magnitude or more.;Our goal is to develop techniques that enable scientific applications to achieve good performance in non-dedicated environments by monitoring system conditions and adapting their behavior accordingly. We focus on two important shared resources, CPU and memory, and pursue our goal on two distinct but complementary fronts: First, we present some simple algorithmic modifications that can significantly improve load balance in a class of iterative methods that form the computational core of many scientific and engineering applications. Second, we introduce a framework for enabling scientific applications to dynamically adapt their memory usage according to current availability of main memory. An application-specific caching policy is used to keep as much of the data set as possible in main memory, while the remainder of the data are accessed in an out-of-core fashion.;We have developed modular code libraries to facilitate implementation of our techniques, and have deployed them in a variety of scientific application kernels. Experimental evaluation of their performance indicates that our techniques provide some important classes of scientific applications with robust and low-overhead means for mitigating the effects of fluctuations in CPU and memory availability.

DOI

https://dx.doi.org/doi:10.21220/s2-my1g-nn08

Rights

© The Author

Share

COinS