VIRTUAL REALITY: UNDERSTANDING MASSIVELY PARALLEL COMPUTER SYSTEMS

Daniel A. Reed

Department of Computer Science
University of Illinois
Urbana, Illinois 61801

CONTACT INFORMATION

Email reed@cs.uiuc.edu
Telephone (217) 333-3807
FAX (217) 244-6869

WWW PAGE

http://www-pablo.cs.uiuc.edu/Projects/VR/

PROGRAM AREA

Virtual Environments

KEYWORDS

Virtual reality, parallel computing, performance analysis, adaptive control, WWW behavior, sonification

PROJECT SUMMARY

The crux of our ongoing work is an exploration of dynamic, immersive data presentation and interaction techniques applicable to performance data captured from massively parallel computer systems, distributed applications, and information servers. Although there is no general theory that predicts the performance effects of software changes, a cycle of software changes and performance measurements does permit application tuning. Hence, recording and analyzing the dynamics of applications, system software, and hardware interactions are the keys to understanding and tuning performance.

By exploiting our Pablo performance analysis software, which supports machine-independent capture and manipulation of dynamic performance data, we have developed an immersive virtual environment, called Avatar, for display, analysis and interaction with the dynamic performance data obtained via Pablo. Avatar operates with (a) a head-mounted display and tracker, (b) the CAVE, virtual reality theater, or (c) a workstation display with stereo glasses. It supports three domain-independent display metaphors (a three-dimensional generalization of scatterplot matrices, a ``time tunnel,'' and a geographic display) with data sonification and sound spatialization.

In a two-dimensional scatterplot matrix, all possible pairs of dimensions for a set of n-dimensional data are plotted against each other in separate scatterplots arranged in an n by n matrix. This shows all possible two-dimensional data projections and can be used to determine data dispersion and bivariate correlations. Our three-dimensional generalization of scatterplot matrices, which we call a scattercube, contains n^3, three-dimensional scatterplots, allowing users to walk around and inside the data.

In each scatterplot cube, the coordinate axes correspond to three of the n performance metrics, and the time-varying position of each processor in the scatterplot cube is determined by the current values of the associated performance metrics. Geometrically, the behaviors of the p processors then define a set of p curves in an n-dimensional performance metric space, with each scatterplot cube showing a different three-dimensional projection of this trajectory. To help analyze data point trajectories, it is possible to display these trajectories using history ribbons (i.e., markers of data paths) --- interactively enabling history ribbons for a subset of the data points allows one to see if the selected points cluster in one or more scattercubes.

Scattercubes provide a statistical view of behavior, but no data that can be mapped directly to specific source code in an application program. The time tunnel shows the time evolutionary behavior of a parallel code via display consisting of a cylinder whose major axis is time. Along the cylinder periphery, each line is composed of segments, where the color and length of each segment indicate the type and duration of each behavior type in a parallel program. Cross-processor interactions (e.g., via message passing) are represented by chords that cut through the interior of the cylinder.

In general, we have found that a combination of the scattercube and time tunnel metaphors provides the most insight into qualitative and quantitative behavior. The scattercubes allow one to understand performance metric correlations, whereas the time tunnel allows one to see state transitions and behavioral skews in parallel program executions.

Performance data for distributed systems, communication networks, and the WWW are geographically distributed, making it appealing to display and interact with such data when mapped to geographic location. Although there are a plethora of possible projections, to date we have relied on a ``globe in space'' view for global perspective and a simple flat projection for local views. The globe consists of a texture map of the world on a sphere whose surface includes altitude relief and political boundaries.

On the globe or its projection, data can be displayed either as arcs between source and destination or as stacked bars. The former can be used to display communication traffic (e.g., from the Internet or the experimental vBNS network), with the thickness, height, and color of the arc representing specific data attributes. Stacked bars convey information through three mechanisms: position, height, and color bands. If displaying the characteristics of requests to WWW servers, each bar is placed at the geographic origin of a WWW request, with the bar heights showing attributes of the requests from that location, typically the number of bytes or the number of requests relative to other sites.

We have successfully used Avatar to study the correlations of large numbers of dynamic performance metrics from parallel applications, to interactively control input/output caching and prefetching policies for parallel applications, and to display real-time data on the behavior of NCSA's World Wide Web (WWW) server. At Supercomputing '95 used this extended version of Avatar to display the real-time behavior of WWW servers using the I-WAY. (The I-WAY is an experimental ATM network connecting geographically distributed parallel systems.) In general, our experience with Avatar has shown that the combination of performance metric correlation and the ability to interactively modify application behavior provides a powerful mechanism for gaining insight into behavioral dynamics and for performance optimization.

PROJECT REFERENCES

D. A. Reed, Shields, K. A., Scullin, W. H., Tavera, L. F., and Elford, C. L., ``Virtual Reality and Parallel Systems Performance Analysis,'' IEEE Computer, November 1995

T. T. Kwan, McGrath, R. E., and Reed, D. A., ``NCSA's World Wide Web Server: Design and Performance,'' IEEE Computer, November 1995

W. H. Scullin, Kwan, T. T., and Reed, D. A., ``Real-Time Visualization of World Wide Web Traffic,'' Symposium on Visualizing Time-Varying Data, September 1995

AREA BACKGROUND

As scalable parallel systems based on commodity microprocessor and memory building blocks become the standard architecture for high-performance computing, there is growing realization that maximizing achieved performance requires an understanding of software and hardware component interactions. Not only do hundreds of processors interact on a microsecond time scale, the space of possible performance optimizations is large, complex, and highly sensitive to software behavior.

Given the complexity of parallel systems and the space of possible performance optimizations, one of the keys to tuning application performance is the capture and analysis of dynamic performance data and a concomitant understanding of the effects of software changes on performance. Just as a logic analyzer allows a hardware designer to study signal transitions, software instrumentation provides the raw data needed to understand the spatial and temporal interactions of parallel tasks.

Event tracing, by combining elements of both counting and timing, is the most general instrumentation mechanism. Intuitively, at each event of interest, event tracing generates a performance data record that specifies the identity of the event, the time the event occurred, and any other ancillary information associated with the event. Because a typical event trace implicitly defines 5--50 dynamic performance metrics for each of tens or hundreds of processors, visualizing and understanding the time-varying correlations of these trajectories is key.

AREA REFERENCES

M. L. Simmons, Hayes, A. H., Reed, D. A. and Brown, J. Debugging and Performance Tuning for Parallel Computing Systems, IEEE Computer Society Press, 1995

B. P. Miller, Clark, M., Hollingsworth, J., Kierstead, S., Lim, S-S., and Torzewski, T., `` ``IPS-2: The Second Generation of a Parallel Program Measurement System,'' IEEE Transactions on Computers, Vol. 1, No. 2, pages 206--217, April 1990

RELATED PROGRAM AREAS

Adaptive human interfaces, other communication modalities

POTENTIAL RELATED PROJECTS

Using captured performance data, one manually can identify performance bottlenecks in both application and system software. However, integration of dynamic performance instrumentation and on-the-fly performance data reduction with configurable, malleable resource management algorithms opens the possibility of interactive and automatic real-time adaptive control mechanism that automatically chooses and configures resource management algorithms based on application request patterns and observed system performance. Such an approach is suitable for use with parallel task scheduling and input/output as well as management of distributed information servers based on the WWW.

There are a plethora of open problems related to direct manipulation and real-time control of computer systems and information servers. The bulk of these relate to developing effective display and interaction metaphors that represent abstract entities in intuitive ways and creating software for distributed adaptive control and information gathering.