On this page... (hide)
Relative debugging is a technique that allows a user to compare data between two executing programs. It was devised to aid the testing and debugging of programs that are either modified in some way, or are ported to other computer platforms. It is effectively a hybrid test & debug methodology.
The challenge a relative debugger can assist with is what to do you do when you move your application to another node of the Grid and it stops working? Subtle errors can be introduced through changes:
- By programmer
- In the environment (e.g. DLL Hell)
Programmer must understand application intimately to be able to locate source of errors. Programmer can spend much time tracing program state to locate source of error and understanding how code changes may have resulted in errors. Relative debugging is about automating this process.
Whilst traditional debuggers force the programmer to understand the expected state and internal operation of a program, relative debugging makes it possible to trace errors by comparing the contents of data structures between programs at run time. In this way, the programmer is less concerned with the actual state of the program, and more concerned with finding when and where differences between the old and new codes occur. The methodology requires users to start by observing that two programs generate different results, and then iteratively moving back through the data flow of the codes to determine the point at which they start producing different answers.
We have developed a relative debugger called Guard. Guard supports the execution of both sequential and parallel programs on a range of platforms, and exists for a number of different development environments. Guard can be applied to both sequential and parallel programs.
This project is supported by:
Traditional debuggers allow a user to control a program and examine its state at any point of the execution. The user sets breakpoints in the code, interactively examines program variables, and verifies that these variables have expected values. Erroneous values can be traced to erroneous code by using information about program data flow. Relative debugging differs from traditional debugging in two important respects. First, program variables are compared not with user expectations, but with variables in another reference program that is known to be correct. Second, because the reference program is available to compute correct values, the comparison process can be automated. Hence, the relative debugging process proceeds as follows. The user first formulates a set of assertions about key data structures in the reference and the development versions. These assertions specify locations at which data structures should be identical: violations of the assertions indicate errors. The relative debugger is then responsible for managing the execution of the two program versions, for validating the supplied assertions by comparing the data structures, and for reporting any differences. If differences are reported, the user proceeds to isolate erroneous code by repeatedly refining the assertions and reinvoking the relative debugger. Once the erroneous region is small enough, traditional debugging techniques can be used to correct the development version. Thus, the relative debugger provides a quick and effective way of locating problems in the development version.
A relative debugger provides all the functionality of a traditional debugger, including commands for program control, state access and breakpoints. However, the heart of the relative debugger is a set of new commands, not available in conventional debuggers. These commands support the relative debugging methodology.
Because the reference and development versions of the program are executed concurrently, a relative debugger must be capable of handling two programs at the same time. It is useful for the relative debugger to support the debugging of programs written in different programming languages and executing on different computers in a heterogeneous network. This makes it possible to use the relative debugger when porting programs from one language or computer to another.
A relative debugger checks user-supplied assertions by comparing data structures in the reference and development versions. It performs necessary transformations of different internal data representations on different computers or in different languages. When performing comparisons, the debugger must take into account different data types, allowing for such issues as inexact equality in floating point numbers, and differences in dynamic pointer values. This aspect of the debugger will be illustrated in the next section. Violations of assertions are reported to the user. A number of approaches are possible for reporting differences in data structures, ranging from text to advanced data visualization techniques. If there are only a few differences, then the numeric values of differences are printed out. If differences are numerous, then visualization techniques are required to present them in a meaningful way.
Guard supports three approaches to the reporting of differences: text, bitmaps, and advanced visualization techniques through external visualisation utilities. Text output is the simplest; the actual values and differences are printed on standard output.
The second approach is more suited to array comparisons, where text output may be excessive. In this case, only two values are printed: the maximum difference between corresponding array elements, and the total cumulative difference between all elements. Most of the information is reported in a rectangular bitmap displayed on the screen. In this bitmap, white pixels denote values that are the same, and black pixels denote values that are different. This simple array visualization is particularly useful for detecting erroneous loop bounds and addressing expressions, because these types of error tend to generate regular patterns on the display. An example of such a visualisation in shown in Figure 2. In this case, a number of columns of the two arrays differ, and this is clearly seen as a black stripe on the right hand side of the array visualisation. Arrays with more than two indices can be folded onto two dimensions using a number of standard techniques.
The most powerful technique supported by Guard involves the use of commercial data exploration and visualization software such as IBM's Data Explorer (DX). A complete set of differences can be saved to a file using a parameter on the assert command. Values from the file can then be displayed using DX. This use of advanced visualization techniques is well suited to the display of differences in arrays with more than two dimensions. Animations can be used to convey the development of differences as the two programs execute. Figure 3 and 4 show a few examples of the visualisation of errors using DX. In both of these cases an error surface is plotted which shows the regions of three dimensional space in which the error exceeds a threshold. For example, in Figure 3 the difference between the temperature variable is shown in red when it exceeds 0.1% relative error.
In figure 4, the red region indicates where error exceeds 10%, and the yellow region indicates where the error exceeds 5%. This image shows that the error only occurs in half of the space, which corresponds to a region which models the physics of pollution transport over land rather than water. The picture also shows that the error is transported vertically, implicating the physics code responsible for vertical advection. Displaying the error surface in this way provides a great deal of information about the potential source of the error, including its possible location. For example, the position of the error may exclude some routines of the code which do not manipulate that region of space.
More complex dynamic data strcutures can also be visualised. In Figure 5, we shows a set of particiles in a particle-in-cell code, and also show the pointer links between the particles. In this visualisation, the black links are where the pointer values are the same in the two codes, whereas the coloured ones are where they differ.
It is also possible to create movies that show the dynamic behaviour of the differences between data structures as the programs execute. Figure 6 shows the same iso surface information that appears in Figure 3, but it can be seen to evolve as the programs execute. Seeing this"evolution" can be very helpful in isolating errors.
|Our earliest implementation of Relative Debugging is embodied in a version called "Classic Guard". This version has been implemented in most Unix machines, and is command line driven. It has a client server architecture, and uses GDB as the portable debug server implementation.|
Classic Guard supports both sequential and parallel relative debugging, and has novel features for describing the data decomposition in parallel codes. It supports a range of conventional programming languages, like C, C++, Fortran, etc, and also a data parallel research language called .
|Eclipse Guard is a version of Guard integrated into the IBMplatform. EclipseGuard leverages the flexibility and extensibility of Eclipse. It current works with Java and C/C++, however, this will be extended as new language plug in's are produced.|
|VSGuard is an implementation of relative debugging for the.NET ® environment. VSGuard leverages the .NET Framework, and the Visual Studio Integration Project. It supports all .NET Framework languages, and also legacy Microsoft languages like Visual Basic 6.0 ® and Visual C++ ®. VSGuard is produced and marketed by|
|OneGuard||OneGuard is a version of Guard that is integrated into the SUN Microsystems Sun One Studio environment.|
|GuardLite is a command line parallel debugger, without relative debugging. It supports the High Performance Debugger Forum (HPDF) command syntax and works on a range of parallel machines.|
The table below summarizes which features have been implemented in the various IDEs.
|Feature||Visual Studio .Net||Eclipse||NetBeans|
|Improved User Interaction|
|Building assertions interactively|
|Integrated Data Visualization|
Because the plug-in architecture and platforms for IDEs vary so significantly, we have designed the Guard-IDE to be as generic as possible. Figure 1 shows the architecture and identifies two main parts:
- an IDE package which resides within the target IDE. This provides an implementation of the graphical user interface for managing assertions, and exposes important debugging interfaces in the IDE that are required by the Guard Interpreter;
- the Guard package which is external to the IDE. This provides the core relative debugging functionality (including a sophisticated data driven execution engine), as well as necessary infrastructure for communication between the IDE and Guard.
The Guard package is actually built from original ClassicGuard components. It encapsulates Guard's core relative debugging functionality within a module called the Guard Interpreter. In Classic Guard, debug functions (such managing breakpoints, evaluating expressions, etc) are provided by a separate client-server based debugger, built on top of GDB. However, in Guard-IDE these debug interfaces are provided instead by the target IDE itself. This leverages the functions that are already available in the IDE, and allows us to support multiple languages in a way that is consistent with the basic IDE operation.
The Guard package and the specific IDE packages utilize different communications mechanisms depending on the available infrastructure. The Java based IDE, such as Eclipse, WebSphere and Netbeans all use Java Native Interface (JNI) as the communication infrastructure, however, VS.Net uses COM under Windows.
The Guard package is external to IDE and provides relative debugging functionality by reusing the original components of ClassicGuard. Since Guard Interpreter is written in C, we implemented an interface layer as a wrapper on top of Guard Interpreter to interact with IDE package. The inter-process communication (JNI / COM)is used to facilitate calls out of the Guard Interpreter to the IDE package, such as those required to place breakpoints, perform expression evaluation, etc.
The Eclipse IDE package is responsible for implementing Guard’s graphical user interface, and provides functions such as debugger control, setting breakpoints, receiving breakpoint events and expression evaluation. This package uses a model-view-controller (MVC) design pattern, and provides a generic interface that handles controlling and monitoring of relative debugging:
- A generic high-level relative debug model that represents the assertions and comparison results.
- Views that give user graphical visualization of the model's status and provide user interface to control relative debugging by transferring commands.
- A controller that keeps the model updated by generating events based on the messages from the external Guard package.
EclipseGuard is an implementation of relative debugging for theenvironment. Using this approach, you can test whether a new project performs the same tasks as a previous version, and if it doesn't, you can debug it using EclipseGuard's powerful assertion mechanism. EclipseGuard builds on the already powerful techniques used in other test tools, but allows you to trace errors down to an individual source line.
How it works
EclipseGuard makes significant use of Eclipse's plug-in and extensibility concept. At its core, Eclipse has a modular Java runtime called Equinox. Equinox is an implementation of the OSGi R4 core framework specification, which provides a set of bundles and services required to support running OSGi-based systems. The platform is the middle tier of the architecture, and consists of a set of components that provide core services and frameworks to higher tiers. The top tier of the architecture incorporates plug-in features that provide the functionality most visible to the users. Importantly, Eclipse plug-ins allow other plug-ins to extend or customize portions of their functionality by declaring extension points and extensions. For example, many programming languages development tools are already available in Eclipse, such as JDT for Java development tools and CDT for C/C++ development tools, and these provide language specific editors, views and debuggers tools, to name a few. Importantly, each of the language plug-ins can be extended by Guard, which allows us to perform relative debugging across multiple languages and development environments.
VSGuard is an implementation of relative debugging for Microsoft's ® Visual Studio ® environment. VSGuard builds on the rich environment provided by Visual Studio, and adds relative debugging functionality seamlessly. Thus, users already familiar with Visual Studio only need to learn a few new commands and concepts to make full use of VSGuard.
Visual Studio allows users to manage two different projects concurrently by building them into a Solution. If you are porting from Visual Basic 6.0 to Visual Basic .NET, VSGuard provides a Wizard that makes it very easy and quick to build two different projects into one Solution. From this point, VSGuard allows you to build assertions between data structures in the two projects. It will run them concurrently, and compare data automatically, reporting differences they occur.
Using this approach, you can test whether a new project performs the same tasks as a previous version, and if it doesn't, you can debug it using VSGuard's powerful assertion mechanism. VSGuard builds on the already powerful techniques used in other test tools, but allows you to trace errors down to an individual source line. What's more, you don't need to capture large trace files, saving both space and time.
How it works
The architecture of VSGuard under Microsoft's Visual Studio ® .NET. Visual Studio .NET is build around a core shell with functionality being provided by commands that are implemented by a set of packages. These packages are conventional COM objects that are activated as a result of user interaction (such as menu selection) within Visual Studio .NET, and also when various asynchronous events occur. This component architecture makes it possible to integrate new functionality into the environment by loading additional packages.
|Dinh, M., Abramson, D., Chao, J., Kurniawan, D., Gontarek, D., Moench, B. and DeRose, L. "Debugging Scientific Applications With Statistical Assertions", in International Conference on Computational Science (ICCS), Omaha, Nebraska, USA.||Abstract|
|Chao, J, Abramson, D., Dinh, M., Kurniawan, D., Gontarek, A, Moench, B. and DeRose, L., “A Scalable Parallel Debugging Library with Pluggable Communication Protocols”, CCGrid 2012, The 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, May 13-16, 2012, Ottawa, Canada.||Abstract|
|Dinh, M., Abramson, D., Chao, J., Kurniawan, D., Gontarek, A., Moench, B. and DeRose, L. "Scalable parallel debugging with statistical assertions", in 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP) - Poster, New Orleans, LA, USA, pp. 311-312.||Abstract|
|Dinh, M., Abramson, D., Kurniawan, D., Moench, B. and DeRose, L., “Assertion based parallel debugging”, CCGrid, 2011, Newport Beach, CA, 24-26th May, 2011.||Abstract||ccgrid2011_final.pdf|
|Abramson, D., Dinh, M.N., Kurniawan, D., Moench, B. and DeRose, L. “Data Centric HIghly Parallel Debugging”, 2010 International Symposium on High Performance Distributed Computing (HPDC 2010), Chicago, USA, June 2010||Abstract||HPDC2010.pdf|
|Abramson, D., Finkel, R., Kurniawan, D., Kowalenko, V. and Watson, G. “Parallel Relative Debugging with Dynamic Data Structures”, 16th International Conference on Parallel and Distributed Computing Systems, pp 22 – 29, August 13 - 15, 2003 Reno, Nevada, USA||Abstract||GuardPDCS03.pdf|
|Searle, A, Gough, J. K. and Abramson, D. A. “DUCT: An Interactive Define-Use Chain Navigation Tool for Relative Debugging”, AADebug’03. Ghent, Belgium, September 2003||Abstract||AADebug03.pdf|
|Abramson, D., Watson, G. and Dung, L. “Guard: A Tool for Migrating Scientific Applications to the .NET Framework”, 2002 International Conference on Computational Science (ICCS 2002), Amsterdam, The Netherlands, April 21st 2002, pp 834 - 843||Abstract||ICCS2002.pdf|
|Watson, G., Abramson, D. “Parallel Relative Debugging for Distributed Memory Applications: A Case Study”, International Conference on Parallel and Distributed Processing Techniques and Applications June 25-28, 2001, Las Vegas, Nevada, USA||Abstract||pdpta2001.pdf|
|Abramson, D.A., Sosic, R. and Watson, G. "Implementation Techniques for a Parallel Relative Debugger ", International Conference on Parallel Architectures and Compilation Techniques - PACT '96, October 20-23, 1996, Boston, Massachusetts, USA||Abstract||pact.pdf|
|Abramson D., Foster, I., Michalakes, J. and Sosic R., "Relative Debugging and its Application to the Development of Large Numerical Models", Proceedings of IEEE Supercomputing 1995, San Diego, December 95. Selected as best paper||Abstract||sc95.pdf|
|Abramson, D.A. and Sosic, R. "Relative Debugging using Multiple Program Versions", Key Note Address, 8th Int. Symp. on Languages for Intensional Programming , Sydney, 3-5th May, 1995. In Intensional Programming I, World Scientific, ISBN 981 - 02 - 2400 - 1.||Abstract||islip.pdf|
|Abramson, D.A. and Sosic, R. "A Debugging Tool for Software Evolution", CASE-95, 7th International Workshop on Computer-Aided Software Engineering, Toronto, Ontario, Canada, July 1995, pp 206 - 214. Also appeared in proceedings of 2nd Working Conference on Reverse Engineering, Toronto, Ontario, Canada, July 1995||Abstract||case95.pdf|
|Abramson, D.A.,Chu, C.,Kurniawan, D.,Searle, A., 2009, Relative Debugging in an integrated development environment in Software - Practice and Experience, 2009:pp 1157-1183; John Wiley&Sons Ltd.||Abstract|
|Abramson, D and Watson, G. “Debugging Scientific Applications in the .NET Framework”, Future Generation Computer Systems, Vol. 19 issue 5, June 2003., pp 665 - 678||Abstract||VSGuard1.pdf|
|Watson, G., Abramson, D., 2000, Relative Debugging for Data-Parallel Programs: A ZPL Case Study, IEEE Concurrency, vol 8 issue 4, IEEE Computer Society, New York NY USA, pp. 42-52||Abstract||reldebug_zpl.pdf|
|Sosic, R., Abramson, D. A., 1997:Guard: a relative debugger, Software - Practice & Experience, vol 27, John Wiley & Sons, Ltd, California USA, pp. 185-206.||Abstract||guard.pdf|
|Abramson D., Foster, I., Michalakes, J. and Sosic R., "Relative Debugging: A new paradigm for debugging scientific applications", the Communications of the Association for Computing Machinery (CACM), Vol 39, No 11, pp 67 - 77, Nov 1996||Abstract||cacm1.pdf|
|Abramson, D.A. and Sosic, R. "A Debugging and Testing Tool for Supporting Software Evolution", Journal of Automated Software Engineering, 3 (1996), pp 369 - 390||Abstract||jase.pdf|
|Abramson, D and Watson, G. “Relative Debugging for Parallel Systems”, PCW '97, September 25 and 26, 1997, Australian National University, Canberra, pp P1-C-1 – P1-C-8.||Abstract||guardpcw.pdf|