SLO - Suggestions for Locality Optimizations


The SLO tool analyzes the causes of poor temporal data locality, and suggests program refactorings that are needed to increase locality. As a result, the number of data cache misses may be reduced, and execution speed may be enhanced.

The following powerpoint presentation (presented at the HPCC2006 conference) gives a short introduction into the underlying ideas and the operation of the SLO tool.

The SLO tool is discussed in the following papers:.

Documentation of both exploring locality optimizations using SLO and how to instrument programs using GCC-SLO is available as HTML and PDF.

Principles of Analysis performed by SLO

Effectively using SLO to optimize a program's locality requires understanding the program and locality model used by SLO to analyze programs. These are explained here.

Want to give it a try?

The SLO-tool consists of two parts:
  1. GCC-SLO: an expanded version of the GCC compiler, that can instrument programs to perform the analysis described above.
  2. SLO: a java-based visualizer of the data analyzed be GCC-SLO.
For both GCC-SLO and SLO, the most recent version can be downloaded by following the download link from http://sourceforge.net/projects/slo.

You can download slo-1.1.jar, and run SLO from the command line using a command like java -jar slo-1.1.jar example1.slo.zip.

Example input files

The table below provides some examples of .slo.zip-files produced by GCC-SLO. The left column in the table provides the .slo.zip input files that are input the the SLO tool. The middle column gives a short description of the program that was instrumented. The right column gives a link to an HTML-page automatically produced by SLO, so that you can browse the results of the locality analysis without needing to install SLO. (To have full functionality in these pages, enable JavaScript in your browser).

SLO input file
Description
HTML-output
example1.slo.zip
The example used in the explanation of the principles behind SLO.
No HTML output available right now.
applu_orig_ref1_reuse_distance.slo.zip
173.applu from the SPEC2000 benchmark suite, run with the reference input.
HTML-output generated by SLO
vpr_orig_ref2_reuse_distance.slo.zip
175.vpr from the SPEC2000 benchmark suite, run with the second reference input.
HTML-output generated by SLO
galgel_orig_ref1_reuse_distance.slo.zip
178.galgel from the SPEC2000 benchmark suite, run with the reference input.
HTML-output generated by SLO
art_orig_ref2_reuse_distance.slo.zip
179.art from the SPEC2000 benchmark suite, run with the reference input.
HTML-output generated by SLO
optimized_histograms/equake_orig_ref1_reuse_distance.slo.zip
183.equake from the SPEC2000 benchmark suite, run with the reference input.
HTML-output generated by SLO

GCC-SLO

The GCC compiler has been adapted to instrument programs using the command line option \literal{-fslo-instrument}. To build your own gcc-compiler that can instrument programs, download the gcc-slo-1.1.0-4.1.0.tar.gz, extract it in executed the script build-gcc-slo.sh in directory gcc-slo-1.1.0-4.1.0. This will build the gcc-slo compiler and install it in $HOME/gcc-slo.

Instrumenting programs with GCC-SLO

See documentation, available as HTML and PDF.

Other information about the way GCC-SLO instruments.

Documentation

The documentation is under construction. You may find the preliminary documentation already useful. It is available as HTML and PDF.

Download

Both SLO (the interactive visualizer) and GCC-SLO (the instrumenting compiler) can be downloaded from the file releases area of the SLO Sourceforge site

Results after using SLO to refactor selected SPEC2000 programs

We have tested the usefullness of SLO by using it to analyze a few SPEC2000 programs. The source code of both the original and refactored codes can be downloaded here. The table below shows for five programs
Program Original reuse distance histogram Reuse distance histogram after refactoring Speedup on Pentium4 2.66Ghz Speedup on Itanium1 733Mhz Speedup on Alpha EV67 667Mhz Speedup on PA-RISC 8500 400Mhz Speedup on UltraSPARC IV 1.05Ghz
173.applu 1.63 2.46 1.69 1.17 2.71
175.vpr 1.51 1.40 1.41 1.17 1.09
178.galgel 2.14 2.63 2.48 1.23 1.46
179.art 4.11 1.54 1.16 2.30 1.89
183.equake 1.10 2.93 3.09 1.54 1.57

Links


Comments can be sent to Kristof Beyls, e-mail: Kristof.Beyls at www.elis.ugent.be

Valid HTML 4.01! SourceForge.net Logo