The SLO tool analyzes the causes of poor temporal data locality, and suggests program refactorings that are needed to increase locality. As a result, the number of data cache misses may be reduced, and execution speed may be enhanced.
The following powerpoint presentation (presented at the HPCC2006 conference) gives a short introduction into the underlying ideas and the operation of the SLO tool.
The SLO tool is discussed in the following papers:.
Documentation of both exploring locality optimizations using SLO and how to instrument programs using GCC-SLO is available as HTML and PDF.
You can download slo-1.1.jar, and run SLO from the command line using a command like java -jar slo-1.1.jar example1.slo.zip.
The table below provides some examples of .slo.zip-files produced by GCC-SLO. The left column in the table provides the .slo.zip input files that are input the the SLO tool. The middle column gives a short description of the program that was instrumented. The right column gives a link to an HTML-page automatically produced by SLO, so that you can browse the results of the locality analysis without needing to install SLO. (To have full functionality in these pages, enable JavaScript in your browser).
SLO
input file |
Description |
HTML-output |
example1.slo.zip |
The example used in the explanation of the principles behind SLO. |
No HTML output available right now. |
applu_orig_ref1_reuse_distance.slo.zip |
173.applu from the SPEC2000
benchmark suite, run with the reference input. |
HTML-output generated by SLO |
vpr_orig_ref2_reuse_distance.slo.zip |
175.vpr from the SPEC2000
benchmark suite, run with the second reference input. |
HTML-output generated by SLO |
galgel_orig_ref1_reuse_distance.slo.zip |
178.galgel from the SPEC2000
benchmark suite, run with the reference input. |
HTML-output generated by SLO |
art_orig_ref2_reuse_distance.slo.zip |
179.art from the SPEC2000
benchmark suite, run with the reference input. |
HTML-output generated by SLO |
optimized_histograms/equake_orig_ref1_reuse_distance.slo.zip |
183.equake from the SPEC2000
benchmark suite, run with the reference input. |
HTML-output generated by SLO |
The GCC compiler has been adapted to instrument programs using the command line option \literal{-fslo-instrument}. To build your own gcc-compiler that can instrument programs, download the gcc-slo-1.1.0-4.1.0.tar.gz, extract it in executed the script build-gcc-slo.sh in directory gcc-slo-1.1.0-4.1.0. This will build the gcc-slo compiler and install it in $HOME/gcc-slo.
See documentation, available as HTML and PDF.
Program | Original reuse distance histogram | Reuse distance histogram after refactoring | Speedup on Pentium4 2.66Ghz | Speedup on Itanium1 733Mhz | Speedup on Alpha EV67 667Mhz | Speedup on PA-RISC 8500 400Mhz | Speedup on UltraSPARC IV 1.05Ghz |
---|---|---|---|---|---|---|---|
173.applu | 1.63 | 2.46 | 1.69 | 1.17 | 2.71 | ||
175.vpr | 1.51 | 1.40 | 1.41 | 1.17 | 1.09 | ||
178.galgel | 2.14 | 2.63 | 2.48 | 1.23 | 1.46 | ||
179.art | 4.11 | 1.54 | 1.16 | 2.30 | 1.89 | ||
183.equake | 1.10 | 2.93 | 3.09 | 1.54 | 1.57 |
Comments can be sent to Kristof Beyls, e-mail: Kristof.Beyls at www.elis.ugent.be