Intel Trace Analyzer and Collector (ITAC)
Intel Trace Analyzer and Collector (ITAC) is a MPI profiling and tracing tool that can be used to understand and visualize the behavior of a MPI code and to identify hotspots and reasons for poor parallel scaling and MPI performance. The tool is available only on Puhti and at the moment it supports only the applications compiled with the Intel MPI library.
Usage is possible for both academic and commercial purposes.
For simple MPI tracing, no recompilation is needed, and it is enough to add following settings into a normal batch job script:
module load intel-itac export LD_PRELOAD=libVT.so srun myprog
libVTwith the particular component. More details about different components can be found in the Intel documentation.
Trace Collector allows also tracing of user defined events, however, this requires always recompilation of the application. As tracing can generate very large files even for relatively small applications, it is often useful to filter the collected data.
The collected data are saved in a series of
<executable>.stf files in the running directory.
- In Fortran programs MPI tracing works only with
mpimodule (i.e. not with
- Collector exits with error "Failed writing buffer to flush file "/tmp/xxx.dat": No space left on device".
- As the
/tmp/in compute nodes is small, temporary files might need to be stored in the running directory by setting
Analyzing the traces
In order to improve the performance of the graphical user interface,
we recommend to use the Puhti web interface remote desktop when carrying out the analysis.
The analyzer is started in the host terminal with the command (note that the
intel-itac module needs to be loaded):
The Trace Analyzer can show the timeline of each process and map each MPI call between the tasks. For each performance issue the following information is provided: description, affected processes, and source locations.
Intel Trace Analyzer can be used also for investigating OTF2 formatted traces produced by other performance tools, such as ScoreP/Scalasca. This can be achieved by starting the analyzer
For more details please check the Intel documentation