ncu: GPU CUDA Kernel Profiler
Puhti: 2020.2.0.18 Mahti: 2020.3.1.0
NVIDIA Nsight Compute is a CUDA kernel profiler that provides detailed performance data and offers guidance for optimizing your CUDA kernels. The ncu profiling and debugging tool collects and views profiling data from the command-line. It is a low level CUDA kernel profiling tool. It enables the collection of a timeline of CUDA-related activities on both CPU and GPU, including kernel execution, memory transfers, memory set and CUDA API calls and events or metrics for CUDA kernels. Profiling results are displayed in the console after the profiling data is collected, and may also be saved for later viewing by ncu-ui tool.
ncu, one needs to first load a CUDA environment. First load the appropriate
module load gcc/9.1.0
module load gcc/10.3.0
module load cuda module load nsight-compute
An example of usage of
ncu --set full -o myreport ./a.out
ncu-uion the CSC supercomputers or on the user's local machine. The performance of the program can be compared to the theoretical peak (
speed-of-light) performance or to a custom baseline (for example a previous realease to be compared to) can be used.
ncu supports many useful running options, it is fully customizable. Use command line arguments
--query-metrics to check the available metrics and enquire which metrics are available for the current platform. For more details please check the nvidia documentation.
Last edited Fri Aug 13 2021