Large overhead and spin time reported in MKL functions

August 7, 2013, 1:24 pm

Latest and popular articles on Intel Technologies

≪ Previous: DGEMM with pgithread is giving segmentation fault

Hello,

Using Vtune Amplifier concurrency analysis on an example code of dgemm (link here), the overhead and spin time surprisingly covered almost 100% of the CPU usage bar! (reported here). I tried VTune concurrency profiling tool for sparse matrix by vector multiplication kernel mkl_dcsrsymv as well, and similar result was obtained. Since in the examples mentioned here, a very high performance is achieved, the large overhead reported seems irrelevant. I initially asked for an explanation in VTune Amplifier forum (here) and I was advised to ask the question in this forum.

Do you have any explanation for the large overhead and spin time?

Cheers,

note: Vtune Amplifier update 11, Intel Composer XE 2013 are used.

↧