I have written a multi-threaded code using pthread. Each thread calls an instance of dss_solve_real separately. I compile the code using following libraries to make sure that MKL works in sequential mode:
$(MKLROOT)/lib/intel64/libmkl_intel_ilp64.a $(MKLROOT)/lib/intel64/libmkl_sequential.a $(MKLROOT)/lib/intel64/libmkl_core.a -lm -lpthread
Also, I have disabled KMP_AFFINITY using:
env KMP_AFFINITY=disabled
The number of threads for MKL is also manually determined in the code using:
mkl_set_num_threads(1);
I use the following code to set affinity for each thread. This piece code is executed at the beginning of each thread's function:
pthread_t curThread = pthread_self();
cpu_set_t cpuset;
CPU_ZERO(&cpuset);
CPU_SET(threadCPUNum[threadData->numOfCurThread], &cpuset);
sched_setaffinity(curThread, sizeof(cpuset), &cpuset);
In this code, threadCPUNum[threadData->numOfCurThread] represents number of the CPU to which current thread will be binded to.
In order to make sure that MKL respects my CPU affinity settings, I initially bind all the threads to CPU0 by setting all elements of threadCPUNum array to zero. However, monitoring CPU utilization reveals that MKL does not pay attention to sched_setaffinity and uses different processors.
I would like to know what I am missing here and how I can force MKL function (dss_solve_real) to bind to a specific CPU.
Thanks in advance for your help.