Quantcast
Channel: Intel® oneAPI Math Kernel Library & Intel® Math Kernel Library
Viewing all articles
Browse latest Browse all 2652

Problem with MKL Scalapack PDGETRF

$
0
0

Hi all,

I am trying to use MKL PBLAS/ScaLAPACK routine pdgetrf to do the LU decompostion. I wrote a simple test fortran program and it worked well with 2*2 processes on the cluster. However, when I tried to use more processes, like 'mpiexec -n 16', The program got stuck. 

One possible reason might be that the BLAS spawns too many threads which lead to a performance disaster ( for ref: https://icl.cs.utk.edu/lapack-forum/viewtopic.php?f=6&t=3371 ). So I tried to export OMP_NUM_THREADS=1 or MKL_NUM_THREADS=1, set different combinations of pbs -l select=:ncpu:mpiprocs: to submit the job. But none of them solved the problem.

I have no idea now why it is fine with 2*2 procs but fails with 4*4 or more procs, hope someone here can help me. Any suggestion would be greatly appreciated.

Cluster compiler info:

Intel® Fortran Composer 13.0.1 and MPICH 3.0. 

Sieg


Viewing all articles
Browse latest Browse all 2652

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>