Quantcast
Channel: Intel® oneAPI Math Kernel Library & Intel® Math Kernel Library
Viewing all articles
Browse latest Browse all 2652

Direct Sparse Solver for Clusters poor scaling

$
0
0

Hi,

We are currently developing a distributed version of our c++ finite element program. We planned to use the Intel Direct Sparse Solver for Cluster but it seems we can't reach good scalability with our settings. The matrix is assumed non symmetric and built in the DCSR format.

The test case used is a simple thermal diffusion problem on a square grid.  Different sizes of problem, ranging from 1M to 25M DOF, have been tested with many combinations of MPI processes and OpenMP threads (usually with 1 MPI process by node or by socket). Memory allocated at factorization phase is scaling down but we observed small speed-up on running time.

Actually, we observed these behaviors:

- Symbolic factorization benefits from more MPI processes but is not affected by threads.

- Factorization scales with number of OpenMP threads and sometimes with MPI.

- Most of the time, results shows no significant gain  on solving phase for both parallelization.

I must be doing something wrong but i can't  seem to find the solution to the problem.

Thanks a lot for any advice

 

 

 

 

The following iparm variables are used:

iparm(0) = 1;

iparm(1) = 10;

iparm(7) = 2;

iparm(9) = 13;

iparm(10) = 1;

iparm(12) = 1;

iparm(34) = 1;

iparm(39) = 2;

iparm(40,41) = first and last line of local matrix

The code is compiled with  2017 Intel compiler and Intel MPI. Compilation flags used are :  -03 -qopenmp -mkl=parallel and 


Viewing all articles
Browse latest Browse all 2652

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>