Hello.
I tried to solve a large linear equation (1,000,000 x 1,000,000 / bandwidth = 100 or 1000) with cpardiso.
( the matrix type is real and symmetric indefinite. )
I have some problems about reordering time and memory.
CPARDISO's reordering phase is compare to slower than the other phase. So I checked event time using Traceanalyzer.
CPARDISO used only one process(rank 0) for reordering and Rank 0 collected information on the divided A matrix on each process.
As a result, When I solved the bandwidth 1,000 equation, It occurred insufficient memory error. (※bandwidth 100 equation was resolved)
Should CPARDISO do the reordering and collect the A matrix in only rank 0 ?
Does rank 0 must have a lot of memory to solve a large system?
How to solve this problem ?
The version of MKL is mkl 11.3, which was bundled with parallel studio xe 2016 cluster edition.
This is the setting for cpardiso
iparm[ 0] = 1;
iparm[ 1] = 0; (I also tried iparm[1]= 2 and 3)
iparm[ 5] = 0;
iparm[ 7] = 0;
iparm[ 9] = 8;
iparm[10] = 0;
iparm[12] = 0;
iparm[17] = 0;
iparm[18] = 0;
iparm[20] = 1;
iparm[26] = 0;
iparm[27] = 0;
iparm[34] = 1;
iparm[39] = input_value[1];
iparm[40] = input_value[2];
iparm[41] = input_value[3];
I used 4 nodes that are connected InfiniBand and Each node have 32 GB RAM.
Thanks.