Hi,
Once again having some trouble with the Direct Sparse Solver for clusters. I am getting the following error when running on a single process
entering matrix solver *** Error in PARDISO ( insufficient_memory) error_num= 1 *** Error in PARDISO memory allocation: MATCHING_REORDERING_DATA, allocation of 1 bytes failed total memory wanted here: 142 kbyte === PARDISO: solving a real structurally symmetric system === 1-based array indexing is turned ON PARDISO double precision computation is turned ON METIS algorithm at reorder step is turned ON Summary: ( reordering phase ) ================ Times: ====== Time spent in calculations of symmetric matrix portrait (fulladj): 0.000005 s Time spent in reordering of the initial matrix (reorder) : 0.000000 s Time spent in symbolic factorization (symbfct) : 0.000000 s Time spent in allocation of internal data structures (malloc) : 0.000465 s Time spent in additional calculations : 0.000080 s Total time spent : 0.000550 s Statistics: =========== Parallel Direct Factorization is running on 1 OpenMP < Linear system Ax = b > number of equations: 6 number of non-zeros in A: 8 number of non-zeros in A (%): 22.222222 number of right-hand sides: 1 < Factors L and U > number of columns for each panel: 128 number of independent subgraphs: 0< Preprocessing with state of the art partitioning metis> number of supernodes: 0 size of largest supernode: 0 number of non-zeros in L: 0 number of non-zeros in U: 0 number of non-zeros in L+U: 0 ERROR during solution: 4294967294
I just hangs when running on a single process. Below is the CSR format of my matrix and the provided RHS to solve for
CSR row values
0
2
6
9
12
16
18
CSR col values
0
1
0
1
2
3
1
2
4
1
3
4
2
3
4
5
4
5
Rank 0 rhs vector :
1
0
0
0
0
1
Now my calling file looks like:
void SolveMatrixEquations(MKL_INT numRows, MatrixPointerStruct &cArrayStruct, const std::pair<MKL_INT,MKL_INT>& rowExtents) { double pressureSolveTime = -omp_get_wtime(); MKL_INT mtype = 1; /* set matrix type to "real structurally symmetric" */ MKL_INT nrhs = 1; /* number of right hand sides. */ void *pt[64] = { 0 }; //internal memory Pointer /* Cluster Sparse Solver control parameters. */ MKL_INT iparm[64] = { 0 }; MKL_INT maxfct, mnum, phase=13, msglvl, error; /* Auxiliary variables. */ float ddum; /* float dummy */ MKL_INT idum; /* Integer dummy. */ MKL_INT i, j; /* -------------------------------------------------------------------- */ /* .. Init MPI. */ /* -------------------------------------------------------------------- */ int mpi_stat = 0; int comm, rank; mpi_stat = MPI_Comm_rank(MPI_COMM_WORLD, &rank); comm = MPI_Comm_c2f(MPI_COMM_WORLD); /* -------------------------------------------------------------------- */ /* .. Setup Cluster Sparse Solver control parameters. */ /* -------------------------------------------------------------------- */ iparm[0] = 0; /* Solver default parameters overridden with provided by iparm */ iparm[1] =3; /* Use METIS for fill-in reordering */ //iparm[1] = 10; /* Use parMETIS for fill-in reordering */ iparm[5] = 0; /* Write solution into x */ iparm[7] = 2; /* Max number of iterative refinement steps */ iparm[9] = 8; /* Perturb the pivot elements with 1E-13 */ iparm[10] = 0; /* Don't use non-symmetric permutation and scaling MPS */ iparm[12] = 0; /* Switch on Maximum Weighted Matching algorithm (default for non-symmetric) */ iparm[17] = 0; /* Output: Number of non-zeros in the factor LU */ iparm[18] = 0; /* Output: Mflops for LU factorization */ iparm[20] = 0; /*change pivoting for use in symmetric indefinite matrices*/ iparm[26] = 1; iparm[27] = 0; /* Single precision mode of Cluster Sparse Solver */ iparm[34] = 1; /* Cluster Sparse Solver use C-style indexing for ia and ja arrays */ iparm[39] = 2; /* Input: matrix/rhs/solution stored on master */ iparm[40] = rowExtents.first+1; iparm[41] = rowExtents.second+1; maxfct = 3; /* Maximum number of numerical factorizations. */ mnum = 1; /* Which factorization to use. */ msglvl = 1; /* Print statistical information in file */ error = 0; /* Initialize error flag */ //cout << "Rank "<< rank << ": "<< iparm[40] << ""<< iparm[41] << endl; #ifdef UNIT_TESTS //msglvl = 0; #endif phase = 11; #ifndef UNIT_TESTS if (rank == 0)printf("Restructuring system...\n"); cout << "Restructuring system...\n"<<endl;; #endif cluster_sparse_solver(pt, &maxfct, &mnum, &mtype, &phase,&numRows, &ddum, cArrayStruct.rowIndexArray, cArrayStruct.colIndexArray, &idum, &nrhs, iparm, &msglvl,&ddum, &ddum, &comm, &error); if (error != 0) { cout << "\nERROR during solution: "<< error << endl; exit(error); } phase = 23; #ifndef UNIT_TESTS // if (rank == 0) printf("\nSolving system...\n"); printf("\nSolving system...\n"); #endif cluster_sparse_solver_64(pt, &maxfct, &mnum, &mtype, &phase,&numRows, cArrayStruct.valArray, cArrayStruct.rowIndexArray, cArrayStruct.colIndexArray, &idum, &nrhs, iparm, &msglvl, cArrayStruct.rhsVector, cArrayStruct.pressureSolutionVector, &comm, &error); if (error != 0) { cout << "\nERROR during solution: "<< error << endl; exit(error); } phase = -1; /* Release internal memory. */ cluster_sparse_solver_64(pt, &maxfct, &mnum, &mtype, &phase,&numRows, &ddum, cArrayStruct.rowIndexArray, cArrayStruct.colIndexArray, &idum, &nrhs, iparm, &msglvl, &ddum, &ddum, &comm, &error); if (error != 0) { cout << "\nERROR during release memory: "<< error << endl; exit(error); } /* Check residual */ pressureSolveTime += omp_get_wtime(); #ifndef UNIT_TESTS //cout << "Pressure Solve Time: "<< pressureSolveTime << endl; #endif //TestPrintCsrMatrix(cArrayStruct,rowExtents.second-rowExtents.first +1); }
This is based on the format of one of the examples. Now i am trying to use the ILP64 interface becasue my example system is very large. (16 billion non-zeros). I am using the Intel C++ compiler 2017 as part of the Intel Composer XE Cluster Edition Update 1. I using the following link lines in my Cmake files:
TARGET_COMPILE_OPTIONS(${MY_TARGET_NAME} PUBLIC "-mkl:cluster""-DMKL_ILP64""-I$ENV{MKLROOT}/include") TARGET_LINK_LIBRARIES(${MY_TARGET_NAME} "-Wl,--start-group $ENV{MKLROOT}/lib/intel64/libmkl_intel_ilp64.a $ENV{MKLROOT}/lib/intel64/libmkl_intel_thread.a $ENV{MKLROOT}/lib/intel64/libmkl_core.a $ENV{MKLROOT}/lib/intel64/libmkl_blacs_intelmpi_ilp64.a -Wl,--end-group -liomp5 -lpthread -lm -ldl")
What is interesting is that this same code runs perfectly fine on my windows development machine. Porting it to my linux cluster is causing issues. Any Ideas?
I am currently awaiting the terribly long download for the update 4 Composer XE package. But I don't have much hope of that fixing it because this code used to run fine on this system.