Quantcast
Channel: Intel® oneAPI Math Kernel Library & Intel® Math Kernel Library
Viewing all articles
Browse latest Browse all 2652

Direct Sparse Solver for Clusters - Pardiso Memory Allocation Error

$
0
0

Hi,

Once again having some trouble with the Direct Sparse Solver for clusters. I am getting the following error when running on a single process

entering matrix solver
*** Error in PARDISO  (     insufficient_memory) error_num= 1
*** Error in PARDISO memory allocation: MATCHING_REORDERING_DATA, allocation of 1 bytes failed
total memory wanted here: 142 kbyte

=== PARDISO: solving a real structurally symmetric system ===
1-based array indexing is turned ON
PARDISO double precision computation is turned ON
METIS algorithm at reorder step is turned ON


Summary: ( reordering phase )
================

Times:
======
Time spent in calculations of symmetric matrix portrait (fulladj): 0.000005 s
Time spent in reordering of the initial matrix (reorder)         : 0.000000 s
Time spent in symbolic factorization (symbfct)                   : 0.000000 s
Time spent in allocation of internal data structures (malloc)    : 0.000465 s
Time spent in additional calculations                            : 0.000080 s
Total time spent                                                 : 0.000550 s

Statistics:
===========
Parallel Direct Factorization is running on 1 OpenMP

< Linear system Ax = b >
             number of equations:           6
             number of non-zeros in A:      8
             number of non-zeros in A (%): 22.222222

             number of right-hand sides:    1

< Factors L and U >
             number of columns for each panel: 128
             number of independent subgraphs:  0< Preprocessing with state of the art partitioning metis>
             number of supernodes:                    0
             size of largest supernode:               0
             number of non-zeros in L:                0
             number of non-zeros in U:                0
             number of non-zeros in L+U:              0

ERROR during solution: 4294967294

I just hangs when running on a single process. Below is the CSR format of my matrix and the provided RHS to solve for

CSR row values
0
2
6
9
12
16
18

CSR col values
0
1
0
1
2
3
1
2
4
1
3
4
2
3
4
5
4
5

Rank 0 rhs vector :
1
0
0
0
0
1

Now my calling file looks like:

void SolveMatrixEquations(MKL_INT numRows, MatrixPointerStruct &cArrayStruct, const std::pair<MKL_INT,MKL_INT>& rowExtents)
{

	double pressureSolveTime = -omp_get_wtime();

	MKL_INT mtype = 1;  /* set matrix type to "real structurally symmetric" */
	MKL_INT nrhs = 1;  /* number of right hand sides. */

	void *pt[64] = { 0 }; //internal memory Pointer

						  /* Cluster Sparse Solver control parameters. */
	MKL_INT iparm[64] = { 0 };
	MKL_INT maxfct, mnum, phase=13, msglvl, error;

	/* Auxiliary variables. */
	float   ddum; /* float dummy   */
	MKL_INT idum; /* Integer dummy. */
	MKL_INT i, j;

	/* -------------------------------------------------------------------- */
	/* .. Init MPI.                                                         */
	/* -------------------------------------------------------------------- */

	int     mpi_stat = 0;
	int     comm, rank;
	mpi_stat = MPI_Comm_rank(MPI_COMM_WORLD, &rank);
	comm = MPI_Comm_c2f(MPI_COMM_WORLD);

	/* -------------------------------------------------------------------- */
	/* .. Setup Cluster Sparse Solver control parameters.                                 */
	/* -------------------------------------------------------------------- */
	iparm[0] = 0; /* Solver default parameters overridden with provided by iparm */
	iparm[1] =3; /* Use METIS for fill-in reordering */
	//iparm[1] = 10; /* Use parMETIS for fill-in reordering */
	iparm[5] = 0; /* Write solution into x */
	iparm[7] = 2; /* Max number of iterative refinement steps */
	iparm[9] = 8; /* Perturb the pivot elements with 1E-13 */
	iparm[10] = 0; /* Don't use non-symmetric permutation and scaling MPS */
	iparm[12] = 0; /* Switch on Maximum Weighted Matching algorithm (default for non-symmetric) */
	iparm[17] = 0; /* Output: Number of non-zeros in the factor LU */
	iparm[18] = 0; /* Output: Mflops for LU factorization */
	iparm[20] = 0; /*change pivoting for use in symmetric indefinite matrices*/
	iparm[26] = 1;
	iparm[27] = 0; /* Single precision mode of Cluster Sparse Solver */
	iparm[34] = 1; /* Cluster Sparse Solver use C-style indexing for ia and ja arrays */

	iparm[39] = 2; /* Input: matrix/rhs/solution stored on master */
	iparm[40] = rowExtents.first+1;
	iparm[41] = rowExtents.second+1;
	maxfct = 3; /* Maximum number of numerical factorizations. */
	mnum = 1; /* Which factorization to use. */
	msglvl = 1; /* Print statistical information in file */
	error = 0; /* Initialize error flag */
	//cout << "Rank "<< rank << ": "<< iparm[40] << ""<< iparm[41] << endl;
#ifdef UNIT_TESTS
	//msglvl = 0;
#endif




	phase = 11;
	#ifndef UNIT_TESTS
	if (rank == 0)printf("Restructuring system...\n");
	cout << "Restructuring system...\n"<<endl;;
	#endif

	cluster_sparse_solver(pt, &maxfct, &mnum, &mtype, &phase,&numRows, &ddum, cArrayStruct.rowIndexArray, cArrayStruct.colIndexArray, &idum, &nrhs, iparm, &msglvl,&ddum, &ddum, &comm, &error);
	if (error != 0)
	{
		cout << "\nERROR during solution: "<< error << endl;
		exit(error);
	}


	phase = 23;

#ifndef UNIT_TESTS
//	if (rank == 0) printf("\nSolving system...\n");
	printf("\nSolving system...\n");
#endif

	cluster_sparse_solver_64(pt, &maxfct, &mnum, &mtype, &phase,&numRows, cArrayStruct.valArray, cArrayStruct.rowIndexArray, cArrayStruct.colIndexArray, &idum, &nrhs, iparm, &msglvl,
		cArrayStruct.rhsVector, cArrayStruct.pressureSolutionVector, &comm, &error);
	if (error != 0)
	{
		cout << "\nERROR during solution: "<< error << endl;
		exit(error);
	}

	phase = -1; /* Release internal memory. */
	cluster_sparse_solver_64(pt, &maxfct, &mnum, &mtype, &phase,&numRows, &ddum, cArrayStruct.rowIndexArray, cArrayStruct.colIndexArray, &idum, &nrhs, iparm, &msglvl, &ddum, &ddum, &comm, &error);
	if (error != 0)
	{
		cout << "\nERROR during release memory: "<< error << endl;
		exit(error);
	}
	/* Check residual */

	pressureSolveTime += omp_get_wtime();


#ifndef UNIT_TESTS
	//cout << "Pressure Solve Time: "<< pressureSolveTime << endl;
#endif

	//TestPrintCsrMatrix(cArrayStruct,rowExtents.second-rowExtents.first +1);
}

This is based on the format of one of the examples. Now i am trying to use the ILP64 interface becasue my example system is very large. (16 billion non-zeros). I am using the Intel C++ compiler 2017 as part of the Intel Composer XE Cluster Edition Update 1. I using the following link lines in my Cmake files: 

TARGET_COMPILE_OPTIONS(${MY_TARGET_NAME} PUBLIC "-mkl:cluster""-DMKL_ILP64""-I$ENV{MKLROOT}/include")
TARGET_LINK_LIBRARIES(${MY_TARGET_NAME} "-Wl,--start-group $ENV{MKLROOT}/lib/intel64/libmkl_intel_ilp64.a $ENV{MKLROOT}/lib/intel64/libmkl_intel_thread.a $ENV{MKLROOT}/lib/intel64/libmkl_core.a $ENV{MKLROOT}/lib/intel64/libmkl_blacs_intelmpi_ilp64.a -Wl,--end-group -liomp5 -lpthread -lm -ldl")

What is interesting is that this same code runs perfectly fine on my windows development machine. Porting it to my linux cluster is causing issues. Any Ideas?

I am currently awaiting the terribly long download for the update 4 Composer XE package. But I don't have much hope of that fixing it because this code used to run fine on this system. 


Viewing all articles
Browse latest Browse all 2652

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>