Non-square Matrix Transpose
Hi guys,Are there any highly optimized MKL routines or maybe performance primitives that can do rectangle matrix transposition but without scaling?I've been using mkl_omatcopy but it seems to perform...
View ArticleColumn sorting
Given matrix A of size MxN, I would like to sort column-wise. In Matlab this can be solved easily and quickly by sort(A) and it can even sort row-wise by sort(A,2).In fortran, I don't find this and...
View ArticleDTRSM in threaded applications
Hello,I have a threaded application where I call the MKL library on a Windows 7 platform. Intel Inspector is reporting a read/write collision in DTRSM (it goes through many other BLAS calls with no...
View ArticleARPACK with MKL crashes when calling zdotc
I am trying to use the ARPACK library written in fortran to solve an eigenvalue problem from within a C++ program, together with MKL implementations of the required BLAS and LAPACK routines....
View Articlehow to effectively reduce memory consumption of each compute node in cluster...
Hi:I need to use cluster pardiso to solve a big double precision complex symmetric matrix. I set the iparm(40)=0, that means provide the matrix in usual centralized input format: the master MPI process...
View Article"dlopen" warning when using "-static"
Hi there,with the makefileFC =ifort -mkl -warn nounused -warn declarations -static -O3 -parallel SRC := OwnFlag = $(LibPath)$(Own)OwnLib_ifort.a LibPath = ~/.local/lib/Fortran/ Own = OwnFunctions/...
View ArticleProblem with ZCGESV
The problem I am facing is that ZCGESV function crashes when matrix size is 46497 or more. When matrix size is, for example, 46202 everything works fine.From what I can see in LAPACK sources at...
View ArticlePreconditioner dcsrilu0 has returned the ERROR code -106
hi, all, Please give me some suggestions about error: Preconditioner dcsrilu0 has returned the ERROR code -106.I tried to use ilu + gmres to solve Ax=b.Here, A is like2 -1 -1 2 -1 -1 2 -1and so...
View ArticleBroken dgeqp3 in Version 11.2 (Update 3) (Linux)
Hi all,This was working in December 2014 when I last ran my code against MKL, but after upgrading to 11.2u3 I'm getting a response of -9 from the info parameter when calling dgeqp3... which is *really*...
View ArticleCholesky with pdpotrf()
I am performing a Cholesky factorization with pdpotrf(). I am reading all the matrix in the master node and then I distribute it. Then, every node is handling a submatrix and call pdpotrf(). Then I...
View Articleignore
How does MKL threading with tbb compare to the threading with openmp in performance? Could you guys add data for openmp as well in the comparison you did?...
View Articlepdpotrf() fails to identify SPD matrix
I am trying to perform a Cholesky decomposition via pdpotrf() of MKL-Intel's library, which uses ScaLAPACK. I am reading the whole matrix in the master node and then distribute it like in this example....
View ArticleVector plus/minus one floating point number
Hi,v?Add(n,a,b) performs element by element addition of vector a and vector b. Sometimes a single shift is required only, i.e. in this case b can be interpreted as a floating point number, i.e. a[k] +...
View ArticleIncomplete factorization...
Hello,I am testing MKL ilu factorization. As it only works for ilu0, I provide a sparse matrix pattern that corresponds to ilu(k). One thing I observe is that the performance is getting slower when I...
View ArticleQuestion: cycle count of 65536 MKL FFT DftiComputeForward(C++)
My code as followings: fft_mkl(int M,float * InputData,float * OutputData){MKL_LONG status;DFTI_DESCRIPTOR my_desc1_handle; DftiCreateDescriptor( &my_desc1_handle, DFTI_SINGLE,DFTI_COMPLEX, 1, M);...
View ArticlePossible dgetrf IPIV issue
Hello, I am attempting to use dgetrf to get an LU factorization of a square matrix as part of a large mex program. When I check the output of dgetrf, I find the IPIV contains both a 0 and a number...
View ArticleScaLAPACK pdgetrf_ factorises only one block of global matrix
Hi all,I am modifying a large C program to factorise a dense square matrix and solve the resulting system of 916 equations by back-substitution in parallel. I set up a 4x4 process grid and distribute...
View ArticleDistributed Cholesky
I am doing a distributed Cholesky. You can find the code I am using (almost the same) here. I am gathering the submatrices (after every node has executed Cholesky) into the master node exactly as shown...
View ArticleComputation of the Schur complement in MKL
Hello everyone,I recently came across a project in which I have to compute the Schur complement of a complex symmetrix matrix. I know that starting from Intel® MKL 11.2 update 1, MKL supports the...
View Article