Quantcast
Channel: Intel® oneAPI Math Kernel Library & Intel® Math Kernel Library
Viewing all 2652 articles
Browse latest View live

Parallel Studio XE Integration Issues with VS2019

$
0
0

I want to use Math Kernel Library, for which I installed Parallel Studio XE 2019 Update 4, but it is not getting integrated with my Microsoft Visual Studio 2019 Community Version 16.2.2, can anyone help me out, I have followed every step mentioned in the Start up guide.

 

file:///C:/Program%20Files%20(x86)/IntelSWTools/documentation_2019/en/compiler_c/ps2019/get_started_wc.htm

https://software.intel.com/en-us/articles/intel-software-development-too...

https://software.intel.com/sites/default/files/managed/0c/45/IPSXE_2019_...

https://software.intel.com/sites/default/files/parallel-studio-xe-2019u4...

 

Thanks

Abhinav Parashar
abhinavmoon20@gmail.com


Way to determine the N smallest eigenvalues in a General Hermitian eigenvalues problem???

$
0
0

Hello, 

There is any way to determine the first N (ie 10) smallest eigenvalues of a General Hermitian Eigenvalues problem.

 

Middle Square Weyl Sequence RNG

$
0
0

Has Intel considered including the Middle Square Weyl Sequence RNG in MKL?

It is probably the fastest RNG available that passes all the statistical tests.

See arxiv 1704.00358v4

Intel Maths Kernel Library (MKL) version differences

$
0
0

Dear Sir,

For one of our project we are currently using Intel MKL(Maths kernel library)  ver 10.0.4 on Redhad enterprise Linux 7.2 distribution.

where as we found latest version on site: https://software.intel.com/en-us/articles/intel-math-kernel-library-rele...

is MKL -2019 update4

 

please let us know what chronology (sequence) of version History (from initial version to latest)

Also, please let us know, way to find out the differences in versions 10.x.x and MKL-2019.update4

does Red Hat Linux 8.0 support MKL-2019 update4 version?

  

Thanks & Regards,

Krishnakant Mehta

 

 

intel math kernel library

$
0
0

I want to calculate the eigenvalue eigenvector fortranda.
I have a few questions here.
1- Select dynamic or static linking: what is dynamic static, single dynamic library
2- Select interface layer: 32 or 64
3- Select threading layer: open mpı squental ...?
4- Select OpenMP library: What is intel libiomp5md.lib
5- Select MPI library: What if I choose paralleling even if my program is not parallel?
6- Select the Fortran 95 interfaces: when should I check this
7- Use this link line: where do we use the links here?
8- Compiler options: where do we write this part?

LAPACKE_dtpqrt and LAPACKE_dtpmqrt bugs report

$
0
0

machine: MAC

mkl version: 2019.4.233

compiler: clang-1001.0.46.4

LAPACKE_dtpqrt:

layout row major

prameters m = n = l = lda = ldb = ldt = 12

if we set nb > 1,

the routines generates nan in the lower part of the triangular matrices stored in t.

This makes LAPACKE_dtpmqrt returns an error indicating invalid t array.

LAPACKE_dtpmqrt:

layout row major, left side multiplication, no transposition

case 1:

prameters m = k = n = l =  ldv = ldt = lda = ldb = 12

If nb != k

The routines does not compute the correct product.

case 2:

prameters m = k =  l = nb = ldv = ldt = 12,

n = lda = ldb

if n != k

the routines returns an error -14 meaning that lda does not have a valid value,

however, it should be valid

Data fitting cubic spline - failing with

$
0
0

Hi,

I'm aware that this question is close to that being asked in: https://software.intel.com/en-us/forums/intel-math-kernel-library/topic/...

I'm using an older version of MKL, version 11.0.5, and I am having problems with the use of the Akima cubic spline that I can't find any concrete answer for in the documentation.

I'm using a combination of boundary conditions on the spline, depending on the case, of either a fixed value to the 1st or 2nd derivative at the left and right edge (both left and right BC are always defined). I don't define any internal conditions, and I use a non-uniform partition for the x breakpoints.

The problem I am encountering is that if there is <= 4 breakpoints, the dfdConstruct1D method will fail with the error 'Error: the number of breakpoints is invalid (code -1004).'  At the moment, the code will enforce a linear interpolation of there are <= 2 breakpoints which I understood to be the minimum.

However, I can't seem to find anywhere in the documentation that the minimum number of breakpoints required for each cubic spline method is defined.

Can someone clarify what the minimum number of breakpoints required is for each of the methods, and how the BC's and/or IC's change this behaviour?

Thanks,

Ewan

Pardiso gives segmentation fault

$
0
0

Hi,

I am trying to diagonalize matrix with Arpack using Pardiso. For small sizes of the matrix everything works fine, but for slightly larger matrices (7300x7300) Pardiso gives segmentation fault. When I run under gdb I get the following backtrace

#0  0x00007ffff4379220 in mkl_pds_lp64_metis_pqueueupdateup ()
   from /home/aghazary/intel/compilers_and_libraries_2019.3.199/linux/mkl/lib/intel64_lin/libmkl_core.so
#1  0x00007ffff437f681 in mkl_pds_lp64_metis_fm_2waynoderefine_onesided ()
   from /home/aghazary/intel/compilers_and_libraries_2019.3.199/linux/mkl/lib/intel64_lin/libmkl_core.so
#2  0x00007ffff437ff81 in mkl_pds_lp64_metis_refine2waynode ()
   from /home/aghazary/intel/compilers_and_libraries_2019.3.199/linux/mkl/lib/intel64_lin/libmkl_core.so
#3  0x00007ffff437018f in mkl_pds_lp64_metis_mlevelnodebisectionmultiple ()
   from /home/aghazary/intel/compilers_and_libraries_2019.3.199/linux/mkl/lib/intel64_lin/libmkl_core.so
#4  0x00007ffff19bbc7c in mkl_pds_lp64_metis_mlevelnesteddissection_pardiso ()
   from /home/aghazary/intel/compilers_and_libraries_2019.3.199/linux/mkl/lib/intel64_lin/libmkl_intel_thread.so
#5  0x00007ffff19bbf8d in mkl_pds_lp64_metis_mlevelnesteddissection_pardiso ()
   from /home/aghazary/intel/compilers_and_libraries_2019.3.199/linux/mkl/lib/intel64_lin/libmkl_intel_thread.so
#6  0x00007ffff19bbf8d in mkl_pds_lp64_metis_mlevelnesteddissection_pardiso ()
   from /home/aghazary/intel/compilers_and_libraries_2019.3.199/linux/mkl/lib/intel64_lin/libmkl_intel_thread.so
#7  0x00007ffff19bbf8d in mkl_pds_lp64_metis_mlevelnesteddissection_pardiso ()
   from /home/aghazary/intel/compilers_and_libraries_2019.3.199/linux/mkl/lib/intel64_lin/libmkl_intel_thread.so
#8  0x00007ffff437137f in mkl_pds_lp64_metis_nodend_vbsr_pardiso ()
   from /home/aghazary/intel/compilers_and_libraries_2019.3.199/linux/mkl/lib/intel64_lin/libmkl_core.so
#9  0x00007ffff4391b60 in mkl_pds_lp64_pds_nested_disection ()
   from /home/aghazary/intel/compilers_and_libraries_2019.3.199/linux/mkl/lib/intel64_lin/libmkl_core.so
#10 0x00007ffff438d199 in mkl_pds_lp64_pds_reordering ()
   from /home/aghazary/intel/compilers_and_libraries_2019.3.199/linux/mkl/lib/intel64_lin/libmkl_core.so
#11 0x00007ffff438cbf4 in mkl_pds_lp64_dist_pardiso ()
   from /home/aghazary/intel/compilers_and_libraries_2019.3.199/linux/mkl/lib/intel64_lin/libmkl_core.so
#12 0x00007ffff4618a6d in mkl_pds_lp64_pardiso ()
   from /home/aghazary/intel/compilers_and_libraries_2019.3.199/linux/mkl/lib/intel64_lin/libmkl_core.so
#13 0x000055555555c3dd in GenMatProdShift::perform_op (this=0x7fffffffd670, xIn=0x555555903270, yOut=0x55555591ff20) at diag.cpp:119
#14 0x000055555555cb0d in calcEValues (baseHam=..., op=..., evalues=0x555555865640, evecs=0x7fffeccf3010, shiftMethod=true,
    neigs=40) at diag.cpp:224
#15 0x0000555555556128 in main () at main.cpp:74

which is not particularly illuminating. I am attaching my code, which is slightly complicated, but pardiso code is in diag.cpp and I am also attaching resulting sparse matrix in CSR format.

I should mention that the matrix is generally complex and neither symmetric or Hermitian.

 

Please let me know if you need further details.

 

Thanks,

Areg Ghazaryan

AttachmentSize
Downloadapplication/zipCode.zip13.08 KB
Downloadapplication/zipMatrix.zip350.31 KB

dxyevx calculates eigenvectors negatively

$
0
0

I wonder if my energy is negative in quantum calculations

FGMRES for complex arithmetic?!

$
0
0

Hi,

Is there a built-in GMRES functions that can deal with complex numbers?. Apparently it only supports the double real numbers!.

I need it to solve large preconditioned system iteratively.

Any help would be appreciated.

Thanks.

segmentation fault when mkl_sparse_d_mv for sparse matrix with many nonzeros

$
0
0

Hi,

I am computing y=Ax (mkl_sparse_d_mv) for a 81,000 x 81,000 sparse matrix A (csr format) with 2.3*10^9 nnzs, but MKL produce segmentation fault.

The sparse matrix is dumped from another program. I first read it from disk, and then call mkl_sparse_d_create_csr to create the matrix. mkl_sparse_d_create_csr returns SPARSE_STATUS_SUCCESS. But when I move to mkl_sparse_d_mv, it shows me segmentation fault.

My another observation is that when I decrease #nnz to be 1.5*10^9, MKL can finish the computation successfully.

My hypothesis is that MKL 2.3*10^9 > 2^31, and MKL may have constraints on #nnz in a sparse matrix.

Any help would be appreciated. Thanks!

Env: Linux, parallel_studio_xe_2019_update4.

How to download MKL for Linux/Windows

$
0
0

It appears that there is an error in a required web-based form at Intel's download pages for MKL and other libraries.

At URL:

https://software.seek.intel.com/performance-libraries

It shows the following text headers that seem to be intended as a form, but there are no entry fields displayed:

  Required Fields(*)

 Please enter a First Name.

 First Name must be at least 2 characters.

 First Name must be less than 255 characters.

 Please enter a Last Name.

 Last Name must be at least 2 characters.

 Last Name must be less than 255 characters.

 Please enter an email address.

 Please enter a valid email address.

 Email Address must be less than 255 characters.

 Please enter a company name.

etc...

There is a 'submit' button, but of course it's not functional.

=====================================================

PS: Just tested, and the form displays correctly with Microsoft Edge browser.

Problems appear under Chrome browser running under both Windows and Linux.

 

How would you justify these execution times (LAPACK, CBLAS, PARDISO)?

$
0
0

I am running experiments on a cluster where nodes are 2*18-core Intel Xeon. 
I have two doubts regarding the execution times that I get from two sets of codes. 

First: I wrote two versions of the same code, that uses both MPI and OpenMP. The code is pretty complex but it makes one call to the LAPACK routine dgesv and one to the CBLAS routine dgemv. The two versions differ just from the fact that in one of them I am including lapacke.h and cblas.h, while in the other one I am including the mkl libraries (mkl.h, mkl_blas.h, mkl_cblas.h, mkl_lapacke.h). From what I know, the latter case is faster because these two functions are threaded with OpenMP (https://software.intel.com/en-us/mkl-linux-developer-guide-openmp-thread...). I have tested both codes on the same input data (basically a matrix) and with the same configurations: usually I run experiments on N compute nodes of the cluster, each node has N tasks (i.e., distributed processes, and hence I have a total of N*N distributed processes) and to each process I assign 4 CPUs for multi-threading. Here are some results: N = 4 (number of distributed processes = 16), time for code 1 is 0.52 seconds, time for code 2 is 0.16s. N = 7 (number of distributed processes = 49), time for code 1 is 0.5s and time for code 2 is 0.36s. N = 9 (81 distributed processes), time for code 1 is again 0.5 seconds while the time for code 2 is 0,12 seconds. These execution times are averaged but the differences between different experiments are very small.
I was expecting the MKL version to be faster, but I am surprised to see such a big time difference. How would you justify it? Execution times are computed by MPI_Wtime().

 

Second: I am using the MKL function cluster_sparse_solver that integrates PARDISO routines. Again the execution time is the one given by MPI_Wtime(). I ran experiments using the same configuration like the one described above (for N*N distributed processes, I am reserving N compute nodes in the cluster, with N tasks per node, 4 CPUs per task). Then I read more accurately the notes (page 1741 of the MKL reference manual for C)  where, speaking of such function, it is stated that "A hybrid implementation combines Message Passing Interface (MPI) technology for data exchange between parallel tasks (processes) running on different nodes, and OpenMP* technology for parallelism inside each node of the cluster." Does it mean that the configuration that I have used is not valid? I ran more experiments with N*N compute nodes and 1 task per node, 4 CPUs per task, and they were way faster than the corresponding ones with the previous configuration. This might only partially be due to the greater quantity of memory available by reserving one whole node for just one process, since the size of the input data is not so big. 

Thank you in advance. 

Should we generate random number (using VSL_RNG) on the fly or prior to the loop?

$
0
0

Hello,

I am currently learning about how to use random functions, and am using the mkl version VSL_RNG.

I have made this simple code which compares the efficiency with generating all random numbers at once or doing so  on the fly.  The code runs in parallel where I am using VSL_BRNG_WH+rank to generate a different generator for each MPI process.

For generating nmax=1e8 numbers I get the following:

time = 0.35 seconds for generating all numbers at once (n=1 setting in the code)

time = 16 seconds for generating on the fly (n=2 setting in the code)

 

Is this  an expected behaviour. Is it generally expected that the speed is much faster for doing all generating numbers at once before entering a loop?

 

include 'mkl_vsl.f90'

program rnd_test

use MKL_VSL
use MKL_VSL_TYPE
use mpi

implicit none
   real(kind=8) t1,t2  ! buffer for random numbers
      real(kind=8) s        ! average
      real(kind=8) a, sigma ! parameters of normal distribution
      real(kind=8), allocatable :: r(:) ! buffer for random numbers

      TYPE (VSL_STREAM_STATE)::stream

      integer errcode
      integer i,j, n11, nloop, nn
      integer brng,method,seed,n, ierr, size, rank
      integer(kind=8) :: nskip, nmax
      call mpi_init(ierr)

      call MPI_COMM_SIZE(MPI_COMM_WORLD, size, ierr)
      call MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierr)

      n = 1
      s = 0.0
      a = 5.0
      sigma  = 2.0

      nmax = 1e8

!-----------------------------------------------------------------------
      nn = 2 ! (1): all at once. >1: on the fly
!----------------------------------------------------------------------






      nloop = 0


      if(nn>1)then
         nloop=nmax
         nn = 1
      else
         nloop=1
         nn = nmax
      endif


      allocate(r(nn))

      method=VSL_RNG_METHOD_GAUSSIAN_ICDF
      seed=777
      brng = VSL_BRNG_WH+rank

!     ***** Initializing *****
      errcode=vslnewstream( stream, brng,  seed )

      t1 = 0.
      t2 = 0.
      t1 = mpi_wtime()

!     ***** Generating *****
      do i = 1, nloop
          errcode=vdrnggaussian( method, stream, nn, r, a, sigma )
!         s = s + sum(r)
      end do

      t2= mpi_wtime()

!      s = s / 10000.0

      print*, "time: ", t2-t1
      call mpi_barrier(MPI_COMM_WORLD,ierr)
!     ***** Deinitialize *****
      errcode=vsldeletestream( stream )




end program

 

best

Ali

 

Problem running PARDISO example

$
0
0

Hi--I am trying to use the PARDISO solver on a MacOS and get this error when I try to complie and run the pardiso_unsym_f.f file.  I'm compiling like this:

gfortran pardiso_unsym_f.f -o pardiso_unsym_fexec  ${MKLROOT}/lib/libmkl_intel_ilp64.a ${MKLROOT}/lib/libmkl_intel_thread.a ${MKLROOT}/lib/libmkl_core.a -liomp5 -lpthread -lm -ldl

and I get this error when I run it:

LMC-062490:Fortran jpolk$ ./pardiso_unsym_fexec
 Reordering completed ...
 The following ERROR was detected:           -1
STOP 1

I would appreciate any help!


Dynamically link without using CPU-specific DLLs

$
0
0

Hello.

How can I compile a program to link MKL dynamically and so that it only uses the basic DLLs (mkl_intel_thread.dll, mkl_core.dll, libiomp5md.dll) and won't try to load different ones like mkl_avx.dll, mkl_avx2.dll, etc. on different processors?

Or is there an environment variable or something to do that?

 

complex-valued scalar products

$
0
0

Is there any problem calling BLAS functions zdotc, zdotu in recent version of MKL?

When calling these functions from a C-code using recent version of the MKL, the code crashes or returns fault values. It works properly when using soft-coded FORTRAN BLAS. I am using LINUX, I tried three different versions of the MKL (2018.5.274, 2019.1.144 und 2019.3.199).

 

Here is how I call the MKL

cc -O -fPIC -fopenmp -m64 -mcmodel=medium -I $MKLROOT/include test.c -L $MKLROOT/lib -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -lm

 

Here is a sample C-code

#include <stdio.h>
#include <stdlib.h>

typedef struct { double r, i; } doublecomplex;

doublecomplex zdotc_(int *,doublecomplex *,int *, doublecomplex *,int *),
                         zdotu_(int *,doublecomplex *,int *, doublecomplex *,int *);

#define N 4

int main(int argc, char **argv)
{
  doublecomplex *v, *w, val;
  int i=N,j=1,k=2,l,m;
  v=(doublecomplex *)malloc((size_t)N  *sizeof(doublecomplex));
  w=(doublecomplex *)malloc((size_t)2*N*sizeof(doublecomplex));

  for (l=0; l<N; l++) {
      v[l].r  = 1.0; v[l].i  =(double)l;

      w[l].r  =-1.0; w[l].i  = 1.0;
      w[N+l].r= 1.0; w[N+l].i= 0.0;
  }

  val=zdotc_(&i, v,&j, w,&k);
  printf("val=(%8.1le,%8.1le)\n",val.r,val.i);

  val=zdotu_(&i, v,&j, w,&k);
  printf("val=(%8.1le,%8.1le)\n",val.r,val.i);

  free(v);
  free(w);
 
  return 0;
}

 

FEAST multiplicity of eigenvalues

$
0
0

Hi,

I was wondering how the FEAST algorithm determines the multiplicity of the eigenvalues, for the case of repeated eigenvalues. More precisely, I would like to know if there exists a special tolerance for when two or more eigenvalues are treated as repeated.

The reason behind the questions is that I need the derivative of eigenvalues and thus special treatment is required for the case of repeated eigenvalues, since the eigenvectors of repeated eigenvalue can be linearly combined in an infinite number of ways.

Best,

Anna

 

Compilation with MKL_VSL gives compiler error

$
0
0

Hello

 

This is probably quite simple. When I am using MKL modules like MKL_VSL within my own module I get a compiler error saying Error in opening the compiled module file. Check INCLUDE path

 

However, I do have -I$(MKLROOT)/include when I am compiling my code. Am I missing something? I read somewhere that I might need an "include mkl_vsl.fi". However, If I put this into my own Fortran module I get other clashes, as I am now trying to include a module within a module

 

 

Thanks

 

best

Ali

Intel® MKL version 2019 Update 5 is now available

$
0
0

Intel® Math Kernel Library (Intel® MKL) is a highly optimized, extensively threaded, and thread-safe library of mathematical functions for engineering, scientific, and financial applications that require maximum performance.

Intel MKL 2019 Update 5 packages are now ready for download.

Intel MKL is available as part of the Intel® Parallel Studio XE and Intel® System Studio. Please visit the Intel® Math Kernel Library Product Page.

Please see What's new in Intel MKL 2019 and in MKL 2019 Update 5 follow this link - https://software.intel.com/en-us/articles/intel-math-kernel-library-rele...

and here is the link to the MKL 2019 Bug Fix list - https://software.intel.com/en-us/articles/intel-math-kernel-library-2019...

Viewing all 2652 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>