error on description page

October 6, 2019, 1:52 pm

Latest and popular articles on Intel Technologies

≫ Next: Where is the preconditioning of coefficient matrix -A in FGMRES

≪ Previous: mkl_sparse_sp2m: Conditional jump or move

There is a line of code on two-stage-algorithm-for-inspector-executor-sparse-blas-routines that seems to be incorrect

status = mkl_sparse_x_export_csr ( csrC, &indexing, &rows, &cols, &rows_start, &rows_end, &col_indx, &values);

MKL_INT nnz = rows_end[rows] - rows_start[0];

"rows_end[rows]" accesses uninitialized space. "rows_end[rows - 1]" addresses the last element.

↧

Where is the preconditioning of coefficient matrix -A in FGMRES

October 4, 2019, 1:38 am

Latest and popular articles on Intel Technologies

≫ Next: cannot find libimf.so

≪ Previous: error on description page

Hello, every one .

I want to implement the ILUT preconditioned FGMRES RCI to solve a large Poisson equation in a lab CFD code which based on non-uniform cartesian grids and standard 7-point discretization scheme . I read the MKL Developer Reference and example : dcsrilut_exampl2.f90 and understand the RCI mechanism , GMRES method and ILU well, but I am still confused about the process of the preconditioned FGMRES method.

According to my understanding , to use ILUT+FGMRES, the user should first generate a CSR matrix for both A(csrA) and precondition matrix B(csrL) .Next, MKL invokes the preconditioned version of FGMRES by setting ipar(11)=1 and then performs an additional matrix(csrL)-vector multiplication step (RCI_request =3) to precondition the rhs , which probably correspond : B.inv() * b in GMRES.

What confuses me is that where is the preconditioning step of coefficients A? Shouldn't there exists a B.inv() * A stepin the (RCI_request =3) (Not clear whether right or left precondition is adopted in MKL) ? Actually , I am expecting some code like :

call mkl_sparse_d_spmm(op, B.inv() , csrA , reconditioned_A_CSR)

following the preconditioning of vector TMP(ipar(22)) .

I also notice the RCI_request =1 performs a set of A*V_i operations with the input of csrA, not csrA and csrL. So, How the preconditioning of A is handled in FGMRES is not clear to me at all.

My guess is that MKL performs the precondition of A automatically after precondition of vector tmp(ipar(22)) ,and then overrides the original A matrix with the same variable name , so that explains why the RCI_request=1 part remain unchanged in precondition version. Can anyone confirm this or explains it for me? Thanks a lot !

Lin Yang

↧

cannot find libimf.so

October 5, 2019, 7:24 pm

Latest and popular articles on Intel Technologies

≫ Next: MKL function cblas_sgemv gives different results each time

≪ Previous: Where is the preconditioning of coefficient matrix -A in FGMRES

Situation:

After installing Intel Composer XE 19 and setting environment variables, I cannot use my internet in Ubuntu 18.04 and printing system in OpenSUSE Leap 15.1. The error message said that they(cups and networking service...) cannot find libimf.so and other library to intel composer. But if I install Intel Composer XE 19 without sudo and root, only within user environment, the system work smoothly. I asked the same question in OpenSUSE forum and someone suggested that I shouldn't use LD_LIBRARY_PATH. And some articles says LD_LIBRARY_PATH is evil. Now I just avoid using LD_LIBRARY_PATH.

Question:

If I install Intel Composer XE 19 as administrator, how should I set environment variables? The installation manual says setting environment variables with "source /opt/intel/.../compilervars.sh intel64". But it causes a lot of troubles for me.

↧

MKL function cblas_sgemv gives different results each time

October 7, 2019, 4:28 am

Latest and popular articles on Intel Technologies

≫ Next: MKL's cblas_saxpy outputs incorrect results

≪ Previous: cannot find libimf.so

Hi, I have used cblas_sgemv but this function gives different results each time and I have checked the inputs which are always the same. Sometimes the result is correct (with 1e-6 L2 norm error compared to the correct result). Could someone tell me why I have this?

Thanks,

Cindy

↧

MKL's cblas_saxpy outputs incorrect results

October 7, 2019, 9:36 am

Latest and popular articles on Intel Technologies

≫ Next: pardiso_handle_store Segmentation fault

≪ Previous: MKL function cblas_sgemv gives different results each time

Hi,

I need to add two arrays in an efficient way, so I tried MKL's saxpy.

When I use the cblas_saxpy function on two dummy arrays with all values initialized to 1 and 2 respectively, I get totally wrong results. And I couldn't figure out what's wrong with my code.

(I omitted other includes)

#include "mkl.h"

#define SIZE 10000

int main()
{

        float* buf_x = (float*) malloc(SIZE * sizeof(float));
        float* buf_y = (float*) malloc(SIZE * sizeof(float));

        for (int i = 0; i<SIZE; i++){

            buf_x[i] = 1.f;

            buf_y[i] = 2.f;

        }

        cblas_saxpy ((MKL_INT)SIZE, 1, buf_x, (MKL_INT)1, buf_y, (MKL_INT)1);

        for (int i = 0; i<SIZE; i++)

            printf("%f, ", buf_y[i]);

        return 0;
}

I get as an output 204 instead of 3.

Can anybody shed a light on this ?

Thank you.

↧

pardiso_handle_store Segmentation fault

October 7, 2019, 10:17 am

Latest and popular articles on Intel Technologies

≫ Next: Permutation of a large sparse matrix

≪ Previous: MKL's cblas_saxpy outputs incorrect results

Hi,

I have been using `pardiso`, and everything works fine. I can successfully factorize a large sparse matrix and later solve systems using the factorization. Now I wish to save the factorization to a file, so I don't need to do the factorization every time I run the application. I believe `pardiso_handle_store` is the correct function to use, but I keep getting the `Segmentation fault (core dumped)` error.

The relevant code and the console output can be found in this gist: https://gist.github.com/hkalexling/f8e0a22d1a29569a4012f717de8c7798.

Any help would be appreciated!

Alex

↧

Permutation of a large sparse matrix

October 8, 2019, 11:48 am

Latest and popular articles on Intel Technologies

≫ Next: use of MKL spline functions strange behavior at second run time

≪ Previous: pardiso_handle_store Segmentation fault

Hi,

What is the fastest way of permuting a large sparse_matrix_t in csr or csc format?

I could either do manual permutations on the csr arrays or I could create a sparse permutation matrix and use the mkl_sparse_spmm method.

Either method seems to be not optimal since I don't benefit from parallelism on the former method and I have to create additional arrays for the Permutation matrix and create a new copy of the matrix on the latter method.

Also, I notice that there might be performance differences between column and row permutations depending on whether the matrix is in csr or csc.

Is there a better way to do it?

↧

use of MKL spline functions strange behavior at second run time

October 10, 2019, 1:02 am

Latest and popular articles on Intel Technologies

≫ Next: Serious memory leak problem of mkl_sparse_d_add subroutine

≪ Previous: Permutation of a large sparse matrix

Hi,

I have wrapped the code required to do an Akima spline interpolation in the attached source code. When I link my main application statically with the MKL, it runs fine. However, with dynamic linking (i.e. with the need of MKL runtime libs) I experienced a strange behavior. First of all I did not know which runtime libs I needed so when trying to call the dfdNewTask1D routine simply terminates my application; after several tries (running in console mode) I found that mkl_vml_avx2.dll and mkl_vml_p4.dll were required. I loaded manually these dlls and my code runs fine and I unload them after. However without exiting my app, if I run again the same calculation (after loading again the same set of dlls), the call to the dfdNewTask1D routine generates an exception an my application crashes (under debug, the call to dfdNewTask1D never returns because an exception is generated somewhere). The runtime dlls I load/unload at/after every run are:

libimalloc.dll
libmmd.dll
libifcoremd.dll
libifportmd.dll
libiomp5md.dll
msvcr100.dll
mkl_vml_avx2.dll
mkl_vml_p4.dll
mkl_core.dll
mkl_sequential.dll

I don't know if I need other dlls or what is going wrong.

Best regards,

Phil.

Attachment	Size
Download AkimaSpline.f90	3.32 KB

↧

Serious memory leak problem of mkl_sparse_d_add subroutine

October 12, 2019, 7:34 pm

Latest and popular articles on Intel Technologies

≫ Next: serious memory leak problem found within mkl_sparse_d_add

≪ Previous: use of MKL spline functions strange behavior at second run time

Hi,

I'm currently programming with the new sparse interface, and experienced serious memory leak problem when this routine: mkl_sparse_d_add is called several thousand times, it takes up all my 64 GB memory and program cannot go on. I'm not sure whether other sparse routines has similar problems, but at least routines mkl_sparse_d_create_coo, mkl_sparse_convert_csr, mkl_sparse_d_mv do not have this problem.

Please take a check of it, thank you very much!

↧

serious memory leak problem found within mkl_sparse_d_add

October 12, 2019, 7:37 pm

Latest and popular articles on Intel Technologies

≫ Next: dtrnlspbc_solve solution outside the allowed range

≪ Previous: Serious memory leak problem of mkl_sparse_d_add subroutine

Hi,

Please take a check of it, thank you very much!

↧

dtrnlspbc_solve solution outside the allowed range

October 13, 2019, 11:40 pm

Latest and popular articles on Intel Technologies

≫ Next: Using MKL Features: MKL Direct Call, MKL JIT, MKL Compact API, MKL Batch API, MKL Packed API on the Single Dynamic Library

≪ Previous: serious memory leak problem found within mkl_sparse_d_add

We are using a Trust Region MKL API: dtrnlspbc_solve and related functions.
We use the optimization with constraints.
The objective function written by us worked properly for years, in your algorithm.
Untill a customer of us signalated a strange behaviour, this is, a valid solution outside the allowed range!

We observed that, moving a little bit the initial conditions, if works! why this can happen?
Any suggestions?

Thank you

Gianluca

↧

Using MKL Features: MKL Direct Call, MKL JIT, MKL Compact API, MKL Batch API, MKL Packed API on the Single Dynamic Library

October 14, 2019, 11:52 am

Latest and popular articles on Intel Technologies

≫ Next: Why different thread num makes no different in performance?

≪ Previous: dtrnlspbc_solve solution outside the allowed range

Hello,

I saw many cool features of MKL listed in Speed Up Small-Matrix Multiplication using New Intel® Math Kernel Library Capabilities.
I am also aware of MKL JIT feature.

I was wondering, if I linked my code in the Using the Single Dynamic Library model, can I still use those?

Could you please list which of the features is available on the Single Dynamic Library?

Thank You.

↧

Why different thread num makes no different in performance?

October 14, 2019, 3:02 am

Latest and popular articles on Intel Technologies

≫ Next: Unexpected DftiComputeForward failures using larger inputs

≪ Previous: Using MKL Features: MKL Direct Call, MKL JIT, MKL Compact API, MKL Batch API, MKL Packed API on the Single Dynamic Library

Hi everyone,

I'm testing MKL using VisualStudio 2019 and MKL v2019.5 on Intel i7-9750H CPU with 6 cores and 12 threads.I'm interested in the time consumed of vector mathematics and FFT functions in MKL.As I understand it, as to these two categories of functions,the time consumed should decrease when max theads num increases.But it did'nt happen to vector mathematics functions.I have tested vcMul and vcAdd function.The time consumed just makes no much different between thread num setting to 1 and 6.It's werid to me and I can't figure out a reason for it.Can anyone help me about it?The code is attached below,thanks very much!

////////////////////////////////

int N = 16384;
int M = 2000;

//#define FFTTEST
#define CMULTEST
int main(void)
{

double clkfreq = mkl_get_clocks_frequency();

   unsigned MKL_INT64 startclk, endclk;
   double time;
   double time2[16384];
   int kk = 0;

/* Execution status */
MKL_LONG status = 0;

DFTI_DESCRIPTOR_HANDLE hand = 0;

//mkl_set_dynamic(0);

   //mkl_set_num_threads(1);
   int threadnum = mkl_get_max_threads();
   printf("设置线程数：%d\n", threadnum);
   printf("FFT点数：%d FFT次数：%d\n", N,M);

   /* Pointer to input/output data */
   MKL_Complex8* x = 0;
   MKL_Complex8* y = 0;
   x = (MKL_Complex8*)mkl_malloc(N * M * sizeof(MKL_Complex8), 64);
   y = (MKL_Complex8*)mkl_malloc(N * M * sizeof(MKL_Complex8), 64);
   MKL_Complex8* x2 = 0;
   MKL_Complex8* y2 = 0;
   x2 = (MKL_Complex8*)mkl_malloc(N * M * sizeof(MKL_Complex8), 64);
   y2 = (MKL_Complex8*)mkl_malloc(N * M * sizeof(MKL_Complex8), 64);
   if (x == NULL) goto failed;

   init2(x, x2);
   vmlSetMode(VML_EP);
   mkl_get_cpu_clocks(&startclk);
   for (kk = 0; kk < M; kk++)
   {
       vcAdd(N, &x[N*kk], &x2[N * kk], &y[N * kk]);
   }
   mkl_get_cpu_clocks(&endclk);
   time = (double)(endclk - startclk) / (clkfreq * 1e9) * 1e6 / M;
   printf("复乘： %f us\n", time);

   mkl_free(x);
   mkl_free(y);
   mkl_free(x2);
   mkl_free(y2);

failed:
return 0;
}

↧

Unexpected DftiComputeForward failures using larger inputs

October 15, 2019, 6:47 am

Latest and popular articles on Intel Technologies

≫ Next: Issue with mkl_sparse_z_export_csr

≪ Previous: Why different thread num makes no different in performance?

Hey,

I'm using MKL 2019u3 to compute 3D FFT.

This is the function:

void CheckMklFFT(const std::string &err_prefix, MKL_LONG status) {
        if (status != 0) {
            std::string error_message(DftiErrorMessage(status));
            std::cerr << err_prefix + error_message << std::endl;
            throw std::runtime_error(err_prefix + error_message);
        }
 }

std::shared_ptr<Ipp64fc> FFT3D(float *image, RPP::DataDimensions dims) {

        std::shared_ptr<Ipp64fc> image_fft = std::shared_ptr<Ipp64fc>(ippsMalloc_64fc((dims.x / 2 + 1) * dims.y * dims.z), [](Ipp64fc *x) { ippFree(x); });
        ippsZero_64fc(image_fft.get(), (dims.x / 2 + 1) * dims.x * dims.z);

        long lengths[] = {dims.z, dims.y, dims.x};
        long strides_in[] = {0, dims.x * dims.y, dims.x, 1};
        long strides_out[] = {0, (dims.x / 2 + 1) * dims.y, dims.x / 2 + 1, 1};

        std::string err_prefix = "Error: Failed with FFT due to ";

        // create descriptor
        DFTI_DESCRIPTOR_HANDLE Desc_Handle;
        CheckMklFFT(err_prefix, DftiCreateDescriptor(&Desc_Handle, DFTI_SINGLE, DFTI_REAL, 3, lengths));

        CheckMklFFT(err_prefix, DftiSetValue(Desc_Handle, DFTI_PLACEMENT, DFTI_NOT_INPLACE));
        CheckMklFFT(err_prefix, DftiSetValue(Desc_Handle, DFTI_PACKED_FORMAT, DFTI_CCE_FORMAT));
        CheckMklFFT(err_prefix, DftiSetValue(Desc_Handle, DFTI_CONJUGATE_EVEN_STORAGE, DFTI_COMPLEX_COMPLEX));
        CheckMklFFT(err_prefix, DftiSetValue(Desc_Handle, DFTI_INPUT_STRIDES, strides_in));
        CheckMklFFT(err_prefix, DftiSetValue(Desc_Handle, DFTI_OUTPUT_STRIDES, strides_out));
        CheckMklFFT(err_prefix, DftiCommitDescriptor(Desc_Handle));

        // direct FFT
        CheckMklFFT(err_prefix, DftiComputeForward(Desc_Handle, image, image_fft.get()));

        // Free descriptor
        CheckMklFFT(err_prefix, DftiFreeDescriptor(&Desc_Handle));

        return image_fft;
    }

My problem occurs when I use a very large array (for example, an image with the dimensions of 814x814x814 when each value is float, the size of the array is about 2³¹, when moving to Fourier domain (to complex numbers) the size of the array grows to ~2³²).
I compile on Linux system with Intel (R) Xeon (R) CPU E5-4627 v3 @ 2.60GHz processor and add the following flags:

  -DMKL_ILP64 -lmkl_intel_ilp64 -lmkl_core -lmkl_intel_thread

However, as soon as I get to DftiComputeForward, an error value returned with the following error message:
Intel MKL DFTI ERROR: Inconsistent configuration parameters.
I've red at the forum that someone was having a similar problem and on MKL 11 updated 5 it fixed, link to the post:

Unexpected DftiCommitDescriptor failures/interactions using larger inputs with MKL 11.1

Anyone know how to deal with the problem?

Thank you :)

↧

Issue with mkl_sparse_z_export_csr

October 17, 2019, 7:22 am

Latest and popular articles on Intel Technologies

≫ Next: IMKL AVX2 DFT slower in 2019.0.4 than in 2017.0.3

≪ Previous: Unexpected DftiComputeForward failures using larger inputs

Hello,

I use mkl_sparse_z_export_csr to export a CSR handle from internal representation. I double checked the exported values and they are all correct. However, when I use the following code to convert "Values" to conjugate of "Values" everything becomes zero (even the real part which I didn't touch)! Any idea what is going wrong? I am using latest version of Intel MKL 2019. Thank you for your help.

sparse_matrix_t C_CSR_Handle = NULL

/* First, I use mkl_sparse_spmm() function to multiply two matrices and save the results in C_CSR_Handle */
mkl_sparse_spmm(SPARSE_OPERATION_NON_TRANSPOSE, A_CSR_Handle, B_CSR_Handle, &C_CSR_Handle)

sparse_index_base_t indexing = 0;
MKL_INT rows, cols;
MKL_INT *JA= NULL, *PointerE = NULL, *IA= NULL;
MKL_Complex16 *Values= NULL;

sparse_index_base_t indexing1 = 0;
MKL_INT rows1, cols1; 
MKL_INT *JA1= NULL, *PointerE1 = NULL, *IA1= NULL; 
MKL_Complex16 *Values1= NULL;

mkl_sparse_z_export_csr(C_CSR_Handle, &indexing, &rows, &cols, &JA, &PointerE, &IA, &Values);
mkl_sparse_z_export_csr(C_CSR_Handle, &indexing1, &rows1, &cols1, &JA1, &PointerE1, &IA1, &Values1);

for (i = 0; i < JA[rows]; i++) {
   printf("values(%i) = %f , %f \n", i, Values[i].real, Values[i].imag); // print values before making any changes
   Values[i].imag = Values[i].imag*(-1); 
   Values1[i].imag = Values1[i].imag*(-1); 
   printf("values(%i) = %f , %f \n", i, Values[i].real, Values[i].imag); // print values after modification
}

↧

IMKL AVX2 DFT slower in 2019.0.4 than in 2017.0.3

October 21, 2019, 7:16 am

Latest and popular articles on Intel Technologies

≫ Next: Segmentation fault in vzMul on large arrays

≪ Previous: Issue with mkl_sparse_z_export_csr

Hi,

I've recently upgraded from IMKL 2017.0.3 (w/compiler: icpc 2017u4) to IMKL 2019.0.4 (w/compiler: icpc 2019u4) and noticed that one of my programs takes ~50% longer to run. Using callgrind and perf top, I've traced the issue down to the amount of time/cycles spent in DftiComputeBackward on a complex DFT. The DFT size is small (8192) and In both cases, the inverse DFT is called ~24 million times, however in IMKL 2019 a significant amount of CPU is spent in the following methods:

compute_colbatch_bwd

mkl_dft_avx2_coDFTColTwid_Compact_Bwd_v_16_s

mkl_dft_avx2_coDFTColBatch_Compact_Bwd_v_32_s

Whereas in IMKL 2017 the inverse DFT time is spent in:

mkl_dft_avx2_compute_bwd_s_c2c_1d_o

mkl_dft_avx2_xipps_inv_rev_32fc

mkl_dft_avx2_ippsDFTOutOrdInv_CToC_32fc

mkl_dft_avx2_ippsFFTInv_CToC_32fc

The program I'm running these DFTs in is relatively large, however >70% of the program cycles are spent on these calls. I have attempted to reproduce the problem in a simple script that just calls DftiComputeBackward repeatedly in a loop with similar paramters but I am unable to reproduce the issue. I was wondering if someone could shed some light on the differences in the underlying DFT functions and why the 2019 version would be spending so much time in compute_colbatch_bwd while this does not even come up in the 2017 version. Any help would be appreciated. For what its worth, I am compiling on an AVX2 platform with -xHost and -O3. My program uses ~1000 DFTI descriptors to compute the DFTs repeatedly for ~1000 different data channels.

Thanks,

Nick

↧

Segmentation fault in vzMul on large arrays

October 22, 2019, 6:56 am

Latest and popular articles on Intel Technologies

≫ Next: Valgrind IntelOpenMP 2018.0.3 issue

≪ Previous: IMKL AVX2 DFT slower in 2019.0.4 than in 2017.0.3

Hello,

I have a problem with the complex multiplication on large arrays. The following code leads to segmentation fault in vzMul:

const MKL_INT cLength = std::pow(2, 30) + 1; /* MKL_INT = int = int32 */
const size_t cLengthInBytes = cLength * sizeof(MKL_Complex16);
const int cAlignment = 64;
MKL_Complex16* pDataComplex1 = static_cast<MKL_Complex16*>(mkl_malloc(cLengthInBytes, cAlignment));
MKL_Complex16* pDataComplex2 = static_cast<MKL_Complex16*>(mkl_malloc(cLengthInBytes, cAlignment));
MKL_Complex16* pDataComplex3 = static_cast<MKL_Complex16*>(mkl_malloc(cLengthInBytes, cAlignment));
  
for (uint32_t i = 0; i < cLength; i++){
  pDataComplex1[i].real = 1.0;
  pDataComplex1[i].imag = 0.0;
  pDataComplex2[i].real = 0.0;
  pDataComplex2[i].imag = 1.0;
}
  
vzMul(cLength, pDataComplex1, pDataComplex2, pDataComplex3);
  
mkl_free(pDataComplex1);
mkl_free(pDataComplex2);
mkl_free(pDataComplex3);

According to the definition of MKL_INT as int32 it should be allowed to insert arrays of length int32Max = 2^31 - 1. There isn't any limitation mentioned in the documentation. The same issue seems to exist for vzAbs.

I use MKL Version 2019.0.1 Build 20180928

↧

Valgrind IntelOpenMP 2018.0.3 issue

October 18, 2019, 2:37 pm

Latest and popular articles on Intel Technologies

≫ Next: IntelMKLML(Intel OpenMP) Valgrind memory leak

≪ Previous: Segmentation fault in vzMul on large arrays

Hi,

I'm using IntelMKLML 2019.0.5. And I got several definite memory leak notification from Valgrind like attached in the below.

Can I ignore(suppress) this? Or do I need to do something?

Thank you in advance!

Regards,

Kyungsoo

------------------------------------------------------------------------------

==12495== HEAP SUMMARY:
==12495==     in use at exit: 7,875 bytes in 303 blocks
==12495==   total heap usage: 35,632 allocs, 35,329 frees, 2,208,839 bytes allocated
==12495==
==12495== 21 bytes in 1 blocks are definitely lost in loss record 5 of 27
==12495==    at 0x4028120: malloc (vg_replace_malloc.c:299)
==12495==    by 0xD37D66C: __intel_sse2_strdup (in /workspace/lib/libiomp5.so)
==12495==    by 0xDDBDF9F: ???
==12495==    by 0xDDBDFCF: ???
==12495==
{
   <insert_a_suppression_name_here>
   Memcheck:Leak
   match-leak-kinds: definite
   fun:malloc
   fun:__intel_sse2_strdup
   obj:*
   obj:*
}
==12495== 21 bytes in 1 blocks are definitely lost in loss record 6 of 27
==12495==    at 0x4028120: malloc (vg_replace_malloc.c:299)
==12495==    by 0xD37D66C: __intel_sse2_strdup (in /workspace/lib/libiomp5.so)
==12495==    by 0xDDBF49F: ???
==12495==    by 0xDDBF4CF: ???
==12495==
{
   <insert_a_suppression_name_here>
   Memcheck:Leak
   match-leak-kinds: definite
   fun:malloc
   fun:__intel_sse2_strdup
   obj:*
   obj:*
}
==12495== 21 bytes in 1 blocks are definitely lost in loss record 7 of 27
==12495==    at 0x4028120: malloc (vg_replace_malloc.c:299)
==12495==    by 0xD37D66C: __intel_sse2_strdup (in /workspace/lib/libiomp5.so)
==12495==    by 0xDDBCA9F: ???
==12495==    by 0xDDBCACF: ???
==12495==
{
   <insert_a_suppression_name_here>
   Memcheck:Leak
   match-leak-kinds: definite
   fun:malloc
   fun:__intel_sse2_strdup
   obj:*
   obj:*
}
==12495== 42 bytes in 2 blocks are definitely lost in loss record 10 of 27
==12495==    at 0x4028120: malloc (vg_replace_malloc.c:299)
==12495==    by 0xD37D66C: __intel_sse2_strdup (in /workspace/lib/libiomp5.so)
==12495==    by 0xDD76D9F: ???
==12495==    by 0xDD76DCF: ???
==12495==
{
   <insert_a_suppression_name_here>
   Memcheck:Leak
   match-leak-kinds: definite
   fun:malloc
   fun:__intel_sse2_strdup
   obj:*
   obj:*
}
==12495== 42 bytes in 2 blocks are definitely lost in loss record 11 of 27
==12495==    at 0x4028120: malloc (vg_replace_malloc.c:299)
==12495==    by 0xD37D66C: __intel_sse2_strdup (in /workspace/lib/libiomp5.so)
==12495==    by 0xDD95F9F: ???
==12495==    by 0xDD95FCF: ???
==12495==
{
   <insert_a_suppression_name_here>
   Memcheck:Leak
   match-leak-kinds: definite
   fun:malloc
   fun:__intel_sse2_strdup
   obj:*
   obj:*
}
==12495== 63 bytes in 3 blocks are definitely lost in loss record 13 of 27
==12495==    at 0x4028120: malloc (vg_replace_malloc.c:299)
==12495==    by 0xD37D66C: __intel_sse2_strdup (in /workspace/lib/libiomp5.so)
==12495==    by 0xDDBD19F: ???
==12495==    by 0xDDBD1CF: ???
==12495==
{
   <insert_a_suppression_name_here>
   Memcheck:Leak
   match-leak-kinds: definite
   fun:malloc
   fun:__intel_sse2_strdup
   obj:*
   obj:*
}
==12495== 84 bytes in 4 blocks are definitely lost in loss record 14 of 27
==12495==    at 0x4028120: malloc (vg_replace_malloc.c:299)
==12495==    by 0xD37D66C: __intel_sse2_strdup (in /workspace/lib/libiomp5.so)
==12495==    by 0xDD96D9F: ???
==12495==    by 0xDD96DCF: ???
==12495==
{
   <insert_a_suppression_name_here>
   Memcheck:Leak
   match-leak-kinds: definite
   fun:malloc
   fun:__intel_sse2_strdup
   obj:*
   obj:*
}
==12495== 105 bytes in 5 blocks are definitely lost in loss record 15 of 27
==12495==    at 0x4028120: malloc (vg_replace_malloc.c:299)
==12495==    by 0xD37D66C: __intel_sse2_strdup (in /workspace/lib/libiomp5.so)
==12495==    by 0xDDBD89F: ???
==12495==    by 0xDDBD8CF: ???
==12495==
{
   <insert_a_suppression_name_here>
   Memcheck:Leak
   match-leak-kinds: definite
   fun:malloc
   fun:__intel_sse2_strdup
   obj:*
   obj:*
}
==12495== 147 bytes in 7 blocks are definitely lost in loss record 19 of 27
==12495==    at 0x4028120: malloc (vg_replace_malloc.c:299)
==12495==    by 0xD37D66C: __intel_sse2_strdup (in /workspace/lib/libiomp5.so)
==12495==    by 0xDD7669F: ???
==12495==    by 0xDD766CF: ???
==12495==
{
   <insert_a_suppression_name_here>
   Memcheck:Leak
   match-leak-kinds: definite
   fun:malloc
   fun:__intel_sse2_strdup
   obj:*
   obj:*
}
==12495== 252 bytes in 12 blocks are definitely lost in loss record 21 of 27
==12495==    at 0x4028120: malloc (vg_replace_malloc.c:299)
==12495==    by 0xD37D66C: __intel_sse2_strdup (in /workspace/lib/libiomp5.so)
==12495==    by 0xDDBE69F: ???
==12495==    by 0xDDBE6CF: ???
==12495==
{
   <insert_a_suppression_name_here>
   Memcheck:Leak
   match-leak-kinds: definite
   fun:malloc
   fun:__intel_sse2_strdup
   obj:*
   obj:*
}
==12495== 420 bytes in 20 blocks are definitely lost in loss record 22 of 27
==12495==    at 0x4028120: malloc (vg_replace_malloc.c:299)
==12495==    by 0xD37D66C: __intel_sse2_strdup (in /workspace/lib/libiomp5.so)
==12495==    by 0xDDBFB9F: ???
==12495==    by 0xDDBFBCF: ???
==12495==
{
   <insert_a_suppression_name_here>
   Memcheck:Leak
   match-leak-kinds: definite
   fun:malloc
   fun:__intel_sse2_strdup
   obj:*
   obj:*
}
==12495== 483 bytes in 23 blocks are definitely lost in loss record 24 of 27
==12495==    at 0x4028120: malloc (vg_replace_malloc.c:299)
==12495==    by 0xD37D66C: __intel_sse2_strdup (in /workspace/lib/libiomp5.so)
==12495==    by 0xDD9749F: ???
==12495==    by 0xDD974CF: ???
==12495==
{
   <insert_a_suppression_name_here>
   Memcheck:Leak
   match-leak-kinds: definite
   fun:malloc
   fun:__intel_sse2_strdup
   obj:*
   obj:*
}
==12495== 924 bytes in 44 blocks are definitely lost in loss record 25 of 27
==12495==    at 0x4028120: malloc (vg_replace_malloc.c:299)
==12495==    by 0xD37D66C: __intel_sse2_strdup (in /workspace/lib/libiomp5.so)
==12495==    by 0xDD9669F: ???
==12495==    by 0xDD966CF: ???
==12495==
{
   <insert_a_suppression_name_here>
   Memcheck:Leak
   match-leak-kinds: definite
   fun:malloc
   fun:__intel_sse2_strdup
   obj:*
   obj:*
}
==12495== 1,575 bytes in 75 blocks are definitely lost in loss record 26 of 27
==12495==    at 0x4028120: malloc (vg_replace_malloc.c:299)
==12495==    by 0xD37D66C: __intel_sse2_strdup (in /workspace/lib/libiomp5.so)
==12495==    by 0xDDBED9F: ???
==12495==    by 0xDDBEDCF: ???
==12495==
{
   <insert_a_suppression_name_here>
   Memcheck:Leak
   match-leak-kinds: definite
   fun:malloc
   fun:__intel_sse2_strdup
   obj:*
   obj:*
}
==12495== LEAK SUMMARY:
==12495==    definitely lost: 4,200 bytes in 200 blocks
==12495==    indirectly lost: 0 bytes in 0 blocks
==12495==      possibly lost: 0 bytes in 0 blocks
==12495==    still reachable: 3,675 bytes in 103 blocks
==12495==         suppressed: 0 bytes in 0 blocks

↧

IntelMKLML(Intel OpenMP) Valgrind memory leak

October 19, 2019, 7:39 am

Latest and popular articles on Intel Technologies

≫ Next: invalid parameters during initialization Nonlinear Least Squares Problem without Constraints

≪ Previous: Valgrind IntelOpenMP 2018.0.3 issue

Hi,

I'm using IntelMKLML 2019.0.5 version with Valgrind. Whenever I compiled it, I got the Valgrind issues like the attached below.

Are they false positives? If not, could you please let me know what went wrong?

Thanks in advance!

---------------------------------------------------

==12495== HEAP SUMMARY:
==12495== in use at exit: 7,875 bytes in 303 blocks
==12495== total heap usage: 35,632 allocs, 35,329 frees, 2,208,839 bytes allocated
==12495==
==12495== 21 bytes in 1 blocks are definitely lost in loss record 5 of 27
==12495== at 0x4028120: malloc (vg_replace_malloc.c:299)
==12495== by 0xD37D66C: __intel_sse2_strdup (in /workspace/lib/libiomp5.so)
==12495== by 0xDDBDF9F: ???
==12495== by 0xDDBDFCF: ???
==12495==
{
<insert_a_suppression_name_here>
Memcheck:Leak
match-leak-kinds: definite
fun:malloc
fun:__intel_sse2_strdup
obj:*
obj:*
}
==12495== 21 bytes in 1 blocks are definitely lost in loss record 6 of 27
==12495== at 0x4028120: malloc (vg_replace_malloc.c:299)
==12495== by 0xD37D66C: __intel_sse2_strdup (in /workspace/lib/libiomp5.so)
==12495== by 0xDDBF49F: ???
==12495== by 0xDDBF4CF: ???
==12495==
{
<insert_a_suppression_name_here>
Memcheck:Leak
match-leak-kinds: definite
fun:malloc
fun:__intel_sse2_strdup
obj:*
obj:*
}
==12495== 21 bytes in 1 blocks are definitely lost in loss record 7 of 27
==12495== at 0x4028120: malloc (vg_replace_malloc.c:299)
==12495== by 0xD37D66C: __intel_sse2_strdup (in /workspace/lib/libiomp5.so)
==12495== by 0xDDBCA9F: ???
==12495== by 0xDDBCACF: ???
==12495==
{
<insert_a_suppression_name_here>
Memcheck:Leak
match-leak-kinds: definite
fun:malloc
fun:__intel_sse2_strdup
obj:*
obj:*
}
==12495== 42 bytes in 2 blocks are definitely lost in loss record 10 of 27
==12495== at 0x4028120: malloc (vg_replace_malloc.c:299)
==12495== by 0xD37D66C: __intel_sse2_strdup (in /workspace/lib/libiomp5.so)
==12495== by 0xDD76D9F: ???
==12495== by 0xDD76DCF: ???
==12495==
{
<insert_a_suppression_name_here>
Memcheck:Leak
match-leak-kinds: definite
fun:malloc
fun:__intel_sse2_strdup
obj:*
obj:*
}
==12495== 42 bytes in 2 blocks are definitely lost in loss record 11 of 27
==12495== at 0x4028120: malloc (vg_replace_malloc.c:299)
==12495== by 0xD37D66C: __intel_sse2_strdup (in /workspace/lib/libiomp5.so)
==12495== by 0xDD95F9F: ???
==12495== by 0xDD95FCF: ???
==12495==
{
<insert_a_suppression_name_here>
Memcheck:Leak
match-leak-kinds: definite
fun:malloc
fun:__intel_sse2_strdup
obj:*
obj:*
}
==12495== 63 bytes in 3 blocks are definitely lost in loss record 13 of 27
==12495== at 0x4028120: malloc (vg_replace_malloc.c:299)
==12495== by 0xD37D66C: __intel_sse2_strdup (in /workspace/lib/libiomp5.so)
==12495== by 0xDDBD19F: ???
==12495== by 0xDDBD1CF: ???
==12495==
{
<insert_a_suppression_name_here>
Memcheck:Leak
match-leak-kinds: definite
fun:malloc
fun:__intel_sse2_strdup
obj:*
obj:*
}
==12495== 84 bytes in 4 blocks are definitely lost in loss record 14 of 27
==12495== at 0x4028120: malloc (vg_replace_malloc.c:299)
==12495== by 0xD37D66C: __intel_sse2_strdup (in /workspace/lib/libiomp5.so)
==12495== by 0xDD96D9F: ???
==12495== by 0xDD96DCF: ???
==12495==
{
<insert_a_suppression_name_here>
Memcheck:Leak
match-leak-kinds: definite
fun:malloc
fun:__intel_sse2_strdup
obj:*
obj:*
}
==12495== 105 bytes in 5 blocks are definitely lost in loss record 15 of 27
==12495== at 0x4028120: malloc (vg_replace_malloc.c:299)
==12495== by 0xD37D66C: __intel_sse2_strdup (in /workspace/lib/libiomp5.so)
==12495== by 0xDDBD89F: ???
==12495== by 0xDDBD8CF: ???
==12495==
{
<insert_a_suppression_name_here>
Memcheck:Leak
match-leak-kinds: definite
fun:malloc
fun:__intel_sse2_strdup
obj:*
obj:*
}
==12495== 147 bytes in 7 blocks are definitely lost in loss record 19 of 27
==12495== at 0x4028120: malloc (vg_replace_malloc.c:299)
==12495== by 0xD37D66C: __intel_sse2_strdup (in /workspace/lib/libiomp5.so)
==12495== by 0xDD7669F: ???
==12495== by 0xDD766CF: ???
==12495==
{
<insert_a_suppression_name_here>
Memcheck:Leak
match-leak-kinds: definite
fun:malloc
fun:__intel_sse2_strdup
obj:*
obj:*
}
==12495== 252 bytes in 12 blocks are definitely lost in loss record 21 of 27
==12495== at 0x4028120: malloc (vg_replace_malloc.c:299)
==12495== by 0xD37D66C: __intel_sse2_strdup (in /workspace/lib/libiomp5.so)
==12495== by 0xDDBE69F: ???
==12495== by 0xDDBE6CF: ???
==12495==
{
<insert_a_suppression_name_here>
Memcheck:Leak
match-leak-kinds: definite
fun:malloc
fun:__intel_sse2_strdup
obj:*
obj:*
}
==12495== 420 bytes in 20 blocks are definitely lost in loss record 22 of 27
==12495== at 0x4028120: malloc (vg_replace_malloc.c:299)
==12495== by 0xD37D66C: __intel_sse2_strdup (in /workspace/lib/libiomp5.so)
==12495== by 0xDDBFB9F: ???
==12495== by 0xDDBFBCF: ???
==12495==
{
<insert_a_suppression_name_here>
Memcheck:Leak
match-leak-kinds: definite
fun:malloc
fun:__intel_sse2_strdup
obj:*
obj:*
}
==12495== 483 bytes in 23 blocks are definitely lost in loss record 24 of 27
==12495== at 0x4028120: malloc (vg_replace_malloc.c:299)
==12495== by 0xD37D66C: __intel_sse2_strdup (in /workspace/lib/libiomp5.so)
==12495== by 0xDD9749F: ???
==12495== by 0xDD974CF: ???
==12495==
{
<insert_a_suppression_name_here>
Memcheck:Leak
match-leak-kinds: definite
fun:malloc
fun:__intel_sse2_strdup
obj:*
obj:*
}
==12495== 924 bytes in 44 blocks are definitely lost in loss record 25 of 27
==12495== at 0x4028120: malloc (vg_replace_malloc.c:299)
==12495== by 0xD37D66C: __intel_sse2_strdup (in /workspace/lib/libiomp5.so)
==12495== by 0xDD9669F: ???
==12495== by 0xDD966CF: ???
==12495==
{
<insert_a_suppression_name_here>
Memcheck:Leak
match-leak-kinds: definite
fun:malloc
fun:__intel_sse2_strdup
obj:*
obj:*
}
==12495== 1,575 bytes in 75 blocks are definitely lost in loss record 26 of 27
==12495== at 0x4028120: malloc (vg_replace_malloc.c:299)
==12495== by 0xD37D66C: __intel_sse2_strdup (in /workspace/lib/libiomp5.so)
==12495== by 0xDDBED9F: ???
==12495== by 0xDDBEDCF: ???
==12495==
{
<insert_a_suppression_name_here>
Memcheck:Leak
match-leak-kinds: definite
fun:malloc
fun:__intel_sse2_strdup
obj:*
obj:*
}
==12495== LEAK SUMMARY:
==12495== definitely lost: 4,200 bytes in 200 blocks
==12495== indirectly lost: 0 bytes in 0 blocks
==12495== possibly lost: 0 bytes in 0 blocks
==12495== still reachable: 3,675 bytes in 103 blocks
==12495== suppressed: 0 bytes in 0 blocks

↧

invalid parameters during initialization Nonlinear Least Squares Problem without Constraints

October 22, 2019, 2:34 am

Latest and popular articles on Intel Technologies

≫ Next: feast max-residual does not decrease

≪ Previous: IntelMKLML(Intel OpenMP) Valgrind memory leak

Hi,

I programmed a not-so-complicated code to solve a nonlinear equation. But unfortunately all the time at initialization I receive an error in input parameters. Please tell me what's wrong. The code is below.When I comment on the result check after the strnlsp_init function, the strnlsp_check function succeeds.

#include <iostream>
#include <vector>
#include <iomanip>

#include "mkl_rci.h"
#include "mkl_types.h"
#include "mkl_service.h"


int main()
{
    std::vector<float> fjac = {
        0,1.0,
        1,1.0,
        2,1.0,
        4,1.0,
        5,1.0
    };

    std::cerr << "size fjac = "<< fjac.size() << std::endl;

    /* n - number of function variables
       m - dimension of function value */
    MKL_INT n = 2, m = 5;

    std::cerr << "n = "<< n << std::endl;
    std::cerr << "m = "<< m << std::endl;

    std::vector<float> fvec = {
       2.1,
       2.4,
       2.6,
       2.8,
       3.0
    };
    std::cerr << "size fvec = "<< fvec.size() << std::endl;

    std::vector<float> x ={
       0.0,
       0.0
    };
    std::cerr << "size x = "<< x.size() << std::endl;

    _TRNSP_HANDLE_t handle = nullptr;   // TR solver handle

    /* results of input parameter checking */
    MKL_INT info[6];

    /* precisions for stop-criteria (see manual for more details) */

    std::vector< float > eps;
    eps.resize(6);
    /* set precisions for stop-criteria */
    for (int32_t i = 0; i < static_cast<int32_t>(eps.size()); ++i)
    {
        eps[i] = 0.00001;
    }
    /* iter1 - maximum number of iterations
       iter2 - maximum number of iterations of calculation of trial-step */
    MKL_INT iter1 = 1000, iter2 = 100;
    /* initial step bound */
    float rs = 0.0;

    MKL_INT res;

    if(m >= n){
        std::cerr << "YES\n";
    }
    else{
        std::cerr << "NO\n";
    }

    res = strnlsp_init(&handle,
                       &n, &m,
                       x.data(),
                       eps.data(),
                       &iter1, &iter2,
                       &rs) ;

    std::cerr << "res = "<< res << std::endl;

    if(res != TR_SUCCESS)
    {
        if(res == TR_INVALID_OPTION){
           std::cerr << "there was an error in the input parameters.\n";
        }
        if(res == TR_OUT_OF_MEMORY){
            std::cerr << "there was a memory error.\n";
        }

        /* if function does not complete successfully then print error message */
        std::cerr << "| error in dtrnlsp_init"<< std::endl;
        /* Release internal Intel(R) MKL memory that might be used for computations         */
        /* NOTE: It is important to call the routine below to avoid memory leaks   */
        /* unless you disable Intel(R) MKL Memory Manager                                   */
        MKL_Free_Buffers ();
        return -1;
    }

    /* Checks the correctness of handle and arrays containing Jacobian matrix,
           objective function, lower and upper bounds, and stopping criteria. */
    if (strnlsp_check (&handle, &n, &m, fjac.data(), fvec.data(), eps.data(), info) != TR_SUCCESS)
    {
        std::cerr << "info:\n";
        for(int32_t i = 0; i < 6; ++i){
            std::cerr << info[i] << ",";
        }
        std::cerr << std::endl;
        /* if function does not complete successfully then print error message */
        std::cerr << "| error in dtrnlsp_init\n"<< std::endl;
        /* Release internal Intel(R) MKL memory that might be used for computations         */
        /* NOTE: It is important to call the routine below to avoid memory leaks   */
        /* unless you disable Intel(R) MKL Memory Manager                                   */
        MKL_Free_Buffers ();
        /* and exit */
        return -1;
    }

    std::cerr << "info:\n";
    for(int32_t i = 0; i < 6; ++i){
        std::cerr << info[i] << ",";
    }
    std::cerr << std::endl;

/* solve code */


    return 0;
}

↧