-
Notifications
You must be signed in to change notification settings - Fork 16
Description
main.txt
I've built the package on Windows 10, Intel(R) Core(TM) i5-8265U CPU, 8GB RAM with the command cmake -LAH -G "Ninja" -B build_slu -S superlu_mt -DPLAT="_OPENMP" -DBUILD_SHARED_LIBS=OFF -Denable_tests=OFF -Denable_examples=OFF. I tested it on some matrices and got a slowdown on several ones. I provide the call program in the attached file. Intel MKL BLAS/LAPACK, OpenMP, Intel OneAPI C/C++ compiler. I linked METIS ordering the same way as it's done in sequential version: https://github.com/ChessMastery/superlu_mt.
raefsky4 from SuiteSparseCollection (unsymmetric real): nprocs=1 - 1 sec, nprocs=4 - 70 sec. Options: permc_spec=MMD_AT_PLUS_A, pivoting_threshold=1.0 (default for pdgssv driver). This is factorization & solution time.
power197k from SuiteSparse: nprocs=1 - 25 sec, nprocs = 4 - 30 sec. Options: ordering=METIS_AT_PLUS_A, pivothing_threshold=1.0.
wang3 from SuiteSparse: nprocs=1 - 1.7 sec, nprocs = 4 - 5 sec, ordering METIS_AT_PLUS_A, threshold=1.0.
Is such vehaviour normal for SuperLU_mt or it is not supposed to happen?