-
Notifications
You must be signed in to change notification settings - Fork 4
Open
Description
Encountered this case where the performance of multiplication seemed to have really poor scaling:
using FillArrays, SemiseparableMatrices, BandedMatrices, LazyArrays
function gen_example(n)
A = BandedMatrix(0 => [1; 1:n-1], 1 => Fill(-1.0, n - 1))
B = ApplyMatrix(*, ones(n), OneElement(1.0, (1, n), (1, n)))
AB = AlmostBandedMatrix(A, B)
U = UpperTriangular(AB)
L = LowerTriangular(brand(n, n, 2, 0))
stats = @timed U * L
return stats.time, stats.bytes
end
n = 2 .^ (0:11)
time_byts = gen_example.(n)12-element Vector{Tuple{Float64, Int64}}:
(2.7e-6, 64)
(3.0e-7, 96)
(1.4e-6, 704)
(2.7e-6, 3648)
(1.44e-5, 16512)
(9.14e-5, 69760)
(0.0005424, 286768)
(0.0045723, 1163312)
(0.0310835, 4685872)
(0.240178, 18808880)
(1.9579295, 75366448)
(41.1409489, 301727792)If I replace the stats = ... line with literally stats = @timed Matrix(U) * Matrix(L) so that all the structure is lost,
12-element Vector{Tuple{Float64, Int64}}:
(0.0068448, 192)
(0.0004204, 288)
(6.15e-5, 704)
(1.53e-5, 2112)
(1.39e-5, 7424)
(0.0007202, 26880)
(0.0003808, 102416)
(0.0053296, 401424)
(0.0047358, 1589264)
(0.0082311, 6324240)
(0.0461804, 25231376)
(2.2484057, 100794384)If I instead keep stats = @timed U * L but do AB = Matrix(A + B) then
12-element Vector{Tuple{Float64, Int64}}:
(0.0019402, 64)
(3.8e-6, 96)
(1.0e-6, 192)
(1.7e-6, 576)
(3.5e-6, 2176)
(0.0004226, 8320)
(0.000128, 32816)
(0.0002281, 131120)
(0.0006641, 524336)
(0.0032373, 2097200)
(0.0220077, 8388656)
(0.1559884, 33554480)If I do AB = A + B, the timing is instant since the multiplication is left lazy.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels