You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This example from Discourse shows a slowdown when broadcasting a moderately complicated expression, instead of broadcasting a function containing the same expression:
arrayfun!(C, A, B) =@. C = A^2+ B^2+ A * B + A / B - A * B - A / B + A * B + A / B - A * B - A / B
scalarfun(A::Real, B::Real) = A^2+ B^2+ A * B + A / B - A * B - A / B + A * B + A / B - A * B - A / B
let N =151
A, B, C1, C2 = (rand(N,N,N).+1for _ in1:4)
@btimearrayfun!($C1, $A, $B)
@btime$C2 .=scalarfun.($A, $B)
C1 ≈ C2
end# 17.306 ms (11 allocations: 352 bytes)# 5.900 ms (0 allocations: 0 bytes)
The effect seems fairly robust, it's not particular to 3D arrays, nor to A^2.
Replacing @. with .+ etc. helps a bit (which according to #29120 removes n-ary +, here n<=4):
arrayfun!(C, A, B) = C .= A.^2.+ B.^2.+ A .* B .+ A ./ B .- A .* B .- A ./ B .+ A .* B .+ A ./ B .- A .* B .- A ./ B
# 17.345 ms (0 allocations: 0 bytes)
Simpler expressions also have the slowdown but no allocation:
arrayfun!(C, A, B) =@. C = A^2+ B^2+ A * B + A / B
scalarfun(A::Real, B::Real) = A^2+ B^2+ A * B + A / B
# 3.148 ms (0 allocations: 0 bytes)# 971.000 μs (0 allocations: 0 bytes)
Even simpler expressions like arrayfun!(C, A, B) = @. C = A^2 + B^2 show no slowdown at all.
The text was updated successfully, but these errors were encountered:
This example from Discourse shows a slowdown when broadcasting a moderately complicated expression, instead of broadcasting a function containing the same expression:
The effect seems fairly robust, it's not particular to 3D arrays, nor to
A^2
.Replacing
@.
with.+
etc. helps a bit (which according to #29120 removes n-ary+
, here n<=4):Simpler expressions also have the slowdown but no allocation:
Even simpler expressions like
arrayfun!(C, A, B) = @. C = A^2 + B^2
show no slowdown at all.The text was updated successfully, but these errors were encountered: