-
-
Notifications
You must be signed in to change notification settings - Fork 5.7k
OffsetArray support for cat/vcat/hcat #37629
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
| pos = first(axes(B, 1)) | ||
| if dense | ||
| for k=1:nargs | ||
| for k=eachindex(A) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spaces around =
03bf04f to
a61b4e0
Compare
This comment has been minimized.
This comment has been minimized.
| end | ||
| B = similar(A[1], T, nrows, ncols) | ||
| pos = 1 | ||
| B = similar(first(A), T, nrows, ncols) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How difficult would it be to adapt this to have B = similar(first(A), T, axes(first(A),1), axes(A,1)), so that all the offsets get propagated?
I guess that's only the case of a vector of vectors; when some elements are matrices you'd have to give up on the 2nd axis.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand your comment correctly.
As explained in JuliaArrays/OffsetArrays.jl#63 (comment), a meaningful offset propagation is only possible for very restricted conditions. This restriction doesn't make cats of OffsetArrays indeed useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I didn't see the issue, let me look there first.
timholy
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much for tackling this! The overall design seems fine.
| p1 = pos + Int(length(Vk))::Int - 1 | ||
| a[pos:p1] = Vk | ||
| pos = p1+1 | ||
| n = length(Vk) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| n = length(Vk) | |
| n = Int(length(Vk))::Int |
because this often gets called with poor inference. Try @code_warntype Base._typed_vcat(Float64, AbstractVector[[1,2,3], [4.0]]).
Not putting in these annotation re-opens us to all sorts of code invalidation, see https://julialang.org/blog/2020/08/invalidations/.
This isn't a general pattern you need to start adopting in all your Julia code, but this is a very low-level function called by lots of things (all of which would be invalidated if this is invalidated) and from its signature alone you might guess it's a bit of an inference nightmare.
I'm not adding comments like these elsewhere in this function, but you'll probably need to add other annotations to the initialization of pos above. Ideally, it would be great to check with SnoopCompile and loading various packages like Makie to see if you've greatly increased the number of invalidations.
| for i = 1:N | ||
| if concat[i] | ||
| inds[i] = offsets[i] .+ cat_indices(x, i) | ||
| inds[i] = offsets[i] .+ parent(cat_indices(x, i)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's this about?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This strips the offsets and uses linear indexing when computing indices. parent(r::IdOffsetRange) = r.parent.
This could be fragile and a little bit tricky, though. Phehaps there's a better way to make things work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would generally try to stay away from assumptions about how x might be structured if at all possible...
| ncols = size(A[1], 2) | ||
| for j = 2:nargs | ||
| ncols = size(first(A), 2) | ||
| for j = first(axes(A))[2:end] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| for j = first(axes(A))[2:end] | |
| for j = firstindex(A)+1:lastindex(A) |
|
|
||
| const BASE_TEST_PATH = joinpath(Sys.BINDIR, "..", "share", "julia", "test") | ||
| isdefined(Main, :OffsetArrays) || @eval Main include(joinpath($(BASE_TEST_PATH), "testhelpers", "OffsetArrays.jl")) | ||
| isdefined(Main, :OffsetArrays) || @eval Main @everywhere include(joinpath($(BASE_TEST_PATH), "testhelpers", "OffsetArrays.jl")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reason for this change? Could this cause problems in tests running in other workers? What if it defines methods that are supposed to fail in those tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm unsure about this, though.
Without this Distributed test is broken:
From worker 10: ERROR: LoadError: LoadError: TaskFailedException
From worker 10:
From worker 10: nested task error: On worker 25:
From worker 10: UndefVarError: OffsetArrays not defined
From worker 10: Stacktrace:
From worker 10: [1] deserialize_module
From worker 10: @ /buildworker/worker/package_linux32/build/usr/share/julia/stdlib/v1.6/Serialization/src/Serialization.jl:957
From worker 10: [2] handle_deserialize
From worker 10: @ /buildworker/worker/package_linux32/build/usr/share/julia/stdlib/v1.6/Serialization/src/Serialization.jl:856
From worker 10: [3] deserialize
From worker 10: @ /buildworker/worker/package_linux32/build/usr/share/julia/stdlib/v1.6/Serialization/src/Serialization.jl:774
Adding this helps remove this error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe
Line 4 in e378767
| using .Main.OffsetArrays |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's already using .Main.OffsetArrays one line down below on this.
I think it's because eachindex now requires OffsetArrays.IdOffsetRange and triggers the error, while for the old version of OffsetArrays it uses IdentityUnitRange and doesn't require anything in OffsetArrays.
It would be better to get #37643 first and I'll drop the OffsetArrays upgrade commit here.
|
|
||
| using Base: Indices, IndexCartesian, IndexLinear, tail | ||
| using Base: Indices, tail, @propagate_inbounds | ||
| @static if !isdefined(Base, :IdentityUnitRange) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is always defined on master, which is what this PR targets.
I didn't go through the rest carefully, is the idea here to basically "sync" the version here with that in OffsetArrays.jl? Seems reasonable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Codecov Report
@@ Coverage Diff @@
## master #37629 +/- ##
==========================================
+ Coverage 87.53% 87.57% +0.04%
==========================================
Files 362 362
Lines 71830 71888 +58
==========================================
+ Hits 62873 62953 +80
+ Misses 8957 8935 -22
Continue to review full report at Codecov.
|
|
Gentle bump, would be good to move this along |
When doing concatenation OffsetArrays are treated as plain lists and offsets are ignored.
This support for
OffsetArraysonly works onOffsetArrays@1whereIdOffsetRangeis defined.Blocked by #37643, the first commit will be removed after that get resolved.
closes #37493
closes #37628
cc: @timholy @jishnub