-
Notifications
You must be signed in to change notification settings - Fork 108
Add GraphsSharedArraysExt
and remove Distributed
dependency
#430
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
b44e9a0
to
3476c5f
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #430 +/- ##
=========================================
Coverage ? 97.16%
=========================================
Files ? 121
Lines ? 7023
Branches ? 0
=========================================
Hits ? 6824
Misses ? 199
Partials ? 0 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Looks like 1.6 / 1.9 have a different notion of "canonical" for I probably need to disable the test on Julia <1.9 |
I don't want to be too negative here - but I am not sure if this is something we want right now. My reasons for this:
If there is really the wish to have weak dependencies I am open to it - but first we should reach a consensus on that - so we should discus it in a Github issue. |
Disabling tests for the lower bound version cannot be an option. We could raise the lower bounds to the latest long-term support release though - that would be v1.10. We can do that in a separate PR though. |
It's just a formatting check though, which is implemented differently on v1.6 vs. 1.9 / 1.10. The |
I think we are due for bumping the min version of this package anyway so I created a PR: #432 - let's see if someone objects. |
Wouldn't this strictly decrease loading time? And decrease compilation time as well? It is a small change for this specific package, but for packages that depend on Graphs.jl (and other packages) this can start adding up. It is why DiffEq.jl split out all of its solvers in separate packages. e.g. running master: this pr:
But happening to have few dependencies makes the life of downstream package maintainers betters. If Graphs unconditionally depends on Distributed, then other packages that have extensions for Distributed will run precompilation for the extensions. Changes like this are overall beneficial for the user because way less unnecessary precompilation will be happening. |
How will it become more difficult?
There's no downside to having fewer dependencies? The fewer dependencies Graphs.jl has, the lower the barrier is to some other package deciding to take a dependency on Graphs.jl. |
As the maintainer of a package that uses Graphs.jl but never Distributed or SharedArrays, I support this change. We should also move Inflate.jl to an extension, it pulls in binary dependencies that most users wont use. |
Are you running this on some kind of beefed up cluster? For me it takes around 6 to 7 seconds to import Also, what does I have not figured out how to invalidate all precompiled code - but when running this multiple times the loading time varies by quite a bit and sometimes one branch is faster and the other time the other. So unless you repeated this experiment multiple times I am not sure if we really have a speed up of 7.4%. |
The runs above were on a cold laptop that was top-of-the-line in 2019, in performance mode, and are admittedly difficult to reproduce. The runs below are on a well-cooled steady-state desktop that is not throttling. The difference between what the two of us are getting above are probably loading vs precompiling, so below I separate those. On julia 1.12-beta3 Either way though, this testing should not matter much -- putting things in extensions inherently leads to less precompilation and less loading. The improvements can be very minimal (and they are negligible in this particular case), but the improvements should be unconditional. More importantly, this is particularly useful for downstream libraries that depend of Graphs. I like the slow push for more extensions and conditional dependencies in the ecosystem as a whole, because that makes my life as a developer of a library with many dependencies easier. loading time comparison (no precompilation): very minor improvement in timing and allocationsthis PR
master:
precompilation of the entire environment: very minor improvement in allocations, negligible improvement in timingthis pr:
master:
precompilation of just Graphs.jl: no improvementthis pr:
master:
The |
@Krastanov Thanks, these are some very solid benchmarks.
Inflate.jl is a very small package. It only pulls in binary dependencies for tests but otherwise it is pure Julia code without any dependencies. If I remember correctly it was even created so that we can use it in this package. I do agree that there are downsides to having dependencies, for example versioning issues and the compilation of code that we never use. But there are also advantages if we don't have to implement everything ourselves. The extension system that Julia uses is rather clumsy - it works mostly if one wants to create a specialization for a function or a type declared in another package. But when that is not the case we require users to write [compat]
Graphs = {version: "1.8", features: ["parallel"]} and then Graphs would add the correct packages for the We had issues with this in the past in GraphIO.jl as we have multiple extensions there. This has confused users of that package. So I am still critical of this but a lot of people seem to think its a good idea so I would propose this:
@topolarity Let's merge #429 first so that the new |
On the subject of loading time - this issue #244 reminded me that there is a julia> @time_imports using Graphs
Precompiling Graphs finished.
16 dependencies successfully precompiled in 21 seconds. 16 already precompiled.
13.5 ms MacroTools
1.2 ms SimpleTraits
0.8 ms Printf
32.8 ms Dates
1.0 ms TOML
1.0 ms Preferences
0.2 ms PrecompileTools
1.3 ms StaticArraysCore
228.6 ms StaticArrays
2.8 ms ArnoldiMethod
1.1 ms Statistics
0.3 ms StaticArrays → StaticArraysStatisticsExt
0.4 ms UUIDs
0.4 ms Compat
0.2 ms Compat → CompatLinearAlgebraExt
0.9 ms Inflate
6.6 ms OrderedCollections
78.2 ms DataStructures
┌ 0.8 ms SuiteSparse_jll.__init__()
11.0 ms SuiteSparse_jll 87.02% compilation time
1.4 ms Serialization
┌ 20.0 ms SparseArrays.CHOLMOD.__init__() 97.90% compilation time
273.7 ms SparseArrays 7.15% compilation time
0.4 ms Statistics → SparseArraysExt
┌ 0.0 ms Distributed.__init__()
48.0 ms Distributed 68.80% compilation time
0.9 ms Mmap
4.6 ms SharedArrays
22.1 ms Graphs |
Thanks for setting this up Cody! Now that #429 is merged, this should be good to go as well. After it is updated, I will make sure to review it quickly and make a release after the merge. Much appreciated that you are looking into this! |
No need to worry about Julia <1.9 anymore, we dropped support for older versions. |
3476c5f
to
ada0c05
Compare
ada0c05
to
e71bfd2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for updating this! Below I have some questions about how the "extension is not loaded" errors are being reported.
function distr_betweenness_centrality(args...; kwargs...) | ||
return error( | ||
"`parallel = :distributed` requested, but SharedArrays or Distributed is not loaded" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am interested in getting your perspective as a compiler dev, on this idiom.
In other places I use a different idiom:
function __init__()
if isdefined(Base.Experimental, :register_error_hint)
Base.Experimental.register_error_hint(MethodError) do io, exc, argtypes, kwargs
if exc.f === something_implemented_in_extension && ...
print(io, "yada yada requested, but foobar is not loaded")
end
end
end
function something_implemented_in_extension end
instead of
function something_implemented_in_extension(args...; kwargs...)
return error(
"yada yada requested, but foobar is not loaded"
)
end
I was left with the impression that the former (1) avoids invalidations and (2) has a semantically more meaningful error type (preserving the MethodError).
If these impressions of mine are correct, could you consider switching to the former idiom?
This PR follows up on #429 to enable
parallel=:threads
by default and move Distributed-based parallelism to an extension, for faster loading times and reduced dependencies.On Julia <1.9, the dependency is still there so this shouldn't break backwards compatibility (1.6 tests are passing for me locally)