Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
257 commits
Select commit Hold shift + click to select a range
c8f4173
refactoring
dufourc1 Oct 16, 2024
db753cd
revert over-refactoring of lossfunction
dufourc1 Oct 16, 2024
b0efdcc
Fast Bernoulli data is functionning but wrong
dufourc1 Oct 16, 2024
11eee28
functionning Bernoulli implementation
dufourc1 Oct 16, 2024
10c1d76
clean and create skeleton for categorical
dufourc1 Oct 16, 2024
4b875e9
minor typos
dufourc1 Oct 16, 2024
1013bbd
add manual warning to remember what to do
dufourc1 Oct 16, 2024
85ff2e7
update docs, GA, and prepare for benchmark
dufourc1 Oct 17, 2024
4fd3fb1
format
dufourc1 Oct 17, 2024
a286874
add tests for assignments creation and swaps
dufourc1 Oct 17, 2024
c795929
found fastest ll computation
dufourc1 Oct 18, 2024
75e3f14
fix make data categorical
dufourc1 Oct 18, 2024
cc54cc4
remove useless memeory in BernoulliSwap
dufourc1 Oct 18, 2024
798ce65
working cat assginment before switching to adj list
dufourc1 Oct 20, 2024
29f59cf
fix cat assignments
dufourc1 Oct 21, 2024
5bd3dd8
clean
dufourc1 Oct 21, 2024
d597330
clean bandwidth selection
dufourc1 Oct 21, 2024
d52a6b5
make docs work
dufourc1 Oct 21, 2024
5d407ca
add more docstrings
dufourc1 Oct 21, 2024
4feb3d9
format
dufourc1 Oct 21, 2024
0fc312a
add init rules
dufourc1 Oct 21, 2024
cb668e5
add docs for init
dufourc1 Oct 21, 2024
ea12c67
put first steps for clustering with higher dims
dufourc1 Oct 21, 2024
6a8a13c
using StatsAPI as it should be
dufourc1 Oct 22, 2024
3ea8413
handle categorical arrays and fix node labels
dufourc1 Oct 22, 2024
ad9d78c
format
dufourc1 Oct 22, 2024
c3bd492
bugfix: swap being overwritten when updating assignment
dufourc1 Oct 23, 2024
0c42955
remove debug statement
dufourc1 Oct 23, 2024
0347062
fix speed and correcteness issue in categorical case
dufourc1 Oct 23, 2024
82556fc
typo
dufourc1 Nov 12, 2024
d6ec72f
first try to benchmark
dufourc1 Nov 12, 2024
42ffe61
add discretisation
dufourc1 Nov 18, 2024
77a3395
small api changes
dufourc1 Nov 21, 2024
45c9e40
remove unused import @turbo
dufourc1 Nov 21, 2024
45c1259
fix test to new api
dufourc1 Nov 22, 2024
3149409
other fit method
dufourc1 Nov 25, 2024
d10db5b
add perm util function
dufourc1 Nov 25, 2024
5448040
typo
dufourc1 Nov 25, 2024
dbde987
monkeypatch
dufourc1 Nov 25, 2024
f4b521d
remove unused DataStructures
dufourc1 Nov 26, 2024
e634ef2
add basic api
dufourc1 Nov 27, 2024
120c4e5
change log level from info to debug
dufourc1 Nov 27, 2024
014655a
add fast inplace sampling
dufourc1 Nov 28, 2024
c1d9e0f
add bootstraping capabilities
dufourc1 Nov 28, 2024
372b404
improve error messages on node labels
dufourc1 Nov 28, 2024
f4577f3
add assignment retour
dufourc1 Nov 28, 2024
84f87f8
add utils for continuous latent
dufourc1 Nov 29, 2024
33d65e5
update api test
dufourc1 Nov 29, 2024
9ede44b
remove over specification of argument type
dufourc1 Nov 29, 2024
ea58864
typo
dufourc1 Nov 29, 2024
38c0b07
change progress bar
dufourc1 Dec 2, 2024
a453dc5
backbone
dufourc1 Dec 2, 2024
44287c4
test new update
dufourc1 Dec 2, 2024
9d315dc
new method
dufourc1 Dec 2, 2024
bdf1ddf
new method
dufourc1 Dec 2, 2024
2037f33
typo
dufourc1 Dec 2, 2024
016104b
update progress bar
dufourc1 Dec 2, 2024
c82d1b7
update import
dufourc1 Dec 2, 2024
af02584
fix arpack version to avoid bug on 0.5.4
dufourc1 Dec 3, 2024
c8b207b
remove Arpack
dufourc1 Dec 3, 2024
572cfd5
update dependencies
dufourc1 Dec 3, 2024
39a1999
define convention for discretisation
dufourc1 Dec 6, 2024
5bfc699
add basic discretized distribution
dufourc1 Dec 6, 2024
f9ea503
fix DiscretizedDistribution
dufourc1 Dec 6, 2024
0d08ee5
first try of sparse assignments
dufourc1 Dec 6, 2024
5e022bb
partially solved, still issues
dufourc1 Dec 6, 2024
1b7ffd1
fixed it 💪
dufourc1 Dec 6, 2024
3e2ac10
add custom zero-inflated cat dist
dufourc1 Dec 8, 2024
09e0a0f
typo
dufourc1 Dec 8, 2024
e5fd356
another typo
dufourc1 Dec 8, 2024
7ca23a7
corner cases
dufourc1 Dec 8, 2024
a925090
type stable clamp
dufourc1 Dec 8, 2024
b5eade9
typo
dufourc1 Dec 8, 2024
a3dd9c1
approx error
dufourc1 Dec 8, 2024
1a58aff
add TODO and basic zero inflated cont dist
dufourc1 Dec 9, 2024
760c7e1
try to manage floating point error
dufourc1 Dec 9, 2024
5900676
monkey patch of convert zero inflated to cat
dufourc1 Dec 9, 2024
da86e77
forgot to take first of tuple params
dufourc1 Dec 9, 2024
0d9fa12
p vs 1-p is a pain
dufourc1 Dec 9, 2024
d36d733
hack: remove compat
dufourc1 Dec 9, 2024
7860d73
remove sbm subtype matrix
dufourc1 Dec 9, 2024
91d5164
Revert "remove sbm subtype matrix"
dufourc1 Dec 10, 2024
1e2b692
typo in name
dufourc1 Dec 11, 2024
984f310
correctly initialize discretizedDist
dufourc1 Dec 11, 2024
3149c55
add case of zero inflated discretisation
dufourc1 Dec 11, 2024
ee80912
change order include
dufourc1 Dec 11, 2024
a66b08c
typo
dufourc1 Dec 11, 2024
4cd2962
change rescaling issue on 0
dufourc1 Dec 11, 2024
230fdd4
hack for fast bin checking. Will fail if discretizer has other bins t…
dufourc1 Dec 11, 2024
cf5181d
add hacky cdf computation
dufourc1 Dec 11, 2024
e23fdf5
typo....
dufourc1 Dec 11, 2024
d447979
update import for bootstrap
dufourc1 Dec 11, 2024
b7df926
fix imports for bootstrap
dufourc1 Dec 11, 2024
9c5e454
add helper for alignment
dufourc1 Dec 13, 2024
b341676
minor fixes to cdf and fit of ZeroInflatedCategorical
dufourc1 Dec 13, 2024
a099721
?
dufourc1 Dec 13, 2024
93c3063
test encoding max bin
dufourc1 Dec 13, 2024
d22f08a
correct bin numbers
dufourc1 Dec 14, 2024
105f237
throw on edge cases for number of bins
dufourc1 Dec 14, 2024
e271a5c
revert to correct number of bins given k
dufourc1 Dec 14, 2024
e1cda21
bound check for cdf support
dufourc1 Dec 14, 2024
6bfef27
try to circumvent pdf(x,0)=Inf
dufourc1 Dec 14, 2024
e220bf1
need to take care of type generate by mixture
dufourc1 Dec 15, 2024
43f158f
force same type for both discretizers
dufourc1 Dec 15, 2024
f2b35f9
return 0 and one of same type of x
dufourc1 Dec 15, 2024
322380e
remove type instability
dufourc1 Dec 15, 2024
601935d
correct counting with missing data
dufourc1 Dec 19, 2024
79fa741
add docstrings and take care of edge cases
dufourc1 Dec 20, 2024
501f419
add tests
dufourc1 Dec 20, 2024
994b942
improve api
dufourc1 Dec 20, 2024
90cabac
clean observations
dufourc1 Dec 20, 2024
49671e2
add julia compat
dufourc1 Jan 20, 2025
6426628
format
dufourc1 Jan 20, 2025
9ded3cc
update julia version in github tests
dufourc1 Jan 20, 2025
4e4c4cd
format tests
dufourc1 Jan 20, 2025
9bcd485
remove benchmarks folder
dufourc1 Jan 20, 2025
5086208
remove dubious check and modification
dufourc1 Jan 20, 2025
4daec03
remove unused dependence
dufourc1 Jan 20, 2025
07f1580
correct missing update
dufourc1 Jan 21, 2025
ce4f1bf
Update codecov action
dufourc1 Feb 6, 2025
7acde37
update docs
dufourc1 Apr 14, 2025
c39ee21
format make.jl
dufourc1 Apr 14, 2025
9cc573c
add implementation for MC
dufourc1 Apr 23, 2025
ad1448d
typo
dufourc1 Apr 23, 2025
a39656d
typo
dufourc1 Apr 23, 2025
2ebd12e
add mc
dufourc1 Apr 25, 2025
6c615e3
rewrite and condense
dufourc1 Apr 28, 2025
8079189
new things
dufourc1 Apr 28, 2025
c7995a6
start testing
dufourc1 Apr 28, 2025
fede79b
issues with zero vs non-edge
dufourc1 Apr 30, 2025
3e88e69
only optimize on missing, need to do it for bernoulli
dufourc1 Apr 30, 2025
f4e7fa2
remove old code
dufourc1 Apr 30, 2025
aa46922
correctly compute fast group ll update
dufourc1 Apr 30, 2025
594d50a
correctly compute fast group ll update
dufourc1 Apr 30, 2025
f1f18b0
decouple neighbour list and edge values
dufourc1 May 7, 2025
5ca238c
try fixing ll
dufourc1 May 8, 2025
844fb0c
fix ll with agent
dufourc1 May 8, 2025
4ba2caa
fix swap with slow rebuild
dufourc1 May 12, 2025
c4e3c91
make shift broadcast
dufourc1 May 12, 2025
1bc74ff
add tests
dufourc1 May 12, 2025
2052df2
add comments
dufourc1 May 12, 2025
98ff8ce
update to slower version
dufourc1 May 12, 2025
83ca2ff
export kinda fixed
dufourc1 May 12, 2025
01884b7
slow recompute faster than trying to isolate behaviour for now, see c…
dufourc1 May 13, 2025
55681d2
add more debug info
dufourc1 May 13, 2025
f7b9955
add more debug info
dufourc1 May 13, 2025
4478a6e
monkey patch for missing, needs to be fixed
dufourc1 May 13, 2025
7edf2b6
hack for empty collection, introduces type instability...
dufourc1 May 20, 2025
d582f68
update block models and add conversion for symarray
dufourc1 May 20, 2025
c295bc0
Fix blockmodel from assignment
dufourc1 May 20, 2025
cbf95d0
update naming convention
dufourc1 May 20, 2025
1501d74
before speed benchmark
dufourc1 May 21, 2025
de5337b
same speed, but maybe parallelization options
dufourc1 May 24, 2025
a2d3539
sum was doing all sum we just need triangular with diag
dufourc1 May 24, 2025
c98afe8
add early warning for extreme behaviour in local search
dufourc1 May 24, 2025
8180c7d
more general sampling procedure for sbm
dufourc1 May 26, 2025
dfee0e3
restructure
dufourc1 May 26, 2025
6c3689c
restart for zero inflated
dufourc1 May 26, 2025
0fb6836
add zero inflated dist
dufourc1 May 26, 2025
6ac0c1c
remove over type specialisation
dufourc1 May 26, 2025
9378236
add fast update for heavy ll
dufourc1 May 27, 2025
85546e7
add skeleton for clustered start
dufourc1 Jun 6, 2025
d630df5
typo in zero inflated
dufourc1 Jul 8, 2025
9956406
add utilities for sbm
dufourc1 Jul 10, 2025
b663709
add more detailed readme
dufourc1 Sep 24, 2025
ee1e56c
trying to unify once more
dufourc1 Sep 24, 2025
1f98baa
formatting mostly
dufourc1 Sep 24, 2025
6282e30
add fast categorical
dufourc1 Sep 24, 2025
348bc23
remove unused dependencies
dufourc1 Sep 24, 2025
0339b88
format and make ready for extension on discretizeDist
dufourc1 Sep 24, 2025
2d51b51
extensions now work, need to code them
dufourc1 Sep 26, 2025
985e289
Generalize to Distributions.jl
dufourc1 Oct 2, 2025
0523a60
add zero inflated for categorical
dufourc1 Oct 8, 2025
e76664c
make assignment concrete and speed up optim
dufourc1 Oct 8, 2025
b61cce8
check backward compat
dufourc1 Oct 8, 2025
6f18035
remove static arrays for sbm
dufourc1 Oct 8, 2025
1f86bd7
add Makie extension and sanity check on multiplex fit
dufourc1 Oct 9, 2025
fe044ab
start with tutorials
dufourc1 Oct 10, 2025
ace7386
silence some output in tutorial
dufourc1 Oct 10, 2025
a5e3a27
add docs and clean codebase
dufourc1 Oct 15, 2025
f5f0938
create benchmark regression test
dufourc1 Oct 15, 2025
c24e842
create benchmark regression test 2
dufourc1 Oct 15, 2025
675a405
add speed benchmark
dufourc1 Oct 15, 2025
81d629e
updated categorical updates
dufourc1 Oct 15, 2025
49818b3
make symarray based on sparse array
dufourc1 Oct 17, 2025
adfbcca
clean
dufourc1 Oct 17, 2025
b71700e
start adding more extensible design
dufourc1 Oct 17, 2025
6b63708
progress towards better package
dufourc1 Oct 19, 2025
93b3c44
typo
dufourc1 Oct 19, 2025
ffa17ab
minor performance improvements
dufourc1 Oct 19, 2025
674fb3c
generalise add counts and realised for funkier use cases (e.g. MC)
dufourc1 Oct 19, 2025
401dd96
make loss compatible with extensions
dufourc1 Oct 19, 2025
7349eae
add clustering
dufourc1 Oct 24, 2025
02427fb
use Graphons.jl to generate graphs for test
dufourc1 Oct 24, 2025
2930519
reexport Graphons.jl export in NetworkHistogram
dufourc1 Oct 24, 2025
f1462fe
fix symarray
dufourc1 Oct 24, 2025
6413ec9
fix some weird similar broadcast shenanigans
dufourc1 Oct 24, 2025
1d56733
switch tutorials to use graphons.jl
dufourc1 Oct 24, 2025
7f050cf
remove old BlockModel
dufourc1 Oct 24, 2025
920a5ae
butcher api to use fast estimator for now
dufourc1 Oct 24, 2025
ee2e80d
add bare version of multiplex
dufourc1 Oct 24, 2025
40a94ed
add suff stats greedy
dufourc1 Oct 26, 2025
1813141
try to match messy perf
dufourc1 Oct 26, 2025
11baabe
almost fix broadcasting for symarray
dufourc1 Oct 27, 2025
2caf748
optimization + start new generic suffstats with indices instead
dufourc1 Oct 27, 2025
5e34062
slight improvement of loop
dufourc1 Oct 27, 2025
f567fa6
update constructors
dufourc1 Oct 27, 2025
8fea621
before the great cleanse
dufourc1 Oct 27, 2025
2d3f905
big cleanse
dufourc1 Oct 27, 2025
854400a
change structure
dufourc1 Oct 27, 2025
418f648
remove kwargs expension in score
dufourc1 Oct 27, 2025
a576654
correct typo
dufourc1 Oct 27, 2025
570f08a
tidy and clean
dufourc1 Oct 27, 2025
5684140
fix broadcasting
dufourc1 Oct 27, 2025
a3ce929
weighted seems to work
dufourc1 Oct 27, 2025
8921f78
add rudimentary api
dufourc1 Oct 27, 2025
16a41e5
remove unnecessary @inline
dufourc1 Oct 28, 2025
f4aec5c
clean parameters
dufourc1 Oct 28, 2025
2326323
add parameters struct
dufourc1 Oct 28, 2025
904e34a
add MC extension with pseudo_factory to get convertor of extensions
dufourc1 Oct 28, 2025
9e4c75a
add note
dufourc1 Oct 28, 2025
dd2aa25
remove deprecated
dufourc1 Oct 28, 2025
b8149cd
add tests for score
dufourc1 Oct 28, 2025
90c83e5
rename score to loss
dufourc1 Oct 28, 2025
180affa
add sanity check for loss cat
dufourc1 Oct 28, 2025
6f4a9a1
correct mapreduce
dufourc1 Oct 28, 2025
c45ba51
add makie ext for SymArray
dufourc1 Oct 29, 2025
247f511
add BinaryConvertor and oracle_estimator
dufourc1 Oct 29, 2025
c957503
remove overspecialisation
dufourc1 Oct 29, 2025
6d718b0
more API conveniance
dufourc1 Oct 29, 2025
49929ad
clean api
dufourc1 Oct 29, 2025
e02e287
use hungarian alg to align node labels
dufourc1 Oct 31, 2025
4174c33
start adding OT methods
dufourc1 Nov 12, 2025
68ad7c0
add PythonCall dependency for ot functions and remove align partition…
dufourc1 Nov 14, 2025
e6348cc
add loading of extension in docs
dufourc1 Nov 14, 2025
bdf54fd
add docs to pot function
dufourc1 Nov 14, 2025
2045ea0
hack the docs
dufourc1 Nov 14, 2025
d2dfd63
add kwargs for ot alignment
dufourc1 Nov 14, 2025
b187632
remove deprecated keyword arg
dufourc1 Nov 14, 2025
af6d010
remove @showprogress from litterate example
dufourc1 Nov 14, 2025
a2f40da
change default params for categorical edges and update litterate example
dufourc1 Nov 14, 2025
202956b
solve PythonCall import problem partially
dufourc1 Nov 14, 2025
6d97595
clean code
dufourc1 Nov 14, 2025
0e36b11
clean tutorials
dufourc1 Nov 14, 2025
fd2c1ec
tutorial follow example from paper
dufourc1 Nov 14, 2025
c936722
compare argmin and knee in example
dufourc1 Nov 14, 2025
7edf850
clean weighted network tutorial
dufourc1 Nov 14, 2025
88fe7b8
add OT alignment to weighted example
dufourc1 Nov 14, 2025
4b5acb4
fix tutorials and refactor OT from python
dufourc1 Nov 16, 2025
93e41d9
remove pycache
dufourc1 Dec 19, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion .JuliaFormatter.toml
Original file line number Diff line number Diff line change
@@ -1 +1,4 @@
style = "sciml"
indent = 4
margin = 92
normalize_line_endings = "unix"
style = "sciml"
10 changes: 10 additions & 0 deletions .github/dependabot.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
version: 2
updates:
- package-ecosystem: "github-actions"
directory: "/"
schedule:
interval: "monthly"
open-pull-requests-limit: 99
labels:
- "dependencies"
- "github-actions"
4 changes: 2 additions & 2 deletions .github/workflows/CI.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ jobs:
fail-fast: false
matrix:
version:
- '1.8'
- '1.11'
# - 'nightly'
os:
- ubuntu-latest
Expand All @@ -35,7 +35,7 @@ jobs:
- uses: julia-actions/julia-buildpkg@v1
- uses: julia-actions/julia-runtest@v1
- uses: julia-actions/julia-processcoverage@v1
- uses: codecov/codecov-action@v3
- uses: codecov/codecov-action@v5
with:
files: lcov.info
token: ${{ secrets.CODECOV_TOKEN }}
Expand Down
78 changes: 78 additions & 0 deletions .github/workflows/benchmark_pr.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
name: Benchmark a pull request

on:
pull_request_target:
branches:
- master

permissions:
pull-requests: write

jobs:
generate_plots:
runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v4
- uses: julia-actions/setup-julia@v2
with:
version: "1.11"
- uses: julia-actions/cache@v2
- name: Extract Package Name from Project.toml
id: extract-package-name
run: |
PACKAGE_NAME=$(grep "^name" Project.toml | sed 's/^name = "\(.*\)"$/\1/')
echo "::set-output name=package_name::$PACKAGE_NAME"
- name: Build AirspeedVelocity
env:
JULIA_NUM_THREADS: 2
run: |
# Lightweight build step, as sometimes the runner runs out of memory:
julia -e 'ENV["JULIA_PKG_PRECOMPILE_AUTO"]=0; import Pkg; Pkg.add(;url="https://github.com/MilesCranmer/AirspeedVelocity.jl.git")'
julia -e 'ENV["JULIA_PKG_PRECOMPILE_AUTO"]=0; import Pkg; Pkg.build("AirspeedVelocity")'
- name: Add ~/.julia/bin to PATH
run: |
echo "$HOME/.julia/bin" >> $GITHUB_PATH
- name: Run benchmarks
run: |
echo $PATH
ls -l ~/.julia/bin
mkdir results
benchpkg ${{ steps.extract-package-name.outputs.package_name }} --rev="${{github.event.repository.default_branch}},${{github.event.pull_request.head.sha}}" --url=${{ github.event.repository.clone_url }} --bench-on="${{github.event.repository.default_branch}}" --output-dir=results/ --tune
- name: Create plots from benchmarks
run: |
mkdir -p plots
benchpkgplot ${{ steps.extract-package-name.outputs.package_name }} --rev="${{github.event.repository.default_branch}},${{github.event.pull_request.head.sha}}" --npart=10 --format=png --input-dir=results/ --output-dir=plots/
- name: Upload plot as artifact
uses: actions/upload-artifact@v4
with:
name: plots
path: plots
- name: Create markdown table from benchmarks
run: |
benchpkgtable ${{ steps.extract-package-name.outputs.package_name }} --rev="${{github.event.repository.default_branch}},${{github.event.pull_request.head.sha}}" --input-dir=results/ --ratio > table.md
echo '### Benchmark Results' > body.md
echo '' >> body.md
echo '' >> body.md
cat table.md >> body.md
echo '' >> body.md
echo '' >> body.md
echo '### Benchmark Plots' >> body.md
echo 'A plot of the benchmark results have been uploaded as an artifact to the workflow run for this PR.' >> body.md
echo 'Go to "Actions"->"Benchmark a pull request"->[the most recent run]->"Artifacts" (at the bottom).' >> body.md

- name: Find Comment
uses: peter-evans/find-comment@v3
id: fcbenchmark
with:
issue-number: ${{ github.event.pull_request.number }}
comment-author: 'github-actions[bot]'
body-includes: Benchmark Results

- name: Comment on PR
uses: peter-evans/create-or-update-comment@v4
with:
comment-id: ${{ steps.fcbenchmark.outputs.comment-id }}
issue-number: ${{ github.event.pull_request.number }}
body-path: body.md
edit-mode: replace
29 changes: 28 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,7 +1,34 @@
.vscode
*dev/
.CondaPkg/


# Files generated by invoking Julia with --code-coverage
*.jl.cov
*.jl.*.cov

# Files generated by invoking Julia with --track-allocation
*.jl.mem

# System-specific files and directories generated by the BinaryProvider and BinDeps packages
# They contain absolute paths specific to the host computer, and so should not be committed
deps/deps.jl
deps/build.log
deps/downloads/
deps/usr/
deps/src/

# Build artifacts for creating documentation generated by the Documenter package
docs/build/
docs/site/
docs/Manifest.toml

Manifest.toml
# File generated by Pkg, the package manager, based on a corresponding Project.toml
# It records a fixed state of all packages used by the project. As such, it should not be
# committed for packages, but should be committed for applications that require a static
# environment.
Manifest.toml


## python
__pycache__/
2 changes: 2 additions & 0 deletions CondaPkg.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
[deps]
pot = ""
55 changes: 34 additions & 21 deletions Project.toml
Original file line number Diff line number Diff line change
@@ -1,36 +1,49 @@
name = "NetworkHistogram"
uuid = "7806f430-7229-459c-b2e6-df35e8e4eb5d"
version = "0.6.0"
authors = ["Charles Dufour", "Jake Grainger"]
version = "0.5.2"

[deps]
ArnoldiMethod = "ec485272-7323-5ecc-a04f-4719b315124d"
Arpack = "7d9fca2a-8960-54d3-9f78-7d1dccf2cb97"
Accessors = "7d9f7c33-5ae7-4f3b-8dc6-eff91059b697"
ArgCheck = "dce04be8-c92d-5529-be00-80e4d2c0e197"
BenchmarkTools = "6e4b80f9-dd63-53aa-95a3-0cdb28fa8baf"
CodecZstd = "6b39b394-51ab-5f42-8807-6242bab2b4c2"
HTTP = "cd3eb016-35fb-5094-929b-558a96fad6f3"
JLD = "4138dd39-2aa7-5051-a626-17a0bb65d9c8"
Kronecker = "2c470bb0-bcc8-11e8-3dad-c9649493f05e"
Clustering = "aaaa29a8-35af-508c-8bc3-b662a17a0fe5"
Distributions = "31c24e10-a181-5473-b8eb-7969acd0382f"
Graphons = "e0c12bfd-47d7-434e-afb7-632611640ca5"
Hungarian = "e91730f6-4275-51fb-a7a0-7064cfbd3b39"
IntervalSets = "8197267c-284f-5f27-9208-e0e47529a953"
LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
LogExpFunctions = "2ab3a3ac-af41-5b50-aa03-7779005ae688"
ProgressMeter = "92933f4c-e287-5a05-a399-4b506db050ca"
Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"
Reexport = "189a3867-3050-52da-a836-e630ba90ab69"
SparseArrays = "2f01184e-e22b-5df5-ae63-d93ebab69eaf"
StaticArrays = "90137ffa-7385-5640-81b9-e52037218182"
StatsAPI = "82ae8749-77ed-4fe6-ae5f-f523153014b0"
StatsBase = "2913bbd2-ae8a-5f71-8c99-4fb6c76f3a91"
TranscodingStreams = "3bb67fe8-82b1-5028-8e26-92a6c54297fa"
ValueHistories = "98cad3c8-aec3-5f06-8e41-884608649ab7"

[weakdeps]
Bootstrap = "e28b5b4c-05e8-5b66-bc03-6f0c0a0a06e0"
LightMC = "b58f5c6e-c887-41d6-b553-02118416cd5d"
Makie = "ee78f7c6-11fb-53f2-987a-cfe4a2b5a57a"
PythonCall = "6099a3de-0909-46bc-b1f4-468b9a2dfc0d"

[extensions]
BootstrapExt = "Bootstrap"
LightMCExt = "LightMC"
MakieExt = "Makie"
PythonOptimalTransport = "PythonCall"

[compat]
ArnoldiMethod = "0.2.0"
Arpack = "0.5.4"
BenchmarkTools = "1.3.2"
CodecZstd = "0.7.2"
HTTP = "1.7.4"
JLD = "0.13.3"
Kronecker = "0.5"
ProgressMeter = "1.7.2"
StatsBase = "0.33.21"
TranscodingStreams = "0.9.11"
ValueHistories = "0.5.4"
julia = "1.8"
Accessors = "0.1.42"
ArgCheck = "2.5.0"
BenchmarkTools = "1.6.3"
Clustering = "0.15.8"
Hungarian = "0.7.0"
IntervalSets = "0.7.11"
LinearAlgebra = "1.12.0"
LogExpFunctions = "0.3.29"
Reexport = "1.2.2"

[extras]
Test = "8dfed614-e22c-5e08-85e1-65c5234f0b40"
Expand Down
138 changes: 132 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,31 @@
[![Stable](https://img.shields.io/badge/docs-stable-blue.svg)](https://sds-epfl.github.io/NetworkHistogram.jl/stable/)
[![DOI](https://zenodo.org/badge/572018079.svg)](https://zenodo.org/doi/10.5281/zenodo.10212851)


Implementation of the network histogram for graphon estimation from the paper [Network histograms and universality of blockmodel approximation](https://doi.org/10.1073/pnas.1400374111) by Sofia C. Olhede and Patrick J. Wolfe.

Implementation of the network histogram for graphon estimation from the paper
[Network histograms and universality of blockmodel approximation (2014)](https://doi.org/10.1073/pnas.1400374111)
by Sofia C. Olhede and Patrick J. Wolfe and its extension to decorated graphs
by Charles Dufour and Sofia C. Olhede
[Inference for decorated graphs and application to multiplex networks (2024)](https://arxiv.org/abs/2408.12339).

The network histogram is a nonparametric estimator for the generating mechanism
of an exchangeable random graph (see graphons, decorated graphons and
probability graphons). We assume our observed graph is
$A \in \mathcal{K}^{n \times n}$, where $\mathcal{K}$ is a set of edge
decorations (e.g. $\{0,1\}$ for unweighted graphs, $\mathbb{N}$ for count
edges, $\mathbb{R}$ for real-valued edges, etc.). Using the Aldous-Hoover
theorem, we know that $A$ is generated from a graphon
$W: [0,1]^2 \to \mathcal{P}\left(\mathcal{K}\right)$, where
$\mathcal{P}\left(\mathcal{K}\right)$ is the set of probability measures on
$\mathcal{K}$ in the following way:

1. Sample $U_1, \ldots, U_n \sim \text{iid } \text{Uniform}[0,1]$.
2. For each pair of nodes $i,j$, sample the edge $A_{ij} \sim W(U_i, U_j)$
independently.

The network histogram approximates the generating graphon
$W: [0,1]^2 \to \mathcal{P}\left(\mathcal{K}\right)$ by a piecewise constant
function, i.e. a stochastic block model with $k$ blocks. For details, see the
papers mentioned above.

## Installation

Expand All @@ -24,7 +46,10 @@ Pkg.add("NetworkHistogram")

## Usage

We fit the estimator and then extract the estimated graphon matrix and node labels.
### Basic Usage

We fit the estimator and then extract the estimated graphon matrix and node
labels.

```julia
using NetworkHistogram, LinearAlgebra
Expand All @@ -35,7 +60,7 @@ A[diagind(A)] .= 0
# approximate the graphon with a network histogram
hist = graphhist(A)

# get the graphist structure
# get the graphhist structure
estimate = hist.graphhist

# get the estimated graphon matrix
Expand All @@ -45,4 +70,105 @@ sbm_matrix = estimate.θ
node_labels = estimate.node_labels
```

You can control the optimization process by modifying the rules used in the optimization. Check out the docs for more information.
### Advanced Usage with Custom Parameters

You can control the optimization process by modifying the rules used in the
optimization:

```julia
using NetworkHistogram

# Binary network
A = Symmetric(rand(0:1, 100, 100))
A[diagind(A)] .= 0

# Initial partition into k groups
k = 3
initial_labels = rand(1:k, 100)

# Configure optimization parameters
params = GreedyParams(
50_000, # Maximum iterations
RandomNodeSwap(), # How to select nodes to swap
Strict(), # Only accept improvements
PreviousBestValue(5000), # Stop after 5000 iterations without improvement
true # Show progress bar
)

# Fit the network histogram
result = nethist(A, Bernoulli(0.5), initial_labels, params)

# Extract results
ll = loglikelihood(result)
block_params = result.θ
node_groups = result.node_labels
```

### Working with Different Edge Types

The package supports various edge types through custom distributions:

```julia
using NetworkHistogram
using Distributions # For standard distributions

# Example 1: Weighted networks with continuous edges
W = Symmetric(rand(100, 100))
W[diagind(W)] .= 0
# You can use any distribution that implements the required interface

# Example 2: Count data (e.g., number of interactions)
C = Symmetric(rand(Poisson(2), 100, 100))
C[diagind(C)] .= 0
# Use appropriate count distribution

# Example 3: Sparse networks with missing edges
A_sparse = Symmetric(rand([0, 1, missing], 100, 100))
A_sparse[diagind(A_sparse)] .= 0
# Missing values are treated as absent edges
```

### Visualizing Results (with Makie.jl)

```julia
using NetworkHistogram
using CairoMakie # or GLMakie

# Fit model
A = Symmetric(rand(0:1, 100, 100))
A[diagind(A)] .= 0
result = nethist(A, Bernoulli(0.5), rand(1:3, 100), GreedyParams())

# Create heatmap of estimated parameters
fig = heatmap_params(result, ordering=true, colormap=:viridis)
save("network_histogram.png", fig)
```

### Sampling from a Block Model

```julia
using NetworkHistogram

# Define a 3-block model
k = 3
bm = BlockModel(k, Bernoulli(0.5))

# Set custom edge probabilities between blocks
bm[1, 1] = Bernoulli(0.8) # High within-group connectivity
bm[2, 2] = Bernoulli(0.7)
bm[3, 3] = Bernoulli(0.6)
bm[1, 2] = Bernoulli(0.1) # Low between-group connectivity
bm[1, 3] = Bernoulli(0.05)
bm[2, 3] = Bernoulli(0.05)

# Sample a network
n_nodes = 150
latents, A = sample(bm, n_nodes)

# latents contains the true block assignments
# A is the sampled adjacency matrix
```

Check out the
[documentation](https://sds-epfl.github.io/NetworkHistogram.jl/dev/) for more
examples and detailed API information.
Loading
Loading