IGL-Bench: Establishing the Comprehensive Benchmark for Imbalanced Graph Learning (ICLR 2025 Spotlight๐) [PDF]
IGL-Bench is a comprehensive benchmark for Imbalanced Graph Learning (IGL) based on PyTorch and PyTorch Geometric. We embark on 16 diverse graph datasets and 24 distinct IGL algorithms with uniform data processing and splitting strategies.
IGL-Bench serves as the first open-sourced benchmark for graph-specific imbalanced learning to the best of our knowledge. IGL-Bench encompases 24 state-of-the-art IGL algorithms and 16 diverse graph datasets covering node-level and graph-level tasks, addressing class- and topology-imbalance issues, while also adopting consistent data processing and splitting approaches for fair comparisons over multiple metrics with different investigation focus. Through benchmarking the existing IGL algorithms for effectiveness, robustness, and efficiency, we make the following contributions:
- First Comprehensive IGL Benchmark. IGL-Bench enables a fair and unified comparison among 19 state-of-the-art node-level and 5 graph-level IGL algorithms by unifying the experimental settings across 16 graph datasets of diverse characteristics, providing a comprehensive understanding of the class-imbalance and topology-imbalance problems in IGL for the first time.
- Multi-faceted Evaluation and Analysis. We conduct a systematic analysis of IGL methods from various dimensions, including effectiveness, efficiency, and complexity. Based on the results of extensive experiments, we uncover both the potential advantages and limitations of current IGL algorithms, providing valuable insights to guide future research endeavors.
- Open-sourced Package. To facilitate future IGL research, we develop an open-sourced benchmark package for public access. Users can evaluate their algorithms or datasets with less effort.
Follow the steps below to install and configure IGL-Bench properly for your local environment.
Please ensure your Python environment meets following dependencies:
Dependency | Version (โฅ) |
---|---|
Python | 3.8.12 |
PyTorch | 1.9.1 |
PyTorch-Geometric | 2.0.1 |
DGL | 1.1 |
scipy | 1.9 |
Recommendation: Use your specific Python virtual environment (e.g., via
conda
orvenv
) or use thisDockerfile
we provide to build a Docker image.
Clone the repository and install required dependencies:
# Clone the repository
git clone https://github.com/RingBDStack/IGL-Bench.git
cd IGL-Bench
# Install dependencies
pip install -r requirements.txt
If you want to install the package for your local development, use:
pip install -e .
The following demonstrates how to quickly run algorithms from IGL-Bench for both node-level and graph-level classification tasks under various imbalance settings.
You can also refer to our runnable example : ๐ demo.ipynb
import IGL_Bench as igl
Dataset = igl.dataset.Dataset(
task="node",
data_name="Cora",
imb_type="topo_global",
imb_level="high"
)
dataset = Dataset.load_dataset()
config = igl.config.load_conf(
task="node",
imbtype="topo_global",
algorithm="PASTEL"
)
solver = igl.manage.Manager(config, dataset)
solver.run(num_runs=5)
import IGL_Bench as igl
Dataset = igl.dataset.Dataset(
task="graph",
data_name="D&D",
imb_type="class",
imb_level="low"
)
dataset = Dataset.load_dataset()
config = igl.config.load_conf(
task="graph",
imbtype="class",
algorithm="G2GNN"
)
solver = igl.manage.Manager(config, dataset)
solver.run(num_runs=10)
You can flexibly configure algorithms using the Dataset(...)
and load_conf(...)
API with the following parameters:
Specifies the type of task:
'node' or 'graph'
Node-level datasets:
'Cora', 'CiteSeer', 'PubMed', 'Computers', 'Photo', 'ogbn-arxiv', 'Chameleon', 'Squirrel', 'Actor'
Graph-level datasets:
'PTC-MR', 'FRANKENSTEIN', 'PROTEINS', 'D&D', 'IMDB-B', 'REDDIT-B', 'COLLAB', 'ogbg-molhiv'
Imbalance type (depends on task):
- For node-level:
'class', 'topo_local', 'topo_global'
- For graph-level:
'class', 'topology'
Controls the severity of imbalance:
'low', 'mid', 'high'
The algorithm to run. Supported algorithms include:
-
Node-level class imbalance:
DRGCN, DPGNN, ImGAGN, GraphSMOTE, GraphENS, GraphMixup, LTE4G, TAM, TOPOAUC, GraphSHA
-
Node-level topology local imbalance:
DEMONet, TailGNN, COLDBREW, LTE4G, meta-tail2vec, GRAPHPATCHER, RAWLSGCN
-
Node-level topology global imbalance:
ReNode, TAM, PASTEL, TOPOAUC, HyperIMBA
-
Graph-level class imbalance:
G2GNN, TopoImb, DataDec, ImGKB
-
Graph-level topology imbalance:
TopoImb, SOLTGNN
You can also manually modify the configuration files located in config/
for fine-grained control over hyperparameters.
IGL-Bench is designed with modularity and extensibility. Due to the unified solver interface, users can seamlessly integrate their custom algorithms into our benchmarking pipeline with minimal redundant codes.
We provide a step-by-step tutorial to guide you through the process: ๐ Build_Your_Own_IGL.ipynb
This tutorial covers:
- โ
Using the pre-split datasets (with
train_mask
,val_mask
,test_mask
) - โ
Understanding the
Dataset
andConfig
structure - โ Adding your own algorithm in a few easy steps
Your custom solver will automatically be discovered and benchmarked by the frameworkโjust like any built-in baseline.
We have implemented the following IGL algorithms in the IGL-Bench:
If you find this repository helpful, please consider citing us. We welcome any discussions with [email protected].
@inproceedings{qin2025iglbench,
title={{IGL}-Bench: Establishing the Comprehensive Benchmark for Imbalanced Graph Learning},
author={Jiawen Qin and Haonan Yuan and Qingyun Sun and Lyujin Xu and Jiaqi Yuan and Pengfeng Huang and Zhaonan Wang and Xingcheng Fu and Hao Peng and Jianxin Li and Philip S. Yu},
booktitle={The Thirteenth International Conference on Learning Representations},
year={2025}
}