CNN ensemble for automated Gleason grading of prostate cancer from whole-slide biopsy images
Gleason grading of prostate cancer biopsies is critical for treatment decisions but suffers from significant inter-pathologist variability. The PANDA challenge was the largest histopathology AI competition, with 1,290 teams from 65 countries developing algorithms on 10,616 digitized prostate biopsies.
This 2nd place solution uses an ensemble of CNNs (EfficientNet, SE-ResNeXt) with tile-based attention pooling, achieving pathologist-level agreement (quadratic weighted kappa >0.9). This work contributed to the published paper "Artificial intelligence for diagnosis and Gleason grading of prostate cancer: the PANDA challenge" in Nature Medicine.
- Tile Extraction: Whole-slide images processed as grids of tiles at multiple resolutions
- Backbone Models: EfficientNet-B0/B1, SE-ResNeXt50
- Attention Pooling: Learnable aggregation of tile-level features to slide-level prediction
- Two-Phase Training: Initial training followed by fine-tuning with hard examples
- Ensemble: Predictions merged from 4 team members' models
| Directory | Author | Description |
|---|---|---|
train_drhb |
DrHB | 2-phase training with EfficientNet |
train_rguo |
R Guo | Attention-based models |
train_cateek |
CatEek | SE-ResNeXt models |
train_xie29 |
xie29 | TFRecords pipeline |
prostate-cancer gleason-grading histopathology digital-pathology deep-learning pytorch whole-slide-images medical-imaging kaggle-competition
prostate-cancer-grade-assessment :competition data
train_cateek :train code from CatEek
train_drhb :train code from DrHB
train_rguo :train code from R Guo
train_xie29 :train code from xie29
prediction :prediction files
For more train details, please refer to each train folder.
To get prediction, prepare the environment and run:
sh make_pred.shStar this repo if you find it useful!