Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SMIT CT Lung GTV segmentation model #108

Draft
wants to merge 20 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 34 additions & 0 deletions models/msk_smit_lung_gtv/config/default.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
general:
data_base_dir: /app/data
version: 1.0.0
description: Default configuration for SMIT model (dicom to dicom)

execute:
- DicomImporter
- NiftiConverter
- SMITRunner
- DsegConverter
- DataOrganizer

modules:
DicomImporter:
source_dir: input_data
import_dir: sorted_data
sort_data: true
meta:
mod: '%Modality'

SMITRunner:
a_min: -500
a_max: 500
# Can add other config paremeters here

DsegConverter:
model_name: SMIT
body_part_examined: CHEST
source_segs: nifti:mod=seg
skip_empty_slices: true

DataOrganizer:
targets:
- dicomseg:mod=seg-->[i:sid]/smit.seg.dcm%
24 changes: 24 additions & 0 deletions models/msk_smit_lung_gtv/dockerfiles/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
FROM mhubai/base:latest

# Update authors label
LABEL authors="[email protected],[email protected],[email protected],[email protected]"

RUN apt update

ARG MHUB_MODELS_REPO
#ENV MHUB_MODELS_REPO=https://github.com/locastre/models.git
RUN buildutils/import_mhub_model.sh msk_smit_lung_gtv ${MHUB_MODELS_REPO}

#ENV WORK_DIR=/app/models/msk_smit_lung_gtv/src
ENV WORK_DIR=/app

WORKDIR ${WORK_DIR}/msk_smit_lung_gtv/src
ENV WEIGHTS_URL=https://mskcc.box.com/shared/static/sf7jic4m2dk67413cipbbq6hddvhpj61.gz
ENV CONDA_URL=https://mskcc.box.com/shared/static/d580gfjzzmt26v8klwp8pivb6wafomag.gz
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this using a conda environment for dependencies?

We should avoid conda in open source projects due to it's licensing scheme.

I suggest we install all dependencies with uv (which comes with our base image), e.g., you could provide a pyproject.toml to make src installable or just provide a list with dependencies (like a requirements.txt) and install with uv pip install.

RUN wget ${WEIGHTS_URL} -O weights.tar.gz && tar xvf weights.tar.gz && rm weights.tar.gz
RUN mkdir conda-pack && chmod -R 777 conda-pack
RUN cd conda-pack && wget ${CONDA_URL} -O conda.tar.gz && tar xvf conda.tar.gz && rm conda.tar.gz


ENTRYPOINT ["mhub.run"]
CMD ["--config", "/app/models/msk_smit_lung_gtv/config/default.yml"]
102 changes: 102 additions & 0 deletions models/msk_smit_lung_gtv/meta.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
{
"id": "",
"name": "msk_smit_lung_gtv",
"title": "CT Lung GTV SMIT Segmentation",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does SMIT stand for :)?

"summary": {
"description": "GTV segmentation from CT scan",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be slightly longer, this is what the user sees on the website.

"inputs": [
{
"label": "Input Image",
"description": "The CT scan of a patient.",
"format": "NIFTI",
"modality": "CT",
"bodypartexamined": "Chest",
"slicethickness": "5mm",
"contrast": true,
"noncontrast": true
}
],
"outputs": [
{
"label": "Segmentation of the lung GTV",
"description": "Segmentation of the lung GTV from NIfTI CT images.",
"type": "Segmentation",
"classes": [
"GTV"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment in SMITrunner.py.

]
}
],
"model": {
"architecture": "Swin Transformer based segmentation, self-supervised pretrained with 10k CT data",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just Swin3D?

"training": "supervised",
"cmpapproach": "3D"
},
"data": {
"training": {
"vol_samples": 377
},
"evaluation": {
"vol_samples": 139
},
"public": true,
"external": false
}
},
"details": {
"name": "SMIT",
"version": "1.0.0",
"devteam": "",
"authors": ["Jue Jiang, Harini Veeraraghavan"],
"type": "it is a 3D Swin transformer based segmentation net",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just Segmentation?

"date": {
"code": "11.03.2025",
"weights": "11.03.2025",
"pub": "15.July.2024"
Comment on lines +52 to +54
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Must be dd.mm.yyyy consistently.

},
"cite": "Jiang, Jue, and Harini Veeraraghavan. Self-supervised pretraining in the wild imparts image acquisition robustness to medical image transformers: an application to lung cancer segmentation. Proceedings of machine learning research 250 (2024): 708.",
"license": {
"code": "GNU General Public License",
"weights": "GNU General Public License"
},
"publications": [
{
"title": "Self-supervised pretraining in the wild imparts image acquisition robustness to medical image transformers: an application to lung cancer segmentation",
"url": "https://openreview.net/pdf?id=G9Te2IevNm"
},
{
"title":"Self-supervised 3D anatomy segmentation using self-distilled masked image transformer (SMIT)",
"url":"https://link.springer.com/chapter/10.1007/978-3-031-16440-8_53"
}
],
"github": "https://github.com/The-Veeraraghavan-Lab/CTRobust_Transformers.git"
},
"info": {
"use": {
"title": "Intended use",
"text": "This model is intended to be used on CT images (with or without contrast)",
"references": [],
"tables": []

},
"evaluation": {
"title": "Evaluation data",
"text": "To assess the model's segmentation performance in the NSCLC Radiogenomics dataset, we considered that the original input data is a full 3D volume. The model segmented not only the labeled tumor but also tumors that were not manually annotated. Therefore, we evaluated the model based on the manually labeled tumors. After applying the segmentation model, we extracted a 128*128*128 cubic region containing the manual segmentation to assess the model’s performance.",
"references": [],
"tables": ["validation_data_id and DSC value, Validation data is 139 data in the NSCLC Radiogenomics data:https://www.cancerimagingarchive.net/collection/nsclc-radiogenomics/, AMC-001:0.023977216,AMC-005:0.84385232,AMC-006:0.844950109,AMC-011:0.885911774,AMC-013:0.786724403,AMC-014:0.628335342,AMC-016:0.708633094,AMC-019:0.791600435,AMC-020:0.882119609,AMC-021:0.834135707,AMC-022:0.688767807,AMC-026:0.801595536,R01-001:0.738330143,R01-002:0.826459454,R01-003:0.724166437,R01-004:0.643794147,R01-005:0.8740986,R01-006:0.816578249,R01-007:0.736460458,R01-008:0.570397112,R01-010:0.901700554,R01-011:0.836905321,R01-012:0.26011073,R01-013:0.760693274,R01-014:0.605606001,R01-015:0.921568729,R01-016:0.748842593,R01-018:0.899090049,R01-019:0.777296896,R01-020:0.858735841,R01-021:0.674536904,R01-022:0.773468955,R01-023:0.851143174,R01-024:0.63791364,R01-025:0.667036976,R01-026:0.867828559,R01-027:0.849266954,R01-028:0.914362163,R01-029:0.796479193,R01-030:0.742501087,R01-031:0.771934798,R01-032:0.546395241,R01-033:0.668465959,R01-034:0.491623711,R01-035:0.861957664,R01-036:0.834929738,R01-039:0.640360767,R01-040:0.843040538,R01-041:0.255910987,R01-042:0.827863856,R01-043:0.358487119,R01-045:0.556983182,R01-046:0.798674399,R01-047:0.875100294,R01-048:0.86953796,R01-049:0.831395349,R01-050:0.736791014,R01-051:0.863763708,R01-052:0.853056081,R01-054:0.890185037,R01-055:0.721171698,R01-056:0.646278311,R01-057:0.819531018,R01-060:0.755168662,R01-061:0.831325301,R01-062:0.621616202,R01-063:0.887817849,R01-064:0.503693754,R01-065:0.900957261,R01-066:0.863084304,R01-067:0.793478908,R01-068:0.706467662,R01-069:0.652887756,R01-070:0.156561781,R01-071:0.794301598,R01-072:0.71873941,R01-073:0.656626506,R01-074:0.686797136,R01-075:0.769153952,R01-076:0.658746901,R01-077:0.515673556,R01-078:0.805609871,R01-079:0.768960982,R01-080:0.465984655,R01-082:0.764202063,R01-083:0.420652174,R01-084:0.679731288,R01-085:0.768992248,R01-086:0.493431042,R01-087:0.488001239,R01-088:0.593974567,R01-089:0.933253651,R01-090:0.891955114,R01-091:0.726296959,R01-092:0.557369092,R01-093:0.827921054,R01-094:0.809129332,R01-095:0.713630679,R01-096:0.728150443,R01-097:0.445709849,R01-099:0.786909219,R01-101:0.826549971,R01-103:0.818544249,R01-105:0.800283429,R01-107:0.77209806,R01-109:0.526077667,R01-110:0.497560976,R01-111:0.511410894,R01-112:0.907062065,R01-113:0.44661508,R01-114:0.902224058,R01-115:0.78721174,R01-116:0.561519405,R01-117:0.570513745,R01-118:0.594700407,R01-119:0.61825917,R01-121:0.839111393,R01-122:0.519057377,R01-124:0.594308036,R01-125:0.734829593,R01-126:0.426915017,R01-127:0.191945712,R01-128:0.781319407,R01-129:0.538877476,R01-131:0.544844598,R01-132:0.557804821,R01-133:0.491422557,R01-134:0.431908166,R01-135:0.554446119,R01-136:0.407775136,R01-137:0.248216534,R01-138:0.835014493,R01-139:0.680349407,R01-140:0.858731552,R01-141:0.081384615,R01-142:0.703421009,R01-144:0.657289694,R01-145:0.787378659,R01-146:0.850732088"],
Comment on lines +84 to +85
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The format for tables is described here.

"limitations": "The model might produce minor false positives but this could be easilily removed by post-processing such as constrain the tumor segmentation only in lung slices"
},
"training": {
"title": "Training data",
"text": "Training data was from 377 data in the TCIA NSCLC-Radiomics data, references: Aerts, H. J. W. L., Wee, L., Rios Velazquez, E., Leijenaar, R. T. H., Parmar, C., Grossmann, P., Carvalho, S., Bussink, J., Monshouwer, R., Haibe-Kains, B., Rietveld, D., Hoebers, F., Rietbergen, M. M., Leemans, C. R., Dekker, A., Quackenbush, J., Gillies, R. J., Lambin, P. (2014). Data From NSCLC-Radiomics (version 4) [Data set]. The Cancer Imaging Archive."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add references using the references: [] key, the format is described here.


},
"analyses": {
"title": "Quantitative Analyses",
"text": "DSC was used to compute the accuracy of the model"
},
Comment on lines +93 to +96
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extend.

"limitations": {
"title": "Limitations",
"text": "The model might produce minor false positives but this could be easilily removed by post-processing such as constrain the tumor segmentation only in lung slices"
}
}
}
4 changes: 4 additions & 0 deletions models/msk_smit_lung_gtv/src/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
## References
[1] Jiang, Jue, and Harini Veeraraghavan. "Self-supervised pretraining in the wild imparts image acquisition robustness to medical image transformers: an application to lung cancer segmentation." In Medical Imaging with Deep Learning. 2024.

[2] Jiang, Jue, Neelam Tyagi, Kathryn Tringale, Christopher Crane, and Harini Veeraraghavan. "Self-supervised 3D anatomy segmentation using self-distilled masked image transformer (SMIT)." In International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 556-566. Cham: Springer Nature Switzerland, 2022.
49 changes: 49 additions & 0 deletions models/msk_smit_lung_gtv/src/bash_run_SMIT_Segmentation.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
#!/bin/bash
#
#
# Input arguments:
# $1 data_dir
# $2 save_folder
# $3 load_weight_name
# $4 input_nifti

source ./conda-pack/bin/activate

#Use SMIT
use_smit=1 #Use SMIT not SMIT+

#Data folder and there need a 'data.json' file in the folder
data_dir="$1"

#Segmentation output folder
save_folder="$2"

#Some configrations for the model, no need to change
#Trained weight
load_weight_name="$3"

input_nifti="$4"

a_min=-500
a_max=500
space_x=1.5
space_y=1.5
space_z=2.0
out_channels=2

python utils/gen_data_json.py $input_nifti

python run_segmentation.py \
--roi_x 128 \
--roi_y 128 \
--roi_z 128 \
--space_x $space_x \
--space_y $space_y \
--space_z $space_z \
--data_dir $data_dir \
--out_channels $out_channels \
--load_weight_name $load_weight_name \
--save_folder $save_folder \
--a_min=$a_min \
--a_max=$a_max \
--use_smit $use_smit
Loading
Loading