Unnecessary warning when using next(iter(SubjectsLoader)) #1261

lorinczszabolcs · 2025-01-17T15:37:53Z

Is there an existing issue for this?

I have searched the existing issues

Related issues, but not completely the same: #1247 #1250

Bug summary

I have some custom code where I iterate through my dataloader with next(iter(loader)), and it shows a warning message, which I believe is unnecessary, as I have changed my code to use a tio.SubjectsLoader.

Code for reproduction

import torchio as tio
from torch.utils.data import Dataset, DataLoader
import numpy as np
import tempfile
import nibabel as nib

# Temporary directory to simulate input files
temp_dir = tempfile.TemporaryDirectory()

# Create example 3D image and label NIfTI files
def create_nifti(data, filename):
    affine = np.eye(4)  # Identity matrix for affine
    nifti_img = nib.Nifti1Image(data, affine)
    filepath = f"{temp_dir.name}/{filename}"
    nib.save(nifti_img, filepath)
    return filepath

# Simulated 3D image and label data
image_data = np.random.rand(10, 10, 10).astype(np.float32)
label_data = np.random.randint(0, 2, (10, 10, 10)).astype(np.uint8)

# Save them as NIfTI files
image_path = create_nifti(image_data, "image1.nii.gz")
label_path = create_nifti(label_data, "label1.nii.gz")

# Create a list of subjects with file paths
subject_list = [
    {
        'image': image_path,
        'label': label_path,
    }
]

# Define a transform pipeline
transform = tio.Compose([
    tio.ToCanonical(),  # Standardize orientation
    tio.RescaleIntensity((0, 1)),  # Normalize intensity to 0-1
])

# Custom Dataset that implements __getitem__
class CustomTorchDataset(tio.SubjectsDataset):
    def __init__(self, subject_list, transform=None):
        self.subject_list = subject_list
        self.transform = transform

    def __len__(self):
        return len(self.subject_list)

    def __getitem__(self, idx):
        subject_dict = self.subject_list[idx]
        subject = tio.Subject(
            image=tio.ScalarImage(subject_dict['image']),
            label=tio.LabelMap(subject_dict['label']),
        )
        if self.transform:
            subject = self.transform(subject)
        return {
            'image': subject['image'].data,
            'label': subject['label'].data,
        }

# Initialize the custom dataset
dataset = CustomTorchDataset(subject_list, transform=transform)

# Create a DataLoader
loader = tio.SubjectsLoader(dataset, batch_size=1, shuffle=False)

# Fetch a batch
batch = next(iter(loader))

# Print the batch keys and their shapes
for key, value in batch.items():
    print(f"{key}: {value.shape}")

# Cleanup temporary files
temp_dir.cleanup()

Actual outcome

image: torch.Size([1, 1, 10, 10, 10])
label: torch.Size([1, 1, 10, 10, 10])

Error messages

/usr/local/lib/python3.11/dist-packages/torchio/data/image.py:248: UserWarning: Using TorchIO images without a torchio.SubjectsLoader in PyTorch >= 2.3 might have unexpected consequences, e.g., the collated batches will be instances of torchio.Subject with 5D images. Replace your PyTorch DataLoader with a torchio.SubjectsLoader so that the collated batch becomes a dictionary, as expected. See https://github.com/fepegar/torchio/issues/1179 for more context about this issue. warnings.warn(message, stacklevel=1)

Expected outcome

I expect to not receive such a warning, given that I am using a SubjectsLoader and SubjectsDataset.

System info

Platform:   Linux-5.15.0-122-generic-x86_64-with-glibc2.35
TorchIO:    0.20.1
PyTorch:    2.4.1
SimpleITK:  2.4.1 (ITK 5.4)
NumPy:      1.26.4
Python:     3.11.0 | packaged by conda-forge | (main, Jan 14 2023, 12:27:40) [GCC 11.3.0]

The text was updated successfully, but these errors were encountered:

fepegar · 2025-01-27T21:50:22Z

Thanks for reporting, @lorinczszabolcs. Unfortunately I won't be able to pick this up any time soon. Feel free to give it a try.

nicoloesch · 2025-01-30T05:56:40Z

Hi @lorinczszabolcs,

I am pretty sure that warning message arises due to the wrong/unexpected instantiation of your subjects. Generally speaking, a tio.SubjectsDataset should be instantiated with tio.Subject instances instead of the list of dicts you provide with

# Create a list of subjects with file paths
subject_list = [
    {
        'image': image_path,
        'label': label_path,
    }
]

The expected usage is outlined in the official documentation.

The reason why you now get an error message arises from the call batch = next(iter(loader)). Internally, the tio.SubjectsLoader gets converted to either a torch.utils.data.dataloader._SingleProcessDataLoaderIter or torch.utils.data.dataloader._MultiProcessingDataLoaderIter depending on the num_workers. This is just how PyTorch works and differentiates between loading data in a single stream or in parallel (also does not really matter as both result in True in your specific case in torchio.utils.in_torch_loader). As you create a tio.Subject dynamically in __getitem__ of your dataset, the stack already contains the _SingleProcessDataLoaderIter/_MultiProcessingDataLoaderIter as the tio.SubjectsLoader has already been converted under the hood and we iterate over it using next(iter()), which then calls your CustomTorchDataset.__getitem__. The respective check in the instantiation of any tio.Image then flags it and provides you with the warning - this is expected behaviour and informs you of a potential mismatch in what PyTorch > 2.3.x expects.

If you create your tio.Subjects first (you can provide paths to them - they will then be loaded accordingly), put them in a list and provide it to the tio.SubjectsDataset upon instantiation instead of dynamically in __getitem__, the warning vanishes.

See following MWE:

import torch
import torchio as tio

def generate_subject(tensor_size: tuple[int, ...]) -> tio.Subject:
    subject = tio.Subject(
        image=tio.ScalarImage(tensor=torch.randint(0, 2**8,tensor_size)),
        label=tio.LabelMap(tensor=torch.randint(0, 2, tensor_size)),
    )
    return subject

def debug():
    tensor_size = (1,240,240,155)
    num_subjects = 2

    transform = tio.Compose([
        tio.ToCanonical(),  # Standardize orientation
        tio.RescaleIntensity((0, 1)),  # Normalize intensity to 0-1
    ])


    dataset = tio.SubjectsDataset([generate_subject(tensor_size=tensor_size) for _ in range(num_subjects)], transform=transform)
    loader = tio.SubjectsLoader(dataset, batch_size=1)
    for batch in loader:
        print(batch)
        break


if __name__ == '__main__':
    debug()

Hope this helps and we can close the Bug.

Edit: Code formatting for better readability

lorinczszabolcs · 2025-02-01T16:54:37Z

Thanks for investigating!

Yes, issue can be closed, I somehow missed the part that we need to instantiate with list of Subjects. Thanks for the heads-up!

lorinczszabolcs added the bug Something isn't working label Jan 17, 2025

lorinczszabolcs mentioned this issue Jan 17, 2025

Unnecessary warning when using TorchIO inside Dataset __getitem__ without returning Subject objects #1247

Open

1 task

lorinczszabolcs closed this as completed Feb 1, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unnecessary warning when using next(iter(SubjectsLoader)) #1261

Unnecessary warning when using next(iter(SubjectsLoader)) #1261

lorinczszabolcs commented Jan 17, 2025 •

edited

Loading

fepegar commented Jan 27, 2025

nicoloesch commented Jan 30, 2025 •

edited

Loading

lorinczszabolcs commented Feb 1, 2025

Unnecessary warning when using next(iter(SubjectsLoader)) #1261

Unnecessary warning when using next(iter(SubjectsLoader)) #1261

Comments

lorinczszabolcs commented Jan 17, 2025 • edited Loading

Is there an existing issue for this?

Bug summary

Code for reproduction

Actual outcome

Error messages

Expected outcome

System info

fepegar commented Jan 27, 2025

nicoloesch commented Jan 30, 2025 • edited Loading

lorinczszabolcs commented Feb 1, 2025

lorinczszabolcs commented Jan 17, 2025 •

edited

Loading

nicoloesch commented Jan 30, 2025 •

edited

Loading