Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unnecessary warning when using next(iter(SubjectsLoader)) #1261

Closed
1 task done
lorinczszabolcs opened this issue Jan 17, 2025 · 3 comments
Closed
1 task done

Unnecessary warning when using next(iter(SubjectsLoader)) #1261

lorinczszabolcs opened this issue Jan 17, 2025 · 3 comments
Labels
bug Something isn't working

Comments

@lorinczszabolcs
Copy link

lorinczszabolcs commented Jan 17, 2025

Is there an existing issue for this?

  • I have searched the existing issues

Related issues, but not completely the same: #1247 #1250

Bug summary

I have some custom code where I iterate through my dataloader with next(iter(loader)), and it shows a warning message, which I believe is unnecessary, as I have changed my code to use a tio.SubjectsLoader.

Code for reproduction

import torchio as tio
from torch.utils.data import Dataset, DataLoader
import numpy as np
import tempfile
import nibabel as nib

# Temporary directory to simulate input files
temp_dir = tempfile.TemporaryDirectory()

# Create example 3D image and label NIfTI files
def create_nifti(data, filename):
    affine = np.eye(4)  # Identity matrix for affine
    nifti_img = nib.Nifti1Image(data, affine)
    filepath = f"{temp_dir.name}/{filename}"
    nib.save(nifti_img, filepath)
    return filepath

# Simulated 3D image and label data
image_data = np.random.rand(10, 10, 10).astype(np.float32)
label_data = np.random.randint(0, 2, (10, 10, 10)).astype(np.uint8)

# Save them as NIfTI files
image_path = create_nifti(image_data, "image1.nii.gz")
label_path = create_nifti(label_data, "label1.nii.gz")

# Create a list of subjects with file paths
subject_list = [
    {
        'image': image_path,
        'label': label_path,
    }
]

# Define a transform pipeline
transform = tio.Compose([
    tio.ToCanonical(),  # Standardize orientation
    tio.RescaleIntensity((0, 1)),  # Normalize intensity to 0-1
])

# Custom Dataset that implements __getitem__
class CustomTorchDataset(tio.SubjectsDataset):
    def __init__(self, subject_list, transform=None):
        self.subject_list = subject_list
        self.transform = transform

    def __len__(self):
        return len(self.subject_list)

    def __getitem__(self, idx):
        subject_dict = self.subject_list[idx]
        subject = tio.Subject(
            image=tio.ScalarImage(subject_dict['image']),
            label=tio.LabelMap(subject_dict['label']),
        )
        if self.transform:
            subject = self.transform(subject)
        return {
            'image': subject['image'].data,
            'label': subject['label'].data,
        }

# Initialize the custom dataset
dataset = CustomTorchDataset(subject_list, transform=transform)

# Create a DataLoader
loader = tio.SubjectsLoader(dataset, batch_size=1, shuffle=False)

# Fetch a batch
batch = next(iter(loader))

# Print the batch keys and their shapes
for key, value in batch.items():
    print(f"{key}: {value.shape}")

# Cleanup temporary files
temp_dir.cleanup()

Actual outcome

image: torch.Size([1, 1, 10, 10, 10])
label: torch.Size([1, 1, 10, 10, 10])

Error messages

/usr/local/lib/python3.11/dist-packages/torchio/data/image.py:248: UserWarning: Using TorchIO images without a torchio.SubjectsLoader in PyTorch >= 2.3 might have unexpected consequences, e.g., the collated batches will be instances of torchio.Subject with 5D images. Replace your PyTorch DataLoader with a torchio.SubjectsLoader so that the collated batch becomes a dictionary, as expected. See https://github.com/fepegar/torchio/issues/1179 for more context about this issue. warnings.warn(message, stacklevel=1)

Expected outcome

I expect to not receive such a warning, given that I am using a SubjectsLoader and SubjectsDataset.

System info

Platform:   Linux-5.15.0-122-generic-x86_64-with-glibc2.35
TorchIO:    0.20.1
PyTorch:    2.4.1
SimpleITK:  2.4.1 (ITK 5.4)
NumPy:      1.26.4
Python:     3.11.0 | packaged by conda-forge | (main, Jan 14 2023, 12:27:40) [GCC 11.3.0]
@fepegar
Copy link
Member

fepegar commented Jan 27, 2025

Thanks for reporting, @lorinczszabolcs. Unfortunately I won't be able to pick this up any time soon. Feel free to give it a try.

@nicoloesch
Copy link
Contributor

nicoloesch commented Jan 30, 2025

Hi @lorinczszabolcs,

I am pretty sure that warning message arises due to the wrong/unexpected instantiation of your subjects. Generally speaking, a tio.SubjectsDataset should be instantiated with tio.Subject instances instead of the list of dicts you provide with

# Create a list of subjects with file paths
subject_list = [
    {
        'image': image_path,
        'label': label_path,
    }
]

The expected usage is outlined in the official documentation.

The reason why you now get an error message arises from the call batch = next(iter(loader)). Internally, the tio.SubjectsLoader gets converted to either a torch.utils.data.dataloader._SingleProcessDataLoaderIter or torch.utils.data.dataloader._MultiProcessingDataLoaderIter depending on the num_workers. This is just how PyTorch works and differentiates between loading data in a single stream or in parallel (also does not really matter as both result in True in your specific case in torchio.utils.in_torch_loader). As you create a tio.Subject dynamically in __getitem__ of your dataset, the stack already contains the _SingleProcessDataLoaderIter/_MultiProcessingDataLoaderIter as the tio.SubjectsLoader has already been converted under the hood and we iterate over it using next(iter()), which then calls your CustomTorchDataset.__getitem__. The respective check in the instantiation of any tio.Image then flags it and provides you with the warning - this is expected behaviour and informs you of a potential mismatch in what PyTorch > 2.3.x expects.

If you create your tio.Subjects first (you can provide paths to them - they will then be loaded accordingly), put them in a list and provide it to the tio.SubjectsDataset upon instantiation instead of dynamically in __getitem__, the warning vanishes.

See following MWE:

import torch
import torchio as tio

def generate_subject(tensor_size: tuple[int, ...]) -> tio.Subject:
    subject = tio.Subject(
        image=tio.ScalarImage(tensor=torch.randint(0, 2**8,tensor_size)),
        label=tio.LabelMap(tensor=torch.randint(0, 2, tensor_size)),
    )
    return subject

def debug():
    tensor_size = (1,240,240,155)
    num_subjects = 2

    transform = tio.Compose([
        tio.ToCanonical(),  # Standardize orientation
        tio.RescaleIntensity((0, 1)),  # Normalize intensity to 0-1
    ])


    dataset = tio.SubjectsDataset([generate_subject(tensor_size=tensor_size) for _ in range(num_subjects)], transform=transform)
    loader = tio.SubjectsLoader(dataset, batch_size=1)
    for batch in loader:
        print(batch)
        break


if __name__ == '__main__':
    debug()

Hope this helps and we can close the Bug.

Edit: Code formatting for better readability

@lorinczszabolcs
Copy link
Author

Thanks for investigating!

Yes, issue can be closed, I somehow missed the part that we need to instantiate with list of Subjects. Thanks for the heads-up!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants