Stable Diffusion Question

I am looking to leverage torch.nn.parallel.DistributedDataParallel per the documentation you have written to integrate dual 3090s into a workflow.  I am using the automatic repo and after trying multiple things to update the following code to include what you have in the torch wiki, I have been unsuccessful in switching the cuda current device to leverage the model methodology outlined in your documentation and stackoverflow examples.  Do you have any recommendations on what I can read or leverage to test further?  I know that Meta has been releasing some wonderful tools I have been using to support the Stable Diffusion project so I hope this is in your purview.  If it is not, feel free to ignore.

def caching_allocator_alloc(size, device: Union[Device, int] = None, stream=None):
    r"""Performs a memory allocation using the CUDA memory allocator.

    Memory is allocated for a given device and a stream, this
    function is intended to be used for interoperability with other
    frameworks. Allocated memory is released through
    :func:`~torch.cuda.caching_allocator_delete`.

    Args:
        size (int): number of bytes to be allocated.
        device (torch.device or int, optional): selected device. If it is
            ``None`` the default CUDA device is used.
        stream (torch.cuda.Stream or int, optional): selected stream. If is ``None`` then
            the default stream for the selected device is used.

    .. note::
        See :ref:`cuda-memory-management` for more details about GPU memory
        management.
    """
    
    if device is None:
        device = torch.cuda.current_device()
    device = _get_device_index(0)
    
    if stream is None:
        stream = torch.cuda.current_stream(device)
    if isinstance(stream, torch.cuda.streams.Stream):
        stream = stream.cuda_stream
    if not isinstance(stream, int):
        raise TypeError('Invalid type for stream argument, must be '
                        '`torch.cuda.Stream` or `int` representing a pointer '
                        'to a exisiting stream')
    with torch.cuda.device(device):
        return torch._C._cuda_cudaCachingAllocator_raw_alloc(size, stream)

cc @mrshenli @osalpekar @H-Huang @kwen2501

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Stable Diffusion Question #2117

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Stable Diffusion Question #2117

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions