Skip to content

Optimizing PyTorch Model Training by Wrapping Memory Mapped Tensors on an Nvidia GPU with TensorDict.

Notifications You must be signed in to change notification settings

Kuenoz/pytorch_training_optimization_using_tensordict_memory_mapping

Repository files navigation

PyTorch Training Optimization Using TensorDict Memory Mapping 🚀

GitHub repo size GitHub stars GitHub issues

Description

Welcome to the PyTorch Training Optimization Using TensorDict Memory Mapping repository! This project focuses on enhancing the training efficiency of PyTorch models by utilizing memory-mapped tensors on Nvidia GPUs with TensorDict. By leveraging advanced memory management techniques, this repository aims to provide a streamlined approach to model training, making it faster and more resource-efficient.

Table of Contents

Features

  • Optimized Memory Usage: Efficiently manage memory with memory-mapped tensors.
  • GPU Acceleration: Leverage the power of Nvidia GPUs for faster computations.
  • Easy Integration: Seamlessly integrate with existing PyTorch workflows.
  • Enhanced Performance: Reduce training time and improve model performance.
  • Documentation: Comprehensive guides and examples to get you started.

Getting Started

To begin using this repository, you need to set up your environment. Follow the steps below to get started quickly.

Prerequisites

Ensure you have the following installed:

  • Python 3.6 or later
  • PyTorch 1.8 or later
  • Nvidia GPU with CUDA support
  • TensorDict

Installation

Clone the repository to your local machine:

git clone https://github.com/Kuenoz/pytorch_training_optimization_using_tensordict_memory_mapping.git
cd pytorch_training_optimization_using_tensordict_memory_mapping

Install the required packages:

pip install -r requirements.txt

Usage

After setting up your environment, you can start using the features of this repository. Below is a simple example to get you started.

Basic Example

import torch
from tensordict import TensorDict

# Create a memory-mapped tensor
data = torch.randn(1000, 1000, dtype=torch.float32)
mapped_tensor = data.storage().new_shared(data.numel()).view(data.size())

# Create a TensorDict
tensor_dict = TensorDict({"input": mapped_tensor}, batch_size=[1000])

# Use the tensor in your model
# model = YourModel()
# output = model(tensor_dict["input"])

For more detailed examples and advanced usage, please refer to the documentation.

Contributing

We welcome contributions to this project! If you would like to contribute, please follow these steps:

  1. Fork the repository.
  2. Create a new branch (git checkout -b feature/YourFeature).
  3. Make your changes and commit them (git commit -m 'Add new feature').
  4. Push to the branch (git push origin feature/YourFeature).
  5. Open a pull request.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Contact

For questions or feedback, please reach out to the project maintainer:

Releases

For the latest updates and versions, please visit our Releases section. Download the necessary files and execute them to start optimizing your PyTorch training.

Acknowledgments

We would like to thank the PyTorch community for their contributions and support. Special thanks to the developers of TensorDict for their amazing work on memory management.

Conclusion

This repository provides a powerful tool for optimizing PyTorch model training using memory-mapped tensors on Nvidia GPUs. With easy integration and efficient memory management, it aims to enhance your machine learning projects significantly. Explore the features, contribute, and help us improve this tool further!

For more information, please check our Releases section.