Skip to content

Web-based distributed machine learning training system with real-time visualization and multi-GPU support

License

Notifications You must be signed in to change notification settings

spartow/distributed-ml-training

Repository files navigation

Distributed Machine Learning Training System

Web-based distributed machine learning training system with real-time visualization and multi-GPU support

Features

  • Web-based Training Interface

    • Real-time training progress monitoring
    • Interactive performance metrics visualization
    • Export capabilities for charts and data
    • System information display
  • Model Support

    • MLP (Multi-Layer Perceptron)
    • CNN (Convolutional Neural Network)
  • Training Methods

    • Single GPU training
    • Distributed training across multiple GPUs
  • Dataset Support

    • Synthetic dataset generation for testing
    • Extensible data loader system

Project Structure

.
├── benchmarks/          # Performance benchmarking tools
├── datasets/            # Dataset loaders and utilities
├── distributed/         # Distributed training implementation
├── models/             # Neural network model definitions
├── results/            # Training results and exports
├── scripts/            # Training scripts
└── templates/          # Web UI HTML templates

Installation

  1. Clone the repository:
git clone https://github.com/spartow/distributed-ml-training.git
cd distributed-ml-training
  1. Install dependencies:
pip install -r requirements.txt

Usage

  1. Start the web interface:
python web_ui.py
  1. Open your browser and navigate to http://localhost:5000

  2. Configure your training parameters:

    • Select model architecture (MLP/CNN)
    • Choose dataset
    • Set training hyperparameters
    • Select training method
  3. Monitor training progress and export results as needed

Requirements

  • Python 3.8+
  • PyTorch 2.0+
  • CUDA (optional, for GPU support)
  • Flask (for web interface)
  • Additional dependencies in requirements.txt

Contributing

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

About

Web-based distributed machine learning training system with real-time visualization and multi-GPU support

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published