A modern web interface for running Microsoft's BitNet models efficiently on CPU. This project provides a user-friendly way to download, manage, and run inference with 1-bit quantized language models.
-
Easy Model Management
- One-click downloads from Hugging Face
- Direct model uploads (GGUF format)
- Real-time download progress tracking
- Popular models quick access
-
Efficient Inference
- CPU-optimized inference
- Support for 1-bit quantized models
- Conversation mode
- Adjustable parameters (temperature, max tokens)
-
Modern UI/UX
- Clean, responsive interface
- Dark/Light theme support
- Real-time status updates
- System logs viewer
-
Technical Features
- Python 3.8 or higher
- pip package manager
- CPU with AVX2 support (recommended)
- Clone the repository:
git clone https://github.com/mindscope-world/bitnet-inference.git
cd bitnet-inference
- Install dependencies:
pip install -r requirements.txt
- Run the application:
python app.py
The web interface will be available at http://localhost:8000
- Navigate to the "Download Model" tab
- Enter a model name or HuggingFace path (e.g.,
microsoft/BitNet-b1.58-2B-4T
) - Click "Download Model"
- Monitor the download progress in real-time
- Ensure a model is loaded
- Enter your prompt in the text area
- Adjust generation parameters if needed:
- Temperature (0.1 - 1.5)
- Max Tokens (10 - 2048)
- Conversation Mode (on/off)
- Click "Generate"
The application supports various BitNet models, including:
- BitNet-b1.58-2B-4T
- bitnet_b1_58-large
- bitnet_b1_58-3B
bitnet-inference/
├── app/
│ ├── static/
│ │ ├── imgs/
│ │ ├── css/
│ │ └── js/
│ ├── templates/
│ └── models/
├── app.py
├── setup_env.py
├── simple_model_server.py
└── requirements.txt
- FastAPI Backend: Handles model management and inference requests
- Async Downloads: Non-blocking model downloads with progress tracking
- Fallback System: Automatic switching between optimized and standard inference
- Theme System: Dynamic theme switching with system preference detection
Contributions are welcome! Please feel free to submit a Pull Request.
- @mindscope-world - Project Lead & Main Developer
This project is licensed under the MIT License - see the LICENSE file for details.
- Microsoft BitNet - For the original BitNet implementation
- FastAPI - For the excellent web framework
- Hugging Face - For model hosting and transformers library
For support, please open an issue in the GitHub repository or contact @mindscope-world.
- Add batch processing support
- Implement model fine-tuning interface
- Add more visualization options
- Support for custom quantization
- API documentation interface
- Docker deployment support
Made with ❤️ by @mindscope-world