BitNet Inference Web UI 🧠

A modern web interface for running Microsoft's BitNet models efficiently on CPU. This project provides a user-friendly way to download, manage, and run inference with 1-bit quantized language models.

🌟 Features

Easy Model Management
- One-click downloads from Hugging Face
- Direct model uploads (GGUF format)
- Real-time download progress tracking
- Popular models quick access
Efficient Inference
- CPU-optimized inference
- Support for 1-bit quantized models
- Conversation mode
- Adjustable parameters (temperature, max tokens)
Modern UI/UX
- Clean, responsive interface
- Dark/Light theme support
- Real-time status updates
- System logs viewer
Technical Features
- FastAPI backend
- Async model downloads
- Automatic fallback mechanisms
- Progress monitoring system

🚀 Getting Started

Prerequisites

Python 3.8 or higher
pip package manager
CPU with AVX2 support (recommended)

Installation

Clone the repository:

git clone https://github.com/mindscope-world/bitnet-inference.git
cd bitnet-inference

Install dependencies:

pip install -r requirements.txt

Run the application:

python app.py

The web interface will be available at http://localhost:8000

💻 Usage

Downloading Models

Navigate to the "Download Model" tab
Enter a model name or HuggingFace path (e.g., microsoft/BitNet-b1.58-2B-4T)
Click "Download Model"
Monitor the download progress in real-time

Running Inference

Ensure a model is loaded
Enter your prompt in the text area
Adjust generation parameters if needed:
- Temperature (0.1 - 1.5)
- Max Tokens (10 - 2048)
- Conversation Mode (on/off)
Click "Generate"

Model Compatibility

The application supports various BitNet models, including:

BitNet-b1.58-2B-4T
bitnet_b1_58-large
bitnet_b1_58-3B

🛠️ Technical Details

Architecture

bitnet-inference/
├── app/
│   ├── static/
│   │   ├── imgs/
│   │   ├── css/
│   │   └── js/
│   ├── templates/
│   └── models/
├── app.py
├── setup_env.py
├── simple_model_server.py
└── requirements.txt

Key Components

FastAPI Backend: Handles model management and inference requests
Async Downloads: Non-blocking model downloads with progress tracking
Fallback System: Automatic switching between optimized and standard inference
Theme System: Dynamic theme switching with system preference detection

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Contributors

@mindscope-world - Project Lead & Main Developer

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Microsoft BitNet - For the original BitNet implementation
FastAPI - For the excellent web framework
Hugging Face - For model hosting and transformers library

📞 Support

For support, please open an issue in the GitHub repository or contact @mindscope-world.

🔮 Future Plans

Add batch processing support
Implement model fine-tuning interface
Add more visualization options
Support for custom quantization
API documentation interface
Docker deployment support

Made with ❤️ by @mindscope-world

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
__pycache__		__pycache__
app		app
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
create_dummy_model.py		create_dummy_model.py
requirements.txt		requirements.txt
run.py		run.py
setup_env.py		setup_env.py
simple_model_server.py		simple_model_server.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BitNet Inference Web UI 🧠

🌟 Features

🚀 Getting Started

Prerequisites

Installation

💻 Usage

Downloading Models

Running Inference

Model Compatibility

🛠️ Technical Details

Architecture

Key Components

🤝 Contributing

Contributors

📝 License

🙏 Acknowledgments

📞 Support

🔮 Future Plans

About

Releases

Packages

Languages

License

mindscope-world/fastapi-bitnet-inference

Folders and files

Latest commit

History

Repository files navigation

BitNet Inference Web UI 🧠

🌟 Features

🚀 Getting Started

Prerequisites

Installation

💻 Usage

Downloading Models

Running Inference

Model Compatibility

🛠️ Technical Details

Architecture

Key Components

🤝 Contributing

Contributors

📝 License

🙏 Acknowledgments

📞 Support

🔮 Future Plans

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages