🌟 这是 VisioFirm 的中文化和优化版本
基于原项目进行全面中文化,为中文用户提供更好的使用体验
本项目是对 VisioFirm 的中文化和二次开发版本,保持了原有的所有功能特性,同时针对中文用户进行了全面优化。
- 基于: OschAI/VisioFirm
- 许可证: Apache License 2.0(与原项目相同)
- 维护状态: 积极维护和更新
- 🖥️ 完整的中文界面: 所有按钮、菜单、对话框都已中文化
- 📝 中文提示信息: 错误消息、状态提示、帮助文本全部中文化
- 🎯 本地化交互: 符合中文用户使用习惯的界面设计
- 📚 中文文档: 完整的中文使用说明和API文档
- 🎨 界面优化: 针对中文文本长度优化了布局
- 🔧 功能增强: 保持原有功能的同时提升用户体验
- 📱 响应式设计: 更好的移动端中文显示效果
- ✅ 图像标注: 支持手动和AI辅助标注
- ✅ 标注类型:
- ✅ 轴对齐边界框 (Axis-aligned Bounding Boxes)
- ✅ 定向边界框 (Oriented Bounding Boxes)
- ✅ 多边形分割 (Polygon Segmentation)
- ✅ AI预标注:
- ✅ YOLOv10 目标检测
- ✅ SAM2 图像分割
- ✅ Grounding DINO 零样本检测
- ✅ 交互式编辑: 画布编辑、撤销/重做、缩放功能
- ✅ 快捷键支持: 完整的键盘快捷键系统
- ✅ 项目创建和管理: 创建、编辑、删除项目
- ✅ 多项目管理: 支持多个项目同时存在
- ✅ 分类管理: 支持自定义多个类别
- ✅ 图像管理: 批量上传、导入、删除图像
- ✅ 数据导出:
- ✅ YOLO格式 (.txt)
- ✅ COCO格式 (.json)
- ✅ 带掩码的图像导出
- ✅ 模型训练: 完整的YOLO模型训练工作流
- ✅ 训练配置: 自定义训练参数(epochs, batch size等)
- ✅ 训练监控: 实时训练进度和指标可视化
- ✅ 训练历史: 保存和管理训练任务历史
- ✅ 模型导出: 训练完成后导出模型文件
- ✅ 数据集CRUD: 创建、读取、更新、删除数据集
- ✅ 数据集分析: 数据集统计信息和分析
- ✅ 数据集下载: 从URL下载公开数据集
- ✅ 数据集导入: 从本地文件导入数据集
- ✅ 数据集关联: 将数据集关联到项目
- ✅ 用户注册和登录: 完整的用户认证系统
- ✅ 用户资料管理: 编辑用户信息
- ✅ 密码重置: 密码找回功能
- ✅ 会话管理: Flask-Login会话管理
- ✅ 完整中文化: 界面、提示、文档全面中文化
- ✅ 代码规范: CSS/JS文件分离,前后端分离架构
- ✅ API优化: RESTful API设计,JSON响应格式
- ✅ 性能优化: 多线程上传、内存优化、缓存机制
- ✅ 错误处理: 完善的错误处理和日志系统
- ❌ 视频标注: 视频帧标注和时间序列标注
- ❌ 分类任务: 图像分类标签(仅支持检测和分割)
- ❌ 关键点标注: 人体姿态、关键点检测标注
- ❌ 3D标注: 3D目标检测和分割标注
- ❌ 多用户协作: 团队协作和任务分配
- ❌ 权限管理: 角色权限和访问控制
- ❌ 标注审核: 标注质量审核工作流
- ❌ 评论系统: 标注讨论和反馈
- ❌ 更多ML框架: mmdetection、detectron2集成
- ❌ 自定义模型: 用户自定义AI模型集成
- ❌ 插件系统: 可扩展的插件架构
- ❌ API认证: Token-based API认证
- ❌ 移动端APP: iOS/Android原生应用
- ❌ 云端部署: Docker容器化部署方案
- ❌ Web实时协作: WebSocket实时同步
- ❌ 自动化流水线: CI/CD集成
- ❌ 批量处理: 批量标注和处理工具
- ❌ 数据增强预览: 训练前数据增强效果预览
- ❌ 模型版本管理: 模型版本控制和回滚
- 🚧 视频标注支持: 视频帧标注和时间序列标注
- 🚧 多用户协作: 团队协作和任务分配
- 🚧 分类任务支持: 图像分类标注功能
- 🚧 云端部署方案: Docker和云平台部署指南
- 🚧 更多ML框架: mmdetection、detectron2集成
- 🚧 API认证: Token-based API认证系统
- 🚧 移动端适配: 响应式设计优化和移动端支持
- 🚧 插件系统: 可扩展的插件架构
- 🚧 Web实时协作: WebSocket实时同步
- 🚧 自定义模型集成: 用户自定义AI模型支持
- 🚧 企业级功能: 权限管理、审计日志、数据备份
- 🚧 自动化流水线: CI/CD集成和自动化工具
| 功能模块 | 完成度 | 状态 |
|---|---|---|
| 核心标注功能 | 95% | ✅ 基本完成 |
| 项目管理 | 100% | ✅ 完整 |
| 训练模块 | 90% | ✅ 可用 |
| 数据集管理 | 85% | ✅ 基本完成 |
| 用户系统 | 90% | ✅ 基本完成 |
| 视频标注 | 0% | ❌ 未开始 |
| 多用户协作 | 0% | ❌ 未开始 |
| 分类任务 | 0% | ❌ 未开始 |
| 云端部署 | 0% | ❌ 未开始 |
总体完成度: 约 75% - 核心功能完整,高级功能待开发
欢迎贡献以下功能:
- 🎯 高优先级: 视频标注、多用户协作
- 📊 中优先级: 分类任务、更多ML框架集成
- 🔧 低优先级: 插件系统、移动端APP
详见 CONTRIBUTING.md 了解如何参与贡献。
VisioFirm/
├── README.md # 项目主要说明
├── CHANGELOG.md # 更新日志
├── LICENSE # 许可证文件
├── setup.py # 安装配置
├── requirements.txt # 依赖包列表
├── run.py # 启动脚本
├── visiofirm/ # 主要源代码
│ ├── models/ # 数据模型
│ ├── routes/ # 路由模块
│ ├── static/ # 静态资源
│ ├── templates/ # HTML模板
│ └── utils/ # 工具函数
├── tests/ # 测试文件
├── docs/ # 项目文档
├── examples/ # 示例文件
└── scripts/ # 脚本工具
VisioFirm 可以通过 pip 从 GitHub 安装(PyPI 即将推出!)
需要 Python 3.10+。
pip install -U visiofirm开发环境安装(从克隆的仓库):
git clone https://github.com/xiongfazhan/VisioFirm-Enhanced.git
cd VisioFirm-Enhanced
pip install -e .使用单个命令启动 VisioFirm—它会自动启动本地 Web 服务器并在浏览器中打开。
visiofirm- 创建新项目并上传图像
- 定义类别(例如 "car", "person")
- 对于易于检测的对象运行 AI 预标注(选择模型:YOLO、Grounding DINO)
- 在交互式编辑器中完善标签
- 导出标注的数据集
VisioFirm 应用使用缓存目录在本地存储设置。
Important
A new release has just dropped. VisioFirm 0.2.0 brings enhancements including bug fixes for image import, improved frontend loading, Cloud/SSH support for downloading images and saving annotations, SAM2 worker offloading for better performance, optimized SAM2-Auto annotation for faster computing, and thread optimizations for image uploading. This version builds on the stable GroundingDINO dependency from 0.1.4.
- Cloud/SSH Integration: download images from cloud storage or SSH servers and save annotations remotely (using local absolute paths).
- Enhanced Image Handling: Fixed bugs in image import, faster frontend loading, and multi-threaded uploading for efficiency.
- SAM2 Optimizations: Worker-based offloading and improved auto-annotation for rapid, high-performance segmentation in the browser. Though you may experience a first timelaps for the first label generation the subsequent annotators are instant.
Note
If you prefer the HF transformers-based library (pre-0.2.0), install from the main branch via pip install visiofirm==0.1.0.
VisioFirm is an open-source, AI-powered image annotation tool designed to accelerate labeling for computer vision tasks like object detection, oriented bounding boxes (OBB), and segmentation. Built for speed and simplicity, it leverages state-of-the-art models for semi-automated pre-annotations, allowing you to focus on refining rather than starting from scratch. Whether you're preparing datasets for YOLO, SAM, or custom models, VisioFirm streamlines your workflow with a intuitive web interface and powerful backend.
Perfect for researchers, data scientists, and ML engineers handling large image datasets—get high-quality annotations in minutes, not hours!
Unlike other annotation tool, this one is majoraly focused on CV tasks annotation detection (normal and oriented bounding box) and segmentation.
- AI-Driven Pre-Annotation: Automatically detect and segment objects using YOLOv10, SAM2, and Grounding DINO—saving up to 80% of manual effort.
- Multi-Task Support: Handles bounding boxes, oriented bounding boxes, and polygon segmentation in one tool.
- array-Based Editing: Interactive canvas for precise adjustments, with real-time SAM-powered segmentation in the browser.
- Offline-Friendly: Models download automatically (or pre-fetch for offline use), with SQLite backend for local projects.
- Extensible & Open-Source: Customize with your own ultralytics models or integrate into pipelines—contributions welcome!
- SAM2-base webgpu: Insta-drawing of annotations via SAM2 with worker offloading and auto-annotation for faster computing!
- Semi-Automated Labeling: Kickstart annotations with AI models like YOLO for detection, SAM for segmentation, and Grounding DINO for zero-shot object grounding.
- Flexible Annotation Types:
- Axis-aligned bounding boxes for standard detection.
- Oriented bounding boxes for rotated objects (e.g., aerial imagery).
- Polygon segmentation for precise boundaries.
- Interactive Frontend: Draw, edit, and refine labels on a responsive canvas. Click-to-segment with browser-based SAM for instant masks.
- Project Management: Create, manage, and export projects with SQLite database storage. Support for multiple classes and images.
- Export Formats: Seamless exports to YOLO, COCO, or custom formats for training.
- Performance Optimizations: Cluster overlapping detections, simplify contours, and filter by confidence for clean datasets.
- Cloud/SSH Integration: Seamlessly download images from cloud storage or SSH servers and save annotations remotely.
- Enhanced Image Handling: Fixed bugs in image import, faster frontend loading, and multi-threaded uploading for efficiency.
- SAM2 Optimizations: Worker-based offloading and improved auto-annotation for rapid, high-performance segmentation in the browser.
- Cross-Platform: Runs locally on Linux, macOS, or Windows via Python— no cloud dependency.
- ✅ Image Annotation: Manual and AI-assisted annotation
- ✅ Annotation Types:
- ✅ Axis-aligned Bounding Boxes
- ✅ Oriented Bounding Boxes
- ✅ Polygon Segmentation
- ✅ AI Pre-annotation:
- ✅ YOLOv10 Object Detection
- ✅ SAM2 Image Segmentation
- ✅ Grounding DINO Zero-shot Detection
- ✅ Interactive Editing: Canvas editing, undo/redo, zoom functionality
- ✅ Keyboard Shortcuts: Complete keyboard shortcut system
- ✅ Project CRUD: Create, edit, delete projects
- ✅ Multi-project Support: Multiple projects simultaneously
- ✅ Class Management: Custom multiple classes
- ✅ Image Management: Batch upload, import, delete images
- ✅ Data Export:
- ✅ YOLO format (.txt)
- ✅ COCO format (.json)
- ✅ Images with masks export
- ✅ Model Training: Complete YOLO model training workflow
- ✅ Training Configuration: Custom training parameters (epochs, batch size, etc.)
- ✅ Training Monitoring: Real-time training progress and metrics visualization
- ✅ Training History: Save and manage training task history
- ✅ Model Export: Export model files after training completion
- ✅ Dataset CRUD: Create, read, update, delete datasets
- ✅ Dataset Analysis: Dataset statistics and analysis
- ✅ Dataset Download: Download public datasets from URLs
- ✅ Dataset Import: Import datasets from local files
- ✅ Dataset Linking: Link datasets to projects
- ✅ User Registration & Login: Complete user authentication system
- ✅ User Profile Management: Edit user information
- ✅ Password Reset: Password recovery functionality
- ✅ Session Management: Flask-Login session management
- ✅ Full Chinese Localization: Interface, prompts, documentation fully localized
- ✅ Code Standards: CSS/JS file separation, frontend-backend separation architecture
- ✅ API Optimization: RESTful API design, JSON response format
- ✅ Performance Optimization: Multi-threaded upload, memory optimization, caching mechanism
- ✅ Error Handling: Comprehensive error handling and logging system
- ❌ Video Annotation: Video frame annotation and time-series annotation
- ❌ Classification Tasks: Image classification labels (only detection and segmentation supported)
- ❌ Keypoint Annotation: Human pose, keypoint detection annotation
- ❌ 3D Annotation: 3D object detection and segmentation annotation
- ❌ Multi-user Collaboration: Team collaboration and task assignment
- ❌ Permission Management: Role permissions and access control
- ❌ Annotation Review: Annotation quality review workflow
- ❌ Comment System: Annotation discussion and feedback
- ❌ More ML Frameworks: mmdetection, detectron2 integration
- ❌ Custom Models: User-defined AI model integration
- ❌ Plugin System: Extensible plugin architecture
- ❌ API Authentication: Token-based API authentication
- ❌ Mobile Apps: iOS/Android native applications
- ❌ Cloud Deployment: Docker containerization deployment solution
- ❌ Web Real-time Collaboration: WebSocket real-time synchronization
- ❌ Automation Pipeline: CI/CD integration
- ❌ Batch Processing: Batch annotation and processing tools
- ❌ Data Augmentation Preview: Data augmentation effect preview before training
- ❌ Model Version Management: Model version control and rollback
- 🚧 Video Annotation Support: Video frame annotation and time-series annotation
- 🚧 Multi-user Collaboration: Team collaboration and task assignment
- 🚧 Classification Task Support: Image classification annotation functionality
- 🚧 Cloud Deployment Solutions: Docker and cloud platform deployment guides
- 🚧 More ML Frameworks: mmdetection, detectron2 integration
- 🚧 API Authentication: Token-based API authentication system
- 🚧 Mobile Adaptation: Responsive design optimization and mobile support
- 🚧 Plugin System: Extensible plugin architecture
- 🚧 Web Real-time Collaboration: WebSocket real-time synchronization
- 🚧 Custom Model Integration: User-defined AI model support
- 🚧 Enterprise Features: Permission management, audit logs, data backup
- 🚧 Automation Pipeline: CI/CD integration and automation tools
| Feature Module | Completion | Status |
|---|---|---|
| Core Annotation | 95% | ✅ Mostly Complete |
| Project Management | 100% | ✅ Complete |
| Training Module | 90% | ✅ Usable |
| Dataset Management | 85% | ✅ Mostly Complete |
| User System | 90% | ✅ Mostly Complete |
| Video Annotation | 0% | ❌ Not Started |
| Multi-user Collaboration | 0% | ❌ Not Started |
| Classification Tasks | 0% | ❌ Not Started |
| Cloud Deployment | 0% | ❌ Not Started |
Overall Completion: Approximately 75% - Core features complete, advanced features pending
Contributions are welcome for the following features:
- 🎯 High Priority: Video annotation, multi-user collaboration
- 📊 Medium Priority: Classification tasks, more ML framework integration
- 🔧 Low Priority: Plugin system, mobile apps
See CONTRIBUTING.md for details on how to contribute.
VisioFirm is easy to install via pip from GitHub (PyPI coming soon!).
It was tested with Python 3.10+.
pip install -U visiofirmFor development or editable install (from a cloned repo):
git clone https://github.com/OschAI/VisioFirm.git
cd VisioFirm
pip install -e .Launch VisioFirm with a single command—it auto-starts a local web server and opens in your browser.
visiofirm- Create a new project and upload images.
- Define classes (e.g., "car", "person").
- For easy-to-detect object run AI pre-annotation (select model: YOLO, Grounding DINO).
- Refine labels in the interactive editor.
- Export your annotated dataset.
The VisioFirm app uses cache directories to store settings locally.
VisioFirm uses advanced models for initial labels:
- YOLOv10: Fast detection.
- SAM2: Precise segmentation.
- Grounding DINO: Zero-shot detection via text prompts.
Models auto-download on first run (stored in current dir or cache). For offline prep:
The web interface (Flask + JS) supports hotkeys, undo/redo, and zoom. Edit static/js/sam.js for browser SAM tweaks.
From the dashboard, export to JSON, TXT (YOLO format), or images with masks.
- Issues: Report bugs or request features here.
- Discord: Coming soon—star the repo for updates!
- Roadmap: Multi-user support, video annotation, custom model integration.
Apache 2.0 - See LICENSE for details.
Built by Safouane El Ghazouali for the research community. Star the repo if it helps your workflow! 🚀
@misc{ghazouali2025visiofirm,
title={VisioFirm: Cross-Platform AI-assisted Annotation Tool for Computer Vision},
author={Safouane El Ghazouali and Umberto Michelucci},
year={2025},
eprint={2509.04180},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
SOON:
- Documentation website
- Discord community
- Paper - detailing the implementation and AI preannotation pipeline
- Classification
Futur:
- Support for video annotation
- Support for more ML frameworks (such as mmdetection and detectron2)


