ViNET is an offline, real-time object detection and depth estimation iOS app designed to empower visually impaired users with instant, multi-modal feedback. By combining lightweight CoreML models with a minimalist interface, ViNET delivers object type, position, and precise distance information—via speech and haptic cues—without ever requiring a network connection.
-
Real-Time Object Detection
- Detects 80 object categories (e.g., pedestrian, vehicle, obstacle) entirely on-device with YOLOv8s.
-
Accurate Depth Estimation
- Estimates meter-level distances using MiDaS_Small and CoreML optimizations.
-
Multi-Modal Feedback
- Speech: “A table is 75.4 centimeters to the right.”
-
Offline & Privacy-First
- No internet needed; all video frames and model weights stay on the user’s device
-
Battery-Friendly
- Maintains ~30 FPS preview while keeping CPU usage below 70%
-
Clone this repository:
git clone https://github.com/your-org/vinet.git](https://github.com/xddlzx/ViNET.git
-
Open ViNET.xcodeproj in Xcode (14.0+).
-
Ensure you have an iPhone device (iOS 15.0+) connected or select a compatible simulator.
-
Build & run the app:
- Product → Run (⌘R)
- Grant Camera Access when prompted.
- Tap the Start button to begin live object detection.
- Listen for spoken cues.
ViNET/
├── Models/
│ ├── YOLOv8s.mlmodel
│ └── MiDaS_Small.mlmodel
├── Sources/
│ ├── CameraSession.swift
│ ├── ObjectDetector.swift
│ ├── DepthEstimator.swift
│ └── VoiceReporter.swift
├── Resources/
│ ├── LaunchScreen.storyboard
│ └── Assets.xcassets
└── ViNET.xcodeproj
- Fork the repo and create a new branch for your feature/bugfix.
- Implement your changes and write unit tests where applicable.
- Open a Pull Request against main and describe your changes.
This project is licensed under the MIT License. See LICENSE for details.