Detect object tagging in the video and examine how parallel object detection on multiple patches can allow detection of smaller objects in the overall image without decreasing the resolution
The Berkeley Deep Drive (BDD) dataset is one of the largest and most diverse video datasets for autonomous vehicles.
- Dataset contains 100,000 video clips collected from more than 50,000 rides covering New York, San Francisco Bay Area, and other regions.
- The dataset contains diverse scene types such as city streets, residential areas, and highways.
- Furthermore, the videos were recorded in diverse weather conditions at different times of the day.
Object detection and segmentation methods are one of the most challenging problems in computer vision which aim to identify all target objects and determine the categories and position information. Numerous approaches have been proposed to solve this problem, mainly inspired by methods of computer vision and deep learning. In this project, we aim to build a model which detects multiple objects and segmentation in a moving video. For eg. Image tagging, lane detection, drivable area segmentation, road object detection, semantic segmentation, instance segmentation, multi-object detection tracking, multi-object segmentation tracking, domain adaptation, and imitation learning.
TensorFlow, PyTorch, Keras
FastAPI (tiangolo.com), Cloud Application Platform | Heroku, Streamlit, Cloud Computing, Hosting Services, and APIs | Google Cloud
- Project Technical Report
- Project Presentation with desired outcomes
- Summary of 3 research papers