Skip to content

A collection of core computer vision tasks implemented as part of a university project. Covers demosaicing, HDR imaging, face recognition, object detection, writer identification, and 3D box estimation using Kinect data.

Notifications You must be signed in to change notification settings

ArchitNK/computer_vision

Repository files navigation

CoreVisionTasks

This repository contains my implementation of a multi-part computer vision course project completed during my university studies. The tasks and datasets were provided by the instructors, while all implementation work was done independently unless otherwise noted.

The project spans a range of classical and modern computer vision techniques, from 3D estimation using depth data to face recognition and object detection.

Box Estimation
Figure 1: Box estimation result
HDR Image
Figure 2: Enhanced HDR image result
Box Estimation
Figure 3: Face recognition result
HDR Image
Figure 4: Object detection result (balloon)

🧩 Project Tasks (in Course Order)

1. 3D Box Estimation from Kinect Data

  • Goal: Estimate real-world box dimensions using Kinect amplitude and depth images.
  • Key Features:
    • RANSAC-based plane detection (floor and box top)
    • Morphological filtering and mask creation
    • Largest connected component extraction
    • Corner detection and computation of box length, width, and height
  • Tools: numpy, scipy, matplotlib, OpenCV, Jupyter Notebook
  • Implementation

2. Image Demosaicing & HDR Imaging

  • Goal: Convert raw sensor data into enhanced RGB images using demosaicing and HDR techniques.
  • Key Features:
    • Bayer pattern analysis and sensor linearity validation
    • Demosaicing (basic and advanced)
    • Gamma correction and gray-world white balancing
    • HDR fusion using exposure stacking and iCAM06 tone mapping
  • Tools: rawpy, numpy, matplotlib, Python

3. Writer Retrieval using VLAD Encoding

  • Goal: Identify writers from handwritten historical document images.
  • Dataset: ICDAR17 Historical Writer Identification (WI) Dataset
  • Key Features:
    • Codebook generation with MiniBatchKMeans
    • VLAD encoding and power normalization
    • Exemplar SVM-based classification
    • PCA whitening and Multi-VLAD (bonus experiments)
  • Tools: scikit-learn, numpy, OpenCV
  • Implementation

4. Face Recognition with Open-Set Evaluation

  • Goal: Build a face recognition system for known and unknown individuals.
  • Key Features:
    • Face detection, tracking, and alignment
    • Open-set recognition using SVM classifiers
  • Implementation

5. Object Detection with Selective Search

  • Goal: Implement an object detection pipeline based on region proposals.
  • Key Features:
    • Selective Search for candidate region generation
    • Modular pipeline structure
    • Visualization and analysis of proposal quality
  • Implementation

🛠️ Technologies & Libraries

  • Languages & Environments: Python 3.8 or later, Jupyter Notebook
  • Core Libraries: NumPy, SciPy, OpenCV, scikit-learn, rawpy, matplotlib

About

A collection of core computer vision tasks implemented as part of a university project. Covers demosaicing, HDR imaging, face recognition, object detection, writer identification, and 3D box estimation using Kinect data.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages