Skip to content

faliqadlan/Automated-Sorting-of-Field-Collected-Images

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

Automated Sorting of Field-Collected Images

Project Overview

This project provides a Python script within a Jupyter Notebook to automate the process of organizing hundreds of field photographs. It intelligently sorts image files into folders corresponding to specific data collection points by matching the image's timestamp (from EXIF metadata) with time windows defined in an Excel logsheet.

This script was designed to solve a common problem in field research and data collection: the tedious and error-prone manual task of matching photographic evidence with its corresponding sensor data. By automating this process, it saves significant time and ensures a high degree of accuracy.


Features

  • Automated Sorting: Automatically reads image metadata and sorts files into appropriately named folders.
  • Time-Based Matching: Uses a precise time window (from the start of one measurement to the start of the next) to associate images with data points.
  • Data Anomaly Detection: Identifies and flags data points where recorded times are inconsistent, preventing incorrect file associations.
  • Error Handling: Isolates any images that cannot be matched into a separate other folder for easy manual review, ensuring no data is lost.
  • Reproducible Workflow: As a Jupyter Notebook, the entire process is documented, transparent, and easily reproducible.

How It Works

The script follows a logical sequence to process and sort the data:

  1. Load Data: It begins by loading the measurement data from the Data Tersortis + logsheet gedong songo.xlsx file into a pandas DataFrame.
  2. Chronological Sort: The data is sorted by the Date Created column to establish a correct timeline, which is crucial for the next step.
  3. Define Time Windows: For each data point (row), a time interval is calculated. The window starts at that row's Date Created time and ends at the Date Created time of the next row. This defines a precise period during which any photos taken should belong to that data point.
  4. Extract Image Timestamps: The script iterates through all image files in the source folder and uses the Pillow library to extract the original creation timestamp from the EXIF metadata of each photo.
  5. Match and Sort: It then compares each image's timestamp to the calculated time windows. If an image's timestamp falls within a data point's window, the script copies that image into a new folder named after the data point's Title.
  6. Isolate Unmatched Files: Any images whose timestamps do not fall into any of the defined windows are copied to a special other folder for manual inspection.

Setup and Usage

1. File Structure

Before running the script, ensure your files are organized as follows:

your-project-folder/
│
├── dokumentasi-adlan.ipynb         # The Jupyter Notebook file
│
├── Data Tersortis + logsheet gedong songo.xlsx  # Your Excel data file
│
└── dokumentasi-adlan/              # Folder containing ALL the images to be sorted
    ├── image_001.jpg
    ├── image_002.jpg
    ├── image_003.jpg
    └── ...

2. Requirements

This script requires Python 3 and the following libraries. You can install them using pip:

pip install pandas openpyxl Pillow

3. How to Use

  1. Place the Excel file and the dokumentasi-adlan folder (containing your images) in the same directory as the Jupyter Notebook.
  2. Open and run the dokumentasi-adlan.ipynb notebook.
  3. You can run all cells sequentially. The script will create a new directory called sorted_documentation containing the results.

Outcome

After the script finishes, a new folder named sorted_documentation will be created with the following structure:

your-project-folder/
│
└── sorted_documentation/
    │
    ├── L1.1/                   # Folder for Title 'L1.1'
    │   ├── image_001.jpg
    │   └── ...
    │
    ├── L1.2/                   # Folder for Title 'L1.2'
    │   └── ...
    │
    └── other/                  # Folder for all unmatched images
        ├── image_unmatched.jpg
        └── ...

This provides a clean, organized dataset that is ready for further analysis, reporting, or archiving.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published