Variational-Augmentation-for-Enhancing-Historical-Document-Image-Binarization

Official Code Implementation of Variational Augmentation of Enhancing Historical Document Image Binarization
Accepted at: ICVGIP 2022

Abstract

Historical Document Image Binarization is a well-known segmentation problem in image processing. Despite ubiquity, traditional thresholding algorithms achieved limited success on severely degraded document images. With the advent of deep learning, several segmentation models were proposed that made significant progress in the field but were limited by the unavailability of large training datasets. To mitigate this problem, we have proposed a novel two-stage framework -- the first of which comprises a generator that generates degraded samples using variational inference and the second being a CNN-based binarization network that trains on the generated data. We evaluated our framework on a range of DIBCO datasets, where it achieved competitive results against previous state-of-the-art methods.

Overview

Approach

Deep learning-based methods need large training datasets which are not readily available in the domain of historical documents. To tackle this problem we propose a two-stage framewrork:

Aug-Net: A VAE-GAN-based augmentation module based on BicycleGAN that generates synthetic training samples.
Bin-Net: An U-Net based segmentation module for the binarization task, trained on the synthetic samples generated by Aug-Net.

### Results The following are some samples obtained from Aug-Net.

Predictions on DIBCO 2014, 2016 and 2018 samples:

Prerequisites

Python 3.7+
Pytorch 1.9+
Albumentations
Fast AI

Dataset Download

You can download the training images of DIBCO from here. Extract patches using datamaker.py.
You can download the testing data from here.
You can also download the training patches directly from here. (recommended)

Directory Structure

- training_datasets
- - train
- - - - bw_patches
- - - - gt_patches
- - - - cl_patches
- - val
- - - - bw_patches
- - - - gt_patches
- - - - cl_patches

- testing_datasets
- - <DIBCO_YEAR>
- - - - bw_patches
- - - - gt_patches
- - - - cl_patches
- - - - results

- Restoration
- - code
- - - - all relavant files here (this repo)
- - weights
- - - - pretrained/saved weights here

Train Instructions

The Augmentation Network (Aug-Net) is based on BicycleGAN. Train the model according to the instructions specified in their official repository using the patches extracted from the training data. Copy the checkpoints folder into synthetic/.
Create a subdirectory evaluation/ to store intermediate results while the model is training.
Run train.py to train the Binarization Network (Bin-Net).

Inference

Change path to the directory containing the test images.
Specify path to weight files.
Run infer.py.
For evaluation, specify the paths to the outputs and the ground truth images in eval.py and run it.

Citation

If you find our paper or code useful, consider citing us:

@misc{https://doi.org/10.48550/arxiv.2211.06581,
  doi = {10.48550/ARXIV.2211.06581},
  
  url = {https://arxiv.org/abs/2211.06581},
  
  author = {Dey, Avirup and Das, Nibaran and Nasipuri, Mita},
  
  keywords = {Computer Vision and Pattern Recognition (cs.CV), FOS: Computer and information sciences, FOS: Computer and information sciences, I.4.6},
  
  title = {Variational Augmentation for Enhancing Historical Document Image Binarization}

Acknowledgements

Our work is partly based on BicycleGAN and we made extensive use of their code. We would like to thank the authors for their contribution.

TO - DO

Inference instructions
Add environment.yml
Add weight files
Add sample images

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
assets		assets
synthetic		synthetic
README.md		README.md
bwmorph_thin.py		bwmorph_thin.py
classical.py		classical.py
config.py		config.py
dataset.py		dataset.py
discriminator.py		discriminator.py
eval.py		eval.py
infer.py		infer.py
metrics.py		metrics.py
model.py		model.py
split.py		split.py
train.py		train.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Variational-Augmentation-for-Enhancing-Historical-Document-Image-Binarization

Abstract

Overview

Approach

Prerequisites

Dataset Download

Directory Structure

Train Instructions

Inference

Citation

Acknowledgements

TO - DO

About

Releases

Packages

Languages

DVLP-CMATERJU/Variational-Augmentation-for-Enhancing-Historical-Document-Image-Binarization

Folders and files

Latest commit

History

Repository files navigation

Variational-Augmentation-for-Enhancing-Historical-Document-Image-Binarization

Abstract

Overview

Approach

Prerequisites

Dataset Download

Directory Structure

Train Instructions

Inference

Citation

Acknowledgements

TO - DO

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages