TrOCR Detector

Submitted to blnk Egypt

TrOCR is a Transformer-based OCR, This repo implements the TrOCR From scratch using Tensorflow, and Django. This README.md contains:

The application is divided into two main apps:

TrOCR_Django_App: Main application containing the main webpage interface and communicates with Deep_Learning_App.
Deep_Learning_App: This application handles all deep learning implementation and scripts. For instance, building encoder, decorer, TrOCR Model, predict.py, train.py and etc.

Getting Started ...

Setting Up Environment: Create and Activate a Virtual Environment

python -m venv ocr_detector_venv
source ocr_detector_venv/Scripts/activate

Install required dependencies

pip install -r requirements.txt

Edit network/ data configuration. If you prefer to use Vim:

vim Deep_Learning_App/src/config.py

or using Nano editor: Edit network/ data configuration. If you prefer to use Vim:

nano Deep_Learning_App/src/config.py

Starting Django Server: Starting development server at http://127.0.0.1:8000/

python manage.py runserver

Deep Learning App

Let's first see the implementation details of such a project:

Data Loader: This module: a. Loads the dataset images and corresponding text. b. Tokenize the words using Bert Arabic Tokenizer.
Preprocessing: The preprocessing implemented included: a. Extracting and cropping the text from the image: This part is done using OpenCV Library by thresholding and finding contours of text to extract the text box as shown:

b. Image resizing after extracting the text: (88,200,3).
c. Normalization: /255.0.

TrOCR Model: All encoder/ decoder architecture is written in Tensorflow in OOP Well-documented Inhertided Classes. Encoder/ Decoder configuration parameters are to be edited in Deep_Learning_App/src/config.py

Run Deep Learning Scripts

You can train/ evaluate or predict without the need to use Django Apps. To do this you can train the model using:

cd TrOCR_Project
nano Deep_Learning_App/src/config.py
python Deep_Learning_App/src/train.py

Or to evaluate:

cd TrOCR_Project
nano Deep_Learning_App/src/config.py
python Deep_Learning_App/src/evaluate.py

Or to predict:

cd TrOCR_Project
nano Deep_Learning_App/src/config.py
python Deep_Learning_App/src/predict.py --image_path absolute/path/to/image.jpg

To convert the TensorFlow model to onnx model:

cd TrOCR_Project
python Deep_Learning_App/src/to_onnx.py --model_path absolute/path/to/model.h5 --output_path absolute/path/to/output/directory

If you don't specify a model/ output path, the script will use the model path given in Deep_Learning_App/src/config.py you just run:

cd TrOCR_Project
python Deep_Learning_App/src/to_onnx.py

To convert the TensorFlow model to TRT model:

cd TrOCR_Project
python Deep_Learning_App/src/to_TRT.py --model_path absolute/path/to/model.h5 --output_path absolute/path/to/output/directory

If you don't specify a model/ output path, the script will use the model path given in Deep_Learning_App/src/config.py you just run:

cd TrOCR_Project
python Deep_Learning_App/src/to_TRT.py

Loss/ Accuracy Masked Functions: Since Transformers require padding sequence length to have a unified length, predictions from paddings should not be accounted for loss/ accuracy calculations as they're being masked. Therefore, masked loss/ accuracy functions were created.
Learning Rate Scheduler: A custom learning rate scheduler according to the formula in the original Transformer was implemented:

Django App

Main interface app will be like this:

Show PreProcessed App

This allows you to see how images are preprocessed (Before normalization).

Show Predict

First, we must upload the image:

Once you upload the image, we can predict:

Configure Model

Allows you to change parameters set in the config.py file:

Once you change the configuration:

See Training Logs

Starting Tensorboard:

Logs Sample:

Note

Kindly note for the sake of work-replication, small samples from images were added to the repo. For full replication, please add the dataset to: TrOCR_Project/media/dataset

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

TrOCR Detector

Getting Started ...

Deep Learning App

Run Deep Learning Scripts

Django App

Show PreProcessed App

Show Predict

Configure Model

See Training Logs

Note

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 68 Commits
Deep_Learning_App		Deep_Learning_App
TrOCR_Django_App		TrOCR_Django_App
media/dataset		media/dataset
.gitignore		.gitignore
README.md		README.md
manage.py		manage.py
requirments.txt		requirments.txt

abdallah1097/TrOCR_Project

Folders and files

Latest commit

History

Repository files navigation

TrOCR Detector

Getting Started ...

Deep Learning App

Run Deep Learning Scripts

Django App

Show PreProcessed App

Show Predict

Configure Model

See Training Logs

Note

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages