GitHub - Muazzeem/reusable-doc-ai-package: A Reusable Doc AI Python Package

Instructions

Installing

Install using pip command

Use the pip command to install this package:

pip install reusable-code-for-document-ai

Basic Usage

NOTE: Our system will support only image file and pdf file.

from reusable_code.process_document import CustomGoogleDocAIProcessor

processor = GoogleDocAIProcessor(
    location="your_location",
    processor_name="your_processor_name",
    processor_options=your_processor_options
)

document = processor.process_document("path_to_your_document", extract_page_number)

path_to_your_document = your pdf or image file path.

extract_page_number = Which page you want to extract, page number.

processor options

NOTE: Additional configurations for Document OCR Processor (Optional).

# For more information: https://cloud.google.com/document-ai/docs/enterprise-document-ocr

process_options = documentai.ProcessOptions(
    ocr_config=documentai.OcrConfig(
        enable_native_pdf_parsing=True,
        enable_image_quality_scores=True,
        enable_symbol=True,
        # OCR Add Ons https://cloud.google.com/document-ai/docs/ocr-add-ons
        premium_features=documentai.OcrConfig.PremiumFeatures(
            compute_style_info=True,
            enable_math_ocr=False,  # Enable to use Math OCR Model
            enable_selection_mark_detection=True,
        ),
    )
)

Follow this pattern for set processor name

your_processor_name = projects/<PROJECT_ID>/locations/us/processors/<processor-id>/processorVersions/<version-id>

Make sure to replace your_location, your_processor_name, your_processor_options, and path_to_your_document with appropriate values.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.github/workflows		.github/workflows
reusable_code		reusable_code
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
README.md		README.md
__init__.py		__init__.py
build_package.sh		build_package.sh
exceptions.py		exceptions.py
pyproject.toml		pyproject.toml
read_pdf_or_image_files.py		read_pdf_or_image_files.py
requirements.txt		requirements.txt
setup.cfg		setup.cfg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Instructions

Installing

Basic Usage

processor options

Follow this pattern for set processor name

About

Uh oh!

Releases 1

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Instructions

Installing

Basic Usage

processor options

Follow this pattern for set processor name

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages