Script to read scanned PDFs

Using the magic of tesseract to extract text from PDFs which weren't machine written

Requirements (Ubuntu 2x.04)

pdf2image needs uv, poppler-utils and Tesseract-ocr installed;

uv run scanner <folder with pdf docs>

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
src/ocr_pdf_scanner		src/ocr_pdf_scanner
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock