SmartOCR – extremely clean AI-powered results, no matter the layout!

What is SmartOCR?

Imagine if you could use an AI to understand and render your document for you. Well, that's what SmartOCR is. SmartOCR is an OCR tool powered by a visual language model. It extracts the text from a page and renders it into ASCII – no matter how complex the output is.

Smart in all senses

SmartOCR isn't just smart because it is AI-powered. It was designed to do the OCR in small batches and then join the results together (this behavior can be tweaked in the settings). This means that while it is powerful, it can also handle very long, 400+ page documents. It also was designed with multithreading in mind, so it'll always attempt to stay as responsive as possible.

Sounds great! How do I run it?

First, download LmStudio.
Your next step is to download the language model. Due to how it is designed, a vision-enabled model is MANDATORY. At the time of my writing, the most powerful language model is Gemma 3 QAT. The 12B parameter model, which is reasonable enough in most cases, will take around 6-7 GB RAM. Download it here, clicking on the button "Use in LMStudio."

When you are done, open the console and run the program with python SmartOCR.py. Install any necessary dependencies.
Enjoy!

Known limitations

Please be aware that this program does not replicate the original document layout or extract any images. Those features are intended in the future, but are not guaranteed.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
core types		core types
README.md		README.md
smartocr.py		smartocr.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SmartOCR – extremely clean AI-powered results, no matter the layout!

What is SmartOCR?

Smart in all senses

Sounds great! How do I run it?

Known limitations

About

Releases

Packages

Contributors 2

Languages

NullMagic2/SmartOCR

Folders and files

Latest commit

History

Repository files navigation

SmartOCR – extremely clean AI-powered results, no matter the layout!

What is SmartOCR?

Smart in all senses

Sounds great! How do I run it?

Known limitations

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages