๐Ÿ“ฆ rooftopcellist / ocr-python-ai

Simple project for extracting text from images (OCR).

โ˜… 0 stars โ‘‚ 0 forks ๐Ÿ‘ 0 watching โš–๏ธ MIT License
๐Ÿ“ฅ Clone https://github.com/rooftopcellist/ocr-python-ai.git
HTTPS git clone https://github.com/rooftopcellist/ocr-python-ai.git
SSH git clone git@github.com:rooftopcellist/ocr-python-ai.git
CLI gh repo clone rooftopcellist/ocr-python-ai
Christian M. Adams Christian M. Adams Support DICOM images 1a00b56 2 years ago ๐Ÿ“ History
๐Ÿ“‚ main View all commits โ†’
๐Ÿ“ docs
๐Ÿ“ utils
๐Ÿ“„ .gitignore
๐Ÿ“„ LICENSE
๐Ÿ“„ main.py
๐Ÿ“„ README.md
๐Ÿ“„ requirements.txt
๐Ÿ“„ README.md

AI Image to Text Extractor

This project uses Optical Character Recognition (OCR) to extract text from images.

Setup

  • Install Tesseract on your machine. For instructions, see: https://github.com/tesseract-ocr/tessdoc#installation
For Fedora, I needed to follow this guide: https://blog.mdda.net/oss/2016/08/10/tesseract-and-python-on-fedora and run the following:

sudo dnf install tesseract-devel
pip install tesserocr

  • Create a virtual environment (optional, but recommended):
python3 -m venv venv
    source venv/bin/activate

  • Install the required Python dependencies:
pip install -r requirements.txt

Usage

To run the script, simply execute the following from a terminal:

python main.py path_to_image1 path_to_image2

This will print the text to the console by default, but if you add the --to-file flag, it will print the text to separate files in the a directly called output.

python main.py path_to_image1 path_to_image2 --to-file

The script expects a file named input.png in the same directory. You can replace it with your image file. Please replace 'input.png' with the path to your image in main.py.

The extracted text will be printed on the console.

This tool accepts multiple image formats since OpenCV's cv2.imread() function supports a variety of image formats including .bmp, .jpg, .jpeg, .png, .tif, .tiff, etc.

Remember, Tesseract does a good job when the image is of high quality and the text is horizontal. For complex cases involving rotations, skewness, different languages or noisy backgrounds, you might have to use additional image processing techniques or different OCR tools.

Additional Info

For more information on how to use this tool and how it works, see the following documentation: