Fix TesseractError eng.traineddata Please make sure the TESSDATA_PREFIX environment variable

When you are using tesseract to recognize text from an image in python, you may get this error:

pytesseract.pytesseract.TesseractError: (1, 'Error opening data file \\Program Files (x86)\\Tesseract-OCR\\eng.traineddata Please make sure the TESSDATA_PREFIX environment variable is set to your "tessdata" directory. Failed loading language \'eng\' Tesseract couldn\'t load any languages! Could not initialize tesseract.')

In this tutorial, we will introduce you how to fix it.

The simplest way is to set tessdata_dir_config.

For example:

from  PIL import  Image
import pytesseract
tessdata_dir_config = '--tessdata-dir "C:\\Program Files (x86)\\Tesseract-OCR\\tessdata"'
img_path='screenshot.png'
text=pytesseract.image_to_string(Image.open(img_path), config=tessdata_dir_config)
 
print(text)

C:\\Program Files (x86)\\Tesseract-OCR\\tessdata is the directory of Tesseract-OCR tessdata.

Run this code, you will find this error is fixed.

Fix TesseractError eng.traineddata Please make sure the TESSDATA_PREFIX environment variable – Python Tutorial

Leave a Reply Cancel reply