image_to_string on CentOS is returning TypeError
See original GitHub issueSo, I have this Python file which uses PyTesseract to get the text from some positions on the image as follows:
import PIL.Image as Image
import os
import pytesseract
from PIL import ImageFile
from dotenv import load_dotenv
from os.path import join, dirname
import traceback
dotenv_path = join(dirname(__file__), '.env')
load_dotenv(dotenv_path)
PYTHONIOENCODING = 'UTF-8'
def read(pic_name, ocr_data):
read_data = {}
ImageFile.LOAD_TRUNCATED_IMAGES = True
# ID's front image
infile = os.getenv('PICS_FOLDER') + '/' + pic_name + '.jpg'
# image containing the zone where name is located
text_area_img = os.getenv('ID_RESULTS_FOLDER') + '/name.jpg'
img = Image.open(infile)
# get dimensions to be used while cropping
width, height = img.size
# ocr_data contains all the text elements to search for along with their coordinates in the image
for data in ocr_data:
coords = data["coordinates"]
print(coords[0] * width)
print(coords[1] * height)
print(coords[2] * width)
print(coords[3] * height)
cropping_coords = (coords[0] * width, coords[1] * height, coords[2] * width, coords[3] * height)
# The readable area contains the text element
readable_area = img.crop(cropping_coords)
# The readable area is saved for later reference
readable_area.save(text_area_img)
try:
txt = pytesseract.image_to_string(readable_area, lang='ara', config='--psm 6')
except Exception as e:
traceback.print_stack()
txt = "An exception occurred: " + str(e)
read_data[data["read_text"]] = txt
with open(os.getenv('ID_RESULTS_FOLDER') + '/name.txt', 'w', encoding='utf-8') as f:
print(txt, file=f)
return read_data
Now, this code is working just fine on my Windows machine, but when I deployed it on CentOS 7 after having installed all the needed python libraries and installing python itself there, it’s giving
TypeError exception: An exception occurred expected str bytes or os.PathLike object not NoneType
along with this stack trace(The same exception happens twice because there are two text elements I am searching for:
Traceback (most recent call last):
File "/var/www/digital-identity/front.py", line 39, in read
txt = pytesseract.image_to_string(readable_area, lang='ara', config='--psm 6')
File "/usr/local/lib/python3.6/site-packages/pytesseract/pytesseract.py", line 345, in image_to_string
}[output_type]()
File "/usr/local/lib/python3.6/site-packages/pytesseract/pytesseract.py", line 344, in <lambda>
Output.STRING: lambda: run_and_get_output(*args),
File "/usr/local/lib/python3.6/site-packages/pytesseract/pytesseract.py", line 253, in run_and_get_output
run_tesseract(**kwargs)
File "/usr/local/lib/python3.6/site-packages/pytesseract/pytesseract.py", line 223, in run_tesseract
proc = subprocess.Popen(cmd_args, **subprocess_args())
File "/usr/lib64/python3.6/subprocess.py", line 729, in __init__
restore_signals, start_new_session)
File "/usr/lib64/python3.6/subprocess.py", line 1278, in _execute_child
executable = os.fsencode(executable)
File "/usr/lib64/python3.6/os.py", line 800, in fsencode
filename = fspath(filename) # Does type-checking of `filename`.
TypeError: expected str, bytes or os.PathLike object, not NoneType
Traceback (most recent call last):
File "/var/www/digital-identity/front.py", line 39, in read
txt = pytesseract.image_to_string(readable_area, lang='ara', config='--psm 6')
File "/usr/local/lib/python3.6/site-packages/pytesseract/pytesseract.py", line 345, in image_to_string
}[output_type]()
File "/usr/local/lib/python3.6/site-packages/pytesseract/pytesseract.py", line 344, in <lambda>
Output.STRING: lambda: run_and_get_output(*args),
File "/usr/local/lib/python3.6/site-packages/pytesseract/pytesseract.py", line 253, in run_and_get_output
run_tesseract(**kwargs)
File "/usr/local/lib/python3.6/site-packages/pytesseract/pytesseract.py", line 223, in run_tesseract
proc = subprocess.Popen(cmd_args, **subprocess_args())
File "/usr/lib64/python3.6/subprocess.py", line 729, in __init__
restore_signals, start_new_session)
File "/usr/lib64/python3.6/subprocess.py", line 1278, in _execute_child
executable = os.fsencode(executable)
File "/usr/lib64/python3.6/os.py", line 800, in fsencode
filename = fspath(filename) # Does type-checking of `filename`.
TypeError: expected str, bytes or os.PathLike object, not NoneType
Now I have checked that the files are being created, but I don’t know what’s causing the problem nor how am I supposed to solve it seeing as it’s probably an issue with the library.
Issue Analytics
- State:
- Created 4 years ago
- Comments:6
Top Results From Across the Web
TypeError: __call__() takes exactly 2 arguments (1 given) in ...
Not being an administrator on my lab machine, I tried to install the pip using curl like below in CentOS:
Read more >Pillow (PIL Fork) Documentation
Note that for a single-band image, split() returns the image itself. To work with individual color bands, you may.
Read more >python-pillow-Pillow/CHANGES.rst at main · alvistack/python-pillow ...
Do not prematurely return in ImageFile when saving to stdout #5665 ... radarhere]; Catch TypeError from corrupted DPI value in EXIF #5639 [homm,...
Read more >python-Pillow-7.2.0-bp153.1.18 - SUSE Package Hub -
SEEK_* constants #3572 [jdufresne] * Make ContainerIO.isatty() return a bool, ... appveyor.yml as .appveyor.yml #2978 [hugovk] * Fix TypeError for JPEG2000 ...
Read more >python3-Pillow-8.4.0-bp154.1.66.x86_64 RPM
SEEK_* constants #3572 [jdufresne] * Make ContainerIO.isatty() return a bool, ... opening webp files #2974 [wiredfool] * Setup: Fix "TypeError: 'NoneType' ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
‘pytesseract.pytesseract.tesseract_cmd’ is just a module variable. Thank you for sharing the workaround. I don’t know why this variable changed for your pytesseract installation - it should be set to ‘tesseract’ by default. Closing this issue as fixed.
Alright, I checked something,
pytesseract.pytesseract.tesseract_cmd
was returning none, so I explicitly declared it to betesseract
as such:pytesseract.pytesseract.tesseract_cmd = 'tesseract'
And it worked, so I guess this function just returned None for some reason but now it’s fixed