Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Same result of evaluation with the 1.0 version and the 2.1.1 version of Calamari on arabic database

See original GitHub issue

Hello! @ChWick when I used the 1.0 version of calamari the evaluation command give me this result:

Got mean normalized label error rate of 100.00% (87988 errs, 87991 total chars, 88020 sync errs)
GT       PRED     COUNT    PERCENT   
{ال هحور هب يجنملا لمعلا يف دهتجم هرخآلا بلاط كلذكو} {}              1      0.06%
{لب . ةقيوازتل حفصتلا هتياغ نوكت الأ اذه انباتك يف رظانلل يغبني دقو} {}              1      0.07%
{جزتمت ،لبجلا ةيدوأ يفو ،لبجلا فورج ىلع دباعملا رشتنت .ةعئارلا ءاقرزلا ءامسلاب} {}              1      0.09%
{نأ نودقتعي مهف ،رخآ يأر مهل اهءانيأ نأ ريغ ،يذوبلا وابيشت دبعم} {}              1      0.07%
{اديدحتو ،لعفلاب ةعبس زونك نم اهمسا تذخأ زونك ةعبسلا ةدلبلا هذه} {}              1      0.07%
{ةيبرعلا ىلإ اهتمجرت نكمي ىاليف لاثمت يه} {}              1      0.04%
{ةيقافتا فدهت ، رحصتلا نم ةبرتلاو يضارألا ةيامح ىلإ ةفاضإلابو} {}              1      0.07%
{ةمدقتملا نادلبلا مزتلتو . رقفلا ةبراحم ىلإ ًاضيأ رحصتلا ةحفاكم} {}              1      0.07%
{نع ةرثأتملا نادلبلا دوهج معدب ، ةيقافتإلا هذه بجومب ، ومنلا} {}              1      0.07%
{نواعتلا راطإ يف ةيفاكلا ةينقتلاو ةيلاملا ةدعاسملا اهحنم قيرط} {}              1      0.07%
The remaining but hidden errors make up 99.32%

the same result approximately when I use the latest there is only a variation of Average sentence confidence: 99.94% I’m using this command for prediction :

!calamari-predict  --checkpoint '/directionof best model obtained after pretraining of arabic model 3/best.ckpt'  --output_dir '/directiontooutputdir' --data.images '*.png'

I’m using this command for evaluation

!calamari-eval --gt.texts *.gt.txt

But the result is unexpected

Evaluation: 100% 1424/1424 [00:00<00:00, 7077.97it/s]
Evaluation result
=================
Got mean normalized label error rate of 100.00% (88180 errs, 88183 total chars, 88212 sync errs)
GT       PRED     COUNT    PERCENT   
{هئامثالث نم رثكأ اوهويج لبج يف دباعملا ددع ناك ، هراهدزا تارتف رثكأ يف} {}              1      0.08%
{. ةيبيللا ةيبعشلا ريباعتلاب} {}              1      0.03%
{ةميدق ةيباتك صوصن ،فسألا عم انيدل تسيلو . نينسلا تائمب داليملا لبق اهتنكس} {}              1      0.08%
{يتاللاو ندملا يف نلمع يتاللا تايفيرلا نيب حضاو فالتخا دوجو نع} {}              1      0.07%
{ريغي نا دب ال ـضفحلاب نامزلا بعالنو نايسنلا تافآ نأو اليوط ادما} {}              1      0.07%
{بتك يف نودملا نيعملا اذه يف وه ثراحلا نب رضنلا'' هملعت يذلا صصقلا نيعم نوكي نأ دب الو} {}              1      0.10%
{كلاذ لاثماو ''ةعلقلا رادب ةتعم مه ،رجه ونب نارهش لويق ،ام هونبو ،ام هوخا ،ام حرش'' :ةصن اذهو ،ناويخ دجسم يف رجح} {}              1      0.13%
{و مسرلا نفل ةرودو ةينيصلا ةمجرتلل تارود ةماقإو ةينيصلا} {}              1      0.06%
{.هيعامتجإلا لئاضفلا نم راثيإو هفعو ةنامأو ءافوو قدص ىف عمتجملل} {}              1      0.07%
{. هيداصتقالا ةيمنتلا هجاوت يتلا تايدحتلاو ةئشانلا تالكشملا ةهجاوم يف مهلامعأ ريوطتل ةيملعلا ةيمنتلا ةيرظن} {}              1      0.12%
The remaining but hidden errors make up 99.19%
INFO     2021-06-03 10:59:45,068 calamari_ocr.ocr.dataset.datar: Resolving input files

I’m using for the prediction the "“test”"part of my database Please how can I resolve this problem? thanks so much for continuous help

Issue Analytics

State:
Created 2 years ago
Comments:8 (4 by maintainers)

Top GitHub Comments

1reaction

ChWickcommented, Jun 5, 2021

Sorry, you must additionally specify: --pred File

1reaction

ChWickcommented, Jun 4, 2021

You specified a custom output directory for calamari-predict: --output_dir '/directiontooutputdir' , therefore by default calamari-eval does not find any prediction file .pred.txt, You must either manually specify the prediction texts during calamari-eval (use --pred.texts /directiontooutputdir/*.pred.txt) or omit the --output_dir parameter completely.

Top Results From Across the Web

Untitled

Blue beam ultra review, Diff between directory trees, Neumaticos pueyo ... Dance expressions maple valley wa, Wwe edge theme song chipmunk version!

The Geography of Taste: Using Yelp to Study Urban Culture

This study aims to put forth a new method to study the sociospatial boundaries by using georeferenced community-authored reviews for restaurants.

OCR4all - An Open-Source Tool Providing a (Semi ... - arXiv

A thorough comparison of a shallow LSTM (OCRopus 1) and a deep CNN/LSTM hybrid (Calamari) is given in [9]. Three early printed books,...

Untitled - Regulations.gov

faecal concentration, composition, and output of shortchain fatty acids and ... the Committee would wish to evaluate the final reports and data to...

CodaLab Worksheets

... search options They built updated linguistic databases come four versions ... palette Color library draw outline within inside result shown After tab ......