regression: ocrd_mets.remove_physical_page broken
See original GitHub issueSee https://github.com/hnesk/browse-ocrd/actions/runs/3573770934/jobs/6008222016
It seems that the (new?) implementation is broken:
File "/home/runner/work/browse-ocrd/browse-ocrd/ocrd_browser/model/document.py", line 383, in delete_page
self.workspace.mets.remove_physical_page(page_id)
File "/opt/hostedtoolcache/Python/3.7.15/x64/lib/python3.7/site-packages/ocrd_models/ocrd_mets.py", line 689, in remove_physical_page
mets_div[0].getparent().remove(mets_div[0])
IndexError: list index out of range
Unfortunately, I cannot pinpoint / dissect, because apparently, @MehmedGIT has effectively erased the history of ocrd_mets.py.
Issue Analytics
- State:
- Created 10 months ago
- Comments:7 (2 by maintainers)
Top Results From Across the Web
Available CRAN Packages By Name
Available CRAN Packages By Name ; AgroR, Experimental Statistics and Graphics for Agricultural Sciences ; AgroReg, Regression Analysis Linear and Nonlinear for ...
Read more >Google Books Online at the University of Michigan Library
Two fileGrps (images and OCR). – Physical structMap tying together the files with any metadata (page numbers or features). METS Object ...
Read more >Image Segmentation methods for fine-grained OCR ... - Helda
The thesis studies how image segmentation techniques can be used for fine-grained OCR docu- ment layout analysis. How to implement fine-grained page ......
Read more >POSTER SESSION 2: Thursday, 1 May 2008, 13:30–18:00 Location ...
Conclusion: This study confirm the evidence of a benefit on exercise capacity from physical rehabilitation (peak WR, VO2peak and VO2@AT) in cardiac patients....
Read more >Untitled
Gasthaus lehmeier nennslingen, Balutschistan hamburg lieferservice, Tennis shoe coloring page, 2006 harley sportster 1200 parts, Yacc video tutorial, ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I could fix the problems with
OCRD_METS_CACHING
and browse-ocrd’s test suite except thetest_reorder
test. The problem there is that browse-ocrd modifies the underlying XML but the OcrdMets caching does not know about it. We either need to extend the OcrdMets API to offer the functionality (i.e. reordering of pages) or at least a way to let OcrdMets know that it should invalidate and re-fill the cache. @MehmedGIT @hneskYes, but look again: it does not in fact break the cache validity at all. And that’s the problem really: What is actually tested as a result,
self.page_ids
, delegates toOcrdMets.physical_pages
, and thus, your cache. So whileself.reorder
does reorder the pages in the actual element tree, the cached version still contains the old order (because it has not been invalidated yet).No, I did mean the cached state in memory. Let’s say we finally get to have some form of error recovery (e.g. catching anything below
Processor.process_page
). Now if the processor crashes on one page, but already made its METS action prior to that, and then recovery has the program revert to some dummy or copy behaviour and continue with the next page – clearly, it should also invalidate the cache, so whatever is now in the tree is also in the cache.Depends. In the processing server, we should be able to have the METS only in memory throughout the whole workflow. But in a standard CLI run, as soon as a processor is finished, it needs to serialise to disk. With page-wise processing it obviously depends on just how that is implemented: With the current METS splitting, we are in the latter case, whereas with page-parallel API we are in memory-only territory.
Yes. But see above (“continue with next page…”)
Definitely. But error handling could always try to undo the last METS action as part of recovery. (So error handling would involve making “backups” available for rollback, as would a MetsServer naturally.)