Improving the usability of EasyOCR
See original GitHub issueHi @rkcosmos
Let me begin by applauding you and your team for building out such a versatile and excellent library. It is definitely worthy of the praise I see online and of my own praise I give this library as a user. That being said, I feel like there are multiple shortcomings that, if addressed, would push this library to new potential.
The following are things I would like to see addressed. I will be pointing out areas I believe to be deficient in a library of this caliber, but it is not my intention to demean or disregard the efforts made by the team and the community thus far. These are simply my opinions and any of them are open to being discussed.
- Documentation is lacking in several key areas. Especially related to how parameters affect detection in methods like
detect
. The library presents itself as “ready-to-go, out-of-the-box” and the current documentation may be sufficient for less-adept users. However, someone like me will spend considerable amounts of time tweaking parameters to find the optimal outcome for my problem and I should be able to discern what they do from reading the docstrings. Not to mention, the documentation hosted on your website is separate from the code. Using ReadTheDocs, documentation can be updated with code commits automatically, reducing overhead. - Type hints are virtually non-existent. These are extremely useful for static type checkers and users to determine what possible types of values a parameter expects.
- Readability of the library code is probably the biggest issue I believe the library suffers from. The number 1 “rule” of Python’s Zen is “Beautiful is better than ugly”. If contributors cannot make heads or tails of what a function is doing because of lacking comments (!!!) or has ambiguous /complex implementations, they’ll be less likely to make said contributions. Inconsistent naming is another sticking point. Some functions are
camelCase
and others are propersnake_case
. - Tests don’t exist. The library appears to be largely tested through usage which isn’t proper. I can see these are starting to be added through 9bd8be0, but they are being added in a way that emphasizes my previous point about readability.
- Everything is manual. The library is not automated in any real way for performing tasks such as running tests, producing releases, etc. These can be easily performed through tools like
tox
and makes contributors’ lives much easier.
This list is non-exhaustive. There are plenty of other things I think are deficient; however, they are not worth mentioning at this time. Others may also have their own reservations and are free to add to the comments on this issue. But, for right now, I’d like to make a proposal.
I am willing to donate much of my time to making this library live up to what I believe could be its maximum potential. I am willing to address every point I’ve put in this list plus help address what other users may put in the comments if the suggestions are worthwhile. That being said, I need some things from you.
- Serious commitment. The severe lack of descriptions of functionality in this library (comments) will absolutely affect my ability to understand what is going on. I will need help understanding certain areas of your code to improve it and describe it.
- Continuous feedback on my approaches to certain things. For example, if I am grouping modules in a directory in a way you don’t agree with, let me know.
- Documenting a library like this is going to be painful and I don’t have nearly the skill level in AI that you do. Your input in documenting methods and classes will be required in some instances.
- And finally, time. This will be a massive project that I will only be able to work on when I have the time. This may take a couple of months to properly build.
- f you agree to my undertaking this project, my changes will be breaking. It may be best to reserve v2.x.x for this, but that is a detail that can be left for a later date.
If you’d like to see an example of what I am proposing, I built a library called python-step-series
for a motor controller family as well as ported all of the documentation from a separate website onto ReadTheDocs.
python-step-series Github python-step-series documentation (note there also exists a Japanese translation of the documentation. Choose ‘ja’ in the lower left menu)
Please let me know your thoughts or concerns. Jules
Issue Analytics
- State:
- Created a year ago
- Reactions:3
- Comments:5
@ystoll
Great point. Hopefully, with our work being open source, @rkcosmos will have ample opportunity to provide feedback or suggestions for any concerns he may have (or directly contribute if he so wishes).
To help mitigate the above, why don’t we discuss how the library should be structured and our roles before we start writing code? This will provide the community with further opportunity to join in if anybody so wishes and gives us a game plan before we start. I can start a Discussion in this repo to keep everything “centralized” and easy to access since this is better continued there.
I’ll post a link to the discussion shortly.
Warm Regards, Jules
Hi @JulianOrteil
I totally agree with you concerning the several aspects that you mentioned on your issue (see #823). I started to work on this on my side, to make the code more readable (turn systematically variables names to snake_case, introduce docstrings templates). I think that it will be great that we find a way to team up to improve this code, which on many aspects is very useful. More precisely, in priority, I would like to:
I don’t have a pure developer background so on certain aspects of software engineering, you might be more skilled that I am. But on the other side, I might be more at ease on the A.I side that you are so, it seems that our profiles are somehow complementary.
Let me know if you are interested in collaborating with me.
Yannick