how to recognize blank, recognize English and Chinese in one model
See original GitHub issueFirstly, you codes are great. I trained with SynthText90k dataset and achieved very good performance on English words.
there are several questions. hopefully you can give me a hand. Thank you very much. thanks for your time.
-
How to recognize blank in one sentence? for example,I want to recognize “I love python” there is blank between I and love. how to handle this problem? just add blank in alphabet? like this? and prepare for the training data
alphabet = """0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ """
-
Can we recognize English and Chinese in one model? if we want to recognize English and Chinese in one model, how to do? just make alphabet contain all English and Chinese characters? just like this?
alphabet = """0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ是不我一有大在人了中到資..."""
-
if we want to recognize very long sentence? do you think it would be better to train with very long sentences or we can just train with short sentence? because your current model only support text length less than 26. so have to modify the network if I want to support training with long sentence.
Issue Analytics
- State:
- Created 3 years ago
- Comments:8 (2 by maintainers)
Hi,
these are two different things, recognize sentence and segment sentences. Just add blank in the labels is not recommended.
just make alphabet contain all English and Chinese characters, like what you say.
Calculate the last lstm T length. The longer you resize image width to be, the longer you can train with. One location for one word.
@ducbluee Hi, I used very limited synthetic data to train the model. so the model does not work well on real-world images. you can follow the way I handle blank.