question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Error when encoding cpu_texts with custom dataset

See original GitHub issue

Hi Holmeyoung, I face this error when running train.py with a custom dataset Annotation 2019-07-22 062747

I try text = b''.join(text) and it turn into another problem Annotation 2019-07-22 063226

My question is: which is the proper type of cpu_texts (tuple of str or tuple of bytes) I think that my custom lmdb dataset might be the problem, because cpu_images, cpu_texts = data returns tuple of bytes

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:9 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
Holmeyoungcommented, Jul 23, 2019

Hi, you can refer to #17 for detail.

1reaction
Holmeyoungcommented, Jul 23, 2019

Hi,

  1. You should use my create_dataset.py to create lmdb. Because to make the Chinese or Japanese work, I story the image and label in binary mode. If you create a normal lmdb, but i treate it as binary in my code, there will be error.
  2. In the model/crnn.py the RNN layer is 26 in T length. It means the max length is 26. If you want it to be 36 or larger, you should change the image resize width.
Read more comments on GitHub >

github_iconTop Results From Across the Web

Why is "Unable to find encoder for type stored in a Dataset ...
The following super simple code yields the compilation error Error:(17, 45) Unable to find encoder for type stored in a Dataset. Primitive types...
Read more >
Error when training with custom dataset #138 - GitHub
This entire encode, decode error happens because in python two, strings are encoded into binary type by default. so no need to use...
Read more >
Fine-tuning with custom datasets - Hugging Face
In TensorFlow, we pass our input encodings and labels to the from_tensor_slices constructor method. We put the data in this format so that...
Read more >
How can I fix the UTF-8 error when bulk uploading users?
This error occurs because the software you are using saves the file in a different type of encoding, such as ISO-8859, instead of...
Read more >
Custom datasets and schemas - Amazon Personalize
When you create a dataset for a Custom dataset group, each dataset type has the following required fields and reserved keywords with required...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found