question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Non-ASCII DST header is not working: 'utf-8' codec can't decode byte 0xe4 in position 3: invalid continuation byte

See original GitHub issue

See this file: https://andreymal.org/files/emb/дизайн 1.DST

It was created using russian version of Tajima DG/ML by Pulse 14.

pyembroidery can’t read this file:

>>> pyembroidery.read_dst('дизайн 1.DST')
  File "pyembroidery/DstReader.py", line 58, in dst_read_header
    header_string = header.decode('utf8')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe4 in position 3: invalid continuation byte

The header has a non-ASCII string “дизайн 1” that is actually encoded using ANSI encoding (Windows-1251 for Russia):

>>> print(header.decode('windows-1251'))
'LA:дизайн 1        \rST:   3294\rCO:  0\r+X:  298\r …

Could you provide a way to set a custom header encoding for the read_dst function?

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:9 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
andreymalcommented, Oct 24, 2019

I am not sure which embroidery machines would accept this and which would throw a fit.

There is a photo from Tajima TFMX-C1501 embroidery machine. The encoding is obviously incorrect, but it seems it’s working (click to enlarge)

0reactions
tatarizecommented, Jan 28, 2020

Fixed the minor remaining issue. #88.

Read more comments on GitHub >

github_iconTop Results From Across the Web

UnicodeDecodeError, invalid continuation byte - Stack Overflow
Then the error is displaying like this :- UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf4 in position 1: invalid continuation byte.
Read more >
UTF-8 - Wikipedia
UTF-8 is a variable-length character encoding used for electronic communication. ... code points in Unicode using one to four one-byte (8-bit) code units....
Read more >
UnicodeDecodeError when trying to run test case
The same test cases have been working fine before. ... UnicodeDecodeError: 'utf8' codec can't decode byte 0xe4 in position 1: invalid continuation byte...
Read more >
'utf8' codec can't decode byte · Issue #1959 - GitHub
'utf8' codec can't decode byte 0xce in position 9: invalid continuation byte.
Read more >
UnicodeDecodeError utf-8 codec can t decode byte in position ...
While I importing the file it shows UnicodeDecodeError: "utf-8" codec can"t decode byte 0xa0 in position ... as pd a ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found