question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

NBT does not use UTF-8, it's MUTF-8.

See original GitHub issue

NBT uses MUTF-8, not UTF-8. Valid game-generated files will result in UnicodeDecodeErrors when using Twoolie’s NBT. Minimal reproduction file with an embedded MUTF-8 NULL: encoded.dat.gz

I’d normally send you a PR to use my MUTF-8 encoder, but being dependency-free seems to be a project goal. There’s a pure-python version in there you can just copy.

@1dt

Issue Analytics

  • State:open
  • Created 3 years ago
  • Reactions:1
  • Comments:10

github_iconTop GitHub Comments

3reactions
TkTechcommented, Jan 22, 2021

Since this library wants to be dependency free, and py2.7 compatible (which mutf8 is not), instead of a PR here’s a patch anyone stumbling on this with an unreadable file can use (as long as you’re py3):

diff --git a/nbt/nbt.py b/nbt/nbt.py
index 947a65e..8f633bd 100644
--- a/nbt/nbt.py
+++ b/nbt/nbt.py
@@ -4,12 +4,13 @@ Handle the NBT (Named Binary Tag) data format
 For more information about the NBT format:
 https://minecraft.gamepedia.com/NBT_format
 """
-
 from struct import Struct, error as StructError
 from gzip import GzipFile
 from collections import MutableMapping, MutableSequence, Sequence
 import sys

+import mutf8
+
 _PY3 = sys.version_info >= (3,)
 if _PY3:
     unicode = str
@@ -353,10 +354,10 @@ class TAG_String(TAG, Sequence):
         read = buffer.read(length.value)
         if len(read) != length.value:
             raise StructError()
-        self.value = read.decode("utf-8")
+        self.value = mutf8.decode_modified_utf8(read)

     def _render_buffer(self, buffer):
-        save_val = self.value.encode("utf-8")
+        save_val = mutf8.encode_modified_utf8(self.value)
         length = TAG_Short(len(save_val))
         length._render_buffer(buffer)
         buffer.write(save_val)
diff --git a/setup.py b/setup.py
index e6a7cd5..4338408 100755
--- a/setup.py
+++ b/setup.py
@@ -13,6 +13,7 @@ setup(
   license          = open("LICENSE.txt").read(),
   long_description = open("README.txt").read(),
   packages         = ['nbt'],
+  install_requires = ['mutf8'],
   classifiers      = [
         "Development Status :: 5 - Production/Stable",
         "Intended Audience :: Developers",
0reactions
Netherwhalcommented, Oct 14, 2022

Still getting the same issue though:

UnicodeDecodeError: ‘mutf-8’ codec can’t decode byte 0xed in position 630: 6-byte codepoint started, but input too short to finish.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Minecraft NBT (Named Binary Tag) format spec for Kaitai Struct
A MUTF-8-encoded string containing these code points cannot be successfully decoded as UTF-8. The behavior in this case depends on the target language...
Read more >
PDO + MySQL and broken UTF-8 encoding - Stack Overflow
In my own framework, after I create the PDO connection, I send two queries – SET NAMES utf8 and SET CHARACTER SET utf8...
Read more >
hematite-nbt: Versions | Openbase
Compiling with the preserve_order feature will use an IndexMap instead of a HashMap ... although it should not in theory affect well-behaved NBT...
Read more >
[perl #120451] perlpod not reading utf-8 on ... - Mailing List Archive
"=encoding utf-8" at the beginning of it's pod section. ... [Please do not change anything below this line] ... PERL5OPT=-Mutf8 -CSA -I/home/law/bin/lib
Read more >
Perl UTF-8 crash course | by Tatsuhiko Miyagawa
Mixing in wide characters without properly encoding is a bug in your part, not perl's.You'll see “Wide characters in print…” if you don't...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found