question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Question: Which encoding is used when saving german umlauts to JSON

See original GitHub issue

Overview

Hello,

I’m trying to save datapackages, resources and table schemas as JSON using the built in .to_json() function. Problem is, that I have german umlauts (Ä,Ö,Ü,ß) and exponents (e.g. m², m³) in my meta data.

When opening the resulting JSON file in PyCharm, it tries to open that with UTF-8 encoding. This results in unrecognised characters and a warning from PyCharm, because the file seems to be encoded in ISO 8859-1 (see screenshot): image

That’s how the file should look like: image

Other editors (like Windows Notepad or Notepad++) recognise the encoding correctly.

My question is, when is fricitonless using UTF-8 and when other encodings? Why is it not saving in UTF-8 at all times and escaping unicode characters, since the JSON specification (RFC 7159, Chapter 8.1) specifies UTF-8 as standard encoding?

Thanks in advance and keep up the good work!


Python-Code to reproduce this issue:

import frictionless

pack = frictionless.Package()
pack.name = "name-of-package"
pack.description = "öäü ÖÄÜ ß m² m³ and some other text"

pack.to_json("datapackage_test.json")

Please preserve this line to notify @roll (lead of this repository)

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:2
  • Comments:7 (3 by maintainers)

github_iconTop GitHub Comments

4reactions
rollcommented, Nov 23, 2020

Great! Thanks for the quick analysis.

I’m going to fix it this week. If you’re interested feel free to PR adding a (failing -> fixed) test

1reaction
hoffchcommented, Nov 30, 2020

Thanks for fixing this issue - works like a charm 👍 And BTW: your release rate is impressing - keep going!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Which encoding is used when saving german umlauts to ...
Overview Hello, I'm trying to save datapackages, resources and table schemas as JSON using the built in .to_json() function.
Read more >
Encoding issues (german Umlaut) with Excel/Java/Json
There are Umlauts in the Excel file and when I open the generated . json file on TextEdit (as UTF-8 encoded file) they're...
Read more >
JsonUtility.fromJson won't work with german Umlauts ä ü ö
Hey Guys, I've been programming a questionnaire creation tool using json files as specification. It works fine with english but as soon as...
Read more >
Python JSON Encode Unicode and non-Ascii characters as-is
In this article, we will address the following frequently asked questions about working with Unicode JSON data in Python.
Read more >
How to customize character encoding with System.Text.Json
Learn how to customize character encoding while serializing to and deserializing from JSON in .NET.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found