question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Suggest: let json.dumps keeps non-ASCII characters when sending json request

See original GitHub issue

Summary

When I send a json request, the json.dumps does not keeps non-ASCII characters.

Reproduction Steps

cat <<MM | python3 - &
from flask import Flask,request
app = Flask('')

@app.route('/', methods=['POST', 'GET'])
def test():
    return(request.data)

app.run()
MM

cat <<MM | python3 -
import time
import requests as r
time.sleep(1)
print(r.post('http://localhost:5000/', json={'bar':'程序员'}).text)
MM

Expected Result

I expect result keeps non-ASCII request: {"bar": "程序员"}

Actual Result

{"bar": "\u7a0b\u5e8f\u5458"}

Suggest

Modify the models vim modles.py +458 to keep the non-ASCII characters. For example:

body = complexjson.dumps(json, ensure_ascii=False)

 >>> json.dumps('程序员', ensure_ascii=False)
'"程序员"'
>>> json.dumps('程序员')
'"\\u7a0b\\u5e8f\\u5458"'

System Information

$ python -m requests.help
{
  "chardet": {
    "version": "3.0.4"
  },
  "cryptography": {
    "version": ""
  },
  "implementation": {
    "name": "CPython",
    "version": "3.6.1"
  },
  "platform": {
    "release": "15.6.0",
    "system": "Darwin"
  },
  "pyOpenSSL": {
    "openssl_version": "",
    "version": null
  },
  "requests": {
    "version": "2.18.1"
  },
  "system_ssl": {
    "version": "100020cf"
  },
  "urllib3": {
    "version": "1.21.1"
  },
  "using_pyopenssl": false
}

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:11 (8 by maintainers)

github_iconTop GitHub Comments

1reaction
nateprewittcommented, Apr 21, 2018

Hi @ahui132, I think you may be suggesting we set ensure_ascii to False, not True. If you look at the method signature in both Python 2 and Python 3, you’ll see it’s already set to True by default.

It looks like we are doing something a bit off here. When you use dumps with the json module, the ensure_ascii flag escapes all unicode strings. When we call encode('utf-8'), we are escaping the escape strings which means we send \\u7a0b\\u5e8f]\u5458 instead of \u7a0b\u5e8f\u5458. I’m pretty sure that’s not what we want, since the receiving json decoder with think the string is literally \u7a0b\u5e8f\u5458 not 程序员 without building in a custom decoder.

So setting ensure_ascii to False means we won’t double escape and do actually send what we’re expecting. I’m not sure how this change would affect existing users, or how we’ve had this functionality around so long without it being called out. I’ve run out of investigation time today but if anyone wants to build some tests around this, that would be a great start.

0reactions
sigmavirus24commented, Apr 24, 2018

This project is not adding new keywords that are related to serialization. If you need this specific handling, then the first way that is explicit is the only right way to do it.

It’s not too complex, as json is API sugar for the 98% use-case. The last 2% simply will not get API sugar around that. That’s the design credo of this project.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Python JSON Encode Unicode and non-Ascii characters as-is
Let ' see how store all incoming non-ASCII characters escaped in JSON. It is a safe way of representing Unicode characters. By setting ......
Read more >
Why does json.dumps escape non-ascii characters with "\uxxxx"
In Python 2, the function json.dumps() will ensure that all non-ascii characters are escaped as \uxxxx . Python 2 Json. But isn't this...
Read more >
39 JSON in Oracle Database
You can use explicit ASCII escaping on input data. Because JSON data uses Unicode internally, when it is output, character-set conversion still applies,...
Read more >
ArduinoJson: Efficient JSON serialization for embedded C++
ArduinoJson is a JSON library for Arduino, IoT, and any embedded C++ project. It supports JSON serialization, JSON deserialization, MessagePack, streams, ...
Read more >
json — JSON encoder and decoder — Python 3.11.1 ...
Source code: Lib/json/__init__.py JSON (JavaScript Object Notation), specified by RFC 7159(which obsoletes RFC 4627) and by ECMA-404, is a lightweight data ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found