Special characters are not escaped in generated code
See original GitHub issueThe following schema is valid, and works in a compliant draft 07 implementation
{
"$schema": "http://json-schema.org/draft-07/schema#",
"$id": "x",
"definitions": {
"(+[{*#$%^&~`';:,.<> \"?!\\n}])": {
"definitions": {
"array": { "type": "array" },
"object": { "type": "object" }
},
"anyOf": [
{ "$ref": "#/definitions/(+[{*#$%^&~`';:,.<> \"?!\\n}])/definitions/array" },
{ "$ref": "#/definitions/(+[{*#$%^&~`';:,.<> \"?!\\n}])/definitions/object" }
]
}
},
"properties": {
"x": { "$ref": "#/definitions/(+[{*#$%^&~`';:,.<> \"?!\\n}])" }
}
}
import fastjsonschema, json
x = fastjsonschema.compile(json.load(open("./nested.schema.json", "r")))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/lib/python3.7/site-packages/fastjsonschema/__init__.py", line 103, in validate
return compile(definition, handlers, formats)(data)
File "/lib/python3.7/site-packages/fastjsonschema/__init__.py", line 167, in compile
exec(code_generator.func_code, global_state)
File "<string>", line 10
validate_x__definitions_(+[{*_$_^&~`';_,_<> ?!\n}])(data__x)
Some characters are properly removed, including the \" and #, but many are not allowed in a Python name.
Simply replacing them with _ is not such a good solution because then a property named # is the same as % and _ etc.
Using things like _open_paren to escape ( etc, is somewhat ok, until a user makes a property named _open_paren, and then the names ( and _open_paren are wrongly considered the same.
A solution I have seen before (in Brat code gen) is to give all special characters unique names, and then to make user-inputted names like identifiers surrounded by __ 2 underscores, which are translated into _dunder_ (for double-underscore), and then ensure that user input is expanded in a way that can never overlap with a different input.
I cannot find where in the JSON Schema specification it says that any character other than / is allowed as an identifier / key name, but I am sure they are. NUL bytes are also allowed except I don’t know of any Python JSON parsers that handle them properly, so I don’t really care.
Also, double-escaping the characters doesn’t help (one backslash alone is invalid to the JSON parser):
{
"$schema": "http://json-schema.org/draft-07/schema#",
"$id": "x",
"definitions": {
"\\(p\\)": {}
},
"properties": {
"x": { "$ref": "#/definitions/\\(p\\)" }
}
}
x = fastjsonschema.compile(json.load(open("./nested.schema.json", "r")))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/cat/Sync/projects/git/pyrat/.venv/lib/python3.7/site-packages/fastjsonschema/__init__.py", line 167, in compile
exec(code_generator.func_code, global_state)
File "<string>", line 10
validate_x__definitions_\(p\)(data__x)
^
SyntaxError: unexpected character after line continuation character
Issue Analytics
- State:
- Created 4 years ago
- Comments:5 (3 by maintainers)

Top Related StackOverflow Question
Thanks for the report. I have an idea how to solve it–by making hash of the original name.
For now I fixed it with simple removing of invalid characters in v2.14.2 which is also used for long time for property names. Let me know if this is not enough for now.