Impossible to represent empty lists/objects
See original GitHub issueCurrently it is impossible to represent empty lists/objects in NestedText.
Relatedly, the language reference says the following about an empty document: “An empty document corresponds to an empty value of unknown type.”.
The reason for this is that list/object items can only be represented by the presence of lines in a certain format, and the ‘absence’ of lines (e.g. blank lines) is not something that can be interpreted as a particular type.
The difference with YAML here is that inline collections (a.k.a. ‘flow style’) are not supported. The justification for this is it allows the simple statement that all values are interpreted as strings (just like how ‘null’ is always treated as a string).
Given the proposal to add multi-line keys (https://github.com/KenKundert/nestedtext/issues/23) based on a desire to make NestedText ‘completely general’, I’m wondering if it’s been considered to add a way to express empty containers? This could be done by allowing ‘flow-style’-like syntax but only permitting its use for empty containers, and requiring they be placed on their own line:
foo:
[]
bar:
{}
baz:
This has the following nice properties:
- Already valid YAML syntax
- Backwards compatible change (the meaning of all previously valid syntax remains unchanged)
- Maintains the property of every line type being identifiable without context of other lines
- Could potentially provide a way to disambiguate the meaning of an empty file (make it an empty string, given the new way to represent empty collections)
Problems:
- What to do with a file containing only
[]
or{}
?- I was going to propose that this should provide a way for a file to represent an empty collection, but actually this would be backwards incompatible as it’s currently interpreted as a string.
- Note, however, that this wouldn’t prevent from having a file corresponding to those strings, since
> []
or> {}
could be used.
Related discussion about removal of flow-style in strictyaml
: https://hitchdev.com/strictyaml/why/flow-style-removed/ (see Counterarguments).
Issue Analytics
- State:
- Created 2 years ago
- Comments:14 (9 by maintainers)
Regarding the flow-style syntax: This is actually growing on me. I think @KenKundert is right that allowing
[]
and{}
would invite people to put values in them. I initially regarded this as a deal-breaker, but on further thought I think there are actually some good arguments in favor of adding a single-line list/dict syntax:The lack of flow-style was one of the primary complaints we saw when we first introduced the spec. I think there are two compelling forms of this complaint:
Our response to these complaints was that you can always parse list/dict values from string values (and in fact I often do exactly this). This response is kind-of a cop-out, though, since clearly the purpose of nestedtext is to encode the structure of the data.
This would create a parallel to strings, which already have single- and multi-line forms.
It would become possible to specify empty lists/dicts. I don’t know if this is really much of a benefit, for the reasons discussed above, but it is something. It would at least make some schemas a bit simpler.
I should also be clear about the exact syntax that I have in mind:
List values and dict key/value pairs would be separated by commas, and would not be allowed to contain any of the following characters:
,[]{}
. The prohibition on[]
and{}
would apply to both dicts and lists, to leave open the possibility of supporting nested flow-style data structures. I’m not sure if the item separator should,
or,␣
.,
is what I would initially expect, but more in line with how nestedtext parses dictionary keys and list items would be to split on,␣
and allow values that contain,
not followed by␣
. My instinct would be to simply split on,
though.List:
Dictionary:
Only single-line flow-style data structures would be supported. There’s no need to support multi-line flow-style, since nestedtext already has (nicer) syntax for that. This also greatly simplifies parsing and maintains the property that each line type can be identified just by looking at its first character.
Not ok:
Nested flow-style data structures would not be allowed. This is a restriction that could probably be lifted in the future, but for now would make the implementation simpler and would not affect many use-cases.
Not ok:
I’m not sure how to handle trailing commas, e.g.
[a, b, c,]
. Most formats (e.g. python, TOML) ignore them, but the point of this is to make it easy to add/remove lines from multi-line data structures, which wouldn’t apply here. These same formats also require quotes or brackets or something to identify a value, and so a trailing comma is clearly distinct from a comma with a value after it. But in nestedtext a trailing comma could be reasonably interpreted as a comma followed by an empty string. My instinct would be to go with that interpretation, but I’m not sure.There are also arguments against this syntax, although I haven’t thought of any yet that I find very compelling:
It does provide two ways to do things, which goes against the “there should be one—and preferably only one—obvious way to do it” philosophy that I generally subscribe to. However, I think it would be pretty obvious to use the flow-style for short data structures and the multiline style for longer ones. You could also argue that the lack of flow style is non-obvious, because most people expect these kinds of structured data formats to have it.
The strictyaml thread linked above argues that flow-style hampers readability, but I don’t really buy that. Ultimately the author of a file is responsible for keeping it readable, and I think the flow-style syntax would provide a tool that could be used to help with that (even if it could also be misused). The requirement for flow-style data structures to fit on a single line also significantly limits the scope for abuse.
The strictyaml thread also mentions that braces can complicate templating. I don’t disagree with this, but I don’t think it’s a major concern. It shouldn’t often be necessary to template nestedtext files, and in any case you can always tell a templating engine to use different brace characters.
This issue has been addressed in v2.0.