question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Don't silently fix invalid JSON with duplicate keys

See original GitHub issue

Checklist

  • I’ve searched for similar feature requests.

What enhancement would you like to see?

When a server returns invalid JSON, the HTTPie formatter should either leave it untouched or fail loudly. As it stands, if the JSON contains duplicate keys, HTTPie will quietly remove the duplicates during formatting.

What problem does it solve?

When debugging broken services, it’s important that I can trust HTTPie to faithfully show me what the server returned, so that I focus on figuring out what’s going wrong with my services and not have to spend that time thinking about HTTPie.

From my point of view it’s helpful for HTTPie to format and color text, I’d rather it didn’t sort keys by default but I know that I can reconfigure that, but it’s a problem if HTTPie is removing anything that isn’t non-significant whitespace. If HTTPie is to be a general purpose tool it is important that it doesn’t mask problems.

Provide any additional information, screenshots, or code examples below:

The below example is my motivation for this feature. This server is returning invalid JSON with duplicate keys in the “dps” object. But I could not tell it was doing this, because HTTPie had silently “corrected” the output during formatting. You can see that if I turn off formatting and only colorize the output, I see the true output.

annettewilson@ANNEWILS02M:~/scratch/undercity$ http GET 'http://opentsdb.skymetrics.prod.skyscanner.local:4242/api/query' m=='zimsum:prod.undercity.search_cycle.meter.count_delta{}{dataCenter=literal_or(EU_WEST_1),endpoint=literal_or(search),from_bot=literal_or(false)}' start==1630064720 end==1630064727
HTTP/1.1 200 OK
Connection: keep-alive
Content-Length: 232
Content-Type: application/json;charset=utf-8
Date: Wed, 01 Sep 2021 09:07:45 GMT
Vary: Accept-Encoding
X-OpenTSDB-Query-Complexity: 318

[
    {
        "aggregateTags": [
            "ip",
            "traffic_source"
        ],
        "dps": {
            "1630064726": 6.0
        },
        "metric": "prod.undercity.search_cycle.meter.count_delta",
        "tags": {
            "dataCenter": "EU_WEST_1",
            "endpoint": "search",
            "from_bot": "false"
        }
    }
]


annettewilson@ANNEWILS02M:~/scratch/undercity$ http GET 'http://opentsdb.skymetrics.prod.skyscanner.local:4242/api/query' m=='zimsum:prod.undercity.search_cycle.meter.count_delta{}{dataCenter=literal_or(EU_WEST_1),endpoint=literal_or(search),from_bot=literal_or(false)}' start==1630064720 end==1630064727 --pretty colors
HTTP/1.1 200 OK
Content-Type: application/json;charset=utf-8
Date: Wed, 01 Sep 2021 09:07:59 GMT
Vary: Accept-Encoding
X-OpenTSDB-Query-Complexity: 318
Content-Length: 232
Connection: keep-alive

[{"metric":"prod.undercity.search_cycle.meter.count_delta","tags":{"endpoint":"search","dataCenter":"EU_WEST_1","from_bot":"false"},"aggregateTags":["ip","traffic_source"],"dps":{"1630064726":5.0,"1630064726":3.0,"1630064726":6.0}}]

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:6 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
jakubroztocilcommented, Sep 1, 2021

Interesting that JSON doesn’t prohibit duplicate keys. But it makes sense, though. It’s valid on the language syntax level, and semantically it could be explained as following the JS object notation behavior where the latest occurrence overwrites any earlier ones.

In any case, we shouldn’t alter the data for display. object_pairs_hook=multidict.MultiDict might be a good start. Cc @BoboTiG

1reaction
jakubroztocilcommented, Sep 1, 2021

Thanks for the thorough description. I agree that the current behavior is not acceptable for the formatted output use case. It’s the default behavior of the underlying library we use.

We should customize the JSON loader/serializer we use for formatting to allow repeated keys. Ideally, we’d also extend the JSON lexer to mark repeated keys as errors, but that might be too big of a project for such a rare scenario.

Read more comments on GitHub >

github_iconTop Results From Across the Web

treat a JSON string with duplicate keys as invalid - PHP :: Bugs
If a json string containing duplicate keys is decoded with ... Just like json_encode returns NULL for other invalid JSON strings, ...
Read more >
Does JSON syntax allow duplicate keys in an object?
The short answer: Yes but is not recommended. The long answer: It depends on what you call valid... ECMA-404 "The JSON Data Interchange...
Read more >
Duplicated JSON fields on record · Issue #1835 · fluent/fluent-bit
FYI: I've pushed fix 1d14886 to handle duplicated keys in a map. The workaround is basically: when converting the data to JSON, if...
Read more >
How to fail a Snap for Invalid json as i/p doc(Duplicate keys)
The pipeline has first snap as Copy, when i checked the output of copy it has removed duplicated keys from the invalid json...
Read more >
Handling non-compliant JSON with Jackson - cowtowncoder
Non-compliant JSON content: alternate String quoting · Avoid quoting of Object keys by disabling JsonWriteFeature.QUOTE_FIELD_NAMES · Allow use of ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found