question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

A model is invalid according to model.json but valid according to defs.json

See original GitHub issue

Description

Writing tests, I discovered a model configuration that triggered buggy behavior of the model.json schema. The model.json schema claims the model is invalid, yet the defs.json one accepts it as valid.

Expected Behavior

Either boths schemas should find the model invalid, or neither. So model.json should not raise a validation error (its not a realistic example, but it seems to me to comply with what is in the schemas).

Actual Behavior

The model is valid according to defs.json, but checking the validity using model.json raises:

jsonschema.exceptions.ValidationError: 'histosys' was expected

Failed validating 'const' in schema[0]['properties']['type']:
    {'const': 'histosys'}

On instance['type']:
    'normfactor'

Steps to Reproduce

package versions (PyHF is 0.5.0 from pip with xmlio and tensorflow, but it is not needed to reproduce the issue):

>>> json.__version__
'2.0.9'
>>> jsonschema.__version__
'3.2.0'
>>> requests.__version__
'2.25.1'

Example:

import json
import jsonschema
import requests

j = ('{"channels": ['
          '{"name": "test_channel_1", '
          ' "samples": ['
            '{"name": "test_sample", '
              '"data": [0.5, 3.33, 666.666], '
              '"modifiers": ['
                '{"name": "lalala", '
                  '"type": "normfactor"}]}]}, '
          '{"name": "test_channel_2", '
          '"samples": ['
            '{"name": "test_sample_2", '
            '"data": [0, 0, 1], '
            '"modifiers": []}]}], '
      '"parameters": ['
          '{"name": "p1"}, '
          '{"name": "parameter of minor interest", '
            '"factors": [0.0072992700729927005]}, '
          '{"name": "very curious parameter", '
          '"fixed": true}]}')

d = json.loads(j)

jsonschema.validate(
    instance = d,
    schema = requests.get(
        'https://scikit-hep.org/pyhf/schemas/1.0.0/defs.json').json()) # works
jsonschema.validate(
    instance = d,
    schema = requests.get(
        'https://scikit-hep.org/pyhf/schemas/1.0.0/model.json').json()) # raises

Checklist

  • Run git fetch to get the most up to date version of master
  • Searched through existing Issues to confirm this is not a duplicate issue
  • Filled out the Description, Expected Behavior, Actual Behavior, and Steps to Reproduce sections above or have edited/removed them in a way that fully describes the issue

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:6 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
lhenkelmcommented, Mar 8, 2021

Thanks a lot for the fast and comprehensive answers! I’ll go and write some schemas then 😃

1reaction
kratsgcommented, Mar 8, 2021

@lhenkelm there’s a couple of problems at play here. Fundamentally, the issue is that your jsonschema implementation isn’t quite correct as it will not resolve references for you on the fly. See https://python-jsonschema.readthedocs.io/en/stable/references/ . You can see julian/jsonschema#274 for more details – and in particular, my gist which shows how you can write up a ref resolver to correctly resolve this for you.

Of course, this is pretty annoying to do (as I realized when I was fleshing out the JSON schema) and support for ref resolver has still waned a bit even to this day… This is why we provide a (undocumented, sorry!) validation utility that wraps jsonschema for you and handles all of this annoying boilerplate code.

Here’s your example that correctly gives the right error for your spec:

$ cat issue1360.py 
import json
import pyhf

j = """{
  "channels": [
    {
      "name": "test_channel_1",
      "samples": [
        {
          "name": "test_sample",
          "data": [
            0.5,
            3.33,
            666.666
          ],
          "modifiers": [
            {
              "name": "lalala",
              "type": "normfactor"
            }
          ]
        }
      ]
    },
    {
      "name": "test_channel_2",
      "samples": [
        {
          "name": "test_sample_2",
          "data": [
            0,
            0,
            1
          ],
          "modifiers": []
        }
      ]
    }
  ],
  "parameters": [
    {
      "name": "p1"
    },
    {
      "name": "parameter of minor interest",
      "factors": [
        0.0072992700729927005
      ]
    },
    {
      "name": "very curious parameter",
      "fixed": true
    }
  ]
}"""

d = json.loads(j)

pyhf.utils.validate(d, 'defs.json')
pyhf.utils.validate(d, 'model.json')

which complains with

$ python issue1360.py 
Traceback (most recent call last):
  File "/Users/kratsg/.pyenv/versions/pyhf-dev/lib/python3.8/site-packages/pyhf/utils.py", line 49, in validate
    return validator.validate(spec)
  File "/Users/kratsg/.pyenv/versions/pyhf-dev/lib/python3.8/site-packages/jsonschema/validators.py", line 353, in validate
    raise error
jsonschema.exceptions.ValidationError: {'name': 'lalala', 'type': 'normfactor'} is not valid under any of the given schemas

Failed validating 'anyOf' in schema['properties']['channels']['items']['properties']['samples']['items']['properties']['modifiers']['items']:
    {'anyOf': [{'$ref': '#/definitions/modifier/histosys'},
               {'$ref': '#/definitions/modifier/lumi'},
               {'$ref': '#/definitions/modifier/normfactor'},
               {'$ref': '#/definitions/modifier/normsys'},
               {'$ref': '#/definitions/modifier/shapefactor'},
               {'$ref': '#/definitions/modifier/shapesys'},
               {'$ref': '#/definitions/modifier/staterror'}]}

On instance['channels'][0]['samples'][0]['modifiers'][0]:
    {'name': 'lalala', 'type': 'normfactor'}

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "issue1360.py", line 60, in <module>
    pyhf.utils.validate(d, 'model.json')
  File "/Users/kratsg/.pyenv/versions/pyhf-dev/lib/python3.8/site-packages/pyhf/utils.py", line 51, in validate
    raise InvalidSpecification(err, schema_name)
pyhf.exceptions.InvalidSpecification: {'name': 'lalala', 'type': 'normfactor'} is not valid under any of the given schemas.
	Path: channels[0].samples[0].modifiers[0]
	Instance: {'name': 'lalala', 'type': 'normfactor'} Schema: model.json

and this is because you neglected to put data: null in for the normfactor. We require that all modifiers have a data parameter, even if they’re not used. The reason is to keep the structure as highly consistent as possible.

In your case, you will want

{
    "name": "lalala",
    "type": "normfactor",
    "data": null
}
Read more comments on GitHub >

github_iconTop Results From Across the Web

Json schema validation error - Stack Overflow
This errors happen when validating the JSON with the schema draft-04 or higher, the problem is that "id ...
Read more >
Invalid JSON Example shown at top level as "object has ...
This model creates yields an "Unable to read contents error". It contains a valid JSON wrapped in extra curly braces. Note that the...
Read more >
Validating JSON with JSON Schema - Json.NET
The simplest way to check if JSON is valid is to load the JSON into a JObject or JArray and then use the...
Read more >
Working with JSON in Swift - Swift Blog - Apple Developer
Although valid JSON may contain only a single value, a response from a web ... Creating Model Objects from Values Extracted from JSON....
Read more >
C# serialization with JsonSchema and System.Text.Json
Json -based APIs that shred, map, merge, filter, compose, and otherwise process and validate JSON data from various sources, using idiomatic ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found