Allow undefined units?
See original GitHub issueCurrently, pyam raises an error when initializing from a pandas.DataFrame with nan or when reading from xlsx/csv with empty cells in any index/coordinate dimension. This makes sense for model, scenario, variable and region, but there could be timeseries without a natural unit (e.g., population or number power plants if not given in an order of magnitude, shares, probabilities if given on the range [0, 1]).
The reason why this was introduced in the first place is that pandas has issues when filtering and other operations when nan are present.
One solution could be to allow variables without units by replacing nan with '' (an empty string). That way, filtering and other operations should behave as expected, and writing to csv/xlsx should again generate empty cells (same as the original when reading from a file).
Issue Analytics
- State:
- Created 3 years ago
- Comments:6 (4 by maintainers)

Top Related StackOverflow Question
I’m not sure I agree with any of these, I would say all of these are either ‘dimensionless’ or some derived unit based on mole. The point here being that I can’t see any case in which you’d want to encourage not having a unit.
Irrespective of that, reading empty of missing cells as an empty string for units sounds like a good solution to me.
Of the examples given, the typical pint unit is ‘dimensionless’. Specifying that explicitly can be helpful, particularly if you ever want to add unit aware operations (e.g. https://github.com/openscm/scmdata/blob/master/notebooks/ops.ipynb https://scmdata.readthedocs.io/en/latest/scmdata.ops.html) in future.
I’m trying to think now, and I can’t think of any timeseries which would actually have no unit but maybe my imagination is too limited.
You could simply assume dimensionless if nothing else is provided?