Should we make a harmonize module?
See original GitHub issueWe are at the moment beginning to pick up more use of aneris
to do harmonization at various stages of analysis. When I first wrote it, I had a specific use case/project in mind, but that is now expanding across a couple of different areas. And it was “pre pyam”.
So, question to our community here - would it make sense to operationally move aneris
inside of pyam
? I would envision this to effective wrap the current aneris functionality and return a pyam.IamDataFrame
of results (or inplace). Currently, aneris
uses a decision-tree approach by default, but we could allow this to be overridden by a single method in this interface to keep things simple as well.
I am wondering what others think about this. I know that harmonization is a very common operation and probably implemented all over the place. Would it make sense to have one ‘canoncial’ interface for it? Would you use it if there was one? Or should we just keep it separate and in its own repo.
I will admit bias, I think it would be nice to have it be part of the community toolbox to also help with maintenance efforts. I do not want it to die due to neglect =)
Would love any and all input, but especially from @danielhuppmann @coroa @gaurav-ganti @l-welder @znicholls @Rlamboll @jkikstra
Issue Analytics
- State:
- Created 3 years ago
- Comments:5 (2 by maintainers)
Hi all, apologies for the delayed response.
From a user perspective, I’m very happy to see this discussion going on. Indeed, as @gidden mentions, while we now use aneris for emissions, it is a common operation that can (and should?) happen much more widely (i.e. for many more variables) in many assessments across the community; one could perhaps think about harmonizing carbon prices or solar capacities.
With this in mind, it would be really great to see the core of aneris being repurposed for creating a
pyam.IamDataFrame.harmonize()
function (the first option argued for by @coroa). Indeed, we could pass a method as argument. Maybe even write a ‘generalised’ tree function, but I have no feeling for whether such a thing is possible or makes sense (it will be very much ‘expert judgement’ for setting the parameters, @gidden might know better here?(EDIT: I did not say hi on any of the available channels yet. So here it is: HI. I joined @gidden 's work-team two months ago and have become a regular user of aneris and pyam since. I typically have many ideas for improvements and am not too shy to propose PRs, if time permits. Happy to work together with all of you.)
I don’t have a strong preference for either solution and would make the choice dependent on the envisioned interface:
If we want to have an interface like:
then I would urge to include the aneris modules directly into the pyam repository, rather than carrying aneris as an explicit dependency along.
If we are happy enough with an update to aneris
HarmonizationDriver
to allow something like:then pyam does not need the aneris dependency and aneris can live happily alongside pyam.
In either case, I think it makes sense to think about making region mappings a bit more explicit in pyam (ie provide a simple interface to the
region_mapping.csv
delivered with it) in conjunction with an adaptation of aneris.