question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Contributor guide?

See original GitHub issue

So far I’ve been developing xESMF on my own. There are sporadic PRs (#23, #27), but I am actually not sure how to best handle them. Given the increasing community interest, it would be useful to talk about how people can contribute to xESMF.

Contributing to xarray is a good reference on the software engineering side (style, testing, documentation, bug fixes…). Here I am thinking more about the science & usability & algorithm sides that are specific to xESMF.

There are several types of contributions I can think of:

1. Contribution to examples and tutorials

This is the easiest one and is highly welcome. I am very interested in how people regrid all kinds of data in different Earth science disciplines (environmental, atmospheric, oceanic, land, remote sensing…). I often see xESMF being successfully used to deal with grid meshes that I haven’t seen before (e.g. the “tri-polar grid” #14).

An example can be just a Jupyter notebook, focusing one or more of the following aspects:

  • An specific scientific application (e.g. converting CMIP5 data from multiple models grids to a common grid for comparison pangeo-data/pangeo#309)
  • The type of grid mesh (e.g. WRF’s lambert conformal projection, MITgcm’s lat-lon-cap, or even the Yin-Yang grid that most people have little experience with. Grid meshes are fun!)
  • The choice of algorithm (e.g. while bilinear and conservative are most common, the nearest neighbor method is actually great for categorical data such as land type index)

Guideline for tutorials/examples:

  • Full reproducibility is required. NCL has a wonderful page of regridding examples, but I have trouble running most scripts due to missing data. For xESMF doc, I only use data from xr.tutorial.load_dataset() or data computed on the fly. Small data used in the example (say less than 20 MB?) can be submitted and added to a xESMF-data repo, just like the xarray-data repo. For large data, a stable link must be provided. The Pangeo platform on GCP/AWS seems a good place for hosting large data.
  • A brief introduction to the scientific problem and why regridding is needed would be very useful.

2. Contribution to standalone, small utilities Many issues on GitHub belong to “small utilities” (e.g. #15, #16). They do not have a large impact on the core regridding, but are crucial for usability and user experience. Developing those small utilities is much easier than hacking the regridding core, and they often do not require ESMF/ESMPy knowledge. It is also much easier for me to handle dependencies.

General principles:

  • The functionality should be closely related to regridding. An example is computing cell area, which is useful for checking mass conservation before/after conservative regridding. The computation of a certain grid mesh (unless extremely common one like regular lat-lon) had better go to examples, not utilities.
  • Avoid complicated data structure. Stick to xarray.Dataset and numpy whenever possible. Compatibility with pure numpy arrays is encouraged.
  • Minimize dependency on other functions, especially other “small utilities”. xESMF is still young and the code base is in flux.

3. Contribution to core functionalities

I extremely welcome hard-core xarray/dask/ESMF/Pangeo developers to tackle some of the most challenging problems. For example:

  • Out-of-core, parallel (#3), and even distributed (pangeo-data/pangeo#334, #26) regridding
  • Unstructured grid (#18)

For those big questions, better discuss on GitHub before starting serious coding.

When & Where to start

I am still planing some significant refactor of the code base, to better support critical features, notably dask support #3, accept Dataset #5, and retrieve weights #11. (It is slowly moving because xESMF is my personal, unfunded, side project😐. Have a lot of other projects in hand. My life would probably be easier if I write a GMD/JORS paper on it, so it can count towards my PhD…) At this stage, hacking the core might not be the best choice, because it is very likely to change (talking about internal code, not user API). Contributing examples & tutorials & use cases is the safe bet.

TODO:

  • Add a Contributor Guide to online docs

Issue Analytics

  • State:open
  • Created 5 years ago
  • Reactions:3
  • Comments:5 (1 by maintainers)

github_iconTop GitHub Comments

1reaction
jhammancommented, Aug 23, 2018

Something that may be of interest on the xESMF roadmap is to move the repo to another namespace. xarray-contrib comes to mind as an obvious option. This may help increase the likelihood of gaining outside contributors and gives the package a more elevated platform. This is really just semantics at this point but something to think about.

1reaction
shoyercommented, Aug 20, 2018

For xESMF doc, I only use data from xr.tutorial.load_dataset() or data computed on the fly

Note that we would be very happy to add more examples to xarray-data to round out our current set (which is quite limited).

Read more comments on GitHub >

github_iconTop Results From Across the Web

Microsoft Learn documentation contributor guide overview
The guide describes how you can contribute to technical documentation on Microsoft Learn.
Read more >
Setting guidelines for repository contributors - GitHub Docs
To help your project contributors do good work, you can add a file with contribution guidelines to your project repository's root, docs ,...
Read more >
Contributor guide | Drupal.org
Contributor guide · Browse by task: task pages give step-by-step guides to performing tasks that contribute to the Drupal software and community, with...
Read more >
Contribution Guide - The Go Programming Language
This document is a guide to help you through the process of contributing to the Go project, which is a little different from...
Read more >
Contribution Guidelines - Creative Commons Open Source
Contribution Guidelines. We do all of our development on GitHub. If you are not familiar with GitHub or pull requests, here is an...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found