Minor edge case in Cminus leads to nans in output
See original GitHub issueEncountered a situation where cuml_x and cuml_y calculated by astroML.lumfunc.Cminus returns nans. This might only happen with bad luck.
Code to reproduce:
from astroML.lumfunc import Cminus
x, y, xmax, ymax = [10.02, 10.00], [14.97, 14.99], [10.03, 10.01], [14.98, 15.00]
Cminus(x, y, xmax, ymax)
Output:
(array([0., 0.]), array([0., 0.]), array([ 0., nan]), array([ 0., nan]))
Astrophysical context for the above input: I would like to use the CMinus method to recover the true distributions of distance modulus (DM) and absolute magnitude (Mr), since I am limited by a flux limit of r < 25. Letting x<-DM and y<-Mr, I can calculate xmax and ymax (xmax = r_lim - y and vice versa).
Since this example is at a peculiar case where the x[0] > xmax[1]
this means Nx[1] = np.sum(x[:1] < xmax[1])
to be 0, which causes cuml_x = np.cumprod(1. + 1. / Nx)
evaluate to nan (see source).
I think this only happens when, after sorting , this unique case comes up where np.sum(x[:j] < xmax[j]) == 0
, usually for very small j. Saw this while using simulated LSST data. I’m not familiar enough with Cminus to come up with a fix for this edge case, and seemed to show up most often when using astroML.lumfuc.bootstrap_Cminus.
Issue Analytics
- State:
- Created 3 years ago
- Comments:6 (3 by maintainers)
This case is fascinating! The Cminus method is 50 years old and as far as I know nobody ever reported this edge case. I agree with your assessment that the problem is due to the first point having a larger y than y_max for the second point. An easy and approximately correct fix would be to set Nx (and Ny) to 0.5 whenever they happen to be 0.
PR with the
np.inf
fix is in #237, while the 0.5 approach is on the branch here if either of you prefers to check it out, too: https://github.com/bsipocz/astroML/tree/cminus_fix_nans_0.5