Working with missing data
See original GitHub issueI’m trying to interpolate data which contains missing values using pyKrige. Is this possible? So far, I encountered this error while doing so:
import numpy as np
from pykrige.ok import OrdinaryKriging
data = np.array([[0.3, 1.2,np.nan],
[1.9, 0.6, np.nan],
[1.1, 3.2, np.nan],
[3.3, 4.4, 1.47],
[4.7, 3.8, 1.74]])
gridx = np.arange(0.0, 5.5, 0.5)
gridy = np.arange(0.0, 5.5, 0.5)
OK = OrdinaryKriging(data[:,0],data[:,1],data[:,2],variogram_model='linear',verbose=False)
File "<ipython-input-40-17311a362b4a>", line 17, in <module>
OK = OrdinaryKriging(data[:,0],data[:,1],data[:,2],variogram_model='linear',verbose=False)
File "~/python3.6/site-packages/pykrige/ok.py", line 232, in __init__
self.variogram_function, nlags, weight)
File "~/python3.6/site-packages/pykrige/core.py", line 199, in initialize_variogram_model
variogram_function, weight)
File "~/python3.6/site-packages/pykrige/core.py", line 286, in calculate_variogram_model
x0 = [(np.amax(semivariance) - np.amin(semivariance))/(np.amax(lags) - np.amin(lags)),
File "~/python3.6/site-packages/numpy/core/fromnumeric.py", line 2252, in amax
out=out, **kwargs)
File "~/python3.6/site-packages/numpy/core/_methods.py", line 26, in _amax
return umr_maximum(a, axis, None, out, keepdims)
ValueError: zero-size array to reduction operation maximum which has no identity
Is there a workaround for this?
Thanks!
Issue Analytics
- State:
- Created 6 years ago
- Reactions:1
- Comments:7 (5 by maintainers)
Top Results From Across the Web
Working with missing data — pandas 1.5.2 documentation
Working with missing data #. In this section, we will discuss missing (also referred to as NA) values in pandas. Note. The choice...
Read more >Tackling Missing Value in Dataset - Analytics Vidhya
The first step in handling missing values is to look at the data carefully and find out all the missing values. The following...
Read more >The best way to handle missing data - Selerity
Use deletion methods to eliminate missing data. The deletion methods only work for certain datasets where participants have missing fields.
Read more >The prevention and handling of the missing data - PMC - NCBI
Missing data present various problems. First, the absence of data reduces statistical power, which refers to the probability that the test will ...
Read more >How to Handle Missing Data in a Dataset - freeCodeCamp
One of the most prevalent methods for dealing with missing data is deletion. And one of the most commonly used methods in the...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Sorry for the slow response on this. I do think you’re getting problems because you’re only feeding in two points in this example. Since you don’t specify any variogram model parameters, the code tries to automatically calculate the variogram model (linear in your case) from the data. However, since you only have two points, there is only one lag value (one distance pair) and the variogram model therefore can’t be calculated. (You can’t fit a line to one point.) If you look at
OK.variogram_model_parameters
, you’ll notice that they’re NaNs, because no model parameters could be calculated. So when you try to solve the kriging system, the matrix gets filled with NaNs due to the lack of a variogram model, hence theValueError
you’re getting. Hope that makes some sense…As told: You filter out the points with
nan
values used for conditioning the kriging system and evaluate at these points.In case of the linked example:
xpred
should then only hold the postions of interest (wherenan
values occurred) andy
andx
should be shrunk to hold only positions and data where nonan
values occur.