Timeseries can be tricked by a couple bad first inputs
See original GitHub issueIf you have one or two wildly off datum points for timeseries.detectEquilibration
from the start of the timeseries, then the true equilibration time is woefully underpredicted. @jchodera requested I post sample data and script to process and regenerate the plot and output from below.
The output below is from Top plot is the full timeseries with bars showing where “Equilibration” stops, bottom is the same set from the 2nd sample onwards (series[1:]
). Below the image is the code used to create. The dataset is in the attached zip file. @jchodera If you only want the series to look at, you only need the import
s and the first three lines of the code below.
Full Trajectory -- Equilibration 0, Subsample Rate 1.75, Num Effective 1710
Trajectory w/o initial sample -- Equilibration 20, Subsample Rate 3.75, Num Effective 794
import numpy as np
import matplotlib.pyplot as plt
from pymbar import timeseries
y = np.load('explicit0_timeseries.npy')
[n_equilibration, g_t, n_effective_max] = timeseries.detectEquilibration(y)
[n_short, g_t_short, n_eff_max_short] = timeseries.detectEquilibration(y[1:])
print("Full Trajectory -- Equilibration {0:d}, Subsample Rate {1:3.2f}, Num Effective {2:d}".format(
n_equilibration, g_t, int(np.floor(n_effective_max))
))
print("Trajectory w/o initial sample -- Equilibration {0:d}, Subsample Rate {1:3.2f}, Num Effective {2:d}".format(
n_short, g_t_short, int(np.floor(n_eff_max_short))
))
f, (a,b) = plt.subplots(2, 1)
x = np.arange(y.size)
a.plot(x, y, 'k-', label='Timeseries')
b.plot(x[1:], y[1:], '-k')
for p in [a, b]:
ylim = p.get_ylim()
xlim = p.get_xlim()
p.set_xlabel('Iteration')
p.set_ylabel(r'$\sum_k u_{k,k,n}$')
p.vlines(n_equilibration, *ylim,
colors='b', linewidth=1,
label='Full Timeseries: Num Samples={}'.format(int(np.floor(n_effective_max))))
p.vlines(n_short, *ylim,
colors='r', linewidth=1,
label='Timeseries[1:]: Num Samples={}'.format(int(np.floor(n_eff_max_short))))
p.set_ylim(ylim)
p.set_xlim(xlim)
a.legend()
f.savefig('bad_series.png', bbox_inches='tight')
Issue Analytics
- State:
- Created 6 years ago
- Comments:14 (8 by maintainers)
Top Results From Across the Web
How (not) to use Machine Learning for time series forecasting
The model is trained on what should (hopefully) be representative data of the process we are trying to forecast. Any characteristic patterns/ ...
Read more >How To Backtest Machine Learning Models for Time Series ...
Walk-Forward Validation where a model may be updated each time step new data is received. First, let's take a look at a small,...
Read more >Chapter 26 - Time Series - CMU Statistics
The simplest form of dependent data are time series, which are just what they ... We can think of this as first picking...
Read more >The Reasonable Effectiveness of Deep Learning for Time ...
The only quantities I can think of that don't make sense as Time Series are the fundamental physical constants: the good old speed...
Read more >Recurrent Neural Network to predict Timeseries | Part 10 (0-14)
D E S C R I P T I O N Recurrent Neural Network (RNN) are good at predicting timeseries data - like...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Can you post a sample data set that is causing problems?
Thanks! This is super helpful @ocmadin!