Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

# Timeseries can be tricked by a couple bad first inputs

See original GitHub issue

If you have one or two wildly off datum points for timeseries.detectEquilibration from the start of the timeseries, then the true equilibration time is woefully underpredicted. @jchodera requested I post sample data and script to process and regenerate the plot and output from below.

The output below is from Top plot is the full timeseries with bars showing where “Equilibration” stops, bottom is the same set from the 2nd sample onwards (series[1:]). Below the image is the code used to create. The dataset is in the attached zip file. @jchodera If you only want the series to look at, you only need the imports and the first three lines of the code below.

Full Trajectory               -- Equilibration 0,  Subsample Rate 1.75, Num Effective 1710
Trajectory w/o initial sample -- Equilibration 20, Subsample Rate 3.75, Num Effective 794


import numpy as np
import matplotlib.pyplot as plt
from pymbar import timeseries

[n_equilibration, g_t, n_effective_max] = timeseries.detectEquilibration(y)
[n_short, g_t_short, n_eff_max_short] = timeseries.detectEquilibration(y[1:])

print("Full Trajectory -- Equilibration {0:d}, Subsample Rate {1:3.2f}, Num Effective {2:d}".format(
n_equilibration, g_t, int(np.floor(n_effective_max))
))
print("Trajectory w/o initial sample -- Equilibration {0:d}, Subsample Rate {1:3.2f}, Num Effective {2:d}".format(
n_short, g_t_short, int(np.floor(n_eff_max_short))
))

f, (a,b) = plt.subplots(2, 1)
x = np.arange(y.size)
a.plot(x, y, 'k-', label='Timeseries')
b.plot(x[1:], y[1:], '-k')
for p in [a, b]:
ylim = p.get_ylim()
xlim = p.get_xlim()
p.set_xlabel('Iteration')
p.set_ylabel(r'$\sum_k u_{k,k,n}$')
p.vlines(n_equilibration, *ylim,
colors='b', linewidth=1,
label='Full Timeseries: Num Samples={}'.format(int(np.floor(n_effective_max))))
p.vlines(n_short, *ylim,
colors='r', linewidth=1,
label='Timeseries[1:]: Num Samples={}'.format(int(np.floor(n_eff_max_short))))
p.set_ylim(ylim)
p.set_xlim(xlim)
a.legend()


explicit0_timeseries.npy.zip

### Issue Analytics

• State:
• Created 6 years ago

1reaction
mrshirtscommented, Jul 13, 2018

Can you post a sample data set that is causing problems?

1reaction
jchoderacommented, Feb 8, 2018

#### Top Results From Across the Web

How (not) to use Machine Learning for time series forecasting
The model is trained on what should (hopefully) be representative data of the process we are trying to forecast. Any characteristic patterns/ ...
How To Backtest Machine Learning Models for Time Series ...
Walk-Forward Validation where a model may be updated each time step new data is received. First, let's take a look at a small,...
Chapter 26 - Time Series - CMU Statistics
The simplest form of dependent data are time series, which are just what they ... We can think of this as first picking...
The Reasonable Effectiveness of Deep Learning for Time ...
The only quantities I can think of that don't make sense as Time Series are the fundamental physical constants: the good old speed...
Recurrent Neural Network to predict Timeseries | Part 10 (0-14)
D E S C R I P T I O N Recurrent Neural Network (RNN) are good at predicting timeseries data - like...

#### Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free