question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Weird DatetimeIndex + secondary y axis plotting issue

See original GitHub issue

Today I stumbled on weird issue with plotting two series with a datetimeindex and a secondary axis.

Code that illustrates the problem:


a = pd.Series(
    [16, 13, 11],
    index=pd.to_datetime(['2017-09-13', '2017-09-14', '2017-09-16'], format='%Y-%m-%d')
)
b = pd.Series(
    [23, 27, 25],
    index=pd.to_datetime(['2017-09-13', '2017-09-14', '2017-09-15'], format='%Y-%m-%d')
)

# Combinations of `secondary_y` values to try out.
secondary_ys = [(False, False), (False, True), (True, False), (True, True)]

fig, axes = plt.subplots(figsize=(10, 2), ncols=4, nrows=1)
for (sya, syb), ax in zip(secondary_ys, axes.flat):
    a.plot(ax=ax, style='o-', secondary_y=sya)
    b.plot(ax=ax, style='x-', secondary_y=syb)
    ax.set_title('a:%r - b:%r' % (sya, syb))
  • b is a straightforward time series with successive days
  • a has a one day jump in the last item
  • the for loop tries out all the possible combinations of assigning these two series to the primary or secondary y axis in a plot
  • Whenever one or more series is assigned to the secondary y axis, the x axis is completely confused: screen shot 2017-12-13 at 18 21 48

More in depth experimentation (and version information) can be found in the notebook at https://gist.github.com/soxofaan/9fdfdeafb8fb555dd8547bc48a27e2f3 (also notice the inconsistant x axes label formatting and orientation)

Issue Analytics

  • State:open
  • Created 6 years ago
  • Reactions:2
  • Comments:6 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
biddiscocommented, May 1, 2019

Is there any kind of workaround that can be used to disable the labels on either the primary or secondary plot so that we can can get what we want with the first dataset and then overlay the second without the labels disappearing? (I experimented with different formatters/locators without success)

1reaction
soxofaancommented, Dec 16, 2017

Ok, this was quite a rabbit hole to debug, let’s see if I can manage to write down what I found.

Main reason that the lines in graph with primary and secondary axis do not properly cover the same x range, is the difference in conversion between:

  • pandas.plotting._converter.DatetimeConverter converts date to number of days since 0001-01-01 plus one. Example: 2017-12-16 -> 736679
  • pandas.plotting._converter.PeriodConverter convers to number of days since 1970-01-01. Example: 2017-12-16 -> 17516

Short description what is happening, starting with the same data as above:

a = pd.Series([16, 13, 11], index=pd.to_datetime(['2017-09-13', '2017-09-14', '2017-09-16'], format='%Y-%m-%d'))
b = pd.Series([23, 27, 25], index=pd.to_datetime(['2017-09-13', '2017-09-14', '2017-09-15'], format='%Y-%m-%d'))

First plot on primary axis:

a.plot(ax=ax, style='o-', secondary_y=False)

in LinePlot._make_plot the self._is_ts_plot() check returns False https://github.com/pandas-dev/pandas/blob/265e3272b8709a7be274321ee8b505a0e74b8e10/pandas/plotting/_core.py#L947 because (among other reasons) in pandas.plotting._timeseries._use_dynamic_x the frequency variables freq (and ax_freq) are None for (“irregular”) series a. https://github.com/pandas-dev/pandas/blob/265e3272b8709a7be274321ee8b505a0e74b8e10/pandas/plotting/_timeseries.py#L211 As a result of this a DatetimeConverter is set up as converter for the x axis .

Second plot on secondary axis:

b.plot(ax=ax, style='o-', secondary_y=True)

For the secondary axis twin axes is created. Now, in LinePlot._make_plot, the self._is_ts_plot() check returns True:

  • in pandas.plotting._timeseries._use_dynamic_x we have now freq='D' (and ax_freq = None ) because the b series has a clean successive day index.
  • a bit further there is https://github.com/pandas-dev/pandas/blob/265e3272b8709a7be274321ee8b505a0e74b8e10/pandas/plotting/_timeseries.py#L217 which, if I understand the comment correctly, seems to want to cover the problem of mixing “irregular” and “clean” time indexes. However: there is an additional check len(ax.get_lines()) > 0 which evaluates to False because the previous plot is done on the primary axes, and we can not reach that from here and we count zero lines.
  • the control flow in _use_dynamic_x goes a bit further, but finally returns True

Back in LinePlot._make_plot, the series with DatetimeIndex is now converted to one with PeriodIndex, and as a result a PeriodConverter is associated now with the axes.

Summary

During plot of a the axes are initially set up using the DatetimeConverter mapping (e.g. for xlim stuff). During plot of b a non compatible PeriodConverter enters the stage. The end result is confusion in x axis conversion for the two line plots.

I guess there are several possible solutions (but my understanding of the inner workings of pandas and matplotlib are limited, so I don’t really know the best way forward in terms of breaking things and keeping things managable):

  • make sure DatetimeConverter and PeriodConverter are “more compatible”
  • for pandas.plotting._timeseries._use_dynamic_x in a secondary axes context: make it possible to check if there are already lines in the corresponding primary axes, to avoid replacing the DatetimeConverter with PeriodConverter
  • don’t do the conversion to a PeriodIndex series/dataframe in LinePlot._make_plot in the first place

I hope this sheds some light on what is happening here.

(note that I didn’t look into the case where both plots are done on the secondary axes, but I guess it is related)

Read more comments on GitHub >

github_iconTop Results From Across the Web

Plotting datetimeindex on x-axis with matplotlib creates wrong ...
The reason for this change in behaviour is that starting from 0.15, the pandas Index object is no longer a numpy ndarray subclass....
Read more >
Plotting datetimeindex on x-axis with matplotlib creates wrong ...
I create a simple pandas dataframe with some random values and a DatetimeIndex like so: import pandas as pd from numpy.random import randint...
Read more >
Fixing common date annoyances - Matplotlib
To fix the first problem, we can use Figure.autofmt_xdate and to fix the second problem we can use the ax.fmt_xdata attribute which can...
Read more >
Display Data with Multiple Scales and Axes Limits - MathWorks
For example, you can use two y-axes to plot two lines on different scales. Create an axes object, and activate the left y-axis...
Read more >
Time Series Data Visualization with Python
In this plot, time is shown on the x-axis with observation values along the y-axis. Below is an example of visualizing the Pandas...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found