Weird DatetimeIndex + secondary y axis plotting issue
See original GitHub issueToday I stumbled on weird issue with plotting two series with a datetimeindex and a secondary axis.
Code that illustrates the problem:
a = pd.Series(
[16, 13, 11],
index=pd.to_datetime(['2017-09-13', '2017-09-14', '2017-09-16'], format='%Y-%m-%d')
)
b = pd.Series(
[23, 27, 25],
index=pd.to_datetime(['2017-09-13', '2017-09-14', '2017-09-15'], format='%Y-%m-%d')
)
# Combinations of `secondary_y` values to try out.
secondary_ys = [(False, False), (False, True), (True, False), (True, True)]
fig, axes = plt.subplots(figsize=(10, 2), ncols=4, nrows=1)
for (sya, syb), ax in zip(secondary_ys, axes.flat):
a.plot(ax=ax, style='o-', secondary_y=sya)
b.plot(ax=ax, style='x-', secondary_y=syb)
ax.set_title('a:%r - b:%r' % (sya, syb))
b
is a straightforward time series with successive daysa
has a one day jump in the last item- the for loop tries out all the possible combinations of assigning these two series to the primary or secondary y axis in a plot
- Whenever one or more series is assigned to the secondary y axis, the x axis is completely confused:
More in depth experimentation (and version information) can be found in the notebook at https://gist.github.com/soxofaan/9fdfdeafb8fb555dd8547bc48a27e2f3 (also notice the inconsistant x axes label formatting and orientation)
Issue Analytics
- State:
- Created 6 years ago
- Reactions:2
- Comments:6 (4 by maintainers)
Top Results From Across the Web
Plotting datetimeindex on x-axis with matplotlib creates wrong ...
The reason for this change in behaviour is that starting from 0.15, the pandas Index object is no longer a numpy ndarray subclass....
Read more >Plotting datetimeindex on x-axis with matplotlib creates wrong ...
I create a simple pandas dataframe with some random values and a DatetimeIndex like so: import pandas as pd from numpy.random import randint...
Read more >Fixing common date annoyances - Matplotlib
To fix the first problem, we can use Figure.autofmt_xdate and to fix the second problem we can use the ax.fmt_xdata attribute which can...
Read more >Display Data with Multiple Scales and Axes Limits - MathWorks
For example, you can use two y-axes to plot two lines on different scales. Create an axes object, and activate the left y-axis...
Read more >Time Series Data Visualization with Python
In this plot, time is shown on the x-axis with observation values along the y-axis. Below is an example of visualizing the Pandas...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Is there any kind of workaround that can be used to disable the labels on either the primary or secondary plot so that we can can get what we want with the first dataset and then overlay the second without the labels disappearing? (I experimented with different formatters/locators without success)
Ok, this was quite a rabbit hole to debug, let’s see if I can manage to write down what I found.
Main reason that the lines in graph with primary and secondary axis do not properly cover the same x range, is the difference in conversion between:
pandas.plotting._converter.DatetimeConverter
converts date to number of days since 0001-01-01 plus one. Example: 2017-12-16 -> 736679pandas.plotting._converter.PeriodConverter
convers to number of days since 1970-01-01. Example: 2017-12-16 -> 17516Short description what is happening, starting with the same data as above:
First plot on primary axis:
in
LinePlot._make_plot
theself._is_ts_plot()
check returnsFalse
https://github.com/pandas-dev/pandas/blob/265e3272b8709a7be274321ee8b505a0e74b8e10/pandas/plotting/_core.py#L947 because (among other reasons) inpandas.plotting._timeseries._use_dynamic_x
the frequency variablesfreq
(andax_freq
) areNone
for (“irregular”) seriesa
. https://github.com/pandas-dev/pandas/blob/265e3272b8709a7be274321ee8b505a0e74b8e10/pandas/plotting/_timeseries.py#L211 As a result of this aDatetimeConverter
is set up as converter for the x axis .Second plot on secondary axis:
For the secondary axis twin axes is created. Now, in
LinePlot._make_plot
, theself._is_ts_plot()
check returnsTrue
:pandas.plotting._timeseries._use_dynamic_x
we have nowfreq='D'
(andax_freq = None
) because theb
series has a clean successive day index.len(ax.get_lines()) > 0
which evaluates toFalse
because the previous plot is done on the primary axes, and we can not reach that from here and we count zero lines._use_dynamic_x
goes a bit further, but finally returnsTrue
Back in
LinePlot._make_plot
, the series withDatetimeIndex
is now converted to one withPeriodIndex
, and as a result aPeriodConverter
is associated now with the axes.Summary
During plot of
a
the axes are initially set up using theDatetimeConverter
mapping (e.g. for xlim stuff). During plot ofb
a non compatiblePeriodConverter
enters the stage. The end result is confusion in x axis conversion for the two line plots.I guess there are several possible solutions (but my understanding of the inner workings of pandas and matplotlib are limited, so I don’t really know the best way forward in terms of breaking things and keeping things managable):
DatetimeConverter
andPeriodConverter
are “more compatible”pandas.plotting._timeseries._use_dynamic_x
in a secondary axes context: make it possible to check if there are already lines in the corresponding primary axes, to avoid replacing theDatetimeConverter
withPeriodConverter
PeriodIndex
series/dataframe inLinePlot._make_plot
in the first placeI hope this sheds some light on what is happening here.
(note that I didn’t look into the case where both plots are done on the secondary axes, but I guess it is related)