CLN remove duplicated entries in valid_resos in pandas/core/indexes/datetimes.py
See original GitHub issue-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
(optional) I have confirmed this bug exists on the master branch of pandas.
Before running, it is easy! just check the lines 560-570 of the pandas/code/indexes/datetime.py and you will understand the human error some developer did 😃
# Your code here
import pandas as pd
import datetime as dt
Inside Github.csv, we have:
'''
,RxID,msgtype,message,confirmed,SNR,strength,format,icao24,correctedBits,LowConf,freqOffst
2020-08-25 11:00:08.187503455,1,10,4000,,0.0,-80.5,,,,,
2020-08-25 11:00:08.189013753,1,10,4000,,0.0,-78.5,,,,,
2020-08-25 11:00:08.189746310,1,8,,,0.0,-79.0,,,,,
2020-08-25 11:00:08.189986916,1,10,2700,,0.0,-78.0,,,,,
2020-08-25 11:00:08.190476779,1,6,20A93040502C94,1,0.0,-79.5,4.0,A1F014,,,
2020-08-25 11:00:08.190482661,1,9,0000,,0.0,-79.0,,,,,
2020-08-25 11:00:08.190963454,1,6,58EB0000171D17,1,0.0,-76.5,11.0,,,,
2020-08-25 11:00:08.191085134,1,9,7F00,,0.0,-78.5,,,,,
2020-08-25 11:00:08.191087092,1,9,0000,,0.0,-79.5,,,,,
2020-08-25 11:00:08.191123383,1,9,1000,,0.0,-72.0,,,,,
2020-08-25 11:00:08.191139020,1,7,,,0.0,-77.0,,,,,
2020-08-25 11:00:08.191150695,1,7,,,0.0,-76.5,,,,,
2020-08-25 11:00:08.191590978,1,10,0000,,0.0,-70.5,,,,,
2020-08-25 11:00:08.193015479,1,10,0000,,0.0,-83.0,,,,,
2020-08-25 11:00:08.193509041,1,9,2800,,0.0,-78.0,,,,,
2020-08-25 11:00:08.193664650,1,8,,,0.0,-80.5,,,,,
2020-08-25 11:00:08.193992571,1,10,0000,,0.0,-81.5,,,,,
2020-08-25 11:00:08.194459459,1,10,7D00,,0.0,-76.5,,,,,
2020-08-25 11:00:08.194461492,1,10,0001,,0.0,-76.0,,,,,
2020-08-25 11:00:08.195045194,1,10,0000,,0.0,-80.0,,,,,
2020-08-25 11:00:08.195061385,1,8,,,0.0,-80.0,,,,,
2020-08-25 11:00:08.195102628,1,10,0000,,0.0,-75.0,,,,,
'''
dfPKT = pd.read_csv('Github.csv',dtype={'RxId': int, 'msgtype': int, 'message': str, 'confirmed': object,'SNR': float, 'strength': float, 'format': float, 'icao24': str, 'correctedBits': float,'LowConf': float, 'freqOffst': float},index_col=0)
i=pd.DatetimeIndex(dfPKT.index,dayfirst=True)
dfPKT.set_index(i,inplace=True)
#occupancy constant
occupancyMode3Ainterrogation = 0.00015675
#How many messages are inside valid Mode A interrogation.
validModeAint=[]
#We start with the first timestamp as a reference.
Next_timestamp = dfPKT.index[0]
#Then, we get all the minutes of our study/analysis.
allminutes = dfPKT.index.floor('60S').unique()
#and for every minute, we do a data analysis.
for minute in allminutes:
#Here, since the dataset is very big, I get small datasets of 1 minute each.
dfAnalysis = dfPKT.loc[(dfPKT.index > minute) & (dfPKT.index < minute + dt.timedelta(seconds=60))]
for message, index, format, code, msgtype in zip(dfAnalysis["message"], dfAnalysis.index, dfAnalysis["format"],dfAnalysis["icao24"], dfAnalysis["msgtype"]):
#Depending on the type of the message, we process it in one way or another
if msgtype == 10:
bits = bin(int(message, 16))[2:].zfill(16) #this is the 16 bits inside the messagetype 10.
if bits[:2] == '00': #if first two bits are zero
'''
when these two specific bits on the message are 0, we have to check on the same dataset if there is a message linked to it, from 8 to 13 microseconds after the actual message. HERE is were we get the error. It is trying to slice the datetime strings, and internally, it does not have a "milisecond" resolution. Altough my resolution in here is microseconds. But internally, it tries to do something with miliseconds and It breaks the code.
'''
if (2 in dfAnalysis[str(index + dt.timedelta(seconds=0.000008)):str(index + dt.timedelta(seconds=0.000013))]["msgtype"].to_numpy()):
validModeAint.append(index)
Next_timestamp = index + dt.timedelta(seconds=occupancyMode3Ainterrogation)
Problem description
Traceback (most recent call last):
File "C:\Users\bkaguila\dev\anaconda\envs\Occupancies\lib\site-packages\pandas\core\indexes\datetimes.py", line 718, in slice_indexer
return Index.slice_indexer(self, start, end, step, kind=kind)
File "C:\Users\bkaguila\dev\anaconda\envs\Occupancies\lib\site-packages\pandas\core\indexes\base.py", line 4966, in slice_indexer
start_slice, end_slice = self.slice_locs(start, end, step=step, kind=kind)
File "C:\Users\bkaguila\dev\anaconda\envs\Occupancies\lib\site-packages\pandas\core\indexes\base.py", line 5169, in slice_locs
start_slice = self.get_slice_bound(start, "left", kind)
File "C:\Users\bkaguila\dev\anaconda\envs\Occupancies\lib\site-packages\pandas\core\indexes\base.py", line 5079, in get_slice_bound
label = self._maybe_cast_slice_bound(label, side, kind)
File "C:\Users\bkaguila\dev\anaconda\envs\Occupancies\lib\site-packages\pandas\core\indexes\datetimes.py", line 665, in _maybe_cast_slice_bound
lower, upper = self._parsed_string_to_bounds(reso, parsed)
File "C:\Users\bkaguila\dev\anaconda\envs\Occupancies\lib\site-packages\pandas\core\indexes\datetimes.py", line 536, in _parsed_string_to_bounds
raise KeyError
KeyError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\bkaguila\dev\anaconda\envs\Occupancies\OccupancyScripts\OccupancyToKML\Github.py", line 31, in <module>
if (2 in dfAnalysis[str(index + dt.timedelta(seconds=0.000008)):str(index + dt.timedelta(seconds=0.000013))]["msgtype"].to_numpy()):
File "C:\Users\bkaguila\dev\anaconda\envs\Occupancies\lib\site-packages\pandas\core\frame.py", line 2881, in __getitem__
indexer = convert_to_index_sliceable(self, key)
File "C:\Users\bkaguila\dev\anaconda\envs\Occupancies\lib\site-packages\pandas\core\indexing.py", line 2132, in convert_to_index_sliceable
return idx._convert_slice_indexer(key, kind="getitem")
File "C:\Users\bkaguila\dev\anaconda\envs\Occupancies\lib\site-packages\pandas\core\indexes\base.py", line 3190, in _convert_slice_indexer
indexer = self.slice_indexer(start, stop, step, kind=kind)
File "C:\Users\bkaguila\dev\anaconda\envs\Occupancies\lib\site-packages\pandas\core\indexes\datetimes.py", line 728, in slice_indexer
start_casted = self._maybe_cast_slice_bound(start, "left", kind)
File "C:\Users\bkaguila\dev\anaconda\envs\Occupancies\lib\site-packages\pandas\core\indexes\datetimes.py", line 665, in _maybe_cast_slice_bound
lower, upper = self._parsed_string_to_bounds(reso, parsed)
File "C:\Users\bkaguila\dev\anaconda\envs\Occupancies\lib\site-packages\pandas\core\indexes\datetimes.py", line 536, in _parsed_string_to_bounds
raise KeyError
KeyError
Process finished with exit code 1
The current behavior is that the indexes are partitioned, as they are strings, and the reso or Resolution class, which has nanosecond and minisecond resolution, then is passed to the datetime.py file on the pandas/core/indexes, which has the valid_resos (on line 520) wrong.
Expected Output
Process finished with exit code 0
Output of pd.show_versions()
INSTALLED VERSIONS
commit : 67a3d4241ab84419856b84fc3ebc9abcbe66c6b3 python : 3.9.0.final.0 python-bits : 64 OS : Windows OS-release : 10 Version : 10.0.18362 machine : AMD64 processor : Intel64 Family 6 Model 142 Stepping 12, GenuineIntel byteorder : little LC_ALL : None LANG : None LOCALE : English_United Kingdom.1252 pandas : 1.1.4 numpy : 1.19.4 pytz : 2020.1 dateutil : 2.8.1 pip : 20.2.4 setuptools : 50.3.1.post20201107 Cython : None pytest : None hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : None lxml.etree : None html5lib : None pymysql : None psycopg2 : None jinja2 : None IPython : None pandas_datareader: None bs4 : None bottleneck : None fsspec : None fastparquet : None gcsfs : None matplotlib : 3.3.3 numexpr : None odfpy : None openpyxl : None pandas_gbq : None pyarrow : None pytables : None pyxlsb : None s3fs : None scipy : None sqlalchemy : None tables : None tabulate : None xarray : None xlrd : None xlwt : None numba : None
[paste the output of pd.show_versions()
here leaving a blank line after the details tag]
Issue Analytics
- State:
- Created 3 years ago
- Comments:19 (10 by maintainers)
Top GitHub Comments
They’ve opened #39503, which is clearer than this issue, so closing in favour of that one
Okay! sorry about that I didn’t know it.