question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

BUG: UserWarning about lacking infer_datetime_format with pd.to_datetime

See original GitHub issue

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

df['date'] = pd.to_datetime(df['date'], infer_datetime_format=True, utc=True, errors='ignore')

Issue Description

When I run this code, I get the following warning:

“UserWarning: Parsing ‘15/09/1979’ in DD/MM/YYYY format. Provide format or specify infer_datetime_format=True for consistent parsing.”

Which is strange, because I already have set infer_datetime_format=True.

Expected Behavior

You would expect no warning at all, since “infer_datetime_format=True” is already provided as an argument.

Installed Versions

pymysql : None html5lib : None lxml.etree : 4.8.0 xlsxwriter : None feather : None blosc : None sphinx : None hypothesis : None pytest : 6.2.5 Cython : None setuptools : 58.1.0 pip : 22.0.3 dateutil : 2.8.2 pytz : 2021.3 numpy : 1.22.2 pandas : 1.4.1

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Reactions:4
  • Comments:12 (8 by maintainers)

github_iconTop GitHub Comments

2reactions
MarcoGorellicommented, Jun 28, 2022

Reckon this would be clearer?

>>> import pandas as pd
>>> pd.to_datetime(['01/01/2000','31/05/2000','31/05/2001', '01/02/2000'], infer_datetime_format=True)
<stdin>:1: UserWarning: Parsing dates in DD/MM/YYYY format when dayfirst=False (the default) was specified. This may lead to inconsistently-parsed dates! Specify a format to ensure consistent parsing.
DatetimeIndex(['2000-01-01', '2000-05-31', '2001-05-31', '2000-01-02'], dtype='datetime64[ns]', freq=None)

Like this, the warning would only be emitted once, and (I think) would be a bit clearer

And while I’m at it format should be datetime_format for clarity.

This’d have to go through a deprecation cycle, not sure it’d be worth it

1reaction
DrNickBaileycommented, Jun 28, 2022

In your example the wording Parsing '15/01/2006' in DD/MM/YYYY format. Provide format to ensure consistent parsing. still makes no sense to me as a (British/international) user as I already thought ‘11/01/2000’ was in DD/MM/YYYY, so the “warning” is confusing.

I’d also wager that dayfirst as an argument name is unclear – what day is first? dayfirst could mean a number of things to the inexperienced pandas user. monthfirst would be more understandable to the international audience. Though datetime_dayfirst. And while I’m at it format should be datetime_format for clarity.

Sorry for my hate of MM-DD. ISO 8601 is a standard for a reason and solves all ambiguity with dates!

Read more comments on GitHub >

github_iconTop Results From Across the Web

python - Converting dates into a specific format in side a CSV
In your case you need to set the dayfirst param to true, like this: pd.to_datetime(dataLake.day, dayfirst=true). or you can set a format ......
Read more >
Avoid Inferring Dates In Pandas - Medium
Pandas has a really straightforward function called to_datetime, where you need to provide the specific format of the input string that you're ...
Read more >
PDEP-4: Consistent datetime parsing - Pandas
Abstract. The suggestion is that: - to_datetime becomes strict and uses the same datetime format to parse all elements in its input.
Read more >
How to remove timezone from a Timestamp column in a ...
from datetime import datetime, timezone. # CREATE THE PANDAS DATAFRAME. # WITH TIMESTAMP COLUMN. df = pd.DataFrame({. "orderNo" : [.
Read more >
Pandas to_datetime() function - w3resource
Name Description Type / Default Value Required / Optional arg scalar, list, tuple, 1‑d array, or Series Required origin scalar. Default Value: 'unix' Required
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found