question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

"Error parsing date..."

See original GitHub issue

Describe the bug

I’m attempting to train a model from data in a large JSON array. An example of the relevant slice looks like:

[
  {
    "date": "1941-03-01",
    "foo": "foo"
    "bar": "bar"
  },
  {
    "date": "2001-05-15",
    "foo": "foo"
    "bar": "bar"
  }
]

When I run the following command:

ludwig train --dataset input.json --data_format json -c config.yaml

After a few moments, I get the following output for every element of the array:

Error parsing date: 1941-03-01 00:00:00 with error strptime() argument 1 must be str, not Timestamp Please provide a datetime format that parses it in the preprocessing section of the date feature in the config. The preprocessing fill in value will be used.For more details: https://ludwig-ai.github.io/ludwig-docs/0.5/configuration/features/date_features/

Error parsing date: 2001-05-15 00:00:00 with error strptime() argument 1 must be str, not Timestamp Please provide a datetime format that parses it in the preprocessing section of the date feature in the config. The preprocessing fill in value will be used.For more details: https://ludwig-ai.github.io/ludwig-docs/0.5/configuration/features/date_features/

I’ve tried this with several config options in config.yaml, including each of the following:

input_features:
  - name: date
    type: date
    preprocessing:
      datetime_format: "%Y-%m-%d %H:%M:%S"
input_features:
  - name: date
    type: date
    preprocessing:
      datetime_format: "%Y-%m-%d"
input_features:
  - name: date
    type: date

All 3 result in the same error message.

It seems to me what’s happening is that the date string is being parsed too soon in the process – it’s being turned into a Timestamp object BEFORE being fed into strptime(), hence the error.

I’m not familiar enough with the codebase to know if this is a bug, an issue with JSON as the input source, or something else, but some guidance would be appreciated!

Expected behavior The date should parse according to the provided format.

Environment (please complete the following information):

  • OS: MacOS
  • Version 12.3.1
  • Python version 3.10.5
  • Ludwig version 0.5.4

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
justinxzhaocommented, Jul 15, 2022

@noahlh Thanks for looking more into the read_json docs.

Setting convert_dates=False seems like a reasonable suggestion. However, there might be other inconsistencies with other data loading libraries, so making sure that date_feature can handle pre-datetime-typed data seems like a good change to make.

1reaction
noahlhcommented, Jul 14, 2022

Ok I did some exploring and as a temporary workaound I’ve successfully patched ludwig/features/date_feature.py as follows:

Changed:

def date_to_list(date_str, datetime_format, preprocessing_parameters):
    try:
        if datetime_format is not None:
            datetime_obj = datetime.strptime(date_str, datetime_format)
        else:
            datetime_obj = parse(date_str)

To:

def date_to_list(date_str, datetime_format, preprocessing_parameters):
    try:
        if isinstance(date_str, datetime): 
            datetime_obj = date_str
        elif datetime_format is not None:
            datetime_obj = datetime.strptime(date_str, datetime_format)
        else:
            datetime_obj = parse(date_str)

I’m not going to submit this as a patch because this doesn’t fix the root cause - but it does get things working. I’d be happy to help submit a proper patch if someone can guide me in the right direction to where this might be happening upstream.

Read more comments on GitHub >

github_iconTop Results From Across the Web

error parsing date - java - Stack Overflow
The Date object is just the value, just like an int is just a number, without an inherent format. You specify the format...
Read more >
Error 'failed to parse date' for Date Time Field with Data Loader
Valid formatting for Date/Time fields in the API are documented in the following resources:Format the 'Date' and 'Date Time' data in a CSV...
Read more >
Question: Error parsing date, but no date is used in the map
I have two simple connectors, SOAP and Salesforce and a map between them that only maps one Id field, and what I get...
Read more >
Error parsing date string when trying to copy one
Solved: I've put in a place an automatic rule that where a date custom field takes (start date) the value of another date...
Read more >
Error parsing date string: passing a time zone identifier as part ...
hello team, I am leveraging toDate in aggregation function and seems like i am running into issues during conversion here is a sample...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found