question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[BUG] Fatal parse error analysing document using invoice model - "not recognized as a valid DateTime"

See original GitHub issue

Library name and version

Azure.AI.FormRecognizer 4.0.0-beta.3

Describe the bug

Send a document to the analyser for ‘invoice’ processing. Service responds without error but SDK throws exception due to parsing issue.

Expected behavior

SDK should return the analysed document information, with best efforts at recognising data types This should be a TRY parse, not fail everything because of one dubious value. Analysis model should be flexible enough to return values just as text, if they are ‘date-ish’ or ‘number-ish’

Actual behavior

SDK throws System.FormatException:

The string 'yyyy-08-21' was not recognized as a valid DateTime. There is an unknown word starting at index '0'.
at System.DateTimeParse.Parse(ReadOnlySpan`1 s, DateTimeFormatInfo dtfi, DateTimeStyles styles, TimeSpan& offset)\r\n   
at System.DateTimeOffset.Parse(String input, IFormatProvider formatProvider, DateTimeStyles styles)\r\n   
at Azure.Core.TypeFormatters.ParseDateTimeOffset(String value, String format)\r\n   
at Azure.AI.FormRecognizer.DocumentAnalysis.DocumentField.DeserializeDocumentField(JsonElement element)\r\n   
at Azure.AI.FormRecognizer.DocumentAnalysis.DocumentField.DeserializeDocumentField(JsonElement element)\r\n   
at Azure.AI.FormRecognizer.DocumentAnalysis.DocumentField.DeserializeDocumentField(JsonElement element)\r\n   
at Azure.AI.FormRecognizer.DocumentAnalysis.AnalyzedDocument.DeserializeAnalyzedDocument(JsonElement element)\r\n   
at Azure.AI.FormRecognizer.DocumentAnalysis.AnalyzeResult.DeserializeAnalyzeResult(JsonElement element)\r\n   
at Azure.AI.FormRecognizer.DocumentAnalysis.AnalyzeResultOperation.DeserializeAnalyzeResultOperation(JsonElement element)\r\n   
at Azure.AI.FormRecognizer.DocumentAnalysis.DocumentAnalysisRestClient.<GetAnalyzeDocumentResultAsync>d__11.MoveNext()\r\n   

Reproduction Steps

Submitting financial document so won’t provide the source data, but the stack trace should be sufficient to trace the root cause… this is a vanilla call to the client

            try
            {
                var apiResponse = await _documentAnalysisClient.StartAnalyzeDocumentFromUriAsync("prebuilt-invoice", uri);

                await apiResponse.WaitForCompletionAsync();

                return apiResponse.Value;
            }
            catch (Exception e)
            {
                log.LogError($"{e.GetType()}\n{e.Message}\n{e.StackTrace}");
            }

Environment

.NET SDK (reflecting any global.json): Version: 6.0.200 Commit: 4c30de7899

Runtime Environment: OS Name: Windows OS Version: 10.0.22000 OS Platform: Windows RID: win10-x64 Base Path: C:\Program Files\dotnet\sdk\6.0.200\

Host (useful for support): Version: 6.0.2 Commit: 839cdfb0ec

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:9 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
kweebtroniccommented, Mar 17, 2022

Thanks @kinelski , I am now able to retrieve the document analysis value with the SDK, without error

1reaction
kinelskicommented, Mar 10, 2022

@kweebtronic Apologies for the delayed response. I have discussed this matter with the service team and confirmed it’s a bug. They already have a fix but deployment is expected to take around two weeks, so I’ll get back to you when that happens.

Once the fix is in place, you won’t be able to access the field date value with DocumentField.AsDate as our samples suggest. This only affects “incomplete” dates that can’t be parsed by the SDK such as “yyyy-08-21”. In these cases, you’ll need to access the string representation of the date in DocumentField.Content and parse it in your code if necessary.

If you’re blocked by this bug and need a fix asap, you could use an HTTP policy to intercept the service response and manually remove the dates causing the bug:

internal class DateFixPolicy : HttpPipelineSynchronousPolicy
{
    public override void OnReceivedResponse(HttpMessage message)
    {
        if (message.Response.ContentStream != null)
        {
            byte[] bytes = message.Response.Content.ToArray();
            string content = Encoding.UTF8.GetString(bytes);
            string modifiedContent = Regex.Replace(content, "\"valueDate\":\"yyyy-[0-9]{2}-[0-9]{2}\"", "\"valueDate\":null");

            message.Response.ContentStream.Dispose();
            message.Response.ContentStream = new MemoryStream(Encoding.UTF8.GetBytes(modifiedContent));
        }

        base.OnReceivedResponse(message);
    }
}

You need to set it in the client options like this:

var options = new DocumentAnalysisClientOptions();
options.AddPolicy(new DateFixPolicy(), HttpPipelinePosition.PerCall);

var client = new DocumentAnalysisClient(<endpoint>, <credential>, options);
Read more comments on GitHub >

github_iconTop Results From Across the Web

String was not recognized as a valid DateTime " format dd ...
The format pattern above is dd/MM/yyyy so a text string with a time in it will not be parsed properly. You'll need to...
Read more >
String not recognised as Valid DateTime - Microsoft Q&A
So I have 99% of my code but wondering if someone can help me just get this right please? ... Now the error...
Read more >
ERP & Ginesys POS - Ginesys One - Connecting Futures
Bug Name Ticket ID Development ID Release Note Cube related reports cannot be opened 35765 Release Note 11.14... Cube Data is not getting refreshed 56876...
Read more >
Unresolved Issues
Master Issue Number Component/s Summary IOJ‑2165016 Database, Web UI Licenses tab on VM Hosts takes long time to l... IOJ‑1991750 Purchase management Cannot process purchase...
Read more >
MySQL 1064 Error: You have an error in your SQL syntax
The 1064 error displays any time you have an issue with your SQL syntax, and is often due to using reserved words, missing...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found