question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[FormRecognizer] Parse operation IDs with regex

See original GitHub issue

Summary

The implementation of the Operation constructors uses some magic numbers when parsing the operation ID.

Operations to address

  • CopyModelOperation
  • RecognizeContentOperation
  • RecognizeCustomFormsOperation
  • RecognizeReceiptsOperation
  • RecognizeIdDocumentsOperation
  • TrainingOperation
  • AnalyzeDocumentOperation
  • ClassifyDocumentOperation

Internal constructor

https://github.com/Azure/azure-sdk-for-net/blob/e32b2e7f7fab4d7209da788250caa2c145b23c66/sdk/formrecognizer/Azure.AI.FormRecognizer/src/RecognizeCustomFormsOperation.cs#L78-L83

The internal constructor takes a parameter called operationLocation, which is returned from the service and is expected to look like:

{endpoint}/formrecognizer/{version}/custom/models/{modelId}/analyzeresults/{resultId}

Public constructor

https://github.com/Azure/azure-sdk-for-net/blob/e32b2e7f7fab4d7209da788250caa2c145b23c66/sdk/formrecognizer/Azure.AI.FormRecognizer/src/RecognizeCustomFormsOperation.cs#L102-L112

The public constructor, on the other hand, takes a parameter called operationId, which is given by the developer and is expected to be a substring of the aforementioned operationLocation:

{modelId}/analyzeresults/{resultId}

Goals

  • Stop using magic numbers and correctly parse the arguments using regex, handling format errors.
  • Make sure we support parsing model IDs with characters _, ~, and .. These characters are allowed when naming a custom model.
  • Include new tests to verify proper behavior.
  • Make sure existing tests still pass reliably.

The following file can be used as a reference: https://github.com/Azure/azure-sdk-for-net/blob/80a63242c93cbbff1ee525248cbd94d89ed72345/sdk/formrecognizer/Azure.AI.FormRecognizer/src/FormField.cs#L90-L91

Related: #10385

Issue Analytics

  • State:open
  • Created 3 years ago
  • Comments:11 (11 by maintainers)

github_iconTop GitHub Comments

1reaction
maririoscommented, Dec 31, 2020

By changing the parsing, the errors might change too, so we need to be careful here to not introduce breaking changes. Will wait first on the outcome of https://github.com/Azure/azure-sdk-for-net/issues/17383 before working on this to make sure we have the coverage

1reaction
kinelskicommented, May 12, 2020

@maririos I wouldn’t say it’s a high-priority issue, since it just changes the behavior internally. @annelo-msft is keeping track of what issues we want closed for P3.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Document Intelligence (formerly Form Recognizer ...
As of July 2023, Form Recognizer is now Azure AI Document Intelligence! ... Model Compose operation, you can assign up to 200 models...
Read more >
Regular expression to parse sequence IDs - python
I am trying to extract only IDs and sequences (skip all the metadata). Unfortunately, list operations alone don't suffice, e.g.
Read more >
Getting Started with MS Azure Form Recognizer
The Form Recognizer service in Azure is an AI-based service that enables ... a document ID which can extracted using a regular expression....
Read more >
Using regular expression
Use regular expressions (regex) to create string patterns that help match, locate, or manage text in Java. Together, the literals and ...
Read more >
Parsing and extracting all IDs from a document (using ...
The idea was to capture all text between any occurrences of those two tokens, and do it non-greedily so that it doesn't capture...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found