[FEATURE REQ] FormRecognizer - Ability to deserialize AnalyzeResult
See original GitHub issueLibrary name and version
Azure.AI.FormRecognizer 4.0.0-beta,3
Describe the bug
Currently whilst performing the new Read
operation, the resultant response can be serialized but cannot be deserialized due to contructors being marked as internal
.
This should be removed such that customers are able to utilize the object as they wish. Currently, a common pattern that is used is that resultant objects are persisted to Storage for later use, and also for passing the object around in Durable Functions (which internally serializes/deserializes to its internal store). Both scenarios currently fail and we are left, at best, to create proxy objects and manually copy data across models to facilitate our use cases.
It is noted this pattern does NOT seem prevalent in other Azure SDKs, both in the ‘older’ style and the newer Azure.* style.
Looking at the Azure SDK Design guidelines also does not point to setting constructors as internal
but instead suggest that if a class is not meant to be modified, then the public properties should be made get
only. Model Types
Expected behavior
Object should be able to be deserialized in line with other SDK models.
Actual behavior
Exception is thrown due to internal
constructors
Reproduction Steps
public static async Task Main(string[] args)
{
var documentAnalysisClient = new DocumentAnalysisClient(<removed>);
var operation = await documentAnalysisClient.StartAnalyzeDocumentFromUriAsync("prebuilt-read",
new Uri("https://images.google.com.au/images/branding/googlelogo/1x/googlelogo_color_272x92dp.png"));
await operation.WaitForCompletionAsync();
var result = operation.Value;
var serializedString = JsonSerializer.Serialize(result);
var deserializedObject = JsonSerializer.Deserialize<AnalyzeResult>(serializedString);
Console.WriteLine("Finished!");
}
Environment
❯ dotnet --info .NET SDK (reflecting any global.json): Version: 6.0.102 Commit: 02d5242ed7
Runtime Environment: OS Name: Windows OS Version: 10.0.19044 OS Platform: Windows RID: win10-x64 Base Path: C:\Program Files\dotnet\sdk\6.0.102\
Host (useful for support): Version: 6.0.2 Commit: 839cdfb0ec
Issue Analytics
- State:
- Created 2 years ago
- Reactions:1
- Comments:10 (3 by maintainers)
Top GitHub Comments
I totally understand the concerns of the maintainers, so anyone who uses this workaround should be aware of these concerns!
There’s already an internal deserialization method, which takes a JsonElement as an argument:
internal static AnalyzeResult DeserializeAnalyzeResult(JsonElement element)
and the
StartAnalyzeDocumentFromUriAsync()
method from theDocumentAnalysisClient
returns anAnalyzeDocumentOperation
which exposes anWaitForCompletionResponseAsync()
method, which returns the “raw” content of the cognitive service response.You can now bring these together:
I’d also like to be able to deserialize AnalyzeOperationResult and/or FormPageCollection from JSON. The reflection is a bit trickier for that since FormPageCollection is the highest level public type, but perhaps still better to be able to leverage Azure.AI.FormRecognizer rather than writing a custom set of types for the model in the interrim of this issue hopefully getting addressed.
I’m curious what Microsoft’s expectation is here for users that need to store the results and/or process them at a later time? Deserialization from JSON seems a reasonable expectation and the library’s utility is greatly reduced for not being able to do that. For another instance, testing for regressions might require an awful lot of data to be mocked (if you want to avoid hitting the service again, which is slow and costs money). Replaying actual JSON is a much simpler proposition than using the model factory.