question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[QUESTION] Why does TextAnalytics SDK not return same results as Language Studio?

See original GitHub issue

Library name and version

Azure.AI.TextAnalytics 5.1.1

Query/Question

Hi there,

I am currently writing an Azure function in C# which uses the TextAnalytics SDK to detect IBAN values. To detect these values I am using the following code:


 var client = new TextAnalyticsClient(new Uri(endpoint), new AzureKeyCredential(apiKey));

            var options = new RecognizePiiEntitiesOptions();
            options.CategoriesFilter.Add(PiiEntityCategory.InternationalBankingAccountNumber);

            var piiResponse = await client.RecognizePiiEntitiesAsync(myQueueItem, "en");

            PiiEntityCollection piiEntities = piiResponse.Value;

myQueueItem is a string that contains multiple sample IBANs. This is the following string:

NL24 ABNA 4047 6339 76 NL61 INGB 0003 3505 63 NL18RABO0123459876 NL98INGB0003856625 NL98ABNA0416961347 NL98UGBI0771565860 NL98TRIO0254712320 NL98SNSB0908532792 NL97DEUT0265134951 NL97BNPA0227673409 NL97BNGH0285061917 NL97BOFA0266546412

The problem is that not all the IBANs are detected, however when I am testing this same string with Language Studio all the IBANs do get detected.

C# SDK Results:

C_Sharp_Results

Language Studio Results:

Language_Studio_Results

Environment

.NET SDK (reflecting any global.json): Version: 6.0.202 Commit: f8a55617d2

Runtime Environment: OS Name: Windows OS Version: 10.0.19042 OS Platform: Windows RID: win10-x64 Base Path: C:\Program Files\dotnet\sdk\6.0.202\

Host (useful for support): Version: 6.0.4 Commit: be98e88c76

.NET SDKs installed: 6.0.202 [C:\Program Files\dotnet\sdk]

.NET runtimes installed: Microsoft.AspNetCore.App 6.0.4 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App] Microsoft.NETCore.App 6.0.4 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App] Microsoft.WindowsDesktop.App 6.0.4 [C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App]

IDE Version: 17.1.6

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:9 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
Ansiemcommented, May 10, 2022

Thanks for the help! The problem was indeed the way the content of myQueueItem was handled. I passed the entire myQueueItem instead of the property that I’d wanted to be analyzed.

1reaction
maririoscommented, May 9, 2022

Great! thank you! so both Language Studio and the SDK are using the same service version. I tried to repro your scenario using the SDK and I get the 12 PII entities that the Language Studio is showing.

I’m using the code present in the PII sample:

string document = @"NL24 ABNA 4047 6339 76 NL61 INGB 0003 3505 63 NL18RABO0123459876 NL98INGB0003856625
NL98ABNA0416961347 NL98UGBI0771565860 NL98TRIO0254712320 NL98SNSB0908532792
NL97DEUT0265134951 NL97BNPA0227673409 NL97BNGH0285061917 NL97BOFA0266546412";

            try
            {
                var options = new RecognizePiiEntitiesOptions();
                options.CategoriesFilter.Add(PiiEntityCategory.InternationalBankingAccountNumber);

                Response<PiiEntityCollection> response = client.RecognizePiiEntities(document, options: options);
                PiiEntityCollection entities = response.Value;

                Console.WriteLine($"Redacted Text: {entities.RedactedText}");
                Console.WriteLine("");
                Console.WriteLine($"Recognized {entities.Count} PII entities:");
                foreach (PiiEntity entity in entities)
                {
                    Console.WriteLine($"  Text: {entity.Text}");
                    Console.WriteLine($"  Category: {entity.Category}");
                    if (!string.IsNullOrEmpty(entity.SubCategory))
                        Console.WriteLine($"  SubCategory: {entity.SubCategory}");
                    Console.WriteLine($"  Confidence score: {entity.ConfidenceScore}");
                    Console.WriteLine("");
                }
            }
            catch (RequestFailedException exception)
            {
                Console.WriteLine($"Error Code: {exception.ErrorCode}");
                Console.WriteLine($"Message: {exception.Message}");
            }

Output: image

The only difference from our code is how the text is passed. Looking at the output, it seems like the content in myQueueItem is not properly handling End of Line/File as those are the 5 entities that are missing

Read more comments on GitHub >

github_iconTop Results From Across the Web

SingleLabelClassifyActionResult not returning any output ...
Hi. I have created and deployed a model using azure language studio, model has deployed successfully but when trying to test the output...
Read more >
Cognitive Service for Language - Microsoft Q&A
Question Answering Projects - Import API REST- don't return body even though answer code is 202 and it doesn't import the questions if...
Read more >
Azure Cognitive Service for Language | by Valentina Alto
Azure Cognitive Services is a collection of cloud-based APIs that allow developers to add intelligent features to their applications.
Read more >
Azure.AI.TextAnalytics 5.3.0
TextAnalytics package supports both synchronous and asynchronous APIs. Sync examples. Detect Language; Analyze Sentiment; Extract Key Phrases; Recognize Named ...
Read more >
Newest 'text-analytics-api' Questions
I want to create an algorithm that searches job descriptions for given words (like Java, Angular, Docker, etc). My algorithm works, but it...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found