question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[QUERY] How do we query the TOP record in Azure.Data.Tables

See original GitHub issue

Library name and version

Azure.Data.Tables 12.6.1

Query/Question

With Microsoft.WindowsAzure.Storage.Table, we used to do as below:

var query = new TableQuery<TEntity>()
            .Where(partitionFilter)
            .Take(limit);
return await ExecuteQuery(table, query);

How do we achieve the same using Azure.Data.Tables. For now, we are getting all the data and then doing top in-memory but that seems to be problematic as the data grows.

Environment

Microsoft Visual Studio Enterprise 2022 (64-bit) - Current Version 17.2.6

Issue Analytics

  • State:closed
  • Created a year ago
  • Reactions:1
  • Comments:9 (4 by maintainers)

github_iconTop GitHub Comments

3reactions
fhtinocommented, Sep 12, 2022

Hi @christothes, regarding the documentation, you linked a different API. However the point is the same: “maxPerPage”. In your case “tables”, in my case “entities”. In my opinion the documentation is clear and also the parameter name “maxPerPage” is correct. Instead this issue discuss the lack of a “TopN” functionality. It would be very usesul to have a Query<T>(filter, . . . . . , topN)

My local implementation (wrapper), that cover my needs is:

public List<T> RunQuery<T>(Expression<Func<T, bool>> filter, int? topN = null) where T : class, ITableEntity, new()
{
    // Currently Azure.Data.Tables does not implement "TopN"
    // https://github.com/Azure/azure-sdk-for-net/issues/30985

    if (filter == null)
    {
        // this generates a query without a "where" condition
        filter = item => true;
    }

    TableClient tableClient = this.GetAzureTableClient<T>();

    // Remember: every underlying REST API call returns 1000 items (maximum).
    // Giving an explicit maxPerPages is not strictly required.
    // But if TopN < 1000, it would be a waste of resource retrieving more items than required.            
    int? maxPerPages = (topN.HasValue && topN < 1000) ? topN : null;

    var query = tableClient.Query<T>(filter, maxPerPages);

    var outputBuffer = new List<T>();

    foreach (var page in query.AsPages())
    {
        outputBuffer.AddRange(page.Values);

        if (topN.HasValue && outputBuffer.Count >= topN.Value)
            break;
    }

    if (topN.HasValue)
        outputBuffer = outputBuffer.Take(topN.Value).ToList();

    return outputBuffer;
}
3reactions
fhtinocommented, Sep 11, 2022

@christothes If I remember correctly, the old WindowsAzure.Storage.Table Take only queried and returned the requested number of elements. With the new Azure.Data.Tables if I specify maxPerPages it only limits the number of elements of each rest api call. E.g. If I have a table with 1 million items and I specify 50, I will get all the elements, running 20.000 api call to azure table rest api.

If I really want to fetch only required data, I must do something like this:

            int? maxPerPages = (topN.HasValue && topN < 1000) ? topN : null;
            var result = tableClient.Query<T>(filter, maxPerPages);
            var buffer = new List<T>();
            foreach (var page in result.AsPages())
            {
                buffer.AddRange(page.Values);
                if (topN.HasValue && buffer.Count >= topN.Value)
                    break;
            }
            if (topN.HasValue)
                buffer = buffer.Take(topN.Value).ToList();
            return buffer;

I think the library should implement a “TopN” feature. Many users are not aware of the behavior of the underlying rest api.

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to get top n entities from azure table storage
This is possible through LINQ queries which is documented at this writing-linq-queries-against-the-table-service link. Additional helpful links ...
Read more >
How to query the most recent n records from Azure Table ...
Records in Azure Table Storage are always sorted by PartitionKey and then by RowKey within a partition.
Read more >
SQL SELECT TOP statement overview and examples
The TOP clause allows us to limit the result set of the queries according to the number of rows or the percentage of...
Read more >
Working with 154 million records on Azure Table Storage
When I came to build HIBP, I had a challenge: How do I make querying 154 million email addresses as fast as possible?...
Read more >
List 10 largest tables in Azure SQL database
Useful T-SQL queries for Azure SQL to explore database schema. ... Query below list ten largest tables in database.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found