question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

How To Parallel Requests

See original GitHub issue

Category

  • Bug

Describe the bug

I’m trying to iterate a bunch of items concurrently, usually in chunks. Sometimes, it works great. Other times, I get NullReferenceException or IndexOutOfRange exception dealing with the BatchClient.

Steps to reproduce

                using (var context = await _myOwnFactory.CreatePnPContextAsync(_credentials, siteUrl))
                {
                    await LoadWebAsync(context);
                    var lists = GetListsToProcess(context.Web);
                    var listCollection = new List<IList>();

                    await foreach (var list in lists.AsAsyncEnumerable().WithCancellation(cancellationToken))
                    {
                        listCollection.Add(list);
                    }

                    await listCollection.Batch(5).AsyncParallelForEach(async batchLists =>
                    {
                        using (var localContext = await context.CloneAsync())
                        {
                            foreach (var list in batchLists)
                            {
                                var jobContext = new LocalProps(localContext, context.Site, context.Web, list);
                                await ProcessListAsync(jobContext);
                            }
                        }
                    });
                }

        private static async Task LoadWebAsync(PnPContext context)
        {
            var batch = context.NewBatch();
            await context.Site.LoadBatchAsync(batch, s => s.Url);
            await context.Web.LoadBatchAsync(batch, w => w.ServerRelativeUrl, w => w.Title);
            var batchResults = await context.ExecuteAsync(batch, false);
        }

        private static IQueryable<IList> GetListsToProcess(IWeb web)
        {
            // The Id (whatever is the Key) is populated automatically; no need to explicitly add it here
            return web.Lists
                .Where(ListPredicate)
                .QueryProperties(
                    l => l.Title,
                    l => l.ContentTypes.QueryProperties(ct => ct.Name, ct => ct.FieldLinks),
                    l => l.Fields.QueryProperties(f => f.InternalName, f => f.Title, f => f.FieldTypeKind));
        }

        private async Task ProcessListAsync(LocalProps jobContext)
        {
            var viewXml = $@"<View Scope='RecursiveAll'>
                    <RowLimit Paged='TRUE'>25</RowLimit>
                    <Query>
                      <Where>
                        <BeginsWith>
                          <FieldRef Name='ContentTypeId' />
                          <Value Type='ContentTypeId'>0x0101</Value>
                        </BeginsWith>
                      </Where>
                    </Query>
                </View>";

            var paging = true;
            string nextPage = null;
            while (paging)
            {
                var output = await jobContext.List.LoadListDataAsStreamAsync(new RenderListDataOptions()
                {
                    ViewXml = viewXml,
                    RenderOptions = RenderListDataOptionsFlags.ListData,
                    Paging = nextPage
                }).ConfigureAwait(false);

                if (output.ContainsKey("NextHref"))
                {
                    nextPage = output["NextHref"].ToStringSafe()[1..];
                }
                else
                {
                    paging = false;
                }
            }

            await jobContext.List.Items.AsRequested().Batch(25).AsyncParallelForEach(async listItems =>
            {
                // Since we can't get the file loaded at the same time as the list items, try to batch load them.
                var files = await GetFilesAsync(jobContext.LocalContext, listItems);
            });
        }

        private static async Task<List<IFile>> GetFilesAsync(PnPContext context, IReadOnlyCollection<IListItem> listItems)
        {
// Error happens somewhere in this function
            var files = new List<IFile>();
            if (listItems.Count > 0)
            {
                var batch = context.NewBatch();

                foreach (var item in listItems)
                {
                    files.Add(await context.Web.GetFileByServerRelativeUrlBatchAsync(batch, (string)item.Values["FileRef"],
                        f => f.Exists,
                        f => f.Title,
                        f => f.TimeCreated,
                        f => f.TimeLastModified,
                        f => f.Name,
                        f => f.ServerRelativeUrl,
                        f => f.CheckedOutByUser.QueryProperties(u => u.Title, u => u.LoginName)));
                }

                var batchResults = await context.ExecuteAsync(batch, false);
            }

            return files;
        }

Expected behavior

I know I can’t use the same context, but I had hoped a .CloneAsync() would be fine.

Environment details (development & target environment)

  • SDK version: 1.1.0
  • OS: Windows 10
  • SDK used in: ASP .NET Core IHostedService
  • Framework: .NET Core v5.0.5
  • Tooling: Visual Studio 2019

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:10 (10 by maintainers)

github_iconTop GitHub Comments

1reaction
DaleyKDcommented, May 13, 2021

Initial tests show that this has helped. Will close for now. Thank you @jansenbe . (If something stops working, I’ll reopen if necessary.)

1reaction
DaleyKDcommented, May 11, 2021

I just want to put this here for documentation sake. I FINALLY got the IndexOutOfRangeException again.

Index was outside the bounds of the array.

System.Private.CoreLib
   at System.Collections.Generic.Dictionary`2.TryInsert(TKey key, TValue value, InsertionBehavior behavior)
   at System.Collections.Generic.Dictionary`2.Add(TKey key, TValue value)
   at PnP.Core.Services.BatchClient.EnsureBatch(Guid id)
   at PnP.Core.Services.BatchClient.EnsureBatch()
   at PnP.Core.Model.SharePoint.List.<LoadListDataAsStreamAsync>d__158.MoveNext()
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Runtime.CompilerServices.ConfiguredTaskAwaitable`1.ConfiguredTaskAwaiter.GetResult()
   at MyClass.<ProcessListAsync>d__18.MoveNext() in MyClass.cs:line 201
Read more comments on GitHub >

github_iconTop Results From Across the Web

Fastest parallel requests in Python
Instead of using multithreading or asyncio.executor you should use aiohttp , which is the equivalent of requests but with asynchronous support.
Read more >
How to make multiple API requests in parallel?
If you need to make multiple API requests, you can send these API requests concurrently instead of sending them one by one.
Read more >
How To Do Parallel HTTP Requests? - crawlbase.com
First of all, the main task is to finalize the parameters to conduct a test of parallel HTTP requests like timeout management of...
Read more >
How to Send Multiple Concurrent Requests in Python
To send multiple parallel HTTP requests in Python, we can use the requests library. send multiple HTTP GET requests Let's start by example ......
Read more >
Parallel web requests in Python. Performing webrequests ...
A method is created that will perform the parallel web requests. The parameters are a list of URL's and the number of worker...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found