How To Parallel Requests
See original GitHub issueCategory
- Bug
Describe the bug
I’m trying to iterate a bunch of items concurrently, usually in chunks. Sometimes, it works great. Other times, I get NullReferenceException or IndexOutOfRange exception dealing with the BatchClient
.
Steps to reproduce
using (var context = await _myOwnFactory.CreatePnPContextAsync(_credentials, siteUrl))
{
await LoadWebAsync(context);
var lists = GetListsToProcess(context.Web);
var listCollection = new List<IList>();
await foreach (var list in lists.AsAsyncEnumerable().WithCancellation(cancellationToken))
{
listCollection.Add(list);
}
await listCollection.Batch(5).AsyncParallelForEach(async batchLists =>
{
using (var localContext = await context.CloneAsync())
{
foreach (var list in batchLists)
{
var jobContext = new LocalProps(localContext, context.Site, context.Web, list);
await ProcessListAsync(jobContext);
}
}
});
}
private static async Task LoadWebAsync(PnPContext context)
{
var batch = context.NewBatch();
await context.Site.LoadBatchAsync(batch, s => s.Url);
await context.Web.LoadBatchAsync(batch, w => w.ServerRelativeUrl, w => w.Title);
var batchResults = await context.ExecuteAsync(batch, false);
}
private static IQueryable<IList> GetListsToProcess(IWeb web)
{
// The Id (whatever is the Key) is populated automatically; no need to explicitly add it here
return web.Lists
.Where(ListPredicate)
.QueryProperties(
l => l.Title,
l => l.ContentTypes.QueryProperties(ct => ct.Name, ct => ct.FieldLinks),
l => l.Fields.QueryProperties(f => f.InternalName, f => f.Title, f => f.FieldTypeKind));
}
private async Task ProcessListAsync(LocalProps jobContext)
{
var viewXml = $@"<View Scope='RecursiveAll'>
<RowLimit Paged='TRUE'>25</RowLimit>
<Query>
<Where>
<BeginsWith>
<FieldRef Name='ContentTypeId' />
<Value Type='ContentTypeId'>0x0101</Value>
</BeginsWith>
</Where>
</Query>
</View>";
var paging = true;
string nextPage = null;
while (paging)
{
var output = await jobContext.List.LoadListDataAsStreamAsync(new RenderListDataOptions()
{
ViewXml = viewXml,
RenderOptions = RenderListDataOptionsFlags.ListData,
Paging = nextPage
}).ConfigureAwait(false);
if (output.ContainsKey("NextHref"))
{
nextPage = output["NextHref"].ToStringSafe()[1..];
}
else
{
paging = false;
}
}
await jobContext.List.Items.AsRequested().Batch(25).AsyncParallelForEach(async listItems =>
{
// Since we can't get the file loaded at the same time as the list items, try to batch load them.
var files = await GetFilesAsync(jobContext.LocalContext, listItems);
});
}
private static async Task<List<IFile>> GetFilesAsync(PnPContext context, IReadOnlyCollection<IListItem> listItems)
{
// Error happens somewhere in this function
var files = new List<IFile>();
if (listItems.Count > 0)
{
var batch = context.NewBatch();
foreach (var item in listItems)
{
files.Add(await context.Web.GetFileByServerRelativeUrlBatchAsync(batch, (string)item.Values["FileRef"],
f => f.Exists,
f => f.Title,
f => f.TimeCreated,
f => f.TimeLastModified,
f => f.Name,
f => f.ServerRelativeUrl,
f => f.CheckedOutByUser.QueryProperties(u => u.Title, u => u.LoginName)));
}
var batchResults = await context.ExecuteAsync(batch, false);
}
return files;
}
Expected behavior
I know I can’t use the same context, but I had hoped a .CloneAsync()
would be fine.
Environment details (development & target environment)
- SDK version: 1.1.0
- OS: Windows 10
- SDK used in: ASP .NET Core IHostedService
- Framework: .NET Core v5.0.5
- Tooling: Visual Studio 2019
Issue Analytics
- State:
- Created 2 years ago
- Comments:10 (10 by maintainers)
Top Results From Across the Web
Fastest parallel requests in Python
Instead of using multithreading or asyncio.executor you should use aiohttp , which is the equivalent of requests but with asynchronous support.
Read more >How to make multiple API requests in parallel?
If you need to make multiple API requests, you can send these API requests concurrently instead of sending them one by one.
Read more >How To Do Parallel HTTP Requests? - crawlbase.com
First of all, the main task is to finalize the parameters to conduct a test of parallel HTTP requests like timeout management of...
Read more >How to Send Multiple Concurrent Requests in Python
To send multiple parallel HTTP requests in Python, we can use the requests library. send multiple HTTP GET requests Let's start by example ......
Read more >Parallel web requests in Python. Performing webrequests ...
A method is created that will perform the parallel web requests. The parameters are a list of URL's and the number of worker...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Initial tests show that this has helped. Will close for now. Thank you @jansenbe . (If something stops working, I’ll reopen if necessary.)
I just want to put this here for documentation sake. I FINALLY got the
IndexOutOfRangeException
again.