Batching and running tasks in parallel

I’ve had a few projects where I’ve had a large number of asyncronous tasks to execute but haven’t wanted to execute them all at once as this would create too many threads and would result in system degredation.

Initially I used something similar to the below to execute all tasks at once.


// Run all tasks
await Task.WhenAll(sources.Select(i => serviceProvider.GetService<App>().Run(i)).ToArray());

This starts a service for all items in a list which works fine when there’s only a small number of items that need processing.

I then split the list into chunks and processed each chunk seperly in a foreach loop using the above code, this again works fine but it can be inefficent if 9 out of 10 tasks started complet quickly as we will have to wait for the final task to complete before we can start the next batch.

Happily Microsoft have something called the Task Parallel Library (TPL) which is designed for just such workloads. I based the below code an an example I found here.

This works by creating action blocks which are then processed in a dataflow with the specified options, in this case the number of tasks executing in parallel is controlled by the MaxDegreeOfParallelism setting.


static async Task MainAsync()
{
	// Create service collection
	ServiceCollection serviceCollection = new ServiceCollection();
	ConfigureServices(serviceCollection);

	// Create service provider
	IServiceProvider serviceProvider = serviceCollection.BuildServiceProvider();

	// Get backup sources for client
	List<String> sources = configuration.GetSection("Backup:Sources").GetChildren().Select(x => x.Value).ToList();

	// Create a block with an asynchronous action
	var block = new ActionBlock<string>(
		async x => await serviceProvider.GetService<App>().Run(x),
		new ExecutionDataflowBlockOptions
		{
			MaxDegreeOfParallelism = int.Parse(configuration["Backup:MaxDegreeOfParallelism"])
			//MaxDegreeOfParallelism = Environment.ProcessorCount, // Parallelize on all cores
		});

	// Add items to the block and asynchronously wait if BoundedCapacity is reached
	foreach (string source in sources)
	{
		await block.SendAsync(source);
	}

	block.Complete();
	await block.Completion;
}

A full working example can be found in this GitHub project.


2 Comments

Trang Nguyen · 31st July 2021 at 2:37 am

I tried this out and expecting it to be a lot faster but it seems to run a lot slower. If I used the traditional way you have using with WhenAll. I took 30 minutes to process 60K. If I use the codes with ActionBlock, it took close to 100 minutes for the same 60K. Any idea?

    Shinigami · 2nd August 2021 at 9:34 am

    I’m not sure why this would be so much slower for you. Possibly WhenAll() runs the code block in more threads than you’re using for the ActionBlocks so maybe tweaking the BoundedCapacity or MaxDegreeOfParallelism would help.

Leave a Reply

Avatar placeholder

Your email address will not be published. Required fields are marked *