I’m in the process of building an Azure Function to replicate some of the functionality found in Postcodes.io. This function will accept one or more postcodes as input and return details of the postcodes in the response.

This looks retrieves the postcode details from a copy of the ONS Postcode Directory stored as a table in Azure Storage. This table uses outcode as the PartitionKey and postcode as the RowKey (potentially it would be faster to partition by postcode given that I won’t be updating this table).

Retrieving a single postcode is easy enough using the Retrieve TableOperation and initially I’d thought that I could use a TableBatchOperation to retrieve multiple rows the same as I’d previously used it for deleting multiple rows. However it turns out you can only make one a single Retrieve request within a batch so this wouldn’t work as a solution.

One of the major advantages of Azure Storage tables is that they are massively parallel allowing up to 500 queries on the same partition simultaneously. To take advantage of this I decided to use the Task Parallel Library to parallelize my individual retrieval request as I’ve done previously batching and running tasks in parallel within a console application.

public async Task<PostcodeEntity> GetPostcodeEntityAsync(PostcodeMatch postcode)
{
	// Create a retrieve operation that takes a postcode entity.
	TableOperation retrieveOperation = TableOperation.Retrieve<PostcodeEntity>(postcode.Outcode, postcode.Postcode);

	// Execute the retrieve operation.
	TableResult retrievedResult = await _postcodeTable.ExecuteAsync(retrieveOperation);
	return (PostcodeEntity)retrievedResult.Result;
}

public async Task<List<PostcodeEntity>> GetPostcodeEntityAsync(List<PostcodeMatch> postcodes)
{
	List<PostcodeEntity> postcodeEntities = new List<PostcodeEntity>();

	// Create transform block to transform PostcodeMatch into PostcodeEntity
	var getPostcodeEntityBlock = new TransformBlock<PostcodeMatch, PostcodeEntity>(async postcodeMatch =>
	{
		return await GetPostcodeEntityAsync(postcodeMatch);
	}, new ExecutionDataflowBlockOptions
	{
		MaxDegreeOfParallelism = DataflowBlockOptions.Unbounded
	});

	// Create action block add each returned PostcodeEntity to the List
	var addPostcodeEntityBlock = new ActionBlock<PostcodeEntity>(postcodeEntity => postcodeEntities.Add(postcodeEntity));

	// Link the transform block to the action block
	getPostcodeEntityBlock.LinkTo(
		addPostcodeEntityBlock,
		new DataflowLinkOptions
		{
			PropagateCompletion = true
		});

	// Post all PostcodeEntities to the transform block
	postcodes.ForEach(x => getPostcodeEntityBlock.Post(x));

	// Complete the transform block
	getPostcodeEntityBlock.Complete();

	// Wait for action block to complete
	await addPostcodeEntityBlock.Completion;

	return postcodeEntities;
}

This uses a transform block to get the PostcodeEntity using the individual GetPostcodeAsync method, this is linked to an action block that which adds the returned PostcodeEntity to the output list. To use it I just push all of the initial PostcodeMatch objects into the transform block and then await completion.


0 Comments

Leave a Reply

Avatar placeholder

Your email address will not be published. Required fields are marked *