Retry Architecture

by Husain   Last Updated January 12, 2018 23:05 PM

This has been on my mind as I developed several applications with this feature.

Suppose that an application is required to process incoming requests in an asynchronous manner. Take the example of a notification system (other agents will submit a request for the notification to notify people via email or text, etc.) In this case this application requires to call other external systems (smtp server in our example). These external systems might be down temporarily so a retry mechanism is required (up to a certain number of retries).

There are libraries that offer a way for retry, such as Polly. The idea is that the application will retry X times with D delay. But the problem with this is that the request processing is done in memory throughout the retry process, making it resource inefficient.

What would be a plausible pattern for this sort of problems? What are some considerations or platform I should look into? What did you do when you faced similar problem?

Every time I faced this problem, I solved it with a table that contains the tasks that need to be processed. I process them in batches, and update their statuses (NEW, IN_PROGRESS, ERROR). This mechanism is good for having one instance, but once I have multiple instances, then locking the table is necessary so that no two instances process the same request. It seems that there is a better solution for this problem.



Related Questions


When should I create separate function (or class)

Updated September 01, 2017 19:05 PM


Engineering approach for Metadata Driven Architecture

Updated September 11, 2017 18:05 PM