Introduction to Resiliency
Resiliency is a critical aspect of building robust and fault-tolerant applications. In distributed systems, transient failures are inevitable due to network issues, system overload, or temporary unavailability of external dependencies. A well-designed system must gracefully handle such failures to ensure high availability and reliability. .NET Aspire provides built-in support for resiliency through Polly, a powerful library for handling transient faults and improving system reliability.
Before we start, if you are new to .NET Aspire, I would suggest you check out the previous chapters of this .NET Aspire Quickbook.
- Getting Started with .NET Aspire
- Setting Up Your Development Environment
- Core Concepts of .NET Aspire
- Upgrading a Real-World Microservices-Based Application with Resiliency
What is Polly?
Polly is a .NET library that enables developers to implement.
- Retries (fixed, exponential backoff, jittered)
- Circuit breakers
- Timeouts
- Fallback policies
- Bulkhead isolation
It is widely used in .NET applications, including .NET Aspire, to enhance reliability when interacting with external services such as APIs, databases, and microservices.
Why is Polly Important in .NET Aspire?
.NET Aspire relies on Polly to implement resilience policies that handle network failures, API timeouts, and transient errors. By integrating Polly, applications can,
- Improve reliability by handling transient failures effectively.
- Prevent API overload using circuit breakers.
- Ensure availability with fallback responses.
- Optimize performance with jittered exponential backoff.
Configuring Polly in .NET Aspire
Polly can be configured within .NET Aspire to automatically retry failed HTTP requests.
Basic Retry Policy
builder.Services.AddHttpClient("MyService")
.AddTransientHttpErrorPolicy(policy =>
policy.WaitAndRetryAsync(3, _ => TimeSpan.FromSeconds(2)));
builder.Services.AddHttpClient("MyService")
.AddTransientHttpErrorPolicy(policy =>
policy.WaitAndRetryAsync(3, _ => TimeSpan.FromSeconds(2)));
- Retries 3 times
- Waits 2 seconds between retries
Exponential Backoff Policy
builder.Services.AddHttpClient("MyService")
.AddTransientHttpErrorPolicy(policy =>
policy.WaitAndRetryAsync(5, retryAttempt =>
TimeSpan.FromSeconds(Math.Pow(2, retryAttempt))));
- Retries 5 times
- Waits 2s, 4s, 8s, 16s... (doubles each time)
Circuit Breaker Policy
builder.Services.AddHttpClient("MyService")
.AddTransientHttpErrorPolicy(policy =>
policy.WaitAndRetryAsync(5, retryAttempt =>
TimeSpan.FromSeconds(Math.Pow(2, retryAttempt))));
If 3 consecutive failures occur, stop retries for 10 seconds
Default Error Codes for Retries in .NET Aspire
By default, .NET Aspire applies retry policies using Polly for transient HTTP errors. It automatically retries requests when encountering the following HTTP status codes.
- 408: Request Timeout
- 429: Too Many Requests (Rate limiting)
- 500: Internal Server Error
- 502: Bad Gateway
- 503: Service Unavailable
- 504: Gateway Timeout
How does it work?
When a request to an external service or API fails with one of the above status codes, .NET Aspire retries up to 3 times (by default) to improve resilience and maintain service availability.
Case Study: Handling Payment Gateway Failures in an E-commerce System
Scenario
An e-commerce platform processes online payments through a third-party payment gateway. However, network issues or API downtime can cause transaction failures, leading to a poor user experience.
Solution
By leveraging Polly’s Retry and Circuit Breaker policies, we can,
- Automatically retry transient failures.
- Prevent excessive load on the payment gateway.
- Provide fallback mechanisms to notify users of failures.
Implementation
1. Define Payment Service Interface
public interface IPaymentService {
Task<bool> ProcessPaymentAsync(PaymentRequest request);
}
2. Implement Polly in the Payment Service
public class PaymentService : IPaymentService
{
private readonly HttpClient _httpClient;
private readonly AsyncPolicy _resiliencyPolicy;
public PaymentService(HttpClient httpClient)
{
_httpClient = httpClient;
_resiliencyPolicy = Policy.Handle<HttpRequestException>()
.WaitAndRetryAsync(3, retryAttempt => TimeSpan.FromSeconds(Math.Pow(2, retryAttempt)))
.WrapAsync(
Policy.Handle<HttpRequestException>()
.CircuitBreakerAsync(5, TimeSpan.FromSeconds(30))
);
}
public async Task<bool> ProcessPaymentAsync(PaymentRequest request)
{
return await _resiliencyPolicy.ExecuteAsync(async () =>
{
var response = await _httpClient.PostAsJsonAsync("https://api.paymentgateway.com/pay", request);
response.EnsureSuccessStatusCode();
return true;
});
}
}
3. Register Service in Dependency Injection (DI)
services.AddHttpClient<IPaymentService, PaymentService>()
.AddPolicyHandler(_ => resiliencyPolicy);
Conclusion
Resiliency is a fundamental requirement in modern applications, particularly in distributed systems. .NET Aspire, in combination with Polly, provides a powerful toolkit to handle transient failures effectively. By implementing retry, circuit breaker, and fallback strategies, developers can build reliable and fault-tolerant applications.
This chapter has covered key resiliency concepts, configurations, and a real-world case study to demonstrate how to handle failures in an e-commerce system efficiently.