Friday, November 29, 2024

Micro Services - Resiliency

In modern software systems, resiliency is a critical aspect that ensures the system remains functional and responsive even in the face of failures or unexpected conditions. Several design patterns can be employed to enhance the resiliency of a system. Below are some key patterns: 

Timeout Pattern

The Timeout pattern is used to configure a timeout for all downstream calls. This pattern helps in failing fast by setting a maximum time limit for a call to complete. If the call does not complete within the specified time, it is aborted. This prevents the system from waiting indefinitely for a response, which can lead to resource exhaustion and degraded performance.  However, if there are a lot of calls within this timeout period, it can still lead to issues. This is where the Circuit Breaker pattern comes into play. 

Circuit Breaker Pattern

The Circuit Breaker pattern is often used in conjunction with the Timeout pattern. It involves implementing a circuit breaker component that tracks all outgoing calls from the service. If the component observes a high number of failures (above a certain threshold), it transitions to an "open" state. In this state, the circuit breaker immediately responds with an error message or a default response, without attempting to make the call. This prevents the system from being overwhelmed by repeated failures and helps maintain overall system health by avoiding cascading failures.  The circuit breaker also has a "half-open" state, where it allows a few calls to pass through to check if the downstream services have recovered. If the calls succeed, the circuit breaker transitions back to the "closed" state, allowing normal operation to resume. 

Retry Pattern

The Retry pattern is useful for handling transient issues. It involves retrying a failed operation a certain number of times before giving up. This pattern helps in self-correcting the services and is particularly effective when used with the Circuit Breaker and Timeout patterns. By retrying operations, the system can recover from temporary issues without manual intervention. 

Bulkhead Pattern

The Bulkhead pattern involves separating services by their criticality and functionality. High-criticality services are allocated more resources to ensure their availability. This separation makes it easier to manage and segregate execution. Additionally, workload balancing can be used to distribute the load across multiple instances of a service. Load shedding can also be employed, where the load balancer redirects requests from overloaded instances to less busy ones. This ensures that retries are more likely to succeed by avoiding overloaded instances. 

Caching Pattern

The Caching pattern involves storing responses for repeated data to reduce the load on the system. By caching frequently requested data, the system can serve responses faster and reduce the number of calls to downstream services. This not only improves performance but also enhances resiliency by reducing the dependency on external services.

 

No comments:

Post a Comment