By: G. Meszaros
Published in: PLoPD2
Category: Reactive and Real-Time Systems, Telecommunications
Summary: Improve the capacity and reliability of real-time reactive systems.
In a reactive system with a limited capacity, to improve the capacity of the system, learn what determines the system's capacity and what affects it. Optimize elements that truly limit system capacity. Engineer the system so that these limits can be avoided automatically, or ensure the system can withstand circumstances in which the demands on its resources are exceeded.
In a reactive system where capacity is limited by processing power, and work arrives in a stochastic fashion, apply the following. If a small number of request types constitute a large part of the processing cost, use Optimize High-Runner Cases. If a large amount of processing capacity must be kept in reserve to handle peak loads, use Shed Load. Further tune the capacity at the expense of latency with Max Headroom. If the necessary increase in capacity exceeds what can be recovered from these techniques, use Share the Load
In a reactive system with limited processing power, the average processing cost of service requests can be too high. In most systems, 80% or more of the processing power is consumed by 20% of the use cases. Measure or project the high-runner transactions and optimize only those that contribute significantly to the cost.
In a processing-bound reactive system with more requests than it can handle, overloads must not cause it to become unavailable. Large numbers of requests can cause the system to thrash. In extreme situations, the entire system can crash. Use triage to shed some requests so that others can be served properly. Requests that cannot be handled properly should be shed before processing is wasted on them. Any request that will take longer than a specified time-out can be shed without any visible change in system behavior.
In a reactive system where user requests are dependent on one another, which requests should be accepted and which should be rejected to improve system throughput? Categorize requests into new and progress categories. Process all progress requests before new requests.
In a reactive system that can't satisfy all requests, to maximize the number of customers who get good service, give good service to as many customers as possible and give the remainder poor or no service. Put all new requests in a LIFO queue. Serve most recently received requests first. Serve stale requests after all fresh ones have been served.
You're using Shed Load and Fresh Work Before Stale. Some users are giving up because their requests are not served immediately. This creates "undo" work the system cannot really afford. When related requests are received, pair them up. If the second request is a cancel, then the work request can be removed before any processing is done.
In a processing-bound reactive system that can't meet its target capacity and all means of increasing processing capacity have been exhausted, the cost of processing all the requests may exceed the capacity of a single processor at a specified system capacity. To increase available processing power, shift some of the processing to another processor. Select the functions to be moved that are clearly partitionable from those left behind.
You're using Shed Load. To shed work at a minimum additional cost, move the detection of new work as close to the periphery of the system as possible. Give this part of the system information about the available processing capacity of the most limiting part of the system.
In a reactive system, one processor needs to be aware of the overload state of another processor. The bottleneck processor tells other processors when it can accept more work by sending them credits. Each of the other processors holds a leaky bucket of these credits. The buckets gradually leak until they are empty. When a processor sends work to other processors, it must track the credits from each.