Capacity Planning
Capacity Planning is measuring the performance of a service and working out the limits of it’s operation. From this information you can estimate if a service will be able to handle the amount of traffic you expect it to receive.
Why is it Important?
To predict in advance if your service will be able to serve all the customers you need it to serve
To be able to plan and justify performance work
To be able to plan for auto scaling
To be able to configure better monitoring and alerting for your service
To understand failure conditions of your service
Tips and Tricks
When testing performance of your application, you should test ALL OF (in order of importance):
Each function of your application in isolation
Many functions at once based upon an estimate of real-world traffic
Randomized testing or “Fuzzing” of different functions
We are capable of mirroring traffic for services which use the Gateway, this may be a good method of testing real traffic against unreleased versions of some services. Please talk to the Architect team if you would like to discuss this possibility!
Testing in Isolation
You should have a way to performance test each function of your application in isolation. That means if your application has multiple different API endpoints, each one should be tested as independently as possible. The purpose of this is to ensure that you have a known baseline of performance. This follows the scientific practice of testing variables in isolation.
Testing based upon Real World Estimates
You should be able to performance test your application in some estimate of a real world scenario. This means you should measure what sort of requests your application will receive and try to mimic those requests in a test environment. As these numbers will change over time, the test should be updated periodically to ensure the testing is still valid.
While testing in isolation would ideally be enough, there are often complex interactions between different parts of an application. This testing intends to detect if any of these complex interactions exist under expected conditions.
Randomized Testing (“Fuzzing”)
Common test scenarios tend to be based upon ideal conditions. Randomized Testing intends to measure if the performance is still consistent (or at least meeting expectations) with less-than-ideal input. This is particularly important for services which receive user traffic, as you cannot always guarantee that your application will receive good input.
In certain scenarios, bad input can cause an applications performance to be significantly worse. Applications involving caches are a common example of this, where bad input can evict important data from a cache and cause the performance of other requests to be degraded.