Going through a course for my AWS Solutions Architect Associate cert. Learned about planning for fault-tolerance with EC2 instances.

Apparently the trick is to think of the worst case failure. For example, say you have three availability zones (AZs), and three MUST have 100% availability.

Think of the worst case failure, i.e. the AZ with the most number of EC2 instances fails. Then come up with a setup that has three instances running, despite the failure. That leads to setups like:

In either of these cases, even if the worst case scenario happens, you’re still abiding by your SLA.