In this article, we would be looking at scalability, high availability and how load balancers help in designing highly scalable and highly available applications.
Scalability — Ability of application/system to handle greater loads by adapting themselves. Two types of scalability are Vertical Scalability and Horizontal Scalability.
- Scalable by increasing the size of the EC2 Instance.
- e.g., consider our application runs on t2.micro EC2 instance. Vertically scaling them would result in running this app on higher configuration EC2 instances like t2.large.
- Vertical Scalability is common for non distribution systems like database (RDS, Elastic Cache etc)
- However, there is always a limit to how much we can vertically scale (in terms of hardware configurations).
- Vertical scaling is generally termed as SCALE UP / SCALE DOWN
- Scalable by increasing the number of instances/systems for our application.
- Horizontal scaling is for distributed systems
- Common for web/mobile based applications
- Horizontal scaling is generally termed as SCALE IN/ SCALE OUT
- Goes hand in hand with horizontal scaling
- This means the thought process to run our application/system in atleast 2 data centers (of different availability zones)
- Goal is to survive a data center loss.
- HA Can be passive as well(for e.g., RDS Multi Availability Zones). One instance of DB is up and other acts a back up.
- HA will be active in case of horizontal scaling (all instances are up and running across data centers/availability zones).
Why Use Elastic Load Balancers (ELB):
- Load Balancers are servers that forward internet traffic to multiple servers (EC2 Instances) downstream.
- Load Balancers spread load traffic across multiple downstream instances and expose single point of access (via DNS) to our applications.
- Seamlessly handle failures of downstream instances by doing regular health checks to our EC2 Instances.
- Provide SSL Termination (HTTPS) for our application
- High Availability across multiple AZ
- Separate public traffic (traffic from users to load balancers) with that of private traffic (traffic from load balancers to EC2 Instances)
- Health checks are crucial for load balancers. They enable load balancers to know if an EC2 instance (to which it forwards traffic to) are available to reply the requests
- Health check is done on port and a route (/health is commonly used route).
- If the response is not 200 (not OK), instance is unhealthy and the load balancer stop sending the traffic to that instance.
- Frequency of Health check is configurable and generally in the range between 5 seconds to a minute.
Types of Load Balancers:
- Classic Load Balancers (CLB) — V1 generation (launched around 2009) and it supports HTTP, HTTPS, TCP
- Network Load Balancers (NLB) — V2 generation (launched around 2017) and it supports TCP, TLS (Secured TCP) & UDP
- Application Load Balancers (ALB) — V2 generation (launched around 2016) and it supports HTTP, HTTPS, Web Socket.
- Gateway Load Balancers (GLB) — V2 generation (launched around 2020)
Load Balancers (ELB) can be set up for internal (private network) as well as external (public network). Security Group needs to be set up properly (as below) to enable client to access only load balancers and not the EC2 Instances directly.
Load Balancer Security Group allows Inbound traffic rule on HTTP/HTTPS from anywhere (or wider range of IPs) for clients/consumers to access the load balancer integrated to our application
Application Security Group (on EC2 Instances) allows Inbound traffic rule on HTTP/HTTPS only from load balancer to which the EC2 Instance is integrated.
- 4xx errors are Client Induced Errors
- 5xx errors are Application Induced Errors
- LB Error 503 means the load balancer is at full capacity or load balancer do not have any registered target yet.
- If load balancer can’t connect to our application, we should verify the configurations of security groups.
Hope you enjoyed this article!