How Load Balancing Works - Stabilizing Services Through Traffic Distribution | Concepts

What is Load Balancing

Load balancing is a technology that distributes traffic across multiple servers to improve overall system availability and performance.

Why it’s needed: Essential for handling large volumes of requests that a single server cannot process, and for maintaining service continuity during server failures.

Basic Load Balancer Configuration

flowchart TB
    LB["Load Balancer"] --> S1["Server 1"]
    LB --> S2["Server 2"]
    LB --> S3["Server 3"]

Clients access the load balancer’s IP address, and the load balancer distributes requests to appropriate servers.

L4 and L7 Load Balancing

L4 (Transport Layer) Load Balancing

Performs load distribution at the TCP/UDP level.

Operation: Routes by IP address and port number
Advantages: Fast, low overhead
Disadvantages: Cannot inspect HTTP content

L7 (Application Layer) Load Balancing

Performs load distribution at the HTTP level.

Operation: Routes by URL, headers, cookies, etc.
Advantages: Flexible routing
Disadvantages: Higher processing load

Path	Destination
/api/*	API server cluster
/static/*	Static file servers
/admin/*	Admin panel servers

Major Load Balancing Algorithms

Round Robin

Distributes requests to each server in sequence.

Request	Server
Request 1	Server A
Request 2	Server B
Request 3	Server C
Request 4	Server A (repeats)

Best for: When server performance is uniform

Weighted Round Robin

Adjusts distribution ratio based on server performance.

Server	Weight	Requests
Server A	3	3 requests
Server B	2	2 requests
Server C	1	1 request

Least Connections

Routes to the server with the fewest current connections.

Server	Connections	Status
Server A	10 active
Server B	5 active	← Next request
Server C	8 active

Best for: When processing times vary

IP Hash

Determines server based on client IP address hash value.

Client IP	Hash Result	Server
192.168.1.100	hash() % 3 = 1	Server B
192.168.1.101	hash() % 3 = 0	Server A

Best for: When session persistence is needed

Health Checks

Load balancers periodically check server status and automatically exclude failed servers.

Active Health Checks

flowchart LR
    LB["Load Balancer"] -->|GET /health| Server
    Server -->|200 OK| Normal["Normal (include in distribution)"]
    Server -->|5xx/Timeout| Abnormal["Abnormal (exclude)"]

Passive Health Checks

Monitors success/failure of actual requests.

Event	Action
5 consecutive failures	Exclude server
After 30 seconds	Retry distribution

Session Persistence (Sticky Sessions)

A mechanism to keep sending the same user’s requests to the same server.

Implementation Methods

Method	Description
Cookie	Store server ID in cookie
IP Address	Fixed by client IP
URL Parameter	Include session ID in URL

Considerations

Sticky sessions can cause uneven load distribution. When possible, manage session information in external stores (like Redis) and aim for a stateless design.

Popular Load Balancers

Software

Name	Features
Nginx	High performance, reverse proxy functionality
HAProxy	High availability, rich features
Envoy	Cloud-native, service mesh support

Cloud Services

Service	Provider
ALB/NLB	AWS
Cloud Load Balancing	Google Cloud
Azure Load Balancer	Microsoft Azure

High Availability Configuration

Redundancy of the load balancer itself is also important.

flowchart TB
    VIP["Virtual IP<br/>(VIP/Floating IP)"]
    VIP --> Active["LB (Active)"]
    VIP --> Standby["LB (Standby)"]
    Active <-->|monitor| Standby
    Active --> Servers["Server cluster"]

In an Active-Standby configuration, when the active load balancer fails, the standby automatically takes over.

Summary

Load balancing is an essential technology for modern web services. By selecting appropriate algorithms, configuring health checks, and implementing high availability configurations, you can achieve stable service operations.

← Back to list