How Load Balancing Works - Stabilizing Services Through Traffic Distribution

14 min read | 2025.12.18

What is Load Balancing

Load balancing is a technology that distributes traffic across multiple servers to improve overall system availability and performance.

Why it’s needed: Essential for handling large volumes of requests that a single server cannot process, and for maintaining service continuity during server failures.

Basic Load Balancer Configuration

flowchart TB
    LB["Load Balancer"] --> S1["Server 1"]
    LB --> S2["Server 2"]
    LB --> S3["Server 3"]

Clients access the load balancer’s IP address, and the load balancer distributes requests to appropriate servers.

L4 and L7 Load Balancing

L4 (Transport Layer) Load Balancing

Performs load distribution at the TCP/UDP level.

  • Operation: Routes by IP address and port number
  • Advantages: Fast, low overhead
  • Disadvantages: Cannot inspect HTTP content

L7 (Application Layer) Load Balancing

Performs load distribution at the HTTP level.

  • Operation: Routes by URL, headers, cookies, etc.
  • Advantages: Flexible routing
  • Disadvantages: Higher processing load
PathDestination
/api/*API server cluster
/static/*Static file servers
/admin/*Admin panel servers

Major Load Balancing Algorithms

Round Robin

Distributes requests to each server in sequence.

RequestServer
Request 1Server A
Request 2Server B
Request 3Server C
Request 4Server A (repeats)

Best for: When server performance is uniform

Weighted Round Robin

Adjusts distribution ratio based on server performance.

ServerWeightRequests
Server A33 requests
Server B22 requests
Server C11 request

Least Connections

Routes to the server with the fewest current connections.

ServerConnectionsStatus
Server A10 active
Server B5 active← Next request
Server C8 active

Best for: When processing times vary

IP Hash

Determines server based on client IP address hash value.

Client IPHash ResultServer
192.168.1.100hash() % 3 = 1Server B
192.168.1.101hash() % 3 = 0Server A

Best for: When session persistence is needed

Health Checks

Load balancers periodically check server status and automatically exclude failed servers.

Active Health Checks

flowchart LR
    LB["Load Balancer"] -->|GET /health| Server
    Server -->|200 OK| Normal["Normal (include in distribution)"]
    Server -->|5xx/Timeout| Abnormal["Abnormal (exclude)"]

Passive Health Checks

Monitors success/failure of actual requests.

EventAction
5 consecutive failuresExclude server
After 30 secondsRetry distribution

Session Persistence (Sticky Sessions)

A mechanism to keep sending the same user’s requests to the same server.

Implementation Methods

MethodDescription
CookieStore server ID in cookie
IP AddressFixed by client IP
URL ParameterInclude session ID in URL

Considerations

Sticky sessions can cause uneven load distribution. When possible, manage session information in external stores (like Redis) and aim for a stateless design.

Software

NameFeatures
NginxHigh performance, reverse proxy functionality
HAProxyHigh availability, rich features
EnvoyCloud-native, service mesh support

Cloud Services

ServiceProvider
ALB/NLBAWS
Cloud Load BalancingGoogle Cloud
Azure Load BalancerMicrosoft Azure

High Availability Configuration

Redundancy of the load balancer itself is also important.

flowchart TB
    VIP["Virtual IP<br/>(VIP/Floating IP)"]
    VIP --> Active["LB (Active)"]
    VIP --> Standby["LB (Standby)"]
    Active <-->|monitor| Standby
    Active --> Servers["Server cluster"]

In an Active-Standby configuration, when the active load balancer fails, the standby automatically takes over.

Summary

Load balancing is an essential technology for modern web services. By selecting appropriate algorithms, configuring health checks, and implementing high availability configurations, you can achieve stable service operations.

← Back to list