System Design 101: Load Balancing

A hands-on guide to implementing and testing load balancing algorithms using Golang, Nginx, and Docker

Madhan published on
6 min, 1097 words

Banner

Load balancing is the process of distributing incoming network traffic across multiple servers to improve reliability, scalability, and performance. It prevents any single server from getting overloaded and helps applications handle higher demand efficiently.

Operating level

Load balancing can happen at different levels of the network stack.

Layer 4 (Transport level) — works at TCP/UDP level, routing traffic based on IP and port. e.g Round Robin, and Least Connections.

Layer 7 (Application level) — looks into HTTP headers, cookies, or request data to make routing decisions. e.g Geo Routing, and Consistent Hashing.

Let's look at how this works in practical by building a simple Restful API server in Go and Nginx load balancer to test algorithms of Round Robin, Least Connections, Geo-based, and Consistent Hashing approaches.

Application code

Restful API

package main

import (
 "fmt"
 "log"
 "net/http"
 "os"
)

func main() {
 id := os.Getenv("SERVER_ID")
 region := os.Getenv("REGION")
 if region == "" {
  region = "global"
 }

 visits := make(map[string]int)

 http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
  sessionID := "anonymous"
  if cookie, err := r.Cookie("session_id"); err == nil {
   sessionID = cookie.Value
  } else {
   sessionID = "anonymous"
  }

  visits[sessionID]++
  count := visits[sessionID]

  response := fmt.Sprintf("Server %s (%s) | Session: %s | Visit: %d\n",
   id, region, sessionID, count)

  fmt.Fprint(w, response)
  log.Printf("[INFO] Session %s → Server %s (visit #%d)", sessionID, id, count)
 })

 log.Printf("Server %s (%s) started on :8080", id, region)
 if err := http.ListenAndServe(":8080", nil); err != nil {
  log.Printf("[FATAL] Server %s failed: %v", id, err)
 }
}

It is a minimal restful program that listens on port 8080 and prints which server handled each request. Each instance is configured using environment variables SERVER_ID and REGION. When a client sends a request, the server reads a session ID from the X-Session-ID header, tracks visit counts, and responds with its ID, region, and visit number which makes it easier to visualize how Nginx distributes requests under different load balancing algorithms.

Dockerize the Go app

FROM golang:1.25-alpine
WORKDIR /app
COPY main.go .
RUN go mod init server && go build -o server .
CMD ["./server"]

Nginx configurations

1. Round robin

It sends each request to the next server in sequence, giving equal distribution. Limitation — Since it doesn't consider server performance or load it can overload weaker servers.

upstream backend {
    zone backend_zone 64k;
    server server1:8080;
    server server2:8080;
    server server3:8080;
}

server {
    listen 80;
    location / {
        proxy_pass http://backend;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
}

2. Least connections

Directs new requests to the server with the fewest active connections, useful when request durations varies. Limitations — Requires tracking active sessions and doesn't take server processing capacity into account.

upstream backend {
    least_conn;
    zone backend_zone 64k;
    server server1:8080;
    server server2:8080;
    server server3:8080;
}

server {
    listen 80;
    location / {
        proxy_pass http://backend;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
}

3. Geo-based routing

Since we are running this setup locally, all requests comes from same machine and can't distinguish locations accurately. So used client headers to send users to specific regions (like US or EU servers). In realtime, client's IP is looked up in a GeoIP database (like MaxMind database).

map $http_x_region $backend_upstream {
    default global_backend;
    us us_backend;
    eu eu_backend;
}

upstream us_backend {
    zone us_zone 64k;
    server server1:8080;
}

upstream eu_backend {
    zone eu_zone 64k;
    server server2:8080;
}

upstream global_backend {
    zone global_zone 64k;
    server server3:8080;
}

server {
    listen 80;
    location / {
        proxy_pass http://$backend_upstream;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
}

4. Consistent hashing

This keeps sessions "sticky". Each client key (like a cookie or header) always maps to the same backend. Limitations — skewed distribution.

upstream backend {
    hash $cookie_session_id consistent;
    zone backend_zone 64k;
    server server1:8080;
    server server2:8080;
    server server3:8080;
}

server {
    listen 80;
    location / {
        # Generate session cookie if missing
        if ($cookie_session_id = "") {
            add_header Set-Cookie "session_id=$request_id; Path=/; HttpOnly";
        }

        proxy_pass http://backend;
        proxy_set_header Host $host;
        proxy_set_header X-Session-ID $cookie_session_id;
        proxy_set_header Cookie $http_cookie;
    }
}

Docker compose setup

Below is the complete setup in docker compose to run all load balancing algorithms.

services:
  server1:
    build: ./server
    environment:
      - SERVER_ID=1
      - REGION=us
    networks:
      - lbnet

  server2:
    build: ./server
    environment:
      - SERVER_ID=2
      - REGION=eu
    networks:
      - lbnet

  server3:
    build: ./server
    environment:
      - SERVER_ID=3
      - REGION=global
    networks:
      - lbnet

  # Round Robin
  nginx-round-robin:
    build: ./nginx
    volumes:
      - ${PWD}/nginx/round_robin.conf:/etc/nginx/conf.d/default.conf:ro
    ports:
      - "8081:80"
    depends_on: [server1, server2, server3]
    networks:
      - lbnet

  # Least Connections
  nginx-least-conn:
    build: ./nginx
    volumes:
      - ${PWD}/nginx/least_conn.conf:/etc/nginx/conf.d/default.conf:ro
    ports:
      - "8082:80"
    depends_on: [server1, server2, server3]
    networks:
      - lbnet

  # Geo-based
  nginx-geo-env:
    build: ./nginx
    volumes:
      - ${PWD}/nginx/geo_env_based.conf:/etc/nginx/conf.d/default.conf:ro
    ports:
      - "8083:80"
    depends_on: [server1, server2, server3]
    networks:
      - lbnet

  # Consistent Hashing
  nginx-consistent-hash:
    build: ./nginx
    volumes:
      - ${PWD}/nginx/consistent_hashing.conf:/etc/nginx/conf.d/default.conf:ro
    ports:
      - "8084:80"
    depends_on: [server1, server2, server3]
    networks:
      - lbnet

networks:
  lbnet:
    driver: bridge

Testing load balancing

To test behavior across all algorithms, following simple script can be used.

1. Round Robin

for i in {1..6}; do curl -s localhost:8081; done

2. Least Connections

seq 1 100 | xargs -n1 -P20 curl -s localhost:8082

3. Geo Routing

curl -s -H 'X-Region: us' localhost:8083
curl -s -H 'X-Region: eu' localhost:8083
curl -s -H 'X-Region: jp' localhost:8083

4. Consistent Hashing

for i in {1..3}; do 
    curl -s --cookie "session_id=alice" localhost:8084
done

Justfile — task runner

To make it running, testing and tearing it down the load balancer setup easier, Just a lightweight command runner can be used.

load_balancer_up:
    @just compose-down {{lb_dir}}
    @just docker-clean
    @just compose-up {{lb_dir}}
    @echo ""
    @echo "Load balancer is running"

load_balancer_test:
    @echo ""
    @echo "Running load balancer tests..."
    @bash {{lb_dir}}/test.sh

load_balancer_down:
    @just compose-down {{lb_dir}}
    @just docker-clean
    @echo ""
    @echo "Load balancer stopped and cleaned"
* * * *

Originally published on Medium

🌟 🌟 🌟 The source code for this blog post can be found here 🌟🌟🌟

learn-system-design/high-level-design/02_load_balancer at main · madhank93/learn-system-design

References

[1] https://www.youtube.com/watch?v=q8OleYuqntY

[2] https://www.youtube.com/watch?v=K0Ta65OqQkY

[3] https://docs.nginx.com/nginx/admin-guide/security-controls/controlling-access-by-geoip

[4] https://docs.nginx.com/nginx/admin-guide/load-balancer/http-load-balancer/

[5] https://aws.amazon.com/what-is/load-balancing/

[6] https://highscalability.com/consistent-hashing-algorithm/