Назад към всички

load-balancing-patterns

// When distributing traffic across multiple servers or regions, use this skill to select and configure the appropriate load balancing solution (L4/L7, cloud-managed, self-managed, or Kubernetes ingress) with proper health checks and session management.

$ git log --oneline --stat
stars:302
forks:57
updated:December 11, 2025
SKILL.mdreadonly
SKILL.md Frontmatter
nameload-balancing-patterns
descriptionWhen distributing traffic across multiple servers or regions, use this skill to select and configure the appropriate load balancing solution (L4/L7, cloud-managed, self-managed, or Kubernetes ingress) with proper health checks and session management.

Load Balancing Patterns

Distribute traffic across infrastructure using the appropriate load balancing approach, from simple round-robin to global multi-region failover.

When to Use This Skill

Use load-balancing-patterns when:

  • Distributing traffic across multiple application servers
  • Implementing high availability and failover
  • Routing traffic based on URLs, headers, or geographic location
  • Managing session persistence across stateless backends
  • Deploying applications to Kubernetes clusters
  • Configuring global traffic management across regions
  • Implementing zero-downtime deployments (blue-green, canary)
  • Selecting between cloud-managed and self-managed load balancers

Core Load Balancing Concepts

Layer 4 vs Layer 7

Layer 4 (L4) - Transport Layer:

  • Routes based on IP address and port (TCP/UDP packets)
  • No application data inspection, lower latency, higher throughput
  • Protocol agnostic, preserves client IP addresses
  • Use for: Database connections, video streaming, gaming, financial transactions, non-HTTP protocols

Layer 7 (L7) - Application Layer:

  • Routes based on HTTP URLs, headers, cookies, request body
  • Full application data visibility, SSL/TLS termination, caching, WAF integration
  • Content-based routing capabilities
  • Use for: Web applications, REST APIs, microservices, GraphQL endpoints, complex routing logic

For detailed comparison including performance benchmarks and hybrid approaches, see references/l4-vs-l7-comparison.md.

Load Balancing Algorithms

AlgorithmDistribution MethodUse Case
Round RobinSequentialStateless, similar servers
Weighted Round RobinCapacity-basedDifferent server specs
Least ConnectionsFewest active connectionsLong-lived connections
Least Response TimeFastest serverPerformance-sensitive
IP HashClient IP-basedSession persistence
Resource-BasedCPU/memory metricsVarying workloads

Health Check Types

Shallow (Liveness): Is the process alive?

  • Endpoint: /health/live or /live
  • Returns: 200 if process running
  • Use for: Process monitoring, container health

Deep (Readiness): Can the service handle requests?

  • Endpoint: /health/ready or /ready
  • Validates: Database, cache, external API connectivity
  • Use for: Load balancer routing decisions

Health Check Hysteresis: Different thresholds for marking up vs down to prevent flapping

  • Example: 3 failures to mark down, 2 successes to mark up

For complete health check implementation patterns, see references/health-check-strategies.md.

Cloud Load Balancers

AWS Load Balancing

Application Load Balancer (ALB) - Layer 7:

  • Use for: HTTP/HTTPS applications, microservices, WebSocket
  • Features: Path/host/header routing, AWS WAF integration, Lambda targets
  • Choose when: Content-based routing needed

Network Load Balancer (NLB) - Layer 4:

  • Use for: Ultra-low latency (<1ms), TCP/UDP, static IPs, millions RPS
  • Features: Preserves source IP, TLS termination
  • Choose when: Non-HTTP protocols, performance critical

Global Accelerator - Layer 4 Global:

  • Use for: Multi-region applications, global users, DDoS protection
  • Features: Anycast IPs, automatic regional failover

GCP Load Balancing

Application LB (L7): Global HTTPS LB, Cloud CDN integration, Cloud Armor (WAF/DDoS) Network LB (L4): Regional TCP/UDP, pass-through balancing, session affinity Cloud Load Balancing: Single anycast IP, global distribution, backend buckets

Azure Load Balancing

Application Gateway (L7): WAF integration, URL-based routing, SSL termination, autoscaling Load Balancer (L4): Basic and Standard SKUs, health probes, HA ports Traffic Manager (Global): DNS-based routing (priority, weighted, performance, geographic)

For complete cloud provider configurations and Terraform examples, see references/cloud-load-balancers.md.

Self-Managed Load Balancers

NGINX

Best for: General-purpose HTTP/HTTPS load balancing, web application stacks

Capabilities:

  • HTTP reverse proxy with multiple algorithms
  • TCP/UDP stream load balancing
  • SSL/TLS termination
  • Passive health checks (open source), active health checks (NGINX Plus)
  • Cookie-based sticky sessions (NGINX Plus)

Basic configuration:

upstream backend {
    least_conn;
    server backend1.example.com:8080 weight=3;
    server backend2.example.com:8080 weight=2;
    keepalive 32;
}

server {
    listen 80;
    location / {
        proxy_pass http://backend;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
}

For complete NGINX patterns and advanced configurations, see references/nginx-patterns.md.

HAProxy

Best for: Maximum performance, database load balancing, resource efficiency

Capabilities:

  • Highest raw throughput, lowest memory footprint
  • 10+ load balancing algorithms
  • Sophisticated health checks (HTTP, TCP, Redis, MySQL, etc.)
  • Cookie or IP-based persistence

Basic configuration:

frontend http_front
    bind *:80
    default_backend web_servers

backend web_servers
    balance roundrobin
    option httpchk GET /health
    server web1 192.168.1.101:8080 check
    server web2 192.168.1.102:8080 check

For complete HAProxy patterns, see references/haproxy-patterns.md.

Envoy

Best for: Microservices, Kubernetes, service mesh integration

Capabilities:

  • Cloud-native design with dynamic configuration (xDS APIs)
  • Circuit breakers, retries, timeouts
  • Advanced health checks (TCP, HTTP, gRPC)
  • Excellent observability

For complete Envoy patterns, see references/envoy-patterns.md.

Traefik

Best for: Docker/Kubernetes environments, dynamic configuration, ease of use

Capabilities:

  • Automatic service discovery
  • Native Kubernetes integration
  • Built-in Let's Encrypt support
  • Middleware system (auth, rate limiting)

For complete Traefik patterns, see references/traefik-patterns.md.

Kubernetes Ingress Controllers

Selection Guide

ControllerBest ForStrengths
NGINX Ingress (F5)General purposeStability, wide adoption, mature features
TraefikDynamic environmentsEasy configuration, service discovery
HAProxy IngressHigh performanceAdvanced L7 routing, reliability
Envoy (Contour/Gateway)Service meshRich L7 features, extensibility
KongAPI-heavy appsJWT auth, rate limiting, plugins
Cloud ProviderSingle-cloudNative cloud integration

Basic Ingress Example

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: app-ingress
  annotations:
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
    nginx.ingress.kubernetes.io/affinity: "cookie"
spec:
  ingressClassName: nginx
  tls:
  - hosts:
    - app.example.com
    secretName: app-tls
  rules:
  - host: app.example.com
    http:
      paths:
      - path: /api
        pathType: Prefix
        backend:
          service:
            name: api-service
            port:
              number: 80
      - path: /
        pathType: Prefix
        backend:
          service:
            name: web-service
            port:
              number: 80

For complete Kubernetes ingress examples and Gateway API patterns, see references/kubernetes-ingress.md.

Session Persistence

Sticky Sessions (Use Sparingly)

Cookie-Based: Load balancer sets cookie to track server affinity

  • Accurate routing, works with NAT/proxies
  • HTTP only, adds cookie overhead

IP Hash: Hash client IP to select backend server

  • No cookie required, works for non-HTTP
  • Poor distribution with NAT/proxies

Drawbacks: Uneven load distribution, session lost on server failure, complicates scaling

Shared Session Store (Recommended)

Architecture: Stateless application servers + centralized session storage (Redis, Memcached)

Benefits:

  • No sticky sessions needed
  • True load balancing
  • Server failures don't lose sessions
  • Horizontal scaling trivial

Client-Side Tokens (Best for APIs)

JWT (JSON Web Tokens): Server generates signed token, client stores and sends with requests

Benefits:

  • Fully stateless servers
  • Perfect load balancing
  • No session storage needed

For complete session management patterns and code examples, see references/session-persistence.md.

Global Load Balancing

GeoDNS Routing

Route users to nearest server based on geographic location:

  • DNS returns different IPs based on client location
  • Reduces latency, supports compliance and regional content
  • Implementation: AWS Route 53, GCP Cloud DNS, Azure Traffic Manager

Multi-Region Failover

Primary/secondary region configuration:

  • Health checks determine primary region health
  • Automatic DNS failover to secondary
  • Transparent to clients

CDN Integration

Combine load balancing with CDN:

  • GeoDNS routes to closest CDN PoP
  • CDN caches content globally
  • Origin load balancing for cache misses

For complete global load balancing examples with Terraform, see references/global-load-balancing.md.

Decision Frameworks

L4 vs L7 Selection

Choose L4 when:

  • Protocol is TCP/UDP (not HTTP)
  • Ultra-low latency critical (<1ms)
  • High throughput required (millions RPS)
  • Client source IP preservation needed

Choose L7 when:

  • Protocol is HTTP/HTTPS
  • Content-based routing needed (URL, headers)
  • SSL termination required
  • WAF integration needed
  • Microservices architecture

Cloud vs Self-Managed

Choose Cloud-Managed when:

  • Single cloud deployment
  • Auto-scaling required
  • Team lacks load balancer expertise
  • Managed service preferred

Choose Self-Managed when:

  • Multi-cloud or hybrid deployment
  • Advanced routing requirements
  • Cost optimization important
  • Full control needed
  • Vendor lock-in avoidance

Self-Managed Selection

  • NGINX: General-purpose, web stacks, HTTP/3 support
  • HAProxy: Maximum performance, database LB, lowest resource usage
  • Envoy: Microservices, service mesh, dynamic configuration
  • Traefik: Docker/Kubernetes, automatic discovery, easy configuration

Configuration Examples

Complete working examples available in examples/ directory:

Cloud Providers:

  • examples/aws/alb-terraform.tf - AWS ALB with path-based routing
  • examples/aws/nlb-terraform.tf - AWS NLB for TCP load balancing

Self-Managed:

  • examples/nginx/http-load-balancing.conf - NGINX HTTP reverse proxy
  • examples/haproxy/http-lb.cfg - HAProxy configuration
  • examples/envoy/basic-lb.yaml - Envoy cluster configuration
  • examples/traefik/kubernetes-ingress.yaml - Traefik IngressRoute

Kubernetes:

  • examples/kubernetes/nginx-ingress.yaml - NGINX Ingress with TLS
  • examples/kubernetes/traefik-ingress.yaml - Traefik IngressRoute
  • examples/kubernetes/gateway-api.yaml - Gateway API configuration

Monitoring and Observability

Key Metrics

Throughput: Requests per second, bytes transferred, connection rate Latency: Request duration (p50, p95, p99), backend response time, SSL handshake time Errors: HTTP error rates (4xx, 5xx), backend connection failures, health check failures Resource Utilization: CPU, memory, active connections, connection queue depth Health: Healthy/unhealthy backend count, health check success rate

Load Balancer Logs

Enable access logs for request/response details, client IPs, response times, error tracking

  • AWS ALB: Store in S3, analyze with Athena
  • NGINX: Custom log format, ship to centralized logging
  • HAProxy: Syslog integration, structured logging

Troubleshooting

Uneven Load Distribution

Symptoms: One server receives disproportionate traffic Causes: Sticky sessions with few clients, IP hash with NAT concentration, long-lived connections Solutions: Switch to least connections, disable sticky sessions, implement connection draining

Health Check Flapping

Symptoms: Servers rapidly transition between healthy/unhealthy Causes: Health check timeout too short, threshold too low, network instability Solutions: Increase interval and timeout, implement hysteresis, use deep health checks

Session Loss After Failover

Symptoms: Users logged out when server fails Causes: Sticky sessions without replication, in-memory sessions Solutions: Implement shared session store (Redis), use client-side tokens (JWT)

Integration Points

Related Skills:

  • infrastructure-as-code - Deploy load balancers via Terraform/Pulumi
  • kubernetes-operations - Ingress controllers for K8s traffic management
  • network-architecture - Network design and topology for load balancing
  • deploying-applications - Blue-green and canary deployments via load balancers
  • observability - Load balancer metrics, access logs, distributed tracing
  • security-hardening - WAF integration, rate limiting, DDoS protection
  • service-mesh - Envoy as both ingress and service mesh proxy
  • implementing-tls - TLS termination and certificate management

Quick Reference

Selection Matrix

Use CaseRecommended Solution
HTTP web app (AWS)ALB
Non-HTTP protocol (AWS)NLB
Kubernetes HTTP ingressNGINX Ingress or Traefik
Maximum performanceHAProxy
Service meshEnvoy
Docker SwarmTraefik
Multi-cloud portableNGINX or HAProxy
Global distributionCloudFlare, AWS Global Accelerator

Algorithm Selection

Traffic PatternAlgorithm
Stateless, similar serversRound Robin
Stateless, different capacityWeighted Round Robin
Long-lived connectionsLeast Connections
Performance-sensitiveLeast Response Time
Session persistence neededIP Hash or Cookie
Varying server loadResource-Based

Health Check Configuration

Service TypeCheck TypeIntervalTimeout
Web appHTTP /health10s3s
APIHTTP /health/ready10s5s
DatabaseTCP connect5s2s
Critical serviceHTTP deep check5s3s
Background workerHTTP /live30s5s

Summary

Load balancing is essential for distributing traffic, ensuring high availability, and enabling horizontal scaling. Choose L4 for raw performance and non-HTTP protocols, L7 for intelligent content-based routing. Prefer cloud-managed load balancers for simplicity and auto-scaling, self-managed for multi-cloud portability and advanced features. Implement proper health checks with hysteresis, avoid sticky sessions when possible, and monitor key metrics continuously.

For deployment patterns, see examples in examples/aws/, examples/nginx/, examples/kubernetes/, and other provider directories.