负载均衡 #

概述 #

负载均衡是分布式系统的核心组件,Caddy 提供了强大的负载均衡功能,支持多种策略、健康检查和会话保持。

text
┌─────────────────────────────────────────────────────────────┐
│                    负载均衡架构                              │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│                    ┌─────────┐                              │
│                    │  客户端  │                              │
│                    └────┬────┘                              │
│                         │                                   │
│                         ▼                                   │
│                    ┌─────────┐                              │
│                    │  Caddy  │                              │
│                    │ 负载均衡 │                              │
│                    └────┬────┘                              │
│                         │                                   │
│         ┌───────────────┼───────────────┐                   │
│         │               │               │                   │
│         ▼               ▼               ▼                   │
│    ┌─────────┐    ┌─────────┐    ┌─────────┐              │
│    │ Server1 │    │ Server2 │    │ Server3 │              │
│    └─────────┘    └─────────┘    └─────────┘              │
│                                                             │
└─────────────────────────────────────────────────────────────┘

负载均衡策略 #

轮询(Round Robin) #

默认策略,按顺序分发请求:

caddyfile
example.com {
    reverse_proxy {
        to server1:3000
        to server2:3000
        to server3:3000
        lb_policy round_robin
    }
}

请求分发顺序:

text
请求1 → Server1
请求2 → Server2
请求3 → Server3
请求4 → Server1
请求5 → Server2
...

最少连接(Least Connections) #

将请求分发给连接数最少的服务器:

caddyfile
example.com {
    reverse_proxy {
        to server1:3000
        to server2:3000
        to server3:3000
        lb_policy least_conn
    }
}

适用场景:

  • 请求处理时间差异大
  • 需要均衡服务器负载
  • 长连接场景

IP 哈希(IP Hash) #

根据客户端 IP 分配服务器,同一 IP 的请求总是路由到同一服务器:

caddyfile
example.com {
    reverse_proxy {
        to server1:3000
        to server2:3000
        to server3:3000
        lb_policy ip_hash
    }
}

适用场景:

  • 需要会话保持
  • 无状态服务但有本地缓存
  • 需要减少缓存失效

URI 哈希(URI Hash) #

根据请求 URI 分配服务器:

caddyfile
example.com {
    reverse_proxy {
        to server1:3000
        to server2:3000
        to server3:3000
        lb_policy uri_hash
    }
}

适用场景:

  • 缓存服务器
  • 静态资源分发
  • CDN 场景

随机选择(Random) #

随机选择后端服务器:

caddyfile
example.com {
    reverse_proxy {
        to server1:3000
        to server2:3000
        to server3:3000
        lb_policy random
    }
}

加权轮询(Weighted Round Robin) #

根据权重分配请求:

caddyfile
example.com {
    reverse_proxy {
        to server1:3000 {
            weight 5
        }
        to server2:3000 {
            weight 3
        }
        to server3:3000 {
            weight 2
        }
        lb_policy weighted_round_robin
    }
}

请求分发比例:Server1:Server2:Server3 = 5:3:2

适用场景:

  • 服务器性能不均
  • 灰度发布
  • 流量控制

加权最少连接 #

结合权重和连接数:

caddyfile
example.com {
    reverse_proxy {
        to server1:3000 {
            weight 5
        }
        to server2:3000 {
            weight 3
        }
        to server3:3000 {
            weight 2
        }
        lb_policy weighted_least_conn
    }
}

首次可用(First) #

选择第一个可用的服务器:

caddyfile
example.com {
    reverse_proxy {
        to server1:3000
        to server2:3000
        to server3:3000
        lb_policy first
    }
}

适用场景:

  • 主备切换
  • 优先级服务器

策略对比 #

策略 特点 适用场景
round_robin 公平分配 服务器性能相近
least_conn 动态均衡 请求处理时间差异大
ip_hash 会话保持 需要粘性会话
uri_hash 缓存友好 缓存服务器
random 简单快速 测试环境
weighted_* 权重控制 性能不均、灰度发布

健康检查 #

被动健康检查 #

通过观察请求结果判断服务器健康状态:

caddyfile
example.com {
    reverse_proxy {
        to server1:3000
        to server2:3000
        to server3:3000
        
        # 被动健康检查配置
        fail_duration 30s        # 失败持续时间
        max_fails 3              # 最大失败次数
        unhealthy_status 500 502 503 504  # 不健康状态码
        unhealthy_latency 5s     # 不健康延迟阈值
        unhealthy_request_count 100  # 不健康请求数
    }
}

主动健康检查 #

定期发送健康检查请求:

caddyfile
example.com {
    reverse_proxy {
        to server1:3000
        to server2:3000
        to server3:3000
        
        # 主动健康检查
        health_uri /health
        health_port 3000
        health_interval 10s
        health_timeout 5s
        health_status 200
        health_body "OK"
        
        # 健康检查头部
        health_headers {
            User-Agent "Caddy-Health-Check"
            X-Health-Check "true"
        }
    }
}

完整健康检查配置 #

caddyfile
example.com {
    reverse_proxy {
        to server1:3000
        to server2:3000
        to server3:3000
        
        # 主动检查
        health_uri /health
        health_interval 30s
        health_timeout 10s
        health_status 200
        health_body "healthy"
        
        # 被动检查
        fail_duration 60s
        max_fails 5
        unhealthy_status 500 502 503 504
        unhealthy_latency 10s
        
        # 重试配置
        max_retries 3
        try_duration 30s
    }
}

健康检查参数详解 #

参数 说明 默认值
health_uri 健康检查路径 /
health_port 检查端口 后端端口
health_interval 检查间隔 30s
health_timeout 超时时间 5s
health_status 期望状态码 200
health_body 期望响应体 -
health_headers 检查请求头 -
health_follow_redirects 跟随重定向 false
fail_duration 失败持续时间 0
max_fails 最大失败次数 1
unhealthy_status 不健康状态码 -
unhealthy_latency 不健康延迟 -

会话保持 #

IP 哈希会话保持 #

caddyfile
example.com {
    reverse_proxy {
        to server1:3000
        to server2:3000
        to server3:3000
        lb_policy ip_hash
    }
}
caddyfile
example.com {
    reverse_proxy {
        to server1:3000
        to server2:3000
        to server3:3000
        
        # 使用 Cookie 保持会话
        lb_policy cookie
        lb_cookie_name "SERVERID"
        lb_cookie_secret "your-secret-key"
        lb_cookie_duration 1h
    }
}

Header 会话保持 #

caddyfile
example.com {
    reverse_proxy {
        to server1:3000
        to server2:3000
        to server3:3000
        
        # 使用 Header 保持会话
        lb_policy header
        lb_header "X-Server-ID"
    }
}

服务发现 #

静态配置 #

caddyfile
example.com {
    reverse_proxy {
        to server1:3000
        to server2:3000
        to server3:3000
    }
}

DNS SRV 发现 #

caddyfile
example.com {
    reverse_proxy {
        dynamic srv {
            service app
            proto tcp
            refresh 10s
        }
    }
}

DNS A 记录发现 #

caddyfile
example.com {
    reverse_proxy {
        dynamic dns {
            name backend.example.com
            refresh 10s
        }
    }
}

故障转移 #

基本故障转移 #

caddyfile
example.com {
    reverse_proxy {
        to server1:3000
        to server2:3000
        to server3:3000
        
        # 故障转移配置
        fail_duration 30s
        max_fails 3
        max_retries 2
        try_duration 10s
    }
}

主备模式 #

caddyfile
example.com {
    reverse_proxy {
        # 主服务器
        to primary:3000
        
        # 备用服务器
        to backup:3000
        
        # 使用 first 策略
        lb_policy first
        
        # 主服务器故障时自动切换
        fail_duration 30s
        max_fails 1
    }
}

多数据中心故障转移 #

caddyfile
example.com {
    reverse_proxy {
        # 主数据中心
        to dc1-server1:3000 weight 5
        to dc1-server2:3000 weight 5
        
        # 备数据中心
        to dc2-server1:3000 weight 2
        to dc2-server2:3000 weight 2
        
        lb_policy weighted_round_robin
        
        health_uri /health
        health_interval 10s
    }
}

连接池管理 #

连接池配置 #

caddyfile
example.com {
    reverse_proxy {
        to server1:3000
        to server2:3000
        
        transport http {
            # 连接池设置
            keepalive 90s
            keepalive_idle_conns 50
            keepalive_idle_conns_per_host 10
            
            # 连接超时
            dial_timeout 5s
            tls_timeout 5s
        }
    }
}

连接限制 #

caddyfile
example.com {
    reverse_proxy {
        to server1:3000
        to server2:3000
        
        transport http {
            # 最大连接数
            max_conns_per_host 100
            
            # 禁用连接复用
            # keepalive off
        }
    }
}

监控与调试 #

查看后端状态 #

bash
# 查看所有后端状态
curl localhost:2019/reverse_proxy/upstreams | jq

# 输出示例
[
    {
        "address": "server1:3000",
        "healthy": true,
        "requests": 1000,
        "fails": 2,
        "average_latency_ms": 15
    },
    {
        "address": "server2:3000",
        "healthy": true,
        "requests": 980,
        "fails": 1,
        "average_latency_ms": 18
    }
]

动态添加/移除后端 #

bash
# 添加新后端
curl -X POST \
    localhost:2019/config/apps/http/servers/srv0/routes/0/handle/0/upstreams \
    -H "Content-Type: application/json" \
    -d '{"dial": "server4:3000"}'

# 移除后端
curl -X DELETE \
    localhost:2019/config/apps/http/servers/srv0/routes/0/handle/0/upstreams/0

完整示例 #

高可用 Web 服务 #

caddyfile
example.com {
    reverse_proxy {
        # 后端服务器
        to web-server-1:3000 weight 3
        to web-server-2:3000 weight 3
        to web-server-3:3000 weight 2
        to web-server-4:3000 weight 2
        
        # 负载均衡策略
        lb_policy weighted_round_robin
        
        # 健康检查
        health_uri /health
        health_interval 10s
        health_timeout 5s
        health_status 200
        
        # 故障处理
        fail_duration 60s
        max_fails 3
        max_retries 2
        try_duration 30s
        
        # 代理头部
        header_up Host {host}
        header_up X-Real-IP {remote_host}
        header_up X-Forwarded-For {remote_host}
        header_up X-Forwarded-Proto {scheme}
        
        # 连接池
        transport http {
            keepalive 90s
            keepalive_idle_conns 50
            dial_timeout 5s
        }
    }
    
    # 日志
    log {
        output file /var/log/caddy/lb.log
        format json
    }
}

API 网关负载均衡 #

caddyfile
api.example.com {
    # 用户服务
    handle /users/* {
        reverse_proxy {
            to user-service-1:3001
            to user-service-2:3001
            to user-service-3:3001
            lb_policy least_conn
            health_uri /health
        }
    }
    
    # 订单服务
    handle /orders/* {
        reverse_proxy {
            to order-service-1:3002
            to order-service-2:3002
            lb_policy least_conn
            health_uri /health
        }
    }
    
    # 产品服务
    handle /products/* {
        reverse_proxy {
            to product-service-1:3003
            to product-service-2:3003
            lb_policy round_robin
            health_uri /health
        }
    }
    
    # WebSocket 服务
    handle /ws/* {
        reverse_proxy {
            to ws-service-1:3004
            to ws-service-2:3004
            lb_policy ip_hash
            
            transport http {
                read_timeout 0
                write_timeout 0
            }
        }
    }
}

蓝绿部署 #

caddyfile
example.com {
    # 当前生产环境(蓝)
    reverse_proxy {
        to blue-server-1:3000 weight 10
        to blue-server-2:3000 weight 10
        to green-server-1:3000 weight 0
        to green-server-2:3000 weight 0
        lb_policy weighted_round_robin
    }
}

切换到绿环境:

bash
# 通过 API 切换
curl -X PATCH \
    localhost:2019/config/apps/http/servers/srv0/routes/0/handle/0/upstreams \
    -H "Content-Type: application/json" \
    -d '[
        {"dial": "blue-server-1:3000", "weight": 0},
        {"dial": "blue-server-2:3000", "weight": 0},
        {"dial": "green-server-1:3000", "weight": 10},
        {"dial": "green-server-2:3000", "weight": 10}
    ]'

金丝雀发布 #

caddyfile
example.com {
    reverse_proxy {
        # 稳定版本
        to stable-1:3000 weight 9
        to stable-2:3000 weight 9
        
        # 金丝雀版本
        to canary-1:3000 weight 1
        to canary-2:3000 weight 1
        
        lb_policy weighted_round_robin
        health_uri /health
    }
}

最佳实践 #

1. 健康检查配置 #

caddyfile
# 推荐:同时启用主动和被动检查
reverse_proxy {
    # 主动检查
    health_uri /health
    health_interval 10s
    health_timeout 5s
    
    # 被动检查
    fail_duration 30s
    max_fails 3
}

2. 超时设置 #

caddyfile
# 根据业务特点设置超时
reverse_proxy {
    transport http {
        dial_timeout 5s
        read_timeout 30s
        write_timeout 30s
    }
}

3. 连接池优化 #

caddyfile
# 根据并发量调整连接池
reverse_proxy {
    transport http {
        keepalive 90s
        keepalive_idle_conns 100
        keepalive_idle_conns_per_host 20
    }
}

4. 日志记录 #

caddyfile
# 记录负载均衡日志
reverse_proxy {
    # ...
}

log {
    output file /var/log/caddy/lb.log
    format json
}

下一步 #

现在你已经掌握了负载均衡配置,接下来学习 虚拟主机 了解如何配置多站点!

最后更新:2026-03-28