负载均衡 #

概述 #

负载均衡是分布式系统的核心组件，Caddy 提供了强大的负载均衡功能，支持多种策略、健康检查和会话保持。

text

┌─────────────────────────────────────────────────────────────┐
│                    负载均衡架构                              │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│                    ┌─────────┐                              │
│                    │  客户端  │                              │
│                    └────┬────┘                              │
│                         │                                   │
│                         ▼                                   │
│                    ┌─────────┐                              │
│                    │  Caddy  │                              │
│                    │ 负载均衡 │                              │
│                    └────┬────┘                              │
│                         │                                   │
│         ┌───────────────┼───────────────┐                   │
│         │               │               │                   │
│         ▼               ▼               ▼                   │
│    ┌─────────┐    ┌─────────┐    ┌─────────┐              │
│    │ Server1 │    │ Server2 │    │ Server3 │              │
│    └─────────┘    └─────────┘    └─────────┘              │
│                                                             │
└─────────────────────────────────────────────────────────────┘

负载均衡策略 #

轮询（Round Robin） #

默认策略，按顺序分发请求：

caddyfile

example.com {
    reverse_proxy {
        to server1:3000
        to server2:3000
        to server3:3000
        lb_policy round_robin
    }
}

请求分发顺序：

text

请求1 → Server1
请求2 → Server2
请求3 → Server3
请求4 → Server1
请求5 → Server2
...

最少连接（Least Connections） #

将请求分发给连接数最少的服务器：

caddyfile

example.com {
    reverse_proxy {
        to server1:3000
        to server2:3000
        to server3:3000
        lb_policy least_conn
    }
}

适用场景：

请求处理时间差异大
需要均衡服务器负载
长连接场景

IP 哈希（IP Hash） #

根据客户端 IP 分配服务器，同一 IP 的请求总是路由到同一服务器：

caddyfile

example.com {
    reverse_proxy {
        to server1:3000
        to server2:3000
        to server3:3000
        lb_policy ip_hash
    }
}

适用场景：

需要会话保持
无状态服务但有本地缓存
需要减少缓存失效

URI 哈希（URI Hash） #

根据请求 URI 分配服务器：

caddyfile

example.com {
    reverse_proxy {
        to server1:3000
        to server2:3000
        to server3:3000
        lb_policy uri_hash
    }
}

适用场景：

缓存服务器
静态资源分发
CDN 场景

随机选择（Random） #

随机选择后端服务器：

caddyfile

example.com {
    reverse_proxy {
        to server1:3000
        to server2:3000
        to server3:3000
        lb_policy random
    }
}

加权轮询（Weighted Round Robin） #

根据权重分配请求：

caddyfile

example.com {
    reverse_proxy {
        to server1:3000 {
            weight 5
        }
        to server2:3000 {
            weight 3
        }
        to server3:3000 {
            weight 2
        }
        lb_policy weighted_round_robin
    }
}

请求分发比例：Server1:Server2:Server3 = 5:3:2

适用场景：

服务器性能不均
灰度发布
流量控制

加权最少连接 #

结合权重和连接数：

caddyfile

example.com {
    reverse_proxy {
        to server1:3000 {
            weight 5
        }
        to server2:3000 {
            weight 3
        }
        to server3:3000 {
            weight 2
        }
        lb_policy weighted_least_conn
    }
}

首次可用（First） #

选择第一个可用的服务器：

caddyfile

example.com {
    reverse_proxy {
        to server1:3000
        to server2:3000
        to server3:3000
        lb_policy first
    }
}

适用场景：

主备切换
优先级服务器

策略对比 #

策略	特点	适用场景
round_robin	公平分配	服务器性能相近
least_conn	动态均衡	请求处理时间差异大
ip_hash	会话保持	需要粘性会话
uri_hash	缓存友好	缓存服务器
random	简单快速	测试环境
weighted_*	权重控制	性能不均、灰度发布

健康检查 #

被动健康检查 #

通过观察请求结果判断服务器健康状态：

caddyfile

example.com {
    reverse_proxy {
        to server1:3000
        to server2:3000
        to server3:3000
        
        # 被动健康检查配置
        fail_duration 30s        # 失败持续时间
        max_fails 3              # 最大失败次数
        unhealthy_status 500 502 503 504  # 不健康状态码
        unhealthy_latency 5s     # 不健康延迟阈值
        unhealthy_request_count 100  # 不健康请求数
    }
}

主动健康检查 #

定期发送健康检查请求：

caddyfile

example.com {
    reverse_proxy {
        to server1:3000
        to server2:3000
        to server3:3000
        
        # 主动健康检查
        health_uri /health
        health_port 3000
        health_interval 10s
        health_timeout 5s
        health_status 200
        health_body "OK"
        
        # 健康检查头部
        health_headers {
            User-Agent "Caddy-Health-Check"
            X-Health-Check "true"
        }
    }
}

完整健康检查配置 #

caddyfile

example.com {
    reverse_proxy {
        to server1:3000
        to server2:3000
        to server3:3000
        
        # 主动检查
        health_uri /health
        health_interval 30s
        health_timeout 10s
        health_status 200
        health_body "healthy"
        
        # 被动检查
        fail_duration 60s
        max_fails 5
        unhealthy_status 500 502 503 504
        unhealthy_latency 10s
        
        # 重试配置
        max_retries 3
        try_duration 30s
    }
}

健康检查参数详解 #

参数	说明	默认值
`health_uri`	健康检查路径	/
`health_port`	检查端口	后端端口
`health_interval`	检查间隔	30s
`health_timeout`	超时时间	5s
`health_status`	期望状态码	200
`health_body`	期望响应体	-
`health_headers`	检查请求头	-
`health_follow_redirects`	跟随重定向	false
`fail_duration`	失败持续时间	0
`max_fails`	最大失败次数	1
`unhealthy_status`	不健康状态码	-
`unhealthy_latency`	不健康延迟	-

会话保持 #

IP 哈希会话保持 #

caddyfile

example.com {
    reverse_proxy {
        to server1:3000
        to server2:3000
        to server3:3000
        lb_policy ip_hash
    }
}

caddyfile

example.com {
    reverse_proxy {
        to server1:3000
        to server2:3000
        to server3:3000
        
        # 使用 Cookie 保持会话
        lb_policy cookie
        lb_cookie_name "SERVERID"
        lb_cookie_secret "your-secret-key"
        lb_cookie_duration 1h
    }
}

caddyfile

example.com {
    reverse_proxy {
        to server1:3000
        to server2:3000
        to server3:3000
        
        # 使用 Header 保持会话
        lb_policy header
        lb_header "X-Server-ID"
    }
}

服务发现 #

静态配置 #

caddyfile

example.com {
    reverse_proxy {
        to server1:3000
        to server2:3000
        to server3:3000
    }
}

DNS SRV 发现 #

caddyfile

example.com {
    reverse_proxy {
        dynamic srv {
            service app
            proto tcp
            refresh 10s
        }
    }
}

DNS A 记录发现 #

caddyfile

example.com {
    reverse_proxy {
        dynamic dns {
            name backend.example.com
            refresh 10s
        }
    }
}

故障转移 #

基本故障转移 #

caddyfile

example.com {
    reverse_proxy {
        to server1:3000
        to server2:3000
        to server3:3000
        
        # 故障转移配置
        fail_duration 30s
        max_fails 3
        max_retries 2
        try_duration 10s
    }
}

主备模式 #

caddyfile

example.com {
    reverse_proxy {
        # 主服务器
        to primary:3000
        
        # 备用服务器
        to backup:3000
        
        # 使用 first 策略
        lb_policy first
        
        # 主服务器故障时自动切换
        fail_duration 30s
        max_fails 1
    }
}

多数据中心故障转移 #

caddyfile

example.com {
    reverse_proxy {
        # 主数据中心
        to dc1-server1:3000 weight 5
        to dc1-server2:3000 weight 5
        
        # 备数据中心
        to dc2-server1:3000 weight 2
        to dc2-server2:3000 weight 2
        
        lb_policy weighted_round_robin
        
        health_uri /health
        health_interval 10s
    }
}

连接池管理 #

连接池配置 #

caddyfile

example.com {
    reverse_proxy {
        to server1:3000
        to server2:3000
        
        transport http {
            # 连接池设置
            keepalive 90s
            keepalive_idle_conns 50
            keepalive_idle_conns_per_host 10
            
            # 连接超时
            dial_timeout 5s
            tls_timeout 5s
        }
    }
}

连接限制 #

caddyfile

example.com {
    reverse_proxy {
        to server1:3000
        to server2:3000
        
        transport http {
            # 最大连接数
            max_conns_per_host 100
            
            # 禁用连接复用
            # keepalive off
        }
    }
}

监控与调试 #

查看后端状态 #

bash

# 查看所有后端状态
curl localhost:2019/reverse_proxy/upstreams | jq

# 输出示例
[
    {
        "address": "server1:3000",
        "healthy": true,
        "requests": 1000,
        "fails": 2,
        "average_latency_ms": 15
    },
    {
        "address": "server2:3000",
        "healthy": true,
        "requests": 980,
        "fails": 1,
        "average_latency_ms": 18
    }
]

动态添加/移除后端 #

bash

# 添加新后端
curl -X POST \
    localhost:2019/config/apps/http/servers/srv0/routes/0/handle/0/upstreams \
    -H "Content-Type: application/json" \
    -d '{"dial": "server4:3000"}'

# 移除后端
curl -X DELETE \
    localhost:2019/config/apps/http/servers/srv0/routes/0/handle/0/upstreams/0

完整示例 #

高可用 Web 服务 #

caddyfile

example.com {
    reverse_proxy {
        # 后端服务器
        to web-server-1:3000 weight 3
        to web-server-2:3000 weight 3
        to web-server-3:3000 weight 2
        to web-server-4:3000 weight 2
        
        # 负载均衡策略
        lb_policy weighted_round_robin
        
        # 健康检查
        health_uri /health
        health_interval 10s
        health_timeout 5s
        health_status 200
        
        # 故障处理
        fail_duration 60s
        max_fails 3
        max_retries 2
        try_duration 30s
        
        # 代理头部
        header_up Host {host}
        header_up X-Real-IP {remote_host}
        header_up X-Forwarded-For {remote_host}
        header_up X-Forwarded-Proto {scheme}
        
        # 连接池
        transport http {
            keepalive 90s
            keepalive_idle_conns 50
            dial_timeout 5s
        }
    }
    
    # 日志
    log {
        output file /var/log/caddy/lb.log
        format json
    }
}

API 网关负载均衡 #

caddyfile

api.example.com {
    # 用户服务
    handle /users/* {
        reverse_proxy {
            to user-service-1:3001
            to user-service-2:3001
            to user-service-3:3001
            lb_policy least_conn
            health_uri /health
        }
    }
    
    # 订单服务
    handle /orders/* {
        reverse_proxy {
            to order-service-1:3002
            to order-service-2:3002
            lb_policy least_conn
            health_uri /health
        }
    }
    
    # 产品服务
    handle /products/* {
        reverse_proxy {
            to product-service-1:3003
            to product-service-2:3003
            lb_policy round_robin
            health_uri /health
        }
    }
    
    # WebSocket 服务
    handle /ws/* {
        reverse_proxy {
            to ws-service-1:3004
            to ws-service-2:3004
            lb_policy ip_hash
            
            transport http {
                read_timeout 0
                write_timeout 0
            }
        }
    }
}

蓝绿部署 #

caddyfile

example.com {
    # 当前生产环境（蓝）
    reverse_proxy {
        to blue-server-1:3000 weight 10
        to blue-server-2:3000 weight 10
        to green-server-1:3000 weight 0
        to green-server-2:3000 weight 0
        lb_policy weighted_round_robin
    }
}

切换到绿环境：

bash

# 通过 API 切换
curl -X PATCH \
    localhost:2019/config/apps/http/servers/srv0/routes/0/handle/0/upstreams \
    -H "Content-Type: application/json" \
    -d '[
        {"dial": "blue-server-1:3000", "weight": 0},
        {"dial": "blue-server-2:3000", "weight": 0},
        {"dial": "green-server-1:3000", "weight": 10},
        {"dial": "green-server-2:3000", "weight": 10}
    ]'

金丝雀发布 #

caddyfile

example.com {
    reverse_proxy {
        # 稳定版本
        to stable-1:3000 weight 9
        to stable-2:3000 weight 9
        
        # 金丝雀版本
        to canary-1:3000 weight 1
        to canary-2:3000 weight 1
        
        lb_policy weighted_round_robin
        health_uri /health
    }
}

最佳实践 #

1. 健康检查配置 #

caddyfile

# 推荐：同时启用主动和被动检查
reverse_proxy {
    # 主动检查
    health_uri /health
    health_interval 10s
    health_timeout 5s
    
    # 被动检查
    fail_duration 30s
    max_fails 3
}

2. 超时设置 #

caddyfile

# 根据业务特点设置超时
reverse_proxy {
    transport http {
        dial_timeout 5s
        read_timeout 30s
        write_timeout 30s
    }
}

3. 连接池优化 #

caddyfile

# 根据并发量调整连接池
reverse_proxy {
    transport http {
        keepalive 90s
        keepalive_idle_conns 100
        keepalive_idle_conns_per_host 20
    }
}

4. 日志记录 #

caddyfile

# 记录负载均衡日志
reverse_proxy {
    # ...
}

log {
    output file /var/log/caddy/lb.log
    format json
}

下一步 #

现在你已经掌握了负载均衡配置，接下来学习虚拟主机了解如何配置多站点！

负载均衡 #

概述 #

负载均衡策略 #

轮询（Round Robin） #

最少连接（Least Connections） #

IP 哈希（IP Hash） #

URI 哈希（URI Hash） #

随机选择（Random） #

加权轮询（Weighted Round Robin） #

加权最少连接 #

首次可用（First） #

策略对比 #

健康检查 #

被动健康检查 #

主动健康检查 #

完整健康检查配置 #

健康检查参数详解 #

会话保持 #

IP 哈希会话保持 #

Cookie 会话保持 #

Header 会话保持 #

服务发现 #

静态配置 #

DNS SRV 发现 #

DNS A 记录发现 #

故障转移 #

基本故障转移 #

主备模式 #

多数据中心故障转移 #

连接池管理 #

连接池配置 #

连接限制 #

监控与调试 #

查看后端状态 #

动态添加/移除后端 #

完整示例 #

高可用 Web 服务 #

API 网关负载均衡 #

蓝绿部署 #

金丝雀发布 #

最佳实践 #

1. 健康检查配置 #

2. 超时设置 #

3. 连接池优化 #

4. 日志记录 #

下一步 #