缓存策略 #

一、缓存基础概念 #

1.1 缓存工作原理 #

text

┌─────────────────────────────────────────────────────────────┐
│                      缓存工作流程                             │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│   Client ──► Request ──► Check Cache ──┬──► HIT ──► Response│
│                                         │                   │
│                                         └──► MISS ──►       │
│                                                    │        │
│                                                    ▼        │
│                                              Backend Server │
│                                                    │        │
│                                                    ▼        │
│                                              Store & Return │
│                                                             │
└─────────────────────────────────────────────────────────────┘

1.2 缓存相关概念 #

概念	说明
TTL	Time To Live，缓存有效期
Grace	宽限时间，过期后仍可使用的时间
Keep	保持时间，用于条件验证
Stale	过期但仍可用的缓存
Fresh	未过期的缓存

1.3 缓存生命周期 #

text

┌────────────────────────────────────────────────────────┐
│                    缓存对象生命周期                      │
├────────────────────────────────────────────────────────┤
│                                                        │
│  创建 ──► TTL ──► Grace ──► Keep ──► 删除              │
│           │         │        │                        │
│           │         │        └── 可用于条件验证         │
│           │         └── 过期后仍可服务                   │
│           └── 正常服务期                                 │
│                                                        │
└────────────────────────────────────────────────────────┘

二、TTL配置 #

2.1 默认TTL #

bash

# 启动时设置默认TTL
varnishd -t 120  # 默认120秒

bash

# 运行时修改
varnishadm param.set default_ttl 300

2.2 VCL中设置TTL #

vcl

sub vcl_backend_response {
    # 固定TTL
    set beresp.ttl = 1h;
    
    # 基于内容类型设置
    if (beresp.http.Content-Type ~ "text/html") {
        set beresp.ttl = 5m;
    } elseif (beresp.http.Content-Type ~ "image/") {
        set beresp.ttl = 1d;
    } elseif (beresp.http.Content-Type ~ "application/javascript") {
        set beresp.ttl = 1d;
    }
    
    # 基于URL设置
    if (bereq.url ~ "^/static/") {
        set beresp.ttl = 7d;
    } elseif (bereq.url ~ "^/api/") {
        set beresp.ttl = 10s;
    }
    
    # 基于状态码设置
    if (beresp.status == 200) {
        set beresp.ttl = 1h;
    } elseif (beresp.status == 404) {
        set beresp.ttl = 1m;
    } elseif (beresp.status >= 500) {
        set beresp.ttl = 0s;
    }
}

2.3 基于HTTP头设置TTL #

vcl

sub vcl_backend_response {
    # 使用Cache-Control头
    if (beresp.http.Cache-Control ~ "max-age") {
        # Varnish会自动解析max-age
    }
    
    # 使用Expires头
    if (beresp.http.Expires) {
        # Varnish会自动解析Expires
    }
    
    # 自定义TTL头
    if (beresp.http.X-Cache-TTL) {
        set beresp.ttl = std.duration(beresp.http.X-Cache-TTL, 1h);
        unset beresp.http.X-Cache-TTL;
    }
    
    # 不缓存标记
    if (beresp.http.Cache-Control ~ "no-cache" ||
        beresp.http.Cache-Control ~ "private" ||
        beresp.http.Pragma ~ "no-cache") {
        set beresp.ttl = 0s;
        set beresp.uncacheable = true;
    }
}

2.4 TTL时间单位 #

单位	说明	示例
s	秒	60s
m	分钟	5m
h	小时	1h
d	天	7d
w	周	1w
y	年	1y

三、Grace宽限时间 #

3.1 Grace概念 #

Grace允许在缓存对象过期后仍继续服务：

后端故障时提供过期内容
后端响应慢时快速返回
提高可用性

3.2 Grace配置 #

vcl

sub vcl_backend_response {
    # 设置宽限时间
    set beresp.grace = 1h;
}

sub vcl_hit {
    # 过期对象仍可服务
    if (obj.ttl >= 0s) {
        return (deliver);
    }
    
    # 在宽限期内，返回过期对象
    if (obj.ttl + obj.grace > 0s) {
        return (deliver);
    }
    
    # 完全过期，重新获取
    return (restart);
}

3.3 Grace与后台更新 #

vcl

sub vcl_hit {
    if (obj.ttl >= 0s) {
        # 未过期，正常返回
        return (deliver);
    }
    
    if (obj.ttl + obj.grace > 0s) {
        # 过期但在宽限期内
        # 返回过期内容，同时触发后台更新
        return (deliver);
    }
    
    return (restart);
}

3.4 Grace最佳实践 #

vcl

sub vcl_backend_response {
    # 根据内容类型设置不同的grace
    if (beresp.http.Content-Type ~ "text/html") {
        set beresp.grace = 5m;
    } else {
        set beresp.grace = 1h;
    }
}

四、缓存键计算 #

4.1 默认缓存键 #

默认缓存键由以下组成：

URL
Host头

4.2 自定义缓存键 #

vcl

sub vcl_hash {
    # 基础：URL
    hash_data(req.url);
    
    # 加入Host
    if (req.http.Host) {
        hash_data(req.http.Host);
    }
    
    # 加入协议
    hash_data(req.http.X-Forwarded-Proto);
    
    # 基于设备类型
    if (req.http.User-Agent ~ "Mobile") {
        hash_data("mobile");
    } else {
        hash_data("desktop");
    }
    
    # 基于语言
    if (req.http.Accept-Language) {
        hash_data(req.http.Accept-Language);
    }
    
    # 基于Cookie中的特定值
    if (req.http.Cookie ~ "user_id") {
        hash_data(regsub(req.http.Cookie, ".*user_id=([^;]+).*", "\1"));
    }
    
    return (lookup);
}

4.3 缓存键策略 #

场景1：多站点缓存

vcl

sub vcl_hash {
    hash_data(req.url);
    hash_data(req.http.Host);
    return (lookup);
}

场景2：移动端/桌面端分离

vcl

sub vcl_hash {
    hash_data(req.url);
    
    if (req.http.User-Agent ~ "(?i)(mobile|android|iphone)") {
        hash_data("mobile");
    } else {
        hash_data("desktop");
    }
    
    return (lookup);
}

场景3：A/B测试

vcl

sub vcl_hash {
    hash_data(req.url);
    
    if (req.http.Cookie ~ "variant=b") {
        hash_data("variant_b");
    } else {
        hash_data("variant_a");
    }
    
    return (lookup);
}

场景4：用户个性化缓存

vcl

sub vcl_hash {
    hash_data(req.url);
    
    # 仅对登录用户区分
    if (req.http.Cookie ~ "session_id") {
        hash_data(regsub(req.http.Cookie, ".*session_id=([^;]+).*", "\1"));
    }
    
    return (lookup);
}

五、缓存条件 #

5.1 请求条件 #

vcl

sub vcl_recv {
    # 只缓存GET和HEAD
    if (req.method != "GET" && req.method != "HEAD") {
        return (pass);
    }
    
    # 带认证头不缓存
    if (req.http.Authorization) {
        return (pass);
    }
    
    # 带特定Cookie不缓存
    if (req.http.Cookie ~ "user_logged_in") {
        return (pass);
    }
    
    return (hash);
}

5.2 响应条件 #

vcl

sub vcl_backend_response {
    # 不缓存带Set-Cookie的响应
    if (beresp.http.Set-Cookie) {
        set beresp.uncacheable = true;
        set beresp.ttl = 0s;
        return (deliver);
    }
    
    # 不缓存特定状态码
    if (beresp.status == 401 ||
        beresp.status == 403 ||
        beresp.status >= 500) {
        set beresp.uncacheable = true;
        set beresp.ttl = 0s;
        return (deliver);
    }
    
    # 不缓存Vary: *的响应
    if (beresp.http.Vary == "*") {
        set beresp.uncacheable = true;
        set beresp.ttl = 0s;
        return (deliver);
    }
}

5.3 Vary头处理 #

vcl

sub vcl_backend_response {
    # Vary头影响缓存键
    # Vary: Accept-Encoding 会为不同编码创建不同缓存
    
    # 简化Vary头
    if (beresp.http.Vary ~ "Cookie") {
        # 如果Vary包含Cookie，可能需要特殊处理
    }
    
    # 移除不必要的Vary
    unset beresp.http.Vary;
}

六、缓存失效 #

6.1 PURGE方法 #

vcl

acl purge {
    "localhost";
    "192.168.1.0"/24;
}

sub vcl_recv {
    if (req.method == "PURGE") {
        if (!client.ip ~ purge) {
            return (synth(405, "Not allowed."));
        }
        return (purge);
    }
}

使用PURGE：

bash

# PURGE特定URL
curl -X PURGE http://varnish:6081/images/logo.png

# PURGE带Host
curl -X PURGE -H "Host: example.com" http://varnish:6081/page.html

6.2 BAN方法 #

vcl

sub vcl_recv {
    if (req.method == "BAN") {
        if (!client.ip ~ purge) {
            return (synth(405, "Not allowed."));
        }
        
        # BAN特定URL模式
        if (req.http.X-Ban-Url) {
            ban("req.url ~ " + req.http.X-Ban-Url);
        }
        
        # BAN特定内容类型
        if (req.http.X-Ban-Type) {
            ban("obj.http.Content-Type ~ " + req.http.X-Ban-Type);
        }
        
        # BAN所有
        if (!req.http.X-Ban-Url && !req.http.X-Ban-Type) {
            ban("req.url ~ " + req.url);
        }
        
        return (synth(200, "Banned."));
    }
}

使用BAN：

bash

# BAN所有图片
curl -X BAN -H "X-Ban-Url: \.jpg$" http://varnish:6081/

# BAN特定路径
curl -X BAN -H "X-Ban-Url: ^/news/" http://varnish:6081/

# BAN HTML内容
curl -X BAN -H "X-Ban-Type: text/html" http://varnish:6081/

6.3 BAN vs PURGE #

特性	PURGE	BAN
作用范围	单个URL	URL模式
执行方式	立即删除	标记失效
性能影响	低	可能累积
使用场景	精确失效	批量失效

6.4 BAN列表管理 #

bash

# 查看BAN列表
varnishadm ban.list

# 输出示例
Present bans:
1648123456.789012   C
1648123455.123456   C    req.url ~ ^/images/
1648123454.987654        req.url ~ \.css$

6.5 自动清理BAN #

bash

# 设置BAN清理参数
varnishadm param.set ban_lurker_age 60
varnishadm param.set ban_lurker_batch 1000

七、缓存策略模式 #

7.1 静态资源缓存 #

vcl

sub vcl_recv {
    # 静态资源直接缓存
    if (req.url ~ "\.(css|js|png|gif|jpg|jpeg|ico|svg|woff|woff2|ttf|eot)$") {
        unset req.http.Cookie;
        return (hash);
    }
}

sub vcl_backend_response {
    # 静态资源长缓存
    if (bereq.url ~ "\.(css|js|png|gif|jpg|jpeg|ico|svg|woff|woff2|ttf|eot)$") {
        set beresp.ttl = 30d;
        unset beresp.http.Set-Cookie;
    }
}

7.2 动态内容缓存 #

vcl

sub vcl_backend_response {
    # HTML短缓存
    if (beresp.http.Content-Type ~ "text/html") {
        set beresp.ttl = 5m;
        set beresp.grace = 1h;
    }
    
    # API响应缓存
    if (bereq.url ~ "^/api/") {
        if (beresp.http.X-Cache-TTL) {
            set beresp.ttl = std.duration(beresp.http.X-Cache-TTL, 10s);
        } else {
            set beresp.ttl = 10s;
        }
    }
}

7.3 条件缓存 #

vcl

sub vcl_recv {
    # 特定路径缓存
    if (req.url ~ "^/cacheable/") {
        return (hash);
    }
    
    # 其他路径不缓存
    if (req.url ~ "^/private/" || 
        req.url ~ "^/admin/" ||
        req.url ~ "^/user/") {
        return (pass);
    }
}

7.4 分层缓存 #

vcl

sub vcl_backend_response {
    # 第一层：浏览器缓存
    if (beresp.ttl > 0s) {
        set beresp.http.Cache-Control = "public, max-age=" + beresp.ttl;
    }
    
    # 第二层：CDN缓存（通过Surrogate-Key）
    if (bereq.url ~ "^/articles/") {
        set beresp.http.Surrogate-Key = "articles";
    }
}

八、缓存预热 #

8.1 预热脚本 #

bash

#!/bin/bash
# warm_cache.sh

URLS=(
    "https://example.com/"
    "https://example.com/css/main.css"
    "https://example.com/js/app.js"
    "https://example.com/images/logo.png"
)

for url in "${URLS[@]}"; do
    echo "Warming: $url"
    curl -s -o /dev/null "$url"
done

echo "Cache warming completed."

8.2 站点地图预热 #

bash

#!/bin/bash
# sitemap_warm.sh

SITEMAP_URL="https://example.com/sitemap.xml"

# 获取sitemap中的URL
urls=$(curl -s "$SITEMAP_URL" | grep -oP '(?<=<loc>)[^<]+')

# 预热每个URL
for url in $urls; do
    echo "Warming: $url"
    curl -s -o /dev/null "$url" &
done

wait
echo "Sitemap warming completed."

九、缓存监控 #

9.1 缓存命中率监控 #

bash

#!/bin/bash
# cache_hit_rate.sh

HITS=$(varnishstat -1 -f MAIN.cache_hit | awk '{print $2}')
MISSES=$(varnishstat -1 -f MAIN.cache_miss | awk '{print $2}')
TOTAL=$((HITS + MISSES))

if [ $TOTAL -gt 0 ]; then
    RATE=$(echo "scale=2; $HITS * 100 / $TOTAL" | bc)
    echo "Cache Hit Rate: ${RATE}%"
    echo "Hits: $HITS"
    echo "Misses: $MISSES"
else
    echo "No requests yet"
fi

9.2 缓存统计 #

bash

# 查看缓存对象数
varnishstat -1 -f MAIN.n_object

# 查看缓存大小
varnishstat -1 -f MAIN.s0.g_bytes

# 查看过期对象
varnishstat -1 -f MAIN.n_expired

# 查看LRU淘汰
varnishstat -1 -f MAIN.n_lru_nuked

十、最佳实践 #

10.1 TTL建议 #

内容类型	建议TTL	Grace
静态资源	30天	1天
HTML页面	5分钟	1小时
API响应	10秒	1分钟
图片	7天	1天
CSS/JS	7天	1天

10.2 缓存策略清单 #

vcl

# 缓存策略清单
sub vcl_recv {
    # 1. 只缓存GET/HEAD
    # 2. 移除不必要的Cookie
    # 3. 处理PURGE/BAN
    # 4. 静态资源直接缓存
}

sub vcl_backend_response {
    # 1. 设置合适的TTL
    # 2. 设置Grace时间
    # 3. 处理Set-Cookie
    # 4. 处理错误响应
}

sub vcl_deliver {
    # 1. 添加缓存状态头
    # 2. 移除敏感头
}

10.3 常见问题 #

问题1：缓存命中率低

vcl

# 检查Cookie是否阻止缓存
sub vcl_recv {
    # 移除静态资源的Cookie
    if (req.url ~ "\.(css|js|png|gif|jpg)$") {
        unset req.http.Cookie;
    }
}

问题2：缓存内容不更新

vcl

# 检查TTL设置
sub vcl_backend_response {
    # 确保TTL合理
    if (beresp.ttl <= 0s) {
        set beresp.ttl = 5m;
    }
}

十一、总结 #

本章我们学习了：

缓存基础：TTL、Grace、Keep概念
TTL配置：默认TTL、VCL设置、HTTP头解析
Grace宽限：提高可用性、后台更新
缓存键：自定义缓存键、多场景策略
缓存条件：请求条件、响应条件
缓存失效：PURGE、BAN方法
缓存策略：静态资源、动态内容、分层缓存
缓存监控：命中率、统计信息

掌握缓存策略后，让我们进入下一章，学习后端服务器配置！