日志分析 #

一、日志系统概述 #

1.1 Varnish日志架构 #

text
┌─────────────────────────────────────────────────────────┐
│                    Varnish日志架构                       │
├─────────────────────────────────────────────────────────┤
│                                                         │
│   Varnish Worker ──► VSL (Varnish Shared Log)          │
│                           │                             │
│         ┌─────────────────┼─────────────────┐           │
│         │                 │                 │           │
│         ▼                 ▼                 ▼           │
│   varnishlog       varnishncsa        varnishstat       │
│   (详细日志)        (NCSA格式)         (统计)           │
│                                                         │
└─────────────────────────────────────────────────────────┘

1.2 日志工具 #

工具 说明 输出格式
varnishlog 详细二进制日志 VSL格式
varnishncsa NCSA/自定义格式 文本格式
varnishtop 实时统计 排名列表
varnishhist 延迟直方图 图形显示
varnishstat 统计计数器 数值显示

二、varnishlog #

2.1 基本使用 #

bash
# 实时查看日志
varnishlog

# 查看特定请求
varnishlog -g request

# 查看后端日志
varnishlog -g backend

2.2 输出格式 #

text
*   << Request  >> 12345
-   Begin          req 12344 rxreq
-   Timestamp      Start: 1648123456.789012 0.000000 0.000000
-   ReqMethod      GET
-   ReqURL         /index.html
-   ReqProtocol    HTTP/1.1
-   ReqHeader      Host: example.com
-   VCL_call       RECV
-   VCL_return     hash
-   RespStatus     200
-   RespReason     OK
-   RespHeader     Content-Type: text/html

2.3 过滤选项 #

bash
# 过滤特定URL
varnishlog -q "ReqURL ~ ^/api/"

# 过滤特定状态码
varnishlog -q "RespStatus == 500"

# 过滤特定客户端
varnishlog -q "ReqHeader:Host == example.com"

# 过滤慢请求
varnishlog -q "Timestamp:Process[2] > 1.0"

# 组合过滤
varnishlog -q "ReqMethod == POST && RespStatus >= 400"

2.4 输出选项 #

bash
# 只显示特定标签
varnishlog -i ReqURL -i RespStatus

# 排除特定标签
varnishlog -I "^Req"

# 显示时间戳
varnishlog -T

# 显示原始数据
varnishlog -r

2.5 日志文件操作 #

bash
# 写入文件
varnishlog -w /var/log/varnish/varnish.log

# 从文件读取
varnishlog -r /var/log/varnish/varnish.log

# 追加模式
varnishlog -a -w /var/log/varnish/varnish.log

三、varnishncsa #

3.1 基本使用 #

bash
# NCSA格式日志
varnishncsa

# 输出示例
192.168.1.100 - - [28/Mar/2026:10:00:00 +0800] "GET /index.html HTTP/1.1" 200 1234 "-" "Mozilla/5.0"

3.2 自定义格式 #

bash
# 自定义格式
varnishncsa -F "%h %l %u %t \"%r\" %s %b \"%{Referer}i\" \"%{User-Agent}i\""

# 包含缓存状态
varnishncsa -F "%h %t \"%r\" %s %b %{X-Cache}o"

# 包含处理时间
varnishncsa -F "%h %t \"%r\" %s %b %D"

3.3 格式说明符 #

符号 说明
%h 客户端IP
%l 远程登录名
%u 远程用户
%t 时间戳
%r 请求行
%s 响应状态码
%b 响应大小
%D 处理时间(微秒)
%T 处理时间(秒)
%{Header}i 请求头
%{Header}o 响应头
%{X-Cache}o 自定义头
%{Varnish:hitmiss}x 命中状态
%{Varnish:side}x 客户端/后端

3.4 高级格式 #

bash
# 完整日志格式
varnishncsa -F '%h %l %u %t "%r" %s %b "%{Referer}i" "%{User-Agent}i" %{Varnish:hitmiss}x %D'

# JSON格式
varnishncsa -F '{"ip":"%h","time":"%t","method":"%m","url":"%U","status":%s,"size":%b,"time_ms":%D,"hitmiss":"%{Varnish:hitmiss}x"}'

3.5 后台运行 #

bash
# 后台运行
varnishncsa -D -w /var/log/varnish/access.log

# 使用PID文件
varnishncsa -D -P /var/run/varnishncsa.pid -w /var/log/varnish/access.log

四、varnishtop #

4.1 基本使用 #

bash
# 实时统计
varnishtop

# 输出示例
list length 25
  12345 ReqURL /index.html
  10234 ReqURL /api/users
   8765 ReqHeader:Host example.com

4.2 统计特定标签 #

bash
# 统计URL
varnishtop -i ReqURL

# 统计请求头
varnishtop -i ReqHeader

# 统计响应状态
varnishtop -i RespStatus

# 统计后端
varnishtop -i Backend

4.3 过滤统计 #

bash
# 只统计GET请求
varnishtop -i ReqURL -q "ReqMethod == GET"

# 只统计特定域名
varnishtop -i ReqURL -q "ReqHeader:Host == example.com"

五、varnishhist #

5.1 基本使用 #

bash
# 延迟直方图
varnishhist

# 输出示例
|  1|  2|  3|  4|  5|  6|  7|  8|  9| 10| 11| 12|
|   |   |   |   |   |   |   |   |   |   |   |   |
|   |   |   |   |###|###|## |   |   |   |   |   |

5.2 参数配置 #

bash
# 自定义桶数量
varnishhist -b 100

# 自定义范围
varnishhist -p 1000

# 只看请求
varnishhist -m ReqURL

5.3 解读直方图 #

text
桶编号:  1   2   3   4   5   6   7   8   9  10  11  12
时间:   1e-6 1e-5 1e-4 1e-3 1e-2 1e-1 1e0 1e1 1e2 1e3 1e4 1e5 (秒)

桶5 (1e-2 = 0.01秒): 大多数请求
桶6 (1e-1 = 0.1秒): 较慢请求
桶7 (1e0 = 1秒): 慢请求

六、日志管理 #

6.1 日志轮转 #

bash
# /etc/logrotate.d/varnish

/var/log/varnish/*.log {
    daily
    rotate 7
    compress
    delaycompress
    missingok
    notifempty
    create 0644 varnish varnish
    sharedscripts
    postrotate
        systemctl reload varnishncsa > /dev/null 2>&1 || true
    endscript
}

6.2 启动日志服务 #

bash
# 启动varnishncsa服务
sudo systemctl start varnishncsa
sudo systemctl enable varnishncsa

# 启动varnishlog服务
sudo systemctl start varnishlog
sudo systemctl enable varnishlog

6.3 日志清理 #

bash
# 清理旧日志
find /var/log/varnish -name "*.log" -mtime +7 -delete

# 压缩旧日志
find /var/log/varnish -name "*.log" -mtime +1 -exec gzip {} \;

七、日志分析 #

7.1 分析脚本 #

bash
#!/bin/bash
# log_analysis.sh

LOG_FILE="/var/log/varnish/access.log"

echo "=== Varnish Log Analysis ==="
echo ""

# 总请求数
TOTAL=$(wc -l < $LOG_FILE)
echo "Total Requests: $TOTAL"

# 状态码分布
echo ""
echo "Status Code Distribution:"
awk '{print $9}' $LOG_FILE | sort | uniq -c | sort -rn

# 热门URL
echo ""
echo "Top 10 URLs:"
awk '{print $7}' $LOG_FILE | sort | uniq -c | sort -rn | head -10

# 热门IP
echo ""
echo "Top 10 IPs:"
awk '{print $1}' $LOG_FILE | sort | uniq -c | sort -rn | head -10

# 平均响应大小
echo ""
echo "Average Response Size:"
awk '{sum+=$10; count++} END {print sum/count " bytes"}' $LOG_FILE

7.2 实时监控 #

bash
#!/bin/bash
# real_time_monitor.sh

echo "=== Real-time Monitor ==="

varnishncsa -F '%h %t "%r" %s %b %{Varnish:hitmiss}x %D' | \
while read line; do
    # 解析日志行
    IP=$(echo $line | awk '{print $1}')
    TIME=$(echo $line | awk '{print $2}')
    URL=$(echo $line | awk '{print $4}')
    STATUS=$(echo $line | awk '{print $6}')
    SIZE=$(echo $line | awk '{print $7}')
    HITMISS=$(echo $line | awk '{print $8}')
    DURATION=$(echo $line | awk '{print $9}')
    
    # 输出关键信息
    echo "$TIME $IP $URL $STATUS ${SIZE}B ${DURATION}us [$HITMISS]"
done

7.3 错误分析 #

bash
#!/bin/bash
# error_analysis.sh

echo "=== Error Analysis ==="

# 5xx错误
echo ""
echo "5xx Errors:"
varnishlog -d -q "RespStatus >= 500" | head -100

# 4xx错误
echo ""
echo "4xx Errors:"
varnishlog -d -q "RespStatus >= 400 && RespStatus < 500" | head -100

# 慢请求
echo ""
echo "Slow Requests (>1s):"
varnishlog -d -q "Timestamp:Process[2] > 1.0" | head -100

八、日志与监控系统集成 #

8.1 ELK集成 #

bash
# JSON格式输出到Logstash
varnishncsa -F '{"@timestamp":"%{%Y-%m-%dT%H:%M:%S%z}t","client_ip":"%h","request":"%r","status":%s,"bytes":%b,"duration":%D,"hitmiss":"%{Varnish:hitmiss}x","user_agent":"%{User-Agent}i"}'

8.2 Fluentd集成 #

xml
# fluentd配置
<source>
  @type tail
  path /var/log/varnish/access.log
  pos_file /var/log/varnish/access.log.pos
  tag varnish.access
  format regexp
  expression /^(?<client_ip>\S+) \S+ \S+ \[(?<time>[^\]]+)\] "(?<request>[^"]+)" (?<status>\d+) (?<bytes>\d+)/
</source>

8.3 Graylog集成 #

bash
# GELF格式输出
varnishncsa -F '{"version":"1.1","host":"varnish","short_message":"%r","full_message":"%h %t %r %s %b","timestamp":%{%s}t,"level":6,"_client_ip":"%h","_status":%s,"_bytes":%b}'

九、日志查询技巧 #

9.1 查找特定请求 #

bash
# 按请求ID查找
varnishlog -q "vxid == 12345"

# 按URL查找
varnishlog -q "ReqURL ~ ^/api/users"

# 按客户端IP查找
varnishlog -q "ReqHeader:X-Forwarded-For ~ 192.168.1.100"

9.2 分析请求链 #

bash
# 查看完整请求链
varnishlog -g request -q "vxid == 12345"

# 查看后端请求
varnishlog -g backend -q "vxid == 12345"

9.3 导出日志 #

bash
# 导出为文本
varnishlog -d > /tmp/varnish.log

# 导出特定时间段
varnishlog -d -b 1648123400 -e 1648123500 > /tmp/varnish.log

十、总结 #

本章我们学习了:

  1. 日志系统概述:架构、工具
  2. varnishlog:详细日志、过滤、文件操作
  3. varnishncsa:NCSA格式、自定义格式
  4. varnishtop:实时统计
  5. varnishhist:延迟直方图
  6. 日志管理:轮转、清理
  7. 日志分析:脚本、实时监控
  8. 系统集成:ELK、Fluentd、Graylog

掌握日志分析后,让我们进入下一章,学习监控告警!

最后更新:2026-03-28