Sentry 告警配置 #
什么是告警? #
告警是 Sentry 的核心功能之一,帮助团队在问题发生时第一时间收到通知。
text
┌─────────────────────────────────────────────────────────────┐
│ 告警的价值 │
├─────────────────────────────────────────────────────────────┤
│ │
│ 1. 快速响应 │
│ - 问题发生立即通知 │
│ - 减少问题发现时间 │
│ │
│ 2. 减少噪音 │
│ - 智能聚合相同问题 │
│ - 可配置告警频率 │
│ │
│ 3. 多渠道通知 │
│ - 邮件 │
│ - Slack │
│ - PagerDuty │
│ - 自定义 Webhook │
│ │
│ 4. 分级告警 │
│ - 不同级别不同处理 │
│ - 按需升级 │
│ │
└─────────────────────────────────────────────────────────────┘
告警规则 #
规则类型 #
text
┌─────────────────────────────────────────────────────────────┐
│ 告警规则类型 │
├─────────────────────────────────────────────────────────────┤
│ │
│ 1. Issue 告警 │
│ - 新问题出现 │
│ - 问题频率增加 │
│ - 问题状态变化 │
│ │
│ 2. 错误频率告警 │
│ - 错误数量超过阈值 │
│ - 错误率上升 │
│ │
│ 3. 性能告警 │
│ - 响应时间过长 │
│ - Crash Free Rate 下降 │
│ │
│ 4. Release 告警 │
│ - 新版本问题 │
│ - 部署失败 │
│ │
└─────────────────────────────────────────────────────────────┘
创建告警规则 #
在 Sentry 控制台:
- 进入 Settings → Projects → 选择项目 → Alerts
- 点击 New Alert Rule
- 选择规则类型
- 配置条件和动作
Issue 告警规则 #
yaml
# Issue 告警规则示例
name: "High Priority Issues"
conditions:
# 条件1: 问题级别为 error 或 fatal
- type: level
match: is
values: [error, fatal]
# 条件2: 问题首次出现
- type: first_seen
match: is
value: true
# 条件3: 影响用户数超过 10
- type: users_affected
match: greater_or_equal
value: 10
actions:
- type: email
targets: ["dev-team@example.com"]
- type: slack
channel: "#alerts"
频率告警规则 #
yaml
# 频率告警规则示例
name: "Error Rate Spike"
conditions:
# 10分钟内错误数超过 100
- type: event_frequency
comparison: greater_or_equal
value: 100
interval: 10m
actions:
- type: email
targets: ["oncall@example.com"]
- type: pagerduty
service_key: "${PAGERDUTY_SERVICE_KEY}"
性能告警规则 #
yaml
# 性能告警规则示例
name: "Slow API Response"
conditions:
# P95 响应时间超过 1 秒
- type: transaction_duration
comparison: greater_than
value: 1000 # 毫秒
percentile: 95
actions:
- type: slack
channel: "#performance"
通知渠道 #
邮件通知 #
yaml
# 邮件配置
actions:
- type: email
targets:
- "dev-team@example.com"
- "oncall@example.com"
# 邮件频率限制
frequency: 5m # 每 5 分钟最多发送一次
Slack 集成 #
安装 Slack 集成 #
- 进入 Settings → Integrations
- 找到 Slack 并点击 Add Integration
- 授权 Sentry 访问 Slack 工作区
- 选择要发送通知的频道
配置 Slack 告警 #
yaml
# Slack 告警配置
actions:
- type: slack
channel: "#alerts"
# 自定义消息格式
blocks:
- type: header
text:
type: plain_text
text: "🚨 {{ issue.title }}"
- type: section
fields:
- type: mrkdwn
text: "*Environment:*\n{{ environment }}"
- type: mrkdwn
text: "*Release:*\n{{ release }}"
- type: actions
elements:
- type: button
text:
type: plain_text
text: "View Issue"
url: "{{ issue.url }}"
PagerDuty 集成 #
yaml
# PagerDuty 配置
actions:
- type: pagerduty
service_key: "${PAGERDUTY_SERVICE_KEY}"
# 严重级别映射
severity_mapping:
fatal: critical
error: error
warning: warning
Webhook 集成 #
yaml
# Webhook 配置
actions:
- type: webhook
url: "https://api.example.com/sentry-webhook"
method: POST
headers:
Authorization: "Bearer ${WEBHOOK_TOKEN}"
Content-Type: "application/json"
body: |
{
"event_id": "{{ event_id }}",
"issue_id": "{{ issue_id }}",
"title": "{{ issue.title }}",
"level": "{{ level }}",
"environment": "{{ environment }}",
"url": "{{ issue.url }}"
}
Webhook 接收示例 #
javascript
// Node.js Express 接收 Webhook
const express = require("express");
const crypto = require("crypto");
const app = express();
app.post("/sentry-webhook", express.json(), (req, res) => {
// 验证签名
const signature = req.headers["sentry-hook-signature"];
const expectedSignature = crypto
.createHmac("sha256", process.env.SENTRY_WEBHOOK_SECRET)
.update(JSON.stringify(req.body))
.digest("hex");
if (signature !== expectedSignature) {
return res.status(401).send("Invalid signature");
}
// 处理告警
const { action, data } = req.body;
switch (action) {
case "event_alert":
handleEventAlert(data);
break;
case "issue_alert":
handleIssueAlert(data);
break;
}
res.status(200).send("OK");
});
function handleEventAlert(data) {
console.log("Event alert:", data.event.title);
// 发送到其他系统、创建工单等
}
app.listen(3000);
python
# Python Flask 接收 Webhook
from flask import Flask, request, jsonify
import hmac
import hashlib
app = Flask(__name__)
@app.route("/sentry-webhook", methods=["POST"])
def sentry_webhook():
# 验证签名
signature = request.headers.get("sentry-hook-signature", "")
expected = hmac.new(
app.config["SENTRY_WEBHOOK_SECRET"].encode(),
request.data,
hashlib.sha256,
).hexdigest()
if not hmac.compare_digest(signature, expected):
return jsonify({"error": "Invalid signature"}), 401
# 处理告警
data = request.json
action = data.get("action")
if action == "event_alert":
handle_event_alert(data["data"])
return jsonify({"status": "ok"}), 200
def handle_event_alert(data):
print(f"Event alert: {data['event']['title']}")
告警策略 #
告警频率控制 #
yaml
# 告警频率配置
alert_settings:
# 同一问题的告警间隔
issue_alert_frequency: 5m
# 每小时最大告警数
max_alerts_per_hour: 10
# 告警静默期
quiet_hours:
start: "22:00"
end: "08:00"
timezone: "Asia/Shanghai"
告警升级 #
yaml
# 告警升级规则
escalation_policy:
# 第一阶段:发送到开发团队
- level: 1
delay: 0
actions:
- type: slack
channel: "#dev-alerts"
# 第二阶段:5分钟后未处理,发送邮件
- level: 2
delay: 5m
actions:
- type: email
targets: ["dev-team@example.com"]
# 第三阶段:15分钟后未处理,呼叫值班人员
- level: 3
delay: 15m
actions:
- type: pagerduty
service_key: "${PAGERDUTY_SERVICE_KEY}"
告警分组 #
yaml
# 告警分组规则
grouping:
# 按项目分组
by_project: true
# 按环境分组
by_environment: true
# 按错误类型分组
by_error_type: true
# 合并相似告警
merge_similar: true
merge_window: 1m
告警过滤 #
忽略特定错误 #
yaml
# 告警过滤规则
filters:
# 忽略特定错误类型
ignore_errors:
- "NetworkError"
- "Failed to fetch"
# 忽略特定环境
ignore_environments:
- "development"
- "testing"
# 忽略特定用户
ignore_users:
- "test-user"
- "bot-*"
# 忽略特定路径
ignore_paths:
- "/health"
- "/metrics"
条件过滤 #
yaml
# 条件告警
conditions:
# 只在生产环境告警
- type: environment
match: is
values: [production]
# 只告警 error 及以上级别
- type: level
match: greater_or_equal
value: error
# 只告警影响用户数超过 5 的问题
- type: users_affected
match: greater_or_equal
value: 5
告警模板 #
自定义告警消息 #
yaml
# 自定义告警模板
templates:
email:
subject: "[{{ level }}] {{ issue.title }}"
body: |
## 错误详情
**标题**: {{ issue.title }}
**级别**: {{ level }}
**环境**: {{ environment }}
**版本**: {{ release }}
**影响用户**: {{ users_affected }}
**发生次数**: {{ event_count }}
## 错误信息
```
{{ error.message }}
```
## 堆栈跟踪
```
{{ error.stacktrace }}
```
[查看详情]({{ issue.url }})
slack:
blocks:
- type: header
text:
type: plain_text
text: "🚨 {{ level }}: {{ issue.title }}"
- type: section
fields:
- type: mrkdwn
text: "*环境:*\n{{ environment }}"
- type: mrkdwn
text: "*版本:*\n{{ release }}"
- type: mrkdwn
text: "*影响用户:*\n{{ users_affected }}"
- type: mrkdwn
text: "*发生次数:*\n{{ event_count }}"
- type: section
text:
type: mrkdwn
text: |
```
{{ error.message }}
```
- type: actions
elements:
- type: button
text:
type: plain_text
text: "查看详情"
url: "{{ issue.url }}"
- type: button
text:
type: plain_text
text: "忽略"
url: "{{ issue.url }}/ignore"
告警最佳实践 #
1. 分级告警 #
yaml
# 不同级别不同处理
rules:
# 致命错误:立即通知
- name: "Fatal Errors"
conditions:
- level: fatal
actions:
- type: pagerduty
- type: slack
channel: "#critical-alerts"
# 错误:工作时间内通知
- name: "Regular Errors"
conditions:
- level: error
actions:
- type: slack
channel: "#errors"
# 警告:每日汇总
- name: "Warnings"
conditions:
- level: warning
actions:
- type: email
targets: ["dev-team@example.com"]
frequency: daily
2. 避免告警疲劳 #
yaml
# 告警频率控制
settings:
# 同一问题 5 分钟内只告警一次
issue_frequency: 5m
# 每小时最多 20 条告警
hourly_limit: 20
# 超过限制后静默
throttle_mode: silent
3. 按团队分配 #
yaml
# 按项目/功能分配告警
rules:
# 支付相关错误 -> 支付团队
- name: "Payment Errors"
conditions:
- tag: feature
match: is
value: payment
actions:
- type: slack
channel: "#payment-team"
# 用户相关错误 -> 用户团队
- name: "User Errors"
conditions:
- tag: feature
match: is
value: user
actions:
- type: slack
channel: "#user-team"
4. 值班轮换 #
yaml
# 值班轮换配置
oncall:
# 使用 PagerDuty 值班表
provider: pagerduty
schedule_id: "${PAGERDUTY_SCHEDULE_ID}"
# 告警规则
rules:
- name: "After Hours Critical"
conditions:
- level: fatal
actions:
- type: pagerduty
use_oncall: true
5. 告警聚合 #
yaml
# 告警聚合配置
aggregation:
# 按时间窗口聚合
time_window: 5m
# 按条件聚合
group_by:
- issue_id
- environment
# 聚合后发送摘要
send_summary: true
summary_template: |
## 告警摘要
在过去 {{ time_window }} 内,收到 {{ count }} 条告警:
{{#issues}}
- {{ title }} ({{ count }} 次)
{{/issues}}
告警监控 #
告警统计 #
在 Sentry 控制台:
- 进入 Settings → Projects → 选择项目 → Alerts
- 查看告警统计:
- 告警数量趋势
- 告警类型分布
- 响应时间
告警健康检查 #
yaml
# 告警健康检查
health_check:
# 检查告警是否正常发送
test_alert:
enabled: true
frequency: daily
# 检查通知渠道是否正常
channel_check:
enabled: true
channels:
- type: slack
channel: "#alerts"
- type: email
targets: ["test@example.com"]
下一步 #
现在你已经掌握了告警配置的知识,接下来学习 Source Maps 了解如何还原压缩代码的错误堆栈!
最后更新:2026-03-29