聚合函数 #
一、聚合函数概述 #
1.1 支持的聚合函数 #
| 函数 | 描述 |
|---|---|
| COUNT | 计数 |
| SUM | 求和 |
| AVG | 平均值 |
| MIN | 最小值 |
| MAX | 最大值 |
1.2 基本语法 #
sql
SELECT aggregate_function(column_name)
FROM table_name
[WHERE condition]
[GROUP BY column_name];
二、COUNT函数 #
2.1 基本计数 #
sql
-- 计算总行数
SELECT COUNT(*) FROM users;
-- 计算特定列非空值数量
SELECT COUNT(email) FROM users;
-- 带条件计数
SELECT COUNT(*) FROM orders
WHERE user_id = 550e8400-e29b-41d4-a716-446655440000;
2.2 分组计数 #
sql
-- 按分区键分组计数
SELECT user_id, COUNT(*) as order_count
FROM orders
GROUP BY user_id;
-- 按聚簇列分组计数
SELECT user_id, order_date, COUNT(*) as daily_count
FROM orders
GROUP BY user_id, order_date;
三、SUM函数 #
3.1 基本求和 #
sql
-- 计算总和
SELECT SUM(amount) FROM orders
WHERE user_id = 550e8400-e29b-41d4-a716-446655440000;
-- 多列求和
SELECT SUM(amount), SUM(quantity) FROM order_items
WHERE order_id = ?;
3.2 分组求和 #
sql
-- 按用户分组求和
SELECT user_id, SUM(amount) as total_amount
FROM orders
GROUP BY user_id;
-- 按日期分组求和
SELECT user_id, order_date, SUM(amount) as daily_total
FROM orders
GROUP BY user_id, order_date;
四、AVG函数 #
4.1 基本平均值 #
sql
-- 计算平均值
SELECT AVG(age) FROM users;
SELECT AVG(price) FROM products;
-- 带条件平均值
SELECT AVG(amount) FROM orders
WHERE user_id = ? AND order_date >= '2024-01-01';
4.2 分组平均值 #
sql
-- 按产品类别分组平均价格
SELECT category_id, AVG(price) as avg_price
FROM products
GROUP BY category_id;
五、MIN/MAX函数 #
5.1 基本使用 #
sql
-- 最小值和最大值
SELECT MIN(age), MAX(age) FROM users;
SELECT MIN(price), MAX(price) FROM products;
-- 时间范围
SELECT MIN(created_at), MAX(created_at) FROM users;
-- 带条件
SELECT MIN(amount), MAX(amount) FROM orders
WHERE user_id = ?;
5.2 分组使用 #
sql
-- 按用户分组获取订单金额范围
SELECT user_id, MIN(amount), MAX(amount)
FROM orders
GROUP BY user_id;
六、GROUP BY #
6.1 基本分组 #
sql
-- 按分区键分组
SELECT user_id, COUNT(*), SUM(amount)
FROM orders
GROUP BY user_id;
-- 按聚簇列分组
SELECT user_id, order_date, COUNT(*), SUM(amount)
FROM orders
GROUP BY user_id, order_date;
6.2 分组规则 #
text
GROUP BY规则:
必须按主键顺序
├── 分区键必须完整
├── 聚簇列按顺序使用
└── 不能跳过主键列
示例(PRIMARY KEY (user_id, order_date, order_id)):
✓ GROUP BY user_id
✓ GROUP BY user_id, order_date
✓ GROUP BY user_id, order_date, order_id
✗ GROUP BY order_date(缺少分区键)
✗ GROUP BY user_id, order_id(跳过order_date)
6.3 分组聚合组合 #
sql
-- 完整分组聚合
SELECT
user_id,
order_date,
COUNT(*) as order_count,
SUM(amount) as total_amount,
AVG(amount) as avg_amount,
MIN(amount) as min_amount,
MAX(amount) as max_amount
FROM orders
GROUP BY user_id, order_date;
七、组合使用 #
7.1 多聚合函数 #
sql
-- 组合多个聚合函数
SELECT
COUNT(*) as total,
SUM(amount) as total_amount,
AVG(amount) as avg_amount,
MIN(amount) as min_amount,
MAX(amount) as max_amount
FROM orders
WHERE user_id = ?;
7.2 聚合与分组 #
sql
-- 统计报表
SELECT
user_id,
order_date,
COUNT(*) as order_count,
SUM(amount) as total_amount,
AVG(amount) as avg_amount
FROM orders
WHERE user_id = ?
AND order_date >= '2024-01-01'
AND order_date <= '2024-01-31'
GROUP BY user_id, order_date;
八、限制与注意事项 #
8.1 不支持的功能 #
text
不支持的功能:
HAVING子句
├── 不能过滤聚合结果
└── 需要在应用层处理
嵌套聚合
├── 不能嵌套聚合函数
└── 如 AVG(SUM(amount))
自定义聚合函数
├── 不支持UDAF
└── 使用内置函数
复杂分组
├── 不支持ROLLUP
├── 不支持CUBE
└── 不支持GROUPING SETS
8.2 性能考虑 #
text
聚合性能考虑:
分区聚合
├── 单分区聚合性能好
└── 数据量可控
全表聚合
├── 需要扫描所有分区
├── 性能较差
└── 可能超时
建议
├── 限制聚合范围
├── 使用分区键过滤
└── 大数据量考虑预聚合
九、预聚合设计 #
9.1 预聚合表 #
sql
-- 创建预聚合表
CREATE TABLE daily_order_stats (
stat_date DATE,
user_id UUID,
order_count COUNTER,
total_amount COUNTER,
PRIMARY KEY (stat_date, user_id)
);
-- 更新统计数据
UPDATE daily_order_stats
SET order_count = order_count + 1,
total_amount = total_amount + ?
WHERE stat_date = ? AND user_id = ?;
-- 查询统计数据
SELECT * FROM daily_order_stats
WHERE stat_date = '2024-01-15';
9.2 物化视图 #
sql
-- 创建物化视图用于聚合
CREATE MATERIALIZED VIEW orders_by_date AS
SELECT user_id, order_date, order_id, amount
FROM orders
WHERE user_id IS NOT NULL
AND order_date IS NOT NULL
AND order_id IS NOT NULL
PRIMARY KEY (order_date, user_id, order_id);
-- 按日期查询
SELECT user_id, COUNT(*), SUM(amount)
FROM orders_by_date
WHERE order_date = '2024-01-15'
GROUP BY user_id;
十、总结 #
聚合函数要点:
- 基本函数:COUNT、SUM、AVG、MIN、MAX
- GROUP BY:必须按主键顺序分组
- 不支持HAVING:需要在应用层过滤
- 性能考虑:限制聚合范围
- 预聚合:大数据量考虑预聚合表
- 计数器:使用计数器表实现实时统计
下一步,让我们学习排序与分页!
最后更新:2026-03-27