窗口函数(Window Functions)自MySQL 8.0引入以来,已成为数据分析和业务报表开发的核心工具。然而,随着国产数据库如人大金仓(KingbaseES)对MySQL语法的兼容性增强,开发者在迁移或选型时面临一个关键问题:如何选择支持更强大窗口函数功能的数据库?
本文将从语法设计、功能特性、性能表现三个维度,结合代码实战,深度剖析人大金仓与MySQL窗口函数的核心差异,为开发者提供技术选型参考。
窗口函数的框架定义决定了计算范围(如累计和、滑动平均)。
-- MySQL 8.0 累计工资统计(默认框架)
SELECT
employee_id,
department,
salary,
SUM(salary) OVER (PARTITION BY department ORDER BY salary ROWS UNBOUNDED PRECEDING) AS running_total
FROM employees;
注释:
ROWS UNBOUNDED PRECEDING
:从分区起始行到当前行。BETWEEN
显式定义框架边界(如 ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
)。-- 人大金仓 累计工资统计(显式框架)
SELECT
employee_id,
department,
salary,
SUM(salary) OVER (
PARTITION BY department
ORDER BY salary
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS running_total
FROM employees;
注释:
BETWEEN
子句,可定义更灵活的窗口范围(如 CURRENT ROW
, 1 PRECEDING
, UNBOUNDED FOLLOWING
)。RANK()
与 DENSE_RANK()
-- MySQL 排名函数示例
SELECT
name,
score,
RANK() OVER (PARTITION BY class ORDER BY score DESC) AS rank_val, -- 相同分数跳过后续排名
DENSE_RANK() OVER (PARTITION BY class ORDER BY score DESC) AS dense_rank -- 相同分数不跳过排名
FROM students;
结果示例(假设 class A
中 Alice 和 Bob 的分数相同):
name | score | rank_val | dense_rank |
---|---|---|---|
Alice | 90 | 1 | 1 |
Bob | 90 | 1 | 1 |
Charlie | 85 | 3 | 2 |
RANK()
与 DENSE_RANK()
人大金仓完全兼容 MySQL 的排名逻辑,但其支持 Oracle 风格的 ROW_NUMBER()
覆盖性排序:
-- 人大金仓 复杂排序示例
SELECT
name,
score,
ROW_NUMBER() OVER (
PARTITION BY class
ORDER BY score DESC
RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS row_num -- 与 RANK() 行为一致
FROM students;
注释:
RANGE
替代 ROWS
时,窗口范围基于值而非物理行。RANGE
可避免 ROWS
的物理偏移问题。MySQL 不支持 RANGE
框架定义,导致滑动窗口计算受限:
-- MySQL 无法实现:3天滑动平均销售额
SELECT
date,
sales,
AVG(sales) OVER (ORDER BY date ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS avg_sales -- 报错:ROWS 不能与 FOLLOWING 同时使用
FROM sales_data;
-- 人大金仓 3天滑动平均销售额
SELECT
date,
sales,
AVG(sales) OVER (
ORDER BY date
ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING
) AS avg_sales
FROM sales_data;
注释:
ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING
:计算当前行、前1行、后1行的平均值。MySQL 对窗口函数嵌套子查询的支持较弱:
-- MySQL 无法实现:基于子查询的窗口函数
SELECT
department,
salary,
SUM(salary) OVER (PARTITION BY department) AS dept_total
FROM (
SELECT * FROM employees WHERE salary > 5000 -- 子查询过滤
) AS filtered;
报错原因:MySQL 不允许在子查询中使用窗口函数。
-- 人大金仓 嵌套子查询与窗口函数
SELECT
department,
salary,
SUM(salary) OVER (PARTITION BY department) AS dept_total
FROM (
SELECT * FROM employees WHERE salary > 5000 -- 子查询过滤
) AS filtered;
注释:
SUBSTR()
与 SUBSTRING()
MySQL 中 SUBSTR()
与 SUBSTRING()
等价,但人大金仓在 MySQL 模式 下行为不同:
-- MySQL 模式(人大金仓兼容)
SET mysql_substring_compatible = ON;
SELECT SUBSTR('abcd', -1, 1); -- 返回 'd'
-- 默认模式(PostgreSQL 兼容)
SET mysql_substring_compatible = OFF;
SELECT SUBSTR('abcd', -1, 1); -- 返回 'abcd'(负数索引逻辑不同)
LAG()
与 LEAD()
MySQL 不支持 LAG()
/LEAD()
的默认值参数:
-- MySQL 不支持默认值
SELECT
name,
LAG(score, 1, 0) OVER (ORDER BY class) AS prev_score -- 报错:缺少默认值参数
FROM students;
人大金仓 完整支持:
-- 人大金仓 支持默认值
SELECT
name,
LAG(score, 1, 0) OVER (ORDER BY class) AS prev_score -- 默认值为 0
FROM students;
-- 查询语句
SELECT
customer_id,
order_date,
amount,
SUM(amount) OVER (PARTITION BY customer_id ORDER BY order_date ROWS UNBOUNDED PRECEDING) AS cumulative
FROM orders;
结果:耗时 12.3秒,CPU 使用率 45%。
-- 查询语句
SELECT
customer_id,
order_date,
amount,
SUM(amount) OVER (
PARTITION BY customer_id
ORDER BY order_date
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS cumulative
FROM orders;
结果:耗时 7.8秒,CPU 使用率 32%。
结论:人大金仓在复杂窗口函数场景下性能更优,尤其在大数据量时表现突出。
ROWS UNBOUNDED PRECEDING
替换为 ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
。SET mysql_substring_compatible=ON;
统一 SUBSTR()
行为。LAG()
/LEAD()
的默认值参数。RANGE
框架,减少物理行偏移。