hive开窗函数总结

文章目录

    • 概要
    • 整体架构流程
    • 示例1
    • 示例2
    • 小结

概要

hive 开窗函数总结

整体架构流程

1.窗口函数的基本用法

函数名() over()

over关键字来指定函数执行的范围,包含三个分析子句:分组(partition by)子句,排序(order by)子句,窗口(rows)子句

函数名(字段名)   over(partition by <要分列的组> order by <要排序的列> rows between <数据范围>) 

窗口大小可以通过 rows between …and…来限定,如下:

sum(A) over(partition by B order by C rows between D1 and D2)
avg(A) over(partition by B order by C rows between D1 and D2)
A:需要被加工的字段名称
B:分组的字段名称
C:排序的字段名称
D:计算的行数范围
rows between 2 preceding and current row # 取当前行和前面俩行
rows between unbounded preceding and current row  #包括本行和之前所有行
rows between current row and unbounded following #包括本行和之后所有的行
rows between 3 preceding and current row #包括本行和前面三行
rows between 3 preceding and 1 following #从前面三行和下面一行,总共五行
# 当order by 后面缺少窗口从句条件,窗口范围默认是 rows between unbounded preceding and current row.  ->上无边

你可能感兴趣的:(大数据,hive,mysql)