SQL 中的窗口函数
窗口函数在特定窗口(一组行)上应用聚合和排名函数。OVER 子句与窗口函数一起使用来定义该窗口。OVER 子句做两件事:
- 将行划分为一组行。(使用了 PARTITION BY 子句)
- 将这些分区中的行按特定顺序排序。(使用 ORDER BY 子句)
注意– 如果分区没有完成,那么 ORDER BY 会对表的所有行进行排序。
基本语法:
SELECT coulmn_name1,
window_function(cloumn_name2),
OVER([PARTITION BY column_name1] [ORDER BY column_name3]) AS new_column
FROM table_name;
window_function= any aggregate or ranking function
column_name1= column to be selected
coulmn_name2= column on which window function is to be applied
column_name3= column on whose basis partition of rows is to be done
new_column= Name of new column
table_name= Name of table
聚合窗口函数: 应用于特定窗口(一组行)的各种聚合函数,如 SUM()、COUNT()、AVERAGE()、MAX()、MIN()被称为聚合窗口函数。
考虑以下员工表:
| 名字 | 年龄 | 部门 | 薪水 | | --- | --- | --- | --- | | Ramesh | Twenty | 金融 | 50, 000 | | --- | --- | --- | --- | | 深的 | Twenty-five | 销售 | 30, 000 | | 苏雷什 | Twenty-two | 金融 | Fifty thousand | | 随机存取存储器(random access memory 的缩写)ˌ随机访问内存(random-access memory 的缩写) | Twenty-eight | 金融 | 20, 000 | | 帕拉德普 | Twenty-two | 销售 | 20, 000 |示例– 查找各部门员工的平均工资,并按年龄排序部门内的员工。
SELECT Name, Age, Department, Salary,
AVERAGE(Salary) OVER( PARTITION BY Department ORDER BY Age) AS Avg_Salary
FROM employee
上述查询的输出将是:
| 名字 | 年龄 | 部门 | 薪水 | 平均工资 | | --- | --- | --- | --- | --- | | Ramesh | Twenty | 金融 | 50, 000 | 40, 000 | | 苏雷什 | Twenty-two | 金融 | Fifty thousand | 40, 000 | | 随机存取存储器(random access memory 的缩写)ˌ随机访问内存(random-access memory 的缩写) | Twenty-eight | 金融 | 20, 000 | 40, 000 | | 帕拉德普 | Twenty-two | 销售 | 20, 000 | 25, 000 | | 深的 | Twenty-five | 销售 | 30, 000 | 25, 0000 |正如我们在上面的例子中看到的,每个部门内的平均工资是计算出来的,并显示在 Avg_Salary 列中。此外,特定列中的员工按年龄排序。
排名窗口函数: 排名函数为,RANK(),DENSE_RANK(),ROW_NUMBER()
- RANK() – As the name suggests, the rank function assigns rank to all the rows within every partition. Rank is assigned such that rank 1 given to the first row and rows having same value are assigned same rank. For the next rank after two same rank values, one rank value will be skipped.
- DENSE_RANK() – It assigns rank to each row within partition. Just like rank function first row is assigned rank 1 and rows having same value have same rank. The difference between RANK() and DENSE_RANK() is that in DENSE_RANK(), for the next rank after two same rank, consecutive integer is used, no rank is skipped.
- ROW_NUMBER() – It assigns consecutive integers to all the rows within partition. Within a partition, no two rows can have same row number.
注意– 在使用等级窗口函数时,应强制指定 ORDER BY()。
例– 根据各部门内部工资计算员工排号、职级、密级为员工表。
SELECT
ROW_NUMBER() OVER (PARTITION BY Department ORDER BY Salary DESC)
AS emp_row_no, Name, Department, Salary,
RANK() OVER(PARTITION BY Department
ORDER BY Salary DESC) AS emp_rank,
DENSE_RANK() OVER(PARTITION BY Department
ORDER BY Salary DESC)
AS emp_dense_rank,
FROM employee
上述查询的输出将是:
| emp_row_no | 名字 | 部门 | 薪水 | 电磁脉冲等级 | emp _ 密集 _ 等级 | | --- | --- | --- | --- | --- | --- | | one | 苏雷什 | 金融 | 50, 000 | one | one | | Two | Ramesh | 金融 | 50, 000 | one | one | | three | 随机存取存储器(random access memory 的缩写)ˌ随机访问内存(random-access memory 的缩写) | 金融 | 20, 000 | three | Two | | one | 深的 | 销售 | 30, 000 | one | one | | Two | 帕拉德普 | 销售 | 20, 000 | Two | Two |因此,我们可以看到,正如 ROW_NUMBER()的定义中所提到的,行号是每个分区内的连续整数。此外,我们可以看到秩和密集秩的区别,在密集秩中,秩值之间没有间隙,而在重复秩之后,秩值之间有间隙。
版权属于:月萌API www.moonapi.com,转载请注明出处