PostgreSQL的window函數(shù)整理

簡(jiǎn)言之,聚合函數(shù)返回的各個(gè)分組的結(jié)果,窗口函數(shù)則為每一行返回結(jié)果,示例如下: 1.建示例表,初始化數(shù)據(jù) DROP TABLE IF EXISTS empsalary; CREATE TABLE empsalary( depname varchar, empno bigint, salary int, enroll_date date ); INSERT INTO empsalary V

簡(jiǎn)言之,聚合函數(shù)返回的各個(gè)分組的結(jié)果,窗口函數(shù)則為每一行返回結(jié)果,示例如下:
1.建示例表,初始化數(shù)據(jù)

DROP TABLE IF EXISTS empsalary;
CREATE TABLE empsalary(
? depname varchar,
? empno bigint,
? salary int,
? enroll_date date
);
INSERT INTO empsalary VALUES(‘develop’,10, 5200, ‘2007/08/01’);
INSERT INTO empsalary VALUES(‘sales’, 1, 5000, ‘2006/10/01’);
INSERT INTO empsalary VALUES(‘personnel’, 5, 3500, ‘2007/12/10’);
INSERT INTO empsalary VALUES(‘sales’, 4, 4800, ‘2007/08/08’);
INSERT INTO empsalary VALUES(‘sales’, 6, 5500, ‘2007/01/02’);
INSERT INTO empsalary VALUES(‘personnel’, 2, 3900, ‘2006/12/23’);
INSERT INTO empsalary VALUES(‘develop’, 7, 4200, ‘2008/01/01’);
INSERT INTO empsalary VALUES(‘develop’, 9, 4500, ‘2008/01/01’);
INSERT INTO empsalary VALUES(‘sales’, 3, 4800, ‘2007/08/01’);
INSERT INTO empsalary VALUES(‘develop’, 8, 6000, ‘2006/10/01’);
INSERT INTO empsalary VALUES(‘develop’, 11, 5200, ‘2007/08/15’);

postgres=# select * from empsalary ;
? depname? | empno | salary | enroll_date
———–+——-+——–+————-
develop?? |??? 10 |?? 5200 | 2007-08-01
sales???? |???? 1 |?? 5000 | 2006-10-01
personnel |???? 5 |?? 3500 | 2007-12-10
sales???? |???? 4 |?? 4800 | 2007-08-08
sales???? |???? 6 |?? 5500 | 2007-01-02
personnel |???? 2 |?? 3900 | 2006-12-23
develop?? |???? 7 |?? 4200 | 2008-01-01
develop?? |???? 9 |?? 4500 | 2008-01-01
sales???? |???? 3 |?? 4800 | 2007-08-01
develop?? |???? 8 |?? 6000 | 2006-10-01
develop?? |??? 11 |?? 5200 | 2007-08-15
(11 rows)2.統(tǒng)計(jì)示例
a.統(tǒng)計(jì)各部門(mén)的總薪水,平均薪水和部門(mén)的詳細(xì)情況
postgres=# select sum(salary) OVER (PARTITION BY depname),avg(salary) OVER (PARTITION BY depname),* from empsalary;
? sum? |????????? avg????????? |? depname? | empno | salary | enroll_date
——-+———————–+———–+——-+——–+————-
25100 | 5020.0000000000000000 | develop?? |??? 10 |?? 5200 | 2007-08-01
25100 | 5020.0000000000000000 | develop?? |???? 7 |?? 4200 | 2008-01-01
25100 | 5020.0000000000000000 | develop?? |???? 9 |?? 4500 | 2008-01-01
25100 | 5020.0000000000000000 | develop?? |???? 8 |?? 6000 | 2006-10-01
25100 | 5020.0000000000000000 | develop?? |??? 11 |?? 5200 | 2007-08-15
? 7400 | 3700.0000000000000000 | personnel |???? 2 |?? 3900 | 2006-12-23
? 7400 | 3700.0000000000000000 | personnel |???? 5 |?? 3500 | 2007-12-10
20100 | 5025.0000000000000000 | sales???? |???? 3 |?? 4800 | 2007-08-01
20100 | 5025.0000000000000000 | sales???? |???? 1 |?? 5000 | 2006-10-01
20100 | 5025.0000000000000000 | sales???? |???? 4 |?? 4800 | 2007-08-08
20100 | 5025.0000000000000000 | sales???? |???? 6 |?? 5500 | 2007-01-02
(11 rows)b.統(tǒng)計(jì)人員在所在部門(mén)的薪水排名情況
postgres=# select rank() OVER (PARTITION BY depname ORDER BY salary),* from empsalary;
rank |? depname? | empno | salary | enroll_date
——+———–+——-+——–+————-
??? 1 | develop?? |???? 7 |?? 4200 | 2008-01-01
??? 2 | develop?? |???? 9 |?? 4500 | 2008-01-01
??? 3 | develop?? |??? 10 |?? 5200 | 2007-08-01
??? 3 | develop?? |??? 11 |?? 5200 | 2007-08-15
??? 5 | develop?? |???? 8 |?? 6000 | 2006-10-01
??? 1 | personnel |???? 5 |?? 3500 | 2007-12-10
??? 2 | personnel |???? 2 |?? 3900 | 2006-12-23
??? 1 | sales???? |???? 4 |?? 4800 | 2007-08-08
??? 1 | sales???? |???? 3 |?? 4800 | 2007-08-01
??? 3 | sales???? |???? 1 |?? 5000 | 2006-10-01
??? 4 | sales???? |???? 6 |?? 5500 | 2007-01-02
(11 rows)3.一個(gè)有趣的例子 注意使用order by,結(jié)果會(huì)兩樣
create table foo(a int,b int) ;
insert into foo values (1,1);
insert into foo values (1,1);
insert into foo values (2,1);
insert into foo values (4,1);
insert into foo values (2,1);
insert into foo values (4,1);
insert into foo values (5,1);
insert into foo values (11,3);
insert into foo values (12,3);
insert into foo values (22,3);
insert into foo values (16,3);
insert into foo values (16,3);
insert into foo values (16,3);

postgres=# select sum(a) over (partition by b), a, b from foo;
sum | a? | b
—–+—-+—
? 19 |? 1 | 1
? 19 |? 1 | 1
? 19 |? 2 | 1
? 19 |? 4 | 1
? 19 |? 2 | 1
? 19 |? 4 | 1
? 19 |? 5 | 1
? 93 | 11 | 3
? 93 | 12 | 3
? 93 | 22 | 3
? 93 | 16 | 3
? 93 | 16 | 3
? 93 | 16 | 3
(13 rows)

postgres=# select sum(a) over (partition by b order by a), a, b from foo;
sum | a? | b
—–+—-+—
?? 2 |? 1 | 1
?? 2 |? 1 | 1
?? 6 |? 2 | 1
?? 6 |? 2 | 1
? 14 |? 4 | 1
? 14 |? 4 | 1
? 19 |? 5 | 1
? 11 | 11 | 3
? 23 | 12 | 3
? 71 | 16 | 3
? 71 | 16 | 3
? 71 | 16 | 3
? 93 | 22 | 3
(13 rows)

postgres=# select a, b, sum(a) over (partition by b order by a ROWS
postgres(# BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) from foo;
a? | b | sum
—-+—+—–
? 1 | 1 |? 19
? 1 | 1 |? 19
? 2 | 1 |? 19
? 2 | 1 |? 19
? 4 | 1 |? 19
? 4 | 1 |? 19
? 5 | 1 |? 19
11 | 3 |? 93
12 | 3 |? 93
16 | 3 |? 93
16 | 3 |? 93
16 | 3 |? 93
22 | 3 |? 93
(13 rows)官網(wǎng)中的解釋是: By default, if ORDER BY is supplied then the frame consists of all rows from the start of the partition up through the current row, plus any following rows that are equal to the current row according to the ORDER BY clause. When ORDER BY is omitted the default frame consists of all rows in the partition.
默認(rèn)情況下,帶了order by 參數(shù)會(huì)從分組的起始值開(kāi)始一直疊加,直到當(dāng)前值為止,當(dāng)忽略order by 參數(shù)則會(huì)計(jì)算分組中所有值的和。

4.其他的窗口函數(shù)
row_number(): 從當(dāng)前開(kāi)始,不間斷,如1,2,3,4,5,6
rank() :從當(dāng)前開(kāi)始,會(huì)間斷,如1,2,2,4,5,6
dense_rank():從當(dāng)前開(kāi)始不間斷,但會(huì)重復(fù),如1,2,2,3,4,5
percent_rank():從當(dāng)前開(kāi)始,計(jì)算在分組中的比例,如0,0.25,0.25,0.75,1,0,1 從0-1不斷地循環(huán)
cume_dist():當(dāng)前行的排序除以分組的數(shù)量,如分組有4行,則值為0.25,0.5,0.75,1
ntile(num_buckets Integer):從1到當(dāng)前值,除以分組的的數(shù)量,盡可能使分布平均
lag(value any [, offset integer [, default any ]]):偏移量函數(shù),取滯后值,如lag(column_name,2,0)表示字段偏移量為2,沒(méi)有則用default值代替,這里是0,不寫(xiě)默認(rèn)是null
lead(value any [, offset integer [, default any ]]):偏移量函數(shù),取提前值,類上 first_value(value any):返回窗口框架中的第一個(gè)值
last_value(value any):返回窗口框架中的最后一個(gè)值
nth_value(value any, nth integer):返回窗口框架中的指定值,如nth_value(salary,2),則表示返回字段salary的第二個(gè)窗口函數(shù)值

5.其他窗口函數(shù)示例
postgres=# select row_number() over (partition by depname order by salary desc),* from empsalary;
row_number |? depname? | empno | salary | enroll_date
————+———–+——-+——–+————-
????????? 1 | develop?? |???? 8 |?? 6000 | 2006-10-01
????????? 2 | develop?? |??? 10 |?? 5200 | 2007-08-01
????????? 3 | develop?? |??? 11 |?? 5200 | 2007-08-15
????????? 4 | develop?? |???? 9 |?? 4500 | 2008-01-01
????????? 5 | develop?? |???? 7 |?? 4200 | 2008-01-01
????????? 1 | personnel |???? 2 |?? 3900 | 2006-12-23
????????? 2 | personnel |???? 5 |?? 3500 | 2007-12-10
????????? 1 | sales???? |???? 6 |?? 5500 | 2007-01-02
????????? 2 | sales???? |???? 1 |?? 5000 | 2006-10-01
????????? 3 | sales???? |???? 3 |?? 4800 | 2007-08-01
????????? 4 | sales???? |???? 4 |?? 4800 | 2007-08-08
(11 rows)

postgres=# select rank() over(partition by depname order by salary desc),* from empsalary;
rank |? depname? | empno | salary | enroll_date
——+———–+——-+——–+————-
??? 1 | develop?? |???? 8 |?? 6000 | 2006-10-01
??? 2 | develop?? |??? 10 |?? 5200 | 2007-08-01
??? 2 | develop?? |??? 11 |?? 5200 | 2007-08-15
??? 4 | develop?? |???? 9 |?? 4500 | 2008-01-01
??? 5 | develop?? |???? 7 |?? 4200 | 2008-01-01
??? 1 | personnel |???? 2 |?? 3900 | 2006-12-23
??? 2 | personnel |???? 5 |?? 3500 | 2007-12-10
??? 1 | sales???? |???? 6 |?? 5500 | 2007-01-02
??? 2 | sales???? |???? 1 |?? 5000 | 2006-10-01
??? 3 | sales???? |???? 3 |?? 4800 | 2007-08-01
??? 3 | sales???? |???? 4 |?? 4800 | 2007-08-08
(11 rows)

postgres=# select dense_rank() over(partition by depname order by salary desc),* from empsalary;
dense_rank |? depname? | empno | salary | enroll_date
————+———–+——-+——–+————-
????????? 1 | develop?? |???? 8 |?? 6000 | 2006-10-01
????????? 2 | develop?? |??? 10 |?? 5200 | 2007-08-01
????????? 2 | develop?? |??? 11 |?? 5200 | 2007-08-15
????????? 3 | develop?? |???? 9 |?? 4500 | 2008-01-01
????????? 4 | develop?? |???? 7 |?? 4200 | 2008-01-01
????????? 1 | personnel |???? 2 |?? 3900 | 2006-12-23
????????? 2 | personnel |???? 5 |?? 3500 | 2007-12-10
????????? 1 | sales???? |???? 6 |?? 5500 | 2007-01-02
????????? 2 | sales???? |???? 1 |?? 5000 | 2006-10-01
????????? 3 | sales???? |???? 3 |?? 4800 | 2007-08-01
????????? 3 | sales???? |???? 4 |?? 4800 | 2007-08-08
(11 rows)

postgres=# select percent_rank() over(partition by depname order by salary desc),* from empsalary;
?? percent_rank??? |? depname? | empno | salary | enroll_date
——————-+———–+——-+——–+————-
???????????????? 0 | develop?? |???? 8 |?? 6000 | 2006-10-01
????????????? 0.25 | develop?? |??? 10 |?? 5200 | 2007-08-01
????????????? 0.25 | develop?? |??? 11 |?? 5200 | 2007-08-15
????????????? 0.75 | develop?? |???? 9 |?? 4500 | 2008-01-01
???????????????? 1 | develop?? |???? 7 |?? 4200 | 2008-01-01
???????????????? 0 | personnel |???? 2 |?? 3900 | 2006-12-23
???????????????? 1 | personnel |???? 5 |?? 3500 | 2007-12-10
???????????????? 0 | sales???? |???? 6 |?? 5500 | 2007-01-02
0.333333333333333 | sales???? |???? 1 |?? 5000 | 2006-10-01
0.666666666666667 | sales???? |???? 3 |?? 4800 | 2007-08-01
0.666666666666667 | sales???? |???? 4 |?? 4800 | 2007-08-08
(11 rows)

postgres=# select cume_dist()over(partition by depname order by salary desc),* from empsalary;
cume_dist |? depname? | empno | salary | enroll_date
———–+———–+——-+——–+————-
?????? 0.2 | develop?? |???? 8 |?? 6000 | 2006-10-01
?????? 0.6 | develop?? |??? 10 |?? 5200 | 2007-08-01
?????? 0.6 | develop?? |??? 11 |?? 5200 | 2007-08-15
?????? 0.8 | develop?? |???? 9 |?? 4500 | 2008-01-01
???????? 1 | develop?? |???? 7 |?? 4200 | 2008-01-01
?????? 0.5 | personnel |???? 2 |?? 3900 | 2006-12-23
???????? 1 | personnel |???? 5 |?? 3500 | 2007-12-10
????? 0.25 | sales???? |???? 6 |?? 5500 | 2007-01-02
?????? 0.5 | sales???? |???? 1 |?? 5000 | 2006-10-01
???????? 1 | sales???? |???? 3 |?? 4800 | 2007-08-01
???????? 1 | sales???? |???? 4 |?? 4800 | 2007-08-08
(11 rows)

postgres=# select ntile(3)over(partition by depname order by salary desc),* from empsalary;
ntile |? depname? | empno | salary | enroll_date
——-+———–+——-+——–+————-
???? 1 | develop?? |???? 8 |?? 6000 | 2006-10-01
???? 1 | develop?? |??? 10 |?? 5200 | 2007-08-01
???? 2 | develop?? |??? 11 |?? 5200 | 2007-08-15
???? 2 | develop?? |???? 9 |?? 4500 | 2008-01-01
???? 3 | develop?? |???? 7 |?? 4200 | 2008-01-01
???? 1 | personnel |???? 2 |?? 3900 | 2006-12-23
???? 2 | personnel |???? 5 |?? 3500 | 2007-12-10
???? 1 | sales???? |???? 6 |?? 5500 | 2007-01-02
???? 1 | sales???? |???? 1 |?? 5000 | 2006-10-01
???? 2 | sales???? |???? 3 |?? 4800 | 2007-08-01
???? 3 | sales???? |???? 4 |?? 4800 | 2007-08-08
(11 rows)

postgres=# select lag(salary,2,null)over(partition by depname order by salary desc),* from empsalary;
lag? |? depname? | empno | salary | enroll_date
——+———–+——-+——–+————-
????? | develop?? |???? 8 |?? 6000 | 2006-10-01
????? | develop?? |??? 10 |?? 5200 | 2007-08-01
6000 | develop?? |??? 11 |?? 5200 | 2007-08-15
5200 | develop?? |???? 9 |?? 4500 | 2008-01-01
5200 | develop?? |???? 7 |?? 4200 | 2008-01-01
????? | personnel |???? 2 |?? 3900 | 2006-12-23
????? | personnel |???? 5 |?? 3500 | 2007-12-10
????? | sales???? |???? 6 |?? 5500 | 2007-01-02
????? | sales???? |???? 1 |?? 5000 | 2006-10-01
5500 | sales???? |???? 3 |?? 4800 | 2007-08-01
5000 | sales???? |???? 4 |?? 4800 | 2007-08-08
(11 rows)

postgres=# select first_value(salary)over(partition by depname order by salary desc),* from empsalary;
first_value |? depname? | empno | salary | enroll_date
————-+———–+——-+——–+————-
??????? 6000 | develop?? |???? 8 |?? 6000 | 2006-10-01
??????? 6000 | develop?? |??? 10 |?? 5200 | 2007-08-01
??????? 6000 | develop?? |??? 11 |?? 5200 | 2007-08-15
??????? 6000 | develop?? |???? 9 |?? 4500 | 2008-01-01
??????? 6000 | develop?? |???? 7 |?? 4200 | 2008-01-01
??????? 3900 | personnel |???? 2 |?? 3900 | 2006-12-23
??????? 3900 | personnel |???? 5 |?? 3500 | 2007-12-10
??????? 5500 | sales???? |???? 6 |?? 5500 | 2007-01-02
??????? 5500 | sales???? |???? 1 |?? 5000 | 2006-10-01
??????? 5500 | sales???? |???? 3 |?? 4800 | 2007-08-01
??????? 5500 | sales???? |???? 4 |?? 4800 | 2007-08-08
(11 rows)

postgres=# select last_value(salary)over(partition by depname order by salary desc),* from empsalary;
last_value |? depname? | empno | salary | enroll_date
————+———–+——-+——–+————-
?????? 6000 | develop?? |???? 8 |?? 6000 | 2006-10-01
?????? 5200 | develop?? |??? 10 |?? 5200 | 2007-08-01
?????? 5200 | develop?? |??? 11 |?? 5200 | 2007-08-15
?????? 4500 | develop?? |???? 9 |?? 4500 | 2008-01-01
?????? 4200 | develop?? |???? 7 |?? 4200 | 2008-01-01
?????? 3900 | personnel |???? 2 |?? 3900 | 2006-12-23
?????? 3500 | personnel |???? 5 |?? 3500 | 2007-12-10
?????? 5500 | sales???? |???? 6 |?? 5500 | 2007-01-02
?????? 5000 | sales???? |???? 1 |?? 5000 | 2006-10-01
?????? 4800 | sales???? |???? 3 |?? 4800 | 2007-08-01
?????? 4800 | sales???? |???? 4 |?? 4800 | 2007-08-08
(11 rows)

postgres=# select nth_value(salary,2)over(partition by depname order by salary desc),* from empsalary;
nth_value |? depname? | empno | salary | enroll_date
———–+———–+——-+——–+————-
?????????? | develop?? |???? 8 |?? 6000 | 2006-10-01
????? 5200 | develop?? |??? 10 |?? 5200 | 2007-08-01
????? 5200 | develop?? |??? 11 |?? 5200 | 2007-08-15
????? 5200 | develop?? |???? 9 |?? 4500 | 2008-01-01
????? 5200 | develop?? |???? 7 |?? 4200 | 2008-01-01
?????????? | personnel |???? 2 |?? 3900 | 2006-12-23
????? 3500 | personnel |???? 5 |?? 3500 | 2007-12-10
?????????? | sales???? |???? 6 |?? 5500 | 2007-01-02
????? 5000 | sales???? |???? 1 |?? 5000 | 2006-10-01
????? 5000 | sales???? |???? 3 |?? 4800 | 2007-08-01
????? 5000 | sales???? |???? 4 |?? 4800 | 2007-08-08
(11 rows)當(dāng)一個(gè)查詢涉及多個(gè)窗口函數(shù)的時(shí)候,可以用別名的辦法來(lái)使用,更簡(jiǎn)單:
postgres=# select sum(salary)over w,avg(salary) over w,* from empsalary window w as (partition by depname order by salary desc);
? sum? |????????? avg????????? |? depname? | empno | salary | enroll_date
——-+———————–+———–+——-+——–+————-
? 6000 | 6000.0000000000000000 | develop?? |???? 8 |?? 6000 | 2006-10-01
16400 | 5466.6666666666666667 | develop?? |??? 10 |?? 5200 | 2007-08-01
16400 | 5466.6666666666666667 | develop?? |??? 11 |?? 5200 | 2007-08-15
20900 | 5225.0000000000000000 | develop?? |???? 9 |?? 4500 | 2008-01-01
25100 | 5020.0000000000000000 | develop?? |???? 7 |?? 4200 | 2008-01-01
? 3900 | 3900.0000000000000000 | personnel |???? 2 |?? 3900 | 2006-12-23
? 7400 | 3700.0000000000000000 | personnel |???? 5 |?? 3500 | 2007-12-10
? 5500 | 5500.0000000000000000 | sales???? |???? 6 |?? 5500 | 2007-01-02
10500 | 5250.0000000000000000 | sales???? |???? 1 |?? 5000 | 2006-10-01
20100 | 5025.0000000000000000 | sales???? |???? 3 |?? 4800 | 2007-08-01
20100 | 5025.0000000000000000 | sales???? |???? 4 |?? 4800 | 2007-08-08
(11 rows)這個(gè)寫(xiě)法和下面的是一樣的,不過(guò)更簡(jiǎn)單
SELECT sum(salary) OVER (PARTITION BY depname ORDER BY salary DESC), avg(salary) OVER (PARTITION BY depname ORDER BY salary DESC),* FROM empsalary;?

? 版權(quán)聲明
THE END
喜歡就支持一下吧
點(diǎn)贊14 分享