PostgreSQL and Window Functions: differences between row_number(), rank(), and dense_rank()

The windows functions bring in SQL the ability to compute values on windows. If the word window isn't evident to you, think about partitions or groups of rows. With these functions, you can, for example, get the value of the previous row, the cumulative sum, an average or median on the window, etc.
A convenient usage is to be able to number the rows. There are 3 functions with slightly different behaviors.
The definitions in the documentation:
- row_number(): number of the current row within its partition, counting from 1
- rank(): rank of the current row with gaps; same as row_number of its first peer.
- dense_rank(): rank of the current row without gaps; this function counts peer groups.
Let's try to understand with an example. Let's take a table with movies.
select
"movie_title",
"country",
"title_year",
"imdb_score"
from "movies";
movie_title | country | title_year | imdb_score |
---|---|---|---|
Avatar | USA | 2009 | 7.9 |
Pirates of the Caribbean: At World's End | USA | 2007 | 7.1 |
Spectre | UK | 2015 | 6.8 |
The Dark Knight Rises | USA | 2012 | 8.5 |
Star Wars: Episode VII - The Force Awakens | NULL | NULL | 7.1 |
John Carter | USA | 2012 | 6.6 |
Spider-Man 3 | USA | 2007 | 6.2 |
Tangled | USA | 2010 | 7.8 |
Avengers: Age of Ultron | NULL | NULL | 7.5 |
Harry Potter and the Half-Blood Prince | UK | 2009 | 7.5 |
Batman v Superman: Dawn of Justice | USA | 2016 | 6.8 |
Let's try the three functions by calculating the position of each movie related to their IMDB score (OVER(ORDER BY "imdb_score" DESC)
).
select
row_number() OVER(ORDER BY "imdb_score" DESC) AS "row_number",
rank() OVER(ORDER BY "imdb_score" DESC) AS "rank",
dense_rank() OVER(ORDER BY "imdb_score" DESC) AS "dense_rank",
"movie_title",
"country",
"title_year",
"imdb_score"
from "movies";
row_number | rank | dense_rank | movie_title | country | title_year | imdb_score |
---|---|---|---|---|---|---|
1 | 1 | 1 | The Dark Knight Rises | USA | 2012 | 8.5 |
2 | 2 | 2 | Avatar | USA | 2009 | 7.9 |
3 | 3 | 3 | Tangled | USA | 2010 | 7.8 |
4 | 4 | 4 | Avengers: Age of Ultron | NULL | NULL | 7.5 |
5 | 4 | 4 | Harry Potter and the Half-Blood Prince | UK | 2009 | 7.5 |
6 | 6 | 5 | Pirates of the Caribbean: At World's End | USA | 2007 | 7.1 |
7 | 6 | 5 | Star Wars: Episode VII - The Force Awakens | NULL | NULL | 7.1 |
8 | 8 | 6 | Batman v Superman: Dawn of Justice | USA | 2016 | 6.8 |
9 | 8 | 6 | Spectre | UK | 2015 | 6.8 |
10 | 10 | 7 | John Carter | USA | 2012 | 6.6 |
11 | 11 | 8 | Spider-Man 3 | USA | 2007 | 6.2 |
We can see that differences appear when films have identical notes. The three functions differ in the following way:
- row_number(): row number, no equality possible
- rank(): row number, but share the number when equality, then resume with a gap
- dense_rank(): row number, but share the number when equality, then resumes at the next number (no gap)
Note that NULL values do not influence.
Similarly, partitioning (PARTITION BY) does not change the behavior of these functions.