Making use of sorted output
B-tree indexes are not only useful to find rows. They are also useful to feed sorted data to the next stage in the process:
test=# EXPLAIN SELECT * FROM t_test ORDER BY id DESC LIMIT 10;
QUERY PLAN
---------------------------------------------------------------
Limit (cost=0.43..0.74 rows=10 width=9)
-> Index Scan Backward using idx_id on t_test
(cost=0.43..125505.43 rows=4000000 width=9)
(2 rows)
In this case, the index already returns data in the right sort order and therefore there is no need to sort the entire set of data. Reading the last 10 rows of the index will be enough to answer this query. Practically, it means that it is possible to find the top N rows of a table in a fraction of a millisecond.
However, ORDER BY is not the only operation requiring sorted output. The min and max functions are also all about sorted output so an index can be used to speed up those two operations as well. Here is an example:
test=# explain SELECT min(id), max(id) FROM t_test;
QUERY PLAN
----------------------------------------------------------------
Result (cost=0.93..0.94 rows=1 width=8)
InitPlan 1 (returns $0)
-> Limit (cost=0.43..0.46 rows=1 width=4)
-> Index Only Scan using idx_id on t_test
(cost=0.43..135505.43 rows=4000000 width=4)
Index Cond: (id IS NOT NULL)
InitPlan 2 (returns $1)
-> Limit (cost=0.43..0.46 rows=1 width=4)
-> Index Only Scan Backward using idx_id on t_test t_test_1
(cost=0.43..135505.43 rows=4000000 width=4)
Index Cond: (id IS NOT NULL)
(9 rows)
In PostgreSQL, an index (or a B-tree, to be more precise) can be read in normal order or backward. The thing now: a B-tree can be seen as a sorted list. So naturally the lowest value is at the beginning and the highest value is at the end. Therefore, min and max are perfect candidates for a speed up.
In SQL, many operations rely on sorted input; therefore, understanding those operations is essential because there are serious implications on the indexing side.