Using more than one index at a time
Up to now, you have seen that one index at a time has been used. However, in many real-world situations, this is, by far, not sufficient. There are cases demanding more logic in the database.
PostgreSQL allows the use of multiple indexes in a single query. Of course, this makes sense if many columns are queried at the same time. However, that's not always the case. It can also happen that a single index is used multiple times to process the very same column. Here is an example:
test=# explain SELECT * FROM t_test WHERE id = 30 OR id = 50;
QUERY PLAN
-----------------------------------------------------------
Bitmap Heap Scan on t_test (cost=8.88..16.85 rows=2 width=9)
Recheck Cond: ((id = 30) OR (id = 50))
-> BitmapOr (cost=8.88..8.88 rows=2 width=0)
-> Bitmap Index Scan on idx_idv (cost=0.00..4.44 rows=1 width=0)
Index Cond: (id = 30)
-> Bitmap Index Scan on idx_id (cost=0.00..4.44 rows=1 width=0)
Index Cond: (id = 50)
(7 rows)
The point here is that the id column is needed twice. First the query looks for 30 and then for 50. As you can see, PostgreSQL will go for a so -called bitmap scan.
The idea behind a bitmap scan is that PostgreSQL will scan the first index, collecting a list of blocks containing the data. Then the next index will be scanned to again compile a list of blocks. This works for as many indexes as desired. In the case of OR, those lists will then be unified, leaving us with a large lists of blocks containing the data. Using this list, the table will be scanned to retrieve those blocks. The trouble now is that PostgreSQL has retrieved a lot more data than needed. In our case, the query will look for two rows; however, a couple of blocks might have been returned by the bitmap scan. Therefore, the executor will do a so called recheck to filter out those rows, which do not satisfy our conditions.
Bitmap scans will also work for AND conditions or a mixture of AND and OR. However, if PostgreSQL sees an AND condition it does not necessarily force itself into a bitmap scan. Let's suppose that we got a query looking for everybody living in Austria and a person with a certain ID. It really makes no sense to use two indexes here because after searching for the ID there is really not much data left. Scanning both indexes would be ways more expensive because there are 8 million people (including me) living in Austria, and reading so many rows to find just one person is pretty pointless from a performance standpoint. The good news is that the PostgreSQL optimizer will make all those decisions for you by comparing the costs of different options and potential indexes, so there is no need to worry.