Using bitmap scans effectively
The question naturally arising now is: when is a bitmap scan most beneficial and when is it chosen by the optimizer? From my point of view, there are really two use cases:
- Avoiding using the same block over and over again
- Combining relatively bad conditions
The first case it quite common. Suppose you are looking for everybody who speaks a certain language. For the sake of the example, we can assume that 10% of all people speak the required language. Scanning the index would mean that a block in the table has to be scanned all over again as many skilled speakers might be stored in the same block. By applying a bitmap scan, it is ensured that a specific block is only used once, which of course leads to better performance.
The second common use case is to use relatively weak criteria together. Let's suppose we are looking for everybody between 20 and 30 years of age owning a yellow shirt. Now, maybe 15% of all people are between 20 and 30 and maybe 15% of all people actually own a yellow shirt. Scanning a table sequentially is expensive, and so PostgreSQL might decide to choose two indexes because the final result might consist of just 1% of the data. Scanning both indexes might be cheaper than reading all of the data.