The pooling layer
The pooling layer is used to reduce the spatial dimensions of our activation tensors, but not volume depth, in a CNN. They are non parametric way of doing this, meaning that the pooling layer has no weights in it. Basically, the following is what you gain from using pooling:
- Cheap way of summarizing spatially related information in an input tensor
- By having less spatial information, you gain computation performance
- You get some translation invariance in your network
However one of the big advantage of pooling, that it has no parameters to learn, is also its biggest disadvantage because pooling can end up just throwing important information away. As a result, pooling is starting to be used less frequently in CNNs now.
In the diagram here, we show the most common type of pooling the max-pooling layer. It slides a window, like a normal convolution, and then at each location, sets the biggest value in the window as the output:
In TensorFlow, we can define pooling layers like this:
tf.layers.max_pooling2d(inputs=some_input_layer, pool_size=[2, 2], strides=2)