Calculating the number of operations
Now we’re interested in calculating the computational cost of a particular convolution layer. This step is important if you would like to understand how to implement efficient network structures, perhaps when speed is key like in mobile devices. Another reasons is to see how many multipliers are needed to implement a particular layer in hardware. The convolutional layers in modern CNN architectures can be responsible for up to 90% of all computation in the model!
These are the factors that impact the number of MACs (Multiply add accumulators)/operations:
- Convolution kernel size (F)
- Number of filters (M)
- Height and Width of the input feature map (H,W)
- Input batch size (B)
- Input depth size (channels) (C)
- Convolution layer stride (S)
The number of MACs can then be calculated as:
#MAC=[F*F*C*(H+2*P-FS+1)*(W+2*P-FS+1) * M ]* B
For example, let's consider a conv layer with input 224 x 224 x 3, batch size 1, kernel 3x3, 64 filters, stride 1, and pad 1:
#MAC=3*3*(224+2-31+1)*(224+2-31+1) * 3 * 64 * 1=9,462,528
In contrast, the fully connected layer has the following number of operations:
#MAC=[H*W*C*Outputneurons]*B
Let's reuse the same example but now with a dense layer of 64 neurons:
#MAC=[224*224*3*64]*1=9,633,792
(We have excluded biases for all op calculations, but they shouldn't add too much cost.)