Padding and Stride in Convolutions
In my previous we have seen introduction about convolutions. Please go through this before reading this article for better understanding.
https://saichandra1199.medium.com/convolutions-in-machine-learning-64dd44c88135
Hope you got some idea about convolutions, Now we will move onto some details of what are padding and stride and why are they used.
So, In Convolutions we get the shrinked version of an input image after convolving with the filter or kernel of weights. So get reduced number of pixels for an image after convolution and by repeating this procedure, after some operations of convolutions over multiple layers we may have very number of pixels which may define much about the input image to predict accordingly. If we want to maintain the image pixel same as the input while performing convolutions then we add padding to an input to prevent from shrinking.
By Padding we are not only preserving the input shape after performing multiple convolutions but also concentrating more on the edges information of an image rather than only concentrating on the center of the image multiple times while convolution ,which helps in better prediction. Here in the above image you can see it is padded with one layer of 0’s on all sides so we get 2 more rows,columns. For example, if the image is 5*5 it becomes 7*7 after padding.
So padding makes convolutions with same shape as input. Let’s represent it in a mathematical formula for better understanding and memorizing.
n = number of pixels of an input image
p= padding number
f = filter size
from these (n*n) + (p*p) convolution (f*f) gives → (n+2p-f+1) * (n+2p-f+1)
Considering an example to justify the above formula, n=3, p=1, f=3
n+2p-f+1 →3 . So we preserved input shape. One more important thing is that we mostly use odd sized filters like 3,5…. because to maintain correct padding ratio and conserving the central position to make them symmetric along the origin and it is a good property.
Coming to Stride, It is nothing but steps taken by the filter over an image. If the stride is 2 it moves by 2 pixels right/ bottom after every convolution. Which makes lesser convolutions and results in reduced dimension of output image. It is clearly shown in the below strided convolution figure
If the stride is increased the output size will be reduced but it must be preferred less than or equal to filter size to cover all pixels.So on overall stride is for reducing the features or size of an image and padding is for increasing or maintaining the same size of an input image.Formula used to calculate size of output image with different stride is
Considering n,f,p same as in padding case lets denote stride by s
output image size → (n+2p-f)/s +1 * (n+2p-f)/s +1
Convolutions can be done with multiple filter channels on same image and then added later on to derive single output feature map.That’s all for padding and stride.
Thank You!!! for reading my article…