Convolutions in Machine Learning

Sai Chandra Nerella
2 min readJun 27, 2021

--

Before going into convolutions lets first see how the need of convolutions arise. For a machine learning model to make predictions it has to learn all the edges, shapes etc., present in an image which we feed while training. So it has to pass each and every pixel of the image to derive an approximate weight based on that particular pixel.Convolution helps to make this happen. FYI, In digital imaging, a pixel(or picture element) is the smallest item of information in an image. Pixels are arranged in a 2-dimensional grid, represented using squares. Lets go on to Convolutions!!

A convolution is the simple application of a filter to an input that results in an activation. Repeated application of the same filter to an input results in a map of activation called a feature map, indicating the locations and strength of a detected feature in an input, such as an image. A Filter or Kernel with some weights convoluted with every pixel of an image.

Fig a :Convolution 3D

From fig a, the kernel is applied to input image to generate an output feature map which resembles input features like edges, patterns etc., but decreases its dimensions to make further computations easy. For example, Input image is 4*4 and Kernel is 2*2, Output is 3*3. So here we get reduced dimension of an input covering crucial features taken from image.

The kernels are designed in such a way depending on output requirement. Detecting particular edges or patterns like horizontal, vertical requires such kernels to be convoluted with an image. Output is nothing but a dot product of input image and kernel (convolution). Shown in below figure.

fig b :Math in Convolutions

Got some info about convolutions. Woohoo!!! You Learned Something new today. One more thing is that your output can be changes with some new techniques. If you want to get output same as input dimension then Padding is used. For even more decreasing the dimensions of it we use higher stride value. FYI, Stride is a parameter of the neural network’s filter that modifies the amount of movement over the image.

Let’s See about these techniques in coming @article. #Stay Tuned!!!

--

--