In this article, we are going to learn about the grayscale image, colour image and the process of convolution.
A grayscale image where the image is represented as only the shades of grey. The intensity of the various pixels of the image is denoted using the values from 0 to 255. i.e., from black to white in terms of an 8-bit integer. It uses only one channel.
Coloured images are constructed by combining red, green and blue (RGB) colours in variable proportions. These 3 colours and hence they are called the primary colours. The colour image pixels contain three channels: The R channel, G channel and the B channel, each having its own intensity values ranging from 0 to 255.
What is a convolution
Convolution is the process of multiplying each pixel with the corresponding pixel value of the filter and then adding all of the products to get the result. These combinations of result give the output image representation.
Now let us look at an example of convolution.
We pass a 6x6 input through a filter (Here we are using a vertical filter). We get a 4x4 output.
Now let us look at how each of the entries in the output is obtained.
We place the filter on top of the input starting from the top left corner till we reach the bottom right corner. Then we perform the process of convolution (multiply the corresponding entries and add them together). The obtained result is the corresponding output entry. Here we take stride value as 1. That is we jump 1 step to the right after each calculation. When we reach the column end, we jump 1 row below. This process goes on till we reach the bottom right corner.
The Convolution operation: The part of the input to be convolved with the filter in each step is highlighted.
The 1st output entry:
The 2nd output entry:
The 3rd output entry:
The 4th output entry:
By performing similar calculations;
The 5th output entry= 1
The 6th output entry= 5
The 7th output entry= 5
The 8th output entry= -4
The 9th output entry= 1
The 10th output entry= 5
The 11th output entry= 5
The 12th output entry= -4
The 13th output entry= 1
The 14th output entry= 5
The 15th output entry= 5
The 16th output entry= -4
The output we obtained here is of the order 4 while we have given the input of order 6. Hence we can say that some information loss occurs here.
To prevent this loss of information, we use the padding technique, which will be discussed in upcoming posts.
Thank you for reading. If you liked this post please consider sharing.
Did you find this article valuable?
Support Ashwin Sharma P by becoming a sponsor. Any amount is appreciated!