Convolutional Neural Network (CNN) in Machine Learning – GeeksforGeeks

In this article, we are going to discuss convolutional neural network(CNN) in machine learning in detail. 

Convolutional Neural Network(CNN) :

A Convolutional Neural Network (CNN) is a type of deep learning algorithm that is particularly well-suited for image recognition and processing tasks. It is made up of multiple layers, including convolutional layers, pooling layers, and fully connected layers.

The convolutional layers are the key component of a CNN, where filters are applied to the input image to extract features such as edges, textures, and shapes. The output of the convolutional layers is then passed through pooling layers, which are used to down-sample the feature maps, reducing the spatial dimensions while retaining the most important information. The output of the pooling layers is then passed through one or more fully connected layers, which are used to make a prediction or classify the image.

CNNs are trained using a large dataset of labeled images, where the network learns to recognize patterns and features that are associated with specific objects or classes. Once trained, a CNN can be used to classify new images, or extract features for use in other applications such as object detection or image segmentation.

CNNs have achieved state-of-the-art performance on a wide range of image recognition tasks, including object classification, object detection, and image segmentation. They are widely used in computer vision, image processing, and other related fields, and have been applied to a wide range of applications, including self-driving cars, medical imaging, and security systems.

  • A convolutional neural network, or CNN, is a deep learning neural network sketched for processing structured arrays of data such as portrayals.
  • CNN are very satisfactory at picking up on design in the input image, such as lines, gradients, circles, or even eyes and faces.
  • This characteristic that makes convolutional neural network so robust for computer vision.
  • CNN can run directly on a underdone image and do not need any preprocessing.
  • A convolutional neural network is a feed forward neural network, seldom with up to 20.
  • The strength of a convolutional neural network comes from a particular kind of layer called the convolutional layer.
  • CNN contains many convolutional layers assembled on top of each other, each one competent of recognizing more sophisticated shapes.
  • With three or four convolutional layers it is viable to recognize handwritten digits and with 25 layers it is possible to differentiate human faces.
  • The agenda for this sphere is to activate machines to view the world as humans do, perceive it in a alike fashion and even use the knowledge for a multitude of duty such as image and video recognition, image inspection and classification, media recreation, recommendation systems, natural language processing, etc.

Convolutional Neural Network Design :

  • The construction of a convolutional neural network is a multi-layered feed-forward neural network, made by assembling many unseen layers on top of each other in a particular order.
  • It is the sequential design that give permission to CNN to learn hierarchical attributes.
  • In CNN, some of them followed by grouping layers and hidden layers are typically convolutional layers followed by activation layers.
  • The pre-processing needed in a ConvNet is kindred to that of the related pattern of neurons in the human brain and was motivated by the organization of the Visual Cortex.

Different Types of CNN Models:

  1. LeNet
  2. AlexNet
  3. ResNet
  4. GoogleNet
  5. MobileNet
  6. VGG

1.LeNet

  • LeNet is a pioneering convolutional neural network (CNN) architecture developed by Yann LeCun and his colleagues in the late 1990s. It was specifically designed for handwritten digit recognition, and was one of the first successful CNNs for image recognition.
  • LeNet consists of several layers of convolutional and pooling layers, followed by fully connected layers. The architecture includes two sets of convolutional and pooling layers, each followed by a subsampling layer, and then three fully connected layers.
  • The first convolutional layer uses a kernel of size 5×5 and applies 6 filters to the input image. The output of this layer is then passed through a pooling layer that reduces the spatial dimensions of the feature maps. The second convolutional layer uses a kernel of size 5×5 and applies 16 filters to the output of the first pooling layer. This is followed by another pooling layer and a subsampling layer.
  • The output of the subsampling layer is then passed through three fully connected layers, with 120, 84, and 10 neurons respectively. The last fully connected layer is used for classification, and produces a probability distribution over the 10 digits (0-9).
  • LeNet was trained on the MNIST dataset, which consists of 70,000 images of handwritten digits, and was able to achieve high recognition accuracy. The LeNet architecture, although relatively simple compared to current architectures, served as a foundation for many later CNNs, and it’s considered as a classic and simple architecture for image recognition tasks.

2.AlexNet

  • AlexNet is a convolutional neural network (CNN) architecture that was developed by Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton in 2012. It was the first CNN to win the ImageNet Large Scale Visual Recognition Challenge (ILSVRC), a major image recognition competition, and it helped to establish CNNs as a powerful tool for image recognition.
  • AlexNet consists of several layers of convolutional and pooling layers, followed by fully connected layers. The architecture includes five convolutional layers, three pooling layers, and three fully connected layers.
  • The first two convolutional layers use a kernel of size 11×11 and apply 96 filters to the input image. The third and fourth convolutional layers use a kernel of size 5×5 and apply 256 filters. The fifth convolutional layer uses a kernel of size 3×3 and applies 384 filters. The output of these convolutional layers is then passed through max-pooling layers that reduce the spatial dimensions of the feature maps.
  • The output of the pooling layers is then passed through three fully connected layers, with 4096, 4096, and 1000 neurons respectively. The last fully connected layer is used for classification, and produces a probability distribution over the 1000 ImageNet classes.
  • AlexNet was trained on the ImageNet dataset, which consists of 1.2 million images with 1000 classes, and was able to achieve high recognition accuracy. The AlexNet architecture was the first to show that CNNs could significantly outperform traditional machine learning methods in image recognition tasks, and was an important step in the development of deeper architectures like VGGNet, GoogleNet, and ResNet.

Applications of CNN:

  • Decoding Facial Recognition
  • Understanding Climate
  • Collecting Historic and Environmental Elements

Case Study of CNN for Diabetic retinopathy :

  • Diabetic retinopathy also known as diabetic eye disease, is a medical state in which destruction occurs to the retina due to diabetes mellitus, It is a major cause of blindness in advance countries.
  • Diabetic retinopathy influence up to 80 percent of those who have had diabetes for 20 years or more.
  • The overlong a person has diabetes, the higher his or her chances of growing diabetic retinopathy.
  • It is also the main cause of blindness in people of age group 20-64.
  • Diabetic retinopathy is the outcome of destruction to the small blood vessels and neurons of the retina.

 

My Personal Notes

arrow_drop_up