What is Dropout? Understanding Dropout in Neural Networks
Mục Lục
What is dropout in deep neural networks?
Dropout refers to data, or noise, that’s intentionally dropped from a neural network to improve processing and time to results.
A neural network is software attempting to emulate the actions of the human brain. The human brain contains billions of neurons that fire electrical and chemical signals to each other to coordinate thoughts and life functions. A neural network uses a software equivalent of these neurons, called units. Each unit receives signals from other units and then computes an output that it passes onto other neuron/units, or nodes, in the network.
Why do we need dropout?
The challenge for software-based neural networks is they must find ways to reduce the noise of billions of neuron nodes communicating with each other, so the networks’ processing capabilities aren’t overrun. To do this, a network eliminates all communications that are transmitted by its neuron nodes not directly related to the problem or training that it’s working on. The term for this neuron node elimination is dropout.
Dropout layers
Like the neurons of the human brain, the units of a neural network randomly process myriad inputs and then fire off myriad outputs at any given time. The process and outputs of each unit may be intermediate output firings that are passed to another unit for further processing, long before an end output or conclusion results. Some of this processing ends up as noise that’s an intermediate output from processing activities but isn’t a final output.
When data scientists apply dropout to a neural network, they consider the nature of this random processing. They make decisions about which data noise to exclude and then apply dropout to the different layers of a neural network as follows:
- Input layer. This is the top-most layer of artificial intelligence (AI) and machine learning where the initial raw data is being ingested. Dropout can be applied to this layer of visible data based on which data is deemed to be irrelevant to the business problem being worked on.
- Intermediate or hidden layers. These are the layers of processing after data ingestion. These layers are hidden because we can’t exactly see what they do. The layers, which could be one or many, process data and then pass along intermediate — but not final — results that they send to other neurons for additional processing. Because much of this intermediate processing will end up as noise, data scientists use dropout to exclude some of it.
- Output layer. This is the final, visible processing output from all neuron units. Dropout is not used on this layer.
These images show the different layers of a neural network before and after dropout has been applied.
Examples and uses of dropout
An organization that’s monitoring sound transmissions from space is looking for repetitious, patterned signals because they might be possible signs of life. The raw signals are fed into a neural network to perform an analysis. Upfront, data scientists exude all incoming sound signals that aren’t repetitive or patterned. They also exclude a percentage of intermediate, hidden layer units to reduce processing and speed time to results.
Here’s another real-world example that shows how dropout works: A biochemical company wants to design a new molecular structure that will enable it to produce a revolutionary form of plastic. The company already knows the individual elements that will comprise the molecule. What it doesn’t know is the correct formulation of these elements.
To save time and processing, the company develops a neural network that can evaluate troves of worldwide research, but that will only ingest and process research that directly refers to the molecule and its identified elements. Any other information is automatically excluded as irrelevant and is dropped out. By excluding irrelevant data upfront, this biochemical company’s AI model avoids a phenomenon known as overfitting. Overfitting occurs when an AI model tries to predict a trend from data that’s too noisy, because extraneous data wasn’t dropped out at the beginning of the process.
The next step: Learn about the different types of machine learning and how they are influencing modern business.