13.2 Fully Connected Neural Networks
The top panel in the figure below shows a common graphical representation of the single layer model above, and visually unwravels the individuals operations performed by such a model depicting them visually from left to right. A visual representation like this – of a model consisting of neural network units – is often referred to as a neural network architecture or just an architecture. Here the bias and input of each single layer unit composing the model is shown as a sequence of dots all the way on the left of the diagram. This layer – consisting of input to the system – is ‘visible’ to us, since this is where we inject the input of our training data (which we ourselves can ‘see’) and is often referred to as the first or input layer of a network. The linear combination of input leading to each unit is then shown visually by edges connecting the input (shown as dots) to an open circle, with the nonlinear activation then shown as a larger blue circle. In the middle of this visual depiction – where these blue circles representing all $U_1$ activations allign – is the hidden-layer of this network model. This layer is called ‘hidden’ because it contains internally processed versions of our input that we do ‘see’ at the outset of learning. Whlie the name ‘hidden’ while not entirely accurate – as we can visualize the internal state of these units if we so desire – it is a commonly used convention. The output of these $U_1$ units is then collected in a linear combination, and once again visualized by edges connecting each unit to a final summation shown as an open circle. This is the final output of the model, and is often called the final layer or visible layer of the network since we can always – regardless of how the parameters of a model are set – see what the model outputs. Finally, in comparing the graphical and algebraic representations of a single layer model, notice that reading the graph from left to right corresponds to starting internally to the single layer units $f^{(1)}_j$, and working outwards to the final linear combination.