Forward propagation (Personal website)

Forward propagation

End-to-end, input-to-output transformation in a neural network:

\[\begin{equation*}
  \hat{Y}
  = \hat{f}(X)
  = A^{[\ell]} \in \mathbb{R}^{p^{\scriptstyle [\ell]} \times p^{\scriptstyle [\ell - 1]}}
\end{equation*}
\]

with

\[\begin{equation*}
  A^{[i]} =
  \begin{cases}
    X \in \mathbb{R}^{n \times m}
    & i = 0
    \\
    \boxed{g^{[i]}(W^{[i]} A^{[i - 1]} + \vec{b}^{[i]})}
    \in \mathbb{R}^{p^{\scriptstyle [i]} \times p^{\scriptstyle [i - 1]}}
    & i > 0
  \end{cases}
\end{equation*}
\]

and

\[\begin{equation*}
  p^{[1]} = p^{[0]}
\end{equation*}
\]

where

\(X_{m \times n}\) is the input matrix
\(g^{[i]}\) is the activation function used in the \(i\)-th layer
\(W^{[i]}\) is a matrix in the weight tensor
\(\vec{b}^{[i]}\) is a vector in the bias tensor
\(p^{[i]}\) is the number of neurons in the \(i\)-th layer
\(\ell\) is the number of layers in the network.

Dimensionality check: Base case
Dimensionality check: Recursive case
Synonyms