End-to-end, input-to-output transformation in a neural network:
\[\begin{equation*} \hat{Y} = \hat{f}(X) = A^{[\ell]} \in \mathbb{R}^{p^{\scriptstyle [\ell]} \times p^{\scriptstyle [\ell - 1]}} \end{equation*} \]
with
\[\begin{equation*} A^{[i]} = \begin{cases} X \in \mathbb{R}^{n \times m} & i = 0 \\ \boxed{g^{[i]}({W^{[i]}} A^{[i - 1]} + \vec{b}^{[i]})} \in \mathbb{R}^{p^{\scriptstyle [i]} \times p^{\scriptstyle [i - 1]}} & i > 0 \end{cases} \end{equation*} \]
and
\[\begin{equation*} p^{[1]} = p^{[0]} \end{equation*} \]
where