General linear model#

The linear regression model (14) is linear on both features \(\boldsymbol x\) and weights \(\boldsymbol w\), since its predictions are given by the inner product

\[ \widehat y = \boldsymbol x^\top \boldsymbol w = \sum\limits_{j=0}^d x_jw_j. \]

Of course, the target variable \(y\) can have nonlinear dependence on the predictors \(\boldsymbol x\). We can easily sacrifise the linearity on \(\boldsymbol x\) and consider general linear model

(22)#\[\widehat y = \sum\limits_{j=0}^M \phi_j(\boldsymbol x)w_j.\]

The functions \(\phi_j(\boldsymbol x)\) are called basis functions.

Note

As before, the bias is included in (22) by putting \(\phi_0(\boldsymbol x) =1\).

Note that

  • if \(M=d\) and \(\phi_j(\boldsymbol x) = x_j\), then (22) turns into multiple linear regression;

  • if \(d=1\) and \(\phi_j(x) = x^j\) then (22) becomes polynomial regression.

The popular choices of \(\phi_j(\boldsymbol x)\) are

  • \(\phi_j(x) = \exp\big(-\frac{(x-\mu_j)^2}{2s^2}\big)\) (Gaussian basis functions);

  • \(\phi_j(x) = \sigma\big(\frac{x-\mu_j}s\big)\) (sigmoidal basis functions).

Sigmoid function

\[ \sigma(x) = \frac 1{1 + e^{-x}} \]
../_images/5bf28c2de24e59ca64514c1ccbcb10f49609477334a9df7b9945e2b6d420bb68.svg
../_images/4ecb221355fce30da7a15e81d752dc6162d41b05177d3827db566d4942b1a43f.svg

TODO

A lot of things…