Navigation(slides)

Section 6.1: Dual Representations

Equation 6.2, the linear regression model:


$$J(\mathbf{w}) = \frac{1}{2} \sum_{n=1}^{N}\{ \mathbf{w}^{T} \phi \left( \mathbf{x}_{n}\right) -t_{n}\}^{2} + \frac{\lambda}{2}\mathbf{w}^{T} \mathbf{w}$$

To go from this equation to the next, we set the gradient of latex($J(\mathbf{w})$) equal to 0, solve for w, resulting in latex($\mathbf{w}=\mathbf{\Phi}^{T}\mathbf{a}$). and then we substitute a in for w.

Equation 6.5


$$J(\mathbf{a}) = \frac{1}{2}\mathbf{a}^T\mathbf{\Phi}\mathbf{\Phi}^T\mathbf{\Phi}\mathbf{\Phi}^T\mathbf{a} - \mathbf{a}^T\mathbf{\Phi}\mathbf{\Phi}^T\mathbf{t} + \frac{1}{2}\mathbf{t}^T\mathbf{t} + \frac{\lambda}{2}\mathbf{a}^T\mathbf{\Phi}\mathbf{\Phi}^T\mathbf{a}$$

We can define the Gram matrix, an NxN matrix, as


$\mathbf{K} = \mathbf{\Phi}\mathbf{\Phi}^T$

with elements


$K_{nm} = \phi(\mathbf{x}_{n})^T\phi(\mathbf{x}_{m}) = k(\mathbf{x}_{n}, \mathbf{x}_{m})$

Thus we can express J in terms of the kernel function


$J(\mathbf{a}) = \frac{1}{2}\mathbf{a}^T\mathbf{K}\mathbf{K}\mathbf{a} - \mathbf{a}^T\mathbf{K}\mathbf{t} + \frac{1}{2}\mathbf{t}^T\mathbf{t} + \frac{\lambda}{2}\mathbf{a}^T\mathbf{K}\mathbf{a}$

Solutions for w and a

If we take the gradient of J in terms of w, we can solve for w to get


$$\mathbf{w} = -\frac{1}{\lambda} \sum_{n=1}^{N}\{\mathbf{w}^{T} \phi \left( \mathbf{x}_{n}\right) -t_{n}\} \phi(\mathbf{x}_{n})=  \sum_{n=1}^{N}\{a_{n}\phi(\mathbf{x}_{n})\}= \mathbf{\Phi}^T\mathbf{a}$$

Now consider the expression for J in terms of a and take the gradient


$\mathbf{a} = (\mathbf{K} + \lambda\mathbf{I}_{N})^{-1}\mathbf{t}$

Linear regression in terms of the kernel

Substitute the solution for a back into the linear regression model


$$y(\mathbf{x}) = \mathbf{w}^T\phi(\mathbf{x}) = \mathbf{a}^{T}\mathbf{\Phi}\phi(\mathbf{x}) = \mathbf{k}(\mathbf{x})^{T}(\mathbf{K} + \lambda\mathbf{I}_{N})^{-1}\mathbf{t}$$

Classes/BMTRY790/KernelMethods/001_Dual_Representations (last edited 2008-01-29 15:52:09 by strasbu)