Section 6.1: Dual Representations
Equation 6.2, the linear regression model:

To go from this equation to the next, we set the gradient of latex($J(\mathbf{w})$) equal to 0, solve for w, resulting in latex($\mathbf{w}=\mathbf{\Phi}^{T}\mathbf{a}$). and then we substitute a in for w.
Equation 6.5

We can define the Gram matrix, an NxN matrix, as

with elements

Thus we can express J in terms of the kernel function

Solutions for w and a
If we take the gradient of J in terms of w, we can solve for w to get

Now consider the expression for J in terms of a and take the gradient

Linear regression in terms of the kernel
Substitute the solution for a back into the linear regression model

