\n",
"Details: Rewriting the cost more conveniently
\n",
"\n",
"We can rewrite the cost more tersly and conveniently as\n",
" \n",
"$$\\begin{align}\n",
" C_2 &= \\sum^N_{n = 1} \\big[y_n - (w_1x_n + w_0)\\big]^2\\\\\n",
" &= \\sum^N_{n = 1} \\big[\\mathbf{y}_n - \\sum^2_{j = 1}\\mathbf{X}_{nj}\\mathbf{w}_j\\big]^2\\\\\n",
" &= \\sum^N_{n = 1} \\big[\\mathbf{y}_n - \\left(\\mathbf{X}\\mathbf{w}\\right)_n\\big]^2\\\\\n",
" &= \\big(\\mathbf{y} - \\mathbf{X}\\mathbf{w}\\big)^\\top \\big(\\mathbf{y} - \\mathbf{X}\\mathbf{w}\\big)\n",
"\\end{align}$$\n",
" \n",
"
\n",
" \n",
"\n",
"Details: Vector derivatives
\n",
"\n",
"Here we show a more detailed derivation of the equality\n",
" \n",
"$$\\begin{align}\\frac{\\partial C_2}{\\partial \\mathbf{w}} = -2\\mathbf{X}^\\top\\big(\\mathbf{y} - \\mathbf{X}\\mathbf{w}\\big).\\end{align}$$\n",
" \n",
"We can write\n",
"\n",
"$$\\begin{align}\n",
"\\bigg(\\frac{\\partial C_2}{\\partial \\mathbf{w}}\\bigg)_i &= \\frac{\\partial C_2}{\\partial \\mathbf{w}_i} = \\frac{\\partial}{\\partial \\mathbf{w}_i} \\bigg[\\big(\\mathbf{y} - \\mathbf{X}\\mathbf{w}\\big)^\\top \\big(\\mathbf{y} - \\mathbf{X}\\mathbf{w}\\big)\\bigg] \\\\\n",
"&= \\frac{\\partial}{\\partial \\mathbf{w}_i} \\sum_n \\bigg[\\big(\\mathbf{y}_n - \\sum_j\\mathbf{X}_{nj}\\mathbf{w}_j\\big) \\big(\\mathbf{y}_n - \\sum_j\\mathbf{X}_{nj}\\mathbf{w}_j\\big)\\bigg]\\\\\n",
"&= 2\\sum_n \\bigg[\\big(\\mathbf{y}_n - \\sum_j\\mathbf{X}_{nj}\\mathbf{w}_j\\big) \\frac{\\partial}{\\partial \\mathbf{w}_i} \\big(\\mathbf{y}_n - \\sum_j\\mathbf{X}_{nj}\\mathbf{w}_j\\big)\\bigg]\\\\\n",
"&= -2\\sum_n \\bigg[\\big(\\mathbf{y}_n - \\sum_j\\mathbf{X}_{nj}\\mathbf{w}_j\\big) \\big(\\sum_j\\mathbf{X}_{nj} \\frac{\\partial \\mathbf{w}_j}{\\partial \\mathbf{w}_i}\\big)\\bigg]\\\\\n",
"&= -2\\sum_n \\bigg[\\big(\\mathbf{y}_n - \\sum_j\\mathbf{X}_{nj}\\mathbf{w}_j\\big) \\big(\\sum_j\\mathbf{X}_{nj} \\delta_{ij}\\big)\\bigg]\\\\\n",
"&= -2\\sum_n \\bigg[\\big(\\mathbf{y}_n - \\sum_j\\mathbf{X}_{nj}\\mathbf{w}_j\\big)\\mathbf{X}_{ni}\\bigg]\\\\\n",
"&= -2\\sum_n \\bigg[\\mathbf{X}^\\top_{in}\\big(\\mathbf{y}_n - \\sum_j\\mathbf{X}_{nj}\\mathbf{w}_j\\big)\\bigg]\\\\\n",
"&= -2 \\left[\\mathbf{X}^\\top \\big(\\mathbf{y} - \\mathbf{X}\\mathbf{w}\\big)\\right]_i\\\\\n",
"\\end{align}$$\n",
"\n",
"\n",
"
\n",
" \n",
"