10/02/2026 12:01 PM - Lecture

⬅️ [27/01/2026 12:00 PM - Lecture](<./27_01_2026 12_00 PM - Lecture.md>) | ⬆️ [ECE 567](<./README.md>) | [12/02/2026 11:56 AM - Lecture](<./12_02_2026 11_56 AM - Lecture.md>) ➡️

10/02/2026 12:01 PM - Lecture
lecture-07-2.pdf
lecture-08.pdf

Starting with some lec 9
567-lecture-09.pdf
General idea approximate the value function as a weight on a feature vector. Then apply normal bellman stuff.
The projecting back and forth between the high dimensional value space and the low dimensional weight space means that we only get that the bellman error is orthogonal to the features, not that the bellman error is 0.

So generally we want
$$
||\Phi^T w - V||
$$
to be small
$\Phi^T w = \tilde{V}$ is an approximation of the value function.

The orthogonality thing holds on-policy (data to learn the approximation comes from the policy. The steady state distribution comes from the policy.)

567-lecture-10.pdf

⬅️ [27/01/2026 12:00 PM - Lecture](<./27_01_2026 12_00 PM - Lecture.md>) | ⬆️ [ECE 567](<./README.md>) | [12/02/2026 11:56 AM - Lecture](<./12_02_2026 11_56 AM - Lecture.md>) ➡️