Skip to content

Logistics

⬅️ [Past Exam](<./Past Exam.md>) | ⬆️ [Studying](<./README.md>) | [General](<./General.md>) ➡️

Logistic

  1. It is a closed-book exam. You can bring a one-page cheat sheet and use both sides.
  2. You can bring a calculator (without WiFi). You shouldn't use your cell phone as a calculator during the exam. You will be asked to put your cellphone and any other devices with Internet access in your backpack and leave your backpack at the front.
  3. Scratch papers will be distributed at the beginning of the exam. You should not use your own scratch papers during the exam.
  4. Please bring your U-M ID card to the exam.
  5. You will be seated according to a seating chart that will be available at the exam.

Topics:

  1. Markov chains and stationary distributions
  2. Value iteration and policy iteration algorithms
  3. Q-learning and SARSA updates
  4. Definition of contraction mapping and application of contraction mapping for proving convergence
  5. Update of Q-learning with linear function approximation
  6. Double estimator and update of DDQN
  7. Score function definition and policy gradient theorem
  8. REINFORCE and variance reduction
  9. Natural policy gradient and its connection to soft policy iteration.
  10. Performance difference lemma
  11. UCB algorithm for multi-armed bandits
  12. MCTS for Alpha-Zero
  13. Potential-based reward shaping
  14. DPO and DPO loss
  15. Zeroth-order optimization
  16. Contrastive learning
  17. Relative value function and Blackwell optimality
  18. Duality and KKT conditions

⬅️ [Past Exam](<./Past Exam.md>) | ⬆️ [Studying](<./README.md>) | [General](<./General.md>) ➡️