📚 ZHANGWP
Search
Search
Search
Dark mode
Light mode
Explorer
Notes
Book Reading
Reinforcement Learning: An Introduction
Chap 2 - 多臂老虎机
Chap 3 - 有限马尔可夫决策过程
Chap 4 - 动态规划
Chap 5 - 蒙特卡罗方法
Chap 6 - 时序差分学习
Chap 7 - n 步自助法
Chap 8 - 规划与学习
Chap 9 - On-Policy的近似预测
Chap 10 - On-Policy的近似控制
Chap 11 - Off-Policy的近似方法
Chap 12 - 资格迹
Chap 13 - 策略梯度法
Paper Reading
[2018-12-26]MCTS Introduction
[2020-07-06]Model-based RL with uncertainty
[2020-07-26]Background and Decision-time Planning
[2022-03-25]RL and Language Models
[2022-10-14]Factored Adaption for Non-stationary RL
[2022-11-18]RL with Causal Reasoning
[2023-05-24]Diffusion Models and RL
[2023-06-30]AdaPlanner & LLM Weights
[2023-10-29]Hallucination in LMM
Other
Donate
Friend Link
Statement
Share
Diary
Blog History
Game Log
My Postgraduate Admission
My Student Life
Running Log
NKU SMS Exams
Notes
Home
2 items under this folder.
Apr 18, 2024
Paper Reading
Apr 18, 2024
Book Reading