📚 ZHANGWP
Search
Search
Search
Dark mode
Light mode
Explorer
Notes
Book Reading
Reinforcement Learning: An Introduction
Chap 2 - 多臂老虎机
Chap 3 - 有限马尔可夫决策过程
Chap 4 - 动态规划
Chap 5 - 蒙特卡罗方法
Chap 6 - 时序差分学习
Chap 7 - n 步自助法
Chap 8 - 规划与学习
Chap 9 - On-Policy的近似预测
Chap 10 - On-Policy的近似控制
Chap 11 - Off-Policy的近似方法
Chap 12 - 资格迹
Chap 13 - 策略梯度法
Paper Reading
[2018-12-26]MCTS Introduction
[2020-07-06]Model-based RL with uncertainty
[2020-07-26]Background and Decision-time Planning
[2022-03-25]RL and Language Models
[2022-10-14]Factored Adaption for Non-stationary RL
[2022-11-18]RL with Causal Reasoning
[2023-05-24]Diffusion Models and RL
[2023-06-30]AdaPlanner & LLM Weights
[2023-10-29]Hallucination in LMM
Other
Donate
Friend Link
Statement
Share
Diary
Blog History
Game Log
My Postgraduate Admission
My Student Life
Running Log
NKU SMS Exams
Diary
5 items under this folder.
May 16, 2024
Running Log
May 16, 2024
My Student Life
May 16, 2024
My Postgraduate Admission
May 16, 2024
Game Log
May 16, 2024
Blog History