多智能体强化学习入门（七）——AC for CDec-POMDP 大规模规划学习算法

xiaoxiao2022-07-02 226

Thien, Nguyen & Kumar, Akshat & Lau, Hoong. (2017). Policy Gradient With Value Function Approximation For Collective Multiagent Planning.

内容详见：https://zhuanlan.zhihu.com/p/66571753

最新回复(0)