多智能体强化学习入门(七)——AC for CDec-POMDP 大规模规划学习算法

    xiaoxiao2022-07-02  109

    Thien, Nguyen & Kumar, Akshat & Lau, Hoong. (2017). Policy Gradient With Value Function Approximation For Collective Multiagent Planning.

    内容详见:https://zhuanlan.zhihu.com/p/66571753

    最新回复(0)