Thien, Nguyen & Kumar, Akshat & Lau, Hoong. (2017). Policy Gradient With Value Function Approximation For Collective Multiagent Planning.
内容详见:https://zhuanlan.zhihu.com/p/66571753