Conference paperAsymptotically Exact Error Characterization of Offline Policy Evaluation with Misspecified Linear Models
Conference paperFinite-Time Convergence and Sample Complexity of Multi-Agent Actor-Critic Reinforcement Learning with Average Reward