Automated AI For Decision Optimization with Reinforcement Learning
Abstract
Reinforcement learning has many practical real-world applications. However, RL solutions are highly sensitive to the choice of the RL algorithm and its internal hyperparameters which requires expert manual effort. This limits the widespread applicability of RL solutions in practice. In this lab session, we introduce an automated system, AutoDO, for end-to-end solving of sequential decision-making problems. Our system automatically selects the best RL algorithm and its hyperparameters using a search strategy based on limited discrepancy search coupled with Bayesian optimization. It supports both online and offline RL as well as automated Markov Decision Process (MDP) models based on mathematical programming. Data scientists with very little or no experience in RL will be able to formulate and solve sequential decision-making problems using AutoDO. They will focus on preparing the inputs, namely a system environment for online RL and a data set annotated with a knowledge data structure for offline RL and automated MDP models and will gain experience in using our system to automate the solution pipeline generation for the optimal decision policy. Prerequisites will be shared in advance: mainly, basic familiarity with python and free subscription in advance to a public facing API service.