id author title date pages extension mime words sentences flesch summary cache txt cord-163946-a4vtc7rp Awasthi, Raghav VacSIM: Learning Effective Strategies for COVID-19 Vaccine Distribution using Reinforcement Learning 2020-09-14 .txt text/plain 4475 251 54 We approach this problem by proposing a novel pipeline VacSIM that dovetails Actor-Critic using Kronecker-Factored Trust Region (ACKTR) model into a Contextual Bandits approach for optimizing the distribution of COVID-19 vaccine. We evaluate this framework against a naive allocation approach of distributing vaccine proportional to the incidence of COVID-19 cases in five different States across India and demonstrate up to 100,000 additional lives potentially saved and a five-fold increase in the efficacy of limiting the spread over a period of 30 days through the VacSIM approach. In this paper, we introduce VacSIM, a novel feed-forward reinforcement learning approach for learning effective policy combined with near real-time optimization of vaccine distribution and demonstrate its potential benefit if applied to five States across India. Contextual Bandits play an action based on its current context, given a corresponding reward, hence are more relevant to real-world environments such as the vaccine distribution problem attacked in this work. ./cache/cord-163946-a4vtc7rp.txt ./txt/cord-163946-a4vtc7rp.txt