"Online Dynamic Programming"
We consider the problem of repeatedly solving a variant of the same dynamic programming problem in successive trials. An instance of the type of problems we consider is to find a good binary search tree in a changing environment. At the beginning of each trial, the learner probabilistically chooses a tree with the n keys. The learner is then told the frequencies of the keys and is charged by the average search cost for the chosen tree. The problem is online because the frequencies can change between trials. The goal is to develop algorithms with the property that their total average search cost (loss) in all trials is close to the total loss of the best tree chosen in hindsight for all trials. The challenge, of course, is that the algorithm has to deal with exponential number of trees. We develop a general methodology for tackling such problems for arbitrary dynamic programming algorithms with min-sum recurrence relations. Our framework allows us to extend online learning algorithms like Expanded Hedge and Component Hedge to a significantly wider class of combinatorial objects than was possible before.
Holakou Rahmanian obtained his PhD in Computer Science at UC Santa Cruz in September 2018. He was advised by SVN Vishwanathan and Manfred Warmuth. His research is mostly focused on online learning, multi-armed bandits, learning structured concepts and combinatorial objects. In the past summers, he worked at Google, Microsoft, and eBay as intern. After his PhD Graduation, he worked at Kyushu University in Japan as a Visiting Researcher. Prior to UCSC, he obtained his Bachelors of Science in Mathematics and also Computer Science from Baha’i Institute for Higher Education (BIHE) in Iran. Holakou is joining Microsoft as an Applied Scientist in December 2018.
Friday, November 16 at 2:30pm to 3:30pm
McAdams Hall, 119