Approximating Optimal Dudo Play with Fixed-Strategy Iteration Counterfactual Regret Minimization
Steven C. Hnath: Class of 2012
Neller and Hnath '12 were the first to compute approximately optimal strategy for the full 2-player game of Dudo, a 15th c. Inca bluffing dice game that remains popular today. After reducing over 290 quintillion information sets to less than 22 million abstract information sets using imperfect recall of actions, the standard Counterfactual Regret Minimization (CFR) algorithm still proved impractical for Dudo, with the number of recursive visits to the same information sets increasing exponentially with the depth of the game graph. By holding strategies fixed across each training iteration, we developed a new algorithm that transforms CFR training iterations from an exponential-time recursive algorithm into a polynomial-time dynamic-programming algorithm, making computation of an approximate Nash equilibrium for the full 2-player game of Dudo possible for the first time.
Neller, Todd W. and Steven Hnath. “Approximating Optimal Dudo Play with Fixed-Strategy Iteration Counterfactual Regret Minimization.” Advances in Computer Games, 13th International Conference, (LNCS 7168, ACG 2011). Eds. H. Jaap van den Herik and Aske Plaat. Tilburg, The Netherlands, November 20-22, 2011. Revised Selected Papers, Springer, 2012. 169-182.