Title

Approximating Optimal Dudo Play with Fixed-Strategy Iteration Counterfactual Regret Minimization

Roles

Steven C. Hnath: Class of 2012

Document Type

Article

Publication Date

2012

Department

Computer Science

Abstract

Neller and Hnath '12 were the first to compute approximately optimal strategy for the full 2-player game of Dudo, a 15th c. Inca bluffing dice game that remains popular today. After reducing over 290 quintillion information sets to less than 22 million abstract information sets using imperfect recall of actions, the standard Counterfactual Regret Minimization (CFR) algorithm still proved impractical for Dudo, with the number of recursive visits to the same information sets increasing exponentially with the depth of the game graph. By holding strategies fixed across each training iteration, we developed a new algorithm that transforms CFR training iterations from an exponential-time recursive algorithm into a polynomial-time dynamic-programming algorithm, making computation of an approximate Nash equilibrium for the full 2-player game of Dudo possible for the first time.