“Pick Stones” Game – an Artificial Intelligence Toy Problem
What’s the “Pick Stones” Game?
The “Pick Stones-” or “Nim Game”, or more exactly the “Marienbad Game” can be seen as an AI toy problem, that students at the Departments of Software Engineering and Medicine- and Bioinformatics at the University of Applied Sciences Upper Austria have to solve in the second semester. The game rules are as follows: there are two players and N piles of stones, each pile has M stones. One player starts with removing x stones from one pile. He must remove at least one stone, but can remove as many stones as he wants (from 1 stone to the complete pile) – but only from one pile. The player to remove the last stone loses the game.
Example game with 4 piles and 3 stones each
After turn 7 player 2 loses, as he has to pick the last stone – from pile 2 – with turn 8 off the field.
The excercise is to implement an algorithm, that takes the best move each turn, therefore acts rationally. Well known approaches to such problems are the Min-Max and the more efficient Alpha-Beta-Pruning algorithm. I’ve done a proof of concept implementation in Python using the Alpha-Beta-Pruning approach, which can be downloaded at the bottom of the post.
The interesting thing about this problem is, that – like for many of such problems – the initial configuration (in this case: the configuration of the piles) alone defines, if the starting player will win for sure – if he plays rationally, or lose for sure – if his opponent plays rationally.
Python source, md5: 152a846eabf83cbf8f9cd901dda356a7, sha1: 7a37b6477e9e1f149411924b15412ad8458b88c1.
# Alpha Beta Pruning Implementation for the "Nim"-, "Pick Stones"- or "Marienbad"-Game in # the course Advanced Algorithms and Data Structures. # Department of Software Engineering, Department of Medicine- and Bioinformatics # University of Applied Sciences Upper Austria # The rules for this version of the game are that the player to pick the last stone loses. # This implementation proves that the only thing that influences which player will win # -- assuming that both player play rationally -- is the initial configuration of the piles. # # 2012-05-06 # Rainhard Findling # Department of Mobile Computing # University of Applied Sciences Upper Austria # # ================================================================================= # numbers near 10000 mean that max will win # near 0 mean min will win MAX_VALUE = 10000; MIN_VALUE = 0; # START CONFIGURATION isStartPlayerMax = True # index i = nr of stones on this pile, nr at index i = how much such piles we have # e.g. [1,0,0,0,2] means we have 1 empty pile, and 2 piles with 4 stones each pile=[0,0,0,0,5] # ================================================================================= def isLose(pile): if pile == 1: for nr in pile[2:]: if nr > 0: return False # print 'encountered lose situation:', pile return True return False def getMoves(pile): moves =  amountOfPilesToChange = 0 for i in range(1,len(pile)): if pile[i] > 0: #there is still such a pile amountOfPilesToChange = amountOfPilesToChange + pile[i] for r in range(i-1, -1, -1): moves.append([i,r+1]) #if we only have 1 pile to change left #we are not allowed to remove all stones from a pile, except it is pile nr 1 if amountOfPilesToChange == 1: for move in moves: if move != 1 and move == move: #print 'amountOfPilesToChange == 1, removing', move, 'pile', pile moves.remove(move) return moves def applyMove(pile, move): #raise new nr newNr = move - move pile[move] = pile[move] - 1 pile[newNr] = pile[newNr] + 1 return pile def undoMove(pile, move): # print 'undoing move, pile=', pile, 'move=',move newNr = move - move pile[move] = pile[move] + 1 pile[newNr] = pile[newNr] - 1 return pile def maxPlayer(depth, alpha, beta, pile): if isLose(pile): # print 'max lost' return MIN_VALUE+depth #get possible moves moves = getMoves(pile) for move in moves: # apply move # print 'max:', pile, ', move:', move pile = applyMove(pile, move) # print 'new:', pile alpha = max(alpha, minPlayer(depth+1, alpha, beta, pile)) if alpha >= beta: pile = undoMove(pile, move) return alpha # reverse move pile = undoMove(pile, move) return alpha def minPlayer(depth, alpha, beta, pile): if isLose(pile): # print 'min lost' return MAX_VALUE-depth #get possible moves moves = getMoves(pile) for move in moves: # apply move # print 'min:', pile, ', move:', move pile = applyMove(pile, move) # print 'new:', pile beta = min(beta, maxPlayer(depth+1, alpha, beta, pile)) if beta <= alpha: pile = undoMove(pile, move) return beta # reverse move pile = undoMove(pile, move) return beta def main(pile, isStartPlayerMax): print 'initial pile:', pile # the while loop calling a alpha-beta-pruning on each iteration is a pretty # cheap proof-of-concept implementation, consumes a lot of runtime # adapt if should be used for bigger piles while not isLose(pile): minStrength = MAX_VALUE maxStrength = MIN_VALUE curBestMove = [-1,-1] depth = 0; moves = getMoves(pile) beststrength = -1 for move in moves: #do move pile = applyMove(pile, move) if isStartPlayerMax: # start with maxplayer strength = minPlayer(depth+1, maxStrength, minStrength, pile) if strength > maxStrength: curBestMove = move maxStrength = strength beststrength = strength else: strength = maxPlayer(depth+1, maxStrength, minStrength, pile) if strength < minStrength: curBestMove = move minStrength = strength beststrength = strength #undo move pile = undoMove(pile, move) print pile, 'ismax=' + str(isStartPlayerMax) + ', applying best move ', curBestMove, 'with utility', beststrength pile = applyMove(pile, curBestMove) isStartPlayerMax = not isStartPlayerMax print pile, 'ismax=' + str(isStartPlayerMax), 'lost' #let players play against each other main(pile, isStartPlayerMax)