# “Pick Stones” Game – an Artificial Intelligence Toy Problem

What’s the “Pick Stones” Game?

The “Pick Stones-” or “Nim Game”, or more exactly the “Marienbad Game” can be seen as an AI toy problem, that students at the Departments of Software Engineering and Medicine- and Bioinformatics at the University of Applied Sciences Upper Austria have to solve in the second semester. The game rules are as follows: there are two players and N piles of stones, each pile has M stones. One player starts with removing x stones from one pile. He must remove at least one stone, but can remove as many stones as he wants (from 1 stone to the complete pile) – but only from one pile. The player to remove the last stone loses the game.

Example game with 4 piles and 3 stones each

After turn 7 player 2 loses, as he has to pick the last stone – from pile 2 – with turn 8 off the field.

The Task

The excercise is to implement an algorithm, that takes the best move each turn, therefore acts rationally. Well known approaches to such problems are the Min-Max and the more efficient Alpha-Beta-Pruning algorithm. I’ve done a proof of concept implementation in Python using the Alpha-Beta-Pruning approach, which can be downloaded at the bottom of the post.

The interesting thing about this problem is, that – like for many of such problems – the initial configuration (in this case: the configuration of the piles) alone defines, if the starting player will win for sure – if he plays rationally, or lose for sure – if his opponent plays rationally.

Download
Python source, md5: 152a846eabf83cbf8f9cd901dda356a7, sha1: 7a37b6477e9e1f149411924b15412ad8458b88c1.

```# Alpha Beta Pruning Implementation for the "Nim"-, "Pick Stones"- or "Marienbad"-Game in
# the course Advanced Algorithms and Data Structures.
# Department of Software Engineering, Department of Medicine- and Bioinformatics
# University of Applied Sciences Upper Austria
# The rules for this version of the game are that the player to pick the last stone loses.
# This implementation proves that the only thing that influences which player will win
# -- assuming that both player play rationally -- is the initial configuration of the piles.
#
# 2012-05-06
# Rainhard Findling
# Department of Mobile Computing
# University of Applied Sciences Upper Austria
#
# =================================================================================

# numbers near 10000 mean that max will win
# near 0 mean min will win
MAX_VALUE = 10000;
MIN_VALUE = 0;

# START CONFIGURATION
isStartPlayerMax = True
# index i = nr of stones on this pile, nr at index i = how much such piles we have
# e.g. [1,0,0,0,2] means we have 1 empty pile, and 2 piles with 4 stones each
pile=[0,0,0,0,5]

# =================================================================================

def isLose(pile):
if pile[1] == 1:
for nr in pile[2:]:
if nr > 0:
return False
#        print 'encountered lose situation:', pile
return True
return False

def getMoves(pile):
moves = []
amountOfPilesToChange = 0
for i in range(1,len(pile)):
if pile[i] > 0:
#there is still such a pile
amountOfPilesToChange = amountOfPilesToChange + pile[i]
for r in range(i-1, -1, -1):
moves.append([i,r+1])
#if we only have 1 pile to change left
#we are not allowed to remove all stones from a pile, except it is pile nr 1
if amountOfPilesToChange == 1:
for move in moves:
if move[0] != 1 and move[0] == move[1]:
#print 'amountOfPilesToChange == 1, removing', move, 'pile', pile
moves.remove(move)
return moves

def applyMove(pile, move):
#raise new nr
newNr = move[0] - move[1]
pile[move[0]] = pile[move[0]] - 1
pile[newNr] = pile[newNr] + 1
return pile

def undoMove(pile, move):
#    print 'undoing move, pile=', pile, 'move=',move
newNr = move[0] - move[1]
pile[move[0]] = pile[move[0]] + 1
pile[newNr] = pile[newNr] - 1
return pile

def maxPlayer(depth, alpha, beta, pile):
if isLose(pile):
#        print 'max lost'
return MIN_VALUE+depth
#get possible moves
moves = getMoves(pile)
for move in moves:
# apply move
#        print 'max:', pile, ', move:', move
pile = applyMove(pile, move)
#        print 'new:', pile
alpha = max(alpha, minPlayer(depth+1, alpha, beta, pile))
if alpha >= beta:
pile = undoMove(pile, move)
return alpha
# reverse move
pile = undoMove(pile, move)
return alpha

def minPlayer(depth, alpha, beta, pile):
if isLose(pile):
#        print 'min lost'
return MAX_VALUE-depth
#get possible moves
moves = getMoves(pile)
for move in moves:
# apply move
#        print 'min:', pile, ', move:', move
pile = applyMove(pile, move)
#        print 'new:', pile
beta = min(beta, maxPlayer(depth+1, alpha, beta, pile))
if beta <= alpha:
pile = undoMove(pile, move)
return beta
# reverse move
pile = undoMove(pile, move)
return beta

def main(pile, isStartPlayerMax):
print 'initial pile:', pile
# the while loop calling a alpha-beta-pruning on each iteration is a pretty
# cheap proof-of-concept implementation, consumes a lot of runtime
# adapt if should be used for bigger piles
while not isLose(pile):
minStrength = MAX_VALUE
maxStrength = MIN_VALUE
curBestMove = [-1,-1]
depth = 0;
moves = getMoves(pile)
beststrength = -1
for move in moves:
#do move
pile = applyMove(pile, move)
if isStartPlayerMax:
# start with maxplayer
strength = minPlayer(depth+1, maxStrength, minStrength, pile)
if strength > maxStrength:
curBestMove = move
maxStrength = strength
beststrength = strength
else:
strength = maxPlayer(depth+1, maxStrength, minStrength, pile)
if strength < minStrength:
curBestMove = move
minStrength = strength
beststrength = strength
#undo move
pile = undoMove(pile, move)
print pile, 'ismax=' + str(isStartPlayerMax) + ', applying best move ', curBestMove, 'with utility', beststrength
pile = applyMove(pile, curBestMove)
isStartPlayerMax = not isStartPlayerMax
print pile, 'ismax=' + str(isStartPlayerMax), 'lost'

#let players play against each other
main(pile, isStartPlayerMax)

```