## Einstein Riddle/Zebra Puzzle in Prolog

In this post we solve another logic puzzle deriving rules from the puzzle and stating them in Prolog. And this time it is a rather famous puzzle one: the Zebra puzzle (also called Einstein riddle/puzzle).

## The puzzle

The setting is as follows: there is a group of 5 people. Each person lives in exactly one house, has exactly one nationality and exactly one pet, drinks exactly one drink and smokes exactly one brand of cigarettes. Different goals are possible: a famous one is “who drinks water and who owns the zebra”. But one could instead also define the goal to derive for all houses and people: who lives is which house, has which nationality and which pet, drinks which drink and smokes which brand of cigarettes – given the fact, that each of these attributes is unique among the people in our group. This is the question we are going to answer. The following hints are provided (from Wikipedia):

- There are five houses.
- The Englishman lives in the red house.
- The Spaniard owns the dog.
- Coffee is drunk in the green house.
- The Ukrainian drinks tea.
- The green house is immediately to the right of the ivory house.
- The Old Gold smoker owns snails.
- Kools are smoked in the yellow house.
- Milk is drunk in the middle house.
- The Norwegian lives in the first house.
- The man who smokes Chesterfields lives in the house next to the man with the fox.
- Kools are smoked in the house next to the house where the horse is kept.
- The Lucky Strike smoker drinks orange juice.
- The Japanese smokes Parliaments.
- The Norwegian lives next to the blue house.

## Puzzle solution in Prolog

We now need to write the knowledge from above as Prolog rules and ask for a solution fulfilling these rules:

% #################################################################### % Zebra puzzle solver. A house is a list with 6 attributes: % 1. Color % 2. Nationality % 3. Drink % 4. Cigarettes % 5. Pet % 6. The housenumber (fixed by hand to avoid redundant solutions) % % Steps to obtain the solution: % % 1) Read definition to prolog: % ['zebra_puzzle.pl']. % % 2) Ask for solutions: % solve(H1,H2,H3,H4,H5). % % Rainhard Findling % 09/2015 % #################################################################### all_members([H],L2) :- member(H,L2). all_members([H|T],L2) :- member(H,L2), all_members(T, L2). % #################################################################### % RULES % #################################################################### % rule 1 (5 houses) is implicit for our implementation rule2(H1,H2,H3,H4,H5) :- member(H, [H1,H2,H3,H4,H5]), H = [red,england,_,_,_,_]. rule3(H1,H2,H3,H4,H5) :- member(H, [H1,H2,H3,H4,H5]), H = [_,spain,_,_,dog,_]. rule4(H1,H2,H3,H4,H5) :- member(H, [H1,H2,H3,H4,H5]), H = [green,_,coffee,_,_,_]. rule5(H1,H2,H3,H4,H5) :- member(H, [H1,H2,H3,H4,H5]), H = [_,ukraine,tea,_,_,_]. rule6(H1,H2,H3,H4,H5) :- member(HL,[H1,H2,H3,H4,H5]), member(HR,[H1,H2,H3,H4,H5]), HL = [white,_,_,_,_,NrL], HR = [green,_,_,_,_,NrR], NrR-NrL =:= 1. rule7(H1,H2,H3,H4,H5) :- member(H, [H1,H2,H3,H4,H5]), H = [_,_,_,altemgold,snails,_]. rule8(H1,H2,H3,H4,H5) :- member(H, [H1,H2,H3,H4,H5]), H = [yellow,_,_,kools,_,_]. rule9(_,_,H3,_,_) :- H3 = [_,_,milk,_,_,3]. rule10(H1,_,_,_,_) :- H1 = [_,norway,_,_,_,1]. rule11(H1,H2,H3,H4,H5) :- member(HL,[H1,H2,H3,H4,H5]), member(HR,[H1,H2,H3,H4,H5]), HL = [_,_,_,chesterfield,_,NrL], HR = [_,_,_,_,fox,NrR], abs(NrL-NrR,1). rule12(H1,H2,H3,H4,H5) :- member(HL,[H1,H2,H3,H4,H5]), member(HR,[H1,H2,H3,H4,H5]), HL = [_,_,_,kools,_,NrL], HR = [_,_,_,_,horse,NrR], abs(NrL-NrR,1). rule13(H1,H2,H3,H4,H5) :- member(H, [H1,H2,H3,H4,H5]), H = [_,_,orange,luckystrike,_,_]. rule14(H1,H2,H3,H4,H5) :- member(H, [H1,H2,H3,H4,H5]), H = [_,japan,_,parliament,_,_]. rule15(H1,H2,H3,H4,H5) :- member(HL,[H1,H2,H3,H4,H5]), member(HR,[H1,H2,H3,H4,H5]), HL = [_,norway,_,_,_,NrL], HR = [blue,_,_,_,_,NrR], abs(NrL-NrR,1). % #################################################################### % solve the puzzle % #################################################################### solve(H1,H2,H3,H4,H5) :- % bind houses to position to avoid multiple solutions with switched variables H1 = [H1_col,H1_nat,H1_drink,H1_cig,H1_pet,1], H2 = [H2_col,H2_nat,H2_drink,H2_cig,H2_pet,2], H3 = [H3_col,H3_nat,H3_drink,H3_cig,H3_pet,3], H4 = [H4_col,H4_nat,H4_drink,H4_cig,H4_pet,4], H5 = [H5_col,H5_nat,H5_drink,H5_cig,H5_pet,5], % the rules rule2(H1,H2,H3,H4,H5), rule3(H1,H2,H3,H4,H5), rule4(H1,H2,H3,H4,H5), rule5(H1,H2,H3,H4,H5), rule6(H1,H2,H3,H4,H5), rule7(H1,H2,H3,H4,H5), rule8(H1,H2,H3,H4,H5), rule9(H1,H2,H3,H4,H5), rule10(H1,H2,H3,H4,H5), rule12(H1,H2,H3,H4,H5), rule13(H1,H2,H3,H4,H5), rule14(H1,H2,H3,H4,H5), rule15(H1,H2,H3,H4,H5), % for all variable ensure that all values exist all_members([white, green, red, blue, yellow], [H1_col,H2_col,H3_col,H4_col,H5_col]), all_members([spain, japan, england, ukraine, norway], [H1_nat,H2_nat,H3_nat,H4_nat,H5_nat]), all_members([orange, coffee, milk, tea, water], [H1_drink,H2_drink,H3_drink,H4_drink,H5_drink]), all_members([luckystrike, parliament, altemgold, chesterfield, kools], [H1_cig,H2_cig,H3_cig,H4_cig,H5_cig]), all_members([dog, zebra, snails, horse, fox], [H1_pet,H2_pet,H3_pet,H4_pet,H5_pet]).

If executed, two possible solutions are printed:

['zebra_puzzle.pl']. solve(H1,H2,H3,H4,H5). H1 = [yellow, norway, water, kools, zebra, 1], H2 = [blue, ukraine, tea, chesterfield, horse, 2], H3 = [red, england, milk, altemgold, snails, 3], H4 = [white, spain, orange, luckystrike, dog, 4], H5 = [green, japan, coffee, parliament, fox, 5] ; H1 = [yellow, norway, water, kools, fox, 1], H2 = [blue, ukraine, tea, chesterfield, horse, 2], H3 = [red, england, milk, altemgold, snails, 3], H4 = [white, spain, orange, luckystrike, dog, 4], H5 = [green, japan, coffee, parliament, zebra, 5] ;

That’s it, puzzle solved! 😉

## Hidoku Solver in Python: branch cutting, intelligent successor generation and some simplifications

In this post we are implementing a Hidoku solver (Hidoku is yet another fine number puzzle) that uses a depth first search, branch cutting, limited (intelligent) successor generation and some automatic simplification. Usually, a Hidoku is a quadratic board consisting of n x n fields – but rectangular or other forms would be possible as well. With each Hidoku, some fields are already pre-filled with numbers at the beginning.

The game goal is to fill in all other numbers so that an ascending number queue is built: each number has to be adjacent to it’s successor, with adjacent meaning in an adjacent horizontal, vertical or diagonal field. Those adjacent fields are what we call a field neighborhood. The first and last number (e.g. with a 10×10 board they will be 1 and 100) are usually amongst those pre-filled into the board – for the player knowing where the queue starts and ends. See e.g. here or here to try out some online Hidoku examples for yourself.

Within our solver we use a similar puzzle representation which we read from a text file. E.g. the sample from above would look like this:

__,__,93,__,__,__,___,__,__,57 __,__,__,22,__,53, 54,40,__,__ __,__,18,23,52,__,100,__,__,__ __,__,__,__,__,99, 33,__,__,__ __,__,__,__,__,25,___,__,__,__ __,__,__,__,__,__,___,__,__,29 14,__, 7,__,__,__,___,__,__,__ __,__,__,65,__,__, 46,72,77,__ __, 9,__,__,__,__,___,__,80,76 1,__,__,11,__,68, 82,__,__,__

## Solving a Hidoku

With our solver we want to obtain all possible solutions to a Hidoku (although it should ideally only have 1). Therefore we utilize a depth first search with branch cutting, intelligent successor generation and some automatic simplification. Besides the simple game board representation (positions with numbers) we also use some additional variables for speeding up the search (see Board.__init__() for details):

- self.nparray: the standard n x m numpy array representation of the game board, with the numbers present on the board written to the y/x position of the array.
- self.nrs_yx_dict: dictionary of numbers filled to the board and their corresponding tuple of y/x positions. This allows fast lookup of the y/x position a number is filled to.
- self.nrs_left_to_place: list of numbers that still need to be placed on the board. Allows for slightly faster lookup of still missing numbers.
- self.nrs_placement_possibilities: list of possible placements as numbers and their corresponding y/x positions (representing positions this numbers could be filled to, given the current state of the board). This list enables much faster access to where numbers could get placed on the board. It is created initially (for the first node in the search tree) and for all successor nodes in the search tree possibilities that became impossible just get removed.
- self.field_connection_status: lists how many connections each field in the board currently has (0 to 2). A connection exists if the number preceding or succeeding to the number in the field is present in the field neighborhood (with the neighborhood including all adjacent fields: vertical, horizontal and diagonal). These connection stats enable quick checks if the board is already invalid (more on that later). The connection stats get handed over to preceding nodes in the search tree by updating the neighborhood and the field itself the number was filled to.

We now look closer at how some of these are created and later used inside branch cutting.

## Determining those fields that a number could be filled to

With the first node in the search tree, self.nrs_placement_possibilities gets created and filled. Succeeding nodes obtain a copy of it with those possibilities removed that became invalid by placing preceding number to the board (by doing the distance checks described below again).

Given the numbers pre-filled to the board, all other numbers can be filled to those fields only that lie within the allowed distance of numbers already filled to the board. An example: if nr. 2 if filled to the top left field (position 0/0), nr. 2 can only be placed within vertical, horizontal and diagonal distance 1 of field 0/0, nr. 3 within distance 2 etc. Hence, this boils down to a simple (but pretty effective in reducing computation complexity) series of distance checks, in which each number X gets checked against each number Y already filled to the board. These checks are only performed if `abs(X-D) < max(board_height, board_width) - 2`

(otherwise the numbers can be placed on the board without any restrictions caused by each other, as the board is too small for such). The core of such a distance check is rather simple: nr. X can be placed to a certain y/x position on the board, if the horizontal and the vertical distance to the position of nr. Y is at max. abs(X-Y).

## Branch cutting and its requirements

In our search tree we need to detect dead ends early (boards that cannot yield a valid solution anymore) in order to exclude them from further search, hence speeding up the complete search.

With keeping track of possible placements of numbers to fields on the board using self.nrs_placement_possibilities, we can do 2 types of checks:

- for each number left to be placed on the board: is there still at least one field left that number can be placed on?
- for each field still empty: is there at least one number left that can be placed on that field?

If at least one is not fulfilled: the board is invalid an can be excluded from further search.

We also use field connection stats (self.field_connection_status) within branch cutting. Fields initially have 0 connections. If the field is filled with a number, for each of the preceding and succeeding number present in its neighborhood (the adjacend fields, maximum 8), the number of connections goes up by one – so the maximum amount of connections per field is 2. Using these field connection stats we can check:

- if the field is filled: does it have its preceding and succeeding numbers placed in its neighborhood, or enough free fields in the neighborhood for them to be placed there in the future? If not the board is invalid.
- if the field is not filled: does it allow to be a connection for any pair of numbers in its neighborhood? Can it still be reached (entered and exited) from the outside? E.g. an empty field with fully filled neighborhood can only be reached if there can a number be placed on that field that has its preceding and succeeding number already filled to the neighborhood (or a free field in the neighborhood in its place). This essentially boils down to neighborhood number permutation check: if we don’t find a valid (still possible) permutation of numbers in the neighborhood for which we can fill a still available and valid number to that field, the board is invalid.

## Successor generation

For the solved board the order of how fields got filling does not matter. Therefore, within our depth first search, we do not care for the search path, but only for the obtained board state (we do not care how a board could have evolved to it’s current state).

The question is: how to generate the required subset of successors (as this will influence the future search complexity)? We only generate successors from filling a single field on the board (with all numbers that can be filled to it) – as we can generate all possible board states from this, without searching multiple search paths (which all lead to the same results). As field to be filled we chose the one with the least possible amount N of numbers that can be filled to this field – and generate one successor per number possible. If there is a number that can be placed only on an amount A of fields that is smaller than N, we instead fill this one number to all those fields and generate one successor per filled field (as we only obtain A instead of N successor nodes then). This way we minimize the amount of branches and keep the overall branching factor low – something a rational human player would strive for as well. Note that we do not miss any possible solutions as a result of this limited successor generation. As a side effect, this also results in basic board simplification: if there is a field that only one number can be filled to, it will be filled next – and generate a single successor only, which is usually called simplification. The same is true if we have a number that can only be filled to a single field.

## Implementation

# Hidoku solver implementation using depth first search, branch cutting, intelligent successor generation and a little simplification. # Rainhard Findling # 2015/04 # # possible improvements: # - use a path finding algorithm, e.g. A* + manhattan distance instead of using max(y_dist,x_dist) to determine possible minimum distances. # import numpy import itertools from copy import copy # STATICS EMPTY = -1 def main(): """gets executed if in __main__ (see bottom of file).""" # parameters board_file = '../boards/10x10_2.txt' # load board, initialize it, print some statistics board = Board(read_board(board_file)) board.initialize() print board print 'nrs left', board.nrs_left_to_place print 'nrs_placement_possibilities', len(board.nrs_placement_possibilities) tmp = board.placements_splitted_per_nr() print 'placements per nr', map(lambda t:(t,tmp[t][0]), tmp.keys()), '\n' board.is_valid() # depth first search list_open = [board] solutions = [] nodes_opened = 0 nr_of_successors = [] while(len(list_open) > 0): cur = list_open.pop() nodes_opened = nodes_opened + 1 print 'looking at', cur print 'list_open', len(list_open) successors = cur.successors() if(len(successors)==0): print 'all successors invalid, reverting.' for s in successors: nr_of_successors = nr_of_successors + [len(successors)] if s.is_solved(): solutions = solutions + [s] #break # enable to quit search after the first solution, disable to search the complete search space else: list_open = list_open + [s] else: continue break print 'done.' if len(solutions) == 0: print 'no solutions found.' else: print 'solution paths:', map(lambda s:s.solution_path(), solutions) print 'solutions:', solutions print 'solutions', len(solutions) print 'nodes_opened', nodes_opened print 'nr_of_successors', sum(nr_of_successors) print 'avg branching (not including reduction by simplification-drive-successor-generation)', sum(nr_of_successors)/float(len(nr_of_successors)) class Board: def __init__(self, nparray = None): self.nparray = nparray # the gameboard as board self.nrs_yx_dict = None # the gameboard as dict of nrs and corresponding y,x positions self.nr_range = None # range of numbers to be contained in board self.nrs_left_to_place = None # numbers left to be placed on the board self.nrs_placement_possibilities = None # list of possible placement possibilities self.field_connection_status = None # how many connections fields have left [0,2] self.parent = None def __str__(self): """str representation of nparray board.""" sep = ' ' top1 = '===' + ''.join(['====' for _ in range(len(self.nparray[0]))]) top2 = '===' + ''.join(['==' for _ in range(len(self.nparray[0]))]) s = '\n' + top1 + sep + top2 + '\n' for row_nr in range(len(self.nparray)): row = self.nparray[row_nr,:] s = s + '| ' + ' '.join(map(lambda x:'___' if x == EMPTY else '{:3.0f}'.format(x), row)) + ' |' fillings = self.field_connection_status[row_nr,:] s = s + sep + '| ' + ' '.join(map(lambda x:'x' if x == 0 else '.' if x == 1 else ' ', fillings)) + ' |\n' s = s + top1 + sep + top2 + '\n' return s def __copy__(self): other = Board() other.nparray = copy(self.nparray) other.nrs_yx_dict = copy(self.nrs_yx_dict) other.nr_range = self.nr_range # no need to copy other.nrs_left_to_place = copy(self.nrs_left_to_place) other.nrs_placement_possibilities = copy(self.nrs_placement_possibilities) other.field_connection_status = copy(self.field_connection_status) return other def __repr__(self): return self.__str__() def __eq__(self, other): return (self.nparray == other.nparray).all() def __ne__(self, other): return not self.__eq__(other) def initialize(self): """initialize the board. meant to be called on first board in search tree only.""" self.nr_range = range(1, len(self.nparray) * len(self.nparray[0])+1) self.nrs_left_to_place = self.generate_nrs_left_to_place() self.nrs_yx_dict = self.generate_nrs_yx_dict() self.nrs_placement_possibilities = self.generate_nrs_placement_possibilities() self.field_connection_status = self.generate_all_field_connection_status() def generate_nrs_yx_dict(self): """generate yx position index of nrs.""" nrs_yx_dict = {} for y in range(len(self.nparray)): for x in range(len(self.nparray[0])): if not self.nparray[y,x] == EMPTY: nrs_yx_dict[self.nparray[y,x]] = (y,x) return nrs_yx_dict def generate_nrs_placement_possibilities(self): """1 huge list containing tuples about which nrs are placeable on which fields.""" nr_placements = [] for nr in self.nrs_left_to_place: for y in range(len(self.nparray)): for x in range(len(self.nparray[0])): if(self.can_nr_be_placed_on_field(nr,y,x)): nr_placements = nr_placements + [(nr,y,x)] return nr_placements def generate_all_field_connection_status(self): """for all filled fields: generate field connection status.""" field_connection_status = numpy.ones([len(self.nparray), len(self.nparray[0])], int) * 2 for y in range(len(self.nparray)): for x in range(len(self.nparray[0])): field_connection_status[y,x] = self.generate_field_connection_status(y,x) return field_connection_status def generate_field_connection_status(self,y,x): """generate field connection status for the given nr on y and x pos.""" neighborhood = self.get_neighborhood_nrs(y,x) nr = self.nparray[y,x] dist = map(lambda n:abs(nr-n), neighborhood) field_connection_status = 2 - dist.count(1) if nr == self.nr_range[0] or nr == self.nr_range[-1]: field_connection_status = field_connection_status - 1 return field_connection_status def fill_nr_to_field(self, nr, y, x): """fill given nr to given field.""" if not nr in self.nrs_left_to_place: raise Exception('cannot place a nr not left to place.') self.nrs_left_to_place.remove(nr) if self.nparray[y,x] != EMPTY: raise Exception('cannot place a nr on a non empty field.') self.nparray[y,x] = nr self.nrs_yx_dict[nr] = (y,x) self.field_connection_status[y,x] = self.generate_field_connection_status(y,x) for (iy,ix) in self.get_neighborhood_pos(y,x): self.field_connection_status[iy,ix] = self.generate_field_connection_status(iy,ix) self.nrs_placement_possibilities = filter(lambda p:self.can_nr_be_placed_on_field(*p), self.nrs_placement_possibilities) def placements_splitted_per_nr(self): """split placement possibilities per nr.""" splitted = {} for nr in self.nrs_left_to_place: for_nr = filter(lambda p:p[0]==nr, self.nrs_placement_possibilities) splitted[nr] = (len(for_nr), for_nr) return splitted def can_nr_be_placed_on_field(self,nr,y,x): """check if given nr is possible on y,x pos in field.""" # only possible if empty if not self.nparray[y,x] == EMPTY: return False # only possible if nr still available if not nr in self.nrs_left_to_place: return False # check that nr is within range of other nrs on board return self.check_distance_to_other_nrs_ok(nr,y,x) def check_distance_to_other_nrs_ok(self,nr,y,x): """check if given nr on the given y,x is within range of other nrs on the board.""" # -2 because -1 is max possible distance, therefore no cutoff effect d_max = max(len(self.nparray), len(self.nparray[0])) - 2 for nr2 in range(nr - d_max, nr + d_max + 1): if nr == nr2 or not nr2 in self.nrs_yx_dict: continue pos2 = self.nrs_yx_dict[nr2] # ensure that nr and nr2 are within range d = self.distance((y,x),pos2) if d > abs(nr-nr2): # the distance between the nrs is bigger than the difference between nrs - impossible to place here #print 'dist too big (nr, nr2, y, x, dist):', nr, nr2, y, x, abs(nr-nr2) return False return True def generate_nrs_left_to_place(self): """generate list of nrs still to placed on the field.""" nrs = [] for nr in self.nr_range: if not nr in self.nparray: nrs = nrs + [nr] return nrs def get_neighborhood_pos(self,y,x): """tuples of (y,x) pos of neighborhood.""" yx = [] for iy in range(max(y-1,0), min(y+2, len(self.nparray))): for ix in range(max(x-1,0), min(x+2, len(self.nparray[0]))): if not (y == iy and x == ix): yx = yx + [(iy,ix)] return yx def get_neighborhood_nrs(self,y,x): """get all fields that surround the specified field.""" n = [] for (iy,ix) in self.get_neighborhood_pos(y,x): n = n + [self.nparray[iy,ix]] return n def neigbors_not_fully_connected(self, y, x): """get neighbors to the given pos that are not yet fully connected.""" neighborhood = self.get_neighborhood_nrs(y,x) not_fully_connected = [] for n in neighborhood: if n == EMPTY: not_fully_connected = not_fully_connected + [n] else: (ny,nx) = self.nrs_yx_dict[n] if self.field_connection_status[ny,nx] > 0: not_fully_connected = not_fully_connected + [n] return not_fully_connected def check_empty_field_neighbor_permutations_ok(self, y, x): """check if the field can be reached from it's neighbors permutations.""" # attention: this does not handle first and last nr if they are not initially placed on the board if(self.nparray[y,x] != EMPTY): raise Exception('trying to check a non empty field with the exhausting empty-field check...') not_fully_connected = self.neigbors_not_fully_connected(y, x) amount_empty = not_fully_connected.count(-1) if(amount_empty >= 2): # we still have empty entrance and exit, too much effort to check return True if(amount_empty == 1): # if only 1 possibility: plugin + redo check possible_fillings1 = set(reduce(lambda a,b:a if b == EMPTY else a + [b-1,b+1], not_fully_connected, [])) possible_fillings2 = filter(lambda x:x in self.nrs_left_to_place, possible_fillings1) if(len(possible_fillings2) > 1): # more than 1 possibility, still valid return True if(len(possible_fillings2)==1): # only this one possibility left! plugin, recheck self.fill_nr_to_field(possible_fillings2[0], y, x) print 'only one possibility left for field (y,x,nr)', y, x, possible_fillings2[0] return self.is_valid() else: if len(not_fully_connected) < 2: # impossible to connect if we have only 1 connectable field #print 'invalid, as permutations invalid', y, x return False permutations = list(itertools.combinations(not_fully_connected, 2)) diff2 = map(lambda t:(abs(t[0]-t[1]), (t[0]+t[1])/2), permutations) # permutations where diff2 is 2 are valid if the nr between them is still placeable. if there is only one: only possibility for that field. if there is none board is invalid diff2_valid = filter(lambda t:t[0] == 2, diff2) # check which nrs are still available to place diff2_available = filter(lambda t:t[1] in self.nrs_left_to_place, diff2_valid) if(len(diff2_available) > 1): # more than 1 possibility, still valid return True if(len(diff2_available) == 1): # only this one possibility left! plugin, recheck print self print 'only one possibility left for field (y,x,nr)', y, x, diff2_available[0][1] self.fill_nr_to_field(diff2_available[0][1], y, x) return self.is_valid() #print 'invalid, as permutations invalid', y, x return False def check_filled_field_valid_connections(self,y,x): """check that the given field can still be reached by means of predecessing and successing nr.""" nr = self.nparray[y,x] if(nr == EMPTY): raise Exception('trying to check an empty field with too inexact filled field check...') neighborhood = self.get_neighborhood_nrs(y,x) nr_frees = neighborhood.count(EMPTY) #print 'frees', nr_frees connected_to_me = [abs(n-nr) for n in neighborhood].count(1) # empty do not count as the will not result in 1 distance as nr 0 does not exist #print 'connected_to_me', connected_to_me need = 2 if(nr == self.nr_range[0] or nr == self.nr_range[-1]): need = 1 if(nr_frees + connected_to_me >= need): return True #print 'invalid, as too less connections filled field (nr, y, x)', nr, y, x return False def successors(self): """generate successors of.""" # nr with least placement possibilities placements_splitted_per_nr = self.placements_splitted_per_nr() p_amount = map(lambda t:placements_splitted_per_nr[t][0], placements_splitted_per_nr.keys()) #print 'p_amount', p_amount i_min = p_amount.index(min(p_amount)) #print 'i_min', i_min nr = placements_splitted_per_nr.keys()[i_min] print 'placing nr', nr # placement possibilities for nr # (nr,y,x) nr_possibilities = filter(lambda t:t[0] == nr, self.nrs_placement_possibilities) print 'possibilities to place (nr, len(p), p)', nr, len(nr_possibilities), nr_possibilities # sanity check: would there be a field that has only 1 nr to be filled with? yx_amount = {} for p in self.nrs_placement_possibilities: key = (p[1],p[2]) if not p in yx_amount: yx_amount[key] = [p] else: yx_amount[key] = yx_amount[key] + [p] yx_ones = filter(lambda t:yx_amount[t] == 1, yx_amount.keys()) if(len(nr_possibilities) > 1 and len(yx_ones) > 1): raise Exception('there is a field to fill with one nr instead. implement generating that successor therefore.') successors = [] for p in nr_possibilities: suc = copy(self) suc.parent = self suc.fill_nr_to_field(*p) if suc.is_valid(): successors = successors + [suc] return successors def solution_path(self): """get solution path up to this point.""" l = [self] cur = self while(not cur.parent is None): l = [cur.parent] + l cur = cur.parent return l def distance_nrs(self, nr1, nr2): """get distance between nr1 and nr2 on board or None if at least one of the nrs is not present.""" if not nr1 in self.nrs_yx_dict or not nr2 in self.nrs_yx_dict: return None pos1 = self.nrs_yx_dict[nr1] pos2 = self.nrs_yx_dict[nr2] return self.distance(pos1, pos2) def distance(self, pos1, pos2): """distance between pos1 and pos2.""" return max(abs(pos1[0]-pos2[0]), abs(pos1[1]-pos2[1])) def is_valid(self): """check if board is still valid.""" #check if each nr can be placed for nr in self.nrs_left_to_place: if len(filter(lambda t:t[0] == nr, self.nrs_placement_possibilities)) == 0: #print 'invalid as nr', nr, 'cannot be placed.' return False # check that each field if filled is connected or connectable and if not set has permutations of neighbors that allow it to be filled for y in range(len(self.nparray)): for x in range(len(self.nparray[0])): if self.nparray[y,x] == EMPTY: if not self.check_empty_field_neighbor_permutations_ok(y,x): return False else: if not self.check_filled_field_valid_connections(y,x): return False return True def is_solved(self): """checks if the field is solved.""" if len(self.nrs_left_to_place) == 0: # sanity checks if EMPTY in self.nparray: raise Exception('no numbers left to place, but still empty fields?') return True return False def read_board(filepath): """read board from a textfile.""" f = open(filepath, 'r') rows = [] w = -1 # same width for all lines for line in f.read().splitlines(): values = line.split(',') if w == -1: w = len(values) elif w != len(values): raise Exception('nr. of columns differ amongst rows.') values = map(lambda x:EMPTY if '_' in x else int(x), values) rows = rows = rows + [values] f.close() return numpy.array(rows) if __name__ == "__main__": main()

When calling the solver on the sample board from above, it finds and prints the single possible solution (as well as the path how to get there, which we don’t show here for space reasons):

[...] solutions: [ =========================================== ======================= | 91 92 93 20 21 36 37 38 39 57 | | x x x x x x x x x x | | 90 94 19 22 35 53 54 40 56 58 | | x x x x x x x x x x | | 89 95 18 23 52 34 100 55 41 59 | | x x x x x x x x x x | | 88 17 96 51 24 99 33 61 60 42 | | x x x x x x x x x x | | 16 87 50 97 98 25 62 32 43 30 | | x x x x x x x x x x | | 15 6 86 49 48 63 26 44 31 29 | | x x x x x x x x x x | | 14 5 7 85 64 47 45 27 28 78 | | x x x x x x x x x x | | 4 13 8 65 84 70 46 72 77 79 | | x x x x x x x x x x | | 3 9 12 66 69 83 71 73 80 76 | | x x x x x x x x x x | | 1 2 10 11 67 68 82 81 74 75 | | x x x x x x x x x x | =========================================== ======================= ] solutions 1 nodes_opened 1220 nr_of_successors 1646 avg branching (not including reduction by simplification-driven successor generation) 1.3491803

## Branching factor and search speed

Limited successor generation and branch cutting greatly influence the branching factor, therefore the overall runtime. With enabling both, you can see that the branching factor is around 1.35, and that the complete search space can successfully be searched. For me, disabling both resulted in the solver not even being able to find the solution in feasible time – but just try and see for yourself! 😉

## Draught board puzzle / checkerboard puzzle solver in Python

The checkerboard puzzle or draught board puzzle (also called Krazee Checkerboard Puzzle, Banzee Island checkerboard puzzle, Zebas puzzle, etc.) is a mutilated chessboard problem, which further is a tiling puzzle/dissection puzzle. Hence, the core problem is similar to the one of solving the well known Tangram, which some of you might be familiar with. The checkerboard puzzle we are looking at in this post is a 8×8 chess board imitation with black and white fields. It is split into 12 stones (left figure). The puzzle goal is to obtain at least one valid solution (right figure) – which is done by our solver. Although we are discussion this specific 8×8 board, the solver is designed to work with any board size (even with unequal width and height).

## Checkerboard puzzle representation

The stones the board consists of are stated in a text file, which the solver loads at the start. In the text file, stones are represented as an 2D array:

- B represents a black field
- W represents a white field
- _ represents an “empty” field (the stone does not cover that field)
- === represents separator between stones

B,_,_ W,B,W _,_,B ============ W,B,W,B _,W,_,_ ============ _,W,B,_ W,B,W,B _,W,B,_ ============ B,W,_,_ _,B,W,B ============ B,W,B,W W,_,_,_ ============ W,B,W,B B,_,_,_ ============ B,W _,B _,W ============ W,B,W B,W,_ W,_,_ ============ B,W,B,W _,B,_,_ ============ W,B,_,_ _,W,B,W ============ B,W,B,W _,B,W,_ ============ B,_,_ W,B,W _,_,B ============

Within the solver, stones are represented as an 2D array as well:

- -1 represents a black field
- 1 represent a white field
- 0 represents an empty (not covered) field

As each stone can be rotated and flipped (“mirrored”), that stone can be put to the board in one of at maximum 8 versions (which we call “rotations” from now on). When loading stones, the solver generates and stores all different rotations for each stone for speedup.

The gameboard is represented as an 2D array as well:

- -1 is a black field, hence to fill with the black part of a stone
- 1 is a white white, hence needs a white part of a stone

This representation gives us the advantage of putting stones to the board fast (and checking, if that is possible in the first place). To check if a stone can be put to a specific xy-position on the board, we use a pointwise array sum. Putting a black part of a stone to a black field on the board results in a sum of -2, putting a white one to a white field results in 2. Instead, putting a black one to a white field or a white one to a black field results in a sum of 0. If a stone does not cover a certain field, the sum will be -1 or 1 (depending on if the field on the board is black or white). Hence, when checking if a stone can be placed on the board, -2 and 2 as well as -1 and 1 are valid sums, while 0 is an invalid sum.

We also have to account for other stones already placed on the board. When we put a stone to the board we change newly covered black fields to -10 and newly covered white fields to 10. This way we ensure that putting a stone ontop another one results in different sums. We consequently have to extend valid sums to also include -10 and 10, and invalid sums -11, 11, -9, 9 (would be the result of trying to put a stone ontop another).

## Solving the checkerboard puzzle

As we want to obtain all possible solutions our solver is designed as depth first search with branch cutting and limited successor generation. If a solution is found it gets appended to a solution file – and the search goes on. The search terminates if we have checked all possible states. We further do some inter-state caching of data for speedup.

### Successor generation

We do not generate all possible successors, but only those of placing a single stones to the field, in all rotations and to all possible positions. This avoids arriving at same successor states by putting stone A and B in different order, hence reduces branching. Consequently, we have to decide on the one stone to put to the board. Imagine that for one stone there is only one placement possibility left on the board. A human player would always put that stone to the board immediately – which is a good idea, as it further reduces branching. Our search does the same by always putting the stone to the board with least placement possibilities left.

### Branch cutting

We do branch cutting using a validity check. A failed validity check indicates that a board cannot yield a valid solution any more, hence it gets excluded from further search (which means none of it successors get generated or looked at, which cuts the complete branch). This check is the main cause for decreasing the branching factor – hence it’s OK if a noteworthy part of computations is done for it (anyway the implementation is done in a way that the validity check itself is quick). The validity check ensures:

- Each not yet placed stone must have at least one position on the board left where it could be placed
- Each field must have at least one stone that could still cover that field (based on stone placements still possible with the board)

### Caching for speedup

Naturally, each board contains information about:

- Which stones have been placed where so far (as array for faster access to the board, as well as list of placed stones, their rotation and position)
- Which stones are left to be placed

Additionally, each board also contains:

- For all stones left to be placed on the board: where they could possibly be placed
- For all yet empty fields on the board: all stones left to be placed that could possibly fill this field

For the latter two, all posibilities are generated at the game start. Afterwards, it is only necessary to update them, depending on which stones are put where to the board.

## Implementation

The implementation source code is available in this repository: https://github.com/pirius/draught-board-puzzle-aka-checkerboard-puzzle-solver

When called on our sample board, the solver finds multiple solutions for our 8×8 board, e.g. the following one:

=================================== | 7B 11W 11B 11W 2B 3W 3B 5W | | 7W 11B 2W 2B 2W 3B 5W 5B | | 7B 7W 2B 10W 3B 3W 5B 5W | | 4W 7B 6W 10B 1W 1B 1W 5B | | 4B 6W 6B 10W 10B 1W 1B 8W | | 4W 4B 6W 10B 0W 0B 1W 8B | | 9B 4W 6B 0W 0B 0W 0B 8W | | 9W 9B 9W 9B 0W 0B 8W 8B | ===================================

### Findings

It’s not easy to estimate the branching factor for this checkerboard puzzle. Depending on which stone is chosen to be the one to place on the board, the initial branching can range up to above 300 possibilities (8×8 field and 2×3 stone). Which each successive stone, the branching factor gets smaller, as less possibilities to place stones remain. For an 8×8 board, without immediately excluding successors eliminated by the branch cutting validity check, the measured average approximation of the branching factor is in the range of 6.5. This includes all stones placed to the board within search as well as invalid successor states, which would be removed by validity checks immediately. If excluding successors failing validity checks, the measured average approximation of the branching factor is reduced to about 1.2 – which is a huge gain (hence, the validity check is really worth its time).

## Magic number board puzzle solver in Python

The magic number board puzzle is yet another equation puzzle (similar to the first equation puzzle or Jodici) – but has a notable larger solution space.

The magic number board is a quadratic board, consisting of 5×5 = 25 fields, which are organized as 5 horizontal rows and 5 vertical columns. Each field should contain a number in the range [5,29], which each number being used exactly once in the board. Each row, column and both diagonals should sum to 85 exactly. Some numbers are already given at start – the task is to fill in all remaining fields while sticking to the rules.

## Solving the magic number board puzzle

The solution space is notably larger than with the equation puzzle or Jodici. Given 8 initially filled fields we have 17 yet empty fields, for which we have 17! solutions (3.5*10^14 solutions or ~48bit entropy). Pure brute force would of course find the solution, but would require an arguable amount of time to do so with typical PCs or Laptops. Therefore, we are speeding up the process by using best first search and – more important – incorporating the following optimizations.

### Successor generation (branching factor reduction)

When adding a number to the board, we don’t need to generate all successors (e.g. place every possible number in every still empty field). We know that we can still reach all possible solutions in our search tree, if we just fill in all possible numbers to a single field (generating multiple successors having this field filled, but all other previously empty field still being empty). Doing so causes us to skip different orders of filling numbers to fields (e.g. first filling 1 to field A, then 2 to field B, or first filling 2 to field B, then 1 to field A – which would result in the same state). Consequently, this reduces the branching factor and therefore the runtime. Given this optimization idea, the question is: how to determine the one field we’re going to fill numbers into? Which his brings us to:

### Order of fields to be filled (heuristic search direction)

A human player would try to fill numbers to those groups that already contain many numbers. Why? Because usually, there are least possibilities of numbers to be filled into these groups (which again causes a lower branching factor, therefore less complexity to deal with for the human player). Hence, we always fill a field in the group with the least fields still empty (excluding completely filled groups of course). An example of using such an order: imagine we have a group which already contains 4 numbers. There is (at max) one possibility for the fifth number, as the five numbers have to sum to 85. A human player would immediately fill in that number – a behaviour we should also strive for in our approach.

### Validity check – detecting dead ends early (branch cutting)

The third thought is about evaluating if a board is already invalid, which correlates to branch cutting in search. As we fill numbers into the board it’s very important to find out as soon as possible, if the board cannot yield a valid solution any more (therefore all possible successors can be excluded from search, which cuts parts of search tree, and which saves us lots of resources). An example for such using such checks for groups containing 4 numbers: with the the fifth number all numbers of that group have to sum to 85. In case the number required to be filled into the field yet empty for the sum to be 85 is already used elsewhere, the current board cannot yield a valid solution any more – therefore the board can be excluded from search. We don’t only check groups already containing 4 numbers, but also groups containing 3 numbers. For these groups we try to fill in all permutations of numbers still available and check, if with at least one combination the group sums to 85. If this is the case, the group is counted as valid. If no combination allows for a sum of 85, the board is already invalid and we exclude it from search. Note: using permutations of numbers still available, conceptually it would be possible to check for groups with any amount of already filled in numbers (e.g. 2, 1 or even empty groups). The drawback is that the amount of permutations that need to be checked increases exponentially with the amount of fields of the group still being empty. Consequently, the gain of recognizing invalid boards earlier is easily eaten up by the additional complexity of generating and checking permutations.

## Python implementation

# Magic number square solver implementation. The magic number square is a 5x5 number board in which each field must contain a nr [5,25]. Some numbers are given at the start, all others have to be found. In the end, each nr must be used once and all rows, columns and both diagonals have to sum to 85. # Rainhard Findling # 2014/11 # # Core concepts: # # Branch cutting: we look at each generated board: for each row (incl. row, col, diagonal) the remaining numbers must contain at least one permutation with which the row sums up exactly to 85, otherwise the board can only create invalid solutions and is skipped in future search. # # Heuristic (search direction): we prefer boards which are more filled over boards not so filled. We futher prefer rows with more filled cells over rows with less filled cells. In fact, we use a monotinic metric: 1 row with N filled cells is always worth more than [MAX] rows which have N-1 cell filled less. Therefore, we always try to fill the row with already has the most filled cells (except of completely filled rows). # # Successor generation: of a board only those successors get created that fill a number to the row with most already filled cells (excluding completely filled rows). This is intuitive, but it seems it would not be necessary as we do heuristic sorting whcih would prefer "more filled" boards always. But it's also required for another reason: it is important in combination with branch cutting (it's one solution to the problem that would emerge otherwise). For a board with a row where only 1 cell is not filled yet, that cell could remain unfilled (even if we have a heuristic that would enforce this cell to be filled next) - simply because filling the cell makes the row valid, but there are no valid successors of the resulting board possible anymore. This would lead to the irrational behavior of trying this cell and discarding all successors, then trying the other candidates from the list which still have this one cell free, therefore leaving this cell free (for getting a still high bonus for having one nearly filled row) and trying to fill cells elsewhere. The approach of only filling the row with already highest filling status - except completely filled rows - ensures that there are no states that allow for "leaving out because resulting successors are invalid" scenarios. # import itertools import copy import numpy global_sum = 85 global_nrs_range = range(5, 30) global_branchcutting_min_amount_of_filled_cells_to_perform_permutation_check = 3 global_heuristics_row_fill_stat_hist_weights = [12**x for x in range(6)] # why? because with 5*5 we have 12 rows to validate in total. therefore, if we want to favor boards that have more filled or nearly filled rows, our base b for hist**b has to be 12 for the metric order to be monotone. why 6 weights if we have 5*5? because we weight the '0-fills' too. _ = '__' # placeholder for empty fields class Cell(): """representation of a cells in the gameboard.""" def __init__(self, nr=_): self.nr = nr #if(self.nr != _): #self.available_nrs = [] #else: #self.available_nrs = global_nrs_range def __str__(self): return '%2s' % str(self.nr) def str_verbose(self): """str with more information about this cell than __str__.""" return self.__str__() + ' (' + ','.join([ str(x) for x in self.available_nrs]) + ')' def __repr__(self): return self.__str__() def __deepcopy(self, memo): return Cell(self.nr) class Board(): """Representation of a gameboard holding all numbers + helper infos for solving it.""" def __init__(self, gameboard, parent, available_nrs=None): self.gameboard = gameboard self.parent = parent if(available_nrs != None): self.available_nrs = available_nrs else: self.available_nrs = self.generate_available_nrs_from_gameboard() def generate_available_nrs_from_gameboard(self): """derive still available numbers from gameboard.""" return [x for x in global_nrs_range if not x in map(lambda x:x.nr, self.all_cells_as_vector())] def all_cells_as_vector(self): return reduce(lambda x,y:x+y, self.gameboard) def __str__(self): return self.str_gameboard(self.gameboard) def str_gameboard(self, gameboard): """string representation of a gameboard = board.""" s = '\n================\n' for row in gameboard: s += '|' + ' '.join([x.__str__() for x in row]) + '|\n' s += '================\n' return s def __repr__(self): return self.__str__() def __deepcopy__(self, memo): # we do not/cannot deepcopy parent! return Board(copy.deepcopy(self.gameboard), self.parent, copy.deepcopy(self.available_nrs)) def __eq__(self, other): if other == None: return False for self_row, other_row in zip(self.gameboard, other.gameboard): for self_cell, other_cell in zip(self_row, other_row): if not self_cell.nr == other_cell.nr: return False return True def __cmp__(self, other): # __cmp__ should return a negative integer if self < other, zero if self == other, and positive if self > other if self.heuristic() < other.heuristic(): return -1 if self.heuristic() == other.heuristic(): return 0 else: return 1 def copy_transposed(self): """get a transposed copy of the gameboard: rows become columns, columns become rows.""" return [[row[nr] for row in self.gameboard] for nr in range(len(self.gameboard))] def diagonal_topleft_bottomright(self): """get the topleft to bottomright diagonal 'row'""" return [self.gameboard[nr][nr] for nr in range(len(self.gameboard))] def diagonal_topright_bottomleft(self): """get the topright to bottomleft diagonal 'row'""" return [self.gameboard[len(self.gameboard)-nr-1][nr] for nr in range(len(self.gameboard))] def get_all_groups(self): """returns all groups (rows, cols, diagonals) that have to fulfill certain criteria.""" return self.gameboard + self.copy_transposed() + [self.diagonal_topleft_bottomright()] + [self.diagonal_topright_bottomleft()] def is_valid(self): """checks if this gameboard is valid / can still be solved in a valid way.""" for row in self.get_all_groups(): free = len(filter(lambda cell:cell.nr == _, row)) #print 'free', free if(len(self.gameboard) - free < global_branchcutting_min_amount_of_filled_cells_to_perform_permutation_check): # too many unfilled cells, we don't check if permutations can still make this row valid #print 'skipping row with', str(free), 'free cells during checks' continue if(free == 0): # is already filled. sanity check: is it's filled in valid way? if(reduce(lambda x,y:x+y.nr, row, 0) != global_sum): return False #raise Exception('Completely filled row does not sum to ' + str(global_sum) + ': ' + str(row)) continue cur_sum = reduce(lambda x,y:x + (y.nr if y.nr != _ else 0), row, 0) # sum of nrs currently in this row # check that from nrs still available it's still possible to get this row to the target sum sum_remainder = global_sum - cur_sum # does any combination of available nrs still fulfill this criteria? permutations = itertools.permutations(self.available_nrs, free) if not any([sum(x) == sum_remainder for x in permutations]): return False return True def is_filled(self): """checks if all fields are filled. valid + filled therefore means solved.""" for row in self.gameboard: for cell in row: if cell.nr == _: return False return True def generate_successors(self): """generate all possible successor states of this board.""" successors = [] counts = [len(filter(lambda cell:cell.nr != _, row)) for row in self.get_all_groups()] row_nr = counts.index(max(filter(lambda x:x<len(self.gameboard),set(counts)))) #for row_nr in range(len(self.gameboard)): row = self.get_all_groups()[row_nr] for cell_nr in range(len(row)): cell = row[cell_nr] if cell.nr == _: # this cell can be set as next step for nr in self.available_nrs: # generate clone suc = copy.deepcopy(self) # modify to show new state suc.available_nrs.remove(nr) suc.get_all_groups()[row_nr][cell_nr].nr = nr suc.parent = self if suc.is_valid(): successors.append(suc) return successors def fill_stat_hist(self): """histogram of how row fill states: how much rows exist with have 0, 1, 2 etc. filled cells.""" counts = [len(filter(lambda cell:cell.nr != _, row)) for row in self.get_all_groups()] #fill_stat_hist = [len(list(group)) for key,group in itertools.groupby(counts)] fill_stat_hist = [len(filter(lambda x:x==i, counts)) for i in range(len(self.gameboard)+1)] return fill_stat_hist def fill_stat_hist_weighted(self): """the same hist weighted by how much we think it's worth to have more of 'such' row fill states.""" return map(lambda x:x[0]*x[1], zip(self.fill_stat_hist(), global_heuristics_row_fill_stat_hist_weights)) def heuristic(self): """heuristic of how good this field is. bigger means better.""" return sum(self.fill_stat_hist_weighted()) def path(self): """evolution chain from beginning to this state - for getting detailled insight.""" return [self] if self.parent == None else self.parent.path() + [self] def generate_board(board): return Board([[Cell(x) for x in row] for row in board], None) # generate board b = generate_board([ [ _, _, _, _,15], [ _, _,25,17, _], [ _,26, _, 5, _], [ _,18, _, _,11], [ _, _,21, _, _]]) print b # best first search nodes = [b] closed = [] solutions = [] while(len(nodes) > 0): # sort: bigger heuristic = best = last in list nodes.sort() node = nodes.pop() closed.append(node) # print 'current board:', node #print 'nodes remaining in list:', len(nodes) #print 'fill_stat_hist:', str(node.fill_stat_hist()) #print 'fill_stat_hist weighted:', str(node.fill_stat_hist_weighted()) # print 'solutions:', len(solutions) # produce some kiddies successors = node.generate_successors() suc_finished = filter(lambda suc:suc.is_filled(), successors) if(len(suc_finished) > 0): #print 'solution(s):', map(lambda x:x.path(), suc_finished) solutions += suc_finished print 'solutions:', len(solutions) #break suc_unfinished = filter(lambda suc:not suc.is_filled(), successors) suc_to_add = filter(lambda suc:not suc in nodes and not suc in closed, suc_unfinished) nodes += suc_to_add print map(lambda x:x.path(), solutions) print 'found solutions:', len(solutions) print 'finished.'

When running the solver on the magic number board example from above, we get presented the single possible, valid solution (along with the steps it took to get there, which are not shown here for layouting reasons ):

================ |__ __ __ __ 15| |__ __ 25 17 __| |__ 26 __ 5 __| |__ 18 __ __ 11| |__ __ 21 __ __| ================ solutions: 1 [[ ================ |__ __ __ __ 15| |__ __ 25 17 __| |__ 26 __ 5 __| |__ 18 __ __ 11| |__ __ 21 __ __| ================ , [...] , ================ |28 24 6 12 15| |13 7 25 17 23| | 8 26 19 5 27| |20 18 14 22 11| |16 10 21 29 9| ================ ]] found solutions: 1 finished.