SSH Proxy: Server and Client Side of Using an SSH User Without Shell as Proxy Server

April 30, 2016 Leave a comment

Accessing services on an internal network over a proxy reachable from the Internet.

Imagine you want one of your machines to become a proxy for external users to be able to access local resources or the internet as if they were on that machine. You could run a dedicated proxy server for that, but if the machine provides SSH and you want an easy solution, you can use SSH as well – without risking any shell-related issues (like this user also having access to the local file system).

Why would you want stuff like that in the first place? That’s what proxy servers (and most frequently VPNs) are made for: to access country-, company-, or university-internal resources from outside of that network just as you would from the inside.

Server side

The server side is rather easy. Let’s assume we call that user “sshproxy”. The important thing is to not give your proxy user a shell, which we do here during creation of the user:

sudo useradd -m -d /home/sshproxy -s /bin/false sshproxy # create user and home directory, disable shell

The user’s password is disabled by default (you can check that there’s a ! in /etc/shadow for the sshproxy user). If you don’t want to use passwords but private-public keys (which I would recommend): in /home/sshproxy create the .ssh/ folder and the .ssh/authorized_keys file and ensure they’re readable for the sshproxy user. There you need to add the ssh public keys of people that should be allowed to use the ssh proxy. The cool thing about it is that different real life users can make use of the same sshproxy user: you just need to add to manage the keys of real life users for the sshproxy user. Further, you likely don’t want the sshproxy user to be able to change the authorized keys, therefore make the file read only for that user (e.g. root owned + writeable by owner only). If you have an AllowUsers section in your /etc/ssh/sshd_config, you need to add the sshproxy user there and restart ssh. That’s it on the server side.

User side

If you’re one of the users that need to generate their ssh keypair, you can have a look at:

https://help.ubuntu.com/community/SSH/OpenSSH/Keys#Generating_RSA_Keys
(very easy to understand)

https://stribika.github.io/2015/01/04/secure-secure-shell.html
(details that are good to know if you want to have more secure keys = not so easy to crack)

If you use RSA then use at least 2048 bit key length (4096 can’t hurt…), e.g. with ssh-keygen with the “-b 4096” parameter. If you’re on a Windows machine: you can do the same using e.g. Putty. Don’t forget to use a good password to protect your private key file (usually the “id_rsa” file) and never disclose it – servers only need your public key (usually “id_rsa.pub”).

Using the proxy via Linux shell

Everything’s built in, just use the following command.

ssh -D 8080 -N sshproxy@YOUR-IP-OR-URL -p YOUR-PORT

This command opens the local port 8080 (on the client machine) for proxy tasks. YOUR-PORT-URL and YOUR-PORT correspond to the SSH server running on the server machine. If everything worked fine you won’t get any response in the shell – just leave the terminal open. Make sure your private key file is in ~/.ssh/id_rsa or provide it explicitly with “-i”. If you can’t access the server try removing the “-N”. Then you should see that the server logs you in and out immediately (this means everything works fine from the SSH, proxy and tunnel side – if you add -N again, you should be fine therefore). You can check your local opened ports with nmap (“nmap localhost”). If port 8080 is not in the list before login and opened after login your tunnel works.

Using the proxy via Putty

  • address: YOUR-IP-OR-URL (where SSH server is running)
  • port: YOUR-PORT (where SSH server is running)
  • user: sshproxy
  • Enable the check box “Don’t start a shell or command at all” in “Connection-SSH”
  • Specify your private key in “Auth”
  • Tunnels: add a “dynamic tunnel” on “local port 8080”, leave destination open (should state “D8080” in Putty after you apply)

Sending data to your local port

As you now have a tunnel on port 8080 on your local machine open, you need to send request to this port, instead of your standard gateway. E.g. for your browser you can achieve that configuring the browser to use a proxy (e.g. Firefox: FoxyProxy AddOn: set address “localhost” and port “8080”).

Btw: you can do other cool things with that proxy as well, such as reverse port tunnelling, circumventing firewalls etc. But you should ensure that this is allowed in the company/country you’re in before you get yourself into troubles.

Repair PDFLaTeX generated pdf using GhostScript (Adobe Acrobat Reader error 131)

February 28, 2016 Leave a comment

I tend to do presentations using LaTeX and Beamer, while working on Linux and using TeXLive as LaTeX distribution – which all work fine. But sometimes I need to share these PDFLaTeX compiled presentations with people using Windows and Adobe Acrobat Reader as their pdf viewer. The feedback I usually get back: your pdf is broken, error 131. And frankly, that seems to be true.

Opening PDFLaTeX generated pdf files with Windows Adobe Acrobat Reader results in error 131 – more precisely in the displayed error message “There was a problem reading this document (131)”. Other pdf viewers don’t complain about the pdf, just Adobe Acrobat Reader. A quick solution to repair the pdf file is by using GhostScript:

gs -dSAFER -dBATCH -dNOPAUSE  -sDEVICE=pdfwrite -sOutputFile=output.pdf input.pdf</code></pre>

Voilà, the pdf file can be opened using Adobe Acrobat Reader – although, the source of the problem still exists of course (generating a pdf file not adhering the standard in the first place).

Git security: enabe fsckobjects in ~/.gitconfig:

February 3, 2016 Leave a comment

In order to prevent possible tampering with code in git repositories you work with (e.g. malicious manipulation of objects during clone, fetch, push…), check if these lines exist in your ~/.gitconfig and add them, if they don’t:

[transfer]
fsckobjects = true
[fetch]
fsckobjects = true
[receive]
fsckObjects = true

These enable git checking transferred objects for their integrity using their computed hashes.

Original idea from here: https://groups.google.com/forum/#!topic/binary-transparency/f-BI4o8HZW0
(and the corresponding bug on Debian here: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=813157)

Solving Logic Puzzles in Prolog: Puzzle 3 of 3

December 27, 2015 Leave a comment

In this post we solve another small logic puzzle in Prolog, similar to the previous two logic puzzles (these are available here and here).

The puzzle

There are four researchers: Ainsley, Madeline, Sophie and Theodore. The goal is to find out their sports competition discipline, birth year and research interests (while knowing that each of the mentioned attributes is different amongst them). In order so solve the puzzle, a couple of hints is provided from which the solution can be derived:

  1. The competition for Theodore or Ainsley is javelin.
  2. The researcher interested in hurricane research has relay as sports subject.
  3. Sophie is not the researcher who has relay as sports subject.
  4. Either the researcher who has longjump as sports subject or Theodore was born in 1933 – the other one was born in 1921.
  5. The researcher with relay as sports subject was born before Theodore.
  6. The researcher interested in wildfires was born after the researcher doing longjumps for sports.
  7. Neither Sophie nor Ainsley are researching wildfires.
  8. The researcher interested in earthquakes has either hurdle or javelin as sports interest.
  9. The researcher interested in earthquakes was born in 1933.
  10. Madeline was not born in 1921.

Solving the puzzle in Prolog

In our Prolog implementation we ask for a solution that a) ensures that all sports subjects, research interests and birth years exist once and b) fulfills the rules mentioned above:

% ####################################################################
% Logic puzzle solver.
%
% Read definition to prolog:
%   ['puzzle.pl'].
%
% 2) Ask for solutions:
%   solve(Ainsley, Madeline, Sophie, Theodore).
%
% Rainhard Findling
% 10/2015
% ####################################################################
all_members([H],L2) :- member(H,L2).
all_members([H|T],L2) :- member(H,L2), all_members(T, L2).

and([H]) :- H.
and([H|T]) :- H, and(T).
or([H]) :- H,!.
or([H|_]) :- H,!.
or([_|T]) :- or(T).

solve(Ainsley, Madeline, Sophie, Theodore) :-
    All = [Ainsley, Madeline, Sophie, Theodore],
    Ainsley = [Ainsley_Year, Ainsley_Competition, Ainsley_Disaster],
    Madeline = [Madeline_Year, Madeline_Competition, Madeline_Disaster],
    Sophie = [Sophie_Year, Sophie_Competition, Sophie_Disaster],
    Theodore = [Theodore_Year, Theodore_Competition, Theodore_Disaster],
    % we know what all variable must be
    all_members([1920, 1921, 1933, 1973], [Ainsley_Year, Madeline_Year, Sophie_Year, Theodore_Year]),
    all_members([hurdle, relay, javelin, longjump], [Ainsley_Competition, Madeline_Competition, Sophie_Competition, Theodore_Competition]),
    all_members([earthquakes, hurricanes, tornados, wildfires], [Ainsley_Disaster, Madeline_Disaster, Sophie_Disaster, Theodore_Disaster]),
    % c1
    or([Ainsley_Competition = javelin, Theodore_Competition = javelin]),
    % c2
    member([_, relay, hurricanes], All),
    % c3
    not(Sophie_Competition = relay),
    % c4
    member([C4_Year, longjump, _], All),
    or([and([Theodore_Year = 1921, C4_Year = 1933]), and([Theodore_Year = 1933, C4_Year = 1921])]),
    % c5
    member([C5_Year, relay, _], All),
    Theodore_Year > C5_Year,
    % c6
    member([C6_1_Year, longjump, _], All),
    member([C6_2_Year, _, wildfires], All),
    C6_1_Year < C6_2_Year,
    % c7
    not(Sophie_Disaster = wildfires),
    not(Ainsley_Disaster = wildfires),
    % c8
    member([_, C8_competition, earthquakes], All),
    not(C8_competition = hurdle),
    not(C8_competition = javelin),
    % c9
    member([1933, _, earthquakes], All),
    % c10
    not(Madeline_Year = 1921).

Using this implementation, if we ask for a solution, the single possible solution is presented:

['puzzle.prolog'].
% puzzle.prolog compiled 0.00 sec, 13 clauses
true.

?- solve(Ainsley, Madeline, Sophie, Theodore).
Ainsley = [1920, relay, hurricanes],
Madeline = [1973, hurdle, wildfires],
Sophie = [1933, longjump, earthquakes],
Theodore = [1921, javelin, tornados] ;

Solving Logic Puzzles in Prolog: Puzzle 2 of 3

November 29, 2015 1 comment

This post is about solving the 2nd of 3 small logic puzzles in Prolog. The previous post is available here.

The puzzle

Eilen, Ada, Verena and Jenny participated in a painting competition. Find out who painted which subject and who took which place in the competition, using the hints provided. Hints:

    1. Eilen painted a constable and was not last in the competition
    2. Jenny took the third place
    3. The person painting the Monet took the first place
    4. Ada beat the person painting a talyor, and the person painting a Van Gogh beat Vera

Solving the puzzle in Prolog

We just translate the hints into rules that ensure that a) places 1-4 and all mentioned painting subjects exist in our solution and b) the mentioned hints are fulfilled:

% ####################################################################
% Logic puzzle solver.
%
% Read definition to prolog:
% ['puzzle.pl'].
%
% 2) Ask for solutions:
% solve(Eilen, Ada, Vera, Jenny).
%
% Rainhard Findling
% 10/2015
% ####################################################################
all_members([H],L2) :- member(H,L2).
all_members([H|T],L2) :- member(H,L2), all_members(T, L2).

solve(Eilen, Ada, Vera, Jenny) :-
% 4 members
All = [Eilen, Ada, Vera, Jenny],
Eilen = [Eilen_place, Eilen_subject],
Ada = [Ada_place, Ada_subject],
Vera = [Vera_place, Vera_subject],
Jenny = [Jenny_place, Jenny_subject],
% we know max place is 4th
all_members([1,2,3,4], [Eilen_place, Ada_place, Vera_place, Jenny_place]),
% we know existing paintings
all_members([constable,taylor,fangogh,monet], [Eilen_subject, Ada_subject, Vera_subject, Jenny_subject]),
% #3 1st was monet
member([1, monet], All),
% #2 Jenny got third
member([3, Jenny_subject], All),
% #1 Eilen painted a constable and was not last
member([Eilen_place, constable], All),
4 > Eilen_place,
% #4 ada beat the taylor painter...
member([Ada_place, Ada_subject], All),
member([H2_taylor_place, taylor], All),
Ada_place < H2_taylor_place, 
% #4 ...and the van gogh painter beat vera member([H4_fangogh_place, fangogh], All), member([Vera_place, Vera_subject], All), Vera_place > H4_fangogh_place.

If we ask for a solution, the single valid solution is presented:

?- solve(Eilen, Ada, Vera, Jenny).
Eilen = [2, constable],
Ada = [1, monet],
Vera = [4, taylor],
Jenny = [3, fangogh] ;

And again, that’s it – problem solved!

Solving Logic Puzzles in Prolog: Puzzle 1 of 3

October 31, 2015 6 comments

Recently I played around with Prolog again to solve 3 small logic puzzles. This post is about solving the first puzzle. This first puzzle consists of 2 individual puzzles that follow the exact same internal structure, so can be solved the exact same way of stating rules in Prolog (just with different content).

The small puzzle: problem and question

There are 4 students: Carrie, Erma, Ora und Tracy. Each has one scholarship and one major subject they study. The goal is to find out which student has which scholarship and studies which subject (with all scholarships and majors being different from each other) from the hints provided. The available scholarships are: 25000, 30000, 35000 and 40000 USD. The available majors are: Astronomy, English, Philosophy, Physics. The following hints are given to solve the problem:

  1. The student who studies Astronomy gets a smaller scholarship than Ora.
  2. Ora is either the one who studies English or the one who studies Philosophy.
  3. Erna has a 10000 USD bigger scholarship than Carrie.
  4. Tracy has a bigger scholarship than the student that studies English.

The small puzzle: the Prolog solution

One way of solving this puzzle in Prolog to ask for a solution (which in our example consists of the 4 students and their associated attributes) that fulfills the rules that we can derive from the 4 hints from above. This is actually pretty short and easy in Prolog syntax:

% ####################################################################
% Logic puzzle solver.
%
% Read definition to prolog:
%   ['puzzle.pl'].
%
% 2) Ask for solutions:
%   solve(Carrie,Erma,Ora,Tracy).
%
% Rainhard Findling
% 10/2015
% ####################################################################

all_members([H],L2) :- member(H,L2).
all_members([H|T],L2) :- member(H,L2), all_members(T, L2).

or([H]) :- H,!.
or([H|_]) :- H,!.
or([_|T]) :- or(T).

solve(Carrie,Erma,Ora,Tracy) :-
% all students
Carrie = [Carrie_scholarship, Carrie_major],
Erma = [Erma_scholarship, Erma_major],
Ora = [Ora_scholarship, Ora_major],
Tracy = [Tracy_scholarship, Tracy_major],
% grouping
All = [Carrie,Erma,Ora,Tracy],
% ensure all values exist once
all_members([25, 30, 35, 40], [Carrie_scholarship, Erma_scholarship, Ora_scholarship, Tracy_scholarship]),
all_members([astronomy, english, philosophy, physics], [Carrie_major, Erma_major, Ora_major, Tracy_major]),
% clue 1
member([C1_scholarship,astronomy], All),
Ora_scholarship > C1_scholarship,
% clue 2
or([Ora_major = english, Ora_major = philosophy]),
% clue 3
member([C3_scholarship, physics], All),
C3_scholarship - Carrie_scholarship =:= 5,
% clue 4
Erma_scholarship - Carrie_scholarship =:= 10,
% clue 5
member([C5_scholarship, english], All),
Tracy_scholarship > C5_scholarship.

If we ask for the solution, the single possible solution is presented:

?- solve(Carrie,Erma,Ora,Tracy).
Carrie = [25, english],
Erma = [35, astronomy],
Ora = [40, philosophy],
Tracy = [30, physics] ;

The slightly bigger puzzle: problem and question

There are 5 car renting contracts. Each has a duration, a pickup location, a car brand and a customer associated to it. The goal is to find out for all contracts, what the customer, duration, car brand and pickup location is (again given the fact that each attribute is unique amongst the contracts). The pickup locations are: Brownfield, Durham, Iowa Falls, Los Altos and Redding. The car brands are: Dodge, Fiat, Hyundai, Jeep and Nissan. The contract duration are 2, 3, 4, 5 and 6 days. And the customer names are Freda, Opal, Penny, Sarah and Vicky. The following hints are given to solve the problem:

  1. The contracts of Vicky, the one with pickup location in Los Altos, the one with pickup location in Durham and the one with the Fiat are all different contracts.
  2. The contract with the Jeep is not in Iowa Falls.
  3. The contract of Vicky and the one with the Nissan are picked up in either Los Altos or Redding.
  4. Penny’s contract is not for 6 days.
  5. The contract in Iowa Falls is for 5 days.
  6. The contract with the Durham is 3 days longer than the contract of Opal.
  7. Of the contract with Nissan and the 2 day contract, one is picked up in Redding and the other is Freda’s contract.
  8. The contract with the Jeep is not for 6 days.
  9. The contract of Opal is 1 day longer than the one with the Hyundai.

The slightly bigger puzzle: the Prolog solution

Again, we can define a solution to fulfill the stated rules:

% ####################################################################
% Logic puzzle solver.
%
% Read definition to prolog:
%   ['puzzle.pl'].
%
% 2) Ask for solutions:
%   solve(Freda, Opal, Penny, Sarah, Vicky).
%
% Rainhard Findling
% 10/2015
% ####################################################################

all_members([H],L2) :- member(H,L2).
all_members([H|T],L2) :- member(H,L2), all_members(T, L2).

all_not([H]) :- not(H).
all_not([H|T]) :- not(H), all_not(T).

all_not_members([H],L2) :- not(member(H,L2)).
all_not_members([H|T],L2) :- not(member(H,L2)), all_not_members(T, L2).

and([H]) :- H.
and([H|T]) :- H, and(T).
or([H]) :- H,!.
or([H|_]) :- H,!.
or([_|T]) :- or(T).

solve(Freda, Opal, Penny, Sarah, Vicky) :-
    % all runners
    Freda = [Freda_car, Freda_location, Freda_days],
    Opal = [Opal_car, Opal_location, Opal_days],
    Penny = [Penny_car, Penny_location, Penny_days],
    Sarah = [Sarah_car, Sarah_location, Sarah_days],
    Vicky = [Vicky_car, Vicky_location, Vicky_days],
    % grouping
    All = [Freda, Opal, Penny, Sarah, Vicky],
    % ensure all values exist once
    all_members([dodge, fiat, hyundai, jeep, nissan], [Freda_car, Opal_car, Penny_car, Sarah_car, Vicky_car]),
    all_members([brownfield, durham, iowafalls, losaltos, redding], [Freda_location, Opal_location, Penny_location, Sarah_location, Vicky_location]),
    all_members([2,3,4,5,6], [Freda_days, Opal_days, Penny_days, Sarah_days, Vicky_days]),
    % clues from easy (fast) to hard (slow)
    % clue 5
    member([_,iowafalls,5], All),
    % clue 9
    member([hyundai,_,C9_days], All),
    Opal_days - C9_days =:= 1,
    % clue 6
    member([_,durham,C6_days], All),
    C6_days - Opal_days =:= 3,
    % clue 4
    not(Penny_days = 6),
    % clue 2
    member([C2_car,iowafalls,_], All),
    not(C2_car = jeep),
    % clue 8
    member([C8_car,_,6], All),
    not(C8_car = jeep),
    % clue 3
    member([nissan,C3_location,_], All),
    or([and([C3_location = losaltos, Vicky_location = redding]), and([C3_location = redding, Vicky_location = losaltos])]),
    % clue 7
    member([nissan, C7_1_location,_], All),
    member([_,C7_2_location, 2], All),
    or([and([Freda_location = C7_1_location, C7_2_location = redding]), and([Freda_location = C7_2_location, C7_1_location = redding])]),
    % clue 1
    member([C1_1_car,losaltos,_], All),
    member([C1_2_car,durham,_], All),
    member([fiat,C1_3_location,_], All),
    all_not_members([C1_1_car, C1_2_car, Vicky_car],[fiat]), %car
    all_not_members([C1_3_location, Vicky_location],[losaltos, durham]). %location

… and ask for it, which again presents the single solution:

?- solve(Freda, Opal, Penny, Sarah, Vicky).
Freda = [nissan, losaltos, 4],
Opal = [jeep, brownfield, 3],
Penny = [fiat, iowafalls, 5],
Sarah = [dodge, durham, 6],
Vicky = [hyundai, redding, 2] ;

Hidoku Solver in Python: branch cutting, intelligent successor generation and some simplifications

In this post we are implementing a Hidoku solver (Hidoku is yet another fine number puzzle) that uses a depth first search, branch cutting, limited (intelligent) successor generation and some automatic simplification. Usually, a Hidoku is a quadratic board consisting of n x n fields – but rectangular or other forms would be possible as well. With each Hidoku, some fields are already pre-filled with numbers at the beginning.

Hidoku Sample

Hidoku Sample

The game goal is to fill in all other numbers so that an ascending number queue is built: each number has to be adjacent to it’s successor, with adjacent meaning in an adjacent horizontal, vertical or diagonal field. Those adjacent fields are what we call a field neighborhood. The first and last number (e.g. with a 10×10 board they will be 1 and 100) are usually amongst those pre-filled into the board – for the player knowing where the queue starts and ends. See e.g. here or here to try out some online Hidoku examples for yourself.

Within our solver we use a similar puzzle representation which we read from a text file. E.g. the sample from above would look like this:

__,__,93,__,__,__,___,__,__,57
__,__,__,22,__,53, 54,40,__,__
__,__,18,23,52,__,100,__,__,__
__,__,__,__,__,99, 33,__,__,__
__,__,__,__,__,25,___,__,__,__
__,__,__,__,__,__,___,__,__,29
14,__, 7,__,__,__,___,__,__,__
__,__,__,65,__,__, 46,72,77,__
__, 9,__,__,__,__,___,__,80,76
 1,__,__,11,__,68, 82,__,__,__

Solving a Hidoku

With our solver we want to obtain all possible solutions to a Hidoku (although it should ideally only have 1). Therefore we utilize a depth first search with branch cutting, intelligent successor generation and some automatic simplification. Besides the simple game board representation (positions with numbers) we also use some additional variables for speeding up the search (see Board.__init__() for details):

  • self.nparray: the standard n x m numpy array representation of the game board, with the numbers present on the board written to the y/x position of the array.
  • self.nrs_yx_dict: dictionary of numbers filled to the board and their corresponding tuple of y/x positions. This allows fast lookup of the y/x position a number is filled to.
  • self.nrs_left_to_place: list of numbers that still need to be placed on the board. Allows for slightly faster lookup of still missing numbers.
  • self.nrs_placement_possibilities: list of possible placements as numbers and their corresponding y/x positions (representing positions this numbers could be filled to, given the current state of the board). This list enables much faster access to where numbers could get placed on the board. It is created initially (for the first node in the search tree) and for all successor nodes in the search tree possibilities that became impossible just get removed.
  • self.field_connection_status: lists how many connections each field in the board currently has (0 to 2). A connection exists if the number preceding or succeeding to the number in the field is present in the field neighborhood (with the neighborhood including all adjacent fields: vertical, horizontal and diagonal). These connection stats enable quick checks if the board is already invalid (more on that later). The connection stats get handed over to preceding nodes in the search tree by updating the neighborhood and the field itself the number was filled to.

We now look closer at how some of these are created and later used inside branch cutting.

Determining those fields that a number could be filled to

With the first node in the search tree, self.nrs_placement_possibilities gets created and filled. Succeeding nodes obtain a copy of it with those possibilities removed that became invalid by placing preceding number to the board (by doing the distance checks described below again).

Given the numbers pre-filled to the board, all other numbers can be filled to those fields only that lie within the allowed distance of numbers already filled to the board. An example: if nr. 2 if filled to the top left field (position 0/0), nr. 2 can only be placed within vertical, horizontal and diagonal distance 1 of field 0/0, nr. 3 within distance 2 etc. Hence, this boils down to a simple (but pretty effective in reducing computation complexity) series of distance checks, in which each number X gets checked against each number Y already filled to the board. These checks are only performed if abs(X-D) < max(board_height, board_width) - 2 (otherwise the numbers can be placed on the board without any restrictions caused by each other, as the board is too small for such). The core of such a distance check is rather simple: nr. X can be placed to a certain y/x position on the board, if the horizontal and the vertical distance to the position of nr. Y is at max. abs(X-Y).

Branch cutting and its requirements

In our search tree we need to detect dead ends early (boards that cannot yield a valid solution anymore) in order to exclude them from further search, hence speeding up the complete search.

With keeping track of possible placements of numbers to fields on the board using self.nrs_placement_possibilities, we can do 2 types of checks:

  • for each number left to be placed on the board: is there still at least one field left that number can be placed on?
  • for each field still empty: is there at least one number left that can be placed on that field?

If at least one is not fulfilled: the board is invalid an can be excluded from further search.

We also use field connection stats (self.field_connection_status) within branch cutting. Fields initially have 0 connections. If the field is filled with a number, for each of the preceding and succeeding number present in its neighborhood (the adjacend fields, maximum 8), the number of connections goes up by one – so the maximum amount of connections per field is 2. Using these field connection stats we can check:

  • if the field is filled: does it have its preceding and succeeding numbers placed in its neighborhood, or enough free fields in the neighborhood for them to be placed there in the future? If not the board is invalid.
  • if the field is not filled: does it allow to be a connection for any pair of numbers in its neighborhood? Can it still be reached (entered and exited) from the outside? E.g. an empty field with fully filled neighborhood can only be reached if there can a number be placed on that field that has its preceding and succeeding number already filled to the neighborhood (or a free field in the neighborhood in its place). This essentially boils down to neighborhood number permutation check: if we don’t find a valid (still possible) permutation of numbers in the neighborhood for which we can fill a still available and valid number to that field, the board is invalid.

Successor generation

For the solved board the order of how fields got filling does not matter. Therefore, within our depth first search, we do not care for the search path, but only for the obtained board state (we do not care how a board could have evolved to it’s current state).

The question is: how to generate the required subset of successors (as this will influence the future search complexity)? We only generate successors from filling a single field on the board (with all numbers that can be filled to it) – as we can generate all possible board states from this, without searching multiple search paths (which all lead to the same results). As field to be filled we chose the one with the least possible amount N of numbers that can be filled to this field – and generate one successor per number possible. If there is a number that can be placed only on an amount A of fields that is smaller than N, we instead fill this one number to all those fields and generate one successor per filled field (as we only obtain A instead of N successor nodes then). This way we minimize the amount of branches and keep the overall branching factor low – something a rational human player would strive for as well. Note that we do not miss any possible solutions as a result of this limited successor generation. As a side effect, this also results in basic board simplification: if there is a field that only one number can be filled to, it will be filled next – and generate a single successor only, which is usually called simplification. The same is true if we have a number that can only be filled to a single field.

Implementation

# Hidoku solver implementation using depth first search, branch cutting, intelligent successor generation and a little simplification.
# Rainhard Findling
# 2015/04
#
# possible improvements:
# - use a path finding algorithm, e.g. A* + manhattan distance instead of using max(y_dist,x_dist) to determine possible minimum distances.
#
import numpy
import itertools
from copy import copy

# STATICS
EMPTY = -1

def main():
    """gets executed if in __main__ (see bottom of file)."""

    # parameters
    board_file = '../boards/10x10_2.txt'

    # load board, initialize it, print some statistics
    board = Board(read_board(board_file))
    board.initialize()
    print board
    print 'nrs left', board.nrs_left_to_place
    print 'nrs_placement_possibilities', len(board.nrs_placement_possibilities)
    tmp = board.placements_splitted_per_nr()
    print 'placements per nr', map(lambda t:(t,tmp[t][0]), tmp.keys()), '\n'
    board.is_valid()
       
    # depth first search
    list_open = [board]
    solutions = []
    nodes_opened = 0
    nr_of_successors = []
    while(len(list_open) > 0):
        cur = list_open.pop()
        nodes_opened = nodes_opened + 1
        print 'looking at', cur
        print 'list_open', len(list_open)
        successors = cur.successors()
        if(len(successors)==0):
            print 'all successors invalid, reverting.'
        for s in successors:
            nr_of_successors = nr_of_successors + [len(successors)]
            if s.is_solved():
                solutions = solutions + [s]
                #break # enable to quit search after the first solution, disable to search the complete search space
            else:
                list_open = list_open + [s]
        else:
            continue
        break
    print 'done.'
    if len(solutions) == 0:
        print 'no solutions found.'
    else:
        print 'solution paths:', map(lambda s:s.solution_path(), solutions)
        print 'solutions:', solutions
        print 'solutions', len(solutions)
        print 'nodes_opened', nodes_opened
        print 'nr_of_successors', sum(nr_of_successors)
        print 'avg branching (not including reduction by simplification-drive-successor-generation)', sum(nr_of_successors)/float(len(nr_of_successors))
    
class Board:
    
    def __init__(self, nparray = None):
        self.nparray = nparray # the gameboard as board
        self.nrs_yx_dict = None # the gameboard as dict of nrs and corresponding y,x positions
        self.nr_range = None # range of numbers to be contained in board
        self.nrs_left_to_place = None # numbers left to be placed on the board
        self.nrs_placement_possibilities = None # list of possible placement possibilities
        self.field_connection_status = None # how many connections fields have left [0,2]
        self.parent = None
    
    def __str__(self):
        """str representation of nparray board."""
        sep = '   '
        top1 = '===' + ''.join(['====' for _ in range(len(self.nparray[0]))])
        top2 = '==='  + ''.join(['=='  for _ in range(len(self.nparray[0]))])
        s = '\n' + top1 + sep + top2 + '\n'
        for row_nr in range(len(self.nparray)):
            row = self.nparray[row_nr,:]
            s = s + '| ' + ' '.join(map(lambda x:'___' if x == EMPTY
                                        else '{:3.0f}'.format(x), row)) + ' |'
            fillings = self.field_connection_status[row_nr,:]
            s = s + sep + '| ' + ' '.join(map(lambda x:'x' if x == 0 else
                                              '.' if x == 1 else
                                              ' ', fillings)) + ' |\n'
        s = s + top1 + sep + top2 + '\n'
        return s
    
    def __copy__(self):
         other = Board()
         other.nparray = copy(self.nparray)
         other.nrs_yx_dict = copy(self.nrs_yx_dict)
         other.nr_range = self.nr_range # no need to copy
         other.nrs_left_to_place = copy(self.nrs_left_to_place)
         other.nrs_placement_possibilities = copy(self.nrs_placement_possibilities)
         other.field_connection_status = copy(self.field_connection_status)
         return other
     
    def __repr__(self):
        return self.__str__()

    def __eq__(self, other):
        return (self.nparray == other.nparray).all()
        
    def __ne__(self, other):
        return not self.__eq__(other)
    
    def initialize(self):
        """initialize the board. meant to be called on first board in search tree only."""
        self.nr_range = range(1, len(self.nparray) * len(self.nparray[0])+1)
        self.nrs_left_to_place = self.generate_nrs_left_to_place()
        self.nrs_yx_dict = self.generate_nrs_yx_dict()
        self.nrs_placement_possibilities = self.generate_nrs_placement_possibilities()
        self.field_connection_status = self.generate_all_field_connection_status()
        
    def generate_nrs_yx_dict(self):
        """generate yx position index of nrs."""
        nrs_yx_dict = {}
        for y in range(len(self.nparray)):
            for x in range(len(self.nparray[0])):
                if not self.nparray[y,x] == EMPTY:
                    nrs_yx_dict[self.nparray[y,x]] = (y,x)
        return nrs_yx_dict
        
    def generate_nrs_placement_possibilities(self):
        """1 huge list containing tuples about which nrs are placeable on which fields."""
        nr_placements = []
        for nr in self.nrs_left_to_place:
            for y in range(len(self.nparray)):
                for x in range(len(self.nparray[0])):
                   if(self.can_nr_be_placed_on_field(nr,y,x)):
                       nr_placements = nr_placements + [(nr,y,x)]
        return nr_placements
    
    def generate_all_field_connection_status(self):
        """for all filled fields: generate field connection status."""
        field_connection_status = numpy.ones([len(self.nparray), len(self.nparray[0])], int) * 2
        for y in range(len(self.nparray)):
            for x in range(len(self.nparray[0])):
                field_connection_status[y,x] = self.generate_field_connection_status(y,x)
        return field_connection_status
    
    def generate_field_connection_status(self,y,x):
        """generate field connection status for the given nr on y and x pos."""
        neighborhood = self.get_neighborhood_nrs(y,x)
        nr = self.nparray[y,x]
        dist = map(lambda n:abs(nr-n), neighborhood)
        field_connection_status = 2 - dist.count(1)
        if nr == self.nr_range[0] or nr == self.nr_range[-1]:
            field_connection_status = field_connection_status - 1
        return field_connection_status
    
    def fill_nr_to_field(self, nr, y, x):
        """fill given nr to given field."""
        if not nr in self.nrs_left_to_place:
            raise Exception('cannot place a nr not left to place.')
        self.nrs_left_to_place.remove(nr)
        if self.nparray[y,x] != EMPTY:
            raise Exception('cannot place a nr on a non empty field.')
        self.nparray[y,x] = nr
        self.nrs_yx_dict[nr] = (y,x)
        self.field_connection_status[y,x] = self.generate_field_connection_status(y,x)
        for (iy,ix) in self.get_neighborhood_pos(y,x):
            self.field_connection_status[iy,ix] = self.generate_field_connection_status(iy,ix)
        self.nrs_placement_possibilities = filter(lambda p:self.can_nr_be_placed_on_field(*p), self.nrs_placement_possibilities)
    
    def placements_splitted_per_nr(self):
        """split placement possibilities per nr."""
        splitted = {}
        for nr in self.nrs_left_to_place:
            for_nr = filter(lambda p:p[0]==nr, self.nrs_placement_possibilities)
            splitted[nr] = (len(for_nr), for_nr)
        return splitted
       
    def can_nr_be_placed_on_field(self,nr,y,x):
        """check if given nr is possible on y,x pos in field."""
        # only possible if empty
        if not self.nparray[y,x] == EMPTY:
            return False
        # only possible if nr still available
        if not nr in self.nrs_left_to_place:
            return False
        # check that nr is within range of other nrs on board
        return self.check_distance_to_other_nrs_ok(nr,y,x)

    def check_distance_to_other_nrs_ok(self,nr,y,x):
        """check if given nr on the given y,x is within range of other nrs on the board."""
        # -2 because -1 is max possible distance, therefore no cutoff effect
        d_max = max(len(self.nparray), len(self.nparray[0])) - 2 
        for nr2 in range(nr - d_max, nr + d_max + 1):
            if nr == nr2 or not nr2 in self.nrs_yx_dict:
                continue
            pos2 = self.nrs_yx_dict[nr2]
            # ensure that nr and nr2 are within range
            d = self.distance((y,x),pos2)
            if d > abs(nr-nr2):
                # the distance between the nrs is bigger than the difference between nrs - impossible to place here
                #print 'dist too big (nr, nr2, y, x, dist):', nr, nr2, y, x, abs(nr-nr2)
                return False
        return True

    def generate_nrs_left_to_place(self):
        """generate list of nrs still to placed on the field."""
        nrs = []
        for nr in self.nr_range:
            if not nr in self.nparray:
                nrs = nrs + [nr]
        return nrs
    
    def get_neighborhood_pos(self,y,x):
        """tuples of (y,x) pos of neighborhood."""
        yx = []
        for iy in range(max(y-1,0), min(y+2, len(self.nparray))):
            for ix in range(max(x-1,0), min(x+2, len(self.nparray[0]))):
                if not (y == iy and x == ix):
                    yx = yx + [(iy,ix)]
        return yx
    
    def get_neighborhood_nrs(self,y,x):
        """get all fields that surround the specified field."""
        n = []
        for (iy,ix) in self.get_neighborhood_pos(y,x):
            n = n + [self.nparray[iy,ix]]
        return n
    
    def neigbors_not_fully_connected(self, y, x):
        """get neighbors to the given pos that are not yet fully connected."""
        neighborhood = self.get_neighborhood_nrs(y,x)
        not_fully_connected = []
        for n in neighborhood:
            if n == EMPTY:
                not_fully_connected = not_fully_connected +  [n]
            else:
                (ny,nx) = self.nrs_yx_dict[n]
                if self.field_connection_status[ny,nx] > 0:
                    not_fully_connected = not_fully_connected + [n]
        return not_fully_connected
    
    def check_empty_field_neighbor_permutations_ok(self, y, x):
        """check if the field can be reached from it's neighbors permutations."""
        # attention: this does not handle first and last nr if they are not initially placed on the board
        if(self.nparray[y,x] != EMPTY):
            raise Exception('trying to check a non empty field with the exhausting empty-field check...')
        not_fully_connected = self.neigbors_not_fully_connected(y, x)
        amount_empty = not_fully_connected.count(-1)
        if(amount_empty >= 2):
            # we still have empty entrance and exit, too much effort to check
            return True
        if(amount_empty == 1):
            # if only 1 possibility: plugin + redo check
            possible_fillings1 = set(reduce(lambda a,b:a if b == EMPTY else a + [b-1,b+1], not_fully_connected, []))
            possible_fillings2 = filter(lambda x:x in self.nrs_left_to_place, possible_fillings1)
            if(len(possible_fillings2) > 1):
                # more than 1 possibility, still valid
                return True
            if(len(possible_fillings2)==1):
                # only this one possibility left! plugin, recheck
                self.fill_nr_to_field(possible_fillings2[0], y, x)
                print 'only one possibility left for field (y,x,nr)', y, x, possible_fillings2[0]
                return self.is_valid()
        else:
            if len(not_fully_connected) < 2:
                # impossible to connect if we have only 1 connectable field
                #print 'invalid, as permutations invalid', y, x                
                return False
            permutations = list(itertools.combinations(not_fully_connected, 2))
            diff2 = map(lambda t:(abs(t[0]-t[1]), (t[0]+t[1])/2), permutations)
            # permutations where diff2 is 2 are valid if the nr between them is still placeable. if there is only one: only possibility for that field. if there is none board is invalid
            diff2_valid = filter(lambda t:t[0] == 2, diff2)
            # check which nrs are still available to place
            diff2_available = filter(lambda t:t[1] in self.nrs_left_to_place, diff2_valid)
            if(len(diff2_available) > 1):
                # more than 1 possibility, still valid
                return True
            if(len(diff2_available) == 1):
                # only this one possibility left! plugin, recheck
                print self 
                print 'only one possibility left for field (y,x,nr)', y, x, diff2_available[0][1]
                self.fill_nr_to_field(diff2_available[0][1], y, x)
                return self.is_valid()
        #print 'invalid, as permutations invalid', y, x
        return False
    
    def check_filled_field_valid_connections(self,y,x):
        """check that the given field can still be reached by means of predecessing and successing nr."""
        nr = self.nparray[y,x]
        if(nr == EMPTY):
            raise Exception('trying to check an empty field with too inexact filled field check...')
        neighborhood = self.get_neighborhood_nrs(y,x)
        nr_frees = neighborhood.count(EMPTY)
        #print 'frees', nr_frees
        connected_to_me = [abs(n-nr) for n in neighborhood].count(1) # empty do not count as the will not result in 1 distance as nr 0 does not exist
        #print 'connected_to_me', connected_to_me
        need = 2
        if(nr == self.nr_range[0] or nr == self.nr_range[-1]):
            need = 1
        if(nr_frees + connected_to_me >= need):
            return True
        #print 'invalid, as too less connections filled field (nr, y, x)', nr, y, x
        return False
    
    def successors(self):
        """generate successors of."""
        # nr with least placement possibilities
        placements_splitted_per_nr = self.placements_splitted_per_nr()
        p_amount = map(lambda t:placements_splitted_per_nr[t][0], placements_splitted_per_nr.keys())
        #print 'p_amount', p_amount
        i_min = p_amount.index(min(p_amount))
        #print 'i_min', i_min
        nr = placements_splitted_per_nr.keys()[i_min]
        print 'placing nr', nr
        
        # placement possibilities for nr
        # (nr,y,x)
        nr_possibilities = filter(lambda t:t[0] == nr, self.nrs_placement_possibilities)
        print 'possibilities to place (nr, len(p), p)', nr, len(nr_possibilities), nr_possibilities
        
        # sanity check: would there be a field that has only 1 nr to be filled with?
        yx_amount = {}
        for p in self.nrs_placement_possibilities:
            key = (p[1],p[2])
            if not p in yx_amount:
                yx_amount[key] = [p]
            else:
                yx_amount[key] = yx_amount[key] + [p]
        yx_ones = filter(lambda t:yx_amount[t] == 1, yx_amount.keys())
        if(len(nr_possibilities) > 1 and len(yx_ones) > 1):
            raise Exception('there is a field to fill with one nr instead. implement generating that successor therefore.')
                    
        successors = []
        for p in nr_possibilities:
            suc = copy(self)
            suc.parent = self
            suc.fill_nr_to_field(*p)
            if suc.is_valid():
                successors = successors + [suc] 
        return successors
    
    def solution_path(self):
        """get solution path up to this point."""
        l = [self]
        cur = self
        while(not cur.parent is None):
            l = [cur.parent] + l
            cur = cur.parent
        return l
          
    def distance_nrs(self, nr1, nr2):
        """get distance between nr1 and nr2 on board or None if at least one of the nrs is not present."""
        if not nr1 in self.nrs_yx_dict or not nr2 in self.nrs_yx_dict:
            return None
        pos1 = self.nrs_yx_dict[nr1]
        pos2 = self.nrs_yx_dict[nr2]
        return self.distance(pos1, pos2)
    
    def distance(self, pos1, pos2):
        """distance between pos1 and pos2."""
        return max(abs(pos1[0]-pos2[0]), abs(pos1[1]-pos2[1]))
               
    def is_valid(self):
        """check if board is still valid."""
        #check if each nr can be placed
        for nr in self.nrs_left_to_place:
            if len(filter(lambda t:t[0] == nr, self.nrs_placement_possibilities)) == 0:
                #print 'invalid as nr', nr, 'cannot be placed.'
                return False
        # check that each field if filled is connected or connectable and if not set has permutations of neighbors that allow it to be filled
        for y in range(len(self.nparray)):
            for x in range(len(self.nparray[0])):
                if self.nparray[y,x] == EMPTY:
                    if not self.check_empty_field_neighbor_permutations_ok(y,x):
                        return False
                else:
                    if not self.check_filled_field_valid_connections(y,x):
                        return False
        return True
    
    def is_solved(self):
        """checks if the field is solved."""
        if len(self.nrs_left_to_place) == 0:
            # sanity checks
            if EMPTY in self.nparray:
                raise Exception('no numbers left to place, but still empty fields?')
            return True
        return False
            
def read_board(filepath):
    """read board from a textfile."""
    f = open(filepath, 'r')
    rows = []
    w = -1 # same width for all lines
    for line in f.read().splitlines():
        values = line.split(',')
        if w == -1:
            w = len(values)
        elif w != len(values):
            raise Exception('nr. of columns differ amongst rows.')
        values = map(lambda x:EMPTY if '_' in x else int(x), values)
        rows = rows = rows + [values]
    f.close()
    return numpy.array(rows)
    
if __name__ == "__main__":
    main()

When calling the solver on the sample board from above, it finds and prints the single possible solution (as well as the path how to get there, which we don’t show here for space reasons):

[...]
solutions: [
===========================================   =======================
|  91  92  93  20  21  36  37  38  39  57 |   | x x x x x x x x x x |
|  90  94  19  22  35  53  54  40  56  58 |   | x x x x x x x x x x |
|  89  95  18  23  52  34 100  55  41  59 |   | x x x x x x x x x x |
|  88  17  96  51  24  99  33  61  60  42 |   | x x x x x x x x x x |
|  16  87  50  97  98  25  62  32  43  30 |   | x x x x x x x x x x |
|  15   6  86  49  48  63  26  44  31  29 |   | x x x x x x x x x x |
|  14   5   7  85  64  47  45  27  28  78 |   | x x x x x x x x x x |
|   4  13   8  65  84  70  46  72  77  79 |   | x x x x x x x x x x |
|   3   9  12  66  69  83  71  73  80  76 |   | x x x x x x x x x x |
|   1   2  10  11  67  68  82  81  74  75 |   | x x x x x x x x x x |
===========================================   =======================
]
solutions 1
nodes_opened 1220
nr_of_successors 1646
avg branching (not including reduction by simplification-driven successor generation) 1.3491803

Branching factor and search speed

Limited successor generation and branch cutting greatly influence the branching factor, therefore the overall runtime. With enabling both, you can see that the branching factor is around 1.35, and that the complete search space can successfully be searched. For me, disabling both resulted in the solver not even being able to find the solution in feasible time – but just try and see for yourself! 😉