Archive

Posts Tagged ‘image’

Batch rename pictures of multiple recording devices to show date and time in filenames

August 17, 2014 Leave a comment

Similar to sorting images into folders named after their date, we’re going to look at automatically renaming pictures – so that their names are based on timestamps they’ve been taken. This enables sorting pictures chronologically in pretty much all software that’s out there (as sorting alphabetically implies chronological sorting then). Note that the same is possible using exiftool only, but using a (from our point of view) slightly more complicated syntax (look e.g. here for renaming files after EXIF data, or here for updating EXIF data from filenames).

The problem

Imagine a group event with multiple people taking pictures using multiple cams. Usually, naming schemes of cameras are different, such as

IMG_1234.JPG
DCM_5678.jpg
...

If people later share images they could either put all into the same folder, or separate them by folders – we’re going to focus on the first case here. If these images get sorted alphabetically, they are not automatically sorted by their timestamps for the reason of using different naming schemes. On the one hand, for file explorers this can often be changed by sorting after image timestamps. On the other hand, this is not applicable for all cases: e.g. image viewers/galleries also need to feature sorting after timestamps for switching to the “next” picture. Panoramas created from multiple images also frequently do not contain timestamps any more, therefore sorting by timestamps causes panoramas to be on top/bottom of the list.

A uniform naming scheme that contain timestamps in names

For those reasons we tend to automatically rename all images to fit a uniform naming scheme such as

yyyy_mm_dd-hh_mm_ss_number_person.jpg

  • date and time are first in filenames – this automatically causes chronological sorting when sorting alphabetically (pretty much every program can sort files alphabetically)
  • number is the original sequential number of the image (contained in the new name to not lose any information)
  • person is an identifier of who and/or with which device the picture was taken

Automatically renaming pictures after their EXIF timestamp

At first we need to make sure that the EXIF timestamp DateTimeOriginal is set correctly for all images. In case all of these timestamps are shifted (typically, because the camera date and time were configured wrong) we can shift the corresponding EXIF entry using exiftool (adapt the offset):

exiftool -DateTimeOriginal+="00:00:00 00:00:00" -overwrite_original -ext FILENAMEEXTENSION -r DIRECTORY

As timestamps are correct we can rename images after their date and time taken:

old_fileending="JPG"                                   # original postfix of your files (file name extension)
old_praefix="IMG_"                                     # original prefix of your files
newname_postfix="_FsCam"                               # your new, personal name postfix
oldname_regex="s/$old_praefix//;s/.$old_fileending//"  # regex of how to modify the old filename before adding it to the new filename (e.g. replacing old pre- and postfix)
match_wildcards="$old_praefix*.$old_fileending"       # wildcards of which files process

for filepath_old in `find . -iname "$match_wildcards"`
do
    filename_old=`basename $filepath_old`
    filename_new=$(exiftool -S -DateTimeOriginal $filepath_old | awk 'BEGIN {FS=" "};{print $2" "$3}' | sed "s/:/_/g;s/ /-/g")_$(echo $filename_old | sed $oldname_regex)$newname_postfix.jpg
    rename -n -v "s/`echo $filename_old`/`echo $filename_new/`" $filepath_old
done
  • match_whitecards: defines which files you want to match
  • oldname_regex: defines how old filenames should be modified and preserved in new filenames
  • newname_postfix: identifier added to filenames (to identify person and device)

Finally, some words before starting to take pictures:

  • Ensure clock synchronization over all devices taking pictures (cameras, mobile phones, …) before starting – it just makes things easier. Otherwise you may experience “lag” effects (when looking through the pictures it seems like some person was a bit ahead or behind always). In case you already took pictures with multiple devices which experience lags: shift the EXIF image timestamps for specific devices as shown above.
  • If you want to sort videos too: take care of used time zones. Cameras usually encode the current device timestamps to the picture (the time shown on the device) – independent of the set time zone. In contrast, many mp4-recorders (like mobile phones) encode UTC timestamps instead of the time shown on the device. If you would use both types of timestamps without intermediate correction, “lags” might occur again.

Updating EXIF timestamps from filenames

In case you need to update timestamps after renaming images to the uniform naming scheme from above, e.g. for images not yet containing EXIF data at all (like panoramas) and/or in case of wrong naming (eventually caused by shifted EXIF timestamps) the following snippet could help.

# part 1: extract timestamp from filename and write it to EXIF
file_wildcard="*_FsCam.jpg"
for filepath in `find . -iname "$file_wildcard"`
do
  # extract timestamp from filename (it's the filename's prefix)
  # might look oldschool, but it's easy to adapt
  bname=$(basename $filepath)
  fdate=$(echo $bname | awk 'BEGIN {FS="-"};{print $1}')
  ftime=$(echo $bname | awk 'BEGIN {FS="-"};{print $2}')
  y=$(echo $fdate | awk 'BEGIN {FS="_"};{print $1}')
  m=$(echo $fdate | awk 'BEGIN {FS="_"};{print $2}')
  d=$(echo $fdate | awk 'BEGIN {FS="_"};{print $3}')
  h=$(echo $ftime | awk 'BEGIN {FS="_"};{print $1}')
  min=$(echo $ftime | awk 'BEGIN {FS="_"};{print $2}')
  s=$(echo $ftime | awk 'BEGIN {FS="_"};{print $3}')
  # write extracted timestamp to file as EXIF timestamp. first do a test run, then uncomment last line and run again
  echo "setting DateTimeOriginal" $y:$m:$d $h:$min:$s "for" $filepath
#   exiftool -DateTimeOriginal="$y:$m:$d $h:$min:$s" -overwrite_original $filepath
done

# part 2: shift EXIF timestamps (adapt offset)
for filepath in `find . -iname "$file_wildcard"`
do
    exiftool -DateTimeOriginal+="00:00:00 00:00:00" -overwrite_original $filepath
done

# part 3: replace the old name prefix timestamp with new (and now correct) one extracted from EXIF (first do test run, then remove -n from rename)
for filepath_old in `find . -iname "$file_wildcard"`
do
    filename_old=`basename $filepath_old`
    # remove the old prefix timestamp, keep other info
    filename_same=${filename_old:19:${#filename_old}}
    # create new filename from exif data and reainder of old filename
    filename_new=$(exiftool -S -DateTimeOriginal $filepath_old | awk 'BEGIN {FS=" "};{print $2" "$3}' | sed "s/:/_/g;s/ /-/g")$(echo $filename_same)
    rename -nv "s/`echo $filename_old`/`echo $filename_new`/" $filepath_old
done

convertconditional: convert an image if it fulfills certain conditions

August 28, 2013 Leave a comment

Recently I needed a script to batch convert only those images amongst a large amount of images which fulfil certain criteria, namely of being exactly of a stated size. The script is based on ImageMagick’s convert and basically takes an arbitrary amount of convert parameters. I personally use this script to automatically reduce size and quality of photos taken with a specific camera in order to reduce their hard disk space coverage.

The script

#! /usr/bin/python
#
# convertconditional: Convert an image if it fulfills certain conditions, e.g. is of a certain size. Requires ImageMagick's convert.
# Rainhard Findling 2013/08
#
# example to convert all *JPG within the current directory with are of certain size to a reduced size and quality:
# find . -iname "*JPG" -exec ./convertconditional {} -filtersize 3888x2592 -convertparams "-resize 3500x3000 +repage -quality 80" \;
#
import argparse # check http://docs.python.org/2/howto/argparse.html for help
import subprocess
import sys
# specify arguments
parser = argparse.ArgumentParser()
parser.add_argument('inputfile', help='path to the file that will be converted.')
parser.add_argument('-convertparams', required=True, help='parameters handed to the convert command used internally, e.g. resize, repage, reduce quality etc. Example: "-resize 300x200 +replage -quality 92"')
parser.add_argument('-filtersize', help='only convert if original image is of this size, stated as WIDTHxHEIGHT, e.g. 3500x3200.')
parser.add_argument('-o', '--outputfile', help='path to where the converted image will be stored. if not specified, the original file will be overwritten.')
parser.add_argument('-v','--verbose',help='print verbose output.',action='store_true')
args = parser.parse_args()
# make sure we can process names with spaces
args.inputfile = '"' + args.inputfile + '"'
# check for correct arguments
if not args.outputfile:
args.outputfile = args.inputfile
if args.filtersize:
filter_x = int(args.filtersize.split('x')[0])
filter_y = int(args.filtersize.split('x')[1])
if args.verbose:
print 'inputfile=' + args.inputfile
print 'outputfile=' + args.outputfile
if args.filtersize:
print 'resizing only', str(filter_x) + 'x' + str(filter_y), 'images.'
print 'convertparams=' + args.convertparams
# get size of image
imagesize = subprocess.check_output(["identify -format '%wx%h' '" + args.inputfile + "'"],
stderr=subprocess.STDOUT,
shell=True)
imagesize_x = int(imagesize.split('x')[0])
imagesize_y = int(imagesize.split('x')[1])
# condition: filter for images of certain size
if args.filtersize:
if args.verbose:
print 'size of', args.inputfile, 'is', str(imagesize_x) + "x" + str(imagesize_y)
# check filter criteria
if not imagesize_x == filter_x or not imagesize_y == filter_y:
print 'leaving out ' + args.inputfile + ' as it is of size ' +  str(imagesize_x) + "x" + str(imagesize_y) + " (required: " + args.filtersize + ")"
sys.exit(0)
# passed all conditions: convert image
print 'converting ' + args.inputfile + ' (size: ' +  str(imagesize_x) + "x" + str(imagesize_y) + ')'
# convert image
command="convert '" + args.inputfile + "' " + args.convertparams + " '" + args.outputfile + "'"
if args.verbose:
print 'command:', command
imagesize = subprocess.check_output([command],
stderr=subprocess.STDOUT,
shell=True)

The script is written in Python, so all you need to do is save it (e.g. in a file called “convertconditional”) and make it executable:

chmod +x convertconditional

Then you can either call it with stating it’s path (e.g. “./convertconditional [parameters]”), or add it to your systems PATH to call it from everyhwere.

Script execution

In order to convert input.jpg to output.jpg, you can try

convertconditional input.jpg -filtersize 3888x2592 -convertparams "-resize 3500x3000 +repage -quality 85" -o output.jpg
  • -filtersize optional filtering: only convert the image  in case it is exactly of the stated size
  • -convertparams states parameters which should be handed to ImageMagick’s convert
  • -o states where to store the converted image (original image gets overwritten if omitted)

In case you want to conditionally convert multiple files (as with my usecase) you can combine convertconditional with find and overwrite the original files:

find . -iname "*JPG" -exec convertconditional {} -filtersize 3888x2592 -convertparams "-resize 3800x +repage -quality 85" \;

Batch panorama stitching with review using Hugin

June 7, 2013 1 comment
Panorama of Mount Batur, Bali, Indonesia.

Panorama of Mount Batur, Bali, Indonesia.

Stitching images to a panorama may take it’s time — which might be frustrating in case you need to create a whole lot of panoramas. Hugin can save you a lot of time here. Basics of Hugin in a nutshell: it’s is a panorama tool providing a command line interface+UI and a two phased processing. Initially, you create a Hugin project which holds links to several images (.pto-file). Then you first sent your photos to the “assistant queue” which performs a preliminary stitching, which you can review and correct if necessary. Second, you send your images + rough stitching info to the “stitching queue”, which does the actual high quality stitching for you. Hugin further provides a batch processor which basically holds a list of Hugin projects — this is what we’re going to make use of.

Processing

To semi-automatically stitch all your panoramas at once, including a review of preliminary stitched panoramas, you can do the following:

  1. Move all photos that should be part of the same panorama to a separate folder — for each panorama you should have a separate folder then. This is the only step you actually have to do by hand completely. We assume you create the folders with leading zeros:
    for i in {001..100}; do mkdir $i; done
    
  2. Assuming that all these folders are located inside the same parent folder and you are in this parent folder, use Hugin’s “pto_gen” command to automatically generate the Hugin projects (.pto-files, make sure to adjust the image extension so that it fits your needs):
    for d in `ls`; do pto_gen $d/*.jpg; done
    

    If you happen to have multiple such folders, each containing multiple panorama folders, you can generate all panorama files at once using the following command instead (assuming all these folders have been named “pano”):

    for d in `find . -name "pano"`
    do
        for p in `ls $d`
            do pto_gen $d/$p/*jpg
        done
    done
    
  3. Add all these projects to the Hugin Batch Processor assistant queue:
    find . -name "*pto" -exec PTBatcherGUI -a {} \;
    
  4. Let the assistant queue create your preliminary panoramas
  5. Optionally review and correct each panorama using Hugin itself:
    find . -name "*pto" -exec hugin {} \;
    

    Sometimes it can be helpful to just review a bunch of panoramas at once instead:

    for d in `ls -d * | egrep "00[0-9]{1}"` # for panorama 000-009, adapt for your use
    do
        hugin $d/*.pto
    done
    
  6. Add the projects to the Hugin Batch Processor stitching queue:
    find . -name "*pto" -exec PTBatcherGUI {} \;
    
  7.  Let the stitching queue create all panoramas.

The following snippet converts panoramas generated as tifs into jpgs and moves them back to their original location (amongst other pictures) using convert from ImageMagick:

find . -name "*tif" -exec rename "s/ //g" {} \; # remove tif filename whitespaces added by Hugin

for t in `find . -name "*tif"`
do
    tif_path=`dirname $t`
    new_name=`basename $t | sed "s/tif/jpg/"`
    new_path=$tif_path/../../$new_name # we want to have panoramas amongst other pictures
    convert $t $new_path
done

Finally, if you’re pleased with your panoramas you can delete all tifs generated on the way:

find . -name "*tif" -delete

If you’d like to keep the original pictures used to create panoramas, but would like to share all other pictures anyway, here’s the command to copy all files but omit all “pano” folders inside:

DEST="/your/destination/folder/"
for f in `find . -type f`
do
    if [ ! `echo $f | grep "/pano/"` ]
    then
        # is not a pano folder and not a file inside a pano folder
        cp --verbose --parents "$f" "$DEST"
    fi
done

Installating Hugin on Ubuntu 12.04

When installing Hugin from the Ubuntu repositories in Ubuntu 12.04, unfortunately pto_gen is missing (seems to be fixed for 14.04 and newer). Therefore install Hugin from the Hugin repository as stated in their Ubuntu howto:

sudo add-apt-repository ppa:hugin/hugin-builds
sudo apt-get update
sudo apt-get install hugin enblend panini

Image classification using SVMs in R

February 24, 2013 6 comments

Recently I did some Support Vector Machine (SVM) tests in R (statistical language with functional parts for rapid prototyping and data analysis — somehow similar to Matlab, but open source ;)) for my current face recognition projects. To get my SVMs up and running in R, using image data as in- and output, I wrote a small demo script for classifying images. As test data I used 2 classes of images (lines from left top to right bottom and lines from left bottom to right top), with 10 samples each — like these:

ImageImageImageImageImageImage
The complete image set is available here.

For SVM classification simple train and test sets get used — for more sophisticated problems n-fold cross validation for searching good parameter settings is recommended instead. For everybody who did not yet work with SVMs, I’d recommend reading something about how to start with “good” SVM classification, like the pretty short and easy to read “A Practical Guide to Support Vector Classification” from Chih-Wei Hsu, Chih-Chung Chang, and Chih-Jen Lin (the LIBSVM-inventors).

Update: added parallel processing using parallel and mclapply for loading image data (for purpose of demonstration only, loading 10 images in parallel does not make a big difference ;)).

print('starting svm demo...')

library('png')
library('e1071')
library('parallel')

# load img data
folder<-'.'
file_list <- dir(folder, pattern="png")
data <- mclapply(file_list, readPNG, mc.cores=2)
# extract subject id + img nr from names
subject_ids <- lapply(file_list, function(file_name) as.numeric(unlist(strsplit(file_name, "_"))[1]))
# rename subject id's to c1 and c2 for more clear displaying of results
subject_ids[subject_ids==0]='c1'
subject_ids[subject_ids!='c1']='c2'
img_ids <- lapply(file_list, function(file_name) as.numeric(unlist(strsplit(unlist(strsplit(file_name, "_"))[2], "\\."))[1]))

# specify which data should be used as test and train by the img nrs
train_test_border <- 7
# split data into train and test, and bring into array form to feed to svm
train_in <- t(array(unlist(data[img_ids < train_test_border]), dim=c(length(unlist(data[1])),sum(img_ids < train_test_border))))
train_out <- unlist(subject_ids[img_ids < train_test_border])
test_in <- t(array(unlist(data[img_ids >= train_test_border]), dim=c(length(unlist(data[1])),sum(img_ids >= train_test_border))))
test_out <- unlist(subject_ids[img_ids >= train_test_border])

# train svm - try out different kernels + settings here
svm_model <- svm(train_in, train_out, type='C', kernel='linear')

# evaluate svm
p <- predict(svm_model, train_in)
print(p)
print(table(p, train_out))
p <- predict(svm_model, test_in)
print(p)
print(table(p, test_out))

print('svm demo done!')