msgbartop
A chronological documentation test project, nothing serious, really!
msgbarbottom

15 May 2007 Shell script for removing duplicate files

The following shell script finds duplicate (2 or more identical) files and outputs a new shell script containing commented-out rm statements for deleting them.
You then have to edit the file to select which files to keep – the script can’t safely do it automatically!

OUTF=rem-duplicates.sh;
echo "#! /bin/sh" > $OUTF;
find "$@" -type f -print0 |
  xargs -0 -n1 md5sum |
    sort --key=1,32 | uniq -w 32 -d --all-repeated=separate |
    sed -r 's/^[0-9a-f]*( )*//;s/([^a-zA-Z0-9./_-])/\\\1/g;s/(.+)/#rm \1/' >> $OUTF;
chmod a+x $OUTF; ls -l $OUTF

Example output (rem-duplicates.sh)

#! /bin/sh
#rm ./gdc2001/113-1303_IMG.JPG
#rm ./reppulilta/gdc2001/113-1303_IMG.JPG

#rm ./lissabon/01-01-2001/108-0883_IMG.JPG
#rm ./kuvat\ reppulilta/lissabon/01-01-2001/108-0883_IMG.JPG

#rm ./gdc2001/113-1328_IMG.JPG
#rm ./kuvat\ reppulilta/gdc2001/113-1328_IMG.JPG

Explanation

  1. write output script header
  2. list all files recursively under current directory
  3. escape all the potentially dangerous characters with xargs
  4. calculate MD5 sums
  5. find duplicate sums
  6. strip off MD5 sums and leave only file names
  7. escape strange characters from the filenames
  8. write out commented-out delete commands
  9. make the output script writable and ls -l it

Tags: , , , , , , , ,

Posted by

09 Apr 2007 md5sum av filer/bilder

For å sikre seg mot at filer er korrupt kan man benytte kommandoen md5sum

Windows

md5sum -b *.JPG > checksum.md5

Linux
Deretter kopierer man denne md5-fila til rett katalog i Linux og tester at disse filene er identisk vha MD5Sums, et grafisk Windows program.

Tags: , ,

Posted by

14 Mar 2007 File integrity

A script using bash and md5sum to keep track of file integrity.

# Change the separator to allow for filenames containing spaces
# (the default is " \t\n", which confuses the for loop)
IFS=$'\n'
FOLDERS=`find /Volumes/disk\ 1/Pictures/Photos -type d | sed 's/ /\\ /g'`
for FOLDER in $FOLDERS; do
# mind you, this will only work with absolute pathnames
if [ -d $FOLDER ]; then
  echo "$0: INFO: Processing" $FOLDER
  cd $FOLDER
  for FILE in `ls -1|grep -i .jpg`; do
    echo "$0: INFO: Checking $FILE"
    djpeg -outfile /dev/null $FILE
    if [ $? -ne 0 ]; then
      echo "$0: ERROR: $FOLDER/$FILE is unreadable as JPEG"
    fi
  done
  if [ -e MD5SUMS ]; then
    md5sum -b -c MD5SUMS 2>&1 > /dev/null
    if [ $? -eq 1 ]; then
      echo "$0: ERROR: in $FOLDER:"
      md5sum -c MD5SUMS | grep FAILED 2>&1
    fi
  else
    echo "$0: WARNING: no MD5SUMS in $FOLDER, creating..."
    md5sum -b *.* > MD5SUMS
    # The obvious bit, in retrospect
    chown username:groupname MD5SUMS
  fi
fi
done

Tags: , ,

Posted by

13 Mar 2007 Script to download pictures from camera and rename them etc

1. download photos from camera and sort them by date of day in folders
2. remove possible duplicates if I did not erase camera images since last download
3. convert RAW/NEF images to a usable format

All this in one single click!

#!/bin/bash
# Change this to where to store Photos
target=/home/multimedia/Images
camera=”USB PTP Class Camera”
date=$(date –iso-8601)
mkdir -p $target/$date/tmp
cd $target/$date/tmp
# Get all photos from camera
gphoto2 –quiet –camera $camera –port usb: -P
# Do not replace photos that were already uploaded this same day
cp -u $target/$date/tmp/* $target/$date
rm -rf $target/$date/tmp
cd $target/$date
# auto-rotate using exif info
exifautotran *.JPG
# If photos were not erased from camera since last upload, remove duplicates
for i in *.{JPG,NEF}; do
for f in $(find $target -name $i ! -samefile $target/$date/$i); do
if md5sum $f | sed -e “s, .*/, ,” | md5sum –check; then
rm -f $i;
fi
done
done
# decode RAW images if not already done ?
# for i in *.NEF; do if [ ! -e $(basename $i .NEF).ppm ]; then dcraw -w $i; fi; done
# Show them!
gimv -d $target/$date

Tags: , , , , , , , , ,

Posted by