Shellscripts: Using ANSI colors or finding duplicate files in style

… or “ASCII stupid questions, get a stupid ANSI”….

I’ve been writing Evil Shell Scripts of Doom[tm] lately, got curious in highlighting parts of the output in different color.
As both Terminal.app and xterms in Apple shipped X11 support ANSI colors, it made sense to me.

There are excellent How to Change a Title of an Xterm and Bash Prompts HOWTOs out there, to which I’ll refere my gentle reader, and thus I will just paste some example code.

Bits that directly relate to ANSI color output are all the echo -e lines, with basic logic being that echo -ne “e[1;35m” turns ON purple color, and echo -ne “e[0;m” turns it off. Of course other numeric values will generate different colors, and colors will look different on different terminals. Please refere to HOWTOs above.

Please make sure that you BOTH understand what this does, and undertand that YOU are responsible for running this. If it breaks, you get to keep both pieces, and not cry in my shoulder. The following is copyright 2005 Stany, and is licensed under… *shrug* BSD 2 clause license, in hope that this will be useful to others

(This is just some internal code that in reality just externalizes a function out of a much larger script, however might be of use to others without revealing what exactly I were working on. Note that it will leave an md5sum.out file behind, and will generate a script called “killfile” to remove duplicate if they were detected. This script doesn’t do many sanity checks, as they were job of a larger script, and will barf if subdirectories are present and matched. )

#!/bin/bash
# Checks for duplicates.  Takes one argument (optional) of the file ending,
# eg: checkdupe.sh pdf
# $Id: checkdupe.sh,v 1.2 2005/09/06 06:56:11 stany Exp stany $
MD5SUM=/opt/gnu/bin/md5sum
BASENAME=`basename $0`

checkdupe()
{
for ii in *$1 ; do
        FILESUM=`$MD5SUM $ii | awk '{print $1}'`
        NUMFILES=`grep $FILESUM md5sum.out | wc -l`
        if [ $NUMFILES -gt 0 ]; then
                OTHERFILE=`grep $FILESUM md5sum.out | awk '{print $2}'`
                 echo -e "e[1;31m`date` $BASENAME: e[1;36m$ii e[1;32mis a duplicate of e[1;35m$OTHERFILE e[0;m"
                echo "rm $ii" >> killfile
        else
                $MD5SUM $ii >> md5sum.out
        fi
done
}

if [ -e killfile ] ; then
        rm killfile
fi

if [ -e md5sum.out ] ; then
        rm md5sum.out
fi
        touch md5sum.out

checkdupe $1

if [ -e killfile ] ; then
        echo -e "e[1;31m`date` $BASENAME: Duplicates found.  Run killfilee[0;m"
        echo -ne "e[1;35m"
        cat killfile
        echo -ne "e[0;m"

fi


Sample run:

stany@gilva:~[02:58 AM]$ mkdir test
stany@gilva:~[03:04 AM]$ cd test/
stany@gilva:~/test[03:04 AM]$ echo a >foo 
stany@gilva:~/test[03:04 AM]$ echo a >bar 
stany@gilva:~/test[03:04 AM]$ echo b > baz
stany@gilva:~/test[03:04 AM]$ ../checkdupe.sh 
Tue Sep  6 03:04:45 EDT 2005 checkdupe.sh: foo is a duplicate of bar 
Tue Sep  6 03:04:45 EDT 2005 checkdupe.sh: Duplicates found.  Run killfile
rm foo
stany@gilva:~/nature/test[03:04 AM]$ ls
bar             baz             foo             killfile        md5sum.out
stany@gilva:~/nature/test[03:04 AM]$ 

Oh, and checkdupe() can be trivially fixed so that it would take any shell expandable expression as an argument to the script, and work with that. I don’t even consider it an excercise 😛