MD5 Checksums
More information about md5deep can be found at http://md5deep.sourceforge.net
What are checksums?
A checksum is a unique string of characters, or “hash”, assigned to a file. The hash stays the same until the file changes. This is useful for long-term preservation as a way to keep tabs on the degradation on files. This tutorial details how to generate and use checksums with md5deep.
How do I install md5deep?
The installation procedure varies by operating system. Consult the md5deep manual for your particular OS.
Which OS is best?
Many digital preservationists prefer to use open source Linux operating systems such as Mint or Ubuntu since they are cleaner environments for digital forensics.
Md5deep is installed. Now what?
The checksum process works through Unix commands in Terminal (Terminal is a pre-installed program on the Mac OSX and Linux operating systems). In order to generate a checksum for a file, you must navigate to its directory by typing cd [directory name]
Once in the correct directory, simply typing “md5deep” and a filename will generate a hash. An asterisk after md5deep will designate all files in that folder. In this example, the hash is the string of characters starting with ff229...
dams@dams-iMac:~/Desktop$ md5deep test.txt ff22941336956098ae9a564289d1bf1b /home/dams/Desktop/test.txt
The real power of md5deep comes with adding letters, or “flags”, to the command line. These flags perform different operations such as matching, recursively generating hashes, and estimating the time needed to generate large sets of hashes. For the DAMS's purposes, only a few flags (-r, -x, and -m) are used frequently.
Writing checksums to a file
Before discussing the flags, it’s important to know how to write a command’s output to a file. Users can either designate an existing file or let Terminal create the file for them.
There are three files on my Desktop I would like to generate hashes for:
By typing "md5deep *" I generated these three hashes.
dams@dams-iMac:~/Desktop$ md5deep * 7a803c643432ea1443e3dbd5ee14db26 /home/dams/Desktop/pogocat.gif ff22941336956098ae9a564289d1bf1b /home/dams/Desktop/test.txt 8741172ab4d318c620f21d4c17f213ac /home/dams/Desktop/beyonce.jpeg
To create a file for the hashes, simply type md5deep * >> [filename]
This command does not display any output in Terminal unless there are errors. The file should appear in your folder like so:
Recursive Mode
The -r flag allows md5deep to run hashes on the contents of sub-directories, including any directories within that sub-directory. For example, on my desktop there's a directory called “checksum_test.” Within that directory there are four more sub-directories.
Simply typing md5deep checksum_test will not return any hashes. Instead, it will say:
dams@dams-iMac:~/Desktop$ md5deep checksum_test /home/dams/Desktop/checksum_test: Is a directory
In order to give md5deep the permission to go into a folder (and that folder’s folders), the recursive flag needs to be added:
dams@dams-iMac:~/Desktop$ md5deep -r checksum_test e73f683a4d79ed3067b7dfb1dd65cea5 /home/dams/Desktop/checksum_test/folder_3/6.3a_Counter_Problems.png bf6ed70e53234ee33be89487cf05e4a4 /home/dams/Desktop/checksum_test/folder_1/6.1_Includes.png e020e204de0d6988cd5e352c41d589d9 /home/dams/Desktop/checksum_test/folder_2/6.3b_Counter_Fix.png
Matching
Md5deep also allows users to generate hashes and compare them to a pre-established list of hashes. There are two types of matching: positive and negative. Positive matching shows filenames with hashes that DO match, and negative matching shows filenames with hashes that DO NOT match. This is where you can see if the hash (and, most importantly, the file) has changed.
The syntax for positive matching is:
md5deep -m [known_hashes_file] [file_to_check]
The syntax for negative matching is:
md5deep -x [known_hashes_file] [file_to_check]
Remember that an asterisk signifies all files in that directory.
So let's say someone added something to the “test.txt” file after I wrote its hash to the “known_hashes.csv” file. When running a positive match, “test.txt” will not appear in the list of matches
dams@dams-iMac:~/Desktop$ md5deep -m known_hashes.csv * /home/dams/Desktop/checksum_test: Is a directory /home/dams/Desktop/beyonce.jpeg /home/dams/Desktop/pogocat.gif
Conversely, if a negative match is run, “text.txt” will be the only file to appear. Notice the tilde. This symbol denotes the old version of the file.
dams@dams-iMac:~/Desktop$ md5deep -x known_hashes.csv * /home/dams/Desktop/checksum_test: Is a directory /home/dams/Desktop/test.txt /home/dams/Desktop/test.txt~
If you wish to show the hashes next to the file names, simply capitalize the flag:
dams@dams-iMac:~/Desktop$ md5deep -X known_hashes.csv * /home/dams/Desktop/checksum_test: Is a directory c2dff9abb840fa6a95b115b9eb25dbf2 /home/dams/Desktop/test.txt d1de632acc13af5231c10f022a3c27c9 /home/dams/Desktop/test.txt~
dams@dams-iMac:~/Desktop$ md5deep -M known_hashes.csv * /home/dams/Desktop/checksum_test: Is a directory 8741172ab4d318c620f21d4c17f213ac /home/dams/Desktop/beyonce.jpeg 7a803c643432ea1443e3dbd5ee14db26 /home/dams/Desktop/pogocat.gif
Matching can also be done recursively. Just be sure to put the recursive flag first.
dams@dams-iMac:~/Desktop$ md5deep -rm known_hashes.csv *
Welcome to the University Wiki Service! Please use your IID (yourEID@eid.utexas.edu) when prompted for your email address during login or click here to enter your EID. If you are experiencing any issues loading content on pages, please try these steps to clear your browser cache. If you require further assistance, please email wikihelp@utexas.edu.