Tuesday, December 10, 2013

NSA Snoop and Error Proof Your Archives

An Englishman's Home is His Castle:

In 1760, William Pitt (the Elder) made a famous declaration of this right. "The poorest man may in his cottage bid defiance to all the force of the Crown.  It may be frail, its roof may shake, the wind may blow through it. The rain may enter. The storms may enter.  But the king of England may not enter.  All his forces dare not cross the threshold of the ruined tenement."

There is many a ruined castle in Europe.  You need to defend your castle if you want it to keep its value!

You should also consider the problem of data corruption, since any single bit error in an encrypted archive, will render the whole thing unreadable.

Also remember this: Always tar a directory, never tar a bunch of files. 

It is extremely annoying when you get an archive from someone, untar it and end up with a million files scattered throughout your home directory...

Encrypting an Archive with GPG

An easy way to secure your data against unwarranted NSA and GCHQ snoops is with gpgAlways use gpg before uploading archives to a public file server such as DropBox, Copy or Evernote.

Using pipes, you can string the tape archiver tar, bzip2 and gpg together like this:
$ tar -cj test | gpg -c >test.tar.bz2.gpg

(or use gpg -er username for key based encryption)

and to reverse it:

$ gpg -d < test.tar.bz2.gpg | tar -xj

The key configuration of GPG is shown in another post. 


Forward Error Correction

A good way to protect your data against corruption is with a Reed Solomon forward error correcting code - it is used on CDROMs.  There is a little known utility available that does it, called rsbep by Guido Fiala and you can get the source using this link: http://www.filewatcher.com/_/?q=rsbep

Then build it the usual way with ./configure; make; sudo make install.  It is part of the BSD and Debian distributions.

Using pipes, you can string the tape archiver tar, bzip2, rsbep and gpg together like this:

$ tar -cj test | gpg -c | rsbep >test.tar.bz2.gpg.rs

(or use gpg -er username for key based encryption)

and to reverse it:

$ rsbep -d < test.tar.bz2.gpg.rs | gpg -d | tar -xj

The Reed Solomon code will protect your archive against error bursts and will help to ensure that you can read the data back from a failing archive system, many years later.


Parity Bits

There is also a utility called par2 (and the handy GUI PyPar2) which adds Reed Solomon parity bits in a series of separate files.  These are in the par2cmdline and pypar2 packages.  This can be used to protect any files, but when you copy things around, you got to remember to pass the parity data along too.

Preventive Measures Using Par2

In order to ensure that I get into and stay in the habit of using gpg, I created a couple of scripts, so I can simply do:

$ targpg directory
$ untargpg directory

and the scripts will handle the messy details.

I tested the error recovery by corrupting the archive with hexedit - it works like magic!

Make an archive script like this called /usr/local/bin/targpg:
#! /bin/bash
echo Make an encrypted archive of a directory
tar -cj "$1" | gpg -c >"$1.tar.bz2.gpg"
md5sum "$1.tar.bz2.gpg" > " $1.tar.bz2.gpg.md5"
parcreate -n1 "$1.tar.bz2.gpg"
ls -al "$1"*

as well as untargpg:
#! /bin/bash
echo Untar a GPG encrypted archive $1
RESULT=$(md5sum -c "$1.tar.bz2.gpg.md5")

if [ "$RESULT" != "$1.tar.bz2.gpg: OK" ]
  echo MD5 error - Attempt a repair
  par2repair $1.tar.bz2.gpg.par2

# Try to decrypt and untar regardless   
# because the md5 and par2 files may be missing
gpg -d < "$1.tar.bz2.gpg" | tar -xj
ls -al "$1"*

and then one day when your disk goes south, par2repair may save the day.

La voila!

No comments:

Post a Comment

On topic comments are welcome. Junk will be deleted.