Friday, February 12, 2016

Mirror, Mirror on the Wall...

A Private Linux Mirror

Debian/Ubuntu mirroring is also described down at the bottom - it is super simple.

A Private Fedora Mirror

If you need to replicate Fedora based machines, then you need to set up your own rpm file mirror.  This allows you to automate the whole install with Kickstart off your own server on a LAN and you can then freeze your server at arbitrary points to facilitate a production run of identical machines.

The installation server can be an old laptop PC with a huge USB disk (reformat the disk with gparted to ext4.  The file system must support UNIX permissions and links).  The file server doesn't have to be very fast.  To do an install, you only need this server machine, a big switch and bunch of target machines with Kickstart (https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Installation_Guide/ch-redhat-config-kickstart.html) and do a netboot, using DHCP and a web server such as lighttpd.

To make your own mirror server, you should set up an account with Fedora, so that you can get access to their servers and allow them to transparently redirect your machines to your own server if necessary.

Once you opened an account, you can set up a rsync script to download and save only what you need.  The secret to success with rsync is the EXCLUDES file.  In there, list patterns of directory names and files to avoid.  For example, if you only want Fedora 22, then you do that, by excluding 4, 5, 6... 21 and 23, plus a few other junk things that will show up once you try it.

More details here: https://fedoraproject.org/wiki/Infrastructure/Mirroring#Mirroring

FAS Account

Open an account here: https://admin.fedoraproject.org/mirrormanager - Without an account, you won't be able to download anything with rsync.

Create a new Site, for example mirrors and set the password to something secure.  Specify the Organization URL if you have one and be sure to select the Private checkbox and then save the site.

Now create a Host with a FQDN of fedora.example.com.  Set the Country code to US and again make sure the Private checkbox is enabled and save the Host.  Once saved, add a new Site-local Netblock.  Go to http://whatismyip.com and make a Netblock 1.2.3.4/32.

Once this is saved, still under the Host setting, add a new Category. This will tell the Mirror Manager what categories of software this host carries. Examples include Fedora Linux and add a URL serving the content definition, such as http://fedora.example.com/mirrors which you need to use in your Lighttpd or Apache web server setup.

Note that if you don't want to do the FAS thing, you can do the same as below with any other Fedora mirror close by at a university or Telco that you trust, but you'll have to research the excludes list and rsync group carefully.

For a full public mirror list, click on something in the matrix here:
https://admin.fedoraproject.org/mirrormanager


e.g.:
https://admin.fedoraproject.org/mirrormanager/mirrors/Fedora/22/aarch64


and

http://ftp-stud.hs-esslingen.de/pub/fedora/linux/releases/22/

Excludes

The most important thing is the excludes file and nobody ever tells what needs to be in there, which prompted me to write this article.  If this file is not good, then rsync will download everything since the abacus was invented and your disk drive is bound to fill up.  The file excludes.list below will exclude everything but Fedora 22:

*/alt*
*/archive*
*/epel*
*/fedora-secondary*
*/atomic*
*/core*
*/development*
*/extras*
*/test*
*/Docker*
*/Cloud*
*/releases/4/*
*
/releases/5/*
*
/releases/6/*
*/epel-release-latest-5*
*/epel-release-latest-6*
*/epel-release-latest-7*
*
/releases/7/*
*
/releases/8/*
*
/releases/9/*
*
/releases/10/*
*
/releases/11/*
*
/releases/12/*
*
/releases/13/*
*
/releases/14/*
*
/releases/15/*
*
/releases/16/*
*
/releases/17/*
*
/releases/18/*
*
/releases/19/*
*
/releases/20/*
*
/releases/21/*
*
/releases/23/*

The only thing not in there, is 22.  As you can see I also don't want Cloud and Docker schtuff, but I do want Arm, i386 and X64 - you may want to tweak it some more.

You can see all the directories you may need to exclude by trolling up and down the tree of the mirror server here: https://dl.fedoraproject.org/pub/fedora/linux/releases/

Rsync vs Wget

The mirroring is recommended to be done with rsync, which will download a group on the server called fedora-enchelada (which plays on 'the whole enchelada' - everything since the abacus).

I have in the distant past made mirrors using wget, but rsync is more efficient and easier to control.  Rsync will honour the excludes list and will not traverse outside its designated directory, but wget will invariably start to walk across to other directories at the same level, thereby downloading more files than it was supposed to, so you have to keep an eye on it and quit it when you think that it is done with what you want (or set it to one directory deeper than what you actually want - like the example below and then let it be).

If you want to use wget, here is an example:
$ wget --continue --recursive --no-parent --no-clobber \
    http://ftp-stud.hs-esselingen.de/pub/fedora/linux/releases/22/Live/x86_64

The above wget script will also download other directories under the Live subdirectory, not only the x86_64 one, so you have to watch it.  Wget also has an exclude directive that doesn't work.  Despite these issues, it does work and can be used to download a mirror that doesn't support rsync, or for which you don't have a download account.

Mirror Script

For testing the rsync script below, I made a mirror directory tree in my home directory ~/mirrors

This tree actually needs to be in the web server root which is usually /var/www to serve the files to Kickstart and dnf.


The rsync script called mirror.sh looks like this:

#! /bin/bash
# Mirror Fedora 22 only - at least, that is the idea
# See the exludes.list file - rsync will download everything, except for the patterns in this file
# See the mirroring wiki for details: https://fedoraproject.org/wiki/Infrastructure/Mirroring#Mirroring

export EXCLUDES="excludes.list"

rsync -vaH --exclude-from=${EXCLUDES} \

 --numeric-ids --delete --delete-after --delay-updates \
 rsync://dl.fedoraproject.org/fedora-enchilada ~/mirrors


Make this file executable with chmod 755 mirrors.sh, run it and see what happens.

The first thing rsync does is to download the files list and build a directory tree in ~/mirrors.  While rsync runs, view this growing tree and make sure that it only includes what you want and that unwanted directories remain empty.

If there is a growing pile of files that you don't want, press Ctrl-C to quit the script, add a pattern to the excludes.list file, delete the junk and try again.  Don't leave the machine alone until you are sure that you get only what you want and no more, or you may end up with a terabyte of useless files.



A Private Ubuntu Mirror

On Ubuntu mirrors, all the files are stored in a single directory called pool.  In there, you find all versions of everything.  The releases are controlled through a system of Master Record Index files that list everything about every file in a release.  These index files are zipped up and kept in a directory tree using the release names like Trusty or whatever.  These files will keep your CM manager as happy as a piggy in a mud bath.

The problem with this system is that you cannot replicate an Ubuntu mirror with rsync, unless you copy everything since Adam invented the Abacus, which is about 800 GB.  To get only the files belonging to a specific release, you need a utility that can parse the index files.  This utility is called apt-mirror and it will download about 100 GB of executables for the Trusty release.

Apt-mirror does exactly what is written on the tin.  It Just Works (TM).



The easiest way to run apt-mirror is to make a server with the same release as what you want to mirror and install the package apt-mirror.  You then only need to change one single line in /etc/apt/mirror.list to point to the place where you want to keep all the files and run apt-mirror (The directory must exist). That is all there is to it.

The most important thing with mirroring is to avoid using a Seagate USB disk that shuts itself down every once in a while.

What I eventually did was to uncomment the lines in mirror.list one by one and save the files on USB sticks.  The first line for main needs about 60 GB, which fits on a modern 64 GB schtick. The following lines of updates and security fixes require about 30 GB storage, about 90 GB total for the compiled code and goodness knows how much for the C-code.

Downloading 100 GB will take about 2 days on a typical 4 Mbps home fibre net, vs weeks on a typical overloaded corporate network.

Note that you can interrupt and restart apt-mirror.  It will figure out what happened and carry on where it left off without a complaint.

Apt-mirror is pretty robust.  I managed to fill up my USB thingies to the max on an Ubuntu VM, then copied them together with rsync -a to a larger SD card, made a raw device file for VirtualBox so it could access the SD card, mounted the SD card in the same path and carried on with apt-mirror and finally copied the SD card to my Fedora mirror server.  All done with nary a hiccup.


GPG - InRelease Clearsigned file isn't valid, got 'NODATA'

This error drove me up the wall.  The InRelease file seems to be somehow corrupted by apt-mirror. The solution is to delete the InRelease file from the mirror server.

Another one of life's little mysteries...

Seagate USB Disks

Those infernal Seagate USB disks need something like this command to stay awake:
$ sudo sdparm --clear STANDBY -6 /dev/sdb -S

In addition, if you are running a Virtualbox virtual machine, do not use a USB3 port (yellow) for an external disk.

La voila!

Happy mirroring...

Herman

No comments:

Post a Comment

On topic comments are welcome. Junk will be deleted.