Medium- and large-sized FreeBSD sites need a faster, easier way of installing new systems and preparing them for production use. Hand-installing each machine using the standard sysinstall(8) tool is tedious. Using current tools and technology, system installation can be quick and painless. Either by floppy or network boot, an automated installation tool can download and extract a full system installation with little or no human intervention.
With the advent of cheap PCs, rack-mount cases, and affordable high-speed networking, developing a large pool of processors is an economical solution to purchasing a few multi-processor, multi- gigabyte machines to operate a large Internet web site. These cheap, independently operating processors facilitate complex, fault-tolerant software systems. As each machine is running its own instance of the operating system, however, maintaining each individual system is quite a task.
Installing each machine, when buying them by the rack, using sysinstall(8) is a tedious and time-consuming task. Thankfully, systems already exist to automate system installations. This paper explores the components available to automate system installation, including using floppy-based automatic installers and network boot to reduce the physical effort necessary to bring new systems into production.
When installing large numbers of systems, spending as little time as possible on each unit is critical. In many cases, the machines do not have keyboards or displays. The best installation method for this environment requires no human intervention to complete, beyond turning on or resetting the machine before and after the installation. As most PCs ship with the floppy drive as the first drive in the boot sequence, simply inserting a floppy and pressing the reset button is an excellent method. However floppy disks have an annoying tendency to fail so another method is preferable. Most server-class systems based on Intel motherboards also offer network boot as a final fallback if the hard disk cannot be booted, which is true for brand-new systems.
The floppy disk has served as the de facto standard of PC booting since the inception of the platform. Nearly every machine has a floppy drive and knows how to boot from it.
The floppy is also the PC¹s Achilles heel: The fundamental technology of floppy disks has changed little over the last 20 years. The 1400 kilobyte 3.5 inch floppy disk is an industry standard and is unlikely to change, even in the day of 75 gigabyte hard disks. Floppies are extremely volatile and fail frequently and when least desired. Floppy disks have an average life of around 24 hours (from personal experience), and will usually fail when used in multiple machines, such as for installation disks.
It is high time that an alternative remote booting method was designed for PC hardware. One of the few universal technologies in PC BIOS implementations is support for network boot systems to hook the BIOS boot sequence. All that is needed is a properly designed network card and option ROM that can intercept the boot sequence.
A floppy-based install disk is handy for installing to machines without network boot capability. Building images by hand is somewhat of a black art, so keep the voodoo dolls handy while building them. The procedure is detailed in the Makefiles for the FreeBSD release build and in the PicoBSD single-floppy build scripts.
Building disk images requires use of the vn(4) device; make sure pseudo-device vn is declared in the kernel or the vn.ko kernel module is loaded.
Because of the limited space on floppies, the system files are stored in a compressed filesystem image, called mfsroot, which loader(8) pre-loads before booting the kernel. The kernel then mounts the image as the root filesystem. Alternately the mfsroot image can be inserted directly into the kernel binary, sizing the mfsroot image is difficult, as it must be large enough to accommodate the necessary files but small enough to compress to fit on the floppy along with a kernel and loader(8) . A good starting size is 2400 kilobytes. Empty space on the image compresses very well so having extra space is not as detrimental.
The file src/release/scripts/doFS.sh in the source tree contains the gritty details in building the actual image file. The process involves tricking disklabel(8) into labeling a zeroed file mounted using the vn(4) device then running newfs(8) on the result. This must be done twice--once for the mfsroot and again for the floppy itself (unless the files are copied directly to a real floppy disk which is subsequently imaged using dd(1)). Note the floppy itself must be labeled with the bootblocks so it will boot. The method in doFS.sh and PicoBSD adds the bootblocks.
Once the mfsroot image is constructed, copy anything and everything needed to get the system running, from /etc/rc on down. Since this is a new environment, a skeleton /etc/rc will do fine. The kernel and /boot are located outside the mfsroot image.
Once the mfsroot is populated, configure and build the smallest kernel required. Remove everything that isn't absolutely necessary. Note that some things that may seem unnecessary are, like options INET. Some trial-and-error may be needed; consult the BOOTMFS kernel config file in the release build directory and various PicoBSD kernels for hints. Once an appropriate kernel is built, gzip it and place it on the floppy.
Finally, install loader(8) and related files in /boot. The loader(8) itself is quite bloated when Forth support is compiled in, but using the NOFORTH compile option in sys/boot/i386/Makefile will shrink loader significantly. Use kgzip(8) to compress the loader (not gzip(1)!) and place on the floppy. Finally, rewrite loader.rc to only load the kernel and mfsroot image, unless enough space exists on the floppy to copy all of the .4th files and /boot/loader.conf over.
Needless to say generating a floppy from scratch is a long and painful process, fraught with danger and frustration. Shoehorning the necessary files into the mfsroot image along with a bloated kernel and loader(8) is a difficult task. Placing the mfsroot image directly in the kernel binary can remove the need for loader but space is still at a premium.
Intel's Preboot Execution Environment (PXE) is a fast and easy way to create netboot-aware systems. PXE provides hardware-abstracted drivers and a DOS-style environment for loading boot programs over a network.
PXE, at the core, is an updated version of the BOOTP & TFTP ROM loaders for earlier cards. Instead of BOOTP, PXE supports DHCP for host address configuration. A built-in TFTP client fetches the boot program from a specified server and runs it. The boot program can then take whatever action is required to boot the system, including making calls to the PXE TFTP client to fetch additional files.
Thanks to the hard work of John Baldwin and Paul Saab, a version of the FreeBSD loader(8) runs in the PXE environment, fetching the kernel and modules either over TFTP or NFS. All of the features of loader(8) are available, including the Forth interpreter. This allows for great flexibility in developing netboot solutions for FreeBSD systems. pxeboot(8)(8) appears in FreeBSD 4.1-RELEASE and later (although it was originally committed to 4-STABLE some time before release).
PXE is found on Intel's EtherExpress Pro/100 series PCI cards and onboard Ethernet adapters, and on 3Com 3c905c PCI Ethernet adapters. Most Intel EtherExpress cards require a firmware upgrade to work around bugs in the PXE firmware; this is available from Intel¹s support site for the EtherExpress PCI cards or by downloading the PXE SDK. There is not much information available for the 3Com equipment other than PXE is present.
In a pxeboot(8) -enable netboot, the system loads several files in stages and it can be confusing what is pulled from where. This is a summary of the steps leading to a successful FreeBSD boot over a network.
PXE-enabled machine is started; PXE queries for a DHCP lease and receives one with netboot information, such as the boot file, the server to fetch it from, and the path to the "root" filesystem.
PXE TFTPs the boot file from the specified server, which defaults to the DHCP server if not given. In this case the boot file would be pxeboot(8) .
PXE executes pxeboot(8) .
pxeboot(8) starts up, relocates itself, and jumps into the loader component. loader(8) mounts the NFS filesystem specified by the root-path option and fetches /boot/loader.rc from the "root" filesystem using NFS. The file must be located on the server specified by next-server or defaults to the DHCP server. (loader(8) can also use TFTP to fetch files; this is a compile-time option.)
Loader follows the instructions in loader.rc, including loading additional script files from the next-server, the kernel binary, root filesystem image, and modules.
Once loading is complete, the kernel is started and proceeds as if a normal from-disk boot, with the root filesystem image mounted as the root partition once the kernel goes multi-user.
A large source of confusion is which server which files are fetched from. For simplicity, everything should come from one server DHCP, TFTP, and NFS. If this is not possible, the DHCP server and the TFTP & NFS server can be split apart.
As pxeboot(8) is a modified form of loader(8), the same configuration items apply as to an unmodified loader. The boot files are pulled from a "virtual root" on another machine, which allows for a separate boot configuration than the NFS server. pxeboot(8) also supports using TFTP to fetch files; this is a compile-time option controlled in src/sys/boot/i386/loader/Makefile. No additional options are required for pxeboot(8) as the server and virutal root filesystem are retrieved from DHCP.
To serve PXE clients, the server simply needs to provide the client with the boot filename and next-server if applicable. For ISC DHCPD, the following lines configure the PXE client:
DHCP Server Configuration Directives
Tells the client what file to pull via TFTP.
Tells the client where to pull the filename from. If not specified, defaults to the DHCP server.
Virtual root NFS partition for loader to fetch the boot files from.
The TFTP server requires no special configuration other than to enable the TFTP server in inetd.conf and to copy the pxeboot(8) binary from /boot/pxeboot(8) into /tftpboot.
The virtual root should contain a boot/ directory, kernel, and (most likely) a compressed filesystem image. A copy of the system boot/ directory serves as a good start. Modify /boot/loader.conf as appropriate for the netbooting machines. Copy in the appropriate kernel for the machines into the virtual root. Copy in a mfsroot image with the root filesystem for the netbooted machines. Finally, make sure the potential clients can mount the filesystem using NFS over UDP.
The FreeBSD system installation utility, sysinstall(8), can be cohereced into performing automated installs. sysinstall(8)contains a (very rudimentary) scripting language which can partition, slice, format, and install disks. Thanks to the work of Alfred Perlstein, magic incantations allow sysinstall(8) to run totally unattended, which is ideal for installing large numbers of systems. Combined with pxeboot(8), creating a netbooting, self-installing environment is easy.
The script file format is documented in the sysinstall(8)man page, which is not installed with the system. The man page can be found in the source tree in src/release/sysinstall/sysinstall.8 and formatted by hand using groff(1) or copied into the appropriate man directory.
The script file is a list of arguments and commands. Order is dependent so great care is required in constructing functional files. Extensive testing is highly recommended.
For local changes, a package with pre- and post-install scripts works nicely. As the packages are installed in the order listed, simply list the custom package last and all the tools installed earlier will be available.
A sample script file is available on Alfred¹s web page and in the source repository. sysinstall(8) can be built to automatically look for the script file on start.
The NFS server housing the FreeBSD install files should look like the FTP site or CDROM, with the root directory containing the distribution directories (bin/, sbin/, etc.) and packages in the packages/ directory. The custom package built for post-install use must be placed in the packages/ directory and an entry added for it to the ports/INDEX file.
PicoBSD, the single-floppy build system that ships with FreeBSD, serves as a handy launching point for assembling the MFS image file. PicoBSD takes care of generating the images with the necessary magic, assembling a crunched binary, and setting up the boot configuration.
The PicoBSD install floppy type came about as a way to seamlessly install new systems in a datacenter. At the time pxeboot(8) had not been developed yet. The sample install script on the install disk sets up the disk with fdisk(8) and disklabel(8), then extracts a tarball from a server over NFS or FTP. Once the image is extracted, the script fixes up the fstab(5) file, selects a kernel, and applies any other local fixes. The tools fit on a 1.44MB floppy with a FreeBSD 3.2-RELEASE system. At the present time this disk is broken in 4.1-RELEASE but work is underway to bring it back to functionality.
Creating the tarball image is rather simple. As with most UNIX commands, the correct commandline options to tar(1) are key. The source of the image is a running system. This has two advantages--one, validating the image is correct is easy as the target software can run on the image machine, and two, if the image machine is destroyed it is easily reconstituted by simply installing another machine.
FreeBSD provides three components to ease automated installation of systems--sysinstall(8)'s internal scripting, PXE-based network booting, and install floppies. Scripting sysinstall(8) is the easiest way to start out, as it involves simply writing a script file and placing it on one of the install floppies. However this provides the least amount of control, requires two floppies and a disk change to start, and lots of time and energy to test and debug the script. If network booting is available, this simplifies and speeds development and execution of the installation. Finally, simple scripts can replace most of the necessary functionality of sysinstall(8) during the install process. Generating the in-memory filesystem image is the single most difficult part of developing a custom install. In the end, however, the time and energy saved when installing a large number of systems with an automated tool is significant and worth pursuing for any medium- or large-sized site.