Introduction

FreeBSD presently has a monolithic /etc/rc. A large number of people want to change this (and some people stay away from FreeBSD just because of this), but there is no clear plan forward. This document is an attempt at gathering different information about what is needed from such a system, provide reference information for the implementors, and collect the community's thoughts about it. It is presently written in a form which indicate me as author (first person) - I'll change this if I get any large contributions that make it inappropriate.

Eivind.

P.S. Thanks to David Wolfskill for picking parts of this to pieces and contributing more requirement points - you can do this too! ;-)

Requirements

There are a bunch of different requirements and nice-to-have's - I suspect we can't satisfy all of them, but unless we actively detail them, each will only turn up as "hey! we can't do that!" when somebody present a scheme, and will keep us at the present system.

I'm presenting these in a somewhat random order, numbering them as I go. I may be doing subgroups etc later; the numbering is intended for somebody that attempt to make an implementation to have an easy way of referencing the requirements in their discussion/presentation of the scheme. This is not a prioritized list.

(1) It should be easy for an administrator to get an overview of what is started and in what order.
(2) It should be easy for an administrator to get an overview of the present system configuration - ie, the values of system variables (as you can presently get by looking at /etc/rc.conf, if you are fully familiar with the system).
(3) It should be trivial for a program to add or change something in the configuration - e.g, add a database that need to run before sendmail, or replace sendmail with qmail.
(4) Programs should not have to edit configuration files to accomplish (3). They should be able to do it by creating files, symlinks, or talking to some sort of daemon that avoid the problems.
(5) It should be possible to set variables similarly as described for order under (4). (This is probably in the nice-to-have category).
(6) It should be possible to have a machine come up in different configurations - e.g, to have a laptop use PPP when at home, but use the Ethernet when docked at work.
(7) It should be possible to create dynamic configurations that will work equal/similarly on different hardware - e.g, different Ethernet cards. (This is from my personal requirements, but I suspect it covers other people's needs, too - I'll keep using a custom system if the final system doesn't support this).
(8) It should be able to restart services etc without forcing the system administrator to know the arcane details of restarting just that service (e.g, inetd wants a HUP, named wants a ndc reload)
(9) It should be able to track changes of startup configuration values and nudge the appropriate parts of the system. (This isn't as hard as it sounds.)
(10) It should be possible to differentiate between 'wants' and 'needs' for a service. E.g, sendmail may want but not need name service.
(11) It should be possible to make automated installs of SysV software under the scheme. (Nice-to-have from Terry)
(12) We should not need another transition later - transitional pain should be covered in one step.
(13) People should be able to update /etc/ and still keep their old /usr/local, and it should work.
(14) It should NOT introduce new security problems (example: A cracker should not be able install a new package that get run before secure-level is raised).
(15) Run-states - a system may have different sets of services for different 'states' (example: single-user vs multi-user, or X vs console). Being able to specify and switch between states is a nice-to-have; an implementation that makes it impossible to implement this later (without getting any transitional pain for those not actively using run-state changes) is a bad idea. An example of such an implementation
(16) It should be implemented with the tools we have already if possible. (This is very clearly a nice-to-have - do NOT constrain yourself to this if you need anything else.)
(17) It should not expand the contents of / more than absolutely necessary.
(18) It should be acceptable to NetBSD/OpenBSD too, to avoid divergence unless necessary. (This is a nice-to-have).
(19) It should support 'keeping services alive', like SysV does if you enter something into /etc/inittab
(20) It should be possible to get configuration values from the network, either from DHCP or another protocol. (Is it possible to use DHCP without grabbing your IP off it? I don't know - you'd better know before you implement ;-) It should also be possible to use strong authorization on the values, so it would be bad to only support DHCP. (Inspired by dhw).
(21) It would be very nice if there were a straightforward way to have a "layering," whereby machine-specific information takes precedence over network-specific information, which takes precedence over site-specific information, which takes precedence over the FreeBSD-default information. (Some sites may not need to distinguish between site-specific and network-specific information; others might need additional "layers.") (From dhw, with minor changes)
(22) There needs to be a way to cleanly handle the upgrade process (within reason.) Parts that are appropriate to upgrade should be automatically upgraded, while things that are site/network-specific need to stay put. It would be nice if outdated variables etc automatically could be flagged, to avoid an admin being stuck with an old knob turned when the knob has been replaced. (Partially from dhw)
(23) It would be nice if machine-, network-, or site-specific changes were amenable to change control. A lot of people like to use RCS/CVS to track the changes they make. (From dhw, with some changes by Y.T.)
(24) It would be nice to have the ability to update the configuration data consistently across a group of machines, in order to implement clustering.

Available implementations

A very rough proof-of-concept implementation for dependency-based splits from http://www.freebsd.org/~eivind/newrc.tar.gz (I don't think I ever booted this one, so the only interest is in some of the details of the split - and there are a couple of interesting details).
The standard SysV approach (startup scripts in /etc/rc.d, symlinks from /etc/rc.1/, /etc/rc.2/ etc for the different runlevels).
The chkconfig system from Irix (based on SysV).
Standard /etc/rc
NetBSD has a new implementation of an rc system that fulfills a lot of these goals, and is likely to be integrated into FreeBSD.
Darwin also has a modern startup system. It is based on having a directory with an executable or script for each service to do the actual starting, and an XML file for each service describing metadata around that (what other services are required, which services this executable provides, etc.) More information is available in "Inside MacOS X - System Overview", in the "Customizing Booting Behaviour" section. This is available as a PDF from http://developer.apple.com/macosx/. Thanks to reader Andrew Stevenson for this information.
Loads and loads of others that I won't think of the details of at the moment :-)

Useful concepts

I'm including a lot of stuff that probably is trivial to many of us. I want this to include enough background that somebody who doesn't know any of this can get my thoughts on it (by following the appropriate references if necessary).

Normalized representation vs de-normalized representation.

A normalized representation of something contains relevant information to construct the de-normalized version. In its correct form (where data is checked before using it), caching is storing both the normalized and the de-normalized version of something. Normalization / de-normalization is a question of viewpoint - a representation can be perfectly normalized from the point of view of one task, while being hopelessly de-normalized for another. The present task (starting services) is a great example. In order from less to more normalized:

The monolithic /etc/rc scheme has de-normalized just about everything - all services are grouped together, a correct order is hard coded.
The SysV scheme is somewhat less de-normalized - the services are split and their way of starting is normalized. The ordering is still de-normalized.
A more normalized scheme would be to have each service provider say 'I need this service to run', 'I want to have this service, but I can work without it' and 'I want to run before that service'. This is perfectly normalized from the view of starting services.
An even more normalized scheme would also track which variables a service used, so it could do things to the services depending on how variables changed.

The advantage of normalized representations is that they allow the system to do changes based on factors outside the constraints and usually allows people (or computers) to do changes to the data more easily - the disadvantage is that they are usually slower than using a de-normalized form if the result of the normalized system is to do the same as the de-normalized system. However, the extra information available from a normalized system often allow the computer to do something else that achieves the same result - for instance, a normalized representation of the rc files would allow parallel execution of different services.

Graphs.

You probably don't want to attempt to write a new rc system before you know what a graph is. You should know at least how to do a topological sort on an acyclic graph (described in any decent book on algorithms). This is not difficult to learn; it is just a couple of concepts you need to understand.

Event driven systems

It will be an advantage if you are used to thinking of event-driven system. The services can be viewed as variables in a dependency graph, with states 'on' and 'off' (and possibly others), and with variables also existing independently of services (example variable: the IP address of the default gateway).
However, you can also view the entire system as event or message-driven. In this case, switching the state of any variable create an event, and different parts of the configuration can wait for (subscribe to) different events. This has the nice side-effect of automatically doing the correct ordering (except for cycles) and making it reasonably easy to do (9), but makes it difficult to do (1), (10), (15) and probably (14) (unless we assume the parts of the system that need to run before switching the secure-level is run in a monolithic system, or create a couple of separate event systems).