Problems With The Upgrade Process

From kris on http://docs.freebsd.org/cgi/getmsg.cgi?fetch=392756+0+archive/2007/freebsd-hackers/20070513.freebsd-hackers:

I think the major area that needs further work is to do with improving the upgrade process. Historically, the ports collection and package tools did not support any notion of "do what needs to be done to upgrade this package". This was fine in the days when systems typically only had a handful of ports installed, and they each had only a few dependencies: you could update them by hand without much trouble using 'make deinstall' and 'make install'. Of course, it fails utterly to scale to the era of GNOME and modular X.org.

This problem was more or less solved by tools like portupgrade, but these are bolted on to the side of the ports collection and underlying package tools rather than being properly integrated with them.

(Note: There are other upgrade tools, but IMO none of them are as mature as portupgrade in terms of feature support or robustness, and they also do not attempt to solve the metadata management issues at all.)

One consequence of this is that the UPDATING file has a disturbing number of entries for manual steps required to update certain ports. Some of these are because of vendor screwups that we can't really fix (e.g. broken backawards compatibility with existing user files requiring data migration). But a lot of them come down to failures of the FreeBSD upgrade process: either a ports developer was lazy and didn't want to do the extra work to support automatic upgrades, or the upgrade tools are insufficiently flexible to handle the upgrade.

My opinion is that every time a ports developer adds an entry to UPDATING that lists manual steps required to update a port (except for the vendor case above), it usually means we have failed as a project.

Part of this problem comes down to metadata management and integration with the underlying tools. Some of the information that an upgrade tool needs to plan and manage the upgrades is not efficiently accessible using the pkg_tools (queries are O(N) or worse in the number of packages installed, etc). portupgrade currently solves this by maintaining its own parallel version of the metadata in a database that can be efficiently queried (it also solves part of the problem by ignoring it, which is a limitation that affected e.g. the xorg upgrade and required the UPDATING entry).

The problem with maintaining a parallel database is that unless users always manipulate their ports and packages using portupgrade this database can become stale. Addressing this is the goal of Garrett Cooper's SoC project, by pushing down the database management into pkg_tools so hopefully portupgrade and pkg_tools can be made to share the database (or portupgrade can revert to querying efficiently via the pkg_tools).

There are other problems affecting package and upgrade management that are due to scaling limitations and data management issues that have become relevant as the ports collection has grown in size and complexity. Some information (e.g. dependency lists) can currently only be obtained by recursively walking the tree, which is a very expensive operation for things like GNOME. There are probably other operations that are O(N) in some large number N that I can't think of right now. It would make sense to explore whether there are ways of avoiding or optimizing these expensive operations.