The FreeBSD Ports Monitoring System Introduction Among the least-publicized strengths of the FreeBSD development model are users' access to the CVS source tree and the continual QA work being done via onging build processes. The work described in this article attempts to leverage these strengths to help ease the process of porting, and maintaining, applications for FreeBSD. Overview The set of programs known as 'portsmon', currently running at portsmon.FreeBSD.org, is a tool to gather and disseminate information about FreeBSD ports. (In FreeBSD terminology, unlike NetBSD terminology, a 'port' refers to a Makefile and set of related source files to build and application; a 'package' refers to the binary file(s) created as its output.) portsmon is written in Python and carries a BSD license. It works by data-mining some existing information that was already available on the web in several different locations. For instance, there are several automated processes that exist to provide Quality Assurance (QA) feedback for the ports tree. Each of these processes produces results that are generally posted in HTML format on a regular basis. In addition, there are other sources of information (such as the Problem Report (PR) database and the source files themselves, which are located on repository mirrors), which are also suitable for mining information from. Until the creation of portsmon there was no way to correlate these sources of information in a way that could be browsed by a human. portsmon grabs the HTML pages, parses them, puts them into a database, and allows interactive queries from HTML forms. In addition, it periodically outputs email with the status of ports that have some kind of error. The first instance of portsmon was installed in May 2003 and one instance or another of it has been running continuously since that time. Why It Is Needed Let's consider how people might actually need or want to use the existing information. - A maintainer of an individual port is primarily interested in the question "do my ports work?" The majority of maintainers may be most interested in finding out whether their ports build on the "stable" branch (currently 6.X) on e.g. the Intel x86 architecture, which is available on a single HTML summary page. But beyond that, they should also be encouraged to, firstly, make sure that their ports build on the "current" branch, and secondly, on other architectures as well. (Without this latter, there is not much point in having architectures other than x86 supported as "first-tier" architectures!) In addition to the build problems, they will likely also want to see pending PRs against their ports. - An individual user is primarily intersted in "will this port I just found out about work on my OS release and architecture?" - A FreeBSD committer may be interested in finding out information that applies to one maintainer. - A member of the port release management team ("portmgr") may be interested in finding any problems affecting either large numbers of ports, or the integrity of the ports build mechanism itself. - Any general member of the FreeBSD community may be interested in seeing metrics about the overall "health" of the ports collection. What portsmon does that is to provide a single place to see correlated results of the Problem Reports, the build error logs, and data from the CVS tree. To the extent the the PR database is lacking in certain features, it serves as a supplement to it. This will be discussed further below. Sources of Information Build Logs There exists a cluster of FreeBSD machines (known as the "bento cluster", from the former hostname of the coordinating machine) whose sole purpose is to run makes of the entire set of applications (known in the FreeBSD parlance as the "ports tree"). The source for each port is automatically fetched from wherever the source is hosted, and then built into a complete binary package. The logfiles that result from each port are scanned for build errors, and if so, an error summary line is created. This operation is repeated over all combinations of the "stable" OS release code versus the "current" OS release code, and the various architectures supported by FreeBSD. For each combination, several HTML summary pages are created, each sorted by various columns such as portname, maintainer, and so forth, having as their contents the errory summary lines. For purposes of this discussion, the individual combinations of OS release and processor architecture are termed "build environments" or "buildenvs". It should be noted that the bento cluster only produces its own HTML reports for each individual buildenv. Problem Report Database (GNATS) The PR database, known as GNATS, does not have any particular knowledge of what a "port" or a "maintainer" are. Anyone can send a PR, either from a FreeBSD machine which can communicate via email via send-pr(1), or from a web form. PRs vary greatly in quality; there is currently no individual who is guaranteed to "screen" them for applicability, accuracy, and so forth (although the present author, as a member of the bugmaster team, attempts to do so). Further, GNATS has no concept of "individual component", and thus no concept of "maintainer of component". However, each port in the Ports Collection is assigned to a maintainer (although there is a fallback email address standing in for "no maintainer"). Therefore, an algorithm had to be introduced that would parse the raw data in incoming and modified PRs in the 'ports' category and attempt to assign a category and portname to each one. The current version of this algorithm is approximately 93% accurate. As of mid-2006, FreeBSD is averaging somewhere around 40 ports PRs per day. There are usually somewhere between 500 and 1500 ports PRs, depending on the stage of the release cycle. By comparison, there are over 14,000 ports. The percentage of ports with PRs has actually gone down in the past two years due to some concentrated effort by a number of committers. CVS repository The source files in the ports tree (in particular the Makefiles) are also used to extract 'metadata' about the port, such as its name, its maintainer, whether it is buildable, and others. cvsup is used to fetch the latest updates, and its output is examined to indicate when a port's metadata might have changed (and thus need to be regenerated). Changes to individual category Makefiles are used to detect when a port has been added or deleted. In addition, a file called MOVED has entries for ports that have either been renamed, obsoleted by some other port, or deleted as being no longer maintained by its upstream author or having security problems. The MOVED technology is used by automated tools such as portupgrade(1) to help the user deal with these changes. In particular, when ports from one category are moved to another (possibly new) category, these tools can update the tree and save the user from having to deal with the issues. Information Extracted from the CVS Tree - for individual ports: - Data from Makevars: - MAINTAINER - status: BROKEN/DEPRECATED/IGNORE/FORBIDDEN These are special designations which are useful to users and maintainers. - EXPIRATION_DATE Ports marked for deletion are usually not deleted immediately, unless there is a licensing problem. A mechanism using the Makevars DEPRECATED and EXPIRATION_DATE make this process more transparent to the greater community. Since this process was instituted, a large number of ports that were previously broken and not being paid attention to were marked for deletion. This brought a much greater deal of attention to their state, and in nearly half the cases have led to the ports being fixed. Others have been deleted as being deemed to have outlived their usefulness; but in this case, at least potential users have been given a heads-up. - IS_SLAVE_PORT - MASTERPORT Since it would be too time-consuming to rebuild the entire set of metadata for the tree on every CVSup run, a shortcut is adopted to only necessitate updates to "affected ports". For portsmon's purposes, this is defined as the ports having a master-slave relationship; this is the most general cases of variable inheritance in the Makefiles. - There are some others that only affect the display on portsoverall.py. - Data from raw files: - Makefile version This is used to decide when a ports' metadata needs to be rescanned to start with. - for the ports tree overall: - Data from raw files: - bsd.port.mk This file contains the master list of categories. - category Makefiles These files contain the list of ports within each category. - MOVED This file is a mechanism for tracking ports that have either been renamed (for instance, the project changed its name, or a new version has been released that is incompatible with an old one, forcing the renaming of the existing portname); or deleted (either being due to no longer being developed, or licensing or security problems.) Schema: Tables See the slides for a more complete explanation. Schema: Attributes See the slides for a more complete explanation. Database updating All of the database updating is done on the fly; static updates would take too long, as an entire dependency tree would have to be built. Since portsmon does not model the dependency tree, this time would be wasted. Also, portsmon relies on some metadata which is not included in the default 'make index' run, such as IS_SLAVE_PORT. Some optimizations have had to be made to the database algorithms to enable incremental updates. For instance, IS_SLAVE_PORT is used to look up any extra ports that will need to have their metadata re-read because of an update to a master port. The existing INDEX file can be used to figure out if a port has a master port, but cannot be used (as such) to figure out which ports have slave ports. Tables driven from CVS data updated every 60 minutes; this is the CVSup interval of the servers. FREEBSD_ERRORLOG_TABLE is updated every 30 minutes; the scan of this file is very fast, so it could be done more quickly, but given that it takes several days to build all 14,000+ ports, even on the fast architectures, not that many entries really change more often than that. Outputs The outputs of portsmon are divided into two areas: on-demand HTML pages, and email notifications. Reports Available Interactively via HTML: Regular Reports This list is not all-inclusive, but focuses on the reports that are most useful to invidual committers or maintainers. Each report tries to focus on one particular area. It would be infeasible to try to show all possible data contained in the database; consider it as an N-dimensional space based on category/portname; maintainer; error type; PR number; and so forth. The reports that focus on build errors: - by type and buildenv (portserrcounts.py) - by category (portserrcountsbycategory.py) - by portname and buildenv (portscrossref.py) Build errors and Problem Reports - by portname (portsconcordance.py) - for one maintainer (portsconcordanceformaintainer.py) - error counts by maintainer (portsbymaintainer.py) - by portname, for one maintainer (portsconcordanceforresponsible.py) - by portname, unmaintained ports (portsconcordancefornoresponsible.py) - by portname, ports for one status (e.g. BROKEN) (portsconcordanceforbroken.py et. al.) Note: this is currently only evaluated on i386-CURRENT; see below. - by portname, for one or more buildenvs (portsconcordanceforbuildenv.py) - overall status (portsoverall.py) Problem Reports by portname, for existing ports (portsprsbyportname.py) by PR number, for existing ports (portsprsbyexplanation.py?explanation=existing) by PR number, for new ports (portsprsbyexplanation.py?explanation=new) by PR number, for framework (portsprsbyexplanation.py?explanation=framework) by PR number, for unknown (portsprsbyexplanation.py?explanation=unknown) Overview of one port (portoverview.py) This gives all the PRs and error logs for one port; in addition, links to the CVS web page, the URL for the mastersite, the FreshPorts page, and other interesting entries are included. Reports Available Interactively via HTML: Specialized Reports This list is not all-inclusive, but focuses on the reports that are most useful to invidual committers or maintainers. Anomalies (portsanomalies.py) This attempts to find both errors where the internal algorithms have generated inconsitent results, and also PRs which may have become mis- assigned (for instance, still assigned to maintainer who has resigned.) Dependency tree for one port (portdependencytree.py) The file /usr/ports/INDEX can show dependencies of ports as they are built by default. However, it does lag by some number of hours. portsmon has a complete mirror of the tree since the last CVS update, however, so it can allow for queries to be run from the web pages. Unlike the existing tools in the ports tree that will only show dependencies for ports installed on one particular machine, this HTML page will show all possible dependencies as though the total tree were installed (e.g. potential dependencies). Port has moved (portsprsformoved.py) This is used to find PRs with stale categorization; e.g., where the port has been renamed or repocopied, but the PR assignment is now stale. Port has maintainer update (portsprsmaintainerupdates.py) This allows committers, if they wish, to prioritize these PRs. Ports where where maintainer is committer (portsprsmaintaineriscommitter.py) or maintainer is not commiter (portsprsmaintainerisnotcommitter.py) These are mainly of interest for ensuring that PRs are assigned correctly. Ports where maintainer might not know (portsprsmaintainermightnotknow.py) Since GNATS has no concept of 'maintainer', it is possible for PRs to be entered where the maintainer will never know about them (e.g., the maintainer was not Cc:ed.) An automated process on the main FreeBSD repository machine reminds people with @FreeBSD.org addresses about all their PR assignments (not just ports), but this report fills in the gap for non-committer maintainers. Ports with no maintainer (portsprsunmaintained.py) This is to allow committers to look for PRs that may not be being attended to because there is no maintainer; this may also allow interested users to see if there are unmaintained ports that they would like to adopt. Ports with possibly misconveyed PRs (portsmisconveyedprs.py) This, again, is mainly of interest for ensuring that PRs are assigned correctly. Reports Generated via Email Every two weeks a set of email reports are generated to attempt to alert users and maintainers of the status of individual ports. This is an attempt to make parts of the process (e.g. the process for deleting ports are scheduled for removal either due to such technical issues as fetch failures, or security-related issues): - Ports which are currently marked broken; this is sent to the individual maintainers - Ports which are currently scheduled for deletion; this is sent to freebsd-ports@FreeBSD.org - Ports which are currently marked "forbidden", to maintainers - Ports with PRs where maintainer might not be aware of them, to maintainers This latter is necessary because GNATS, as stated before, does not have any concept of 'maintainer', and thus cannot notifiy a maintainer when a PR comes in. A tool known as 'gtk-send-pr' takes care of this automatically, but not everyone uses it. Charts and Graphs There is an uncompleted project to provide some charts and graphs. This includes - a bar chart of number of ports marked BROKEN, by maintainer; - a bar chart of percent of ports with build errors by build environment; - a pie graph of percent of PRs by explanation type; - a pie graph of percent of PRs by state; and - a bar chart of unique error counts in all buildenvs. How Well Does It Work? Despite its hacky orgins, the system is quite robust. It has been online for public access for over 2 years, in one instance or another, and has not crashed during this time. The GNATS PR-classifier get about ~93% true positives on PR assignment, ~3% false positives; the rest are classified as "unknown" and must be manually fixed. Often, it is easier to change the PR synopsis than simply do the override in the database; this also allows for people that use GNATS to search PRs to get the right answer. You only have to totally rebuild database on new installs; the code keeps up with incremental changes otherwise. However, there are some bugs in the MOVED processing, so that MOVED needs to be rescanned periodically. There may be some interaction with respect to IS_SLAVE_PORT here. What's Missing? portsmon does not yet model packages that build successfully. Since all but a few percent of ports do successfully build into packages, this is a fair amount of data. An alpha-quality code implementation exists that scans the directory listings (e.g. via ftp), but due to its slowness it has not yet been integrated. Integrating this code would enable us to answer the questions "how far behind is a given package for the underlying port", and "how many ports successfully package on amd64 vs. i386." portsmon does also not yet model port metadata such as BROKEN/FORBIDDEN/ IGNORE for anything other than the i386 architecture, and for one single OS revision. Therefore, only one buildenv is completely represented in the database. Using i386 is easiest because the metadata are evaluated on an i386 machine. However, it is possible to override both the arch and the OS version -- in fact, the latest OS version is generally used because it is thought that that is where most new problems will arise that need maintainer attention.. Using a single buildenv both simplifies the database, and minimizes update time, since the make -V invocations are one of the rate-limiting factors of the whole process. It is also unknown exactly how many other metadata other than the build status depends on buildenv. Before portsmon and FreshPorts, it is entirely possible that no one even considered the possible impact of allowing other metadata to vary based on these variables. Therefore, it is hard to know exactly how many port Makefiles might make such changes, and thus exactly how much of the database would have to be generalized to be by-buildenv. (There is at least one case known where PORTREVISION varies depending on ARCH.) Another source of data is the list of ports for which one or more distfiles fail to fetch, generated by Bill Fenner. This is discussed more thoroughly below. One more source of data are two new projects that attempt to scan the home pages for various port projects and identify new revisions that may be available. Both Edwin Groothius and one other author have programs that do this. portsmon's author had also implemented an alpha-quality scanner, but it has not yet been finished and integrated. (It does work surprisingly well -- correctly identifying updates in at least 50% of the pages it scans -- but it has no caching or mastersite selection, and so just hits the same mastersites over and over again. This doesn't seem to be in the spirit of 'fair play'.) It is unknown how well the 3 implementations' algorithms compare with each other. Finally, it is not currently possible to get data about such things as "maintainer timeouts" to learn when an individual may have lost interest in participating, and thus the maintainership should be passed to someone else. To do this, it will be necessary to mine the data from the CVS logs. An alpha implementation of the parser exists, but no database backend has been created. Having this data would then allow the asking of questions such as "show me only ports that have failed to build after their last CVS update". There are many other reports that could also be generated, including a more general-purpose query page. The latter would require some restructuring of the database. Bugs - Some of the reports still do not have the ability to sort on some columns. - The design of the database was done to optimize the ability to tinker with the reports, not to make the queries efficient. This has the disadvantage of making the pages fairly slow. The advantage is that there have not needed to be any "flag days" where the database had to be completely reloaded as new features were added. - The database update mechanism could be improved; for a larger deployment, moving some of the SQL to transactions would be necessary. - Some of the configuration needs to be less hard-coded. Related work FreshPorts Dan Langille's FreshPorts (www.freshports.org) is a set of web pages that correlate a great deal of information about FreeBSD ports. The scope of FreshPorts overlaps this work to some degree, but the projects are complementary. There are areas where portsmon has information that FreshPorts does not have, and vice versa. In each case we are trying to view various "meta-information" about individual ports. Both are necessary but there is still more than can be added. FreshPorts concentrates on all ports, not just ports that have either build errors or PRs. It is more oriented towards individual users than is portsmon. It also is designed to automatically notify interested maintainers (via email) of problems as soon as they occur. The framework for the emailing and subscriber control is outside what the scope of this work would be able to do. So, while there is some degree of overlap, the intended audience and application is different. In addition, FreshPorts models the dependency tree, which is not a focus for portsmon. portsmon is more concerned with individual ports. A final implementation note: FreshPorts parses CVS commit mail to track updates; portsmon uses the output of cvsup. This allows FreshPorts to be more up-to-date than portsmon; however, in the even of unreliable email, portsmon may prove to be more robust. Bill Fenner's Distfile Survey Bill Fenner maintains a report of ports that fail to fetch. Unlike the pointyhat errorlogs, which only assert the error "fetch" if the sourcefile cannot be fetched from _any_ server, Bill's reports include all sourcefiles that cannot be fetched from each of the servers on which they are supposed to reside. Further, his report shows how long each of the individual fetch failures has been ongoing. This data should be included in the database. Edwin Groothius' PR auto-assigner When portsmon was first written, there was no automated way of seeing which ports PRs ought to be assigned to which maintainers. For the initial year or so of its deployment it was the only source of that information, and its author used the web page to scan through the list once or twice a day to do any missing assignments. Since then, Edwin Groothius has written a program to scan the PRs and attempt to auto- assign them using edit-pr(1). It is unknown which PR classification algorithm is more effective for identifying existing ports, although they appear to be comparable. This process has saved portsmon's author some considerable time in poring over its output :-) However, occasional passes over the reports are still necessary, to catch PRs that are either not automatically classified, or incorrectly classified. Also note the that PR auto-assigner does not model new ports or ports framework issues, which portsmon does. Dirk Meyer's Reports Some work very similar to this work has been done by Dirk Meyer and is hosted at URL: http://ports.dinoex.net/errorlogs/. (The present author was also unaware of this work when he began). Like Fenner's work, this focuses on the build errors rather than the PRs; however, it's worth noting that unlike Fenner's reports, which are statically generated, Meyer's reports are database-driven. However, the presentation differs from the current work. Interested users may find one or the other presentation more useful. Future Work In addition to the data sources that are not yet being data-mined as mentioned above, various people have requested the ability to 'subscribe' via email to PRs about certain ports (or even proposed new ports). Currently, there is no provision for being able to do so. An even more interesting project would be to automatically identify ports whose authors have updated them and then attempt to update its Makefile in a dedicated area of a ports tinderbox and build it. The build logs could then be scanned by a local instance of portsmon and possibly save an interested party some of the "detail-work" involved in updating ports. Summary The goal of this work is to reduce the time and frustration involved in maintaining source-based applications on FreeBSD. It is hoped that with these reports, problems such as "maintainer hasn't noticed that new changes to system include files broke all his ports" or "maintainer hasn't made sure that his ports run on Sparc-64" or "maintainer has gone missing in action" can be spotted and the problems corrected with much less time and frustration on everyone's part.