The key component here was to realize that most tags already corresponded to manpages. For the ones that do not:
The ability to correlate PRs with a manpage seems to be really helpful -- both in submitting and in handling. We might consider this to be the primary category in the database.
Once you take a look at the above, you can see that we still need to do more work on categorization. A list of 400 is much closer to human-browsable than a list of 5400, but it is still too many. Users and developers will need a further breakdown.