FreeBSD Network Administrators Guide


Table of Contents

Preface
Purpose and Audience for This Book
Sources of Information
Documentation Available via FTP
Documentation Available via WWW
Documentation Available Commercially
FreeBSD Usenet Newsgroups
FreeBSD Mailing Lists
Online FreeBSD Support
FreeBSD User Groups
Obtaining FreeBSD
About This Book
The Official Printed Version
Overview
Conventions Used in This Book
Submitting Changes
Acknowledgments
The Hall of Fame
1. Introduction to Networking
History
TCP/IP Networks
Introduction to TCP/IP Networks
Ethernets
Other Types of Hardware
The Internet Protocol
IP Over Serial Lines
The Transmission Control Protocol
The User Datagram Protocol
More on Ports
The Socket Library
UUCP Networks
Maintaining Your System
System Security
2. Issues of TCP/IP Networking
Networking Interfaces
IP Addresses
Address Resolution
IP Routing
IP Networks
Subnetworks
Gateways
The Routing Table
Metric Values
The Internet Control Message Protocol
Resolving Host Names
3. Configuringthe NetworkingHardware
Kernel Configuration
Kernel Options in FreeBSD
The PLIP Driver
The PPP and SLIP Drivers
4. Configuring the Serial Hardware
Communications Software for Modem Links
Introduction to Serial Devices
Accessing Serial Devices
The Serial Device Special Files
Serial Hardware
Using the Configuration Utilities
The setserial Command
The stty Command
Serial Devices and the login: Prompt
Configuring the mgetty Daemon
5. Configuring TCP/IP Networking
Setting the Hostname
Assigning IP Addresses
Creating Subnets
Writing hosts and networks Files
Interface Configuration for IP
The Loopback Interface
Ethernet Interfaces
Routing Through a Gateway
Configuring a Gateway
The PLIP Interface
The SLIP and PPP Interfaces
IP Alias
The netstat Command
Displaying the Routing Table
Displaying Interface Statistics
Displaying Connections
The sockstat Command
Checking the ARP Tables
Viewing the arp tables
Adding entries
Deleting entries
Proxy arp
6. Name Service and Resolver Configuration
The Resolver Library
The host.conf File
Configuring Name Server Lookups Using resolv.conf
Resolver Robustness
How DNS Works
Name Lookups with DNS
Types of Name Servers
The DNS Database
Reverse Lookups
Running named
The BIND 8 named.conf File
The DNS Database Files
Caching-only named Configuration
Writing the Master Files
Verifying the Name Server Setup
Other Useful Tools
7. Serial Line IP
General Requirements
SLIP Operation
Dealing with Private IP Networks
Using dip
A Sample Script
A dip Reference
Running in Server Mode
8. The Point-to-Point Protocol
PPP on Linux
Running pppd
Using Options Files
Using chat to Automate Dialing
IP Configuration Options
Choosing IP Addresses
Routing Through a PPP Link
Link Control Options
General Security Considerations
Authentication with PPP
PAP Versus CHAP
The CHAP Secrets File
The PAP Secrets File
Debugging Your PPP Setup
More Advanced PPP Configurations
PPP Server
Demand Dialing
Persistent Dialing
9. TCP/IP Firewall
Methods of Attack
What Is a Firewall?
What Is IP Filtering?
Setting up FreeBSD for Firewalling
Kernel configuration for ipfw
Kernel configuration for IPFilter
Three Ways We Can Do Filtering
Basics of IP Packet Filtering
Building an ipfw firewall
ipfw Rules
Manipulating the ipfw Rules
Creating a Firewall with ipfw
Filtering udp in a stateful manner with ipfw
Logging traffic
Filtering ICMP
More things to filter on
Integrating in to FreeBSD
Testing a Firewall Configuration
A Sample Firewall Configuration
10. IP Accounting
Configuring the Kernel for IP Accounting
Configuring IP Accounting
Accounting by Address
Accounting by Service Port
Accounting of ICMP Datagrams
Accounting by Protocol
Using IP Accounting Results
Listing Accounting Data with ipfwadm
Listing Accounting Data with ipchains
Listing Accounting Data with iptables
Resetting the Counters
Flushing the Ruleset
Passive Collection of Accounting Data
11. Network Address Translation
Side Effects and Fringe Benefits of Dynamic NAT
NAT implementations for FreeBSD
Configuring the kernel for NAT
Configuring NAT with ipfw and natd
Configuring the kernel for ipfw and natd
Running natd at boot time
Adjusting the firewall to support natd
Dynamic NAT configuration
A sample firewall/natd configuration
Allowing access to internal hosts with dynamic NAT
Static NAT
Configuring NAT with IPFilter and IPNat
Handling Name Server Lookups
More About Network Address Translation
12. ImportantNetwork Features
The inetd Super Server
The tcpd Access Control Facility
The Services and Protocols Files
Remote Procedure Call
Configuring Remote Loginand Execution
Disabling the r; Commands
Installing and Configuring ssh
13. The Network Information System
Getting Acquainted with NIS
NIS Versus NIS+
The Client Side of NIS
Running an NIS Server
NIS Server Security
Setting Up an NIS Client with GNU libc
Choosing the Right Maps
Using the passwd and group Maps
Using NIS with Shadow Support
14. The NetworkFile System
Preparing NFS
Mounting an NFS Volume
The NFS Daemons
The exports File
Kernel-Based NFSv2 Server Support
Kernel-Based NFSv3 Server Support
15. Electronic Mail
What Is a Mail Message?
How Is Mail Delivered?
Email Addresses
RFC-822
Obsolete Mail Formats
Mixing Different Mail Formats
How Does Mail Routing Work?
Mail Routing on the Internet
Mail Routing in the UUCP World
Mixing UUCP and RFC-822
Configuring elm
Global elm Options
National Character Sets
16. Sendmail
Introduction to sendmail
Installing sendmail
Overview of Configuration Files
The sendmail.cf and sendmail.mc Files
Two Example sendmail.mc Files
Typically Used sendmail.mc Parameters
Generating the sendmail.cf File
Interpreting and Writing Rewrite Rules
sendmail.cf R and S Commands
Some Useful Macro Definitions
The Lefthand Side
The Righthand Side
A Simple Rule Pattern Example
Ruleset Semantics
Configuring sendmail Options
Some Useful sendmail Configurations
Trusting Users to Set the From: Field
Managing Mail Aliases
Using a Smart Host
Managing Unwanted or Unsolicited Mail (Spam)
Configuring Virtual Email Hosting
Testing Your Configuration
Running sendmail
Tips and Tricks
Managing the Mail Spool
Forcing a Remote Host to Process its Mail Queue
Analyzing Mail Statistics
17. Netnews
Usenet History
What Is Usenet, Anyway?
How Does Usenet Handle News?
18. C News
Delivering News
Installation
The sys File
The active File
Article Batching
Expiring News
Miscellaneous Files
Control Messages
The cancel Message
newgroup and rmgroup
The checkgroups Message
sendsys, version, and senduuname
C News in an NFS Environment
Maintenance Tools and Tasks
19. NNTP and thenntpd Daemon
The NNTP Protocol
Connecting to the News Server
Pushing a News Article onto a Server
Changing to NNRP Reader Mode
Listing Available Groups
Listing Active Groups
Posting an Article
Listing New Articles
Selecting a Group on Which to Operate
Listing Articles in a Group
Retrieving an Article Header Only
Retrieving an Article Body Only
Reading an Article from a Group
Installing the NNTP Server
Restricting NNTP Access
NNTP Authorization
nntpd Interaction with C News
20. Internet News
Some INN Internals
Newsreaders and INN
Installing INN
Configuring INN: the Basic Setup
INN Configuration Files
Global Parameters
Configuring Newsgroups
Configuring Newsfeeds
Controlling Newsreader Access
Expiring News Articles
Handling Control Messages
Running INN
Managing INN: The ctlinnd Command
Add a New Group
Change a Group
Remove a Group
Renumber a Group
Allow/Disallow Newsreaders
Reject Newsfeed Connections
Allow Newsfeed Connections
Disable News Server
Restart News Server
Display Status of a Newsfeed
Drop a Newsfeed
Begin a Newsfeed
Cancel an Article
21. Newsreader Configuration
tin Configuration
trn Configuration
nn Configuration
A. Example Network:The Virtual Brewery
Connecting the Virtual Subsidiary Network
B. Useful Cable Configurations
A PLIP Parallel Cable
A Serial NULL Modem Cable
C. Linux Network Administrator's Guide, Second Edition Copyright Information
0. Preamble
1. Applicability and Definitions
2. Verbatim Copying
3. Copying in Quantity
4. Modifications
5. Combining Documents
6. Collections of Documents
7. Aggregation with Independent Works
8. Translation
9. Termination
10. Future Revisions of this License
D. SAGE: The SystemAdministrators Guild

The Internet is now a household term in many countries. With otherwise serious people beginning to joyride along the Information Superhighway, computer networking seems to be moving toward the status of TV sets and microwave ovens. The Internet has unusually high media coverage, and social science majors are descending on Usenet newsgroups, online virtual reality environments, and the Web to conduct research on the new Internet Culture.

Of course, networking has been around for a long time. Connecting computers to form local area networks has been common practice, even at small installations, and so have long-haul links using transmission lines provided by telecommunications companies. A rapidly growing conglomerate of world-wide networks has, however, made joining the global village a perfectly reasonable option for even small non-profit organizations of private computer users. Setting up an Internet host with mail and news capabilities offering dialup and ISDN access has become affordable, and the advent of DSL (Digital Subscriber Line) and Cable Modem technologies will doubtlessly continue this trend.

Talking about computer networks often means talking about Unix. Of course, Unix is not the only operating system with network capabilities, nor will it remain a frontrunner forever, but it has been in the networking business for a long time, and will surely continue to be for some time to come.

What makes Unix particularly interesting to private users is that there has been much activity to bring free Unix-like operating systems to the PC, such as 386BSD, FreeBSD, and Linux.

FreeBSD is a freely distributable Unix clone for personal computers. It currently runs on machines based around the Intel 32 bit familty of processors386, 486, Pentium, and so on, as well as `clones' from AMD and other companiesand the Compaq/DEC Alpha platform. At the time of writing, porting efforts are underway to the Intel 64 bit platform (IA-64), AMD's 64 bit platform (Hammer), Sparc, and PowerPC.

FreeBSD is developed by a large team of volunteers across the Internet. The project had its genesis in the 386BSD project, which first ported BSD Unix to the PC. 386BSD, in turn, was based on code developed at the University of California, Berkeley. BSD Unix has a long heritage, going back to the 1970s, and the birth of Unix, and FreeBSD can trace its history as an unbroken line to these roots. This heritage gives FreeBSD an implementation of TCP/IP that is widely recognised as being the best in the industry, making FreeBSD an excellent networking platform. FreeBSD supports a wide and diverse range of hardware, and runs all manner of applicationsfrom word processors and spreadsheets to simulation modelling, to games, and everything in between.

FreeBSD is made available under the two-clause BSD software license. This license allows anyone to redistribute and modify the source software (free of charge, or for a profit). The only restrictions are that modifications must retain the original copyright notices on the code, and the initial warranty. In essence, don't claim that you wrote the code, and don't blame us if it doesn't work.

Purpose and Audience for This Book

This book was written to provide a single reference for network administration in a FreeBSD environment. Beginners and experienced users alike should find the information they need to cover nearly all important administration activities required to manage a FreeBSD network configuration. The possible range of topics to cover is nearly limitless, so of course it has been impossible to include everything there is to say on all subjects. We've tried to cover the most important and common ones. We've found that beginners to FreeBSD networking, even those with no prior exposure to Unix-like operating systems, have found this book good enough to help them successfully get their FreeBSD network configurations up and running and get them ready to learn more.

There are many books and other sources of information from which you can learn any of the topics covered in this book. We've provided a bibliography for you to use when you are ready to explore more.

Sources of Information

If you are new to the world of FreeBSD, there are a number of resources to explore and become familiar with. Having access to the Internet is helpful, but not essential.

FreeBSD Documentation Project Books

The FreeBSD Documentation Project is a group of volunteers who have worked to produce books, articles, and manual pages on topics covering the use and development of FreeBSD. Books include:

FreeBSD Frequently Asked Questions

The Frequently Asked Questions (FAQ) list contains hundreds of questions, with answers, relating to FreeBSD. It should be your first port of call.

The FreeBSD Handbook

A comprehensive guide to FreeBSD. It includes an illustrated installation guide, and chapters covering topics including kernel configuration, printing, and security.

The FreeBSD Developer's Handbook

The Developer's Handbook introduces FreeBSD as a development platform, both for people interested in developing their software using FreeBSD, and people interested in contributing to the FreeBSD development effort.

FreeBSD Documentation Project Articles

The FreeBSD articles are a series of documents detailing various aspects of the system. Typically, these are more specialised than the content that appears in the larger books, such as the Handbook, and have more of a niche audience.

Articles include FreeBSD and Solid State Devices, FreeBSD on Laptops, and Creating a diskless X server.

Documentation Available via FTP

If you have access to anonymous FTP, you can obtain all FreeBSD documentation listed above from various sites, typically under the pub/FreeBSD/doc/ directory. The canonical FTP site is ftp.FreeBSD.org, and per-country mirrors are available with name like ftp.uk.FreeBSD.org, ftp.jp.FreeBSD.org, and so on.

The documentation is available in a wide variety of formats, including HTML, Postscript, and Adobe PDF.

Documentation Available via WWW

The FreeBSD Documentation Project endeavours to keep a comprehensive list of available FreeBSD documentation at http://www.FreeBSD.org/docs.html.

Documentation Available Commercially

A number of commercial books on FreeBSD are available, with more appearing every month. Particularly recommended are:

  • The Complete FreeBSD

  • FreeBSD: An Open-Source Operating System for Your PC

In addition, the Unix System Administration Handbook is widely recognised as being the reference for cross-platform Unix system administration. The third edition, with a purple cover, includes FreeBSD, Linux, Solaris, and HP-UX in the examples, and is absolutely indispensable. It is also a very entertaining read.

FreeBSD Usenet Newsgroups

If you have access to Usenet news, the following FreeBSD-related newsgroups are available:

  • comp.unix.freebsd.announce

  • comp.unix.freebsd.misc

There are other newsgroups that may also be of interested, particularly those in the comp.unix hierarchy, such as comp.unix.admin and comp.unix.questions.

FreeBSD Mailing Lists

There is a large number of specialist FreeBSD mailing lists on which you will find many people willing to help with questions you might have.

These lists are hosted by the FreeBSD project. You may subscribe to them by sending an email formatted as follows:

To: majordomo@FreeBSD.org
Subject: anything at all
Body:
	  
subscribe listname

There are a large number of FreeBSD related mailing lists, and sending your message to the correct list is very important. Useful lists relating to networking are:

<questions@FreeBSD.org>

This is the mailing list for questions about FreeBSD. You should not send ``how to'' questions to the technical lists unless you consider the question to be pretty technical.

<ipfw@FreeBSD.org>

This is the forum for technical discussions concerning the redesign of the IP firewall code in FreeBSD. This is a technical mailing list for which strictly technical content is expected.

<hackers@FreeBSD.org>

This is a forum for technical discussions related to FreeBSD. This is the primary technical mailing list. It is for individuals actively working on FreeBSD, to bring up problems or discuss alternative solutions. Individuals interested in following the technical discussion are also welcome. This is a technical mailing list for which strictly technical content is expected.

<isp@FreeBSD.org>

This mailing list is for discussing topics relevant to Internet Service Providers (ISPs) using FreeBSD. This is a technical mailing list for which strictly technical content is expected.

Of these lists, <questions@FreeBSD.org> is the primary list for questions. The other lists tend to be concerned with the development of FreeBSD, or with specific deployments of FreeBSD.

Online FreeBSD Support

There are many ways of obtaining help online, where volunteers from around the world offer expertise and services to assist users with questions and problems.

The OpenProjects IRC Network is an IRC network devoted entirely to Open ProjectsOpen Source and Open Hardware alike. Some of its channels are designed to provide online FreeBSD support services. IRC stands for Internet Relay Chat, and is a network service that allows you to talk interactively on the Internet to other users. IRC networks support multiple channels on which groups of people talk. Whatever you type in a channel is seen by all other users of that channel.

The main channel providing FreeBSD help is #freebsdhelp. You can use this service by installing an IRC client like irc-II, connecting to servername irc.openprojects.org:6667 and joining #freebsdhelp. You may also see a #freebsd channel. This is a chat channel for FreeBSD users, and is not intended for questions.

FreeBSD User Groups

Many FreeBSD User Groups around the world offer direct support to users. Many FreeBSD User Groups engage in activities such as installation days, talks and seminars, demonstration nights, and other completely social events. FreeBSD User Groups are a great way of meeting other FreeBSD users in your area. There are a number of published lists of FreeBSD User Groups. Some of the better-known ones are:

The FreeBSD Project maintains an up-to-date list of usergroups on the FreeBSD web site.

Obtaining FreeBSD

The FreeBSD Project makes complete releases of FreeBSD available on the Internet, and full instructions for downloading and installing these releases are available on the FreeBSD web site.

Various commercial organisations also make available FreeBSD releases on CD and DVD. In some cases these releases are simple re-packaging of the releases put out by the project, in other cases the release is more customised than that, with additional documentation and other enhancements.

Some organisations of note are:

DaemonNews Mall

The DaemonNews Mall sells FreeBSD CD releases, as well as books, stickers, and other BSD paraphenalia.

FreeBSD Services

FreeBSD Services produces DVD and CD releases of FreeBSD.

About This Book

When Olaf joined the Linux Documentation Project in 1992, he wrote two small chapters on UUCP and smail, which he meant to contribute to the System Administrator's Guide. Development of TCP/IP networking was just beginning, and when those small chapters started to grow, he wondered aloud whether it would be nice to have a Networking Guide. Great! everyone said. Go for it! So he went for it and wrote the first version of the Linux Networking Guide, which was released in September 1993.

Olaf continued work on the Linux Networking Guide and eventually produced a much enhanced version of the guide. Vince Skahan contributed the original sendmail mail chapter, which was completely replaced in this edition because of a new interface to the sendmail configuration.

The second edition of the guide was a revision and update prompted by O'Reilly & Associates and undertaken by Terry Dawson.[1] Terry has been an amateur radio operator for over 20 years and has worked in the telecommunications industry for over 15 of those. He was co-author of the original NET-FAQ, and has since authored and maintained various networking-related HOWTO documents. Terry has always been an enthusiastic supporter of the Network Administrators Guide project, and added a few new chapters to this version describing features of Linux networking that have been developed since the first edition, plus a bunch of changes to bring the rest of the book up to date.

The FreeBSD edition of the guide was undertaken by Nik Clayton[2]. Nik has been involved with the FreeBSD project since 1993, and has headed up the FreeBSD Documentation Project since 1998. In 2001 Nik approached Terry, Olaf, and O'Reilly about taking the content of the Linux Network Administrator's Guide (which, in reality, is about 70% Unix-specific, rather than Linux specific), and using that as the base for the FreeBSD Network Administrator's Guide.

The exim chapter was contributed by Philip Hazel,[3] who is a lead developer and maintainer of the package.

The book is organized roughly along the sequence of steps you have to take to configure your system for networking. It starts by discussing basic concepts of networks, and TCP/IP-based networks in particular. It then slowly works its way up from configuring TCP/IP at the device level to firewall, accounting, and masquerade configuration, to the setup of common applications such as rlogin and friends, the Network File System, and the Network Information System. This is followed by a chapter on how to set up your machine as a UUCP node. Most of the remaining sections is dedicated to two major applications that run on top of TCP/IP and UUCP: electronic mail and news. A special chapter has been devoted to the IPX protocol and the NCP filesystem, because these are used in many corporate environments where FreeBSD is finding a home.

The email part features an introduction to the more intimate parts of mail transport and routing, and the myriad of addressing schemes you may be confronted with. It describes the configuration and management of exim, a mail transport agent ideal for use in most situations not requiring UUCP, and sendmail, which is for people who have to do more complicated routing involving UUCP.

The news part gives you an overview of how Usenet news works. It covers INN and C News, the two most widely used news transport software packages at the moment, and the use of NNTP to provide newsreading access to a local network. The book closes with a chapter on the care and feeding of the most popular newsreaders on FreeBSD.

Of course, a book can never exhaustively answer all questions you might have. So if you follow the instructions in this book and something still does not work, please be patient. Some of your problems may be due to mistakes on our part (see the section the section called “Submitting Changes”", later in this Preface), but they also may be caused by changes in the operating system. Therefore, you should check the listed information resources first. There's a good chance that you are not alone with your problems, so a fix or at least a proposed workaround is likely to be known.

The Official Printed Version

In Autumn 1993, Andy Oram, who had been around the LDP mailing list from almost the very beginning, asked Olaf about publishing this book at O'Reilly & Associates. He was excited about this book, never having imagined that it would become this successful. He and Andy finally agreed that O'Reilly would produce an enhanced Official Printed Version of the Networking Guide, while Olaf retained the original copyright so that the source of the book could be freely distributed. This means that you can choose freely: you can get the various free forms of the document from your nearest Linux Documentation Project mirror site and print it out, or you can purchase the official printed version from O'Reilly.

Why, then, would you want to pay money for something you can get for free? Is Tim O'Reilly out of his mind for publishing something everyone can print and even sell themselves? [4] Is there any difference between these versions?

The answers are it depends, no, definitely not, and yes and no. O'Reilly & Associates does take a risk in publishing the Networking Guide, and it seems to have paid off for them (they've asked us to do it again). We believe this project serves as a fine example of how the free software world and companies can cooperate to produce something both can benefit from. In our view, the great service O'Reilly is providing to the Linux community (apart from the book becoming readily available in your local bookstore) is that it has helped Linux become recognized as something to be taken seriously: a viable and useful alternative to other commercial operating systems. It's a sad technical bookstore that doesn't have at least one shelf stacked with O'Reilly Linux books.

Why are they publishing it? They see it as their kind of book. It's what they'd hope to produce if they contracted with an author to write about Linux. The pace, level of detail, and style fit in well with their other offerings.

The point of the LDP license is to make sure no one gets shut out. Other people can print out copies of this book, and no one will blame you if you get one of these copies. But if you haven't gotten a chance to see the O'Reilly version, try to get to a bookstore or look at a friend's copy. We think you'll like what you see, and will want to buy it for yourself.

So what about the differences between the printed and online versions? Andy Oram has made great efforts at transforming our ramblings into something actually worth printing. (He has also reviewed a few other books produced by the Linux Documentation Project, contributing whatever professional skills he can to the Linux community.)

Since Andy started reviewing the Networking Guide and editing the copies sent to him, the book has improved vastly from its original form, and with every round of submission and feedback it improves again. The opportunity to take advantage of a professional editor's skill is one not to be wasted. In many ways, Andy's contribution has been as important as that of the authors. The same is also true of the copyeditors, who got the book into the shape you see now. All these edits have been fed back into the online version, so there is no difference in content.

Still, the O'Reilly version will be different. It will be professionally bound, and while you may go to the trouble to print the free version, it is unlikely that you will get the same quality result, and even then it is more unlikely that you'll do it for the price. Secondly, our amateurish attempts at illustration will have been replaced with nicely redone figures by O'Reilly's professional artists. Indexers have generated an improved index, which makes locating information in the book a much simpler process. If this book is something you intend to read from start to finish, you should consider reading the official printed version.

Overview

Chapter 1., Introduction to Networking, discusses the history of FreeBSD and covers basic networking information on UUCP, TCP/IP, various protocols, hardware, and security. The next few chapters deal with configuring FreeBSD for TCP/IP networking and running some major applications. We examine IP a little more closely in Chapter 2., Issues of TCP/IP Networking, before getting our hands dirty with file editing and the like. If you already know how IP routing works and how address resolution is performed, you can skip this chapter.

Chapter 3., Configuringthe NetworkingHardware, deals with very basic configuration issues, such as building a kernel and setting up your Ethernet card. The configuration of your serial ports is covered separately in Chapter 4., Configuring the Serial Hardware, because the discussion does not apply to TCP/IP networking only, but is also relevant for UUCP.

Chapter 5., Configuring TCP/IP Networking, helps you set up your machine for TCP/IP networking. It contains installation hints for standalone hosts with loopback enabled only, and hosts connected to an Ethernet. It also introduces you to a few useful tools you can use to test and debug your setup. Chapter 6., Name Service and Resolver Configuration, discusses how to configure hostname resolution and explains how to set up a name server.

Chapter 7., Serial Line IP, explains how to establish SLIP connections and gives a detailed reference for dip, a tool that allows you to automate most of the necessary steps. Chapter 8., The Point-to-Point Protocol, covers PPP and pppd, the PPP daemon.

Chapter 9., TCP/IP Firewall, extends our discussion of network security and describes the FreeBSD TCP/IP firewall and its configuration tools. IP firewalling provides a means of controlling who can access your network and hosts very precisely.

Chapter 10., IP Accounting, explains how to configure IP Accounting in FreeBSD so you can keep track of how much traffic is going where and who is generating it.

Chapter 11., Network Address Translation, covers a feature called Network Address Translation (NAT), which allows whole IP networks to connect to and use the Internet through a single IP address, hiding internal systems from outsiders in the process.

Chapter 12., ImportantNetwork Features, gives a short introduction to setting up some of the most important network applications, such as rlogin, ssh, etc. This chapter also covers how services are managed by the inetd superuser, and how you may restrict certain security-relevant services to a set of trusted hosts.

Chapter 13., The Network Information System, and Chapter 14., The NetworkFile System, discuss NIS and NFS. NIS is a tool used to distribute administative information, such as user passwords in a local area network. NFS allows you to share filesystems between several hosts in your network.

In ???, we discuss the IPX protocol and the NCP filesystem. These allow FreeBSD to be integrated into a Novell NetWare environment, sharing files and printers with non-FreeBSD machines.

???, gives you an extensive introduction to the administration of Taylor UUCP, a free implementation of the UUCP suite.

The remainder of the book is taken up by a detailed tour of electronic mail and Usenet news. Chapter 15., Electronic Mail, introduces you to the central concepts of electronic mail, like what a mail address looks like, and how the mail handling system manages to get your message to the recipient.

Chapter 16., Sendmail, and ???, cover the configuration of sendmail and exim, two mail transport agents you can use for FreeBSD. This book explains both of them, because exim is easier to install for the beginner, while sendmail provides support for UUCP.

Chapter 17., Netnews, through Chapter 20., Internet News, explain the way news is managed in Usenet and how you install and use C News, nntpd, and INN: three popular software packages for managing Usenet news. After the brief introduction in Chapter 17., Netnews, you can read Chapter 18., C News, if you want to transfer news using C News, a traditional service generally used with UUCP. The following chapters discuss more modern alternatives to C News that use the Internet-based protocol NNTP (Network News Transfer Protocol). Chapter 19., NNTP and thenntpd Daemon covers how to set up a simple NNTP daemon, nntpd, to provide news reading access for a local network, while Chapter 20., Internet News describes a more robust server for more extensive NetNews transfers, the InterNet News daemon (INN). And finally, Chapter 21., Newsreader Configuration, shows you how to configure and maintain various newsreaders.

Conventions Used in This Book

All examples presented in this book assume you are using a sh compatible shell. If you happen to be a csh user, you will have to make appropriate adjustments.

The following is a list of the typographical conventions used in this book:

Italic

Used for file and directory names, program and command names, command-line options, email addresses and pathnames, URLs, and for emphasizing new terms.

Boldface

Used for machine names, hostnames, site names, usernames and IDs, and for occasional emphasis.

Constant Width

Used in examples to show the contents of code files or the output from commands and to indicate environment variables and keywords that appear in code.

Constant Width Italic

Used to indicate variable options, keywords, or text that the user is to replace with an actual value.

Constant Width Bold

Used in examples to show commands or other text that should be typed literally by the user.

Warning

Text appearing in this manner offers a warning. You can make a mistake here that hurts your system or is hard to recover from.

Submitting Changes

We have tested and verified the information in this book to the best of our ability, but you may find that features have changed (or even that we have made mistakes!). Please let us know about any errors you find, as well as your suggestions for future editions, by writing to:

      O'Reilly & Associates, Inc.
      101 Morris Street
      Sebastopol, CA 95472
      1-800-998-9938 (in the U.S. or Canada)
      1-707-829-0515 (international or local)
      1-707-829-0104 (FAX)
    

You can send us messages electronically. To be put on the mailing list or request a catalog, send email to:

      info@oreilly.com     
    

To ask technical questions or comment on the book, send email to:

      bookquestions@oreilly.com
    

We have a web site for the book, where we'll list examples, errata, and any plans for future editions. You can access this page at:

      http://www.oreilly.com/catalog/linag2
    

For more information about this book and others, see the O'Reilly web site:

      http://www.oreilly.com
    

Acknowledgments

This edition of the Networking Guide owes almost everything to the outstanding work of Olaf and Vince. It is difficult to appreciate the effort that goes into researching and writing a book of this nature until you've had a chance to work on one yourself. Updating the book was a challenging task, but with an excellent base to work from, it was an enjoyable one.

This book owes very much to the numerous people who took the time to proof-read it and help iron out many mistakes, both technical and grammatical (never knew that there was such a thing as a dangling participle). Phil Hughes, John Macdonald, and Erik Ratcliffe all provided very helpful (and on the whole, quite consistent) feedback on the content of the book.

We also owe many thanks to the people at O'Reilly we've had the pleasure to work with: Sarah Jane Shangraw, who got the book into the shape you can see now; Maureen Dempsey, who copyedited the text; Rob Romano, Rhon Porter, and Chris Reilley, who created all the figures; Hanna Dyer, who designed the cover; Alicia Cech, David Futato, and Jennifer Niedherst for the internal layout; Lars Kaufman for suggesting old woodcuts as a visual theme; Judy Hoer for the index; and finally, Tim O'Reilly for the courage to take up such a project.

We are greatly indebted to Andres Seplveda, Wolfgang Michaelis, Michael K. Johnson, and all developers who spared the time to check the information provided in the Networking Guide. Phil Hughes, John MacDonald, and Eric Ratcliffe contributed invaluable comments on the second edition. We also wish to thank all those who read the first version of the Networking Guide and sent corrections and suggestions. You can find a hopefully complete list of contributors in the file Thanks in the online distribution. Finally, this book would not have been possible without the support of Holger Grothe, who provided Olaf with the Internet connectivity he needed to make the original version happen.

Olaf would also like to thank the following groups and companies that printed the first edition of the Networking Guide and have donated money either to him or to the Linux Documentation Project as a whole: Linux Support Team, Erlangen, Germany; S.u.S.E. GmbH, Fuerth, Germany; and Linux System Labs, Inc., Clinton Twp., United States, RedHat Software, North Carolina, United States.

Terry thanks his wife, Maggie, who patiently supported him throughout his participation in the project despite the challenges presented by the birth of their first child, Jack. Additionally, he thanks the many people of the Linux community who either nurtured or suffered him to the point at which he could actually take part and actively contribute. I'll help you if you promise to help someone else in return.

The Hall of Fame

Besides those we have already mentioned, a large number of people have contributed to the Networking Guide, by reviewing it and sending us corrections and suggestions. We are very grateful.

Here is a list of those whose contributions left a trace in our mail folders.

Al Longyear, Alan Cox, Andres Seplveda, Ben Cooper, Cameron Spitzer, Colin McCormack, D.J. Roberts, Emilio Lopes, Fred N. van Kempen, Gert Doering, Greg Hankins, Heiko Eissfeldt, J.P. Szikora, Johannes Stille, Karl Eichwalder, Les Johnson, Ludger Kunz, Marc van Diest, Michael K. Johnson, Michael Nebel, Michael Wing, Mitch D'Souza, Paul Gortmaker, Peter Brouwer, Peter Eriksson, Phil Hughes, Raul Deluth Miller, Rich Braun, Rick Sladkey, Ronald Aarts, Swen Themmler, Terry Dawson, Thomas Quinot, and Yury Shevchuk.



[1] Terry Dawson can be reached at terry@linux.org.au.

[2] Nik Clayton can be reached at <nik@FreeBSD.org>.

[3] Philip Hazel can be reached at ph10@cus.cam.ac.uk.

[4] Note that while you are allowed to print out the online version, you may not run the O'Reilly book through a photocopier, much less sell any of its (hypothetical) copies.

History

The idea of networking is probably as old as telecommunications itself. Consider people living in the Stone Age, when drums may have been used to transmit messages between individuals. Suppose caveman A wants to invite caveman B over for a game of hurling rocks at each other, but they live too far apart for B to hear A banging his drum. What are A's options? He could 1) walk over to B's place, 2) get a bigger drum, or 3) ask C, who lives halfway between them, to forward the message. The last option is called networking.

Of course, we have come a long way from the primitive pursuits and devices of our forebears. Nowadays, we have computers talk to each other over vast assemblages of wires, fiber optics, microwaves, and the like, to make an appointment for Saturday's soccer match. [5] In the following description, we will deal with the means and ways by which this is accomplished, but leave out the wires, as well as the soccer part.

We will describe three types of networks in this guide. We will focus on TCP/IP most heavily because it is the most popular protocol suite in use on both Local Area Networks (LANs) and Wide Area Networks (WANs), such as the Internet. We will also take a look at UUCP and IPX. UUCP was once commonly used to transport news and mail messages over dialup telephone connections. It is less common today, but is still useful in a variety of situations. The IPX protocol is used most commonly in the Novell NetWare environment and we'll describe how to use it to connect your FreeBSD machine into a Novell network. Each of these protocols are networking protocols and are used to carry data between host computers. We'll discuss how they are used and introduce you to their underlying principles.

We define a network as a collection of hosts that are able to communicate with each other, often by relying on the services of a number of dedicated hosts that relay data between the participants. Hosts are often computers, but need not be; one can also think of X terminals or intelligent printers as hosts. Small agglomerations of hosts are also called sites.

Communication is impossible without some sort of language or code. In computer networks, these languages are collectively referred to as protocols. However, you shouldn't think of written protocols here, but rather of the highly formalized code of behavior observed when heads of state meet, for instance. In a very similar fashion, the protocols used in computer networks are nothing but very strict rules for the exchange of messages between two or more hosts.

TCP/IP Networks

Modern networking applications require a sophisticated approach to carrying data from one machine to another. If you are managing a FreeBSD machine that has many users, each of whom may wish to simultaneously connect to remote hosts on a network, you need a way of allowing them to share your network connection without interfering with each other. The approach that a large number of modern networking protocols uses is called packet-switching. A packet is a small chunk of data that is transferred from one machine to another across the network. The switching occurs as the datagram is carried across each link in the network. A packet-switched network shares a single network link among many users by alternately sending packets from one user to another across that link.

The solution that Unix systems, and subsequently many non-Unix systems, have adopted is known as TCP/IP. When talking about TCP/IP networks you will hear the term datagram, which technically has a special meaning but is often used interchangeably with packet. In this section, we will have a look at underlying concepts of the TCP/IP protocols.

Introduction to TCP/IP Networks

TCP/IP traces its origins to a research project funded by the United States Defense Advanced Research Projects Agency (DARPA) in 1969. The ARPANET was an experimental network that was converted into an operational one in 1975 after it had proven to be a success.

In 1983, the new protocol suite TCP/IP was adopted as a standard, and all hosts on the network were required to use it. When ARPANET finally grew into the Internet (with ARPANET itself passing out of existence in 1990), the use of TCP/IP had spread to networks beyond the Internet itself. Many companies have now built corporate TCP/IP networks, and the Internet has grown to a point at which it could almost be considered a mainstream consumer technology. It is difficult to read a newspaper or magazine now without seeing reference to the Internet; almost everyone can now use it.

For something concrete to look at as we discuss TCP/IP throughout the following sections, we will consider Groucho Marx University (GMU), situated somewhere in Fredland, as an example. Most departments run their own Local Area Networks, while some share one and others run several of them. They are all interconnected and hooked to the Internet through a single high-speed link.

Suppose your FreeBSD box is connected to a LAN of Unix hosts at the Mathematics department, and its name is erdos. To access a host at the Physics department, say quark, you enter the following command:

$ rlogin quark.physics
Welcome to the Physics Department at GMU
(ttyq2) login:

At the prompt, you enter your login name, say andres, and your password. You are then given a shell[6] on quark, to which you can type as if you were sitting at the system's console. After you exit the shell, you are returned to your own machine's prompt. You have just used one of the instantaneous, interactive applications that TCP/IP provides: remote login.

While being logged into quark, you might also want to run a graphical user interface application, like a word processing program, a graphics drawing program, or even a World Wide Web browser. The X windows system is a fully network-aware graphical user environment, and it is available for many different computing systems. To tell this application that you want to have its windows displayed on your host's screen, you have to set the DISPLAY environment variable:

$ DISPLAY=erdos.maths:0.0
$ export DISPLAY

If you now start your application, it will contact your X server instead of quark's, and display all its windows on your screen. Of course, this requires that you have X11 runnning on erdos. The point here is that TCP/IP allows quark and erdos to send X11 packets back and forth to give you the illusion that you're on a single system. The network is almost transparent here.

Another very important application in TCP/IP networks is NFS, which stands for Network File System. It is another form of making the network transparent, because it basically allows you to treat directory hierarchies from other hosts as if they were local file systems and look like any other directories on your host. For example, all users' home directories can be kept on a central server machine from which all other hosts on the LAN mount them. The effect is that users can log in to any machine and find themselves in the same home directory. Similarly, it is possible to share large amounts of data (such as a database, documentation or application programs) among many hosts by maintaining one copy of the data on a server and allowing other hosts to access it. We will come back to NFS in Chapter 14., The NetworkFile System.

Of course, these are only examples of what you can do with TCP/IP networks. The possibilities are almost limitless, and we'll introduce you to more as you read on through the book.

We will now have a closer look at the way TCP/IP works. This information will help you understand how and why you have to configure your machine. We will start by examining the hardware, and slowly work our way up.

Ethernets

The most common type of LAN hardware is known as Ethernet. In its simplest form, it consists of a single cable with hosts attached to it through connectors, taps, or transceivers. Simple Ethernets are relatively inexpensive to install, which together with a net transfer rate of 10, 100, or even 1,000 Megabits per second, accounts for much of its popularity.

Ethernets come in three flavors: thick, thin, and twisted pair. Thin and thick Ethernet each use a coaxial cable, differing in diameter and the way you may attach a host to this cable. Thin Ethernet uses a T-shaped BNC connector, which you insert into the cable and twist onto a plug on the back of your computer. Thick Ethernet requires that you drill a small hole into the cable, and attach a transceiver using a vampire tap. One or more hosts can then be connected to the transceiver. Thin and thick Ethernet cable can run for a maximum of 200 and 500 meters respectively, and are also called 10base-2 and 10base-5. The base refers to baseband modulation and simply means that the data is directly fed onto the cable without any modem. The number at the start refers to the speed in Megabits per second, and the number at the end is the maximum length of the cable in hundreds of metres. Twisted pair uses a cable made of two pairs of copper wires and usually requires additional hardware known as active hubs. Twisted pair is also known as 10base-T, the T meaning twisted pair. The 100 Megabits per second version is known as 100base-T.

To add a host to a thin Ethernet installation, you have to disrupt network service for at least a few minutes because you have to cut the cable to insert the connector. Although adding a host to a thick Ethernet system is a little complicated, it does not typically bring down the network. Twisted pair Ethernet is even simpler. It uses a device called a hub, which serves as an interconnection point. You can insert and remove hosts from a hub without interrupting any other users at all.

Many people prefer thin Ethernet for small networks because it is very inexpensive; PC cards come for as little as US 30 (many companies are literally throwing them out now), and cable is in the range of a few cents per meter. However, for large-scale installations, either thick Ethernet or twisted pair is more appropriate. For example, the Ethernet at GMU's Mathematics Department originally chose thick Ethernet because it is a long route that the cable must take so traffic will not be disrupted each time a host is added to the network. Twisted pair installations are now very common in a variety of installations. The Hub hardware is dropping in price and small units are now available at a price that is attractive to even small domestic networks. Twisted pair cabling can be significantly cheaper for large installations, and the cable itself is much more flexible than the coaxial cables used for the other Ethernet systems. The network administrators in GMU's mathematics department are planning to replace the existing network with a twisted pair network in the coming finanical year because it will bring them up to date with current technology and will save them significant time when installing new host computers and moving existing computers around.

One of the drawbacks of Ethernet technology is its limited cable length, which precludes any use of it other than for LANs. However, several Ethernet segments can be linked to one another using repeaters, bridges, or routers. Repeaters simply copy the signals between two or more segments so that all segments together will act as if they are one Ethernet. Due to timing requirements, there may not be more than four repeaters between any two hosts on the network. Bridges and routers are more sophisticated. They analyze incoming data and forward it only when the recipient host is not on the local Ethernet.

Ethernet works like a bus system, where a host may send packets (or frames) of up to 1,500 bytes to another host on the same Ethernet. A host is addressed by a six-byte address hardcoded into the firmware of its Ethernet network interface card (NIC). These addresses are usually written as a sequence of two-digit hex numbers separated by colons, as in aa:bb:cc:dd:ee:ff.

A frame sent by one station is seen by all attached stations, but only the destination host actually picks it up and processes it. If two stations try to send at the same time, a collision occurs. Collisions on an Ethernet are detected very quickly by the electronics of the interface cards and are resolved by the two stations aborting the send, each waiting a random interval and re-attempting the transmission. You'll hear lots of stories about collisions on Ethernet being a problem and that utilization of Ethernets is only about 30 percent of the available bandwidth because of them. Collisions on Ethernet are a normal phenomenon, and on a very busy Ethernet network you shouldn't be surprised to see collision rates of up to about 30 percent. Utilization of Ethernet networks is more realistically limited to about 60 percent before you need to start worrying about it.[7]

Other Types of Hardware

In larger installations, such as Groucho Marx University, Ethernet is usually not the only type of equipment used. There are many other data communications protocols available and in use. All of the protocols listed are supported by FreeBSD, but due to space constraints we'll describe them briefly. Many of the protocols have documents that describe them in detail, so you should refer to those if you're interested in exploring those that we don't describe in this book.

At Groucho Marx University, each department's LAN is linked to the campus high-speed backbone network, which is a fiber optic cable running a network technology called Fiber Distributed Data Interface (FDDI). FDDI uses an entirely different approach to transmitting data, which basically involves sending around a number of tokens, with a station being allowed to send a frame only if it captures a token. The main advantage of a token-passing protocol is a reduction in collisions. Therefore, the protocol can more easily attain the full speed of the transmission medium, up to 100 Mbps in the case of FDDI. FDDI, being based on optical fiber, offers a significant advantage because its maximum cable length is much greater than wire-based technologies. It has limits of up to around 200 km, which makes it ideal for linking many buildings in a city, or as in GMU's case, many buildings on a campus.

Similarly, if there is any IBM computing equipment around, an IBM Token Ring network is quite likely to be installed. Token Ring is used as an alternative to Ethernet in some LAN environments, and offers the same sorts of advantages as FDDI in terms of achieving full wire speed, but at lower speeds (4 Mbps or 16 Mbps), and lower cost because it is based on wire rather than fiber. In Linux, Token Ring networking is configured in almost precisely the same way as Ethernet, so we don't cover it specifically.

A more recent protocol commonly offered by telecommunications companies is called Frame Relay. The Frame Relay protocol shares a number of technical features with the X.25 protocol, but is much more like the IP protocol in behavior. Like X.25, Frame Relay requires special synchronous serial hardware. Because of their similarities, many cards support both of these protocols. An alternative is available that requires no special internal hardware, again relying on an external device called a Frame Relay Access Device (FRAD) to manage the encapsulation of Ethernet packets into Frame Relay packets for transmission across a network. Frame Relay is ideal for carrying TCP/IP between sites. FreeBSD provides drivers that support some types of internal Frame Relay devices.

If you need higher speed networking that can carry many different types of data, such as digitized voice and video, alongside your usual data, ATM (Asynchronous Transfer Mode) is probably what you'll be interested in. ATM is a new network technology that has been specifically designed to provide a manageable, high-speed, low-latency means of carrying data, and provide control over the Quality of Service (Q.S.). Many telecommunications companies are deploying ATM network infrastructure because it allows the convergence of a number of different network services into one platform, in the hope of achieving savings in management and support costs. ATM is often used to carry TCP/IP.

Other types of Internet access involve dialing up a central system over slow but cheap serial lines (telephone, ISDN, and so on). These require yet another protocol for transmission of packets, such as SLIP or PPP, which will be described later.

The Internet Protocol

Of course, you wouldn't want your networking to be limited to one Ethernet or one point-to-point data link. Ideally, you would want to be able to communicate with a host computer regardless of what type of physical network it is connected to. For example, in larger installations such as Groucho Marx University, you usually have a number of separate networks that have to be connected in some way. At GMU, the Math department runs two Ethernets: one with fast machines for professors and graduates, and another with slow machines for students. Both are linked to the FDDI campus backbone network.

This connection is handled by a dedicated host called a gateway that handles incoming and outgoing packets by copying them between the two Ethernets and the FDDI fiber optic cable. For example, if you are at the Math department and want to access quark on the Physics department's LAN from your FreeBSD box, the networking software will not send packets to quark directly because it is not on the same Ethernet. Therefore, it has to rely on the gateway to act as a forwarder. The gateway (named sophus) then forwards these packets to its peer gateway niels at the Physics department, using the backbone network, with niels delivering it to the destination machine. Data flow between erdos and quark is shown in Figure 1.1..

Figure 1.1. The three steps of sending a datagram from erdos to quark

This scheme of directing data to a remote host is called routing, and packets are often referred to as datagrams in this context. To facilitate things, datagram exchange is governed by a single protocol that is independent of the hardware used: IP, or Internet Protocol. In Chapter 2., Issues of TCP/IP Networking, we will cover IP and the issues of routing in greater detail.

The main benefit of IP is that it turns physically dissimilar networks into one apparently homogeneous network. This is called internetworking, and the resulting meta-network is called an internet. Note the subtle difference here between an internet and the Internet. The latter is the official name of one particular global internet.

Of course, IP also requires a hardware-independent addressing scheme. This is achieved by assigning each host a unique 32-bit number called the IP address. An IP address is usually written as four decimal numbers, one for each 8-bit portion, separated by dots. For example, quark might have an IP address of 0x954C0C04, which would be written as 149.76.12.4. This format is also called dotted decimal notation and sometimes dotted quad notation. It is increasingly going under the name IPv4 (for Internet Protocol, Version 4) because a new standard called IPv6 offers much more flexible addressing, as well as other modern features. It will be at least a year after the release of this edition before IPv6 is in use.

You will notice that we now have three different types of addresses: first there is the host's name, like quark, then there are IP addresses, and finally, there are hardware addresses, like the 6-byte Ethernet address. All these addresses somehow have to match so that when you type rlogin quark, the networking software can be given quark's IP address; and when IP delivers any data to the Physics department's Ethernet, it somehow has to find out what Ethernet address corresponds to the IP address.

We will deal with these situations in Chapter 2., Issues of TCP/IP Networking. For now, it's enough to remember that these steps of finding addresses are called hostname resolution, for mapping hostnames onto IP addresses, and address resolution, for mapping the latter to hardware addresses.

IP Over Serial Lines

On serial lines, a de facto standard exists known as SLIP, or Serial Line IP. A modification of SLIP known as CSLIP, or Compressed SLIP, performs compression of IP headers to make better use of the relatively low bandwidth provided by most serial links. Another serial protocol is PPP, or the Point-to-Point Protocol. PPP is more modern than SLIP and includes a number of features that make it more attractive. Its main advantage over SLIP is that it isn't limited to transporting IP datagrams, but is designed to allow just about any protocol to be carried across it.

The Transmission Control Protocol

Sending datagrams from one host to another is not the whole story. If you log in to quark, you want to have a reliable connection between your rlogin process on erdos and the shell process on quark. Thus, the information sent to and fro must be split up into packets by the sender and reassembled into a character stream by the receiver. Trivial as it seems, this involves a number of complicated tasks.

A very important thing to know about IP is that, by intent, it is not reliable. Assume that ten people on your Ethernet started downloading the latest release of Netscape's web browser source code from GMU's FTP server. The amount of traffic generated might be too much for the gateway to handle, because it's too slow and it's tight on memory. Now if you happen to send a packet to quark, sophus might be out of buffer space for a moment and therefore unable to forward it. IP solves this problem by simply discarding it. The packet is irrevocably lost. It is therefore the responsibility of the communicating hosts to check the integrity and completeness of the data and retransmit it in case of error.

This process is performed by yet another protocol, Transmission Control Protocol (TCP), which builds a reliable service on top of IP. The essential property of TCP is that it uses IP to give you the illusion of a simple connection between the two processes on your host and the remote machine, so you don't have to care about how and along which route your data actually travels. A TCP connection works essentially like a two-way pipe that both processes may write to and read from. Think of it as a telephone conversation.

TCP identifies the end points of such a connection by the IP addresses of the two hosts involved and the number of a port on each host. Ports may be viewed as attachment points for network connections. If we are to strain the telephone example a little more, and you imagine that cities are like hosts, one might compare IP addresses to area codes (where numbers map to cities), and port numbers to local codes (where numbers map to individual people's telephones). An individual host may support many different services, each distinguished by its own port number.

In the rlogin example, the client application (rlogin) opens a port on erdos and connects to port 513 on quark, to which the rlogind server is known to listen. This action establishes a TCP connection. Using this connection, rlogind performs the authorization procedure and then spawns the shell. The shell's standard input and output are redirected to the TCP connection, so that anything you type to rlogin on your machine will be passed through the TCP stream and be given to the shell as standard input.

The User Datagram Protocol

Of course, TCP isn't the only user protocol in TCP/IP networking. Although suitable for applications like rlogin, the overhead involved is prohibitive for applications like NFS, which instead uses a sibling protocol of TCP called UDP, or User Datagram Protocol. Just like TCP, UDP allows an application to contact a service on a certain port of the remote machine, but it doesn't establish a connection for this. Instead, you use it to send single packets to the destination servicehence its name.

Assume you want to request a small amount of data from a database server. It takes at least three datagrams to establish a TCP connection, another three to send and confirm a small amount of data each way, and another three to close the connection. UDP provides us with a means of using only two datagrams to achieve almost the same result. UDP is said to be connectionless, and it doesn't require us to establish and close a session. We simply put our data into a datagram and send it to the server; the server formulates its reply, puts the data into a datagram addressed back to us, and transmits it back. While this is both faster and more efficient than TCP for simple transactions, UDP was not designed to deal with datagram loss. It is up to the application, a name server for example, to take care of this.

More on Ports

Ports may be viewed as attachment points for network connections. If an application wants to offer a certain service, it attaches itself to a port and waits for clients (this is also called listening on the port). A client who wants to use this service allocates a port on its local host and connects to the server's port on the remote host. The same port may be open on many different machines, but on each machine only one process can open a port at any one time.

An important property of ports is that once a connection has been established between the client and the server, another copy of the server may attach to the server port and listen for more clients. This property permits, for instance, several concurrent remote logins to the same host, all using the same port 513. TCP is able to tell these connections from one another because they all come from different ports or hosts. For example, if you log in twice to quark from erdos, the first rlogin client will use the local port 1023, and the second one will use port 1022. Both, however, will connect to the same port 513 on quark. The two connections will be distinguished by use of the port numbers used at erdos.

This example shows the use of ports as rendezvous points, where a client contacts a specific port to obtain a specific service. In order for a client to know the proper port number, an agreement has to be reached between the administrators of both systems on the assignment of these numbers. For services that are widely used, such as rlogin, these numbers have to be administered centrally. This is done by the IETF (Internet Engineering Task Force), which regularly releases an RFC titled Assigned Numbers (RFC-1700). It describes, among other things, the port numbers assigned to well-known services. FreeBSD uses a file called /etc/services that maps service names to numbers.

It is worth noting that although both TCP and UDP connections rely on ports, these numbers do not conflict. This means that TCP port 513, for example, is different from UDP port 513. In fact, these ports serve as access points for two different services, namely rlogin (TCP) and rwho (UDP).

The Socket Library

In Unix operating systems, the software performing all the tasks and protocols described above is usually part of the kernel, and so it is in FreeBSD. The programming interface most common in the Unix world is the Berkeley Socket Library. Its name derives from a popular analogy that views ports as sockets and connecting to a port as plugging in. It provides the bind call to specify a remote host, a transport protocol, and a service that a program can connect or listen to (using connect, listen, and accept). The socket library is somewhat more general in that it provides not only a class of TCP/IP-based sockets (the AF_INET sockets), but also a class that handles connections local to the machine (the AF_UNIX class). Some implementations can also handle other classes, like the XNS (Xerox Networking System) protocol or X.25.

In FreeBSD, the socket library is part of the standard libc C library. It supports the AF_INET and AF_INET6 sockets for TCP/IP and AF_UNIX for Unix domain sockets. It also supports AF_IPX for Novell's network protocols, AF_ATM for the ATM network protocol. Other protocol families are being developed and will be added in time.

UUCP Networks

Unix-to-Unix Copy (UUCP) started out as a package of programs that transferred files over serial lines, scheduled those transfers, and initiated execution of programs on remote sites. It has undergone major changes since its first implementation in the late seventies, but it is still rather spartan in the services it offers. Its main application is still in Wide Area Networks, based on periodic dialup telephone links.

UUCP was first developed by Bell Laboratories in 1977 for communication between their Unix development sites. In mid-1978, this network already connected over 80 sites. It was running email as an application, as well as remote printing. However, the system's central use was in distributing new software and bug fixes. Today, UUCP is not confined solely to the Unix environment. There are free and commercial ports available for a variety of platforms, including AmigaOS, DOS, and Atari's TOS.

One of the main disadvantages of UUCP networks is that they operate in batches. Rather than having a permanent connection established between hosts, it uses temporary connections. A UUCP host machine might dial in to another UUCP host only once a day, and then only for a short period of time. While it is connected, it will transfer all of the news, email, and files that have been queued, and then disconnect. It is this queuing that limits the sorts of applications that UUCP can be applied to. In the case of email, a user may prepare an email message and post it. The message will stay queued on the UUCP host machine until it dials in to another UUCP host to transfer the message. This is fine for network services such as email, but is no use at all for services such as rlogin.

Despite these limitations, there are still many UUCP networks operating all over the world, run mainly by hobbyists, which offer private users network access at reasonable prices. The main reason for the longtime popularity of UUCP was that it was very cheap compared to having your computer directly connected to the Internet. To make your computer a UUCP node, all you needed was a modem, a working UUCP implementation, and another UUCP node that was willing to feed you mail and news. Many people were prepared to provide UUCP feeds to individuals because such connections didn't place much demand on their existing network.

We cover the configuration of UUCP in a chapter of its own later in the book, but we won't focus on it too heavily, as it's being replaced rapidly with TCP/IP, now that cheap Internet access has become commonly available in most parts of the world.

Maintaining Your System

Throughout this book, we will mainly deal with installation and configuration issues. Administration is, however, much more than thatafter setting up a service, you have to keep it running, too. For most services, only a little attendance will be necessary, while some, like mail and news, require that you perform routine tasks to keep your system up to date. We will discuss these tasks in later chapters.

The absolute minimum in maintenance is to check system and per-application log files regularly for error conditions and unusual events. Often, you will want to do this by writing a couple of administrative shell scripts and periodically running them from cron. The source distributions of some major applications, like inn or C News, contain such scripts. You only have to tailor them to suit your needs and preferences.

The output from any of your cron jobs should be mailed to an administrative account. By default, many applications will send error reports, usage statistics, or log file summaries to the root account. This makes sense only if you log in as root frequently; a much better idea is to forward root's mail to your personal account by setting up a mail alias as described in ??? or Chapter 16., Sendmail.

However carefully you have configured your site, Murphy's law guarantees that some problem will surface eventually. Therefore, maintaining a system also means being available for complaints. Usually, people expect that the system administrator can at least be reached via email as root, but there are also other addresses that are commonly used to reach the person responsible for a specific aspect of maintenence. For instance, complaints about a malfunctioning mail configuration will usually be addressed to postmaster, and problems with the news system may be reported to newsmaster or usenet. Mail to hostmaster should be redirected to the person in charge of the host's basic network services, and the DNS name service if you run a name server.

System Security

Another very important aspect of system administration in a network environment is protecting your system and users from intruders. Carelessly managed systems offer malicious people many targets. Attacks range from password guessing to Ethernet snooping, and the damage caused may range from faked mail messages to data loss or violation of your users' privacy. We will mention some particular problems when discussing the context in which they may occur and some common defenses against them.

This section will discuss a few examples and basic techniques for dealing with system security. Of course, the topics covered cannot treat all security issues you may be faced with in detail; they merely serve to illustrate the problems that may arise. Therefore, reading a good book on security is an absolute must, especially in a networked system.

System security starts with good system administration. This includes checking the ownership and permissions of all vital files and directories and monitoring use of privileged accounts. The COPS program, for instance, will check your file system and common configuration files for unusual permissions or other anomalies. It is also wise to use a password suite that enforces certain rules on the users' passwords that make them hard to guess. The shadow password suite, for instance, requires a password to have at least five letters and to contain both upper- and lowercase numbers, as well as non-alphabetic characters.

When making a service accessible to the network, make sure to give it least privilege; don't permit it to do things that aren't required for it to work as designed. For example, you should make programs setuid to root or some other privileged account only when necessary. Also, if you want to use a service for only a very limited application, don't hesitate to configure it as restrictively as your special application allows. For instance, if you want to allow diskless hosts to boot from your machine, you must provide Trivial File Transfer Protocol (TFTP) so that they can download basic configuration files from the /boot directory. However, when used unrestrictively, TFTP allows users anywhere in the world to download any world-readable file from your system. If this is not what you want, restrict TFTP service to the /boot directory. [8]

You might also want to restrict certain services to users from certain hosts, say from your local network. In Chapter 12., ImportantNetwork Features, we introduce tcpd, which does this for a variety of network applications. More sophisticated methods of restricting access to particular hosts or services will be explored later in Chapter 9., TCP/IP Firewall.

Another important point is to avoid dangerous software. Of course, any software you use can be dangerous because software may have bugs that clever people might exploit to gain access to your system. Things like this happen, and there's no complete protection against it. This problem affects free software and commercial products alike.[9] However, programs that require special privilege are inherently more dangerous than others, because any loophole can have drastic consequences.[10] If you install a setuid program for network purposes, be doubly careful to check the documentation so that you don't create a security breach by accident.

Another source of concern should be programs that enable login or command execution with limited authentication. The rlogin, rsh, and rexec commands are all very useful, but offer very limited authentication of the calling party. Authentication is based on trust of the calling host name obtained from a name server (we'll talk about these later), which can be faked. Today it should be standard practice to disable the r commands completely and replace them with the ssh suite of tools. The ssh tools use a much more reliable authentication method and provide other services, such as encryption and compression, as well.

You can never rule out the possibility that your precautions might fail, regardless of how careful you have been. You should therefore make sure you detect intruders early. Checking the system log files is a good starting point, but the intruder is probably clever enough to anticipate this action and will delete any obvious traces he or she left. However, there are tools like tripwire, written by Gene Kim and Gene Spafford, that allow you to check vital system files to see if their contents or permissions have been changed. tripwire computes various strong checksums over these files and stores them in a database. During subsequent runs, the checksums are recomputed and compared to the stored ones to detect any modifications.



[5] The original spirit of which (see above) still shows on some occasions in Europe.

[6] The shell is a command-line interface to the Unix operating system. It's similar to the DOS prompt in a Microsoft Windows environment, albeit much more powerful.

[7] The Ethernet FAQ at http://www.faqs.org/faqs/LANs/ethernet-faq/ talks about this issue, and a wealth of detailed historical and technical information is available at Charles Spurgeon's Ethernet web site at http://wwwhost.ots.utexas.edu/ethernet/.

[8] We will come back to this topic in Chapter 12., ImportantNetwork Features.

[9] There have been commercial Unix systems (that you have to pay lots of money for) that came with a setuid- root shell script, which allowed users to gain root privilege using a simple standard trick.

[10] In 1988, the RTM worm brought much of the Internet to a grinding halt, partly by exploiting a gaping hole in some programs including the sendmail program. This hole has long since been fixed.

In this chapter we turn to the configuration decisions you'll need to make when connecting your FreeBSD machine to a TCP/IP network, including dealing with IP addresses, hostnames, and routing issues. This chapter gives you the background you need in order to understand what your setup requires, while the next chapters cover the tools you will use.

To learn more about TCP/IP and the reasons behind it, refer to the three-volume set Internetworking with TCP/IP, by Douglas R. Comer (Prentice Hall). For a more detailed guide to managing a TCP/IP network, see TCP/IP Network Administration by Craig Hunt (O'Reilly).

Networking Interfaces

To hide the diversity of equipment that may be used in a networking environment, TCP/IP defines an abstract interface through which the hardware is accessed. This interface offers a set of operations that is the same for all types of hardware and basically deals with sending and receiving packets.

For each networking device, a corresponding interface has to be present in the kernel. For example, ethernet interfaces (that correspond to network cards) are named according to the driver for that card; xe0 for the first Xircom Ethernet card, fxp1 for the second Intel Fast Ether Express card, and so on. Logical interfaces (that correspond to an abstraction rather than physical hardware) have more logical names; for example, tunnel interfaces (used for PPP, and other services) have names like tun0 and tun1. These interface names are used when you want to specify a particular device in a configuration command, and they have no meaning beyond this use.

Before being used by TCP/IP networking, an interface must be assigned an IP address that serves as its identification when communicating with the rest of the world. This address is different from the interface name mentioned previously; if you compare an interface to a door, the address is like the nameplate pinned on it.

Other device parameters may be set, like the maximum size of datagrams that can be processed by a particular piece of hardware, which is referred to as Maximum Transfer Unit (MTU). Other attributes will be introduced later. Fortunately, most attributes have sensible defaults.

IP Addresses

As mentioned in Chapter 1., Introduction to Networking, the IP networking protocol understands addresses as 32-bit numbers. Each machine must be assigned a number unique to the networking environment. [11] If you are running a local network that does not have TCP/IP traffic with other networks, you may assign these numbers according to your personal preferences. There are some IP address ranges that have been reserved for such private networks. These ranges are listed in Table 2.1.. However, for sites on the Internet, numbers are assigned by a central authority, the Network Information Center (NIC).[12]

IP addresses are split up into four eight-bit numbers called octets for readability. For example, quark.physics.groucho.edu has an IP address of 0x954C0C04, which is written as 149.76.12.4. This format is often referred to as dotted quad notation.

Another reason for this notation is that IP addresses are split into a network number, which is contained in the leading octets, and a host number, which is the remainder. When applying to the NIC for IP addresses, you are not assigned an address for each single host you plan to use. Instead, you are given a network number and allowed to assign all valid IP addresses within this range to hosts on your network according to your preferences.

The size of the host part depends on the size of the network. To accommodate different needs, several classes of networks, defining different places to split IP addresses, have been defined. The class networks are described here:

Class A

Class A comprises networks 1.0.0.0 through 127.0.0.0. The network number is contained in the first octet. This class provides for a 24-bit host part, allowing roughly 1.6 million hosts per network.

Class B

Class B contains networks 128.0.0.0 through 191.255.0.0; the network number is in the first two octets. This class allows for 16,320 nets with 65,024 hosts each.

Class C

Class C networks range from 192.0.0.0 through 223.255.255.0, with the network number contained in the first three octets. This class allows for nearly 2 million networks with up to 254 hosts.

Classes D, E, and F

Addresses falling into the range of 224.0.0.0 through 254.0.0.0 are either experimental or are reserved for special purpose use and don't specify any network. IP Multicast, which is a service that allows material to be transmitted to many points on an internet at one time, has been assigned addresses from within this range.

If we go back to the example in Chapter 1, we find that 149.76.12.4, the address of quark, refers to host 12.4 on the class B network 149.76.0.0.

You may have noticed that not all possible values in the previous list were allowed for each octet in the host part. This is because octets 0 and 255 are reserved for special purposes. An address where all host part bits are 0 refers to the network, and an address where all bits of the host part are 1 is called a broadcast address. This refers to all hosts on the specified network simultaneously. Thus, 149.76.255.255 is not a valid host address, but refers to all hosts on network 149.76.0.0.

A number of network addresses are reserved for special purposes. 0.0.0.0 and 127.0.0.0 are two such addresses. The first is called the default route, and the latter is the loopback address. The default route has to do with the way the IP routes datagrams.

Network 127.0.0.0 is reserved for IP traffic local to your host. Usually, address 127.0.0.1 will be assigned to a special interface on your host, the loopback interface, which acts like a closed circuit. Any IP packet handed to this interface from TCP or UDP will be returned to them as if it had just arrived from some network. This allows you to develop and test networking software without ever using a real network. The loopback network also allows you to use networking software on a standalone host. This may not be as uncommon as it sounds; for instance, many UUCP sites don't have IP connectivity at all, but still want to run the INN news system. For proper operation on Linux, INN requires the loopback interface.

Some address ranges from each of the network classes have been set aside and designated reserved or private address ranges. These addresses are reserved for use by private networks and are not routed on the Internet. They are commonly used by organizations building their own intranet, but even small networks often find them useful. The reserved network addresses appear in Table 2.1..

Table 2.1. IP Address Ranges Reserved for Private Use

ClassNetworks
A10.0.0.0 through 10.255.255.255
B172.16.0.0 through 172.31.0.0
C192.168.0.0 through 192.168.255.0

Address Resolution

Now that you've seen how IP addresses are composed, you may be wondering how they are used on an Ethernet or Token Ring network to address different hosts. After all, these protocols have their own addresses to identify hosts that have absolutely nothing in common with an IP address, don't they? Right.

A mechanism is needed to map IP addresses onto the addresses of the underlying network. The mechanism used is the Address Resolution Protocol (ARP). In fact, ARP is not confined to Ethernet or Token Ring, but is used on other types of networks, such as the amateur radio AX.25 protocol. The idea underlying ARP is exactly what most people do when they have to find Mr. X in a throng of 150 people: the person who wants him calls out loudly enough that everyone in the room can hear them, expecting him to respond if he is there. When he responds, we know which person he is.

When ARP wants to find the Ethernet address corresponding to a given IP address, it uses an Ethernet feature called broadcasting, in which a datagram is addressed to all stations on the network simultaneously. The broadcast datagram sent by ARP contains a query for the IP address. Each receiving host compares this query to its own IP address and if it matches, returns an ARP reply to the inquiring host. The inquiring host can now extract the sender's Ethernet address from the reply.

You may wonder how a host can reach an Internet address that may be on a different network halfway around the world. The answer to this question involves routing, namely finding the physical location of a host in a network. We will discuss this issue further in the next section.

Let's talk a little more about ARP. Once a host has discovered an Ethernet address, it stores it in its ARP cache so that it doesn't have to query for it again the next time it wants to send a datagram to the host in question. However, it is unwise to keep this information forever; the remote host's Ethernet card may be replaced because of technical problems, so the ARP entry becomes invalid. Therefore, entries in the ARP cache are discarded after some time to force another query for the IP address.

Sometimes it is also necessary to find the IP address associated with a given Ethernet address. This happens when a diskless machine wants to boot from a server on the network, which is a common situation on Local Area Networks. A diskless client, however, has virtually no information about itselfexcept for its Ethernet address! So it broadcasts a message containing a request asking a boot server to provide it with an IP address. There's another protocol for this situation named Reverse Address Resolution Protocol (RARP). Along with the BOOTP protocol, it serves to define a procedure for bootstrapping diskless clients over the network.

IP Routing

We now take up the question of finding the host that datagrams go to based on the IP address. Different parts of the address are handled in different ways; it is your job to set up the files that indicate how to treat each part.

IP Networks

When you write a letter to someone, you usually put a complete address on the envelope specifying the country, state, and Zip Code. After you put it in the mailbox, the post office will deliver it to its destination: it will be sent to the country indicated, where the national service will dispatch it to the proper state and region. The advantage of this hierarchical scheme is obvious: wherever you post the letter, the local postmaster knows roughly which direction to forward the letter, but the postmaster doesn't care which way the letter will travel once it reaches its country of destination.

IP networks are structured similarly. The whole Internet consists of a number of proper networks, called autonomous systems. Each system performs routing between its member hosts internally so that the task of delivering a datagram is reduced to finding a path to the destination host's network. As soon as the datagram is handed to any host on that particular network, further processing is done exclusively by the network itself.

Subnetworks

This structure is reflected by splitting IP addresses into a host and network part, as explained previously. By default, the destination network is derived from the network part of the IP address. Thus, hosts with identical IP network numbers should be found within the same network.[13]

It makes sense to offer a similar scheme inside the network, too, since it may consist of a collection of hundreds of smaller networks, with the smallest units being physical networks like Ethernets. Therefore, IP allows you to subdivide an IP network into several subnets.

A subnet takes responsibility for delivering datagrams to a certain range of IP addresses. It is an extension of the concept of splitting bit fields, as in the A, B, and C classes. However, the network part is now extended to include some bits from the host part. The number of bits that are interpreted as the subnet number is given by the so-called subnet mask, or netmask. This is a 32-bit number too, which specifies the bit mask for the network part of the IP address.

The campus network of Groucho Marx University is an example of such a network. It has a class B network number of 149.76.0.0, and its netmask is therefore 255.255.0.0.

Internally, GMU's campus network consists of several smaller networks, such various departments' LANs. So the range of IP addresses is broken up into 254 subnets, 149.76.1.0 through 149.76.254.0. For example, the department of Theoretical Physics has been assigned 149.76.12.0. The campus backbone is a network in its own right, and is given 149.76.1.0. These subnets share the same IP network number, while the third octet is used to distinguish between them. They will thus use a subnet mask of 255.255.255.0.

Figure 2.1. shows how 149.76.12.4, the address of quark, is interpreted differently when the address is taken as an ordinary class B network and when used with subnetting.

Figure 2.1. Subnetting a class B network

It is worth noting that subnetting (the technique of generating subnets) is only an internal division of the network. Subnets are generated by the network owner (or the administrators). Frequently, subnets are created to reflect existing boundaries, be they physical (between two Ethernets), administrative (between two departments), or geographical (between two locations), and authority over each subnet is delegated to some contact person. However, this structure affects only the network's internal behavior, and is completely invisible to the outside world.

Gateways

Subnetting is not only a benefit to the organization; it is frequently a natural consequence of hardware boundaries. The viewpoint of a host on a given physical network, such as an Ethernet, is a very limited one: it can only talk to the host of the network it is on. All other hosts can be accessed only through special-purpose machines called gateways. A gateway is a host that is connected to two or more physical networks simultaneously and is configured to switch packets between them.

Figure 2.2. shows part of the network topology at Groucho Marx University (GMU). Hosts that are on two subnets at the same time are shown with both addresses.

Figure 2.2. A part of the net topology at Groucho Marx University

Different physical networks have to belong to different IP networks for IP to be able to recognize if a host is on a local network. For example, the network number 149.76.4.0 is reserved for hosts on the mathematics LAN. When sending a datagram to quark, the network software on erdos immediately sees from the IP address 149.76.12.4 that the destination host is on a different physical network, and therefore can be reached only through a gateway (sophus by default).

sophus itself is connected to two distinct subnets: the Mathematics department and the campus backbone. It accesses each through a different interface, eth0 and fddi0, respectively. Now, what IP address do we assign it? Should we give it one on subnet 149.76.1.0, or on 149.76.4.0?

The answer is: both. sophus has been assigned the address 149.76.1.1 for use on the 149.76.1.0 network and address 149.76.4.1 for use on the 149.76.4.0 network. A gateway must be assigned one IP address for each network it belongs to. These addressesalong with the corresponding netmaskare tied to the interface through which the subnet is accessed. Thus, the interface and address mapping for sophus would look like this:

InterfaceAddressNetmask
fxp0149.76.4.1255.255.255.0
fpa0149.76.1.1255.255.255.0
lo0127.0.0.1255.0.0.0

The last entry describes the loopback interface lo0, which we talked about earlier.

Generally, you can ignore the subtle difference between attaching an address to a host or its interface. For hosts that are on one network only, like erdos, you would generally refer to the host as having this-and-that IP address, although strictly speaking, it's the Ethernet interface that has this IP address. The distinction is really important only when you refer to a gateway.

The Routing Table

We now focus our attention on how IP chooses a gateway to use to deliver a datagram to a remote network.

We have seen that erdos, when given a datagram for quark, checks the destination address and finds that it is not on the local network. erdos therefore sends the datagram to the default gateway sophus, which is now faced with the same task. sophus recognizes that quark is not on any of the networks it is connected to directly, so it has to find yet another gateway to forward it through. The correct choice would be niels, the gateway to the Physics department. sophus thus needs information to associate a destination network with a suitable gateway.

IP uses a table for this task that associates networks with the gateways by which they may be reached. A catch-all entry (the default route) must generally be supplied too; this is the gateway associated with network 0.0.0.0. All destination addresses match this route, since none of the 32 bits are required to match, and therefore packets to an unknown network are sent through the default route. On sophus, the table might look like this:

NetworkNetmaskGatewayInterface
149.76.1.0255.255.255.0-fpa0
149.76.2.0255.255.255.0149.76.1.2fpa0
149.76.3.0255.255.255.0149.76.1.3fpa0
149.76.4.0255.255.255.0-fxp0
149.76.5.0255.255.255.0149.76.1.5fpa0
    
0.0.0.00.0.0.0149.76.1.2fpa0

If you need to use a route to a network that sophus is directly connected to, you don't need a gateway; the gateway column here contains a hyphen.

The process for identifying whether a particular destination address matches a route is a mathematical operation. The process is quite simple, but it requires an understanding of binary arithmetic and logic: A route matches a destination if the network address logically ANDed with the netmask precisely equals the destination address logically ANDed with the netmask.

Translation: a route matches if the number of bits of the network address specified by the netmask (starting from the left-most bit, the high order bit of byte one of the address) match that same number of bits in the destination address.

When the IP implementation is searching for the best route to a destination, it may find a number of routing entries that match the target address. For example, we know that the default route matches every destination, but datagrams destined for locally attached networks will match their local route, too. How does IP know which route to use? It is here that the netmask plays an important role. While both routes match the destination, one of the routes has a larger netmask than the other. We previously mentioned that the netmask was used to break up our address space into smaller networks. The larger a netmask is, the more specifically a target address is matched; when routing datagrams, we should always choose the route that has the largest netmask. The default route has a netmask of zero bits, and in the configuration presented above, the locally attached networks have a 24-bit netmask. If a datagram matches a locally attached network, it will be routed to the appropriate device in preference to following the default route because the local network route matches with a greater number of bits. The only datagrams that will be routed via the default route are those that don't match any other route.

You can build routing tables by a variety of means. For small LANs, it is usually most efficient to construct them by hand and feed them to IP using the route command at boot time (see Chapter 5., Configuring TCP/IP Networking). For larger networks, they are built and adjusted at runtime by routing daemons; these daemons run on central hosts of the network and exchange routing information to compute optimal routes between the member networks.

Depending on the size of the network, you'll need to use different routing protocols. For routing inside autonomous systems (such as the Groucho Marx campus), the internal routing protocols are used. The most prominent one of these is the Routing Information Protocol (RIP), which is implemented by the BSD routed daemon. For routing between autonomous systems, external routing protocols like External Gateway Protocol (EGP) or Border Gateway Protocol (BGP) have to be used; these protocols, including RIP, have been implemented in the University of Cornell's gated daemon.

Metric Values

We depend on dynamic routing to choose the best route to a destination host or network based on the number of hops. Hops are the gateways a datagram has to pass before reaching a host or network. The shorter a route is, the better RIP rates it. Very long routes with 16 or more hops are regarded as unusable and are discarded.

RIP manages routing information internal to your local network, but you have to run gated on all hosts. At boot time, gated checks for all active network interfaces. If there is more than one active interface (not counting the loopback interface), it assumes the host is switching packets between several networks and will actively exchange and broadcast routing information. Otherwise, it will only passively receive RIP updates and update the local routing table.

When broadcasting information from the local routing table, gated computes the length of the route from the so-called metric value associated with the routing table entry. This metric value is set by the system administrator when configuring the route, and should reflect the actual route cost.[14] Therefore, the metric of a route to a subnet that the host is directly connected to should always be zero, while a route going through two gateways should have a metric of two. You don't have to bother with metrics if you don't use RIP or gated.

The Internet Control Message Protocol

IP has a companion protocol that we haven't talked about yet. This is the Internet Control Message Protocol (ICMP), used by the kernel networking code to communicate error messages to other hosts. For instance, assume that you are on erdos again and want to telnet to port 12345 on quark, but there's no process listening on that port. When the first TCP packet for this port arrives on quark, the networking layer will recognize this arrival and immediately return an ICMP message to erdos stating Port Unreachable.

The ICMP protocol provides several different messages, many of which deal with error conditions. However, there is one very interesting message called the Redirect message. It is generated by the routing module when it detects that another host is using it as a gateway, even though a much shorter route exists. For example, after booting, the routing table of sophus may be incomplete. It might contain the routes to the Mathematics network, to the FDDI backbone, and the default route pointing at the Groucho Computing Center's gateway (gcc1). Thus, packets for quark would be sent to gcc1 rather than to niels, the gateway to the Physics department. When receiving such a datagram, gcc1 will notice that this is a poor choice of route and will forward the packet to niels, meanwhile returning an ICMP Redirect message to sophus telling it of the superior route.

This seems to be a very clever way to avoid manually setting up any but the most basic routes. However, be warned that relying on dynamic routing schemes, be it RIP or ICMP Redirect messages, is not always a good idea. ICMP Redirect and RIP offer you little or no choice in verifying that some routing information is indeed authentic. This situation allows malicious good-for-nothings to disrupt your entire network traffic, or even worse. Consequently, the Linux networking code treats Network Redirect messages as if they were Host Redirects. This minimizes the damage of an attack by restricting it to just one host, rather than the whole network. On the flip side, it means that a little more traffic is generated in the event of a legitimate condition, as each host causes the generation of an ICMP Redirect message. It is generally considered bad practice to rely on ICMP redirects for anything these days.

Resolving Host Names

As described previously, addressing in TCP/IP networking, at least for IP Version 4, revolves around 32-bit numbers. However, you will have a hard time remembering more than a few of these numbers. Therefore, hosts are generally known by ordinary names such as gauss or strange. It becomes the application's duty to find the IP address corresponding to this name. This process is called hostname resolution.

When an application needs to find the IP address of a given host, it relies on the library functions gethostbyname(3) and gethostbyaddr(3). Traditionally, these and a number of related procedures were grouped in a separate library called the resolverlibrary; on FreeBSD, these functions are part of the standard libc. Colloquially, this collection of functions is therefore referred to as the resolver. Resolver name configuration is detailed in Chapter 6., Name Service and Resolver Configuration.

On a small network like an Ethernet or even a cluster of Ethernets, it is not very difficult to maintain tables mapping hostnames to addresses. This information is usually kept in a file named /etc/hosts. When adding or removing hosts, or reassigning addresses, all you have to do is update the hosts file on all hosts. Obviously, this will become burdensome with networks that comprise more than a handful of machines.

One solution to this problem is the Network Information System (NIS), developed by Sun Microsystems, colloquially called YP or Yellow Pages. NIS stores the hosts file (and other information) in a database on a master host from which clients may retrieve it as needed. Still, this approach is suitable only for medium-sized networks such as LANs, because it involves maintaining the entire hosts database centrally and distributing it to all servers. NIS installation and configuration is discussed in detail in Chapter 13., The Network Information System.

On the Internet, address information was initially stored in a single HOSTS.TXT database, too. This file was maintained at the Network Information Center (NIC), and had to be downloaded and installed by all participating sites. When the network grew, several problems with this scheme arose. Besides the administrative overhead involved in installing HOSTS.TXT regularly, the load on the servers that distributed it became too high. Even more severe, all names had to be registered with the NIC, which made sure that no name was issued twice.

This is why a new name resolution scheme was adopted in 1994: the Domain Name System. DNS was designed by Paul Mockapetris and addresses both problems simultaneously. We discuss the Domain Name System in detail in Chapter 6., Name Service and Resolver Configuration.



[11] The version of the Internet Protocol most frequently used on the Internet is Version 4. A lot of effort has been expended in designing a replacement called IP Version 6. IPv6 uses a different addressing scheme and larger addresses. Linux has an implementation of IPv6, but it isn't ready to document it in this book yet. The Linux kernel support for IPv6 is good, but a large number of network applications need to be modified to support it as well. Stay tuned.

[12] Frequently, IP addresses will be assigned to you by the provider from whom you buy your IP connectivity. However, you may also apply to the NIC directly for an IP address for your network by sending email to hostmaster@internic.net, or by using the form at http://www.internic.net/.

[13] Autonomous systems are slightly more general. They may comprise more than one IP network.

[14] The cost of a route can be thought of, in a simple case, as the number of hops required to reach the destination. Proper calculation of route costs can be a fine art in complex network designs.

We've been talking quite a bit about network interfaces and general TCP/IP issues, but we haven't really covered what happens when the networking code in the kernel accesses a piece of hardware. In order to describe this accurately, we have to talk a little about the concept of interfaces and drivers.

First, of course, there's the hardware itself, for example an Ethernet, FDDI or Token Ring card: this is a slice of Epoxy cluttered with lots of tiny chips with strange numbers on them, sitting in a slot of your PC. This is what we generally call a physical device.

For you to use a network card, special functions have to be present in your FreeBSD kernel that understand the particular way this device is accessed. The software that implements these functions is called a device driver. FreeBSD has device drivers for many different types of network interface cards: ISA, PCI, MCA, EISA, Parallel port, PCMCIA, and more recently, USB.

But what do we mean when we say a driver handles a device? Let's consider an Ethernet card. The driver has to be able to communicate with the peripheral's on-card logic somehow: it has to send commands and data to the card, while the card should deliver any data received to the driver.

In IBM-style personal computers, this communication takes place through a cluster of I/O addresses that are mapped to registers on the card and/or through shared or direct memory transfers. All commands and data the kernel sends to the card have to go to these addresses. I/O and memory addresses are generally described by providing the starting or base address. Typical base addresses for ISA bus Ethernet cards are 0x280 or 0x300. PCI bus network cards generally have their I/O address automatically assigned.

Usually you don't have to worry about any hardware issues such as the base address because the kernel makes an attempt at boot time to detect a card's location. This is called auto probing, which means that the kernel reads several memory or I/O locations and compares the data it reads there with what it would expect to see if a certain network card were installed at that location. However, there may be network cards it cannot detect automatically; this is sometimes the case with cheap network cards that are not-quite clones of standard cards from other manufacturers. Also, the kernel will normally attempt to detect only one network device when booting. If you're using more than one card, you have to tell the kernel about the other cards explicitly.

Another parameter that you might have to tell the kernel about is the interrupt request line. Hardware components usually interrupt the kernel when they need to be taken care offor example, when data has arrived or a special condition occurs. In an ISA bus PC, interrupts may occur on one of 15 interrupt channels numbered 0, 1, and 3 through 15. The interrupt number assigned to a hardware component is called its interrupt request number (IRQ). [15]

As described in Chapter 2., Issues of TCP/IP Networking, the kernel accesses a piece of network hardware through a software construct called an interface. Interfaces offer an abstract set of functions that are the same across all types of hardware, such as sending or receiving a datagram.

FreeBSD interface names are defined internally in the kernel and are not device files in the /dev directory[16]. The assignment of interfaces to devices usually depends on the order in which the devices are discovered by the kernel, which is often related to the physical placement of the network cards in the host. For instance, if you have two network cards based on the DEC DC21x4x chipset, the first card will be de0, the second will be de1. If you then unplug the first card, the second card will be found as de0 when you restart FreeBSD.

Figure 3.1. illustrates the relationship between the hardware, device drivers, and interfaces.

Figure 3.1. The relationship between drivers, interfaces, and hardware

When booting, the kernel displays the devices it detects and the interfaces it installs. The following is an excerpt from typical boot messages:

.
.
pci0: <PCI bus> on pcib0
...
sis0: <NatSemi DP83815 10/100BaseTX> port 0xe000-0xe0ff mem 0xe4000000-0xe4000ff
f irq 10 at device 9.0 on pci0
sis0: Ethernet address: 00:02:e3:0a:c0:f6
miibus0: <MII bus> on sis0
ukphy0: <Generic IEEE 802.3u media interface> on miibus0
ukphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
fxp0: <Intel Pro 10/100B/100+ Ethernet> port 0xd800-0xd81f mem 0xe3800000-0xe38f
ffff,0xe7000000-0xe7000fff irq 10 at device 10.0 on pci0
fxp0: Ethernet address 00:90:27:30:9c:88
...
ppc0: <Parallel port> at port 0x378-0x37f irq 7 on isa0
ppc0: Generic chipset (NIBBLE-only) in COMPATIBLE mode
plip0: <PLIP network interface> on ppbus0
.
.

This example shows that the kernel has been compiled with support for the PCI bus. Two network devices have been found there. The sis driver provides support for ethernet adapters based on the Silicon Integrated Systems 900, and 7016 chips, and the National Semiconductor DP83815 chip. This particular card is based on the National Semiconductor DP83815 chip, and supports 10baseT and 100baseTX operation. The kernel also displays the IRQ and other resources used by the card, and its MAC address.

Many network cards now support a media independent interface (MII), providing a programming abstraction that shields the programmer from the specifics of the media. FreeBSD implements this through the miibus device, which must be present to use these cards. A PHY is the part of the hardware that implements the physical interface, supported in FreeBSD through the ukphy device. Drivers that support the MII/PHY abstraction need both devices in the kernel.

sis is one of these devices. So the kernel also shows that it has discovered the miibus on sis0, and the first physical device (ukphy0) on that bus.

Following that is the probe for the Intel Fast EtherExpress card, fxp0. Again, the kernel displays the IRQ and related resources used by the card, the speeds it supports, and the MAC address. The fxp driver does not use the miibus abstraction[17] so there are no probe messages relating to it.

If you have network cards in your computer that are not listed at boot time then they were not discovered by the kernel. The kernel will have to be recompiled with support for the card, discussed later.

A little further along the booting process the kernel probes the parallel ports. First the parallel port controller (ppc) is probed, and its IRQ and other resources are determined. Then, because the kernel was built with support for Parallel Link IP (PLIP), the plip0 network device is created.

Kernel Configuration

Most FreeBSD distributions are supplied with boot disks that work for all common types of PC hardware. Generally, the supplied kernel is highly modularized and includes nearly every possible driver. This is a great idea for boot disks, but is probably not what you'd want for long-term use. There isn't much point in having drivers cluttering up your disk that you will never use. Therefore, you will generally roll your own kernel and include only those drivers you actually need or want; that way you save a little disk space and reduce the time it takes to compile a new kernel.

In any case, when running a FreeBSD system, you should be familiar with building a kernel. Think of it as a rite of passage, an affirmation of the one thing that makes free software as powerful as it isyou have the source. It isn't a case of, I have to compile a kernel, rather it's a case of, I can compile a kernel. The basics of compiling a FreeBSD kernel are explained in the FreeBSD Handbook. Therefore, we will discuss only configuration options that affect networking in this section.

Not all options need to be configured in to the kernel. Many of them are variables that can be accessed using sysctl(8). For example, if a FreeBSD host has two or more network interfaces, it will only forward packets between them (i.e., act as a gateway) if the variable net.inet.ip.forwarding is set to 1. A listing of all these variables can be seen with this command:

# sysctl -a | grep net.inet

And they will be discussed throughout this book where appropriate.

Kernel Options in FreeBSD

The best source of information about the options that you can include in your kernel configuration file is LINT[18] . This is a kernel configuration file that contains every possible option you can include in your kernel, and comments that indicate what the options are for. When adjusting your kernel configuration file it is very common to examine LINT for the information you need, and then cut and paste, or copy, the appropriate lines from LINT in to your kernel configuration file.

LINT lays out the networking related options in several sections. The first section covers options that relate to FreeBSD networking as a whole. The second section covers options and configuration information for the different network devices that FreeBSD supports.

Note

FreeBSD is under constant development. Do not be surprised if the devices or options in LINT have changed slightly in the time between this book going to press and you reading it.

Protocols

FreeBSD supports a number of different networking protocols. INET is effectively mandatory, as without it, you will not be able to use IP. The other options are at your discretion. For example, if you know that you will not be using IPv6 and Appletalk you would comment out those sections of the configuration file.

options         INET                    #Internet communications protocols
options         INET6                   #IPv6 communications protocols
options         IPSEC                   #IP security
options         IPSEC_ESP               #IP security (crypto; define w/ IPSEC)
options         IPSEC_DEBUG             #debug for IP security

options         IPX                     #IPX/SPX communications protocols

options         NCP                     #NetWare Core protocol

options         NETATALK                #Appletalk communications protocols

Netgraph is a very powerful mechanism for hooking together various components of the FreeBSD networking infrastructure. For example, netgraph provides a Frame Relay component and a PPP component. These can be hooked together to support PPP over Frame Relay. The same PPP component could be hooked up to the ATM component to support PPP over ATM.

Netgraph is not enabled by default on FreeBSD, it must be compiled in to the kernel. It is fair to say that netgraph is quite esoteric, and that, in many cases, FreeBSD's standard network drivers will do everything you need. But if you find yourself needing to hook several disparate networking protocols together, netgraph is a God-send.

# netgraph(4). Enable the base netgraph code with the NETGRAPH option.
# Individual node types can be enabled with the corresponding option
# listed below; however, this is not strictly necessary as netgraph
# will automatically load the corresponding KLD module if the node type   
# is not already compiled into the kernel. Each type below has a
# corresponding man page, e.g., ng_async(8).
options         NETGRAPH                #netgraph(4) system
options         NETGRAPH_ASYNC
options         NETGRAPH_BPF
options         NETGRAPH_CISCO
options         NETGRAPH_ECHO
options         NETGRAPH_ETHER
options         NETGRAPH_FRAME_RELAY
options         NETGRAPH_HOLE
options         NETGRAPH_IFACE
options         NETGRAPH_KSOCKET
options         NETGRAPH_LMI
# MPPC compression requires proprietary files (not included)
#options        NETGRAPH_MPPC_COMPRESSION
options         NETGRAPH_MPPC_ENCRYPTION
options         NETGRAPH_ONE2MANY
options         NETGRAPH_PPP
options         NETGRAPH_PPPOE
options         NETGRAPH_PPTPGRE
options         NETGRAPH_RFC1490
options         NETGRAPH_SOCKET
options         NETGRAPH_TEE
options         NETGRAPH_TTY
options         NETGRAPH_UI
options         NETGRAPH_VJC
Interfaces

The next section in LINT is concerned with network interfaces. Again, you would comment out the ones you don't need. These interfaces all have manual pages; if you wanted to find out more information about the Berkeley Packet Filter interface you would run man 4 bpf.

# Network interfaces:
#  The `loop' pseudo-device is MANDATORY when networking is enabled.
#  The `ether' pseudo-device provides generic code to handle
#  Ethernets; it is MANDATORY when a Ethernet device driver is
#  configured or token-ring is enabled.
#  The 'fddi' pseudo-device provides generic code to support FDDI.
#  The `sppp' pseudo-device serves a similar role for certain types  
#  of synchronous PPP links (like `cx', `ar').
#  The `sl' pseudo-device implements the Serial Line IP (SLIP) service.  
#  The `ppp' pseudo-device implements the Point-to-Point Protocol.
#  The `bpf' pseudo-device enables the Berkeley Packet Filter.  Be
#  aware of the legal and administrative consequences of enabling this
#  option.  The number of devices determines the maximum number of
#  simultaneous BPF clients programs runnable.
#  The `disc' pseudo-device implements a minimal network interface,
#  which throws away all packets sent and never receives any.  It is
#  included for testing purposes.  This shows up as the 'ds' interface.
#  The `tun' pseudo-device implements (user-)ppp and nos-tun
#  The `gif' pseudo-device implements IPv6 over IP4 tunneling,
#  IPv4 over IPv6 tunneling, IPv4 over IPv4 tunneling and
#  IPv6 over IPv6 tunneling.
#  The `faith' pseudo-device captures packets sent to it and diverts them
#  to the IPv4/IPv6 translation daemon.
#  The `stf' device implements 6to4 encapsulation.
#  The `ef' pseudo-device provides support for multiple ethernet frame types
#  specified via ETHER_* options. See ef(4) for details.
#
# The PPP_BSDCOMP option enables support for compress(1) style entire
# packet compression, the PPP_DEFLATE is for zlib/gzip style compression.
# PPP_FILTER enables code for filtering the ppp data stream and selecting
# events for resetting the demand dial activity timer - requires bpf.
# See pppd(8) for more details.
#
pseudo-device   ether                   #Generic Ethernet
pseudo-device   vlan    1               #VLAN support
pseudo-device   token                   #Generic TokenRing
pseudo-device   fddi                    #Generic FDDI
pseudo-device   sppp                    #Generic Synchronous PPP
pseudo-device   loop                    #Network loopback device
pseudo-device   bpf                     #Berkeley packet filter
pseudo-device   disc                    #Discard device (ds0, ds1, etc)
pseudo-device   tun                     #Tunnel driver (ppp(8), nos-tun(8))
pseudo-device   sl      2               #Serial Line IP
pseudo-device   ppp     2               #Point-to-point protocol
options         PPP_BSDCOMP             #PPP BSD-compress support
options         PPP_DEFLATE             #PPP zlib/deflate/gzip support

pseudo-device   ef                      # Multiple ethernet frames support
options         ETHER_II                # enable Ethernet_II frame
options         ETHER_8023              # enable Ethernet_802.3 (Novell) frame
options         ETHER_8022              # enable Ethernet_802.2 frame
options         ETHER_SNAP              # enable Ethernet_802.2/SNAP frame

# for IPv6
pseudo-device   gif     4               #IPv6 and IPv4 tunneling
pseudo-device   faith   1               #for IPv6 and IPv4 translation
pseudo-device   stf     1               #6to4 IPv6 over IPv4 encapsulation

Note

PPP does not have to be included in the kernel configuration file. FreeBSD supports two PPP implementations, one in the kernel, and one as a regular daemon. The kernel implementation is a little more efficient, but is harder to confgure when compared to the daemon implementation.

IP options

LINT then covers various options that you can to control the behaviour of FreeBSD's IP stack. Many of these option relate to FreeBSD's firewall code, which is covered in more detail later.

# Internet family options:
# 
# TCP_COMPAT_42 causes the TCP code to emulate certain bugs present in
# 4.2BSD.  This option should not be used unless you have a 4.2BSD   
# machine and TCP connections fail.
# 
# MROUTING enables the kernel multicast packet forwarder, which works
# with mrouted(8).
# 
# IPFIREWALL enables support for IP firewall construction, in
# conjunction with the `ipfw' program.  IPFIREWALL_VERBOSE sends
# logged packets to the system logger.  IPFIREWALL_VERBOSE_LIMIT
# limits the number of times a matching entry can be logged.
#
# WARNING:  IPFIREWALL defaults to a policy of "deny ip from any to any"
# and if you do not add other rules during startup to allow access,
# YOU WILL LOCK YOURSELF OUT.  It is suggested that you set firewall_type=open
# in /etc/rc.conf when first enabling this feature, then refining the
# firewall rules in /etc/rc.firewall after you've tested that the new kernel
# feature works properly.
#
# IPFIREWALL_DEFAULT_TO_ACCEPT causes the default rule (at boot) to
# allow everything.  Use with care, if a cracker can crash your
# firewall machine, they can get to your protected machines.  However,
# if you are using it as an as-needed filter for specific problems as
# they arise, then this may be for you.  Changing the default to 'allow'
# means that you won't get stuck if the kernel and /sbin/ipfw binary get
# out of sync.
#
# IPDIVERT enables the divert IP sockets, used by ``ipfw divert''
#
# IPSTEALTH enables code to support stealth forwarding (i.e., forwarding
# packets without touching the ttl).  This can be useful to hide firewalls
# from traceroute and similar tools.
#
options         TCP_COMPAT_42           #emulate 4.2BSD TCP bugs
options         MROUTING                # Multicast routing
options         IPFIREWALL              #firewall
options         IPFIREWALL_VERBOSE      #print information about
                                        # dropped packets
options         IPFIREWALL_FORWARD      #enable transparent proxy support
options         IPFIREWALL_VERBOSE_LIMIT=100    #limit verbosity
options         IPFIREWALL_DEFAULT_TO_ACCEPT    #allow everything by default
options         IPV6FIREWALL            #firewall for IPv6
options         IPV6FIREWALL_VERBOSE
options         IPV6FIREWALL_VERBOSE_LIMIT=100
options         IPV6FIREWALL_DEFAULT_TO_ACCEPT
options         IPDIVERT                #divert sockets
options         IPFILTER                #ipfilter support
options         IPFILTER_LOG            #ipfilter logging
options         IPFILTER_DEFAULT_BLOCK  #block all packets by default
options         IPSTEALTH               #support for stealth forwarding

# Statically Link in accept filters
options                ACCEPT_FILTER_DATA
options                ACCEPT_FILTER_HTTP

# The following options add sysctl variables for controlling how certain
# TCP packets are handled.
#
# TCP_DROP_SYNFIN adds support for ignoring TCP packets with SYN+FIN. This
# prevents nmap et al. from identifying the TCP/IP stack, but breaks support
# for RFC1644 extensions and is not recommended for web servers.
#
# TCP_RESTRICT_RST adds support for blocking the emission of TCP RST packets.
# This is useful on systems which are exposed to SYN floods (e.g. IRC servers)
# or any system which one does not want to be easily portscannable.
#
options         TCP_DROP_SYNFIN         #drop TCP packets with SYN+FIN
options         TCP_RESTRICT_RST        #restrict emission of TCP RST

# ICMP_BANDLIM enables icmp error response bandwidth limiting.   You
# typically want this option as it will help protect the machine from
# D.O.S. packet attacks.
#
options         ICMP_BANDLIM

# DUMMYNET enables the "dummynet" bandwidth limiter. You need
# IPFIREWALL as well. See the dummynet(4) manpage for more info.
# BRIDGE enables bridging between ethernet cards -- see bridge(4).
# You can use IPFIREWALL and dummynet together with bridging.
options         DUMMYNET
options         BRIDGE
ATM
#
# ATM (HARP version) options
#
# ATM_CORE includes the base ATM functionality code.  This must be included
#       for ATM support.
#
# ATM_IP includes support for running IP over ATM.
#
# At least one (and usually only one) of the following signalling managers
# must be included (note that all signalling managers include PVC support):
# ATM_SIGPVC includes support for the PVC-only signalling manager `sigpvc'.
# ATM_SPANS includes support for the `spans' signalling manager, which runs
#       the FORE Systems's proprietary SPANS signalling protocol.
# ATM_UNI includes support for the `uni30' and `uni31' signalling managers,
#       which run the ATM Forum UNI 3.x signalling protocols.
#
# The `hea' driver provides support for the Efficient Networks, Inc.
# ENI-155p ATM PCI Adapter.
# 
# The `hfa' driver provides support for the FORE Systems, Inc.
# ENI-155p ATM PCI Adapter.
# 
# The `hfa' driver provides support for the FORE Systems, Inc.
# PCA-200E ATM PCI Adapter.
#
options         ATM_CORE                #core ATM protocol family
options         ATM_IP                  #IP over ATM support
options         ATM_SIGPVC              #SIGPVC signalling manager
options         ATM_SPANS               #SPANS signalling manager
options         ATM_UNI                 #UNI signalling manager
device          hea                     #Efficient ENI-155p ATM PCI
device          hfa                     #FORE PCA-200E ATM PCI
Network-aware filesystems

The kernel also allows you to include support for various filesystems. It is no longer necessary to specify these in the kernel configuration file. If you try and mount a filesystem that the kernel was not compiled with support for, the kernel will load the appropriate kernel module from the /modules directory to provide that support. Some people still prefer to compile these filesystems directly in to the kernel, however.

The network-aware filesystems, and their associated options, are:

options         NFS                     #Network File System
#options        NFS_NOSERVER            #Disable the NFS-server code.
options         NWFS                    #NetWare filesystem
options         NFS_ROOT                #NFS usable as root device
...
# If you are running a machine just as a fileserver for PC and MAC
# users, using SAMBA or Netatalk, you may consider setting this option
# and keeping all those users' directories on a filesystem that is
# mounted with the suiddir option. This gives new files the same
# ownership as the directory (similar to group). It's a security hole
# if you let these users run programs, so confine it to file-servers
# (but it'll save you lots of headaches in those cases). Root owned
# directories are exempt and X bits are cleared. The suid bit must be
# set on the directory as well; see chmod(1) PC owners can't see/set
# ownerships so they keep getting their toes trodden on. This saves
# you all the support calls as the filesystem it's used on will act as
# they expect: "It's my dir so it must be my file".
#
options         SUIDDIR

# NFS options:
options         NFS_MINATTRTIMO=3       # VREG attrib cache timeout in sec
options         NFS_MAXATTRTIMO=60
options         NFS_MINDIRATTRTIMO=30   # VDIR attrib cache timeout in sec
options         NFS_MAXDIRATTRTIMO=60
options         NFS_GATHERDELAY=10      # Default write gather delay (msec)
options         NFS_UIDHASHSIZ=29       # Tune the size of nfssvc_sock with this
options         NFS_WDELAYHASHSIZ=16    # and with this
options         NFS_MUIDHASHSIZ=63      # Tune the size of nfsmount with this
options         NFS_DEBUG               # Enable NFS Debugging

# Coda stuff:
options         CODA                    #CODA filesystem.
pseudo-device   vcoda   4               #coda minicache <-> venus comm.
Network card support

You must now include support for the network card or cards in your host. This is not always necessary, as some drivers, such as fxp are available as modules which can be dynamically loaded by the kernel. However, some network drivers must still be configured statically in to the kernel. In practice, this is no loss, since the network interface is such a fundamental component of the machine, there is little or no benefit to loading it dynamically.

ISA devices are typically listed first. If you have any ISA devices you must include support for them in the kernel by including the device isa line, as well as configuration lines for the devices themselves.

device          isa
...
#
# Network interfaces: `cx', `ed', `el', `ep', `ie', `is', `le', `lnc'
# 
# ar: Arnet SYNC/570i hdlc sync 2/4 port V.35/X.21 serial driver (requires sppp)
# cs: IBM Etherjet and other Crystal Semi CS89x0-based adapters
# cx: Cronyx/Sigma multiport sync/async (with Cisco or PPP framing)
# ed: Western Digital and SMC 80xx; Novell NE1000 and NE2000; 3Com 3C503
# el: 3Com 3C501 (slow!)
# ep: 3Com 3C509
# ex: Intel EtherExpress Pro/10 and other i82595-based adapters  
# fe: Fujitsu MB86960A/MB86965A Ethernet
# ie: AT&T StarLAN 10 and EN100; 3Com 3C507; unknown NI5210; Intel EtherExpress
# le: Digital Equipment EtherWorks 2 and EtherWorks 3 (DEPCA, DE100,
#     DE101, DE200, DE201, DE202, DE203, DE204, DE205, DE422)
# lnc: Lance/PCnet cards (Isolan, Novell NE2100, NE32-VL, AMD Am7990 & Am79C960)
# rdp: RealTek RTL 8002-based pocket ethernet adapters
# sr: RISCom/N2 hdlc sync 1/2 port V.35/X.21 serial driver (requires sppp)
# wl: Lucent Wavelan (ISA card only).
# awi: IEEE 802.11b PRISM I cards.
# wi: Lucent WaveLAN/IEEE 802.11 PCMCIA adapters. Note: this supports both
#     the PCMCIA and ISA cards: the ISA card is really a PCMCIA to ISA
#     bridge with a PCMCIA adapter plugged into it.
# an: Aironet 4500/4800 802.11 wireless adapters. Supports the PCMCIA,
#     PCI and ISA varieties.
# xe: Xircom/Intel EtherExpress Pro100/16 PC Card ethernet controller.
# ray: Raytheon Raylink 802.11 wireless NICs, OEM as Webgear Aviator 2.4GHz
# oltr: Olicom ISA token-ring adapters OC-3115, OC-3117, OC-3118 and OC-3133
#       (no options needed)
#
device ar0 at isa? port 0x300 irq 10 iomem 0xd0000
device cs0 at isa? port 0x300
device cx0 at isa? port 0x240 irq 15 drq 7
device ed0 at isa? port 0x280 irq 5 iomem 0xd8000
device el0 at isa? port 0x300 irq 9
device ep
device ex
device fe0 at isa? port 0x300
device ie0 at isa? port 0x300 irq 5 iomem 0xd0000
device ie1 at isa? port 0x360 irq 7 iomem 0xd0000
device le0 at isa? port 0x300 irq 5 iomem 0xd0000
device lnc0 at isa? port 0x280 irq 10 drq 0
device rdp0 at isa? port 0x378 irq 7 flags 2
device sr0 at isa? port 0x300 irq 5 iomem 0xd0000
device sn0 at isa? port 0x300 irq 10
device awi
device wi
device an
options         WLCACHE         # enables the signal-strength cache
options         WLDEBUG         # enables verbose debugging output
device wl0 at isa? port 0x300
device xe
device ray

Notice that some of the drivers must be configured explicitly with the IRQ and memory port addresses of the cards they work with. Other cards support auto-probing of these values, and can be left out.

For cards that support autoprobing you only need one device line, irrespective of how many cards of that type might be installed in the host. For cards that must be configured, you must have one entry per-card, with different IRQ and memory address settings. See the entries for ie0 and ie1 as an example.

ATM devices are specified in the same way.

#
# ATM related options
#
# The `en' device provides support for Efficient Networks (ENI)
# ENI-155 PCI midway cards, and the Adaptec 155Mbps PCI ATM cards (ANA-59x0).
#
# atm pseudo-device provides generic atm functions and is required for
# atm devices.
# NATM enables the netnatm protocol family that can be used to
# bypass TCP/IP.
#
# the current driver supports only PVC operations (no atm-arp, no multicast).
# for more details, please read the original documents at
# http://www.ccrc.wustl.edu/pub/chuck/tech/bsdatm/bsdatm.html
#
pseudo-device   atm
device          en
options         NATM                    #native ATM

Including support for PCI cards is a little easier, as you do not need to specify IRQs and other information, just the device name. Again, one entry per driver required will suffice, rather than one entry per device.

# 
# PCI devices & PCI options:
# 
# The main PCI bus device is `pci'.  It provides auto-detection and
# configuration support for all devices on the PCI bus, using either  
# configuration mode defined in the PCI specification.
  
device          pci
...
# 
# The `dc' device provides support for PCI fast ethernet adapters
# based on the DEC/Intel 21143 and various workalikes including:
# the ADMtek AL981 Comet and AN985 Centaur, the ASIX Electronics
# AX88140A and AX88141, the Davicom DM9100 and DM9102, the Lite-On
# 82c168 and 82c169 PNIC, the Lite-On/Macronix LC82C115 PNIC II
# and the Macronix 98713/98713A/98715/98715A/98725 PMAC. This driver
# replaces the old al, ax, dm, pn and mx drivers.  List of brands:
# Digital DE500-BA, Kingston KNE100TX, D-Link DFE-570TX, SOHOware SFA110,
# SVEC PN102-TX, CNet Pro110B, 120A, and 120B, Compex RL100-TX,
# LinkSys LNE100TX, LNE100TX V2.0, Jaton XpressNet, Alfa Inc GFC2204,
# KNE110TX.
#
# The `de' device provides support for the Digital Equipment DC21040
# self-contained Ethernet adapter.
#
# The `fxp' device provides support for the Intel EtherExpress Pro/100B
# PCI Fast Ethernet adapters.
#
# The pcn device provides support for PCI fast ethernet adapters based
# on the AMD Am79c97x chipsets, including the PCnet/FAST, PCnet/FAST+,
# PCnet/PRO and PCnet/Home. These were previously handled by the lnc
# driver (and still will be if you leave this driver out of the kernel).
#
# The 'rl' device provides support for PCI fast ethernet adapters based
# on the RealTek 8129/8139 chipset. Note that the RealTek driver defaults
# to using programmed I/O to do register accesses because memory mapped
# mode seems to cause severe lockups on SMP hardware. This driver also
# supports the Accton EN1207D `Cheetah' adapter, which uses a chip called
# the MPX 5030/5038, which is either a RealTek in disguise or a RealTek
# workalike.  Note that the D-Link DFE-530TX+ uses the RealTek chipset
# and is supported by this driver, not the 'vr' driver.
#
# The 'sf' device provides support for Adaptec Duralink PCI fast
# ethernet adapters based on the Adaptec AIC-6915 "starfire" controller.
# This includes dual and quad port cards, as well as one 100baseFX card.
# Most of these are 64-bit PCI devices, except for one single port
# card which is 32-bit.
#
# The 'ste' device provides support for adapters based on the Sundance
# Technologies ST201 PCI fast ethernet controller. This includes the
# D-Link DFE-550TX.
#
# The 'sis' device provides support for adapters based on the Silicon
# Integrated Systems SiS 900 and SiS 7016 PCI fast ethernet controller
# chips.
#
# The 'sk' device provides support for the SysKonnect SK-984x series
# PCI gigabit ethernet NICs. This includes the SK-9841 and SK-9842
# single port cards (single mode and multimode fiber) and the
# SK-9843 and SK-9844 dual port cards (also single mode and multimode).
# The driver will autodetect the number of ports on the card and
# attach each one as a separate network interface.
#
# The 'ti' device provides support for PCI gigabit ethernet NICs based
# on the Alteon Networks Tigon 1 and Tigon 2 chipsets. This includes the
# Alteon AceNIC, the 3Com 3c985, the Netgear GA620 and various others.
# Note that you will probably want to bump up NMBCLUSTERS a lot to use
# this driver.
#
# The 'tl' device provides support for the Texas Instruments TNETE100
# series 'ThunderLAN' cards and integrated ethernet controllers. This
# includes several Compaq Netelligent 10/100 cards and the built-in
# ethernet controllers in several Compaq Prosignia, Proliant and
# Deskpro systems. It also supports several Olicom 10Mbps and 10/100
# boards.
#
# The `tx' device provides support for the SMC 9432 TX, BTX and TX_2 cards.
#
# The `vr' device provides support for various fast ethernet adapters
# based on the VIA Technologies VT3043 `Rhine I' and VT86C100A `Rhine II'
# chips, including the D-Link DFE530TX (see 'rl' for DFE530TX+), the Hawking
# Technologies PN102TX, and the AOpen/Acer ALN-320.
#
# The `vx' device provides support for the 3Com 3C590 and 3C595
# early support
#
# The `wb' device provides support for various fast ethernet adapters
# based on the Winbond W89C840F chip. Note: this is not the same as
# the Winbond W89C940F, which is an NE2000 clone.
#
# The `wx' device provides support for the Intel Gigabit Ethernet
# PCI card (`Wiseman').
#
# The `xl' device provides support for the 3Com 3c900, 3c905 and
# 3c905B (Fast) Etherlink XL cards and integrated controllers. This
# includes the integrated 3c905B-TX chips in certain Dell Optiplex and
# Dell Precision desktop machines and the integrated 3c905-TX chips
# in Dell Latitude laptop docking stations.
#
# The `fpa' device provides support for the Digital DEFPA PCI FDDI
# adapter. pseudo-device fddi is also needed.
#
...
#
# The oltr driver supports the following Olicom PCI token-ring adapters
# OC-3136, OC-3137, OC-3139, OC-3140, OC-3141, OC-3540, OC-3250
#
...
# MII bus support is required for some PCI 10/100 ethernet NICs,
# namely those which use MII-compliant transceivers or implement
# tranceiver control interfaces that operate like an MII. Adding
# "device miibus0" to the kernel config pulls in support for
# the generic miibus API and all of the PHY drivers, including a
# generic one for PHYs that aren't specifically handled by an
# individual driver.
device          miibus

# PCI Ethernet NICs that use the common MII bus controller code.
device          dc              # DEC/Intel 21143 and various workalikes
device          rl              # RealTek 8129/8139
device          pcn             # AMD Am79C79x PCI 10/100 NICs
device          sf              # Adaptec AIC-6915 (``Starfire'')
device          sis             # Silicon Integrated Systems SiS 900/SiS 7016
device          ste             # Sundance ST201 (D-Link DFE-550TX)
device          tl              # Texas Instruments ThunderLAN
device          tx              # SMC EtherPower II (83c17x ``EPIC'')
device          vr              # VIA Rhine, Rhine II
device          wb              # Winbond W89C840F
device          xl              # 3Com 3c90x (``Boomerang'', ``Cyclone'')

# PCI Ethernet NICs.
device          de              # DEC/Intel DC21x4x (``Tulip'')
device          fxp             # Intel EtherExpress PRO/100B (82557, 82558)
device          vx              # 3Com 3c590, 3c595 (``Vortex'')

device          sk
device          ti
device          wx
device          fpa

device         oltr0
Parallel Port (PLIP)

In order to enable support for PLIP you must ensure that the parallel port is configured in the kernel with support for PLIP. It is possible to configure the parallel port without this support if you're not going to use it. The appropriate options are:

device          ppc0    at isa? irq 7
...
device          plip

As with the other ISA devices you need to ensure that you have specified the correct IRQ for the parallel port. 7 is the default, but it is normally possible to change this in the BIOS settings.

USB Ethernet

FreeBSD also supports USB ethernet drivers. These drivers require USB support in the FreeBSD kernel, and the MIIBUS support discussed earlier.

# USB support
# UHCI controller
device          uhci
# OHCI controller
device          ohci
# General USB code (mandatory for USB)
device          usb
...
device          miibus
...
#
# ADMtek USB ethernet. Supports the LinkSys USB100TX,
# the Billionton USB100, the Melco LU-ATX, the D-Link DSB-650TX
# and the SMC 2202USB. Also works with the ADMtek AN986 Pegasus
# eval board.
device          aue
#
# CATC USB-EL1201A USB ethernet. Supports the CATC Netmate
# and Netmate II, and the Belkin F5U111.
device          cue
#
# Kawasaki LSI ethernet. Supports the LinkSys USB10T,
# Entrega USB-NET-E45, Peracom Ethernet Adapter, the
# 3Com 3c19250, the ADS Technologies USB-10BT, the ATen UC10T,
# the Netgear EA101, the D-Link DSB-650, the SMC 2102USB
# and 2104USB, and the Corega USB-T.
device          kue

The PLIP Driver

Parallel Line IP (PLIP) is a cheap way to network when you want to connect only two machines. It uses a parallel port and a special cable, achieving speeds of 10 kilobytes per second to 20 kilobytes per second.

PLIP on FreeBSD is described in the plip(4) manual page, which includes details on the cable that is required to connect two hosts. It should be noted that FreeBSD's PLIP driver supports two communication protocols, selected by an argument to ifconfig(8) when the device is configured. The -link0 option selects FreeBSD PLIP, also referred to as LPIP. This is the simpler protocol, and slightly more efficient. The link0 option (no leading “-”) selects Crynwr/Linux compatible mode (CLPIP). You should use this if you need to run PLIP to a non-FreeBSD host.

The PPP and SLIP Drivers

Point-to-Point Protocol (PPP) and Serial Line IP (SLIP) are widely used protocols for carrying IP packets over a serial link. A number of institutions offer dialup PPP and SLIP access to machines that are on the Internet, thus providing IP connectivity to private persons (something that's otherwise hardly affordable).

No hardware modifications are necessary to run PPP or SLIP; you can use any serial port. Since serial port configuration is not specific to TCP/IP networking, we have devoted a separate chapter to this. Please refer to Chapter 4., Configuring the Serial Hardware, for more information. We cover PPP in detail in Chapter 8., The Point-to-Point Protocol, and SLIP in Chapter 7., Serial Line IP.



[15] IRQs 2 and 9 are the same because the IBM PC design has two cascaded interrupt processors with eight IRQs each; the secondary processor is connected to IRQ 2 of the primary one.

[16] Yet. FreeBSD 5.0 will introduce entries in /dev for network devices.

[17] At least, it doesn't in FreeBSD 4.3-RELEASE, which is where this snapshot was taken. Later versions of the fxp driver in FreeBSD 4.4 and up do use the miibus abstraction.

[18] Normally found as /usr/src/sys/i386/conf/LINT

The Internet is growing at an incredible rate. Much of this growth is attributed to Internet users who can't afford high-speed permanent network connections and who use protocols such as SLIP, PPP, or UUCP to dial in to a network provider to retrieve their daily dose of email and news.

This chapter is intended to help all people who rely on modems to maintain their link to the outside world. We won't cover the mechanics of how to configure your modem (the manual that came with it will tell you more about it than we can), but we will cover most of the Linux-specific aspects of managing devices that use serial ports. Topics include serial communications software, creating the serial device files, serial hardware, and configuring serial devices using the setserial and stty commands. Many other related topics are covered in the Serial HOWTO by David Lawyer.[19]

Communications Software for Modem Links

There are a number of communications packages available for Linux. Many of these packages are terminal programs, which allow a user to dial in to another computer as if she were sitting in front of a simple terminal. The traditional terminal program for Unix-like environments is kermit. It is, however, fairly ancient now, and would probably be considered difficult to use. There are more comfortable programs available that support features, like telephone-dialing dictionaries, script languages to automate dialing and logging in to remote computer systems, and a variety of file exchange protocols. One of these programs is minicom, which was modeled after some of the most popular DOS terminal programs. X11 users are accommodated, too. seyon is a fully featured X11-based communications program.

Terminal programs aren't the only type of serial communication programs available. Other programs let you connect to a host and download news and email in a single bundle, to read and reply later at your leisure. This can save a lot of time, and is especially useful if you are unfortunate enough to live in an area where your local calls are time-charged. All of the reading and replying time can be spent offline, and when you are ready, you can redial and upload your responses in a single bundle. This all consumes a bit more hard disk because all of the messages have to be stored to your disk before you can read them, but this could be a reasonable trade-off at today's hard drive prices.

UUCP epitomizes this communication software style. It is a program suite that copies files from one host to another and executes programs on a remote host. It is frequently used to transport mail or news in private networks. Ian Taylor's UUCP package, which also runs under Linux, is described in detail in ???. Other noninteractive communications software is used throughout networks such as Fidonet. Fidonet application ports like ifmail are also available, although we expect that not many people still use them.

PPP and SLIP are in between, allowing both interactive and noninteractive use. Many people use PPP or SLIP to dial in to their campus network or other Internet Service Provider to run FTP and read web pages. PPP and SLIP are also, however, commonly used over permanent or semipermanent connections for LAN-to-LAN coupling, although this is really only interesting with ISDN or other high-speed network connections.

Introduction to Serial Devices

The Unix kernel provides devices for accessing serial hardware, typically called tty devices (pronounced as it is spelled: T-T-Y). This is an abbreviation for Teletype device, which used to be one of the major manufacturers of terminal devices in the early days of Unix. The term is used now for any character-based data terminal. Throughout this chapter, we use the term to refer exclusively to the Linux device files rather than the physical terminal.

Linux provides three classes of tty devices: serial devices, virtual terminals (all of which you can access in turn by pressing Alt-F1 through Alt-Fnn on the local console), and pseudo-terminals (similar to a two-way pipe, used by applications such as X11). The former were called tty devices because the original character-based terminals were connected to the Unix machine by a serial cable or telephone line and modem. The latter two were named after the tty device because they were created to behave in a similar fashion from the programmer's perspective.

SLIP and PPP are most commonly implemented in the kernel. The kernel doesn't really treat the tty device as a network device that you can manipulate like an Ethernet device, using commands such as ifconfig. However, it does treat tty devices as places where network devices can be bound. To do this, the kernel changes what is called the line discipline of the tty device. Both SLIP and PPP are line disciplines that may be enabled on tty devices. The general idea is that the serial driver handles data given to it differently, depending on the line discipline it is configured for. In its default line discipline, the driver simply transmits each character it is given in turn. When the SLIP or PPP line discipline is selected, the driver instead reads a block of data, wraps a special header around it that allows the remote end to identify that block of data in a stream, and transmits the new data block. It isn't too important to understand this yet; we'll cover both SLIP and PPP in later chapters, and it all happens automatically for you anyway.

Accessing Serial Devices

Like all devices in a Unix system, serial ports are accessed through device special files, located in the /dev directory. There are two varieties of device files related to serial drivers, and there is one device file of each type for each port. The device will behave slightly differently, depending on which of its device files we open. We'll cover the differences because it will help you understand some of the configurations and advice that you might see relating to serial devices, but in practice you need to use only one of these. At some point in the future, one of them may even disappear completely.

The most important of the two classes of serial device has a major number of 4, and its device special files are named ttyS0, ttyS1, etc. The second variety has a major number of 5, and was designed for use when dialing out (calling out) through a port; its device special files are called cua0, cua1, etc. In the Unix world, counting generally starts at zero, while laypeople tend to start at one. This creates a small amount of confusion for people because COM1: is represented by /dev/ttyS0, COM2: by /dev/ttyS1, etc. Anyone familiar with IBM PC-style hardware knows that COM3: and greater were never really standardized anyway.

The cua, or callout, devices were created to solve the problem of avoiding conflicts on serial devices for modems that have to support both incoming and outgoing connections. Unfortunately, they've created their own problems and are now likely to be discontinued. Let's briefly look at the problem.

Linux, like Unix, allows a device, or any other file, to be opened by more than one process simultaneously. Unfortunately, this is rarely useful with tty devices, as the two processes will almost certainly interfere with each other. Luckily, a mechanism was devised to allow a process to check if a tty device had already been opened by another device before opening it. The mechanism uses what are called lock files. The idea was that when a process wanted to open a tty device, it would check for the existence of a file in a special location, named similarly to the device it intends to open. If the file does not exist, the process creates it and opens the tty device. If the file does exist, the process assumes another process already has the tty device open and takes appropriate action. One last clever trick to make the lock file management system work was writing the process ID (pid) of the process that had created the lock file into the lock file itself; we'll talk more about that in a moment.

The lock file mechanism works perfectly well in circumstances in which you have a defined location for the lock files and all programs know where to find them. Alas, this wasn't always the case for Linux. It wasn't until the Linux Filesystem Standard defined a standard location for lock files when tty lock files began to work correctly. At one time there were at least four, and possibly more locations chosen by software developers to store lock files: /usr/spool/locks/, /var/spool/locks/, /var/lock/, and /usr/lock/. Confusion caused chaos. Programs were opening lock files in different locations that were meant to control a single tty device; it was as if lock files weren't being used at all.

The cua devices were created to provide a solution to this problem. Rather than relying on the use of lock files to prevent clashes between programs wanting to use the serial devices, it was decided that the kernel could provide a simple means of arbitrating who should be given access. If the ttyS device were already opened, an attempt to open the cua would result in an error that a program could interpret to mean the device was already being used. If the cua device were already open and an attempt was made to open the ttyS, the request would block; that is, it would be put on hold and wait until the cua device was closed by the other process. This worked quite well if you had a single modem that you had configured for dial-in access and you occasionally wanted to dial out on the same device. But it did not work very well in environments where you had multiple programs wanting to call out on the same device. The only way to solve the contention problem was to use lock files! Back to square one.

Suffice it to say that the Linux Filesystem Standard came to the rescue and now mandates that lock files be stored in the /var/lock directory, and that by convention, the lock file name for the ttyS1 device, for instance, is LCK..ttyS1. The cua lock files should also go in this directory, but use of cua devices is now discouraged.

The cua devices will probably still be around for some time to provide a period of backward compatibility, but in time they will be retired. If you are wondering what to use, stick to the ttyS device and make sure that your system is Linux FSSTND compliant, or at the very least that all programs using the serial devices agree on where the lock files are located. Most software dealing with serial tty devices provides a compile-time option to specify the location of the lock files. More often than not, this will appear as a variable called something like LOCKDIR in the Makefile or in a configuration header file. If you're compiling the software yourself, it is best to change this to agree with the FSSTND-specified location. If you're using a precompiled binary and you're not sure where the program will write its lock files, you can use the following command to gain a hint:

strings binaryfile | grep lock

If the location found does not agree with the rest of your system, you can try creating a symbolic link from the lock directory that the foreign executable wants to use back to /var/lock/. This is ugly, but it will work.

The Serial Device Special Files

Minor numbers are identical for both types of serial devices. If you have your modem on one of the ports COM1: through COM4:, its minor number will be the COM port number plus 63. If you are using special serial hardware, such as a high-performance multiple port serial controller, you will probably need to create special device files for it; it probably won't use the standard device driver. The Serial-HOWTO should be able to assist you in finding the appropriate details.

Assume your modem is on COM2:. Its minor number will be 65, and its major number will be 4 for normal use. There should be a device called ttyS1 that has these numbers. List the serial ttys in the /dev/ directory. The fifth and sixth columns show the major and minor numbers, respectively:

$ ls -l /dev/ttyS*
	  0 crw-rw----   1 uucp     dialout    4,  64 Oct 13  1997 /dev/ttyS0
	  0 crw-rw----   1 uucp     dialout    4,  65 Jan 26 21:55 /dev/ttyS1
	  0 crw-rw----   1 uucp     dialout    4,  66 Oct 13  1997 /dev/ttyS2
	  0 crw-rw----   1 uucp     dialout    4,  67 Oct 13  1997 /dev/ttyS3

If there is no device with major number 4 and minor number 65, you will have to create one. Become the superuser and type:

# mknod -m 666 /dev/ttyS1 c 4 65
	  # chown uucp.dialout /dev/ttyS1

The various Linux distributions use slightly differing strategies for who should own the serial devices. Sometimes they will be owned by root, and other times they will be owned by another user, such as uucp in our example. Modern distributions have a group specifically for dial-out devices, and any users who are allowed to use them are added to this group.

Some people suggest making /dev/modem a symbolic link to your modem device so that casual users don't have to remember the somewhat unintuitive ttyS1. However, you cannot use modem in one program and the real device file name in another. Their lock files would have different names and the locking mechanism wouldn't work.

Serial Hardware

RS-232 is currently the most common standard for serial communications in the PC world. It uses a number of circuits for transmitting single bits, as well as for synchronization. Additional lines may be used for signaling the presence of a carrier (used by modems) and for handshaking. Linux supports a wide variety of serial cards that use the RS-232 standard.

Hardware handshake is optional, but very useful. It allows either of the two stations to signal whether it is ready to receive more data, or if the other station should pause until the receiver is done processing the incoming data. The lines used for this are called Clear to Send (CTS) and Ready to Send (RTS), respectively, which explains the colloquial name for hardware handshake: RTS/CTS. The other type of handshake you might be familiar with is called XON/XOFF handshaking. XON/XOFF uses two nominated characters, conventionally Ctrl-S and Ctrl-Q, to signal to the remote end that it should stop and start transmitting data, respectively. While this method is simple to implement and okay for use by dumb terminals, it causes great confusion when you are dealing with binary data, as you may want to transmit those characters as part of your data stream, and not have them interpreted as flow control characters. It is also somewhat slower to take effect than hardware handshake. Hardware handshake is clean, fast, and recommended in preference to XON/XOFF when you have a choice.

In the original IBM PC, the RS-232 interface was driven by a UART chip called the 8250. PCs around the time of the 486 used a newer version of the UART called the 16450. It was slightly faster than the 8250. Nearly all Pentium-based machines have been supplied with an even newer version of the UART called the 16550. Some brands (most notably internal modems equipped with the Rockwell chip set) use completely different chips that emulate the behavior of the 16550 and can be treated similarly. Linux supports all of these in its standard serial port driver.[20]

The 16550 was a significant improvement over the 8250 and the 16450 because it offered a 16-byte FIFO buffer. The 16550 is actually a family of UART devices, comprising the 16550, the 16550A, and the 16550AFN (later renamed PC16550DN). The differences relate to whether the FIFO actually works; the 16550AFN is the one that is sure to work. There was also an NS16550, but its FIFO never really worked either.

The 8250 and 16450 UARTs had a simple 1-byte buffer. This means that a 16450 generates an interrupt for every character transmitted or received. Each interrupt takes a short period of time to service, and this small delay limits 16450s to a reliable maximum bit speed of about 9,600 bps in a typical ISA bus machine.

In the default configuration, the kernel checks the four standard serial ports, COM1: through COM4:. The kernel is also able to automatically detect what UART is used for each of the standard serial ports, and will make use of the enhanced FIFO buffer of the 16550, if it is available.

Using the Configuration Utilities

Now let's spend some time looking at the two most useful serial device configuration utilities: setserial and stty.

The setserial Command

The kernel will make its best effort to correctly determine how your serial hardware is configured, but the variations on serial device configuration makes this determination difficult to achieve 100 percent reliably in practice. A good example of where this is a problem is the internal modems we talked about earlier. The UART they use has a 16-byte FIFO buffer, but it looks like a 16450 UART to the kernel device driver: unless we specifically tell the driver that this port is a 16550 device, the kernel will not make use of the extended buffer. Yet another example is that of the dumb 4-port cards that allow sharing of a single IRQ among a number of serial devices. We may have to specifically tell the kernel which IRQ port it's supposed to use, and that IRQs may be shared.

setserial was created to configure the serial driver at runtime. The setserial command is most commonly executed at boot time from a script called 0setserial on some distributions, and rc.serial on others. This script is charged with the responsibility of initializing the serial driver to accommodate any nonstandard or unusual serial hardware in the machine.

The general syntax for the setserial command is:

setserial device [parameters]

in which the device is one of the serial devices, such as ttyS0.

The setserial command has a large number of parameters. The most common of these are described in Table 4.1.. For information on the remainder of the parameters, you should refer to the setserial manual page.

Table 4.1. setserial Command-Line Parameters

ParameterDescription
portport_number

Specify the I/O port address of the serial device. Port numbers should be specified in hexadecimal notation, e.g., 0x2f8.

irqnum

Specify the interrupt request line the serial device is using.

uartuart_type

Specify the UART type of the serial device. Common values are 16450, 16550, etc. Setting this value to none will disable this serial device.

fourport

Specifying this parameter instructs the kernel serial driver that this port is one port of an AST Fourport card.

spd_hi

Program the UART to use a speed of 57.6 kbps when a process requests 38.4 kbps.

spd_vhi

Program the UART to use a speed of 115 kbps when a process requests 38.4 kbps.

spd_normal

Program the UART to use the default speed of 38.4 kbps when requested. This parameter is used to reverse the effect of a spd_hi or spd_vhi performed on the specified serial device.

auto_irq

This parameter will cause the kernel to attempt to automatically determine the IRQ of the specified device. This attempt may not be completely reliable, so it is probably better to think of this as a request for the kernel to guess the IRQ. If you know the IRQ of the device, you should specify that it use the irq parameter instead.

autoconfig

This parameter must be specified in conjunction with the port parameter. When this parameter is supplied, setserial instructs the kernel to attempt to automatically determine the UART type located at the supplied port address. If the auto_irq parameter is also supplied, the kernel attempts to automatically determine the IRQ, too.

skip_test

This parameter instructs the kernel not to bother performing the UART type test during auto-configuration. This is necessary when the UART is incorrectly detected by the kernel.

A typical and simple rc file to configure your serial ports at boot time might look something like that shown in Example 4.1.. Most Linux distributions will include something slightly more sophisticated than this one.

Example 4.1. Example rc.serial setserial Commands

# /etc/rc.serial - serial line configuration script.
	  #
	  # Configure serial devices
	  /sbin/setserial /dev/ttyS0 auto_irq skip_test autoconfig
	  /sbin/setserial /dev/ttyS1 auto_irq skip_test autoconfig
	  /sbin/setserial /dev/ttyS2 auto_irq skip_test autoconfig
	  /sbin/setserial /dev/ttyS3 auto_irq skip_test autoconfig
	  #
	  # Display serial device configuration
	  /sbin/setserial -bg /dev/ttyS*

The -bg /dev/ttyS* argument in the last command will print a neatly formatted summary of the hardware configuration of all active serial devices. The output will look like that shown in Example 4.2..

Example 4.2. Output of setserial -bg /dev/ttyS Command

/dev/ttyS0 at 0x03f8 (irq = 4) is a 16550A
	  /dev/ttyS1 at 0x02f8 (irq = 3) is a 16550A

The stty Command

The name stty probably means set tty, but the stty command can also be used to display a terminal's configuration. Perhaps even more so than setserial, the stty command provides a bewildering number of characteristics you can configure. We'll cover the most important of these in a moment. You can find the rest described in the stty manual page.

The stty command is most commonly used to configure terminal parameters, such as whether characters will be echoed or what key should generate a break signal. We explained earlier that serial devices are tty devices and the stty command is therefore equally applicable to them.

One of the more important uses of the stty for serial devices is to enable hardware handshaking on the device. We talked briefly about hardware handshaking earlier. The default configuration for serial devices is for hardware handshaking to be disabled. This setting allows three wire serial cables to work; they don't support the necessary signals for hardware handshaking, and if it were enabled by default, they'd be unable to transmit any characters to change it.

Surprisingly, some serial communications programs don't enable hardware handshaking, so if your modem supports hardware handshaking, you should configure the modem to use it (check your modem manual for what command to use), and also configure your serial device to use it. The stty command has a crtscts flag that enables hardware handshaking on a device; you'll need to use this. The command is probably best issued from the rc.serial file (or equivalent) at boot time using commands like those shown in Example 4.3..

Example 4.3. Example rc.serial stty Commands

#
	  stty crtscts < /dev/ttyS0
	  stty crtscts < /dev/ttyS1
	  stty crtscts < /dev/ttyS2
	  stty crtscts < /dev/ttyS3
	  #

The stty command works on the current terminal by default, but by using the input redirection (<) feature of the shell, we can have stty manipulate any tty device. It's a common mistake to forget whether you are supposed to use < or >; modern versions of the stty command have a much cleaner syntax for doing this. To use the new syntax, we'd rewrite our sample configuration to look like that shown in Example 4.4..

Example 4.4. Example rc.serial stty Commands Using Modern Syntax

#
	  stty crtscts -F /dev/ttyS0
	  stty crtscts -F /dev/ttyS1
	  stty crtscts -F /dev/ttyS2
	  stty crtscts -F /dev/ttyS3
	  #

We mentioned that the stty command can be used to display the terminal configuration parameters of a tty device. To display all of the active settings on a tty device, use:

$ stty -a -F /dev/ttyS1

The output of this command, shown in Example 4.5., gives you the status of all flags for that device; a flag shown with a preceding minus, as in crtscts, means that the flag has been turned off.

Example 4.5. Output of stty -a Command

speed 19200 baud; rows 0; columns 0; line = 0;
	  intr = ^C; quit = ^\; erase = ^?; kill = ^U; eof = ^D; eol = <undef>; 
	  eol2 = <undef>; start = ^Q; stop = ^S; susp = ^Z; rprnt = ^R;
	  werase = ^W; lnext = ^V; flush = ^O; min = 1; time = 0;
	  -parenb -parodd cs8 hupcl -cstopb cread clocal -crtscts
	  -ignbrk -brkint -ignpar -parmrk -inpck -istrip -inlcr -igncr -icrnl -ixon
	  -ixoff -iuclc -ixany -imaxbel
	  -opost -olcuc -ocrnl onlcr -onocr -onlret -ofill -ofdel nl0 cr0 tab0
	  bs0 vt0 ff0
	  -isig -icanon iexten echo echoe echok -echonl -noflsh -xcase -tostop
	  -echoprt echoctl echoke

A description of the most important of these flags is given in Table 4.2.. Each of these flags is enabled by supplying it to stty and disabled by supplying it to stty with the character in front of it. Thus, to disable hardware handshaking on the ttyS0 device, you would use:

$ stty -crtscts -F /dev/ttyS0

Table 4.2. stty Flags Most Relevant to Configuring Serial Devices

FlagsDescription
N

Set the line speed to N bits per second.

crtsdts

Enable/Disable hardware handshaking.

ixon

Enable/Disable XON/XOFF flow control.

clocal

Enable/Disable modem control signals such as DTR/DTS and DCD. This is necessary if you are using a three wire serial cable because it does not supply these signals.

cs5 cs6 cs7 cs8

Set number of data bits to 5, 6, 7, or 8, respectively.

parodd

Enable odd parity. Disabling this flag enables even parity.

parenb

Enable parity checking. When this flag is negated, no parity is used.

cstopb

Enable use of two stop bits per character. When this flag is negated, one stop bit per character is used.

echo

Enable/Disable echoing of received characters back to sender.

The next example combines some of these flags and sets the ttyS0 device to 19,200 bps, 8 data bits, no parity, and hardware handshaking with echo disabled:

$ stty 19200 cs8 -parenb crtscts -echo -F /dev/ttyS0

Serial Devices and the login: Prompt

It was once very common that a Unix installation involved one server machine and many dumb character mode terminals or dial-up modems. Today that sort of installation is less common, which is good news for many people interested in operating this way, because the dumb terminals are now very cheap to acquire. Dial-up modem configurations are no less common, but these days they would probably be used to support a SLIP or PPP login (discussed in Chapter 7., Serial Line IP and Chapter 8., The Point-to-Point Protocol) than to be used for a simple login. Nevertheless, each of these configurations can make use of a simple program called a getty program.

The term getty is probably a contraction of get tty. A getty program opens a serial device, configures it appropriately, optionally configures a modem, and waits for a connection to be made. An active connection on a serial device is usually indicated by the Data Carrier Detect (DCD) pin on the serial device being raised. When a connection is detected, the getty program issues a login: prompt, and then invokes the login program to handle the actual system login. Each of the virtual terminals (e.g., /dev/tty1) in Linux has a getty running against it.

There are a number of different getty implementations, each designed to suit some configurations better than others. The getty that we'll describe here is called mgetty. It is quite popular because it has all sorts of features that make it especially modem-friendly, including support for automatic fax programs and voice modems. We'll concentrate on configuring mgetty to answer conventional data calls and leave the rest for you to explore at your convenience.

Configuring the mgetty Daemon

The mgetty daemon is available in source form from ftp://alpha.greenie.net/pub/mgetty/source/, and is available in just about all Linux distributions in prepackaged form. The mgetty daemon differs from most other getty implementations in that it has been designed specifically for Hayes-compatible modems. It still supports direct terminal connections, but is best suited for dialup applications. Rather than using the DCD line to detect an incoming call, it listens for the RING message generated by modern modems when they detect an incoming call and are not configured for auto-answer.

The main executable program is called /usr/sbin/mgetty, and its main configuration file is called /etc/mgetty/mgetty.config. There are a number of other binary programs and configuration files that cover other mgetty features.

For most installations, configuration is a matter of editing the /etc/mgetty/ mgetty.config file and adding appropriate entries to the /etc/inittab file to execute mgetty automatically.

Example 4.6. shows a very simple mgetty configuration file. This example configures two serial devices. The first, /dev/ttyS0, supports a Hayes-compatible modem at 38,400 bps. The second, /dev/ttyS0, supports a directly connected VT100 terminal at 19,200 bps.

Example 4.6. Sample /etc/mgetty/mgetty.config File

#
	    # mgetty configuration file
	    #
	    # this is a sample configuration file, see mgetty.info for details
	    #
	    # comment lines start with a "#", empty lines are ignored
	    #
	    # ----- global section -----
	    #
	    # In this section, you put the global defaults, per-port stuff is below
	    #
	    # access the modem(s) with 38400 bps
	    speed 38400
	    #
	    # set the global debug level to "4" (default from policy.h)
	    debug 4
	    #
	    
	    # ----- port specific section -----
	    # 
	    # Here you can put things that are valid only for one line, not the others
	    #
	    #
	    # Hayes modem connected to ttyS0: don't do fax, less logging
	    #
	    port ttyS0
	    debug 3
	    data-only y
	    #
	    # direct connection of a VT100 terminal which doesn't like DTR drops
	    #
	    port ttyS1
	    direct y
	    speed 19200
	    toggle-dtr n
	    #

The configuration file supports global and port-specific options. In our example we used a global option to set the speed to 38,400 bps. This value is inherited by the ttyS0 port. Ports we apply mgetty to use this speed setting unless it is overwritten by a port-specific speed setting, as we have done in the ttyS1 configuration.

The debug keyword controls the verbosity of mgetty logging. The data-only keyword in the ttyS0 configuration causes mgetty to ignore any modem fax features, to operate just as a data modem. The direct keyword in the ttyS1 configuration instructs mgetty not to attempt any modem initialization on the port. Finally, the toggle-dtr keyword instructs mgetty not to attempt to hang up the line by dropping the DTR (Data Terminal Ready) pin on the serial interface; some terminals don't like this to happen.

You can also choose to leave the mgetty.config file empty and use command-line arguments to specify most of the same parameters. The documentation accompanying the application includes a complete description of the mgetty configuration file parameters and command-line arguments. See the following example.

We need to add two entries to the /etc/inittab file to activate this configuration. The inittab file is the configuration file of the Unix System V init command. The init command is responsible for system initialization; it provides a means of automatically executing programs at boot time and re-executing them when they terminate. This is ideal for the goals of running a getty program.

T0:23:respawn:/sbin/mgetty ttyS0
	  T1:23:respawn:/sbin/mgetty ttyS1

Each line of the /etc/inittab file contains four fields, separated by colons. The first field is an identifier that uniquely labels an entry in the file; traditionally it is two characters, but modern versions allow four. The second field is the list of run levels at which this entry should be active. A run level is a means of providing alternate machine configurations and is implemented using trees of startup scripts stored in directories called /etc/rc1.d, /etc/rc2.d, etc. This feature is typically implemented very simply, and you should model your entries on others in the file or refer to your system documentation for more information. The third field describes when to take action. For the purposes of running a getty program, this field should be set to respawn, meaning that the command should be re-executed automatically when it dies. There are several other options, as well, but they are not useful for our purposes here. The fourth field is the actual command to execute; this is where we specify the mgetty command and any arguments we wish to pass it. In our simple example we're starting and restarting mgetty whenever the system is operating at either of run levels two or three, and are supplying as an argument just the name of the device we wish it to use. The mgetty command assumes the /dev/, so we don't need to supply it.

This chapter was a quick introduction to mgetty and how to offer login prompts to serial devices. You can find more extensive information in the Serial-HOWTO.

After you've edited the configuration files, you need to reload init to make the changes take effect. Simply send a hangup signal to the init process; it always has a process ID of one, so you can use the following command safely:

# kill -HUP 1


[19] David can be reached at bf347@lafn.org.

[20] Note that we are not talking about WinModem here! WinModems have very simple hardware and rely completely on the main CPU of your computer instead of dedicated hardware to do all of the hard work. If you're purchasing a modem, it is our strongest recommendation to not purchase such a modem; get a real modem. You may find Linux support for WinModems, but that makes them only a marginally more attractive solution.

In this chapter, we walk you through all the necessary steps to set up TCP/IP networking on your machine. Starting with the assignment of IP addresses, we slowly work our way through the configuration of TCP/IP network interfaces and introduce a few tools that come in handy when hunting down network installation problems.

Most of the tasks covered in this chapter will generally have to be done only once. Afterward, you have to touch most configuration files only when adding a new system to your network or when you reconfigure your system entirely. Some of the commands used to configure TCP/IP, however, have to be executed each time the system is booted. This is usually done by invoking them from the system /etc/rc* scripts.

As with other system configuration options, entries in /etc/rc.conf determine your systems initial settings. This file is then used by the scripts /etc/rc.network (for IPv4 network configuration) and /etc/rc.network6 (for IPv6 network configuration). The defaults for all these settings are stored in /etc/defaults/rc.conf. Thus /etc/rc.conf only contains entries that are different from the default values. If you have not already done so, taking a look through /etc/defaults/rc.conf gives you a good overview of all the parameters you can change in this way.

This chapter discusses parts of the script that configure your network interfaces, while applications will be covered in later chapters. After finishing this chapter, you should have established a sequence of commands that properly configure TCP/IP networking on your computer. You should then replace any sample commands in your configuration scripts with your commands, make sure the script is executed from the basic rc script at startup time, and reboot your machine.

Setting the Hostname

Most, if not all, network applications rely on you to set the local host's name to some reasonable value. This setting is usually made during the boot procedure by executing the hostname command. To set the hostname to name, enter:

# hostname name

You should use the hosts Fully Qualified Domain Name (FQDN) when setting the host name. For instance, hosts at the Virtual Brewery (described in Appendix A.) might be called vale.vbrew.com or vlager.vbrew.com. These are their FQDNs, and should be used.

For example:

# hostname vlager.vbrew.com

The appropriate entry in /etc/rc.conf is:

hostname="vlager.vbrew.com"

Do not use domainname

You may have noticed the domainname(1) command, and thought that you should use this to set the host's domain name. Don't. This command sets the current NIS domain name, and is not related to the host's Internet domain.

Assigning IP Addresses

If you configure the networking software on your host for standalone operation (for instance, to be able to run the INN Netnews software), you can safely skip this section, because the only IP address you will need is for the loopback interface, which is always 127.0.0.1.

Things are a little more complicated with real networks like Ethernets. If you want to connect your host to an existing network, you have to ask its administrators to give you an IP address on this network. When setting up a network all by yourself, you have to assign IP addresses yourself.

Hosts within a local network should usually share addresses from the same logical IP network. Hence, you have to assign an IP network address. If you have several physical networks, you have to either assign them different network numbers, or use subnetting to split your IP address range into several subnetworks. Subnetting will be revisited in the next section, the section called “Creating Subnets”.

When picking an IP network number, much depends on whether you intend to get on the Internet in the near future. If so, you should obtain an official IP address now. Ask your network service provider to help you. If you want to obtain a network number, just in case you might get on the Internet someday, request a Network Address Application Form from hostmaster@internic.net, or your country's own Network Information Center, if there is one.

If your network is not connected to the Internet and won't be in the near future, you are free to choose any legal network address. Just make sure no packets from your internal network escape to the real Internet. To make sure no harm can be done even if packets did escape, you should use one of the network numbers reserved for private use. The Internet Assigned Numbers Authority (IANA) has set aside several network numbers from classes A, B, and C that you can use without registering. These addresses are valid only within your private network and are not routed between real Internet sites. The numbers are defined by RFC 1597 and are listed in Table 2.1. in Chapter 2., Issues of TCP/IP Networking. Note that the second and third blocks contain 16 and 256 networks, respectively.

Picking your addresses from one of these network numbers is not only useful for networks completely unconnected to the Internet; you can still implement a slightly more restricted access using a single host as a gateway. To your local network, the gateway is accessible by its internal IP address, while the outside world knows it by an officially registered address (assigned to you by your provider). We come back to this concept in connection with the IP masquerade facility in Chapter 11., Network Address Translation.

Throughout the remainder of the book, we will assume that the brewery's network manager uses a class B network number, say 172.16.0.0. Of course, a class C network number would definitely suffice to accommodate both the Brewery's and the Winery's networks. We'll use a class B network here for the sake of simplicity; it will make the subnetting examples in the next section of this chapter a little more intuitive.

Creating Subnets

To operate several Ethernets (or other networks, once a driver is available), you have to split your network into subnets. Note that subnetting is required only if you have more than one broadcast networkpoint-to-point links don't count. For instance, if you have one Ethernet, and one or more SLIP links to the outside world, you don't need to subnet your network. This is explained in more detail in Chapter 7., Serial Line IP.

To accommodate the two Ethernets, the Brewery's network manager decides to use 8 bits of the host part as additional subnet bits. This leaves another 8 bits for the host part, allowing for 254 hosts on each of the subnets. She then assigns subnet number 1 to the brewery, and gives the winery number 2. Their respective network addresses are thus 172.16.1.0 and 172.16.2.0. The subnet mask is 255.255.255.0.

vlager, which is the gateway between the two networks, is assigned a host number of 1 on both of them, which gives it the IP addresses 172.16.1.1 and 172.16.2.1, respectively.

Note that in this example we are using a class B network to keep things simple, but a class C network would be more realistic. With the new networking code, subnetting is not limited to byte boundaries, so even a class C network may be split into several subnets. For instance, you could use two bits of the host part for the netmask, giving you 4 possible subnets with 64 hosts on each. [21]

Writing hosts and networks Files

After you have subnetted your network, you should prepare for some simple sort of hostname resolution using the /etc/hosts file. If you are not going to use DNS or NIS for address resolution, you have to put all hosts in the hosts file.

Even if you want to run DNS or NIS during normal operation, you should have some subset of all hostnames in /etc/hosts. You should have some sort of name resolution, even when no network interfaces are running, for example, during boot time. This is not only a matter of convenience, but it allows you to use symbolic hostnames in your network rc scripts. Thus, when changing IP addresses, you only have to copy an updated hosts file to all machines and reboot, rather than edit a large number of rc files separately. Usually you put all local hostnames and addresses in hosts, adding those of any gateways and NIS servers used.[22]

You should make sure your resolver only uses information from the hosts file during initial testing. Sample files that come with your DNS or NIS software may produce strange results. To make all applications use /etc/hosts exclusively when looking up the IP address of a host, you have to edit the /etc/host.conf file.

/etc/host.conf lists the order in which servers will be queried, one per line. Yours should look like this:

# First try the /etc/hosts file
hosts
# Now try the nameserver next
bind
# If you have YP/NIS configured, uncomment the next line
# nis

This indicates that the /etc/hosts file will be checked first, and then the DNS. For the duration of this testing you should also comment out the bind line.

The configuration of the resolver library is covered in detail in Chapter 6., Name Service and Resolver Configuration.

The hosts file contains one entry per line, consisting of an IP address, a hostname, and an optional list of aliases for the hostname. The fields are separated by spaces or tabs, and the address field must begin in the first column. Anything following a hash sign (#) is regarded as a comment and is ignored.

Hostnames can be either fully qualified or relative to the local domain. For vale, you would usually enter the fully qualified name, vale.vbrew.com, and vale by itself in the hosts file, so that it is known by both its official name and the shorter local name.

This is an example how a hosts file at the Virtual Brewery might look. Two special names are included, vlager-if1 and vlager-if2, which give the addresses for both interfaces used on vlager:

#
# Hosts file for Virtual Brewery/Virtual Winery
#
# IP            FQDN                 aliases
#
127.0.0.1       localhost
#
172.16.1.1      vlager.vbrew.com      vlager vlager-if1
172.16.1.2      vstout.vbrew.com      vstout
172.16.1.3      vale.vbrew.com        vale
#
172.16.2.1      vlager-if2
172.16.2.2      vbeaujolais.vbrew.com vbeaujolais
172.16.2.3      vbardolino.vbrew.com  vbardolino
172.16.2.4      vchianti.vbrew.com    vchianti

Just as with a host's IP address, you should sometimes use a symbolic name for network numbers, too. Therefore, the hosts file has a companion called /etc/networks that maps network names to network numbers, and vice versa. At the Virtual Brewery, we might install a networks file like this: [23]

# /etc/networks for the Virtual Brewery
brew-net      172.16.1.0
wine-net      172.16.2.0

Interface Configuration for IP

After setting up your hardware as explained in Chapter 4., Configuring the Serial Hardware, you have to make these devices known to the kernel networking software. A couple of commands are used to configure the network interfaces and initialize the routing table. These tasks are usually performed from the network initialization script each time you boot the system. The basic tools for this process are called ifconfig (where if stands for interface) and route.

ifconfig is used to make an interface accessible to the kernel networking layer. This involves the assignment of an IP address and other parameters, and activation of the interface, also known as bringing up the interface. Being active here means that the kernel will send and receive IP datagrams through the interface. The simplest way to invoke it is with:

# ifconfig interface inet ip-address

This command assigns ip-address to interface and activates it. inet indicates that this is an IPv4 address. You can use ifconfig to assign addresses from other address families, such as IPv6 (inet6), AppleTalk (atalk), and IPX (ipx).All other parameters are set to default values. For instance, the default network mask is derived from the network class of the IP address, such as 255.255.0.0 for a class B address. ifconfig is described in detail in the section ???.

route allows you to add or remove routes from the kernel routing table. It can be invoked as:

#  route command modifiers args

command can be one of add, flush, delete, change, get, and monitor. When adding or deleting routes you normally use the following syntax.

# route [add | delete] [-net | -host] destination gateway [netmask]

add and delete determine whether to add or delete the route to the destination. If you are adding a default route, or the combination of destination and netmask indicate a network then route adds a network route, otherwise it will default to a host route. Use -net and -host to enforce a particular interpretation.

The Loopback Interface

The very first interface to be activated is the loopback interface:

# ifconfig lo0 inet 127.0.0.1

This will typically have been carried out automatically when the system booted. /etc/defaults/rc.conf contains the following line:

ifconfig_lo0="inet 127.0.0.1"

to initialise the loopback interface.

Occasionally, you will see the dummy hostname localhost being used instead of the IP address. ifconfig will look up the name in the hosts file, where an entry should declare it as the hostname for 127.0.0.1:

# Sample /etc/hosts entry for localhost
127.0.0.01    localhost

To view the configuration of an interface, you invoke ifconfig, giving it only the interface name as argument:

$ ifconfig lo0
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 16384
        inet 127.0.0.1 netmask 0xff000000

As you can see, the loopback interface has been assigned a netmask of 255.0.0.0 (or 0xff000000 in hexadecimal), since 127.0.0.1 is a class A address.

Next, you should check that everything works fine, for example by using ping. ping is the networking equivalent of a sonar device. [24] The command is used to verify that a given address is actually reachable, and to measure the delay that occurs when sending a datagram to it and back again. The time required for this process is often referred to as the round-trip time:

$ ping localhost
PING localhost (127.0.0.1): 56 data bytes
64 bytes from 127.0.0.1: icmp_seq=0 ttl=255 time=0.048 ms
64 bytes from 127.0.0.1: icmp_seq=1 ttl=255 time=0.062 ms
64 bytes from 127.0.0.1: icmp_seq=2 ttl=255 time=0.045 ms
64 bytes from 127.0.0.1: icmp_seq=3 ttl=255 time=0.072 ms
^C
--- localhost ping statistics ---
4 packets transmitted, 4 packets received, 0% packet loss
round-trip min/avg/max/stddev = 0.045/0.057/0.072/0.011 ms

When you invoke ping as shown here, it will continue emitting packets forever, unless interrupted by the user. The ^C marks the place where we pressed Ctrl-C.

The previous example shows that packets for 127.0.0.1 are properly delivered and a reply is returned to ping almost instantaneously. This shows that you have successfully set up your first network interface.

Ethernet Interfaces

Configuring an Ethernet interface is pretty much the same as the loopback interface; it just requires a few more parameters when you are using subnetting.

At the Virtual Brewery, we have subnetted the IP network, which was originally a class B network, into class C subnetworks. To make the interface recognize this, the ifconfig incantation would look like this:

# ifconfig fxp0 inet vstout netmask 255.255.255.0

This command assigns the fxp0 interface the IP address of vstout (172.16.1.2). If we omitted the netmask, ifconfig would deduce the netmask from the IP network class, which would result in an incorrect netmask of 255.255.0.0. Now a quick check shows:

# ifconfig fxp0
fxp0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
        inet 172.16.1.2 netmask 0xffffff00 broadcast 172.16.1.255
        ether 00:50:8b:64:bf:43 
        media: autoselect status: no carrier
        supported media: autoselect 100baseTX <full-duplex> 100baseTX 10baseT/UTP <full-duplex> 10baseT/UTP

You can see that ifconfig automatically sets the broadcast address (the broadcast field) to the usual value, which is the host's network number with all the host bits set. Also, the maximum transmission unit (the maximum size of IP datagrams the kernel will generate for this interface) has been set to the maximum size of Ethernet packets: 1,500 bytes. The defaults are usually what you will use, but all these values can be overidden if required, with special options that will be described under ???.

Now that you've finished the basic configuration steps, we want to make sure that your Ethernet interface is indeed running happily. Choose a host from your Ethernet, for instance vlager, and type:

# ping vlager
PING vlager.vbrew.com (172.16.1.1): 56 data bytes
64 bytes from 172.16.1.1: icmp_seq=0 ttl=255 time=0.278 ms
64 bytes from 172.16.1.1: icmp_seq=1 ttl=255 time=0.230 ms
64 bytes from 172.16.1.1: icmp_seq=2 ttl=255 time=0.281 ms
64 bytes from 172.16.1.1: icmp_seq=3 ttl=255 time=0.234 ms
^C
--- vstout.vbrew.com ping statistics ---
4 packets transmitted, 4 packets received, 0% packet loss
round-trip min/avg/max/stddev = 0.230/0.256/0.281/0.024 ms

If you don't see similar output, something is broken. If you encounter unusual packet loss rates, this hints at a hardware problem, like bad or missing terminators. If you don't receive any replies at all, you should check the interface configuration with netstat described later in the section called “The netstat Command”. The packet statistics displayed by netstat -i should tell you whether any packets have been sent out on the interface at all. If you have access to the remote host too, you should go over to that machine and check the interface statistics. This way you can determine exactly where the packets got dropped. In addition, you should display the routing information with netstat -r to see if both hosts have the correct routing entry (n just makes it print addresses as dotted quad instead of using the hostname):

# netstat -rn
Routing tables

Internet:
Destination        Gateway            Flags     Refs     Use     Netif Expire
127.0.0.1          127.0.0.1          UH          0      136      lo0
172.16.1           link#1             UC          0       10     fxp0 =>

The detailed meaning of these fields is explained later in the section called “The netstat Command”." The Flags column contains a list of flags set for each interface. U is always set for active interfaces, and H says the destination address denotes a host. If the H flag is set for a route that you meant to be a network route, you have to reissue the route command with the net option. To check whether a route you have entered is used at all, check to see if the Use field in the second to last column increases between two invocations of ping.

To have this network configuration recreated when the host restarts you must add an entry to /etc/rc.conf. Interface configuration variables are called ifconfig_interface, and the value of the variable should be the arguments you pass to ifconfig, without including the interface name.

To run the earlier command:

# ifconfig fxp0 inet vstout netmask 255.255.255.0

automatically when the system restarts, add this line to /etc/rc.conf:

ifconfig_fxp0="inet vstout netmask 255.255.255.0"

Routing Through a Gateway

In the previous section, we covered only the case of setting up a host on a single Ethernet. Quite frequently, however, one encounters networks connected to one another by gateways. These gateways may simply link two or more Ethernets, but may also provide a link to the outside world, such as the Internet. In order to use a gateway, you have to provide additional routing information to the networking layer.

The Ethernets of the Virtual Brewery and the Virtual Winery are linked through such a gateway, namely the host vlager. Assuming that vlager has already been configured, we just have to add another entry to vstout's routing table that tells the kernel it can reach all hosts on the Winery's network through vlager. The appropriate incantation of route is shown :

# route add wine-net vlager

Of course, any host on the Winery network you wish to talk to must have a routing entry for the Brewery's network. Otherwise you would only be able to send data to the Winery network from the Brewery network, but the hosts on the Winery would be unable to reply.

This example describes only a gateway that switches packets between two isolated Ethernets. Now assume that vlager also has a connection to the Internet (say, through an additional SLIP link). Then we would want datagrams to any destination network other than the Brewery to be handed to vlager. This action can be accomplished by making it the default gateway for vstout:

# route add default vlager

The network name default is a shorthand for 0.0.0.0, which denotes the default route. The default route matches every destination and will be used if there is no more specific route that matches. You do not have to add this name to /etc/networks because it is built into route.

If you see high packet loss rates when pinging a host behind one or more gateways, this may hint at a very congested network. Packet loss is not so much due to technical deficiencies as to temporary excess loads on forwarding hosts, which makes them delay or even drop incoming datagrams.

Routing information can be configured in to /etc/rc.conf. The hostname or IP address of the default route should be added to the defaultrouter variable. To add routes to other networks you create as many route_var variables as you need, each of which should contain the parameters to pass to route add. You must also list each var in the static_routes variable.

The previous configuration would be written like this:

defaultrouter="vlager"
static_routes="wine"
route_wine="wine-net vlager"

Note

The route for wine-net in this example is a little redundant. Since wine-net and the default route share the same gateway (vlager) you only actually need the default route in order for the routing to work properly.

Configuring a Gateway

Configuring a machine to switch packets between two Ethernets is pretty straightforward. Assume we're back at vlager, which is equipped with two Ethernet cards, each connected to one of the two networks. All you have to do is configure both interfaces separately, giving them their respective IP addresses and matching routes, configure the host to forward packets between the two interfaces, and that's it.

It is quite useful to add information on the two interfaces to the hosts file as shown in the following example, so we have handy names for them, too:

172.16.1.1      vlager.vbrew.com    vlager vlager-if1
172.16.2.1      vlager-if2

The sequence of commands to set up the two interfaces is then:

# ifconfig fxp0 inet vlager-if1
# ifconfig fxp1 inet vlager-if2
# sysctl -w net.inet.ip.forwarding=1

That last line sets a variable in the kernel to enable forwarding. This variable, net.inet.ip.forwarding, defaults to 0, or off. Setting it to 1 turns on IP forwarding.

To configure this in /etc/rc.conf, add variables for ifconfig_fxp0 and ifconfig_fxp1. To make the host a gateway (i.e., enable IP forwarding) set the gateway_enable variable to YES.

ifconfig_fxp0="inet vlager-if1"
ifconfig_fxp1="inet vlager-if2"
gateway_enable="YES"

The PLIP Interface

A PLIP link used to connect two machines is a little different from an Ethernet. PLIP links are an example of what are called point-to-point links, meaning that there is a single host at each end of the link. Networks like Ethernet are called broadcast networks. Configuration of point-to-point links is different because unlike broadcast networks, point-to-point links don't support a network of their own.

PLIP provides very cheap and portable links between computers. As an example, we'll consider the laptop computer of an employee at the Virtual Brewery that is connected to vlager via PLIP. The laptop itself is called vlite and has only one parallel port. At boot time, this port will be registered as lp0. To activate the link, you have to configure the lp0 interface using the following commands:

# ifconfig lp0 inet vlite vlager

This configures lp0 on vlite with a point-to-point link to vlager. You need to run a similar command on vlager to activate the link.

# ifconfig lp0 vlager vlite

Note that the lp0 interface on vlager does not need a separate IP address, but may also be given the address 172.16.1.1. Point-to-point networks don't support a network directly, so the interfaces don't require an address on any supported network. The kernel uses the interface information in the routing table to avoid any possible confusion.

Now we have configured routing from the laptop to the Brewery's network; what's still missing is a way to route from any of the Brewery's hosts to vlite. One particularly cumbersome way is to add a specific route to every host's routing table that names vlager as a gateway to vlite:

# route add vlite vlager

Dynamic routing offers a much better option for temporary routes. You could use gated, a routing daemon, which you would have to install on each host in the network in order to distribute routing information dynamically. The easiest option, however, is to use proxy ARP (Address Resolution Protocol). With proxy ARP, vlager will respond to any ARP query for vlite by sending its own Ethernet address. All packets for vlite will wind up at vlager, which then forwards them to the laptop.

The arp command is used to do this. The command takes the form:

arp -s hostname MAC

where hostname is the host at the other end of the link (i.e., vlite), and MAC is the MAC address of the ethernet interface that is connected to the network. On vlager this is fxp0. The easiest way to find the MAC address is to examine the output from ifconfig.

# ifconfig fxp0
fxp0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
        inet 172.16.1.2 netmask 0xffffff00 broadcast 172.16.1.255
        ether 00:50:8b:64:bf:43 
        media: autoselect status: no carrier
        supported media: autoselect 100baseTX <full-duplex> 100baseTX 10baseT/UTP <full-duplex> 10baseT/UTP
# arp -s vlite 00:50:8b:64:bf:43

vlager must also be configured as a gateway, in order to forward packets between the fxp0 and lp0 interfaces.

# sysctl -w net.inet.ip.forwarding=1

We will come back to proxy ARP in the section the section called “Checking the ARP Tables”.

The SLIP and PPP Interfaces

Although SLIP and PPP links are only simple point-to-point links like PLIP connections, there is much more to be said about them. Usually, establishing a SLIP connection involves dialing up a remote site through your modem and setting the serial line to SLIP mode. PPP is used in a similar fashion. We discuss SLIP and PPP in detail in Chapter 7., Serial Line IP and Chapter 8., The Point-to-Point Protocol.

IP Alias

IP Alias allows you to configure multiple IP addresses onto a single interface. This allows you to configure your host to look like many different hosts, each with its own IP address. This configuration is sometimes called Virtual Hosting, although technically it is also used for a variety of other techniques.[25]

An example of when this can be useful is when you know you are going to start a small network, perhaps with many services running on a single server. You can assign a different IP address to each service you want to run. As your network expands, and you need to migrate services to additional servers, you can configure the additional servers with existing IP addresses, and remove the aliases[26].

To configure an alias for an interface you use alias keyword in the ifconfig command. You must also use the all-ones netmask.

vlager has already been configured with the IP address 172.16.1.2 on the fxp0 interface. To add the 172.16.1.3 and 172.16.1.4 addresses to the same interface you would run these commands:

# ifconfig fxp0 inet 172.16.1.3 netmask 255.255.255.255 alias
# ifconfig fxp0 inet 172.16.1.4 netmask 255.255.255.255 alias

These IP addresses will then be visible to other hosts on the network, can be connected to using ping, and so forth. They are indistinguishable from the first IP address configured on the interface.

Aliases are defined in /etc/rc.conf using variable named ifconfig_interface_aliasn where interface is the interface name, and n is the interface number. The variable's value is the ifconfig arguments you would normally use, without the alias. The complete configuration for fxp0 would look like this:

# First IP address
ifconfig_fxp0="inet vstout netmask 255.255.255.0"

# Aliases
ifconfig_fxp0_alias0="inet 172.16.1.3 netmask 255.255.255.255"
ifconfig_fxp0_alias1="inet 172.16.1.4 netmask 255.255.255.255"

The netstat Command

netstat is a useful tool for checking your network configuration and activity. It is in fact a collection of several tools lumped together. We discuss each of its functions in the following sections.

Displaying the Routing Table

When you invoke netstat with the r flag, it displays the kernel routing table . On vstout, it produces:

# netstat -nr
Routing tables

Internet:
Destination        Gateway            Flags     Refs     Use     Netif Expire
default            172.16.1.1         UGSc       17        0     fxp0
127.0.0.1          127.0.0.1          UH          0     1657      lo0
172.16.1           link#1             UC          0        0     fxp0 =>
172.16.2           172.16.1.1         UGS         0        0     fxp0

The n option makes netstat print addresses as dotted quad IP numbers rather than the symbolic host and network names. This option is especially useful when you want to avoid address lookups over the network (e.g., to a DNS or NIS server).

The Gateway column of netstat's output shows the gateway to which the routing entry points. If no gateway is used, an asterisk is printed instead.

The Flags column displays the following flags that describe the route:

G

The route uses a gateway.

U

The interface to be used is up.

H

Only a single host can be reached through the route. For example, this is the case for the loopback entry 127.0.0.1.

S

This route is static, and was configured manually using route add ...

The Netif field displays the network interface that this route will use.

Displaying Interface Statistics

netstat can display statistics about the amount of traffic travelling over an interface, the number of collisions, and so forth. Use the -i flag to do this. If you also use the -b flag then netstat will display the number of bytes transferred, while using the -d flag as well will show the number of dropped packets.

# netstat -i
Name  Mtu   Network       Address            Ipkts Ierrs    Opkts Oerrs  Coll
fxp0  1500  <Link#1>    00:02:b3:3a:58:0a  4584308     2  3682517     0 459178
fxp0  1500  172.16.1      vstout           3637833     -  3713419     -     -
lp0   1500  <Link#2>                             0     0        0     0     0
lo0   16384 <Link#7>                         83949     0    83949     0     0
lo0   16384 127           localhost           1657     -     1657     -     -

Displaying Connections

netstat supports a set of options to display active sockets, and to specify which address families (IPv4, IPv6, IPX, etc) to display. This can be used to give you a list of all servers that are currently running on your system.

# netstat -f inet
Active Internet Connections
Proto Recv-Q Send-Q Local Address    Foreign Address    (state)
tcp        0      0 *:domain         *:*                LISTEN  
tcp        0      0 *:time           *:*                LISTEN  
tcp        0      0 *:smtp           *:*                LISTEN  
tcp        0      0 vlager:smtp      vstout:1040        ESTABLISHED  
tcp        0      0 *:telnet         *:*                LISTEN  
tcp        0      0 localhost:1046   vbardolino:telnet  ESTABLISHED  
tcp        0      0 *:chargen        *:*                LISTEN  
tcp        0      0 *:daytime        *:*                LISTEN  
tcp        0      0 *:discard        *:*                LISTEN  
tcp        0      0 *:echo           *:*                LISTEN  
tcp        0      0 *:shell          *:*                LISTEN  
tcp        0      0 *:login          *:*                LISTEN

This output shows most servers simply waiting for an incoming connection. However, the fourth line shows an incoming SMTP connection from vstout, and the sixth line tells you there is an outgoing telnet connection to vbardolino. [27]

The sockstat Command

The sockstat command gives a similar display to that of netstat, but also shows the command and process ID of the process that has opened the socket, as well as the user name it is running under. This can be very useful in tracing down which process is listening on a particular port, or determining who is connecting where.

These are the first few lines obtained when running sockstat on a user's workstation.

# sockstat
USER     COMMAND    PID   FD PROTO  LOCAL ADDRESS         FOREIGN ADDRESS      
nik      mozilla- 62453   28 tcp4   192.168.1.20:2962     212.67.196.230:80    
nik      ssh      61773    3 tcp4   192.168.1.20:2895     192.168.1.16:22      
nik      ssh      58784    3 tcp4   192.168.1.20:1744     192.168.1.16:22      
nik      xchat    20137    8 tcp4   192.168.1.20:2628     195.159.0.90:6667    
nik      xchat    20137   12 tcp4   192.168.1.20:4670     194.242.139.170:6667 
nik      slrn     17273    3 tcp4   192.168.1.20:3059     194.117.133.24:119   
nik      licq     88462   15 tcp4   *:49153               *:*                  
...

This example shows what seems to be a web connection open to 212.67.196.230, two ssh connections, two IRC connections, a Usenet news session, and an ICQ client.

Checking the ARP Tables

On some occasions, it is useful to view or alter the contents of the kernel's ARP tables, for example when you suspect a duplicate Internet address is the cause for some intermittent network problem.

All hostname arguments may be either symbolic hostnames or IP addresses in dotted quad notation.

Viewing the arp tables

# arp [-n] hostname
# arp [-n] -a

This displays the ARP entry for the IP address or host specified, or all hosts known if no hostname is given. For example, invoking arp on vlager may yield:

# arp -a
vlager.vbrew.com (172.16.1.3) at 00:00:C0:5A:42:C1 permanent [ethernet]
vstout.vbrew.com (172.16.1.2) at 00:00:C0:90:B3:42 [ethernet]
vale.vbrew.com (172.16.2.4) at 0:00:C0:04:69:AA [ethernet]

which shows the Ethernet addresses of vlager, vstout and vale.

Adding entries

# arp [-s|-S] hostname ether_addr [temp] [pub]

The s option is used to permanently add hostname's Ethernet address to the ARP tables. The hwaddr argument specifies the hardware address, which is by default expected to be an Ethernet address specified as six hexadecimal bytes separated by colons.

If you use -S then any existing arp entries for the host will be deleted first. Append temp to the options to make this a temporary, instead of permanent, assignment. The pub option is covered in the section called “Proxy arp”.

Deleting entries

# arp -d [-a|hostname]

Invoking arp using the d switch deletes all ARP entries relating to the given host. This switch may be used to force the interface to re-attempt obtaining the Ethernet address for the IP address in question. This is useful when a misconfigured system has broadcasted wrong ARP information (of course, you have to reconfigure the broken host first).

Using -a instead of specifying a hostname can be used to delete all arp entries.

Proxy arp

The s option may also be used to implement proxy ARP. This is a special technique through which a host, say gate, acts as a gateway to another host named fnord by pretending that both addresses refer to the same host, namely gate. It does so by publishing an ARP entry for fnord that points to its own Ethernet interface. Now when a host sends out an ARP query for fnord, gate will return a reply containing its own Ethernet address. The querying host will then send all datagrams to gate, which dutifully forwards them to fnord.

These contortions may be necessary when you want to access fnord from a DOS machine with a broken TCP implementation that doesn't understand routing too well. When you use proxy ARP, it will appear to the DOS machine as if fnord was on the local subnet, so it doesn't have to know about how to route through a gateway.

Another useful application of proxy ARP is when one of your hosts acts as a gateway to some other host only temporarily, for instance, through a dial-up link. In a previous example, we encountered the laptop vlite, which was connected to vlager through a PLIP link from time to time. Of course, this application will work only if the address of the host you want to provide proxy ARP for is on the same IP subnet as your gateway. vstout could proxy ARP for any host on the Brewery subnet (172.16.1.0), but never for a host on the Winery subnet (172.16.2.0).

The proper invocation to provide proxy ARP for fnord is given below; of course, the given Ethernet address must be that of gate:

# arp -s fnord 00:00:c0:a1:42:e0 pub

The proxy ARP entry may be removed again by invoking:

# arp -d fnord


[21] The first number on each subnet is the subnetwork address, and the last number on each subnet is reserved as the broadcast address, so it's actually 62 hosts per subnet.

[22] You need the address of an NIS server only if you use Peter Eriksson's NYS. Other NIS implementations locate their servers only at runtime by using ypbind.

[23] Note that names in networks must not collide with hostnames from the hosts file, or else some programs may produce strange results.

[24] Anyone remember Pink Floyd's Echoes?

[25] More correctly, using IP aliasing is known as network layer virtual hosting. It is more common in the WWW and STMP worlds to use application layer virtual hosting, in which the same IP address is used for each virtual host, but a different hostname is passed with each application layer request. Services like FTP are not capable of operating in this way, and they demand network layer virtual hosting.

[26] You can do something similar using CNAMEs in the DNS. It's often personal preference as to which you decide to use.

[27] You can tell whether a connection is outgoing from the port numbers. The port number shown for the calling host will always be a simple integer. On the host being called, a well-known service port will be in use for which netstat uses the symbolic name such as smtp, found in /etc/services.

As we discussed in Chapter 2., Issues of TCP/IP Networking, TCP/IP networking may rely on different schemes to convert names into addresses. The simplest way is a host table stored in /etc/hosts. This is useful only for small LANs that are run by one single administrator and otherwise have no IP traffic with the outside world. The format of the hosts file has already been described in Chapter 5., Configuring TCP/IP Networking.

Alternatively, you can use the Berkeley Internet Name Domain service (BIND) for resolving hostnames to IP addresses. Configuring BIND can be a real chore, but once you've done it, you can easily make changes in the network topology. On Linux, as on many other Unixish systems, name service is provided through a program called named. At startup, it loads a set of master files into its internal cache and waits for queries from remote or local user processes. There are different ways to set up BIND, and not all require you to run a name server on every host.

This chapter can do little more than give a rough sketch of how DNS works and how to operate a name server. It should be sufficient if you have a small LAN and an Internet uplink. For the most current information, you may want to check the documentation contained in the BIND source package, which supplies manual pages, release notes, and the BIND Operator's Guide (BOG). Don't let this name scare you off; it's actually a very useful document. For a more comprehensive coverage of DNS and associated issues, you may find DNS and BIND by Paul Albitz and Cricket Liu (O'Reilly) a useful reference. DNS questions may be answered in a newsgroup called comp.protocols.tcp-ip.domains. For technical details, the Domain Name System is defined by RFC numbers 1033, 1034, and 1035.

The Resolver Library

The term resolver refers not to a special application, but to the resolver library. This is a collection of functions that can be found in the standard C library. The central routines are gethostbyname(2) and gethostbyaddr(2), which look up all IP addresses associated with a host name, and vice versa. They may be configured to simply look up the information in hosts, to query a number of DNS name servers, or to use the hosts database of Network Information Service (NIS).

The resolver functions read configuration files when they are invoked. From these configuration files, they determine what databases to query, in which order, and other details relevant to how you've configured your environment.

The FreeBSD resolver uses two files, /etc/host.conf and /etc/resolv.conf, when determining how to resolve IP addresses from host names.

The host.conf File

The /etc/host.conf file tells the library resolver functions which services to use, and in what order. Services are tried in the order they are listed in the file until one of them returns a match, or until they all fail.

Options in host.conf must appear on separate lines. Fields may be separated by white space (spaces or tabs). A hash sign (#) introduces a comment that extends to the next newline. The following options are available:

hosts

Search the /etc/hosts file

bind

Search the DNS, using the entries in /etc/resolv.conf to locate a nameserver.

nis

Search the YP/NIS system

A sample file for vlager is shown in Example 6.1..

Example 6.1. Sample host.conf File

# First try /etc/hosts
hosts
# Now try the DNS
bind
# Uncomment this line to search the YP/NIS tables
# nis

Configuring Name Server Lookups Using resolv.conf

When configuring the resolver library to use the BIND name service for host lookups, you also have to tell it which name servers to use. There is a separate file for this called resolv.conf. If this file does not exist or is empty, the resolver assumes the name server is on your local host.

To run a name server on your local host, you have to set it up separately, as will be explained in the following section. If you are on a local network and have the opportunity to use an existing name server, this should always be preferred. If you use a dialup IP connection to the Internet, you would normally specify the name server of your service provider in the resolv.conf file.

The most important option in resolv.conf is nameserver, which gives the IP address of a name server to use. If you specify several name servers by giving the nameserver option several times, they are tried in the order given. You should therefore put the most reliable server first. The current implementation allows you to have up to three nameserver statements in resolv.conf. If no nameserver option is given, the resolver attempts to connect to the name server on the local host.

Two other options, domain and search, let you use shortcut names for hosts in your local domain. Usually, when just telnetting to another host in your local domain, you don't want to type in the fully qualified hostname, but use a name like gauss on the command line and have the resolver tack on the mathematics.groucho.edu part.

This is just the domain statement's purpose. It lets you specify a default domain name to be appended when DNS fails to look up a hostname. For instance, when given the name gauss, the resolver fails to find gauss. in DNS, because there is no such top-level domain. When given mathematics.groucho.edu as a default domain, the resolver repeats the query for gauss with the default domain appended, this time succeeding.

That's just fine, you may think, but as soon you get out of the Math department's domain, you're back to those fully qualified domain names. Of course, you would also want to have shorthands like quark.physics for hosts in the Physics department's domain.

This is when the search list comes in. A search list can be specified using the search option, which is a generalization of the domain statement. Where the latter gives a single default domain, the former specifies a whole list of them, each to be tried in turn until a lookup succeeds. This list must be separated by blanks or tabs.

The search and domain statements are mutually exclusive and may not appear more than once. If neither option is given, the resolver will try to guess the default domain from the local hostname using the getdomainname(2) system call. If the local hostname doesn't have a domain part, the default domain will be assumed to be the root domain.

If you decide to put a search statement into resolv.conf, you should be careful about what domains you add to this list. Resolver libraries prior to BIND 4.9 used to construct a default search list from the domain name when no search list was given. This default list was made up of the default domain itself, plus all of its parent domains up to the root. This caused some problems because DNS requests wound up at name servers that were never meant to see them.

Assume you're at the Virtual Brewery and want to log in to foot.groucho.edu. By a slip of your fingers, you mistype foot as foo, which doesn't exist. GMU's name server will therefore tell you that it knows no such host. With the old-style search list, the resolver would now go on trying the name with vbrew.com and com appended. The latter is problematic because groucho.edu.com might actually be a valid domain name. Their name server might then even find foo in their domain, pointing you to one of their hostswhich clearly was not intended.

For some applications, these bogus host lookups can be a security problem. Therefore, you should usually limit the domains on your search list to your local organization, or something comparable. At the Mathematics department of Groucho Marx University, the search list would commonly be set to maths.groucho.edu and groucho.edu.

If default domains sound confusing to you, consider this sample resolv.conf file for the Virtual Brewery:

# /etc/resolv.conf
# Our domain
domain         vbrew.com
#
# We use vlager as central nameserver:
nameserver     172.16.1.1

When resolving the name vale, the resolver looks up vale and, failing this, vale.vbrew.com.

Resolver Robustness

When running a LAN inside a larger network, you definitely should use central name servers if they are available. The name servers develop rich caches that speed up repeat queries, since all queries are forwarded to them. However, this scheme has a drawback: when a fire destroyed the backbone cable at Olaf's university, no more work was possible on his department's LAN because the resolver could no longer reach any of the name servers. This situation caused difficulties with most network services, such as X terminal logins and printing.

Although it is not very common for campus backbones to go down in flames, one might want to take precautions against cases like this.

One option is to set up a local name server that resolves hostnames from your local domain and forwards all queries for other hostnames to the main servers. Of course, this is applicable only if you are running your own domain.

Alternatively, you can maintain a backup host table for your domain or LAN in /etc/hosts. This is very simple to do. You simply ensure that the resolver library queries DNS first, and the hosts file next.

In an /etc/host.conf file you'd make sure that bind was listed before hosts to make the resolver fall back to the hosts file if the central name server is unreachable.

How DNS Works

DNS organizes hostnames in a domain hierarchy. A domain is a collection of sites that are related in some sensebecause they form a proper network (e.g., all machines on a campus, or all hosts on BITNET), because they all belong to a certain organization (e.g., the U.S. government), or because they're simply geographically close. For instance, universities are commonly grouped in the edu domain, with each university or college using a separate subdomain, below which their hosts are subsumed. Groucho Marx University have the groucho.edu domain, while the LAN of the Mathematics department is assigned maths.groucho.edu. Hosts on the departmental network would have this domain name tacked onto their hostname, so erdos would be known as erdos.maths.groucho.edu. This is called the fully qualified domain name (FQDN), which uniquely identifies this host worldwide.

Figure 6.1. shows a section of the namespace. The entry at the root of this tree, which is denoted by a single dot, is quite appropriately called the root domain and encompasses all other domains. To indicate that a hostname is a fully qualified domain name, rather than a name relative to some (implicit) local domain, it is sometimes written with a trailing dot. This dot signifies that the name's last component is the root domain.

Figure 6.1. A part of the domain namespace

Depending on its location in the name hierarchy, a domain may be called top-level, second-level, or third-level. More levels of subdivision occur, but they are rare. This list details several top-level domains you may see frequently:

DomainDescription
edu

(Mostly U.S.) educational institutions like universities.

com

Commercial organizations and companies.

org

Non-commercial organizations. Private UUCP networks are often in this domain.

net

Gateways and other administrative hosts on a network.

mil

U.S. military institutions.

gov

U.S. government institutions.

uucp

Officially, all site names formerly used as UUCP names without domains have been moved to this domain.

Historically, the first four of these were assigned to the U.S., but recent changes in policy have meant that these domains, named global Top Level Domains (gTLD), are now considered global in nature. Negotiations are currently underway to broaden the range of gTLDs, which may result in increased choice in the future.

Outside the U.S., each country generally uses a top-level domain of its own named after the two-letter country code defined in ISO-3166. Finland, for instance, uses the fi domain; fr is used by France, de by Germany, and au by Australia. Below this top-level domain, each country's NIC is free to organize hostnames in whatever way they want. Australia has second-level domains similar to the international top-level domains, named com.au and edu.au. Other countries, like Germany, don't use this extra level, but have slightly long names that refer directly to the organizations running a particular domain. It's not uncommon to see hostnames like ftp.informatik.uni-erlangen.de. Chalk that up to German efficiency.

Of course, these national domains do not imply that a host below that domain is actually located in that country; it means only that the host has been registered with that country's NIC. A Swedish manufacturer might have a branch in Australia and still have all its hosts registered with the se top-level domain.

Organizing the namespace in a hierarchy of domain names nicely solves the problem of name uniqueness; with DNS, a hostname has to be unique only within its domain to give it a name different from all other hosts worldwide. Furthermore, fully qualified names are easy to remember. Taken by themselves, these are already very good reasons to split up a large domain into several subdomains.

DNS does even more for you than this. It also allows you to delegate authority over a subdomain to its administrators. For example, the maintainers at the Groucho Computing Center might create a subdomain for each department; we already encountered the math and physics subdomains above. When they find the network at the Physics department too large and chaotic to manage from outside (after all, physicists are known to be an unruly bunch of people), they may simply pass control of the physics.groucho.edu domain to the administrators of this network. These administrators are free to use whatever hostnames they like and assign them IP addresses from their network in whatever fashion they desire, without outside interference.

To this end, the namespace is split up into zones, each rooted at a domain. Note the subtle difference between a zone and a domain: the domain groucho.edu encompasses all hosts at Groucho Marx University, while the zone groucho.edu includes only the hosts that are managed by the Computing Center directly; those at the Mathematics department, for example. The hosts at the Physics department belong to a different zone, namely physics.groucho.edu. In Figure 6.1., the start of a zone is marked by a small circle to the right of the domain name.

Name Lookups with DNS

At first glance, all this domain and zone fuss seems to make name resolution an awfully complicated business. After all, if no central authority controls what names are assigned to which hosts, how is a humble application supposed to know?

Now comes the really ingenious part about DNS. If you want to find the IP address of erdos, DNS says, Go ask the people who manage it, and they will tell you.

In fact, DNS is a giant distributed database. It is implemented by so-called name servers that supply information on a given domain or set of domains. For each zone there are at least two, or at most a few, name servers that hold all authoritative information on hosts in that zone. To obtain the IP address of erdos, all you have to do is contact the name server for the groucho.edu zone, which will then return the desired data.

Easier said than done, you might think. So how do I know how to reach the name server at Groucho Marx University? In case your computer isn't equipped with an address-resolving oracle, DNS provides for this, too. When your application wants to look up information on erdos, it contacts a local name server, which conducts a so-called iterative query for it. It starts off by sending a query to a name server for the root domain, asking for the address of erdos.maths.groucho.edu. The root name server recognizes that this name does not belong to its zone of authority, but rather to one below the edu domain. Thus, it tells you to contact an edu zone name server for more information and encloses a list of all edu name servers along with their addresses. Your local name server will then go on and query one of those, for instance, a.isi.edu. In a manner similar to the root name server, a.isi.edu knows that the groucho.edu people run a zone of their own, and points you to their servers. The local name server will then present its query for erdos to one of these, which will finally recognize the name as belonging to its zone, and return the corresponding IP address.

This looks like a lot of traffic being generated for looking up a measly IP address, but it's really only miniscule compared to the amount of data that would have to be transferred if we were still stuck with HOSTS.TXT. There's still room for improvement with this scheme, however.

To improve response time during future queries, the name server stores the information obtained in its local cache. So the next time anyone on your local network wants to look up the address of a host in the groucho.edu domain, your name server will go directly to the groucho.edu name server.[28]

Of course, the name server will not keep this information forever; it will discard it after some time. The expiration interval is called the time to live, or TTL. Each datum in the DNS database is assigned such a TTL by administrators of the responsible zone.

Types of Name Servers

Name servers that hold all information on hosts within a zone are called authoritative for this zone, and sometimes are referred to as master name servers. Any query for a host within this zone will end up at one of these master name servers.

Master servers must be fairly well synchronized. Thus, the zone's network administrator must make one the primary server, which loads its zone information from data files, and make the others secondary servers, which transfer the zone data from the primary server at regular intervals.

Having several name servers distributes workload; it also provides backup. When one name server machine fails in a benign way, like crashing or losing its network connection, all queries will fall back to the other servers. Of course, this scheme doesn't protect you from server malfunctions that produce wrong replies to all DNS requests, such as from software bugs in the server program itself.

You can also run a name server that is not authoritative for any domain.[29] This is useful, as the name server will still be able to conduct DNS queries for the applications running on the local network and cache the information. Hence it is called a caching-only server.

The DNS Database

We have seen that DNS not only deals with IP addresses of hosts, but also exchanges information on name servers. DNS databases may have, in fact, many different types of entries.

A single piece of information from the DNS database is called a resource record (RR). Each record has a type associated with it describing the sort of data it represents, and a class specifying the type of network it applies to. The latter accommodates the needs of different addressing schemes, like IP addresses (the IN class), Hesiod addresses (used by MIT's Kerberos system), and a few more. The prototypical resource record type is the A record, which associates a fully qualified domain name with an IP address.

A host may be known by more than one name. For example you might have a server that provides both FTP and World Wide Web servers, which you give two names: ftp.machine.org and www.machine.org. However, one of these names must be identified as the official or canonical hostname, while the others are simply aliases referring to the official hostname. The difference is that the canonical hostname is the one with an associated A record, while the others only have a record of type CNAME that points to the canonical hostname.

We will not go through all record types here, but we will give you a brief example. Example 6.2. shows a part of the domain database that is loaded into the name servers for the physics.groucho.edu zone.

Example 6.2. An Excerpt from the named.hosts File for the Physics Department

; Authoritative Information on physics.groucho.edu.
@  IN  SOA niels.physics.groucho.edu. janet.niels.physics.groucho.edu. {
	  1999090200       ; serial no
	  360000           ; refresh
	  3600             ; retry
	  3600000          ; expire
	  3600             ; default ttl
	  }
;
; Name servers
IN    NS       niels
IN    NS       gauss.maths.groucho.edu.
gauss.maths.groucho.edu. IN A 149.76.4.23
;
; Theoretical Physics (subnet 12)
niels         IN    A        149.76.12.1
              IN    A        149.76.1.12
nameserver    IN    CNAME    niels
otto          IN    A        149.76.12.2
quark         IN    A        149.76.12.4
down          IN    A        149.76.12.5
strange       IN    A        149.76.12.6
...
; Collider Lab. (subnet 14)
boson         IN    A        149.76.14.1
muon          IN    A        149.76.14.7
bogon         IN    A        149.76.14.12
...

Apart from the A and CNAME records, you can see a special record at the top of the file, stretching several lines. This is the SOA resource record signaling the Start of Authority, which holds general information on the zone the server is authoritative for. The SOA record comprises, for instance, the default time to live for all records.

Note that all names in the sample file that do not end with a dot should be interpreted relative to the physics.groucho.edu domain. The special name (@) used in the SOA record refers to the domain name by itself.

We have seen earlier that the name servers for the groucho.edu domain somehow have to know about the physics zone so that they can point queries to their name servers. This is usually achieved by a pair of records: the NS record that gives the server's FQDN, and an A record that associates an address with that name. Since these records are what holds the namespace together, they are frequently called glue records. They are the only instances of records in which a parent zone actually holds information on hosts in the subordinate zone. The glue records pointing to the name servers for physics.groucho.edu are shown in Example 6.3..

Example 6.3. An Excerpt from the named.hosts File for GMU

; Zone data for the groucho.edu zone.
@  IN  SOA vax12.gcc.groucho.edu. joe.vax12.gcc.groucho.edu. {
          1999070100       ; serial no
	  360000           ; refresh
	  3600             ; retry
	  3600000          ; expire
	  3600             ; default ttl
	  }
....
;
; Glue records for the physics.groucho.edu zone
physics        IN     NS        niels.physics.groucho.edu.
               IN     NS        gauss.maths.groucho.edu.
niels.physics  IN     A         149.76.12.1
gauss.maths    IN     A         149.76.4.23
...

Reverse Lookups

Finding the IP address belonging to a host is certainly the most common use for the Domain Name System, but sometimes you'll want to find the canonical hostname corresponding to an address. Finding this hostname is called reverse mapping, and is used by several network services to verify a client's identity. When using a single hosts file, reverse lookups simply involve searching the file for a host that owns the IP address in question. With DNS, an exhaustive search of the namespace is out of the question. Instead, a special domain, in-addr.arpa, has been created that contains the IP addresses of all hosts in a reversed dotted quad notation. For instance, an IP address of 149.76.12.4 corresponds to the name 4.12.76.149.in-addr.arpa. The resource-record type linking these names to their canonical hostnames is PTR.

Creating a zone of authority usually means that its administrators have full control over how they assign addresses to names. Since they usually have one or more IP networks or subnets at their hands, there's a one-to-many mapping between DNS zones and IP networks. The Physics department, for instance, comprises the subnets 149.76.8.0, 149.76.12.0, and 149.76.14.0.

Consequently, new zones in the in-addr.arpa domain have to be created along with the physics zone, and delegated to the network administrators at the department: 8.76.149.in-addr.arpa, 12.76.149.in-addr.arpa, and 14.76.149.in-addr.arpa. Otherwise, installing a new host at the Collider Lab would require them to contact their parent domain to have the new address entered into their in-addr.arpa zone file.

The zone database for subnet 12 is shown in Example 6.4.. The corresponding glue records in the database of their parent zone are shown in Example 6.5..

Example 6.4. An Excerpt from the named.rev File for Subnet 12

; the 12.76.149.in-addr.arpa domain.
@  IN  SOA  niels.physics.groucho.edu. janet.niels.physics.groucho.edu. {
	  1999090200 360000 3600 3600000 3600
	  }
2        IN     PTR       otto.physics.groucho.edu.
4        IN     PTR       quark.physics.groucho.edu.
5        IN     PTR       down.physics.groucho.edu.
6        IN     PTR       strange.physics.groucho.edu.

Example 6.5. An Excerpt from the named.rev File for Network 149.76

; the 76.149.in-addr.arpa domain.
@  IN  SOA vax12.gcc.groucho.edu. joe.vax12.gcc.groucho.edu. {
	  1999070100 360000 3600 3600000 3600
	  }
...
; subnet 4: Mathematics Dept.
1.4        IN     PTR      sophus.maths.groucho.edu.
17.4       IN     PTR      erdos.maths.groucho.edu.
23.4       IN     PTR      gauss.maths.groucho.edu.
...
; subnet 12: Physics Dept, separate zone
12         IN     NS       niels.physics.groucho.edu.
IN     NS       gauss.maths.groucho.edu.
niels.physics.groucho.edu. IN  A 149.76.12.1
gauss.maths.groucho.edu. IN  A   149.76.4.23
...

in-addr.arpa system zones can only be created as supersets of IP networks. An even more severe restriction is that these networks' netmasks have to be on byte boundaries. All subnets at Groucho Marx University have a netmask of 255.255.255.0, hence an in-addr.arpa zone could be created for each subnet. However, if the netmask were 255.255.255.128 instead, creating zones for the subnet 149.76.12.128 would be impossible, because there's no way to tell DNS that the 12.76.149.in-addr.arpa domain has been split into two zones of authority, with hostnames ranging from 1 through 127, and 128 through 255, respectively.

Running named

The Berekeley Internet Name Daemon (BIND) version 8 is the DNS server shipped with FreeBSD; there are alternatives in the FreeBSD ports collection. BIND includes a program, named (pronounce name-dee) which is the actual DNS server.

This section requires some understanding of the way DNS works. If the following discussion is all Greek to you, you may want to reread the section the section called “How DNS Works”."

named is usually started at system boot time and runs until the machine stops. On FreeBSD, BIND is normally started with a named_enable="YES" entry in /etc/rc.conf. There are additional variables, named_program and named_flags that you can set if you have installed an alternative to named, or want to run it with different options.

To run named at the prompt, enter:

# /usr/sbin/named

named will come up and read the /etc/namedb/named.conf file and any zone files specified therein. It writes its process ID to /var/run/named.pid in ASCII, downloads any zone files from primary servers, if necessary, and starts listening on port 53 for DNS queries.

The BIND 8 named.conf File

BIND's configuration file changed drastically with version 8. BIND is much more configurable than it used to be, but the tradeoff is that the configuration file format is more complex.

A BIND configuration file consists of a series of one or more statements, each of which must be ended with a semi-colon. Many statements can contain sub-statements, and these are grouped using curly brackets. The most common statements you will normally use in a named.conf file are:

options

Controls global configuration options, and sets the defaults for other statements in the configuration file. They can only be one options statement per file.

There a large number of variables that can be defined inside an options statement. The most frequently used are:

version

The version number the server will report if asked. This is normally the same value as the server's actual version number. You can change this to some other string. This is useful because BIND has been a source of security problems in the past, and attackers have been able to determine whether the server is vulnerable by looking at the version string it returns. Setting this to surely you must be joking is a common practice.

directory

The server's working directory. Any other relative paths in this file (i.e., those without a leading /) will be relative to this directory.

forward

If this is set to the string only then named will not try to resolve the query itself; instead, it forwards it on to the nameservers listed in the forwarders entry. This configuration is referred to as a caching only name server.

forwarders

The IP addresses of the nameservers that will be consulted if forward is set to only.

listen-on

named will normally listen on all the host's active network interfaces. Use this directive if you want named to only listen on specific interfaces.

zone

Defines a zone, and lists the configuration files that contain the DNS information for that zone. There are a large number of configuration variables that can be defined inside a zone statement. Some of them are:

type

One of master, slave, stub, forward or hint.

master

Indicates that the name server will answer queries for this zone. There must be an associated file entry which specifies the filename that contains the zone data.

slave

Indicates that the name server will answer queries for this zone, and that the zone information will be retrieved from another name server. There must be an associated masters entry which lists the IP addresses of the master name servers for this zone.

forward

The nameserver will answer queries for this zone by forwarding them on the list of name servers configured in the forwarders option.

hint

This zone contains the initial set of root name servers. There must be an associated file entry which lists the file that contains the root name server data.

Example 6.6. named.conf for vlager

This is an example named.conf for vlager, which will be the main DNS server for vbrew.com.

// 
// /etc/namedb/named.conf file for vlager.vbrew.com
//

// Set global options
options {
  directory "/etc/namedb";
};

// Load the root nameserver entries from named.root
zone "." {
  type hint;
  file "named.root";
};

// You always need a reverse entry for 127.0.0.1
zone "0.0.127.in-addr.arpa" {
  type master;
  file "localhost.rev";
};
	  
// Master server entries for vbrew.com
zone "vbrew.com" {
  type master;
  file "vbrew.com/db";
};
	  
// Reverse entries for vbrew.com
zone "16.172.in-addr.arpa" {
  type master;
  file "rev";
};

Organising your DNS directories

There are several ways to organise your zone files. Some people keep them in the DNS directory, and give them names like domainname.db for the forward mapping, and domainname.rev for the reverse.

This is sufficient if you're hosting the domain information for one or two zones, but can become unwieldy if you have to host tens or hundreds of zones.

Others do the exact opposite, and construct directory trees that mirror the domain hierarchy; /etc/namedb/com/vlager, and so forth. This approach scales well, but does mean that you can spend a lot of time changing up and down the directory hierarchy if you need to edit a lot of files.

A good compromise is to have a relatively flat directory structure, with the directories named after the domain names, and with no sub-directories for sub-domains. Each directory should have two filesone for the zone file (called db in these examples), and one for the reverse lookup file (called rev in these examples).

The scheme that you use is not as important as having a scheme in the first place. If you find yourself having to manage many different DNS domains then it is essential you have a good way to manage the content of the zone tables.

The DNS Database Files

Master files included with named, like named.hosts, always have a domain associated with them, which is called the origin. This is the domain name specified with the cache and primary options. Within a master file, you are allowed to specify domain and host names relative to this domain. A name given in a configuration file is considered absolute if it ends in a single dot, otherwise it is considered relative to the origin. The origin by itself may be referred to using (@).

The data contained in a master file is split up in resource records(RRs). RRs are the smallest units of information available through DNS. Each resource record has a type. A records, for instance, map a hostname to an IP address, and a CNAME record associates an alias for a host with its official hostname. To see an example, look at Example 6.8., which shows the named.hosts master file for the Virtual Brewery.

Resource record representations in master files share a common format:

[domain] [ttl] [class] type rdata

Fields are separated by spaces or tabs. An entry may be continued across several lines if an opening brace occurs before the first newline and the last field is followed by a closing brace. Anything between a semicolon and a newline is ignored. A description of the format terms follows:

domain

This term is the domain name to which the entry applies. If no domain name is given, the RR is assumed to apply to the domain of the previous RR.

ttl

In order to force resolvers to discard information after a certain time, each RR is associated a time to live (ttl). The ttl field specifies the time in seconds that the information is valid after it has been retrieved from the server. It is a decimal number with at most eight digits.

If no ttl value is given, the field value defaults to that of the minimum field of the preceding SOA record.

class

This is an address class, like IN for IP addresses or HS for objects in the Hesiod class. For TCP/IP networking, you have to specify IN.

If no class field is given, the class of the preceding RR is assumed.

type

This describes the type of the RR. The most common types are A, SOA, PTR, and NS. The following sections describe the various types of RRs.

rdata

This holds the data associated with the RR. The format of this field depends on the type of RR. In the following discussion, it will be described for each RR separately.

The following is partial list of RRs to be used in DNS master files. There are a couple more of them that we will not explain; they are experimental and of little use, generally.

SOA

This RR describes a zone of authority (SOA means Start of Authority). It signals that the records following the SOA RR contain authoritative information for the domain. Every master file included by a primary statement must contain an SOA record for this zone. The resource data contains the following fields:

origin

This field is the canonical hostname of the primary name server for this domain. It is usually given as an absolute name.

contact

This field is the email address of the person responsible for maintaining the domain, with the "@" sign replaced by a dot. For instance, if the responsible person at the Virtual Brewery were janet, this field would contain janet.vbrew.com.

serial

This field is the version number of the zone file, expressed as a single decimal number. Whenever data is changed in the zone file, this number should be incremented. A common convention is to use a number that reflects the date of the last update, with a version number appended to it to cover the case of multiple updates occurring on a single day, e.g., 2000012600 being update 00 that occurred on January 26, 2000.

The serial number is used by secondary name servers to recognize zone information changes. To stay up to date, secondary servers request the primary server's SOA record at certain intervals and compare the serial number to that of the cached SOA record. If the number has changed, the secondary servers transfer the whole zone database from the primary server.

refresh

This field specifies the interval in seconds that the secondary servers should wait between checking the SOA record of the primary server. Again, this is a decimal number with at most eight digits.

Generally, the network topology doesn't change too often, so this number should specify an interval of roughly a day for larger networks, and even more for smaller ones.

retry

This number determines the intervals at which a secondary server should retry contacting the primary server if a request or a zone refresh fails. It must not be too low, or a temporary failure of the server or a network problem could cause the secondary server to waste network resources. One hour, or perhaps one-half hour, might be a good choice.

expire

This field specifies the time in seconds after which a secondary server should finally discard all zone data if it hasn't been able to contact the primary server. You should normally set this field to at least a week (604,800 seconds), but increasing it to a month or more is also reasonable.

minimum

This field is the default ttl value for resource records that do not explicitly contain one. The ttl value specifies the maximum amount of time other name servers may keep the RR in their cache. This time applies only to normal lookups, and has nothing to do with the time after which a secondary server should try to update the zone information.

If the topology of your network does not change frequently, a week or even more is probably a good choice. If single RRs change more frequently, you could still assign them smaller ttls individually. If your network changes frequently, you may want to set minimum to one day (86,400 seconds).

A

This record associates an IP address with a hostname. The resource data field contains the address in dotted quad notation.

For each hostname, there must be only one A record. The hostname used in this A record is considered the official or canonical hostname. All other hostnames are aliases and must be mapped onto the canonical hostname using a CNAME record. If the canonical name of our host were vlager, we'd have an A record that associated that hostname with its IP address. Since we may also want another name associated with that address, say news, we'd create a CNAME record that associates this alternate name with the canonical name. We'll talk more about CNAME records shortly.

NS

NS records are used to specify a zone's primary server and all its secondary servers. An NS record points to a master name server of the given zone, with the resource data field containing the hostname of the name server.

You will meet NS records in two situations: The first situation is when you delegate authority to a subordinate zone; the second is within the master zone database of the subordinate zone itself. The sets of servers specified in both the parent and delegated zones should match.

The NS record specifies the name of the primary and secondary name servers for a zone. These names must be resolved to an address so they can be used. Sometimes the servers belong to the domain they are serving, which causes a chicken and egg problem; we can't resolve the address until the name server is reachable, but we can't reach the name server until we resolve its address. To solve this dilemma, we can configure special A records directly into the name server of the parent zone. The A records allow the name servers of the parent domain to resolve the IP address of the delegated zone name servers. These records are commonly called glue records because they provide the glue that binds a delegated zone to its parent.

CNAME

This record associates an alias with a host's canonical hostname. It provides an alternate name by which users can refer to the host whose canonical name is supplied as a parameter. The canonical hostname is the one the master file provides an A record for; aliases are simply linked to that name by a CNAME record, but don't have any other records of their own.

PTR

This type of record is used to associate names in the in-addr.arpa domain with hostnames. It is used for reverse mapping of IP addresses to hostnames. The hostname given must be the canonical hostname.

MX

This RR announces a mail exchanger for a domain. Mail exchangers are discussed in the section called “Mail Routing on the Internet”. The syntax of an MX record is:

[domain] [ttl] [class] MX preference host

host names the mail exchanger for domain. Every mail exchanger has an integer preference associated with it. A mail transport agent that wants to deliver mail to domain tries all hosts who have an MX record for this domain until it succeeds. The one with the lowest preference value is tried first, then the others, in order of increasing preference value.

HINFO

This record provides information on the system's hardware and software. Its syntax is:

[domain] [ttl] [class] HINFO hardware software

The hardware field identifies the hardware used by this host. Special conventions are used to specify this. A list of valid machine names is given in the Assigned Numbers RFC (RFC-1700). If the field contains any blanks, it must be enclosed in double quotes. The software field names the operating system software used by the system. Again, a valid name from the Assigned Numbers RFC should be chosen.

An HINFO record to describe an Intel-based Linux machine should look something like:

tao	 36500  IN  HINFO  IBM-PC  LINUX2.2

and HINFO records for Linux running on Motorola 68000-based machines might look like:

cevad 36500 IN  HINFO  ATARI-104ST LINUX2.0
jedd  36500 IN  HINFO  AMIGA-3000  LINUX2.0

Caching-only named Configuration

There is a special type of named configuration that we'll talk about before we explain how to build a full name server configuration. It is called a caching-only configuration. It doesn't really serve a domain, but acts as a relay for all DNS queries produced on your host. The advantage of this scheme is that it builds up a cache so only the first query for a particular host is actually sent to the name servers on the Internet. Any repeated request will be answered directly from the cache in your local name server. This may not seem useful yet, but it will when you are dialing in to the Internet, as described in Chapter 7., Serial Line IP and Chapter 8., The Point-to-Point Protocol.

A named.conf file for a caching-only server looks like this:

options {
   directory "/etc/namedb";
   forward only;
   forwarders {
      A.B.C.D;
      W.X.Y.Z;
   };
};

zone "." {
   type hint;
   file "named.root";
};

zone "0.0.127.in-addr.arpa" {
   type master;
   file "localhost.rev";
};

Change the entries in the forwarders clause to list the IP addresses of the name servers you want to query.

In addition to this named.confile, you must set up the named.root file with a valid list of root name servers. No other files are needed for a caching-only server configuration.

Writing the Master Files

Example 6.7., Example 6.8., Example 6.9., and Example 6.10. give sample files for a name server at the brewery, located on vlager. Due to the nature of the network discussed (a single LAN), the example is pretty straightforward.

The named.root cache file shown in Example 6.7. shows sample hint records for a root name server. A typical cache file usually describes about a dozen name servers. You can obtain the current list of name servers for the root domain using the nslookup tool described in the next section.[30]

Example 6.7. The named.root File

;
; /etc/namedb/named.root         Cache file for the brewery.
;                We're not on the Internet, so we don't need
;                any root servers. To activate these
;                records, remove the semicolons.
;
;.                        3600000  IN  NS    A.ROOT-SERVERS.NET.
;A.ROOT-SERVERS.NET.      3600000      A     198.41.0.4
;.                        3600000      NS    B.ROOT-SERVERS.NET.
;B.ROOT-SERVERS.NET.      3600000      A     128.9.0.107
;.                        3600000      NS    C.ROOT-SERVERS.NET.
;C.ROOT-SERVERS.NET.      3600000      A     192.33.4.12
;.                        3600000      NS    D.ROOT-SERVERS.NET.
;D.ROOT-SERVERS.NET.      3600000      A     128.8.10.90
;.                        3600000      NS    E.ROOT-SERVERS.NET.
;E.ROOT-SERVERS.NET.      3600000      A     192.203.230.10
;.                        3600000      NS    F.ROOT-SERVERS.NET.
;F.ROOT-SERVERS.NET.      3600000      A     192.5.5.241
;.                        3600000      NS    G.ROOT-SERVERS.NET.
;G.ROOT-SERVERS.NET.      3600000      A     192.112.36.4
;.                        3600000      NS    H.ROOT-SERVERS.NET.
;H.ROOT-SERVERS.NET.      3600000      A     128.63.2.53
;.                        3600000      NS    I.ROOT-SERVERS.NET.
;I.ROOT-SERVERS.NET.      3600000      A     192.36.148.17
;.                        3600000      NS    J.ROOT-SERVERS.NET.
;J.ROOT-SERVERS.NET.      3600000      A     198.41.0.10
;.                        3600000      NS    K.ROOT-SERVERS.NET.
;K.ROOT-SERVERS.NET.      3600000      A     193.0.14.129 
;.                        3600000      NS    L.ROOT-SERVERS.NET.
;L.ROOT-SERVERS.NET.      3600000      A     198.32.64.12
;.                        3600000      NS    M.ROOT-SERVERS.NET.
;M.ROOT-SERVERS.NET.      3600000      A     202.12.27.33
;

Example 6.8. The /etc/namedb/vbrew.com/db File

;
; /etc/namedb/vbrew.com/db       Local hosts at the brewery
;                               Origin is vbrew.
;
@                IN  SOA   vlager.vbrew.com. janet.vbrew.com. (
          2000012601 ; serial
	  86400      ; refresh: once per day
	  3600       ; retry:   one hour
	  3600000    ; expire:  42 days
	  604800     ; minimum: 1 week
	  )
IN  NS    vlager.vbrew.com.
;
; local mail is distributed on vlager
IN  MX    10 vlager
;
; loopback address
localhost.       IN  A     127.0.0.1
;
; Virtual Brewery Ethernet
vlager           IN  A     172.16.1.1
vlager-if1       IN  CNAME vlager
; vlager is also news server
news             IN  CNAME vlager
vstout           IN  A     172.16.1.2
vale             IN  A     172.16.1.3
;
; Virtual Winery Ethernet
vlager-if2       IN  A     172.16.2.1
vbardolino       IN  A     172.16.2.2
vchianti         IN  A     172.16.2.3
vbeaujolais      IN  A     172.16.2.4
;
; Virtual Spirits (subsidiary) Ethernet
vbourbon         IN  A     172.16.3.1
vbourbon-if1     IN  CNAME vbourbon

Example 6.9. The localhost.rev File

;
; /etc/namedb/localhost.rev       Reverse mapping of 127.0.0
;                              Origin is 0.0.127.in-addr.arpa.
;
@             IN  SOA   vlager.vbrew.com. joe.vbrew.com. (
	  1          ; serial
	  360000     ; refresh: 100 hrs
	  3600       ; retry:   one hour
	  3600000    ; expire:  42 days
	  360000     ; minimum: 100 hrs
	  )
IN  NS    vlager.vbrew.com.
1             IN  PTR   localhost.

Example 6.10. The /etc/namedb/vbrew.com/rev File

;
; /etc/namedb/vbrew.com/rev         Reverse mapping of our IP addresses
;                               Origin is 16.172.in-addr.arpa.
;
@             IN  SOA   vlager.vbrew.com. joe.vbrew.com. (
	  16         ; serial
	  86400      ; refresh: once per day
	  3600       ; retry:   one hour
	  3600000    ; expire:  42 days
	  604800     ; minimum: 1 week
	  )
IN  NS    vlager.vbrew.com.
; brewery
1.1           IN  PTR   vlager.vbrew.com.
2.1           IN  PTR   vstout.vbrew.com.
3.1           IN  PTR   vale.vbrew.com.
; winery
1.2           IN  PTR   vlager-if2.vbrew.com.
2.2           IN  PTR   vbardolino.vbrew.com.
3.2           IN  PTR   vchianti.vbrew.com.
4.2           IN  PTR   vbeaujolais.vbrew.com.

Verifying the Name Server Setup

nslookup is a great tool for checking the operation of your name server setup. It can be used both interactively with prompts and as a single command with immediate output. In the latter case, you simply invoke it as:

$ nslookup hostname

nslookup queries the name server specified in resolv.conf for hostname. (If this file names more than one server, nslookup chooses one at random.)

The interactive mode, however, is much more exciting. Besides looking up individual hosts, you may query for any type of DNS record and transfer the entire zone information for a domain.

When invoked without an argument, nslookup displays the name server it uses and enters interactive mode. At the > prompt, you may type any domain name you want to query. By default, it asks for class A records, those containing the IP address relating to the domain name.

You can look for record types by issuing:

> set type=type

in which type is one of the resource record names described earlier, or ANY.

You might have the following nslookup session:

$ nslookup
Default Server:  tao.linux.org.au
Address:  203.41.101.121

> metalab.unc.edu
Server:  tao.linux.org.au
Address:  203.41.101.121
	  
Name:    metalab.unc.edu
Address:  152.2.254.81
	  
>

The output first displays the DNS server being queried, and then the result of the query.

If you try to query for a name that has no IP address associated with it, but other records were found in the DNS database, nslookup returns with an error message saying No type A records found. However, you can make it query for records other than type A by issuing the set type command. To get the SOA record of unc.edu, you would issue:

$ unc.edu
Server:  tao.linux.org.au
Address:  203.41.101.121
	  
*** No address (A) records available for unc.edu
> set type=SOA
> unc.edu
Server:  tao.linux.org.au
Address:  203.41.101.121
	  
unc.edu
          origin = ns.unc.edu
          mail addr = host-reg.ns.unc.edu
          serial = 1998111011
          refresh = 14400 (4H)
	  retry   = 3600 (1H)
	  expire  = 1209600 (2W)
	  minimum ttl = 86400 (1D)
unc.edu name server = ns2.unc.edu
unc.edu name server = ncnoc.ncren.net
unc.edu name server = ns.unc.edu
ns2.unc.edu     internet address = 152.2.253.100
ncnoc.ncren.net internet address = 192.101.21.1
ncnoc.ncren.net internet address = 128.109.193.1
ns.unc.edu      internet address = 152.2.21.1

In a similar fashion, you can query for MX records:

> set type=MX
> unc.edu
Server:  tao.linux.org.au
Address:  203.41.101.121
	  
unc.edu preference = 0, mail exchanger = conga.oit.unc.edu
unc.edu preference = 10, mail exchanger = imsety.oit.unc.edu
unc.edu name server = ns.unc.edu
unc.edu name server = ns2.unc.edu
unc.edu name server = ncnoc.ncren.net
conga.oit.unc.edu       internet address = 152.2.22.21
imsety.oit.unc.edu      internet address = 152.2.21.99
ns.unc.edu      internet address = 152.2.21.1
ns2.unc.edu     internet address = 152.2.253.100
ncnoc.ncren.net internet address = 192.101.21.1
ncnoc.ncren.net internet address = 128.109.193.1

Using a type of ANY returns all resource records associated with a given name.

A practical application of nslookup, besides debugging, is to obtain the current list of root name servers. You can obtain this list by querying for all NS records associated with the root domain:

> set type=NS
> .
Server:  tao.linux.org.au
Address:  203.41.101.121
	  
Non-authoritative answer:
(root)  name server = A.ROOT-SERVERS.NET
(root)  name server = H.ROOT-SERVERS.NET
(root)  name server = B.ROOT-SERVERS.NET
(root)  name server = C.ROOT-SERVERS.NET
(root)  name server = D.ROOT-SERVERS.NET
(root)  name server = E.ROOT-SERVERS.NET
(root)  name server = I.ROOT-SERVERS.NET
(root)  name server = F.ROOT-SERVERS.NET
(root)  name server = G.ROOT-SERVERS.NET
(root)  name server = J.ROOT-SERVERS.NET
(root)  name server = K.ROOT-SERVERS.NET
(root)  name server = L.ROOT-SERVERS.NET
(root)  name server = M.ROOT-SERVERS.NET
	  
Authoritative answers can be found from:
A.ROOT-SERVERS.NET      internet address = 198.41.0.4
H.ROOT-SERVERS.NET      internet address = 128.63.2.53
B.ROOT-SERVERS.NET      internet address = 128.9.0.107
C.ROOT-SERVERS.NET      internet address = 192.33.4.12
D.ROOT-SERVERS.NET      internet address = 128.8.10.90
E.ROOT-SERVERS.NET      internet address = 192.203.230.10
I.ROOT-SERVERS.NET      internet address = 192.36.148.17
F.ROOT-SERVERS.NET      internet address = 192.5.5.241
G.ROOT-SERVERS.NET      internet address = 192.112.36.4
J.ROOT-SERVERS.NET      internet address = 198.41.0.10
K.ROOT-SERVERS.NET      internet address = 193.0.14.129
L.ROOT-SERVERS.NET      internet address = 198.32.64.12
M.ROOT-SERVERS.NET      internet address = 202.12.27.33

To see the complete set of available commands, use the help command in nslookup.

Other Useful Tools

There are a number of third party tools that can help you set up, trace, and debug DNS problems. These are not part of the FreeBSD Operating System, but are available through the ports system.

Searching the ports tree for `dns'[31] will show you the utilities available, which can be installed in the usual manner.



[28] If information weren't cached, then DNS would be as inefficient as any other method because each query would involve the root name servers.

[29] Well, almost. A name server has to provide at least name service for localhost and reverse lookups of 127.0.0.1.

[30] Note that you can't query your name server for the root servers if you don't have any root server hints installed. To escape this dilemma, you can either make nslookup use a different name server, or use the sample file in Example 6.7. as a starting point, and then obtain the full list of valid servers.

[31] cd /usr/ports; make search key=dns

Packet protocols like IP or IPX rely upon the receiver host knowing where the start and end of each packet are in the data stream. The mechanism used to mark and detect the start and end of packets is called delimitation. The Ethernet protocol manages this mechanism in a LAN environment, and the SLIP and PPP protocols manage it for serial communications lines.

The comparatively low cost of low-speed dialup modems and telephone circuits has made the serial line IP protocols immensely popular, especially for providing connectivity to end users of the Internet. The hardware required to run SLIP or PPP is simple and readily available. All that is required is a modem and a serial port equipped with a FIFO buffer.

The SLIP protocol is very simple to implement and at one time was the more common of the two. Today almost everyone uses the PPP protocol instead. The PPP protocol adds a host of sophisticated features that contribute to its popularity today, and we'll look at the most important of these later.

Linux supports kernel-based drivers for both SLIP and PPP. The drivers have both been around for some time and are stable and reliable. In this chapter and the next, we'll discuss both protocols and how to configure them.

General Requirements

To use SLIP or PPP, you have to configure some basic networking features as described in the previous chapters. You must set up the loopback interface and configure the name resolver. When connecting to the Internet, you will want to use DNS. Your options here are the same as for PPP: you can perform your DNS queries across your serial link by configuring your Internet Service Provider's IP address into your /etc/resolv.conf file, or configure a caching-only name server as described under the section called “Caching-only named Configuration”, in Chapter 6., Name Service and Resolver Configuration."

SLIP Operation

Dialup IP servers frequently offer SLIP service through special user accounts. After logging in to such an account, you are not dropped into the common shell; instead, a program or shell script is executed that enables the server's SLIP driver for the serial line and configures the appropriate network interface. Then you have to do the same at your end of the link.

On some operating systems, the SLIP driver is a user-space program; under Linux, it is part of the kernel, which makes it a lot faster. This speed requires, however, that the serial line be converted to the SLIP mode explicitly. This conversion is done by means of a special tty line discipline, SLIPDISC. While the tty is in normal line discipline (DISC0), it exchanges data only with user processes, using the normal read(2) and write(2) calls, and the SLIP driver is unable to write to or read from the tty. In SLIPDISC, the roles are reversed: now any user-space processes are blocked from writing to or reading from the tty, while all data coming in on the serial port is passed directly to the SLIP driver.

The SLIP driver itself understands a number of variations on the SLIP protocol. Apart from ordinary SLIP, it also understands CSLIP, which performs the so-called Van Jacobson header compression (described in RFC-1144) on outgoing IP packets. This compression improves throughput for interactive sessions noticeably. There are also six-bit versions for each of these protocols.

A simple way to convert a serial line to SLIP mode is by using the slattach tool. Assume you have your modem on /dev/ttyS3 and have logged in to the SLIP server successfully. You will then execute:

# slattach /dev/ttyS3 &

This tool switches the line discipline of ttyS3 to SLIPDISC and attaches it to one of the SLIP network interfaces. If this is your first active SLIP link, the line will be attached to sl0; the second will be attached to sl1, and so on. The current kernels support a default maximum of 256 simultaneous SLIP links.

The default line discipline chosen by slattach is CSLIP. You may choose any other discipline using the p switch. To use normal SLIP (no compression), you use:

# slattach -p slip /dev/ttyS3 &

The disciplines available are listed in Table 7.1.. A special pseudo-discipline is available called adaptive, which causes the kernel to automatically detect which type of SLIP encapsulation is being used by the remote end.

Table 7.1. Linux Slip-Line Disciplines

Disclipline

Description
slip

Traditional SLIP encapsulation.

cslip

SLIP encapsulation with Van Jacobsen header compression.

slip6

SLIP encapsulation with six-bit encoding. The encoding method is similar to that used by the uuencode command, and causes the SLIP datagram to be converted into printable ASCII characters. This conversion is useful when you do not have a serial link that is eight bit clean.

cslip6

SLIP encapsulation with Van Jacobsen header compression and six-bit encoding.

adaptive

This is not a real line discipline; instead, it causes the kernel to attempt to identify the line discipline being used by the remote machine and to match it.

Note that you must use the same encapsulation as your peer. For example, if cowslip uses CSLIP, you also have to do so. If your SLIP connection doesn't work, the first thing you should do is ensure that both ends of the link agree on whether to use header compression or not. If you are unsure what the remote end is using, try configuring your host for adaptive slip. The kernel might figure out the right type for you.

slattach lets you enable not only SLIP, but other protocols that use the serial line, like PPP or KISS (another protocol used by ham radio people). Doing this is not common, though, and there are better tools available to support these protocols. For details, please refer to the slattach(8) manual page.

After turning over the line to the SLIP driver, you must configure the network interface. Again, you do this using the standard ifconfig and route commands. Assume that we have dialed up a server named cowslip from vlager. On vlager you would execute:

# ifconfig sl0 vlager-slip pointopoint cowslip
	# route add cowslip
	# route add default gw cowslip

The first command configures the interface as a point-to-point link to cowslip, while the second and third add the route to cowslip and the default route, using cowslip as a gateway.

Two things are worth noting about the ifconfig invocation: The pointopoint option that specifies the address of the remote end of a point-to-point link and our use of vlager-slip as the address of the local SLIP interface.

We have mentioned that you can use the same address you assigned to vlager's Ethernet interface for your SLIP link, as well. In this case, vlager-slip might just be another alias for address 172.16.1.1. However, it is also possible that you have to use an entirely different address for your SLIP link. One such case is when your network uses an unregistered IP network address, as the Brewery does. We will return to this scenario in greater detail in the next section.

For the remainder of this chapter we will always use vlager-slip to refer to the address of the local SLIP interface.

When taking down the SLIP link, you should first remove all routes through cowslip using route with the del option, then take the interface down, and send slattach the hangup signal. The you must hang up the modem using your terminal program again:

# route del default
	# route del cowslip
	# ifconfig sl0 down
	# kill -HUP 516

Note that the 516 should be replaced with the process id (as shown in the output of ps ax) of the slattach command controlling the slip device you wish to take down.

Dealing with Private IP Networks

You will remember from Chapter 5., Configuring TCP/IP Networking, that the Virtual Brewery has an Ethernet-based IP network using unregistered network numbers that are reserved for internal use only. Packets to or from one of these networks are not routed on the Internet; if we were to have vlager dial into cowslip and act as a router for the Virtual Brewery network, hosts within the Brewery's network could not talk to real Internet hosts directly because their packets would be dropped silently by the first major router.

To work around this dilemma, we will configure vlager to act as a kind of launch pad for accessing Internet services. To the outside world, it will present itself as a normal SLIP-connected Internet host with a registered IP address (probably assigned by the network provider running cowslip). Anyone logged in to vlager can use text-based programs like ftp, telnet, or even lynx to make use of the Internet. Anyone on the Virtual Brewery LAN can therefore telnet and log in to vlager and use the programs there. For some applications, there may be solutions that avoid logging in to vlager. For WWW users, for example, we could run a so-called proxy server on vlager, which would relay all requests from your users to their respective servers.

Having to log in to vlager to make use of the Internet is a little clumsy. But apart from eliminating the paperwork (and cost) of registering an IP network, it has the added benefit of going along well with a firewall setup. Firewalls are dedicated hosts used to provide limited Internet access to users on your local network without exposing the internal hosts to network attacks from the outside world. Simple firewall configuration is covered in more detail in Chapter 9., TCP/IP Firewall. In Chapter 11., Network Address Translation, we'll discuss a Linux feature called IP masquerade that provides a powerful alternative to proxy servers.

Assume that the Brewery has been assigned the IP address 192.168.5.74 for SLIP access. All you have to do to realize that the setup discussed above is to enter this address into your /etc/hosts file, naming it vlager-slip. The procedure for bringing up the SLIP link itself remains unchanged.

Using dip

Now that was rather simple. Nevertheless, you might want to automate the steps previously described. It would be much better to have a simple command that performs all the steps necessary to open the serial device, cause the modem to dial the provider, log in, enable the SLIP line discipline, and configure the network interface. This is what the dip command is for.

dip means Dialup IP. It was written by Fred van Kempen and has been patched very heavily by a number of people. Today there is one strain that is used by almost everyone: Version dip337p-uri, which is included with most modern Linux distributions, or is available from the metalab.unc.edu FTP archive.

dip provides an interpreter for a simple scripting language that can handle the modem for you, convert the line to SLIP mode, and configure the interfaces. The script language is powerful enough to suit most configurations.

To be able to configure the SLIP interface, dip requires root privilege. It would now be tempting to make dip setuid to root so that all users can dial up some SLIP server without having to give them root access. This is very dangerous, though, because setting up bogus interfaces and default routes with dip may disrupt routing on your network. Even worse, this action would give your users power to connect to any SLIP server and launch dangerous attacks on your network. If you want to allow your users to fire up a SLIP connection, write small wrapper programs for each prospective SLIP server and have these wrappers invoke dip with the specific script that establishes the connection. Carefully written wrapper programs can then safely be made setuid to root.[32] An alternative, more flexible approach is to give trusted users root access to dip using a program like sudo.

A Sample Script

Assume that the host to which we make our SLIP connection is cowslip, and that we have written a script for dip to run called cowslip.dip that makes our connection. We invoke dip with the script name as argument:

# dip cowslip.dip
	  DIP: Dialup IP Protocol Driver version 3.3.7 (12/13/93)
	  Written by Fred N. van Kempen, MicroWalt Corporation.
	  connected to cowslip.moo.com with addr 192.168.5.74
	  #

The script itself is shown in Example 7.1..

Example 7.1. A Sample dip Script

# Sample dip script for dialing up cowslip
	  # Set local and remote name and address
	  get $local vlager-slip
	  get $remote cowslip
	  port ttyS3                # choose a serial port
	  speed 38400              # set speed to max
	  modem HAYES              # set modem type
	  reset                    # reset modem and tty
	  flush                    # flush out modem response
	  # Prepare for dialing.
	  send ATQ0V1E1X1\r
	  wait OK 2
	  if $errlvl != 0 goto error
	  dial 41988
	  if $errlvl != 0 goto error
	  wait CONNECT 60
	  if $errlvl != 0 goto error
	  # Okay, we're connected now
	  sleep 3
	  send \r\n\r\n
	  wait ogin: 10
	  if $errlvl != 0 goto error
	  send Svlager\n
	  wait ssword: 5
	  if $errlvl != 0 goto error
	  send knockknock\n
	  wait running 30
	  if $errlvl != 0 goto error
	  # We have logged in, and the remote side is firing up SLIP.
	  print Connected to $remote with address $rmtip
	  default                  # Make this link our default route
	  mode SLIP                # We go to SLIP mode, too
	  # fall through in case of error
	  error:
	  print SLIP to $remote failed.

After connecting to cowslip and enabling SLIP, dip will detach from the terminal and go to the background. You can then start using the normal networking services on the SLIP link. To terminate the connection, simply invoke dip with the k option. This sends a hangup signal to dip, using the process ID dip records in /etc/dip.pid:

# dip -k

In dip's scripting language, keywords prefixed with a dollar symbol denote variable names. dip has a predefined set of variables, which will be listed below. $remote and $local, for instance, contain the hostnames of the remote and local hosts involved in the SLIP link.

The first two statements in the sample script are get commands, which is dip's way to set a variable. Here, the local and remote hostnames are set to vlager and cowslip, respectively.

The next five statements set up the terminal line and the modem. reset sends a reset string to the modem. The next statement flushes out the modem response so that the login chat in the next few lines works properly. This chat is pretty straightforward: it simply dials 41988, the phone number of cowslip, and logs in to the account Svlager using the password knockknock. The wait command makes dip wait for the string given as its first argument; the number given as its second argument makes the wait time out after that many seconds if no such string is received. The if commands interspersed in the login procedure check that no error occurred while executing the command.

The final commands executed after logging in are default, which makes the SLIP link the default route to all hosts, and mode, which enables SLIP mode on the line and configures the interface and routing table for you.

A dip Reference

In this section, we will give a reference for most of dip's commands. You can get an overview of all the commands it provides by invoking dip in test mode and entering the help command. To learn about the syntax of a command, you may enter it without any arguments. Remember that this does not work with commands that take no arguments. The following example illustrates the help command:

# dip -t
	  DIP: Dialup IP Protocol Driver version 3.3.7p-uri (25 Dec 96)
	  Written by Fred N. van Kempen, MicroWalt Corporation.
	  Debian version 3.3.7p-2 (debian).
	  
	  DIP> help
	  DIP knows about the following commands:
	  
	  beep         bootp        break        chatkey      config       
	  databits     dec          default      dial         echo         
	  flush        get          goto         help         if           
	  inc          init         mode         modem        netmask      
	  onexit       parity       password     proxyarp     print        
	  psend        port         quit         reset        securidfixed 
	  securid      send         shell        skey         sleep        
	  speed        stopbits     term         timeout      wait         
	  
	  DIP> echo
	  Usage: echo on|off
	  DIP>

Throughout the following section, examples that display the DIP> prompt show how to enter a command in test mode and what output it produces. Examples lacking this prompt should be taken as script excerpts.

The modem commands

dip provides a number of commands that configure your serial line and modem. Some of these are obvious, such as port, which selects a serial port, and speed, databits, stopbits, and parity, which set the common line parameters. The modem command selects a modem type. Currently, the only type supported is HAYES (capitalization required). You have to provide dip with a modem type, or else it will refuse to execute the dial and reset commands. The reset command sends a reset string to the modem; the string used depends on the modem type selected. For Hayes-compatible modems, this string is ATZ.

The flush code can be used to flush out all responses the modem has sent so far. Otherwise, a chat script following reset might be confused because it reads the OK responses from earlier commands.

The init command selects an initialization string to be passed to the modem before dialing. The default for Hayes modems is ATE0 Q0 V1 X1, which turns on echoing of commands and long result codes, and selects blind dialing (no checking of dial tone). Modern modems have a good factory default configuration, so this is a little unnecessary, though it does no harm.

The dial command sends the initialization string to the modem and dials up the remote system. The default dial command for Hayes modems is ATD.

The echo command

The echo command serves as a debugging aid. Calling echo on makes dip echo to the console everything it sends to the serial device. This can be turned off again by calling echo off.

dip also allows you to leave script mode temporarily and enter terminal mode. In this mode, you can use dip just like any ordinary terminal program, writing the characters you type to the serial line, reading data from the serial line, and displaying the characters. To leave this mode, enter Ctrl-].

The get command

The get command is dip's way of setting a variable. The simplest form is to set a variable to a constant, as we did in cowslip.dip. You may, however, also prompt the user for input by specifying the keyword ask instead of a value:

DIP> get $local ask
	    Enter the value for $local: _

A third method is to obtain the value from the remote host. Bizarre as it seems at first, this is very useful in some cases. Some SLIP servers will not allow you to use your own IP address on the SLIP link, but will rather assign you one from a pool of addresses whenever you dial in, printing some message that informs you about the address you have been assigned. If the message looks something like Your address: 192.168.5.74, the following piece of dip code would let you pick up the address:

# finish login
	    wait address: 10
	    get $locip remote

The print command

This is the command used to echo text to the console from which dip was started. Any of dip's variables may be used in print commands. Here's an example:

DIP> print Using port $port at speed $speed
	    Using port ttyS3 at speed 38400

Variable names

dip understands only a predefined set of variables. A variable name always begins with a dollar symbol and must be written in lowercase letters.

The $local and $locip variables contain the local host's name and IP address. When you store the canonical hostname in $local, dip will automatically attempt to resolve the hostname to an IP address and to store it in the $locip variable. A similar but backward process occurs when you assign an IP address to the $locip variable; dip will attempt to perform a reverse lookup to identify the name of the host and store it in the $local variable.

The $remote and $rmtip variables operate in the same way for the remote host's name and address. $mtu contains the MTU value for the connection.

These five variables are the only ones that may be assigned values directly using the get command. A number of other variables are set as a result of the configuration commands bearing the same name, but may be used in print statements; these variables are $modem, $port, and $speed.

$errlvl is the variable through which you can access the result of the last command executed. An error level of 0 indicates success, while a nonzero value denotes an error.

The if and goto commands

The if command is a conditional branch, rather than a full-featured programming if statement. Its syntax is:

if var op number goto label

The expression must be a simple comparison between one of the variables $errlvl, $locip, and $rmtip. var must be an integer number; the operator op may be one of ==, !=, <, >, <=, and >=.

The goto command makes the execution of the script continue at the line following that bearing the label. A label must be the first word on the line and must be followed immediately by a colon.

send, wait, and sleep

These commands help implement simple chat scripts in dip. The send command outputs its arguments to the serial line. It does not support variables, but understands all C-style backslash character sequences, such as \n for newline and \b for backspace. The tilde character (~) can be used as an abbreviation for carriage return/newline.

The wait command takes a word as an argument and will read all input on the serial line until it detects a sequence of characters that match this word. The word itself may not contain any blanks. Optionally, you may give wait a timeout value as a second argument; if the expected word is not received within that many seconds, the command will return with an $errlvl value of 1. This command is used to detect login and other prompts.

The sleep command may be used to wait for a certain amount of time; for instance, to patiently wait for any login sequence to complete. Again, the interval is specified in seconds.

mode and default

These commands are used to flip the serial line to SLIP mode and configure the interface.

The mode command is the last command executed by dip before going into daemon mode. Unless an error occurs, the command does not return.

mode takes a protocol name as argument. dip currently recognizes SLIP, CSLIP, SLIP6, CSLIP6, PPP, and TERM as valid names. The current version of dip does not understand adaptive SLIP, however.

After enabling SLIP mode on the serial line, dip executes ifconfig to configure the interface as a point-to-point link, and invokes route to set the route to the remote host.

If, in addition, the script executes the default command before mode, dip creates a default route that points to the SLIP link.

Running in Server Mode

Setting up your SLIP client was the hard part. Configuring your host to act as a SLIP server is much easier.

There are two ways of configuring a SLIP server. Both ways require that you set up one login account per SLIP client. Assume you provide SLIP service to Arthur Dent at dent.beta.com. You might create an account named dent by adding the following line to your passwd file:

dent:*:501:60:Arthur Dent's SLIP account:/tmp:/usr/sbin/diplogin

Afterwards, you would set dent's password using the passwd utility.

The dip command can be used in server mode by invoking it as diplogin. Usually diplogin is a link to dip. Its main configuration file is /etc/diphosts, which is where you specify what IP address a SLIP user will be assigned when he or she dials in. Alternatively, you can also use the sliplogin command, a BSD-derived tool featuring a more flexible configuration scheme that lets you execute shell scripts whenever a host connects and disconnects.

When our SLIP user dent logs in, dip starts up as a server. To find out if he is indeed permitted to use SLIP, it looks up the username in /etc/diphosts. This file details the access rights and connection parameter for each SLIP user. The general format for an /etc/diphosts entry looks like:

# /etc/diphosts
	user:password:rem-addr:loc-addr:netmask:comments:protocol,MTU
	#

Each of the fields is described in Table 7.2..

Table 7.2. /etc/diphosts Field Description

FieldDescription
user

The username of the user invoking dip that this entry will apply to.

password

Field 2 of the /etc/diphosts file is used to add an extra layer of password-based security on the connection. You can place a password in encrypted form here (just as in /etc/passwd) and diplogin will prompt for the user to enter the password before allowing SLIP access. Note that this password is used in addition to the normal login-based password the user will enter.

rem-addr

The address that will be assigned to the remote machine. This address may be specified either as a hostname that will be resolved or an IP address in dotted quad notation.

loc-addr

The IP address that will be used for this end of the SLIP link. This may also be specified as a resolvable hostname or in dotted quad format.

netmask

The netmask that will be used for routing purposes. Many people are confused by this entry. The netmask doesn't apply to the SLIP link itself, but is used in combination with the rem-addr field to produce a route to the remote site. The netmask should be that used by the network supported by that of the remote host.

comments

This field is free-form text that you may use to help document the /etc/diphosts file. It serves no other purpose.

protocol

This field is where you specify what protocol or line discipline you want applied to this connection. Valid entries here are the same as those valid for the p argument to the slattach command.

MTU

The maximum transmission unit that this link will carry. This field describes the largest datagram that will be transmitted across the link. Any datagram routed to the SLIP device that is larger than the MTU will be fragmented into datagrams no larger than this value. Usually, the MTU is configured identically at both ends of the link.

A sample entry for dent could look like this:

dent::dent.beta.com:vbrew.com:255.255.255.0:Arthur Dent:CSLIP,296

Our example gives our user dent access to SLIP with no additional password required. He will be assigned the IP address associated with dent.beta.com with a netmask of 255.255.255.0. His default route should be directed to the IP address of vbrew.com, and he will use the CSLIP protocol with an MTU of 296 bytes.

When dent logs in, diplogin extracts the information on him from the diphosts file. If the second field contains a value, diplogin will prompt for an external security password. The string entered by the user is encrypted and compared to the password from diphosts. If they do not match, the login attempt is rejected. If the password field contains the string s/key, and dip was compiled with S/Key support, S/Key authentication will take place. S/Key authentication is described in the documentation that comes in the dip source package.

After a successful login, diplogin proceeds by flipping the serial line to CSLIP or SLIP mode, and sets up the interface and route. This connection remains established until the user disconnects and the modem drops the line. diplogin then returns the line to normal line discipline and exits.

diplogin requires superuser privilege. If you don't have dip running setuid root, you should make diplogin a separate copy of dip instead of a simple link. diplogin can then safely be made setuid without affecting the status of dip itself.



[32] diplogin must be run as setuid to root, too. See the section at the end of this chapter.

Like SLIP, PPP is a protocol used to send datagrams across a serial connection; however, it addresses a couple of the deficiencies of SLIP. First, it can carry a large number of protocols and is thus not limited to the IP protocol. It provides error detection on the link itself, while SLIP accepts and forwards corrupted datagrams as long as the corruption does not occur in the header. Equally important, it lets the communicating sides negotiate options, such as the IP address and the maximum datagram size at startup time, and provides client authorization. This built-in negotiation allows reliable automation of the connection establishment, while the authentication removes the need for the clumsy user login accounts that SLIP requires. For each of these capabilities, PPP has a separate protocol. In this chapter, we briefly cover these basic building blocks of PPP. This discussion of PPP is far from complete; if you want to know more about PPP, we urge you to read its RFC specification and the dozen or so companion RFCs.[33] There is also a comprehensive O'Reilly book on the topic of Using & Managing PPP, by Andrew Sun.

At the very bottom of PPP is the High-Level Data Link Control (HDLC) protocol, which defines the boundaries around the individual PPP frames and provides a 16-bit checksum.[34] As opposed to the more primitive SLIP encapsulation, a PPP frame is capable of holding packets from protocols other than IP, such as Novell's IPX or Appletalk. PPP achieves this by adding a protocol field to the basic HDLC frame that identifies the type of packet carried by the frame.

The Link Control Protocol, (LCP) is used on top of HDLC to negotiate options pertaining to the data link. For instance, the Maximum Receive Unit (MRU), states the maximum datagram size that one side of the link agrees to receive.

An important step at the configuration stage of a PPP link is client authorization. Although it is not mandatory, it is really a must for dialup lines in order to keep out intruders. Usually the called host (the server) asks the client to authorize itself by proving it knows some secret key. If the caller fails to produce the correct secret, the connection is terminated. With PPP, authorization works both ways; the caller may also ask the server to authenticate itself. These authentication procedures are totally independent of each other. There are two protocols for different types of authorization, which we will discuss further in this chapter: Password Authentication Protocol (PAP) and Challenge Handshake Authentication Protocol (CHAP).

Each network protocol that is routed across the data link (like IP and AppleTalk) is configured dynamically using a corresponding Network Control Protocol (NCP). To send IP datagrams across the link, both sides running PPP must first negotiate which IP address each of them uses. The control protocol used for this negotiation is the Internet Protocol Control Protocol (IPCP).

Besides sending standard IP datagrams across the link, PPP also supports Van Jacobson header compression of IP datagrams. This technique shrinks the headers of TCP packets to as little as three bytes. It is also used in CSLIP, and is more colloquially referred to as VJ header compression. The use of compression may be negotiated at startup time through IPCP, as well.

PPP on Linux

On Linux, PPP functionality is split into two parts: a kernel component that handles the low-level protocols (HDLC, IPCP, IPXCP, etc.) and the user space pppd daemon that handles the various higher-level protocols, such as PAP and CHAP. The current release of the PPP software for Linux contains the PPP daemon pppd and a program named chat that automates the dialing of the remote system.

The PPP kernel driver was written by Michael Callahan and reworked by Paul Mackerras. pppd was derived from a free PPP implementation[35] for Sun and 386BSD machines that was written by Drew Perkins and others, and is maintained by Paul Mackerras. It was ported to Linux by Al Longyear. chat was written by Karl Fox.[36]

Like SLIP, PPP is implemented by a special line discipline. To use a serial line as a PPP link, you first establish the connection over your modem as usual, and subsequently convert the line to PPP mode. In this mode, all incoming data is passed to the PPP driver, which checks the incoming HDLC frames for validity (each HDLC frame carries a 16-bit checksum), and unwraps and dispatches them. Currently, PPP is able to transport both the IP protocol, optionally using Van Jacobson header compression, and the IPX protocol.

pppd aids the kernel driver, performing the initialization and authentication phase that is necessary before actual network traffic can be sent across the link. pppd's behavior may be fine-tuned using a number of options. As PPP is rather complex, it is impossible to explain all of them in a single chapter. This book therefore cannot cover all aspects of pppd, but only gives you an introduction. For more information, consult Using & Managing PPP or the pppd manual pages, and READMEs in the pppd source distribution, which should help you sort out most questions this chapter fails to discuss. The PPP-HOWTO might also be of use.

Probably the greatest help you will find in configuring PPP will come from other users of the same Linux distribution. PPP configuration questions are very common, so try your local usergroup mailing list or the IRC Linux channel. If your problems persist even after reading the documentation, you could try the comp.protocols.ppp newsgroup. This is the place where you can find most of the people involved in pppd development.

Running pppd

When you want to connect to the Internet through a PPP link, you have to set up basic networking capabilities, such as the loopback device and the resolver. Both have been covered in Chapter 5., Configuring TCP/IP Networking, and Chapter 6., Name Service and Resolver Configuration. You can simply configure the name server of your Internet Service Provider in the /etc/resolv.conf file, but this will mean that every DNS request is sent across your serial link. This situation is not optimal; the closer (network-wise) you are to your name server, the faster the name lookups will be. An alternative solution is to configure a caching-only name server at a host on your network. This means that the first time you make a DNS query for a particular host, your request will be sent across your serial link, but every subsequent request will be answered directly by your local name server, and will be much faster. This configuration is described in Chapter 6, in the section called “Caching-only named Configuration”.

As an introductory example of how to establish a PPP connection with pppd, assume you are at vlager again. First, dial in to the PPP server c3po and log in to the ppp account. c3po will execute its PPP driver. After exiting the communications program you used for dialing, execute the following command, substituting the name of the serial device you used for the ttyS3 shown here:

# pppd /dev/ttyS3 38400 crtscts defaultroute

This command flips the serial line ttyS3 to the PPP line discipline and negotiates an IP link with c3po. The transfer speed used on the serial port will be 38,400 bps. The crtscts option turns on hardware handshake on the port, which is an absolute must at speeds above 9,600 bps.

The first thing pppd does after starting up is negotiate several link characteristics with the remote end using LCP. Usually, the default set of options pppd tries to negotiate will work, so we won't go into this here. Expect to say that part of this negotiation involves requesting or assigning the IP addresses at each end of the link.

For the time being, we also assume that c3po doesn't require any authentication from us, so the configuration phase is completed successfully.

pppd will then negotiate the IP parameters with its peer using IPCP, the IP control protocol. Since we didn't specify any particular IP address to pppd earlier, it will try to use the address obtained by having the resolver look up the local hostname. Both will then announce their addresses to each other.

Usually, there's nothing wrong with these defaults. Even if your machine is on an Ethernet, you can use the same IP address for both the Ethernet and the PPP interface. Nevertheless, pppd allows you to use a different address, or even to ask your peer to use some specific address. These options are discussed later in the the section called “IP Configuration Options” section.

After going through the IPCP setup phase, pppd will prepare your host's networking layer to use the PPP link. It first configures the PPP network interface as a point-to-point link, using ppp0 for the first PPP link that is active, ppp1 for the second, and so on. Next, it sets up a routing table entry that points to the host at the other end of the link. In the previous example, pppd made the default network route point to c3po, because we gave it the defaultroute option.[37] The default route simplifies your routing by causing any IP datagram destined to a nonlocal host to be sent to c3po; this makes sense since it is the only way they can be reached. There are a number of different routing schemes pppd supports, which we will cover in detail later in this chapter.

Using Options Files

Before pppd parses its command-line arguments, it scans several files for default options. These files may contain any valid command-line arguments spread out across an arbitrary number of lines. Hash signs introduce comments.

The first options file is /etc/ppp/options, which is always scanned when pppd starts up. Using it to set some global defaults is a good idea, because it allows you to keep your users from doing several things that may compromise security. For instance, to make pppd require some kind of authentication (either PAP or CHAP) from the peer, you add the auth option to this file. This option cannot be overridden by the user, so it becomes impossible to establish a PPP connection with any system that is not in your authentication databases. Note, however, that some options can be overridden; the connect string is a good example.

The other options file, which is read after /etc/ppp/options, is .ppprc in the user's home directory. It allows each user to specify her own set of default options.

A sample /etc/ppp/options file might look like this:

# Global options for pppd running on vlager.vbrew.com
	lock                 # use UUCP-style device locking
	auth                 # require authentication
	usehostname          # use local hostname for CHAP
	domain vbrew.com     # our domain name

The lock keyword makes pppd comply to the standard UUCP method of device locking. With this convention, each process that accesses a serial device, say /dev/ttyS3, creates a lock file with a name like LCK..ttyS3 in a special lock-file directory to signal that the device is in use. This is necessary to prevent signal other programs, such as minicom or uucico, from opening the serial device while it is used by PPP.

The next three options relate to authentication and, therefore, to system security. The authentication options are best placed in the global configuration file because they are privileged and cannot be overridden by users' ~/.ppprc options files.

Using chat to Automate Dialing

One of the things that may have struck you as inconvenient in the previous example is that you had to establish the connection manually before you could fire up pppd. Unlike dip, pppd does not have its own scripting language for dialing the remote system and logging in, but relies on an external program or shell script to do this. The command to be executed can be given to pppd with the connect command-line option. pppd will redirect the command's standard input and output to the serial line.

The pppd software package is supplied with a very simple program called chat, which is capable of being used in this way to automate simple login sequences. We'll talk about this command in some detail.

If your login sequence is complex, you will need something more powerful than chat. One useful alternative you might consider is expect, written by Don Libes. It has a very powerful language based on Tcl, and was designed exactly for this sort of application. Those of you whose login sequence requires, for example, challenge/response authentication involving calculator-like key generators will find expect powerful enough to handle the task. Since there are so many possible variations on this theme, we won't describe how to develop an appropriate expect script in this book. Suffice it to say, you'd call your expect script by specifying its name using the pppd connect option. It's also important to note that when the script is running, the standard input and output will be attached to the modem, not to the terminal that invoked pppd. If you require user interaction, you should manage it by opening a spare virtual terminal, or arrange some other means.

The chat command lets you specify a UUCP-style chat script. Basically, a chat script consists of an alternating sequence of strings that we expect to receive from the remote system, and the answers we are to send. We will call them expect and send strings, respectively. This is a typical excerpt from a chat script:

ogin: b1ff ssword: s3|<r1t

This script tells chat to wait for the remote system to send the login prompt and return the login name b1ff. We wait only for ogin: so that it doesn't matter if the login prompt starts with an uppercase or lowercase l, or if it arrives garbled. The following string is another expect string that makes chat wait for the password prompt and send our response password.

This is basically what chat scripts are all about. A complete script to dial up a PPP server would, of course, also have to include the appropriate modem commands. Assume that your modem understands the Hayes command set, and the server's telephone number is 318714. The complete chat invocation to establish a connection with c3po would then be:

$ chat -v '' ATZ OK ATDT318714 CONNECT '' ogin: ppp word: GaGariN

By definition, the first string must be an expect string, but as the modem won't say anything before we have kicked it, we make chat skip the first expect by specifying an empty string. We then send ATZ, the reset command for Hayes-compatible modems, and wait for its response (OK). The next string sends the dial command along with the phone number to chat, and expects the CONNECT message in response. This is followed by an empty string again because we don't want to send anything now, but rather wait for the login prompt. The remainder of the chat script works exactly as described previously. This description probably looks a bit confusing, but we'll see in a moment that there is a way to make chat scripts a lot easier to understand.

The v option makes chat log all activities to the syslog daemon local2 facility.[38]

Specifying the chat script on the command line bears a certain risk because users can view a process's command line with the ps command. You can avoid this risk by putting the chat script in a file like dial-c3po. You make chat read the script from the file instead of the command line by giving it the f option, followed by the filename. This action has the added benefit of making our chat expect sequences easier to understand. To convert our example, our dial-c3po file would look like:

''      ATZ
	OK      ATDT318714
	CONNECT ''
	ogin:   ppp
	word:   GaGariN

When we use a chat script file in this way, the string we expect to receive is on the left and the response we will send is on the right. They are much easier to read and understand when presented this way.

The complete pppd incantation would now look like this:

# pppd connect "chat -f dial-c3po" /dev/ttyS3 38400 -detach \
	  crtscts modem defaultroute

Besides the connect option that specifies the dialup script, we have added two more options to the command line: detach, which tells pppd not to detach from the console and become a background process, and the modem keyword, which makes it perform modem-specific actions on the serial device, like disconnecting the line before and after the call. If you don't use this keyword, pppd will not monitor the port's DCD line and will therefore not detect whether the remote end hangs up unexpectedly.

The examples we have shown are rather simple; chat allows for much more complex scripts. For instance, it can specify strings on which to abort the chat with an error. Typical abort strings are messages like BUSY or NO CARRIER that your modem usually generates when the called number is busy or doesn't answer. To make chat recognize these messages immediately rather than timing out, you can specify them at the beginning of the script using the ABORT keyword:

$ chat -v ABORT BUSY ABORT 'NO CARRIER' '' ATZ OK ...

Similarly, you can change the timeout value for parts of the chat scripts by inserting TIMEOUT options.

Sometimes you also need to have conditional execution for parts of the chat script: when you don't receive the remote end's login prompt, you might want to send a BREAK or a carriage return. You can achieve this by appending a subscript to an expect string. The subscript consists of a sequence of send and expect strings, just like the overall script itself, which are separated by hyphens. The subscript is executed whenever the expected string it is appended to is not received in time. In the example above, we would modify the chat script as follows:

ogin:-BREAK-ogin: ppp ssword: GaGariN

When chat doesn't see the remote system send the login prompt, the subscript is executed by first sending a BREAK, and then waiting for the login prompt again. If the prompt now appears, the script continues as usual; otherwise, it will terminate with an error.

IP Configuration Options

IPCP is used to negotiate a number of IP parameters at link configuration time. Usually, each peer sends an IPCP Configuration Request packet, indicating which values it wants to change from the defaults and the new value. Upon receipt, the remote end inspects each option in turn and either acknowledges or rejects it.

pppd gives you a lot of control over which IPCP options it will try to negotiate. You can tune it through various command-line options that we will discuss in this section.

Choosing IP Addresses

All IP interfaces require IP addresses assigned to them; a PPP device always has an IP address. The PPP suite of protocols provides a mechanism that allows the automatic assignment of IP addresses to PPP interfaces. It is possible for the PPP program at one end of a point-to-point link to assign an IP address for the remote end to use, or each may use its own.

Some PPP servers that handle a lot of client sites assign addresses dynamically; addresses are assigned to systems only when calling in and are reclaimed after they have logged off again. This allows the number of IP addresses required to be limited to the number of dialup lines. While limitation is convenient for managers of the PPP dialup server, it is often less convenient for users who are dialing in. We discussed the way that hostnames are mapped to IP addresses by use of a database in Chapter 6., Name Service and Resolver Configuration. In order for people to connect to your host, they must know your IP address or the hostname associated with it. If you are a user of a PPP service that assigns you an IP address dynamically, this knowledge is difficult without providing some means of allowing the DNS database to be updated after you are assigned an IP address. Such systems do exist, but we won't cover them in detail here; instead, we will look at the more preferable approach, which involves you being able to use the same IP address each time you establish your network connection.[39]

In the previous example, we had pppd dial up c3po and establish an IP link. No provisions were taken to choose a particular IP address on either end of the link. Instead, we let pppd take its default action. It attempts to resolve the local hostname, vlager in our example, to an IP address, which it uses for the local end, while letting the remote machine, c3po, provide its own. PPP supports several alternatives to this arrangement.

To ask for particular addresses, you generally provide pppd with the following option:

local_addr:remote_addr

local_addr and remote_addr may be specified either in dotted quad notation or as hostnames.[40] This option makes pppd attempt to use the first address supplied as its own IP address, and the second as the peer's. If the peer rejects either of the addresses during IPCP negotiation, no IP link will be established.[41]

If you are dialing in to a server and expect it to assign you an IP address, you should ensure that pppd does not attempt to negotiate one for itself. To do this, use the noipdefault option and leave the local_addr blank. The noipdefault option will stop pppd from trying to use the IP address associated with the hostname as the local address.

If you want to set only the local address but accept any address the peer uses, simply leave out the remote_addr part. To make vlager use the IP address 130.83.4.27 instead of its own, give it 130.83.4.27: on the command line. Similarly, to set the remote address only, leave the local_addr field blank. By default, pppd will then use the address associated with your hostname.

Routing Through a PPP Link

After setting up the network interface, pppd will usually set up a host route to its peer only. If the remote host is on a LAN, you certainly want to be able to connect to hosts behind your peer as well; in that case, a network route must be set up.

We have already seen that pppd can be asked to set the default route using the defaultroute option. This option is very useful if the PPP server you dialed up acts as your Internet gateway.

The reverse case, in which your system acts as a gateway for a single host, is also relatively easy to accomplish. For example, take some employee at the Virtual Brewery whose home machine is called oneshot. Let's also assume that we've configured vlager as a dialin PPP server. If we've configured vlager to dynamically assign an IP address that belongs to the Brewery's subnet, then we can use the proxyarp option with pppd, which will install a proxy ARP entry for oneshot. This automatically makes oneshot accessible from all hosts at the Brewery and the Winery.

However, things aren't always that simple. Linking two local area networks usually requires adding a specific network route because these networks may have their own default routes. Besides, having both peers use the PPP link as the default route would generate a loop, through which packets to unknown destinations would ping-pong between the peers until their time to live expired.

Suppose the Virtual Brewery opens a branch in another city. The subsidiary runs an Ethernet of its own using the IP network number 172.16.3.0, which is subnet 3 of the Brewery's class B network. The subsidiary wants to connect to the Brewery's network via PPP to update customer databases. Again, vlager acts as the gateway for the brewery network and will support the PPP link; its peer at the new branch is called vbourbon and has an IP address of 172.16.3.1. This network is illustrated in Figure A.2. in Appendix A..

When vbourbon connects to vlager, it makes the default route point to vlager as usual. On vlager, however, we will have only the point-to-point route to vbourbon and will have to specially configure a network route for subnet 3 that uses vbourbon as its gateway. We could do this manually using the route command by hand after the PPP link is established, but this is not a very practical solution. Fortunately, we can configure the route automatically by using a feature of pppd that we haven't discussed yetthe ip-up command. This command is a shell script or program located in /etc/ppp that is executed by pppd after the PPP interface has been configured. When present, it is invoked with the following parameters:

ip-up iface device speed local_addr remote_addr

The following table summarizes the meaning of each of the arguments (in the first column, we show the number used by the shell script to refer to each argument):

ArgumentNamePurpose
1iface

The network interface used, e.g., ppp0

2device

The pathname of the serial device file used (/dev/tty, if stdin/stdout are used)

3speed

The speed of the serial device in bits per second

4local_addr

The IP address of the link's remote end in dotted quad notation

5remote_addr

The IP address of the remote end of the link in dotted quad notation

In our case, the ip-up script may contain the following code fragment:[42]

#!/bin/sh
	  case $5 in
	  172.16.3.1)            # this is vbourbon
	  route add -net 172.16.3.0 gw 172.16.3.1;;
	  ...
	  esac
	  exit 0

Similarly, /etc/ppp/ip-down can be used to undo any actions of ip-up after the PPP link has been taken down again. So in our /etc/ppp/ip-down script we would have a route command that removed the route we created in the /etc/ppp/ip-up script.

However, the routing scheme is not yet complete. We have set up routing table entries on both PPP hosts, but so far none of the hosts on either network knows anything about the PPP link. This is not a big problem if all hosts at the subsidiary have their default route pointing at vbourbon, and all Brewery hosts route to vlager by default. If this is not the case, your only option is usually to use a routing daemon like gated. After creating the network route on vlager, the routing daemon broadcasts the new route to all hosts on the attached subnets.

Link Control Options

We already encountered the Link Control Protocol (LCP), which is used to negotiate link characteristics and test the link.

The two most important options negotiated by LCP are the Asynchronous Control Character Map and the Maximum Receive Unit. There are a number of other LCP configuration options, but they are far too specialized to discuss here.

The Asynchronous Control Character Map, colloquially called the async map, is used on asynchronous links, such as telephone lines, to identify control characters that must be escaped (replaced by a specific two-character sequence) to avoid them being interpreted by equipment used to establish the link. For instance, you may want to avoid the XON and XOFF characters used for software handshake because a misconfigured modem might choke upon receipt of an XOFF. Other candidates include Ctrl-l (the telnet escape character). PPP allows you to escape any of the characters with ASCII codes 0 through 31 by specifying them in the async map.

The async map is a 32-bit-wide bitmap expressed in hexadecimal. The least significant bit corresponds to the ASCII NULL character, and the most significant bit corresponds to ASCII 31 decimal. These 32 ASCII characters are the control characters. If a bit is set in the bitmap, it signals that the corresponding character must be escaped before it is transmitted across the link.

To tell your peer that it doesn't have to escape all control characters, but only a few of them, you can specify an async map to pppd using the asyncmap option. For example, if only ^S and ^Q (ASCII 17 and 19, commonly used for XON and XOFF) must be escaped, use the following option:

asyncmap 0x000A0000

The conversion is simple as long as you can convert binary to hex. Lay out 32 bits in front of you. The right-most bit corresponds to ASCII 00 (NULL), and the left-most bit corresponds to ASCII 32 decimal. Set the bits corresponding to the characters you want escaped to one, and all others to zero. To convert that into the hexadecimal number pppd expects, simply take each set of 4 bits and convert them into hex. You should end up with eight hexadecimal figures. String them all together and preprend 0x to signify it is a hexadecimal number, and you are done.

Initially, the async map is set to 0xffffffffthat is, all control characters will be escaped. This is a safe default, but is usually much more than you need. Each character that appears in the async map results in two characters being transmitted across the link, so escaping comes at the cost of increased link utilization and a corresponding performance reduction.

In most circumstances, an async map of 0x0 works fine. No escaping is performed.

The Maximum Receive Unit (MRU), signals to the peer the maximum size of HDLC frames we want to receive. Although this may remind you of the Maximum Transfer Unit (MTU) value, these two have little in common. The MTU is a parameter of the kernel networking device and describes the maximum frame size the interface is able to transmit. The MRU is more of an advice to the remote end not to generate frames larger than the MRU; the interface must nevertheless be able to receive frames of up to 1,500 bytes.

Choosing an MRU is therefore not so much a question of what the link is capable of transferring, but of what gives you the best throughput. If you intend to run interactive applications over the link, setting the MRU to values as low as 296 is a good idea, so that an occasional larger packet (say, from an FTP session) doesn't make your cursor jump. To tell pppd to request an MRU of 296, you give it the option mru 296. Small MRUs, however, make sense only if you have VJ header compression (it is enabled by default), because otherwise you'd waste a large amount of your bandwidth just carrying the IP header for each datagram.

pppd also understands a couple of LCP options that configure the overall behavior of the negotiation process, such as the maximum number of configuration requests that may be exchanged before the link is terminated. Unless you know exactly what you are doing, you should leave these options alone.

Finally, there are two options that apply to LCP echo messages. PPP defines two messages, Echo Request and Echo Response. pppd uses this feature to check if a link is still operating. You can enable this by using the lcp-echo-interval option together with a time in seconds. If no frames are received from the remote host within this interval, pppd generates an Echo Request and expects the peer to return an Echo Response. If the peer does not produce a response, the link is terminated after a certain number of requests are sent. This number can be set using the lcp-echo-failure option. By default, this feature is disabled altogether.

General Security Considerations

A misconfigured PPP daemon can be a devastating security breach. It can be as bad as letting anyone plug their machine into your Ethernet (and that can be very bad). In this section, we discuss a few measures that should make your PPP configuration safe.

Note

Root privilege is required to configure the network device and routing table. You will usually solve this by running pppd setuid root. However, pppd allows users to set various security-relevant options.

To protect against any attacks a user may launch by manipulating pppd options, you should set a couple of default values in the global /etc/ppp/options file, like those shown in the sample file in the section called “Using Options Files”, earlier in this chapter. Some of them, such as the authentication options, cannot be overridden by the user, and thus provide reasonable protection against manipulations. An important option to protect is the connect option. If you intend to allow non-root users to invoke pppd to connect to the Internet, you should always add the connect and noauth options to the global options file /etc/ppp/options. If you fail to do this, users will be able to execute arbitrary commands with root privileges by specifying the command as their connect command on the pppd line or in their personal options file.

Another good idea is to restrict which users may execute pppd by creating a group in /etc/group and adding only those users who you wish to have the ability to execute the PPP daemon. You should then change group ownership of the pppd daemon to that group and remove the world execute privileges. To do this, assuming you've called your group dialout, you could use something like:

# chown root /usr/sbin/pppd
	# chgrp dialout /usr/sbin/pppd
	# chmod 4750 /usr/sbin/pppd

Of course, you have to protect yourself from the systems you speak PPP with, too. To fend off hosts posing as someone else, you should always require some sort of authentication from your peer. Additionally, you should not allow foreign hosts to use any IP address they choose, but restrict them to at most a few. The following section will deal with these topics in detail.

Authentication with PPP

With PPP, each system may require its peer to authenticate itself using one of two authentication protocols: the Password Authentication Protocol (PAP), and the Challenge Handshake Authentication Protocol (CHAP). When a connection is established, each end can request the other to authenticate itself, regardless of whether it is the caller or the callee. In the description that follows, we will loosely talk of client and server when we want to distinguish between the system sending authentication requests and the system responding to them. A PPP daemon can ask its peer for authentication by sending yet another LCP configuration request identifying the desired authentication protocol.

PAP Versus CHAP

PAP, which is offered by many Internet Service Providers, works basically the same way as the normal login procedure. The client authenticates itself by sending a username and a (optionally encrypted) password to the server, which the server compares to its secrets database.[43] This technique is vulnerable to eavesdroppers, who may try to obtain the password by listening in on the serial line, and to repeated trial and error attacks.

CHAP does not have these deficiencies. With CHAP, the server sends a randomly generated challenge string to the client, along with its hostname. The client uses the hostname to look up the appropriate secret, combines it with the challenge, and encrypts the string using a one-way hashing function. The result is returned to the server along with the client's hostname. The server now performs the same computation, and acknowledges the client if it arrives at the same result.

CHAP also doesn't require the client to authenticate itself only at startup time, but sends challenges at regular intervals to make sure the client hasn't been replaced by an intruder, for instance by switching phone lines, or because of a modem configuration error that causes the PPP daemon not to notice that the original phone call has dropped out and someone else has dialed in.

pppd keeps the secret keys for PAP and CHAP in two separate files called /etc/ppp/pap-secrets and /etc/ppp/chap-secrets. By entering a remote host in one or the other file, you have fine control over whether PAP or CHAP is used to authenticate yourself with your peer, and vice versa.

By default, pppd doesn't require authentication from the remote host, but it will agree to authenticate itself when requested by the remote host. Since CHAP is so much stronger than PAP, pppd tries to use the former whenever possible. If the peer does not support it, or if pppd can't find a CHAP secret for the remote system in its chap-secrets file, it reverts to PAP. If it doesn't have a PAP secret for its peer either, it refuses to authenticate altogether. As a consequence, the connection is shut down.

You can modify this behavior in several ways. When given the auth keyword, pppd requires the peer to authenticate itself. pppd agrees to use either CHAP or PAP as long as it has a secret for the peer in its CHAP or PAP database. There are other options to turn a particular authentication protocol on or off, but I won't describe them here.

If all systems you talk to with PPP agree to authenticate themselves with you, you should put the auth option in the global /etc/ppp/options file and define passwords for each system in the chap-secrets file. If a system doesn't support CHAP, add an entry for it to the pap-secrets file. That way, you can make sure no unauthenticated system connects to your host.

The next two sections discuss the two PPP secrets files, pap-secrets and chap-secrets. They are located in /etc/ppp and contain triplets of clients, servers, and passwords, optionally followed by a list of IP addresses. The interpretation of the client and server fields is different for CHAP and PAP, and also depends on whether we authenticate ourselves with the peer, or whether we require the server to authenticate itself with us.

The CHAP Secrets File

When it has to authenticate itself with a server using CHAP, pppd searches the chap-secrets file for an entry with the client field equal to the local hostname, and the server field equal to the remote hostname sent in the CHAP challenge. When requiring the peer to authenticate itself, the roles are simply reversed: pppd then looks for an entry with the client field equal to the remote hostname (sent in the client's CHAP response), and the server field equal to the local hostname.

The following is a sample chap-secrets file for vlager:[44]

# CHAP secrets for vlager.vbrew.com
	  #
	  # client         server           secret                addrs
	  #---------------------------------------------------------------------
	  vlager.vbrew.com  c3po.lucas.com   "Use The Source Luke" vlager.vbrew.com
	  c3po.lucas.com    vlager.vbrew.com "arttoo! arttoo!"     c3po.lucas.com
	  *                 vlager.vbrew.com "TuXdrinksVicBitter"  pub.vbrew.com

When vlager establishes a PPP connection with c3po, c3po asks vlager to authenticate itself by sending a CHAP challenge. pppd on vlager then scans chap-secrets for an entry with the client field equal to vlager.vbrew.com and the server field equal to c3po.lucas.com, and finds the first line shown in the example.[45] It then produces the CHAP response from the challenge string and the secret (Use The Source Luke), and sends it off to c3po.

pppd also composes a CHAP challenge for c3po containing a unique challenge string and its fully qualified hostname, vlager.vbrew.com. c3po constructs a CHAP response in the way we discussed, and returns it to vlager. pppd then extracts the client hostname (c3po.vbrew.com) from the response and searches the chap-secrets file for a line matching c3po as a client and vlager as the server. The second line does this, so pppd combines the CHAP challenge and the secret arttoo! arttoo!, encrypts them, and compares the result to c3po's CHAP response.

The optional fourth field lists the IP addresses that are acceptable for the client named in the first field. The addresses can be given in dotted quad notation or as hostnames that are looked up with the resolver. For instance, if c3po asks to use an IP address during IPCP negotiation that is not in this list, the request is rejected, and IPCP is shut down. In the sample file shown above, c3po is therefore limited to using its own IP address. If the address field is empty, any addresses are allowed; a value of - prevents the use of IP with that client altogether.

The third line of the sample chap-secrets file allows any host to establish a PPP link with vlager because a client or server field of * is a wildcard matching any hostname. The only requirements are that the connecting host must know the secret and that it must use the IP address associated with pub.vbrew.com. Entries with wildcard hostnames may appear anywhere in the secrets file, since pppd will always use the best match it can find for the server/client pair.

pppd may need some help forming hostnames. As explained before, the remote hostname is always provided by the peer in the CHAP challenge or response packet. The local hostname is obtained by calling the gethostname(2) function by default. If you have set the system name to your unqualified hostname, you also have to provide pppd with the domain name using the domain option:

# pppd  domain vbrew.com

This provision appends the Brewery's domain name to vlager for all authentication related activities. Other options that modify pppd's idea of the local hostname are usehostname and name. When you give the local IP address on the command line using local:remote and local as a name instead of a dotted quad, pppd uses this as the local hostname.

The PAP Secrets File

The PAP secrets file is very similar to CHAP's. The first two fields always contain a username and a server name; the third holds the PAP secret. When the remote host sends its authentication information, pppd uses the entry that has a server field equal to the local hostname, and a user field equal to the username sent in the request. When it is necessary for us to send our credentials to the peer, pppd uses the secret that has a user field equal to the local username and the server field equal to the remote hostname.

A sample PAP secrets file might look like this:

# /etc/ppp/pap-secrets
	  #
	  # user          server          secret          addrs
	  vlager-pap      c3po            cresspahl       vlager.vbrew.com
	  c3po            vlager          DonaldGNUth     c3po.lucas.com

The first line is used to authenticate ourselves when talking to c3po. The second line describes how a user named c3po has to authenticate itself with us.

The name vlager-pap in the first column is the username we send to c3po. By default, pppd picks the local hostname as the username, but you can also specify a different name by giving the user option followed by that name.

When picking an entry from the pap-secrets file to identify us to a remote host, pppd must know the remote host's name. As it has no way of finding that out, you must specify it on the command line using the remotename keyword followed by the peer's hostname. To use the above entry for authentication with c3po, for example, we must add the following option to pppd's command line:

# pppd ... remotename c3po user vlager-pap

In the fourth field of the PAP secrets file (and all following fields), you can specify what IP addresses are allowed for that particular host, just as in the CHAP secrets file. The peer will be allowed to request only addresses from that list. In the sample file, the entry that c3po will use when it dials inthe line where c3po is the clientallows it to use its real IP address and no other.

Note that PAP is a rather weak authentication method, you should use CHAP instead whenever possible. We will therefore not cover PAP in greater detail here; if you are interested in using it, you will find more PAP features in the pppd(8) manual page.

Debugging Your PPP Setup

By default, pppd logs any warnings and error messages to syslog's daemon facility. You have to add an entry to syslog.conf that redirects these messages to a file or even the console; otherwise, syslog simply discards them. The following entry sends all messages to /var/log/ppp-log:

daemon.*                /var/log/ppp-log

If your PPP setup doesn't work right away, you should look in this log file. If the log messages don't help, you can also turn on extra debugging output using the debug option. This output makes pppd log the contents of all control packets sent or received to syslog. All messages then go to the daemon facility.

Finally, the most drastic way to check a problem is to enable kernel-level debugging by invoking pppd with the kdebug option. It is followed by a numeric argument that is the sum of the following values: 1 for general debug messages, 2 for printing the contents of all incoming HDLC frames, and 4 to make the driver print all outgoing HDLC frames. To capture kernel debugging messages, you must either run a syslogd daemon that reads the /proc/kmsg file, or the klogd daemon. Either of them directs kernel debugging to the syslog kernel facility.

More Advanced PPP Configurations

While configuring PPP to dial in to a network like the Internet is the most common application, there are those of you who have more advanced requirements. In this section we'll talk about a few of the more advanced configurations possible with PPP under Linux.

PPP Server

Running pppd as a server is just a matter of configuring a serial tty device to invoke pppd with appropriate options when an incoming data call has been received. One way to do this is to create a special account, say ppp, and give it a script or program as a login shell that invokes pppd with these options. Alternatively, if you intend to support PAP or CHAP authentication, you can use the mgetty program to support your modem and exploit its /AutoPPP/ feature.

To build a server using the login method, you add a line similar to the following to your /etc/passwd file:[46]

ppp:x:500:200:Public PPP Account:/tmp:/etc/ppp/ppplogin

If your system supports shadow passwords, you also need to add an entry to the /etc/shadow file:

ppp:!:10913:0:99999:7:::

Of course, the UID and GID you use depends on which user you wish to own the connection, and how you've created it. You also have to set the password for the mentioned account using the passwd command.

The ppplogin script might look like this:

#!/bin/sh
	  # ppplogin - script to fire up pppd on login
	  mesg n
	  stty -echo
	  exec pppd -detach silent modem crtscts

The mesg command disables other users from writing to the tty by using, for instance, the write command. The stty command turns off character echoing. This command is necessary; otherwise, everything the peer sends would be echoed back to it. The most important pppd option given is detach because it prevents pppd from detaching from the controlling tty. If we didn't specify this option, it would go to the background, making the shell script exit. This in turn would cause the serial line to hang up and the connection to be dropped. The silent option causes pppd to wait until it receives a packet from the calling system before it starts sending. This option prevents transmit timeouts from occurring when the calling system is slow in firing up its PPP client. The modem option makes pppd drive the modem control lines of the serial port. You should always turn this option on when using pppd with a modem. The crtscts option turns on hardware handshake.

Besides these options, you might want to force some sort of authentication, for example, by specifying auth on pppd's command line or in the global options file. The manual page also discusses more specific options for turning individual authentication protocols on and off.

If you wish to use mgetty, all you need to do is configure mgetty to support the serial device your modem is connected to (see the section called “Configuring the mgetty Daemon” for details), configure pppd for either PAP or CHAP authentication with appropriate options in its options file, and finally, add a section similar to the following to your /etc/mgetty/login.config file:

# Configure mgetty to automatically detect incoming PPP calls and invoke
	  # the pppd daemon to handle the connection.
	  #
	  /AutoPPP/ -     ppp   /usr/sbin/pppd auth -chap +pap login

The first field is a special piece of magic used to detect that an incoming call is a PPP one. You must not change the case of this string; it is case sensitive. The third column is the username that appears in who listings when someone has logged in. The rest of the line is the command to invoke. In our example, we've ensured that PAP authentication is required, disabled CHAP, and specified that the system passwd file should be used for authenticating users. This is probably similar to what you'll want. Remember, you can specify the options in the options file or on the command line if you prefer.

Here is a small checklist of tasks to perform and the sequence you should perform them to get PPP dial in working on your machine. Make sure each step works before moving on to the next:

  1. Configure the modem for auto-answer mode. On Hayes-compatible modems, this is performed using a command like ATS0=3. If you're going to be using the mgetty daemon, this isn't necessary.

  2. Configure the serial device with a getty type of command to answer incoming calls. A commonly used getty variant is mgetty.

  3. Consider authentication. Will your callers authenticate using PAP, CHAP, or system login?

  4. Configure pppd as server as described in this section.

  5. Consider routing. Will you need to provide a network route to callers? Routing can be performed using the ip-up script.

Demand Dialing

When there is IP traffic to be carried across the link, demand dialing causes your telephone modem to dial and to establish a connection to a remote host. Demand dialing is most useful when you can't leave your telephone line permanently switched to your Internet provider. For example, you might have to pay timed local calls, so it might be cheaper to have the telephone line switched on only when you need it and disconnected when you aren't using the Internet.

Traditional Linux solutions have used the diald command, which worked well but was fairly tricky to configure. Versions 2.3.0 and later of the PPP daemon have built-in support for demand dialing and make it very simple to configure. You must use a modern kernel for this to work, too. Any of the later 2.0 kernels will work just fine.

To configure pppd for demand dialing, all you need to do is add options to your options file or the pppd command line. The following table summarizes the options related to demand dialing:

OptionDescription
demand

This option specifies that the PPP link should be placed in demand dial mode. The PPP network device will be created, but the connect command will not be used until a datagram is transmitted by the local host. This option is mandatory for demand dialing to work.

active-filterexpression

This option allows you to specify which data packets are to be considered active traffic. Any traffic matching the specified rule will restart the demand dial idle timer, ensuring that pppd waits again before closing the link. The filter syntax has been borrowed from the tcpdump command. The default filter matches all datagrams.

holdoffn

This option allows you to specify the minimum amount of time, in seconds, to wait before reconnecting this link if it terminates. If the connection fails while pppd believes it is in active use, it will be re-established after this timer has expired. This timer does not apply to reconnections after an idle timeout.

idlen

If this option is configured, pppd will disconnect the link whenever this timer expires. Idle times are specified in seconds. Each new active data packet will reset the timer.

A simple demand dialing configuration would therefore look something like this:

demand
	  holdoff 60
	  idle 180

This configuration would enable demand dialing, wait 60 seconds before re-establishing a failed connection, and drop the link if 180 seconds pass without any active data on the link.

Persistent Dialing

Persistent dialing is what people who have permanent dialup connections to a network will want to use. There is a subtle difference between demand dialing and persistent dialing. With persistent dialing, the connection is automatically established as soon as the PPP daemon is started, and the persistent aspect comes into play whenever the telephone call supporting the link fails. Persistent dialing ensures that the link is always available by automatically rebuilding the connection if it fails.

You might be fortunate to not have to pay for your telephone calls; perhaps they are local and free, or perhaps they're paid by your company. The persistent dialing option is extremely useful in this situation. If you do have to pay for your telephone calls, then you have to be a little careful. If you pay for your telephone calls on a time-charged basis, persistent dialing is almost certainly not what you want, unless you're very sure you'll be using the connection fairly steadily twenty-four hours a day. If you do pay for calls, but they are not time charged, you need to be careful to protect yourself against situations that might cause the modem to endlessly redial. The pppd daemon provides an option that can help reduce the effects of this problem.

To enable persistent dialing, you must include the persist option in one of your pppd options files. Including this option alone is all you need to have pppd automatically invoke the command specified by the connect option to rebuild the connection when the link fails. If you are concerned about the modem redialing too rapidly (in the case of modem or server fault at the other end of the connection), you can use the holdoff option to set the minimum amount of time that pppd will wait before attempting to reconnect. This option won't solve the problem of a fault costing you money in wasted phone calls, but it will at least serve to reduce the impact of one.

A typical configuration might have persistent dialing options that look like this:

persist
	  holdoff 600

The holdoff time is specified in seconds. In our example, pppd waits a full five minutes before redialing after the call drops out.

It is possible to combine persistent dialing with demand dialing, using idle to drop the link if it has been idle for a specified period of time. We doubt many users would want to do so, but this scenario is described briefly in the pppd manual page, if you'd like to pursue it.



[33] Relevant RFCs are listed in the Bibiliography at the end of this book.

[34] In fact, HDLC is a much more general protocol devised by the International Standards Organization (ISO) and is also an essential component of the X.25 specification.

[35] If you have any general questions about PPP, ask the people on the Linux-net mailing list at vger.rutgers.edu.

[36] Karl can be reached at karl@morningstar.com.

[37] The default network route is installed only if none is already present.

[38] If you edit syslog.conf to redirect these log messages to a file, make sure this file isn't world readable, as chat also logs the entire chat script by defaultincluding passwords.

[39] More information on two dynamic host assignment mechanisms can be found at http://www.dynip.com/ and http://www.justlinux.com/dynamic_dns.html.

[40] Using hostnames in this option has consequences for CHAP authentication. Please refer to the the section called “Authentication with PPP” section later in this chapter.

[41] The ipcp-accept-local and ipcp-accept-remote options instruct your pppd to accept the local and remote IP addresses being offered by the remote PPP, even if you've supplied some in your configuration. If these options are not configured, your pppd will reject any attempt to negotiate the IP addresses used.

[42] If we wanted to have routes for other sites created when they dial in, we'd add appropriate case statements to cover those in which the ... appears in the example.

[43] Secret is just the PPP name for passwords. PPP secrets don't have the same length limitation as Linux login passwords.

[44] The double quotes are not part of the secret; they merely serve to protect the whitespace within it.

[45] This hostname is taken from the CHAP challenge.

[46] The useradd or adduser utility, if you have it, will simplify this task.

Security is increasingly important for companies and individuals alike. The Internet has provided them with a powerful tool to distribute information about themselves and obtain information from others, but it has also exposed them to dangers that they have previously been exempt from. Computer crime, information theft, and malicious damage are all potential dangers.

An unauthorized and unscrupulous person who gains access to a computer system may guess system passwords or exploit the bugs and idiosyncratic behavior of certain programs to obtain a working account on that machine. Once they are able to log in to the machine, they may have access to information that may be damaging, such as commercially sensitive information like marketing plans, new project details, or customer information databases. Damaging or modifying this type of data can cause severe setbacks to the company.

The safest way to avoid such widespread damage is to prevent unauthorized people from gaining network access to the machine. This is where firewalls come in.

Warning

Constructing secure firewalls is an art. It involves a good understanding of technology, but equally important, it requires an understanding of the philosophy behind firewall designs. We won't cover everything you need to know in this book; we strongly recommend you do some additional research before trusting any particular firewall design, including any we present here.

There is enough material on firewall configuration and design to fill a whole book, and indeed there are some good resources that you might like to read to expand your knowledge on the subject. Two of these are:

Building Internet Firewalls

by Elizabeth D. Zwicky, Simon Cooper, and D. Brent Chapman (O'Reilly). A guide explaining how to design and install firewalls for Unix, Linux, and Windows NT, and how to configure Internet services to work with the firewalls.

Firewalls and Internet Security

by W. Cheswick and S. Bellovin (Addison Wesley). This book covers the philosophy of firewall design and implementation.

We will focus on the FreeBSD-specific technical issues in this chapter. Later we will present a sample firewall configuration that should serve as a useful starting point in your own configuration, but as with all security-related matters, trust no one. Double check the design, make sure you understand it, and then modify it to suit your requirements. To be safe, be sure.

Methods of Attack

As a network administrator, it is important that you understand the nature of potential attacks on computer security. We'll briefly describe the most important types of attacks so that you can better understand precisely what the FreeBSD IP firewall will protect you against. You should do some additional reading to ensure that you are able to protect your network against other types of attacks. Here are some of the more important methods of attack and ways of protecting yourself against them:

Unauthorized access

This simply means that people who shouldn't use your computer services are able to connect and use them. For example, people outside your company might try to connect to your company accounting machine or to your NFS server.

There are various ways to avoid this attack by carefully specifying who can gain access through these services. You can prevent network access to all except the intended users.

Exploitation of known weaknesses in programs

Some programs and network services were not originally designed with strong security in mind and are inherently vulnerable to attack. The BSD remote services (rlogin, rexec, etc.) are an example.

The best way to protect yourself against this type of attack is to disable any vulnerable services or find alternatives. With Open Source, it is sometimes possible to repair the weaknesses in the software.

Denial of service

Denial of service attacks cause the service or program to cease functioning or prevent others from making use of the service or program. These may be performed at the network layer by sending carefully crafted and malicious datagrams that cause network connections to fail. They may also be performed at the application layer, where carefully crafted application commands are given to a program that cause it to become extremely busy or stop functioning.

Preventing suspicious network traffic from reaching your hosts and preventing suspicious program commands and requests are the best ways of minimizing the risk of a denial of service attack. It's useful to know the details of the attack method, so you should educate yourself about each new attack as it gets publicized.

Spoofing

This type of attack causes a host or application to mimic the actions of another. Typically the attacker pretends to be an innocent host by following IP addresses in network packets. For example, a well-documented exploit of the BSD rlogin service can use this method to mimic a TCP connection from another host by guessing TCP sequence numbers.

To protect against this type of attack, verify the authenticity of datagrams and commands. Prevent datagram routing with invalid source addresses. Introduce unpredictablility into connection control mechanisms, such as TCP sequence numbers and the allocation of dynamic port addresses.

Eavesdropping

This is the simplest type of attack. A host is configured to "listen" to and capture data not belonging to it. Carefully written eavesdropping programs can take usernames and passwords from user login network connections. Broadcast networks like Ethernet are especially vulnerable to this type of attack.

To protect against this type of threat, avoid use of broadcast network technologies and enforce the use of data encryption.

IP firewalling is very useful in preventing or reducing unauthorized access, network layer denial of service, and IP spoofing attacks. It not very useful in avoiding exploitation of weaknesses in network services or programs and eavesdropping.

What Is a Firewall?

A firewall is a secure and trusted machine that sits between a private network and a public network.[47] The firewall machine is configured with a set of rules that determine which network traffic will be allowed to pass and which will be blocked or refused. In some large organizations, you may even find a firewall located inside their corporate network to segregate sensitive areas of the organization from other employees. Many cases of computer crime occur from within an organization, not just from outside.

Firewalls can be constructed in quite a variety of ways. The most sophisticated arrangement involves a number of separate machines and is known as a perimeter network. Two machines act as "filters" called chokes to allow only certain types of network traffic to pass, and between these chokes reside network servers such as a mail gateway or a World Wide Web proxy server. This configuration can be very safe and easily allows quite a great range of control over who can connect both from the inside to the outside, and from the outside to the inside. This sort of configuration might be used by large organizations.

Typically though, firewalls are single machines that serve all of these functions. These are a little less secure, because if there is some weakness in the firewall machine itself that allows people to gain access to it, the whole network security has been breached. Nevertheless, these types of firewalls are cheaper and easier to manage than the more sophisticated arrangement just described. Figure 9.1. illustrates the two most common firewall configurations.

Figure 9.1. The two major classes of firewall design

The FreeBSD kernel provides a range of built-in features that allow it to function quite nicely as an IP firewall. The network implementation includes code to do IP filtering in a number of different ways, and provides a mechanism to quite accurately configure what sort of rules you'd like to put in place. A FreeBSD firewall is flexible enough to make it very useful in either of the configurations illustrated in Figure 9.1.. FreeBSD firewall software provides two other useful features that we'll discuss in separate chapters: IP Accounting (Chapter 10., IP Accounting) and IP masquerade (Chapter 11., Network Address Translation).

What Is IP Filtering?

IP filtering is simply a mechanism that decides which types of IP datagrams will be processed normally and which will be discarded. By discarded we mean that the datagram is deleted and completely ignored, as if it had never been received. You can apply many different sorts of criteria to determine which datagrams you wish to filter; some examples of these are:

  • Protocol type: TCP, UDP, ICMP, etc.

  • Socket number (for TCP/UPD)

  • Datagram type: SYN/ACK, data, ICMP Echo Request, etc.

  • Datagram source address: where it came from

  • Datagram destination address: where it is going to

It is important to understand at this point that IP filtering is a network layer facility. This means it doesn't understand anything about the application using the network connections, only about the connections themselves. For example, you may deny users access to your internal network on the default telnet port, but if you rely on IP filtering alone, you can't stop them from using the telnet program with a port that you do allow to pass trhough your firewall. You can prevent this sort of problem by using proxy servers for each service that you allow across your firewall. The proxy servers understand the application they were designed to proxy and can therefore prevent abuses, such as using the telnet program to get past a firewall by using the World Wide Web port. If your firewall supports a World Wide Web proxy, their telnet connection will always be answered by the proxy and will allow only HTTP requests to pass. A large number of proxy-server programs exist. Some are free software and many others are commercial products. The Firewall-HOWTO discusses one popular set of these, but they are beyond the scope of this book.

The IP filtering ruleset is made up of many combinations of the criteria listed previously. For example, let's imagine that you wanted to allow World Wide Web users within the Virtual Brewery network to have no access to the Internet except to use other sites' web servers. You would configure your firewall to allow forwarding of:

  • datagrams with a source address on Virtual Brewery network, a destination address of anywhere, and with a destination port of 80 (WWW)

  • datagrams with a destination address of Virtual Brewery network and a source port of 80 (WWW) from a source address of anywhere

Note that we've used two rules here. We have to allow our data to go out, but also the corresponding reply data to come back in.

Setting up FreeBSD for Firewalling

FreeBSD supports two different packet filtering firewalls “out of the box”, ipfw and IPFilter, they have slightly different functionality and feature support.

ipfw is specific to FreeBSD, while IPFilter is available for a variety of different operating systems. So if your network consists of hosts running different operating systems you may wish to use IPFilter, so that you only need to learn one firewall configuration language.

Both applications support Network Address Translation (NAT, sometimes also referred to as IP Masquerading). ipfw does this using a daemon that runs in userland, while IPFilter does so in the kernel, which can be faster and more efficient if you are transmitting and receiving a lot of information.

ipfw has an interface towards “dummynet”. This is FreeBSD functionality that allows you to apply rules that specify how much traffic can flow through a particular interface, to enforce bandwidth constraints and quality of service guarantees. IPFilter does not have this interface.

Both applications support tracking state, allowing you to define rules that implement policies such as “Allow incoming UDP traffic from any host that I've sent UDP traffic to in the past 60 seconds”. IPFilter's state tracking is more extensive than ipfw's, which may be important to you.

ipfw and IPFilter can be run at the same time, and you can use features from them both together. So you might choose to use ipfw for it's dummynet support, while using IPFilter to carry out filtering and NAT.

To enable one or both of these firewalls you must include specific options in your kernel configuration file, and then build, install, and boot from the new kernel.

Kernel configuration for ipfw

The following entries enable ipfw in the kernel, and configure some related options.

options IPFIREWALL

This option is required to enable the ipfw code in the kernel.

Strictly speaking that's not true. The ipfw functionality is also available in a kernel module, ipfw.ko. If you load the module in to the kernel after booting then you can configure the firewall as normal. However, this is really only of benefit if you are developing variations on the core firewall code, and want to be able to load and unload various versions of the module without needing to reboot the system. For a production firewall you would always compile the firewall code in to the kernel.

options IPFIREWALL_VERBOSE

This option enables logging within ipfw. ipfw supports the log keyword within the firewall rules, which can be used to log information about packets that match particular rules. However, this keyword does nothing if this option is not.

options IPFIREWALL_VERBOSE_LIMIT=n

In conjunction with IPFIREWALL_VERBOSE, this option controls how many matching packets will be logged (per rule) before the logging is disabled.

Your firewall rules can also specify this using the logamount parameter, so IPFIREWALL_VERBOSE_LIMIT acts as the hard limit for those firewall rules that specify log but do not specify logamount.

This value may be changed while the system is running using sysctl. The appropriate variable is net.inet.ip.fw.verbose_limit.

options IPFIREWALL_DEFAULT_TO_ACCEPT

ipfw's default configuration (i.e., when you have provided no firewall rules of your own), is to deny all packets. This is also the default if a packet matches none of your defined firewall rules. This is a sensible default, as generally you would rather have the firewall block all packets in the case of misconfiguration, rather than allowing all packets.

This option reverses that default, so that all packets will be allowed in the absence of any explicit rules that deny the packet. This may be a potential security problem, depending on your system, but can also be useful if you are managing a host's firewall configuration remotely, where you do not want to risk accidentally firewalling yourself from the remote host.

options IPFIREWALL_FORWARD

This option allows you to use the fwd keyword in your ipfw rules. This keyword is used to change the next-hop address on matching packets, allowing you to divert traffic to the host and port of your choice.

For example, if your site runs a web cache, you might decide to divert all outbound HTTP traffic (where the traffic is destined for port 80 on a remote host) to your local web cache. This forces all the hosts behind the firewall to use your web cache, irrespective of their browser settings.

options IPSTEALTH

By using this option your firewalls have the option of not decrementing the packets time-to-live (ttl) value. This value is used by tools such as traceroute, and can be useful if you wish to hide the presence of your firewall from the outside world.

Kernel configuration for IPFilter

The following entries enable IPFilter in the kernel, and configure some related options.

options IPFILTER

This option is required to enable the IPFilter code in the kernel.

option IPFILTER_LOG

Enables support for the /dev/ipl device, which IPFilter logs packets to, and which IPFilter applications such as ipmon use to display firewall logs. Without this option IPFilter's logging functionality will be disabled.

options IPFILTER_DEFAULT_BLOCK

IPFilter's default rules will allow any packets through the firewall (this is contrary to ipfw's default). To reverse this, and deny packets by default, set this option.

Three Ways We Can Do Filtering

Consider how a Unix machine, or in fact any machine capable of IP routing, processes IP datagrams. The basic steps, shown in Figure 9.2. are:

Figure 9.2. The stages of IP datagram processing

  • The IP datagram is received. (1)

  • The incoming IP datagram is examined to determine if it is destined for a process on this machine.

  • If the datagram is for this machine, it is processed locally. (2)

  • If it is not destined for this machine, a search is made of the routing table for an appropriate route and the datagram is forwarded to the appropriate interface or dropped if no route can be found. (3)

  • Datagrams from local processes are sent to the routing software for forwarding to the appropriate interface. (4)

  • The outgoing IP datagram is examined to determine if there is a valid route for it to take, if not, it is dropped.

  • The IP datagram is transmitted. (5)

In our diagram, the flow 135 represents our machine routing data between a host on our Ethernet network to a host reachable via our PPP link. The flows 12 and 45 represent the data input and output flows of a network program running on our local host. The flow 432 would represent data flow via a loopback connection. Naturally data flows both into and out of network devices. The question marks on the diagram represent the points where the IP layer makes routing decisions.

The Linux kernel IP firewall is capable of applying filtering at various stages in this process. That is, you can filter the IP datagrams that come in to your machine, filter those datagrams being forwarded across your machine, and filter those datagrams that are ready to be transmitted.

In ipfwadm and ipchains, an Input rule applies to flow 1 on the diagram, a Forwarding rule to flow 3, and an Output rule to flow 5. We'll see when we discuss netfilter later that the points of interception have changed so that an Input rule is applied at flow 2, and an Output rule is applied at flow 4. This has important implications for how you structure your rulesets, but the general principle holds true for all versions of Linux firewalling.

This may seem unnecessarily complicated at first, but it provides flexibility that allows some very sophisticated and powerful configurations to be built.

Basics of IP Packet Filtering

Broadly speaking, there are three different types of IP traffic that you will want to filter, irrespective of the technology that you choose to do so.

Internet Control Message Protocol (ICMP)

ICMP is the mechanism that IP hosts use to indicate and control error conditions and send diagnostic information back and forth. Incorrectly filtering ICMP can leave your firewall host (and the network it protects) unable to access portions of the Internet. There are various types of ICMP message, including "destination unreachable", "time-to-live exceeded", and "source quench".

However, some ICMP message are less critical. As an example, the ping(8) utility sends ICMP "echo request" messages, and listens for the corresponding "echo reply" message. It is possible for an attacker to send a large number of these (a "ping flood") as part of a denial of service attack, so filtering these can be beneficial.

Each ICMP message has three parameters that can be used to identify it; the IP address of the host that generated the message (the source), the IP address of the host that the message has been sent to (the destination), and the ICMP message type.

Transmission Control Protocol (TCP)

TCP is probably the most widely used protocol on the Internet. A TCP connection sets up a virtual circuit between two ports on two hosts. This gives each TCP packet four values that can be used to uniquely identify it for filtering; the IP address of the host that generated the packet (the source IP), the port number that the packet was sent from (the source port), the IP address of the host that the packet is being sent to (the destination IP), and the port number that the packet is being sent to (the destination port).

In addition, a TCP connection can be in one of three states; it is either being set up (in which case the two hosts are negotiating the connection information between themselves), established (the connection has been set up, and information is being sent back and forth), or being closed (the hosts are making sure that all information that has been sent has been received, and the resources used by the connection are freed).

This can be used to make your firewall rules simpler. You can safely permit all TCP traffic that has been established (irrespective of the hosts the connection originated with) with a single rule, and then create rules that only allow the set up of specific connections from or two specific hosts and ports.

User Datagram Protocol (UDP)

UDP is the second most common protocol on the Internet. In shares the same four properties as TCP (source IP and port, destination IP and port), but, unlike TCP, UDP has no concept of a connection. A host simply sends a UDP packet to its destination. This means that UDP does not support the setup/established/closing states that TCP does.

This can make UDP harder to firewall correctly. For example, the DNS generally uses UDP. Your local DNS server will send DNS requests out to other DNS servers using UDP, and will expect UDP replies to its queries. Your firewall will need to allow incoming UDP from these servers in order for the DNS to function properly. Since there's no way of knowing which DNS servers will be queried day-to-day, this can open up a gaping hole in your firewall.

To work around this problem, many firewalls now support what is called stateful-filtering. A stateful firewall watches outgoing UDP traffic, and creates firewall rules on the fly to allow incoming UDP traffic from the original destination for a limited period of time. After that period has elapsed the dynamic firewall rules are deleted. In effect, this opens up small holes in the firewall to each host that you send UDP to, and closes those holes after a minute or so.

DNS is not the only protocol that requires this. Many of today's Internet games use UDP, and benefit from this feature.

Building an ipfw firewall

ipfw works by passing each packet received on any interface through a series of rules. The rules govern whether or not the packet will be allowed through, or rejected, and whether or not any dummynet (trafic shaping) or packet divert rules (for things like NAT) will be applied.

This section will concentrate on simple packet filtering. Dummynet and NAT have their own chapters later.

ipfw Rules

ifpw processes packets by applying rules to them. Each rule has a number; rules are generally processed in sequence, although it is possible for one rule to cause processing to jump to a rule further down the list. Each rule has an associated action (allow or deny), and processing stops as soon as a matching rule is found.

In other words, the action for the first matching rule determines how the packet is processed.

ipfw rules have the following general appearance.

action protocol from src to dst options

Note

This is a simplified example to introduce the basic concepts. Other options will be introduced later.

Two of the allowed values for action are:

allow

Allow any packets that match this rule to pass. Synonyms for this are pass, pass, permit, and accept.

deny

Prevent packets that match this rule from passing. A synonym for this is drop.

src and dst specify the source and destination of the packets respectively. You can specify IP addresses in a number of different formats.

options specifies various options that packets must have set. Two options of particular interest when filtering TCP connections are:

setup

Matches packets that have the SYN bit set but not the ACK bit. This indicates that this is the first packet in an attempt to establish a connection to the local or remote machine.

established

Matches packets that have the RST or ACK bits set. This indicates that this is a packet for a TCP connection that has been established.

The ipfw rule set is managed using the ipfw utility. Rules are added with ipfw add number rule, and deleted with ipfw delete number. If you omit the number when adding a rule then ipfw takes the number of the last rule in the list, adds 100, and uses that.

One rule, numbered 65535, always exists, and can not be added or deleted. This rule either accepts all packets, or denies all packets, depending on whether you configured your kernel with options IPFIREWALL_DEFAULT_TO_ACCEPT.

Manipulating the ipfw Rules

All changes to the ipfw rules are made using the ipfw command. This command has a number of different sub-commands that are normally specified as the first parameter. Each sub-command then has its own options that apply to it.

Adding rules is accomplished with the add sub-command. This sub-command takes an optional rule number as the first parameter, followed by the body of the rule you want to add. If a rule number is not specified for the add sub-command then the number of the last rule in the list is incremented by 100, and used.

For example:

# ipfw add 100 rule

is used to add a new rule with an explicit number of 100, while:

# ipfw add rule

is used to add a new rule with a rule number 100 greater than the last rule.

Deleting rules is accomplished with the delete sub-command, which takes a list of one or more rule numbers to delete.

The current list of rules can be seen with the show or list sub-commands.

Creating a Firewall with ipfw

You can now create a simple firewall. Assume that your firewall machine also runs a web server and a mail server (not necessarily best practice, but it helps keep the example simple), and that you want to allow all outgoing connections, and incoming connections to the web server running on port 80 and the mail server running on port 25:

# ipfw add allow ip from me to any
# ipfw add allow ip from any to me established
# ipfw add allow tcp from any to me port 25 setup
# ipfw add allow tcp from any to me port 80 setup
# ipfw add deny ip from any to any

In order, these rules behave as follows.

  1. Allow all IP traffic that originated from this host (me) to any other host (any), whether it is on the local network or the wider Internet.

  2. Allow all incoming traffic from any host, as long as it is part of an already established connection. At this point, no rules have been specified to permit connections to be established.

  3. Allow tcp connections from the outside world to connect to this host, as long as they are directed at the local port 25 (the smtp port).

  4. Allow tcp connections from the outside world to connect to the this host, as long as they are directed at the local port 80 (the http port).

  5. Explicitly disallow any other traffic.

The complete set of firewall rules can then be shown with the ipfw show command.

# ipfw show

Although no sequence numbers were specified when the rules were added, you can see how ipfw has numbered each rule, starting from 100. You can use this to delete individual rules. If this host is no longer running a web server then the command:

# ipfw delete 

will delete just the rule relating to port 80.

It is possible for multiple rules to have the same number. This can be useful if you want to group rules that are for related protocols, or applications. Your web server might also provide secure HTTP (https://...) functionality. This requires packets for port 443 to be allowed through in addition ot port 80. To group all the rules for the web server under one rule number you must explicitly specify the rule number, like so.

# ipfw add 500 allow ip from any to me 80
# ipfw add 500 allow ip from any to me 443

Here, the web server firewall rules are both numbered 500. To delete them both you use the single delete command.

# ipfw delete 500

It is not possible to manipulate individual rules that share a rule number.

Filtering udp in a stateful manner with ipfw

As discussed earlier, many protocols that use UDP must be firewalled in a stateful manner. The firewall has to keep track of outgoing UDP packets, and create temporary rules that allow responses to those packets for a short period of time. These temporary rules are stored in a different ruleset, the dynamic ruleset, and are managed by ipfw itself. You do not need to add or remove rules from the dynamic ruleset.

Outgoing packets that should cause an entry to be made in the dynamic ruleset are indicated with the keep-state keyword. Incoming packets are checked against the dynamic ruleset as soon as ipfw encounters a rule that includes keep-state, or the check-state keyword.

The following simple rule allows all outgoing DNS queries (which will use UDP, and will be to port 53 on the remote host), and creates the appropriate entry in the dynamic ruleset.

# ipfw add allow udp from me to any 53 keep-state

If this rule had a high number, and you had many more rules before this one, incoming UDP packets would not be checked against the dynamic ruleset until they reach this rule. If you expect a lot of incoming UDP traffic you can mitigate this by putting a check-state rule earlier in your ruleset, like this:

# ipfw add 1 check-state

Logging traffic

You can use ipfw to keep track of how much of each type of traffic you are sending. Each rule in the ruleset has several counters associated with it. Each counter is incremented for each packet that matches the rule.

The counters are:

packet count

This counter is incremented by one for each packet that matched the rule.

byte

This counter is incremented by the number of bytes contained in each packet that matched the rule.

log

A log message can be generated for each packet that matches a rule. This counter keeps track of how many log messages have been generated. ipfw can be configured to stop sending log messages, on a per-rule basis, when a pre-defined logging limit is reached.

timestamp

Not strictly a counter. This value is updated with a timestamp of the last packet that matched the rule.

Together, these allow you to keep track of the amount of traffic your firewall is handling, and understand what percentage of the traffic is devoted to which protocols. It is even possible to create a firewall that allows all traffic past, and simply counts the number of packets that are destined for each port.

The counters can be seen with the XXX command, and counters for individual rule numbers can be reset to zero with the zero command.

The log counters are slightly different. The kernel must be configured with the IPFIREWALL_VERBOSE option. Then, an entry in the ruleset of the form:

action log [logamount nnn] protocol ...

causes all packets that match this rule to cause a log message to be sent to syslogd(8), with a LOG_SECURITY facility. By default, these will be appended to /var/log/security.

This example will log all attempts to set up incoming telnet connections (to port 23), and deny the attempt.

# ipfw add log deny tcp from any to me 23

Because this may generate a large number of log messages you may limit the number of messages that are generated on a per-rule basis. This can be done in two ways.

  1. The kernel can be configured with the IPFIREWALL_VERBOSE_LIMIT option. This option specifies the maximum number of log messages that will be generated for each rule before logging is disabled for that rule.

    This option sets the initial value for the sysctl(8) variable net.inet.ip.fw.verbose_limit, and you can adjust it while the system is running as necessary.

  2. You can use the logamount keyword to set the limit on a per-rule basis. To limit the logging on the telnet rule so that only 100 messages are logged you would use this rule:

    # ipfw add log logamount 100 deny tcp from any to me 23

    If you have also defined IPFIREWALL_VERBOSE_LIMIT then you can use a logamount of 0 to specify that there should be no limits for this rule.

Once a rule reaches its pre-defined logging limit (if set) no more messages will be generated. You can reset the log counter (and therefore start generating log messages again) with the resetlog command. This command resets the log counter for rule 5,300.

# ipfw resetlog 5300

Filtering ICMP

More things to filter on

Integrating in to FreeBSD

The FreeBSD start up scripts have a hook that enables your firewall configuration to be run as soon as the system configures networking. FreeBSD also comes with a sample script, /etc/rc.firewall, that contains many sample rules, and shows how you might write one script that can cater for a variety of different packet filtering requirements.

/etc/rc.conf has three variables that control overall firewall functionality, and two variables that are specific to the FreeBSD /etc/rc.firewall script.

The generic firewall variables are:

  1. firewall_enable

  2. firewall_script

  3. firewall_logging

The boot scripts check to see if firewall_enable is set to YES. If it is, the filename in firewall_script is run, on the assumption that it contains a series of ipfw commands to specify the ruleset. firewall_script defaults to /etc/rc.firewall. firewall_logging sets the net.inet.ip.fw.verbose to 1, the equivalent of setting IPFIREWALL_VERBOSE in the kernel config file.

/etc/rc.firewall honours two variables; firewall_type can be one of open, closed, client, or simple. These correspond to different sets of rules that are chosen from the default rc.firewall script. The firewall_quiet variable can be set to YES, in which case the firewall rules will not be displayed on the console as they are displayed by the scriptin other words, -q is added to each ipfw command line in the script.

Because ipfw's divert functionality is used to send packets to and from the Network Address Translation daemon, /etc/rc.firewall also contains examples of how to do this, conditionalised on the natd_enable variable. More information about this can be found in the XXX chapter.

The default /etc/rc.firewall script is not really suitable for use unchanged in production. Instead, you should examine the script to see how the firewall rules are built up, how to filter various protocols, and so forth. A good start is to copy the script to another file, edit it to taste, and then put this script in to /etc/rc.conf. For example:

# cd /etc
# cp rc.firewall rc.firewall.local
[ edit rc.firewall.local, and test]
# echo 'firewall_enable="YES"' >> rc.conf
# echo 'firewall_secript="/etc/rc.firewall.local"' >> rc.conf

Of course, your script can honour the firewall_type and firewall_quiet variables if you want.

Testing a Firewall Configuration

After you've designed an appropriate firewall configuration, it's important to validate that it does in fact do what you want it to do. One way to do this is to use a test host outside your network to attempt to pierce your firewall: this can be quite clumsy and slow, though, and is limited to testing only those addresses that you can actually use.

A faster and easier method is available with the Linux firewall implementation. It allows you to manually generate tests and run them through the firewall configuration just as if you were testing with actual datagrams. All varieties of the Linux kernel firewall software, ipfwadm, ipchains, and iptables, provide support for this style of testing. The implementation involves use of the relevant check command.

The general test procedure is as follows:

  1. Design and configure your firewall using ipfwadm, ipchains, or iptables.

  2. Design a series of tests that will determine whether your firewall is actually working as you intend. For these tests you may use any source or destination address, so choose some address combinations that should be accepted and some others that should be dropped. If you're allowing or disallowing only certain ranges of addresses, it is a good idea to test addresses on either side of the boundary of the rangeone address just inside the boundary and one address just outside the boundary. This will help ensure that you have the correct boundaries configured, because it is sometimes easy to specify netmasks incorrectly in your configuration. If you're filtering by protocol and port number, your tests should also check all important combinations of these parameters. For example, if you intend to accept only TCP under certain circumstances, check that UDP datagrams are dropped.

  3. Develop ipfwadm, ipchains, or iptables rules to implement each test. It is probably worthwhile to write all the rules into a script so you can test and re-test easily as you correct mistakes or change your design. Tests use almost the same syntax as rule specifications, but the arguments take on slightly differing meanings. For example, the source address argument in a rule specification specifies the source address that datagrams matching this rule should have. The source address argument in test syntax, in contrast, specifies the source address of the test datagram that will be generated. For ipfwadm, you must use the c option to specify that this command is a test, while for ipchains and iptables, you must use the C option. In all cases you must always specify the source address, destination address, protocol, and interface to be used for the test. Other arguments, such as port numbers or TOS bit settings, are optional.

  4. Execute each test command and note the output. The output of each test will be a single word indicating the final target of the datagram after running it through the firewall configurationthat is, where the processing ended. For ipchains and iptables, user-specified chains will be tested in addition to the built-in ones.

  5. Compare the output of each test against the desired result. If there are any discrepancies, you will need to analyse your ruleset to determine where you've made the error. If you've written your test commands into a script file, you can easily rerun the test after correcting any errors in your firewall configuration. It's a good practice to flush your rulesets completely and rebuild them from scratch, rather than to make changes dynamically. This helps ensure that the active configuration you are testing actually reflects the set of commands in your configuration script.

Let's take a quick look at what a manual test transcript would look like for our nave example with ipchains. You will remember that our local network in the example was 172.16.1.0 with a netmask of 255.255.255.0, and we were to allow TCP connections out to web servers on the net. Nothing else was to pass our forward chain. Start with a transmission that we know should work, a connection from a local host to a web server outside:

# ipchains -C forward -p tcp -s 172.16.1.0 1025 -d 44.136.8.2 80 -i eth0
	accepted

Note the arguments had to be supplied and the way they've been used to describe a datagram. The output of the command indicates that that the datagram was accepted for forwarding, which is what we hoped for.

Now try another test, this time with a source address that doesn't belong to our network. This one should be denied:

# ipchains -C forward -p tcp -s 172.16.2.0 1025 -d 44.136.8.2 80 -i eth0
	denied

Try some more tests, this time with the same details as the first test, but with different protocols. These should be denied, too:

# ipchains -C forward -p udp -s 172.16.1.0 1025 -d 44.136.8.2 80 -i eth0
denied
# ipchains -C forward -p icmp -s 172.16.1.0 1025 -d 44.136.8.2 80 -i eth0
denied

Try another destination port, again expecting it to be denied:

# ipchains -C forward -p tcp -s 172.16.1.0 1025 -d 44.136.8.2 23 -i eth0
denied

You'll go a long way toward achieving peace of mind if you design a series of exhaustive tests. While this can sometimes be as difficult as designing the firewall configuration, it's also the best way of knowing that your design is providing the security you expect of it.

A Sample Firewall Configuration

We've discussed the fundamentals of firewall configuration. Let's now look at what a firewall configuration might actually look like.

The configuration in this example has been designed to be easily extended and customized. We've provided three versions. The first version is implemented using the ipfwadm command (or the ipfwadm-wrapper script), the second uses ipchains, and the third uses iptables. The example doesn't attempt to exploit user-defined chains, but it will show you the similarities and differences between the old and new firewall configuration tool syntaxes:

#!/bin/bash
##########################################################################
# IPFWADM VERSION
# This sample configuration is for a single host firewall configuration
# with no services supported by the firewall machine itself.
##########################################################################
      
# USER CONFIGURABLE SECTION
      
# The name and location of the ipfwadm utility. Use ipfwadm-wrapper for
# 2.2.* kernels.
IPFWADM=ipfwadm
      
# The path to the ipfwadm executable.
PATH="/sbin"
      
# Our internal network address space and its supporting network device.
OURNET="172.29.16.0/24"
OURBCAST="172.29.16.255"
OURDEV="eth0"
      
# The outside address and the network device that supports it.
ANYADDR="0/0"
ANYDEV="eth1"
      
# The TCP services we wish to allow to pass - "" empty means all ports
# note: space separated
TCPIN="smtp www"
TCPOUT="smtp www ftp ftp-data irc"
      
# The UDP services we wish to allow to pass - "" empty means all ports
# note: space separated
UDPIN="domain"
UDPOUT="domain"
      
# The ICMP services we wish to allow to pass - "" empty means all types
# ref: /usr/include/netinet/ip_icmp.h for type numbers
# note: space separated
ICMPIN="0 3 11"
ICMPOUT="8 3 11"
      
# Logging; uncomment the following line to enable logging of datagrams
# that are blocked by the firewall.
# LOGGING=1
      
# END USER CONFIGURABLE SECTION
###########################################################################
# Flush the Incoming table rules
$IPFWADM -I -f

# We want to deny incoming access by default.
$IPFWADM -I -p deny

# SPOOFING
# We should not accept any datagrams with a source address matching ours
# from the outside, so we deny them.
$IPFWADM -I -a deny -S $OURNET -W $ANYDEV

# SMURF
# Disallow ICMP to our broadcast address to prevent "Smurf" style attack.
$IPFWADM -I -a deny -P icmp -W $ANYDEV -D $OURBCAST

# TCP
# We will accept all TCP datagrams belonging to an existing connection
# (i.e. having the ACK bit set) for the TCP ports we're allowing through.
# This should catch more than 95 % of all valid TCP packets.
$IPFWADM -I -a accept -P tcp -D $OURNET $TCPIN -k -b

# TCP - INCOMING CONNECTIONS
# We will accept connection requests from the outside only on the
# allowed TCP ports.
$IPFWADM -I -a accept -P tcp -W $ANYDEV -D $OURNET $TCPIN -y

# TCP - OUTGOING CONNECTIONS
# We accept all outgoing tcp connection requests on allowed TCP ports.
$IPFWADM -I -a accept -P tcp -W $OURDEV -D $ANYADDR $TCPOUT -y

# UDP - INCOMING
# We will allow UDP datagrams in on the allowed ports.
$IPFWADM -I -a accept -P udp -W $ANYDEV -D $OURNET $UDPIN

# UDP - OUTGOING
# We will allow UDP datagrams out on the allowed ports.
$IPFWADM -I -a accept -P udp -W $OURDEV -D $ANYADDR $UDPOUT

# ICMP - INCOMING
# We will allow ICMP datagrams in of the allowed types.
$IPFWADM -I -a accept -P icmp -W $ANYDEV -D $OURNET $UDPIN

# ICMP - OUTGOING
# We will allow ICMP datagrams out of the allowed types.
$IPFWADM -I -a accept -P icmp -W $OURDEV -D $ANYADDR $UDPOUT

# DEFAULT and LOGGING
# All remaining datagrams fall through to the default
# rule and are dropped. They will be logged if you've
# configured the LOGGING variable above.
#
if [ "$LOGGING" ]
then
# Log barred TCP
$IPFWADM -I -a reject -P tcp -o

# Log barred UDP
$IPFWADM -I -a reject -P udp -o

# Log barred ICMP
$IPFWADM -I -a reject -P icmp -o
fi
#
# end.
    

Now we'll reimplement it using the ipchains command:

#!/bin/bash
##########################################################################
# IPCHAINS VERSION
# This sample configuration is for a single host firewall configuration
# with no services supported by the firewall machine itself.
##########################################################################

# USER CONFIGURABLE SECTION

# The name and location of the ipchains utility.
IPCHAINS=ipchains

# The path to the ipchains executable.
PATH="/sbin"

# Our internal network address space and its supporting network device.
OURNET="172.29.16.0/24"
OURBCAST="172.29.16.255"
OURDEV="eth0"

# The outside address and the network device that supports it.
ANYADDR="0/0"
ANYDEV="eth1"

# The TCP services we wish to allow to pass - "" empty means all ports
# note: space separated
TCPIN="smtp www"
TCPOUT="smtp www ftp ftp-data irc"

# The UDP services we wish to allow to pass - "" empty means all ports
# note: space separated
UDPIN="domain"
UDPOUT="domain"

# The ICMP services we wish to allow to pass - "" empty means all types
# ref: /usr/include/netinet/ip_icmp.h for type numbers
# note: space separated
ICMPIN="0 3 11"
ICMPOUT="8 3 11"

# Logging; uncomment the following line to enable logging of datagrams
# that are blocked by the firewall.
# LOGGING=1

# END USER CONFIGURABLE SECTION
##########################################################################
# Flush the Input table rules
$IPCHAINS -F input

# We want to deny incoming access by default.
$IPCHAINS -P input deny

# SPOOFING
# We should not accept any datagrams with a source address matching ours
# from the outside, so we deny them.
$IPCHAINS -A input -s $OURNET -i $ANYDEV -j deny

# SMURF
# Disallow ICMP to our broadcast address to prevent "Smurf" style attack.
$IPCHAINS -A input -p icmp -w $ANYDEV -d $OURBCAST -j deny

# We should accept fragments, in ipchains we must do this explicitly.
$IPCHAINS -A input -f -j accept

# TCP
# We will accept all TCP datagrams belonging to an existing connection
# (i.e. having the ACK bit set) for the TCP ports we're allowing through.
# This should catch more than 95 % of all valid TCP packets.
$IPCHAINS -A input -p tcp -d $OURNET $TCPIN ! -y -b -j accept

# TCP - INCOMING CONNECTIONS
# We will accept connection requests from the outside only on the
# allowed TCP ports.
$IPCHAINS -A input -p tcp -i $ANYDEV -d $OURNET $TCPIN -y -j accept

# TCP - OUTGOING CONNECTIONS
# We accept all outgoing TCP connection requests on allowed TCP ports.
$IPCHAINS -A input -p tcp -i $OURDEV -d $ANYADDR $TCPOUT -y -j accept

# UDP - INCOMING
# We will allow UDP datagrams in on the allowed ports.
$IPCHAINS -A input -p udp -i $ANYDEV -d $OURNET $UDPIN -j accept

# UDP - OUTGOING
# We will allow UDP datagrams out on the allowed ports.
$IPCHAINS -A input -p udp -i $OURDEV -d $ANYADDR $UDPOUT -j accept

# ICMP - INCOMING
# We will allow ICMP datagrams in of the allowed types.
$IPCHAINS -A input -p icmp -w $ANYDEV -d $OURNET $UDPIN -j accept

# ICMP - OUTGOING
# We will allow ICMP datagrams out of the allowed types.
$IPCHAINS -A input -p icmp -i $OURDEV -d $ANYADDR $UDPOUT -j accept

# DEFAULT and LOGGING
# All remaining datagrams fall through to the default
# rule and are dropped. They will be logged if you've
# configured the LOGGING variable above.
#
if [ "$LOGGING" ]
then
# Log barred TCP
$IPCHAINS -A input -p tcp -l -j reject

# Log barred UDP
$IPCHAINS -A input -p udp -l -j reject

# Log barred ICMP
$IPCHAINS -A input -p icmp -l -j reject
fi
#
# end.
    

In our iptables example, we've switched to using the FORWARD ruleset because of the difference in meaning of the INPUT ruleset in the netfilter implementation. This has implications for us; it means that none of the rules protect the firewall host itself. To accurately mimic our ipchains example, we would replicate each of our rules in the INPUT chain. For clarity, we've dropped all incoming datagrams received from our outside interface instead.

#!/bin/bash
##########################################################################
# IPTABLES VERSION
# This sample configuration is for a single host firewall configuration
# with no services supported by the firewall machine itself.
##########################################################################

# USER CONFIGURABLE SECTION

# The name and location of the ipchains utility.
IPTABLES=iptables

# The path to the ipchains executable.
PATH="/sbin"

# Our internal network address space and its supporting network device.
OURNET="172.29.16.0/24"
OURBCAST="172.29.16.255"
OURDEV="eth0"

# The outside address and the network device that supports it.
ANYADDR="0/0"
ANYDEV="eth1"

# The TCP services we wish to allow to pass - "" empty means all ports
# note: comma separated
TCPIN="smtp,www"
TCPOUT="smtp,www,ftp,ftp-data,irc"

# The UDP services we wish to allow to pass - "" empty means all ports
# note: comma separated
UDPIN="domain"
UDPOUT="domain"

# The ICMP services we wish to allow to pass - "" empty means all types
# ref: /usr/include/netinet/ip_icmp.h for type numbers
# note: comma separated
ICMPIN="0,3,11"
ICMPOUT="8,3,11"

# Logging; uncomment the following line to enable logging of datagrams
# that are blocked by the firewall.
# LOGGING=1

# END USER CONFIGURABLE SECTION
###########################################################################
# Flush the Input table rules
$IPTABLES -F FORWARD

# We want to deny incoming access by default.
$IPTABLES -P FORWARD deny

# Drop all datagrams destined for this host received from outside.
$IPTABLES -A INPUT -i $ANYDEV -j DROP

# SPOOFING
# We should not accept any datagrams with a source address matching ours
# from the outside, so we deny them.
$IPTABLES -A FORWARD -s $OURNET -i $ANYDEV -j DROP

# SMURF
# Disallow ICMP to our broadcast address to prevent "Smurf" style attack.
$IPTABLES -A FORWARD -m multiport -p icmp -i $ANYDEV -d $OURNET -j DENY

# We should accept fragments, in iptables we must do this explicitly.
$IPTABLES -A FORWARD -f -j ACCEPT

# TCP
# We will accept all TCP datagrams belonging to an existing connection
# (i.e. having the ACK bit set) for the TCP ports we're allowing through.
# This should catch more than 95 % of all valid TCP packets.
$IPTABLES -A FORWARD -m multiport -p tcp -d $OURNET --dports $TCPIN /
! --tcp-flags SYN,ACK ACK -j ACCEPT
$IPTABLES -A FORWARD -m multiport -p tcp -s $OURNET --sports $TCPIN /
! --tcp-flags SYN,ACK ACK -j ACCEPT


# TCP - INCOMING CONNECTIONS
# We will accept connection requests from the outside only on the
# allowed TCP ports.
$IPTABLES -A FORWARD -m multiport -p tcp -i $ANYDEV -d $OURNET $TCPIN /
--syn -j ACCEPT

# TCP - OUTGOING CONNECTIONS
# We will accept all outgoing tcp connection requests on the allowed /
TCP ports.
$IPTABLES -A FORWARD -m multiport -p tcp -i $OURDEV -d $ANYADDR /
--dports $TCPOUT --syn -j ACCEPT
# UDP - INCOMING
# We will allow UDP datagrams in on the allowed ports and back.
$IPTABLES -A FORWARD -m multiport -p udp -i $ANYDEV -d $OURNET /
--dports $UDPIN -j ACCEPT
$IPTABLES -A FORWARD -m multiport -p udp -i $ANYDEV -s $OURNET /
--sports $UDPIN -j ACCEPT
# UDP - OUTGOING
# We will allow UDP datagrams out to the allowed ports and back.
$IPTABLES -A FORWARD -m multiport -p udp -i $OURDEV -d $ANYADDR /
--dports $UDPOUT -j ACCEPT
$IPTABLES -A FORWARD -m multiport -p udp -i $OURDEV -s $ANYADDR /
--sports $UDPOUT -j ACCEPT
# ICMP - INCOMING
# We will allow ICMP datagrams in of the allowed types.
$IPTABLES -A FORWARD -m multiport -p icmp -i $ANYDEV -d $OURNET /
--dports $ICMPIN -j ACCEPT
# ICMP - OUTGOING
# We will allow ICMP datagrams out of the allowed types.
$IPTABLES -A FORWARD -m multiport -p icmp -i $OURDEV -d $ANYADDR /
--dports $ICMPOUT -j ACCEPT
# DEFAULT and LOGGING
# All remaining datagrams fall through to the default
# rule and are dropped. They will be logged if you've
# configured the LOGGING variable above.
#
if [ "$LOGGING" ]
then
# Log barred TCP
$IPTABLES -A FORWARD -m tcp -p tcp -j LOG
# Log barred UDP
$IPTABLES -A FORWARD -m udp -p udp -j LOG
# Log barred ICMP
$IPTABLES -A FORWARD -m udp -p icmp -j LOG
fi
#
# end.
    

In many simple situations, to use the sample all you have to do is edit the top section of the file labeled USER CONFIGURABLE section to specify which protocols and datagrams type you wish to allow in and out. For more complex configurations, you will need to edit the section at the bottom, as well. Remember, this is a simple example, so scrutinize it very carefully to ensure it does what you want while implementing it.



[47] The term firewall comes from a device used to protect people from fire. The firewall is a shield of material resistant to fire that is placed between a potential fire and the people it is protecting.

In todays world of commercial Internet service, it is becoming increasingly important to know how much data you are transmitting and receiving on your network connections. If you are an Internet Service Provider and you charge your customers by volume, this will be essential to your business. If you are a customer of an Internet Service Provider that charges by data volume, you will find it useful to collect your own data to ensure the accuracy of your Internet charges.

There are other uses for network accounting that have nothing to do with dollars and bills. If you manage a server that offers a number of different types of network services, it might be useful to you to know exactly how much data is being generated by each one. This sort of information could assist you in making decisions, such as what hardware to buy or how many servers to run.

The Linux kernel provides a facility that allows you to collect all sorts of useful information about the network traffic it sees. This facility is called IP accounting.

Configuring the Kernel for IP Accounting

The Linux IP accounting feature is very closely related to the Linux firewall software. The places you want to collect accounting data are the same places that you would be interested in performing firewall filtering: into and out of a network host, and in the software that does the routing of datagrams. If you haven't read the section on firewalls, now is probably a good time to do so, as we will be using some of the concepts described in Chapter 9., TCP/IP Firewall.

To activate the Linux IP accounting feature, you should first see if your Linux kernel is configured for it. Check to see if the /proc/net/ip_acct file exists. If it does, your kernel already supports IP accounting. If it doesn't, you must build a new kernel, ensuring that you answer Y to the options in 2.0 and 2.2 series kernels:

Networking options  --->
	[*] Network firewalls
	[*] TCP/IP networking
	...
	[*] IP: accounting

or in 2.4 series kernels:

Networking options  --->
	[*] Network packet filtering (replaces ipchains)

Configuring IP Accounting

Because IP accounting is closely related to IP firewall, the same tool was designated to configure it, so ipfwadm, ipchains or iptables are used to configure IP accounting. The command syntax is very similar to that of the firewall rules, so we won't focus on it, but we will discuss what you can discover about the nature of your network traffic using this feature.

The general syntax for IP accounting with ipfwadm is:

# ipfwadm -A [direction] [command] [parameters]

The direction argument is new. This is simply coded as in, out, or both. These directions are from the perspective of the linux machine itself, so in means data coming into the machine from a network connection and out means data that is being transmitted by this host on a network connection. The both direction is the sum of both the incoming and outgoing directions.

The general command syntax for ipchains and iptables is:

# ipchains -A chain rule-specification
	# iptables -A chain rule-specification

The ipchains and iptables commands allow you to specify direction in a manner more consistent with the firewall rules. IP Firewall Chains doesn't allow you to configure a rule that aggregates both directions, but it does allow you to configure rules in the forward chain that the older implementation did not. We'll see the difference that makes in some examples a little later.

The commands are much the same as firewall rules, except that the policy rules do not apply here. We can add, insert, delete, and list accounting rules. In the case of ipchains and iptables, all valid rules are accounting rules, and any command that doesn't specify the -j option performs accounting only.

The rule specification parameters for IP accounting are the same as those used for IP firewall. These are what we use to define precisely what network traffic we wish to count and total.

Accounting by Address

Let's work with an example to illustrate how we'd use IP accounting.

Imagine we have a Linux-based router that serves two departments at the Virtual Brewery. The router has two Ethernet devices, eth0 and eth1, each of which services a department; and a PPP device, ppp0, that connects us via a high-speed serial link to the main campus of the Groucho Marx University.

Let's also imagine that for billing purposes we want to know the total traffic generated by each of the departments across the serial link, and for management purposes we want to know the total traffic generated between the two departments.

The following table shows the interface addresses we will use in our example:

ifaceaddressnetmask
eth0172.16.3.0255.255.255.0
eth1172.16.4.0255.255.255.0

To answer the question, How much data does each department generate on the PPP link?, we could use a rule that looks like this:

# ipfwadm -A both -a -W ppp0 -S 172.16.3.0/24 -b
	  # ipfwadm -A both -a -W ppp0 -S 172.16.4.0/24 -b

or:

# ipchains -A input -i ppp0 -d 172.16.3.0/24
	  # ipchains -A output -i ppp0 -s 172.16.3.0/24
	  # ipchains -A input -i ppp0 -d 172.16.4.0/24
	  # ipchains -A output -i ppp0 -s 172.16.4.0/24

and with iptables:

# iptables -A FORWARD -i ppp0 -d 172.16.3.0/24
	  # iptables -A FORWARD -o ppp0 -s 172.16.3.0/24
	  # iptables -A FORWARD -i ppp0 -d 172.16.4.0/24
	  # iptables -A FORWARD -o ppp0 -s 172.16.4.0/24

The first half of each of these set of rules say, Count all data traveling in either direction across the interface named ppp0 with a source or destination (remember the function of the -b flag in ipfwadm and iptables) address of 172.16.3.0/24. The second half of each ruleset is the same, but for the second Ethernet network at our site.

To answer the second question, How much data travels between the two departments?, we need a rule that looks like this:

# ipfwadm -A both -a -S 172.16.3.0/24 -D 172.16.4.0/24 -b

or:

# ipchains -A forward -s 172.16.3.0/24 -d 172.16.4.0/24 -b

or:

# iptables -A FORWARD -s 172.16.3.0/24 -d 172.16.4.0/24
	  # iptables -A FORWARD -s 172.16.4.0/24 -d 172.16.3.0/24

These rules will count all datagrams with a source address belonging to one of the department networks and a destination address belonging to the other.

Accounting by Service Port

Okay, let's suppose we also want a better idea of exactly what sort of traffic is being carried across our PPP link. We might, for example, want to know how much of the link the FTP, smtp, and World Wide Web services are consuming.

A script of rules to enable us to collect this information might look like:

#!/bin/sh
	  # Collect FTP, smtp and www volume statistics for data carried on our
	  # PPP link using ipfwadm
	  #
	  ipfwadm -A both -a -W ppp0 -P tcp -S 0/0 ftp ftp-data
	  ipfwadm -A both -a -W ppp0 -P tcp -S 0/0 smtp
	  ipfwadm -A both -a -W ppp0 -P tcp -S 0/0 www

or:

#!/bin/sh
	  # Collect ftp, smtp and www volume statistics for data carried on our
	  # PPP link using ipchains
	  #
	  ipchains -A input -i ppp0 -p tcp -s 0/0 ftp-data:ftp
	  ipchains -A output -i ppp0 -p tcp -d 0/0 ftp-data:ftp
	  ipchains -A input -i ppp0 -p tcp -s 0/0 smtp
	  ipchains -A output -i ppp0 -p tcp -d 0/0 smtp
	  ipchains -A input -i ppp0 -p tcp -s 0/0 www
	  ipchains -A output -i ppp0 -p tcp -d 0/0 www

or:

#!/bin/sh
	  # Collect ftp, smtp and www volume statistics for data carried on our
	  # PPP link using iptables.
	  #
	  iptables -A FORWARD -i ppp0 -m tcp -p tcp --sport ftp-data:ftp
	  iptables -A FORWARD -o ppp0 -m tcp -p tcp --dport ftp-data:ftp
	  iptables -A FORWARD -i ppp0 -m tcp -p tcp --sport smtp
	  iptables -A FORWARD -o ppp0 -m tcp -p tcp --dport smtp
	  iptables -A FORWARD -i ppp0 -m tcp -p tcp --sport www
	  iptables -A FORWARD -o ppp0 -m tcp -p tcp --dport www

There are a couple of interesting features to this configuration. Firstly, we've specified the protocol. When we specify ports in our rules, we must also specify a protocol because TCP and UDP provide separate sets of ports. Since all of these services are TCB-based, we've specified it as the protocol. Secondly, we've specified the two services ftp and ftp-data in one command. ipfwadm allows you to specify single ports, ranges of ports, or arbitrary lists of ports. The ipchains command allows either single ports or ranges of ports, which is what we've used here. The syntax "ftp-data:ftp" means "ports ftp-data (20) through ftp (21)," and is how we encode ranges of ports in both ipchains and iptables. When you have a list of ports in an accounting rule, it means that any data received for any of the ports in the list will cause the data to be added to that entry's totals. Remembering that the FTP service uses two ports, the command port and the data transfer port, we've added them together to total the FTP traffic. Lastly, we've specified the source address as 0/0, which is special notation that matches all addresses and is required by both the ipfwadm and ipchains commands in order to specify ports.

We can expand on the second point a little to give us a different view of the data on our link. Let's now imagine that we class FTP, SMTP, and World Wide Web traffic as essential traffic, and all other traffic as nonessential. If we were interested in seeing the ratio of essential traffic to nonessential traffic, we could do something like:

# ipfwadm -A both -a -W ppp0 -P tcp -S 0/0 ftp ftp-data smtp www
	  # ipfwadm -A both -a -W ppp0 -P tcp -S 0/0 1:19 22:24 26:79 81:32767

If you have already examined your /etc/services file, you will see that the second rule covers all ports except (ftp, ftp-data, smtp, and www).

How do we do this with the ipchains or iptables commands, since they allow only one argument in their port specification? We can exploit user-defined chains in accounting just as easily as in firewall rules. Consider the following approach:

# ipchains -N a-essent
	  # ipchains -N a-noness
	  # ipchains -A a-essent -j ACCEPT
	  # ipchains -A a-noness -j ACCEPT
	  # ipchains -A forward -i ppp0 -p tcp -s 0/0 ftp-data:ftp -j a-essent
	  # ipchains -A forward -i ppp0 -p tcp -s 0/0 smtp -j a-essent
	  # ipchains -A forward -i ppp0 -p tcp -s 0/0 www -j a-essent
	  # ipchains -A forward -j a-noness

Here we create two user-defined chains, one called a-essent, where we capture accounting data for essential services and another called a-noness, where we capture accounting data for nonessential services. We then add rules to our forward chain that match our essential services and jump to the a-essent chain, where we have just one rule that accepts all datagrams and counts them. The last rule in our forward chain is a rule that jumps to our a-noness chain, where again we have just one rule that accepts all datagrams and counts them. The rule that jumps to the a-noness chain will not be reached by any of our essential services, as they will have been accepted in their own chain. Our tallies for essential and nonessential services will therefore be available in the rules within those chains. This is just one approach you could take; there are others. Our iptables implementation of the same approach would look like:

# iptables -N a-essent
	  # iptables -N a-noness
	  # iptables -A a-essent -j ACCEPT
	  # iptables -A a-noness -j ACCEPT
	  # iptables -A FORWARD -i ppp0 -m tcp -p tcp --sport ftp-data:ftp -j a-essent
	  # iptables -A FORWARD -i ppp0 -m tcp -p tcp --sport smtp -j a-essent
	  # iptables -A FORWARD -i ppp0 -m tcp -p tcp --sport www -j a-essent
	  # iptables -A FORWARD -j a-noness

This looks simple enough. Unfortunately, there is a small but unavoidable problem when trying to do accounting by service type. You will remember that we discussed the role the MTU plays in TCP/IP networking in an earlier chapter. The MTU defines the largest datagram that will be transmitted on a network device. When a datagram is received by a router that is larger than the MTU of the interface that needs to retransmit it, the router performs a trick called fragmentation. The router breaks the large datagram into small pieces no longer than the MTU of the interface and then transmits these pieces. The router builds new headers to put in front of each of these pieces, and these are what the remote machine uses to reconstruct the original data. Unfortunately, during the fragmentation process the port is lost for all but the first fragment. This means that the IP accounting can't properly count fragmented datagrams. It can reliably count only the first fragment, or unfragmented datagrams. There is a small trick permitted by ipfwadm that ensures that while we won't be able to know exactly what port the second and later fragments were from, we can still count them. An early version of Linux accounting software assigned the fragments a fake port number, 0xFFFF, that we could count. To ensure that we capture the second and later fragments, we could use a rule like:

# ipfwadm -A both -a -W ppp0 -P tcp -S 0/0 0xFFFF

The IP chains implementation has a slightly more sophisticated solution, but the result is much the same. If using the ipchains command we'd instead use:

# ipchains -A forward -i ppp0 -p tcp -f

and with iptables we'd use:

# iptables -A FORWARD -i ppp0 -m tcp -p tcp -f

These won't tell us what the original port for this data was, but at least we are able to see how much of our data is fragments, and be able to account for the volume of traffic they consume.

In 2.2 kernels you can select a kernel compile-time option that negates this whole issue if your Linux machine is acting as the single access point for a network. If you enable the IP: always defragment option when you compile your kernel, all received datagrams will be reassembled by the Linux router before routing and retransmission. This operation is performed before the firewall and accounting software sees the datagram, and thus you will have no fragments to deal with. In 2.4 kernels you compile and load the netfilter forward-fragment module.

Accounting of ICMP Datagrams

The ICMP protocol does not use service port numbers and is therefore a little bit more difficult to collect details on. ICMP uses a number of different types of datagrams. Many of these are harmless and normal, while others should only be seen under special circumstances. Sometimes people with too much time on their hands attempt to maliciously disrupt the network access of a user by generating large numbers of ICMP messages. This is commonly called ping flooding. While IP accounting cannot do anything to prevent this problem (IP firewalling can help, though!) we can at least put accounting rules in place that will show us if anybody has been trying.

ICMP doesn't use ports as TCP and UDP do. Instead ICMP has ICMP message types. We can build rules to account for each ICMP message type. To do this, we place the ICMP message and type number in place of the port field in the ipfwadm accounting commands. We listed the ICMP message types in ???, so refer to it if you need to remember what they are.

An IP accounting rule to collect information about the volume of ping data that is being sent to you or that you are generating might look like:

# ipfwadm -A both -a -P icmp -S 0/0 8
	  # ipfwadm -A both -a -P icmp -S 0/0 0
	  # ipfwadm -A both -a -P icmp -S 0/0 0xff

or, with ipchains:

# ipchains -A forward -p icmp -s 0/0 8
	  # ipchains -A forward -p icmp -s 0/0 0
	  # ipchains -A forward -p icmp -s 0/0 -f

or, with iptables:

# iptables -A FORWARD -m icmp -p icmp --sports echo-request
	  # iptables -A FORWARD -m icmp -p icmp --sports echo-reply
	  # iptables -A FORWARD -m icmp -p icmp -f

The first rule collects information about the ICMP Echo Request datagrams (ping requests), and the second rule collects information about the ICMP Echo Reply datagrams (ping replies). The third rule collects information about ICMP datagram fragments. This is a trick similar to that described for fragmented TCP and UDP datagrams.

If you specify source and/or destination addresses in your rules, you can keep track of where the pings are coming from, such as whether they originate inside or outside your network. Once you've determined where the rogue datagrams are coming from, you can decide whether you want to put firewall rules in place to prevent them or take some other action, such as contacting the owner of the offending network to advise them of the problem, or perhaps even legal action if the problem is a malicious act.

Accounting by Protocol

Let's now imagine that we are interested in knowing how much of the traffic on our link is TCP, UDP, and ICMP. We would use rules like the following:

# ipfwadm -A both -a -W ppp0 -P tcp -D 0/0
	  # ipfwadm -A both -a -W ppp0 -P udp -D 0/0
	  # ipfwadm -A both -a -W ppp0 -P icmp -D 0/0

or:

# ipchains -A forward -i ppp0 -p tcp -d 0/0
	  # ipchains -A forward -i ppp0 -p udp -d 0/0
	  # ipchains -A forward -i ppp0 -p icmp -d 0/0

or:

# iptables -A FORWARD -i ppp0 -m tcp -p tcp
	  # iptables -A FORWARD -o ppp0 -m tcp -p tcp
	  # iptables -A FORWARD -i ppp0 -m udp -p udp
	  # iptables -A FORWARD -o ppp0 -m udp -p udp
	  # iptables -A FORWARD -i ppp0 -m icmp -p icmp
	  # iptables -A FORWARD -o ppp0 -m icmp -p icmp

With these rules in place, all of the traffic flowing across the ppp0 interface will be analyzed to determine whether it is TCP, UDP, or IMCP traffic, and the appropriate counters will be updated for each. The iptables example splits incoming flow from outgoing flow as its syntax demands it.

Using IP Accounting Results

It is all very well to be collecting this information, but how do we actually get to see it? To view the collected accounting data and the configured accounting rules, we use our firewall configuration commands, asking them to list our rules. The packet and byte counters for each of our rules are listed in the output.

The ipfwadm, ipchains, and iptables commands differ in how accounting data is handled, so we will treat them independently.

Listing Accounting Data with ipfwadm

The most basic means of listing our accounting data with the ipfwadm command is to use it like this:

# ipfwadm -A -l
	  IP accounting rules
	  pkts bytes dir prot source               destination          ports
	  9833 2345K i/o all  172.16.3.0/24      anywhere             n/a
	  56527   33M i/o all  172.16.4.0/24      anywhere             n/a

This will tell us the number of packets sent in each direction. If we use the extended output format with the -e option (not shown here because the output is too wide for the page), we are also supplied the options and applicable interface names. Most of the fields in the output will be self-explanatory, but the following may not:

dir

The direction in which the rule applies. Expected values here are in, out, or i/o, meaning both ways.

prot

The protocols to which the rule applies.

opt

A coded form of the options we use when invoking ipfwadm.

ifname

The name of the interface to which the rule applies.

ifaddress

The address of the interface to which the rule applies.

By default, ipfwadm displays the packet and byte counts in a shortened form, rounded to the nearest thousand (K) or million (M). We can ask it to display the collected data in exact units by using the expanded option as follows:

	  # ipfwadm -A -l -e -x

Listing Accounting Data with ipchains

The ipchains command will not display our accounting data (packet and byte counters) unless we supply it the -v argument. The simplest means of listing our accounting data with the ipchains is to use it like this:

	  # ipchains -L -v

Again, just as with ipfwadm, we can display the packet and byte counters in units by using the expanded output mode. The ipchains uses the -x argument for this:

	  # ipchains -L -v -x

Listing Accounting Data with iptables

The iptables command behaves very similarly to the ipchains command. Again, we must use the -v when listing tour rules to see the accounting counters. To list our accounting data, we would use:

	# iptables -L -v

Just as for the ipchains command, you can use the -x argument to show the output in expanded format with unit figures.

Resetting the Counters

The IP accounting counters will overflow if you leave them long enough. If they overflow, you will have difficulty determining the value they actually represent. To avoid this problem, you should read the accounting data periodically, record it, and then reset the counters back to zero to begin collecting accounting information for the next accounting interval.

The ipfwadm and ipchains commands provide you with a means of doing this quite simply:

# ipfwadm -A -z

or:

# ipchains -Z

or:

# iptables -Z

You can even combine the list and zeroing actions together to ensure that no accounting data is lost in between:

# ipfwadm -A -l -z

or:

# ipchains -L -Z

or:

# iptables -L -Z -v

These commands will first list the accounting data and then immediately zero the counters and begin counting again. If you are interested in collecting and using this information regularly, you would probably want to put this command into a script that recorded the output and stored it somewhere, and execute the script periodically using the cron command.

Flushing the Ruleset

One last command that might be useful allows you to flush all the IP accounting rules you have configured. This is most useful when you want to radically alter your ruleset without rebooting the machine.

The -f argument in combination with the ipfwadm command will flush all of the rules of the type you specify. ipchains supports the -F argument, which does the same:

# ipfwadm -A -f

or:

# ipchains -F

or:

# iptables -F

This flushes all of your configured IP accounting rules, removing them all and saving you having to remove each of them individually. Note that flushing the rules with ipchains does not cause any user-defined chains to be removed, only the rules within them.

Passive Collection of Accounting Data

One last trick you might like to consider: if your Linux machine is connected to an Ethernet, you can apply accounting rules to all of the data from the segment, not only that which it is transmitted by or destined for it. Your machine will passively listen to all of the data on the segment and count it.

You should first turn IP forwarding off on your Linux machine so that it doesn't try to route the datagrams it receives.[48] In the 2.0.36 and 2.2 kernels, this is a matter of:

# echo 0 >/proc/sys/net/ipv4/ip_forward

You should then enable promiscuous mode on your Ethernet interface using the ifconfig command. Now you can establish accounting rules that allow you to collect information about the datagrams flowing across your Ethernet without involving your Linux in the route at all.



[48] This isn't a good thing to do if your Linux machine serves as a router. If you disable IP forwarding, it will cease to route! Do this only on a machine with a single physical network interface.

You don't have to have a good memory to remember a time when only large organizations could afford to have a number of computers networked together by a LAN. Today network technology has dropped so much in price that two things have happened. First, LANs are now commonplace, even in many household environments. Certainly many FreeBSD users will have two or more computers connected by some Ethernet. Second, network resources, particularly IP addresses, are now a scarce resource and while they used to be free, they are now being bought and sold.

Most people with a LAN will probably also want an Internet connection that every computer on the LAN can use. The IP routing rules are quite strict in how they deal with this situation. Traditional solutions to this problem would have involved requesting an IP network address, perhaps a class C address for small sites, assigning each host on the LAN an address from this network and using a router to connect the LAN to the Internet.

In a commercialized Internet environment, this is quite an expensive proposition. First, you'd be required to pay for the network address that is assigned to you. Second, you'd probably have to pay your Internet Service Provider for the privilege of having a suitable route to your network put in place so that the rest of the Internet knows how to reach you. This might still be practical for companies, but domestic installations don't usually justify the cost.

FreeBSD supports two different types of NAT.

The first of these is dynamic NAT. Linux users refer to this as `IP Masquerading'. Dynamic NAT is the most typical usage, and will be referred to as just `NAT' for the rest of this chapter.

NAT allows you to use a private (reserved) IP network address on your LAN and have your FreeBSD-based router perform some clever, real-time translation of IP addresses and ports. When it receives a datagram from a computer on the LAN, it takes note of the type of datagram it is, TCP, UDP, ICMP, etc., and modifies the datagram so that it looks like it was generated by the router machine itself (and remembers that it has done so). It then transmits the datagram onto the Internet with its single connected IP address. When the destination host receives this datagram, it believes the datagram has come from the routing host and sends any reply datagrams back to that address. When the FreeBSD router receives a datagram from its Internet connection, it looks in its table of established connections to see if this datagram actually belongs to a computer on the LAN, and if it does, it reverses the modification it did on the forward path and transmits the datagram to the LAN computer.

A simple example is illustrated in Figure 11.1..

Figure 11.1. A typical Network Address Translation configuration

We have a small Ethernet network using one of the reserved network addresses. The network has a FreeBSD-based router providing access to the Internet. One of the workstations on the network (192.168.1.3) wishes to establish a connection to the remote host 209.1.106.178 on port 8888. The workstation routes its datagram to the router, which identifies this connection request as requiring NAT services. It accepts the datagram and allocates a port number to use (1035), substitutes its own IP address and port number for those of the originating host, and transmits the datagram to the destination host. The destination host believes it has received a connection request from the FreeBSD router and generates a reply datagram. The router, upon receiving this datagram, finds the association in its NAT table and reverses the substution it performed on the outgoing datagram. It then transmits the reply datagram to the originating host.

The local host believes it is speaking directly to the remote host. The remote host knows nothing about the local host at all and believes it has received a connection from the FreeBSD router. The FreeBSD router knows these two hosts are speaking to each other, and on what ports, and performs the address and port translations necessary to allow communication.

The second form of NAT is `Static NAT'.

Static NAT is used to make hosts on your internal network visible to the outside world, without putting them on your external network. This can be useful when your ISP has allocated you several IP addresses, which you want to use to refer to hosts inside your network.

Suppose that you have 10 hosts on your internal network, and a FreeBSD router (11 hosts in all), your ISP has allocated 3 IP addresses to you, and you want two of your internal hosts to also be visible on the wider Internet. With static NAT you can configure your router so that it has all 3 IP addresses associated with it (one primary address, and two aliases). Your static NAT configuration then forwards all incoming connections to two of the addresses to the appropriate internal hosts, effectively making them visible to the Internet. This configuration is discussed in more detail later.

Side Effects and Fringe Benefits of Dynamic NAT

The NAT facility comes with its own set of side effects, some of which are useful and some of which might become bothersome.

The hosts behind the NAT router are effectively hidden from the rest of the Internet. It is impossible to connect directly to them, which affords an additional layer of protection over and above any firewall settings you may have. This is not a hard and fast rule--it is possible to configure NAT on the router so that incoming connections on the router are forwarded to machines inside the firewall, and examples of doing this (and the potential pitfalls) are covered later.

Second, because none of the masqueraded hosts are visible, they are relatively protected from attacks from outside; this could simplify or even remove the need for firewall configuration on the masquerade host. You shouldn't rely too heavily on this, though. Your whole network will be only as safe as your masquerade host, so you should use firewall to protect it if security is a concern.

Last, some network services just won't work through NAT, or at least not without a lot of help. Typically, these are services that rely on incoming sessions to work, such as some types of Direct Communications Channels (DCC), features in IRC, or certain types of video and audio multicasting services.

NAT implementations for FreeBSD

There are (broadly speaking) three NAT implementations for FreeBSD.

The user-land PPP daemon supports NAT with the -nat flag (also -alias for backward compatability with earlier implementations). NAT with PPP is covered in more detail in the PPP chapter, and won't be discussed further here.

The IPFW firewall package has a companion NATD application. Used in conjunction with ipfw's divert functionality, natd supports dynamic and static NAT.

The IPFilter firewall package has a companion IPNAT application, used to manipulate IPFilter's NAT rules.

Configuring the kernel for NAT

If you will be using IPFilter you must rebuild your kernel with the following line in the configuration file.

options        IPFILTER

Rebuild your kernel with the appropriate options, and then reboot the system with the new kernel.

Configuring NAT with ipfw and natd

ipfw and natd work together to support dynamic and static NAT. The process and requirements are as follows.

  1. ipfw must be enabled in the kernel, with support for divert sockets.

  2. natd must be running, and listening on a local network port. By default, this is port 8668, listed as natd in /etc/services. However, this can be adjusted in the natd configuration.

  3. ipfw must be configured to divert some or all packets to natd, via this port.

  4. natd must be (optionally) configured with any special NAT requirements you have.

Configuring the kernel for ipfw and natd

You must ensure that the following two lines are in the kernel configuration file.

options        IPFIREWALL
options        IPDIVERT

The first line is needed to enable ipfw functionality, the second line enables support for `divert' sockets.

Running natd at boot time

There are several entries in /etc/rc.conf that can be used to enable natd and change its configuration. As always, examine /etc/defaults/rc.conf and rc.conf(5) for more information.

natd_enable

Whether or not to run natd_program. Set to YES to enable this functionality.

natd_program

The program that will be run if natd_enable is set to YES.

natd_interface

The network interface natd will use.

natd_flags

Any additional flags to pass to natd.

A typical configuration for natd in /etc/rc.conf would be:

firewall_enable="YES"         # Enable ipfw
gateway_enable="YES"          # Enable packet forwarding
...
natd_enable="YES"
natd_interface="fxp0"
natd_flags="-f /etc/natd.conf"

This enables natd on the fxp0 interface, and directs natd to load further configuration information from the /etc/natd.conf file.

Adjusting the firewall to support natd

natd requires a rule in the firewall configuration to divert packets to natd for further processing.

In general terms, the rule will look like this:

divert natd all from any to any via interface

where natd is the port that natd listens on (defined in /etc/services), and interface is the network interface that natd operates on.

The sample firewall rules in /etc/rc.firewall, which include the configuration variables from /etc/rc.conf, have lines like this:

${fwcmd} add divert natd all from any to any via ${natd_interface}

Dynamic NAT configuration

natd's default configuration supports dynamic NAT with no other changes. However, there are a number of options you can set to finetune its behaviour. This is not a comprehensive list of all the options described in natd(8).

These options can either be given on the natd command line (most easily through the natd_flags variable in /etc/rc.conf) or through a configuration file specified using the -f flag. More detail about these options can also be found in the libalias(3) manual page.

use_sockets

This option is useful when the host running natd will be generating traffic as well as simply forwarding traffic from other hosts--for example, if the natd host is also running a web cache. It causes natd to pre-allocate a socket when waiting for a connection in order to prevent port conflicts.

same_ports

natd will try to use the same port numbers for connections that it is maintaining.

For example, if host A on the internal network makes an outgoing connection from it's local port 4000, natd will try to make its outgoing connection from the same port. This can be useful with certain protocols such as RPC.

This option does not guarantee that the same port will be used. For example, if host B had already caused natd to open a connection from port 4000, natd will silently pick a different port number, even though host A wanted to use port 4000.

dynamic

natd will monitor the IP address of the interface it is using, and adjust its internal tables as necessary if the IP address changes. This can be useful if your ISP allocates you addresses from a range, perhaps using DHCP. This allows natd to cope with these changes without needing to be restarted.

punch_fw base:count

Directs natd to “punch holes” in the firewall. Certain protocols, such as IRC DCC and FTP do not support NAT very well, because they encode information about port numbers that are being used in the data between the two communicating hosts. natd has support for some of these protocols allowing it to watch the data and become aware of ports that need to be opened. Enabling this directive allows natd to reconfigure the firewall to allow these extra communication channels through.

natd does this be inserting additional, temporary, rules in to the firewall. These rules will start at the rule number given by base, and no more than count additional rules will be inserted.

A sample firewall/natd configuration

Suppose that your firewall/natd host has two interfaces. The internal interface is called fxp0 and numbered 192.168.1.1/24. The external interface is called fxp1, and has a public IP address, which in these examples we'll call A.B.C.D. A good basic firewall/natd configuration would look something like this:

# Allow all traffic over the loopback address, deny all
# over traffic to 127.0.0.0/8

100 allow all from any to any via lo0
110 deny all from any to 127.0.0.0/8

# Make sure that traffic that appears to be from our
# internal network does not arrive from the external interface.  And 
# make sure that traffic that appears to be from the outside does not
# arrive from the internal interface.

200 deny all from 192.168.1.0/24 to any in via fxp1
210 deny all from A.B.C.D to any in via fxp0

# Block all outgoing traffic that seems to be from a private address 
# (as described in RFC1918) going out on the outside interface.

300 deny all from any to 10.0.0.0/8 via fxp1
...                          # Many of these rules

# Divert all traffic going through the outside interface through natd

400 divert natd all from any to any via fxp1

# Block all incoming traffic that seems to be from a private address
# coming in on the outside interface

500 deny all from 10.0.0.0/8 to any via fxp1

# Allow all established connections, and TCP fragments

600 allow tcp from any to any established
610 allow all from any to any frag

# Allow any incoming services, as necessary 
# (ssh and web in this example)

700 allow tcp from any to me 22 setup
710 allow tcp from any to me 80 setup

# Disallow all other incoming connections from the outside interface
# Allow all other incoming connections

800 deny tcp from any to me in recv fxp1 setup
810 allow tcp from any to any setup

# Allow all UDP, keeping state

900 allow udp from any to any keep-state

This is a fairly simple firewall. It allows all traffic along the internal network, all traffic to the outside world, and places restrictions on what incoming connections are allowed. In addition, there are rules in place to ensure that private (RFC1918) addresses are never accepted on the outside interface, nor are they allowed out to the Internet by the firewall.

Any traffic that is going via the outside interface (fxp1), whether it is incoming or outgoing, is diverted through natd for any processing that might be necessary.

The rules in the 700-799 range cover incoming connections to the host. The sample rules show how you would allow certain incoming TCP/IP protocols.

Allowing access to internal hosts with dynamic NAT

As already mentioned, it is possible to allow limited access to internal hosts while still using dynamic NAT. You can think of this as a halfway house between dynamic NAT and full-blown static NAT. This can be very useful in situations where a limited number of ports on an internal host must be made available to hosts outside your network.

For example, suppose that you have a host on the internal network that has the IP address 192.168.1.250. This is a private address, and must therefore not appear on the wider Internet. You run a web server on this host, and need to be able to access this web server from the Internet. The web server runs on the standard port 80.

natd allows you to specify a port (or ports) on the host running natd. Any connections to these ports will be automatically forwarded on to the host inside the network.

So you might decide to allocate port 8080 on the natd host. From the Internet you want to be able to go to the URL http://natd-host.example.com:8080/, and actually reach the web server running on 192.168.1.250:80.

The natd configuration option for this is redirect_port. The settings for our example would be:

redirect_port tcp 192.168.1.250:80 8080

which redirects all incoming traffic on port 8080 to port 80 on 192.168.1.250.

However, this is not sufficient with the firewall example we have been building up. Simply adding this line to /etc/natd.conf and restarting natd does not suffice. Recall that the firewall has these two rules configured:

# Disallow all other incoming connections from the outside interface
# Allow all other incoming connections

800 deny tcp from any to me in recv fxp1 setup
810 allow tcp from any to any setup

You might think this is sufficient for your redirection to work. However, you must add two additional rules.

The first is straightforward. Remember that we want to redirect incoming traffic on port 8080, so we need to allow that traffic through, with a rule like this.

720 allow tcp from any to A.B.C.D 8080 setup

The second is slightly less intuitive.

Your might be reasoning along the following lines[49].

natd has changed the packet so that it looks like it was originated on the firewall machine. My firewall is configured to allow all traffic from the firewall machine to the internal network. Therefore, these packets altered by natd will be allowed by my existing rules.

Unfortunately, it doesn't work like that. natd only changes the source address of incoming packets. It does not change any other characteristics of the packet, including the interface the packet was originally received on. Rule 800 denies any packets that have been received by the outside interface (fxp1) and that are trying to set up a new connection. natd has rewritten the packet's source address, but the original interface remains the same. So the firewall configuration blocks the packet.

In order for this to work as expected you will need to add this rule:

730 allow tcp from me to 192.168.1.250 80 setup

which allows the rewritten packets through the firewall.

Static NAT

Static NAT is useful where your ISP has allocated a block of IP addresses to you, you have services running on several machines that you want to make available to the Internet, but where you do not want those hosts directly connected to the Internet.

Suppose that your ISP has allocated A.B.C.1, A.B.C.2, and A.B.C.3 to you. You give your firewall the .1 address, and want two other machines on your internal network to appear to have the .2 and .3 addresses. However, in reality, those two hosts will have internal addresses, 192.168.1.250, and 192.168.1.251.

There are three steps to take to enable this:

  1. Configure the natd host with all three IP addresses.

    If the outside interface is fxp1 then you need to configure the interface with all three IP addresses.

    # ifconfig fxp1 inet A.B.C.1 netmask ...
    # ifconfig fxp1 inet A.B.C.2 netmask 0xffffffff alias
    # ifconfig fxp1 inet A.B.C.3 netmask 0xffffffff alias

    Add these entries to /etc/rc.conf as necessary.

  2. Configure natd to redirect incoming packets to A.B.D.{2,3} to the IP addresses of the internal hosts. The configuration option for this is:

    redirect_address privateIP publicIP

    So for this example you would have:

    redirect_address 192.168.1.250 A.B.C.2
    redirect_address 192.168.1.252 A.B.C.3
  3. Configure the firewall to allow the redirected connections.

Configuring NAT with IPFilter and IPNat

Handling Name Server Lookups

Handling domain name server lookups from the hosts on the LAN with IP masquerading has always presented a problem. There are two ways of accomodating DNS in a masquerade environment. You can tell each of the hosts that they use the same DNS that the FreeBSD router machine does, and let IP masquerade do its magic on their DNS requests. Alternatively, you can run a caching name server on the FreeBSD machine and have each of the hosts on the LAN use the FreeBSD machine as their DNS. Although a more aggressive action, this is probably the better option because it reduces the volume of DNS traffic travelling on the Internet link and will be marginally faster for most requests, since they'll be served from the cache. The downside to this configuration is that it is more complex. the section called “Caching-only named Configuration”, in Chapter 6, describes how to configure a caching name server.

More About Network Address Translation

The netfilter software is capable of many different types of Network Address Translation. IP Masquerade is one simple application of it.

It is possible, for example, to build NAT rules that translate only certain addresses or ranges of addresses and leave all others untouched, or to translate addresses into pools of addresses rather than just a single address, as masquerade does. You can in fact use the iptables command to generate NAT rules that map just about anything, with combinations of matches using any of the standard attributes, such as source address, destination address, protocol type, port number, etc.

Translating the Source Address of a datagram is referred to as Source NAT, or SNAT, in the netfilter documentation. Translating the Destination Address of a datagram is known as Destination NAT, or DNAT. Translating the TCP or UDP port is known by the term REDIRECT. SNAT, DNAT, and REDIRECT are targets that you may use with the iptables command to build more complex and sophisticated rules.

The topic of Network Address Translation and its uses warrants at least a whole chapter of its own.[50] Unfortunately we don't have the space in this book to cover it in any greater depth. You should read the IPTABLES-HOWTO for more information, if you're interested in discovering more about how you might use Network Address Translation.



[49] I know I was when I first started looking at this.

[50] ... and perhaps even a whole book!

After successfully setting up IP and the resolver, you then must look at the services you want to provide over the network. This chapter covers the configuration of a few simple network applications, including the inetd server and the programs from the rlogin family. We'll also deal briefly with the Remote Procedure Call interface, upon which services like the Network File System (NFS) and the Network Information System (NIS) are based. The configuration of NFS and NIS, however, is more complex and are described in separate chapters, as are electronic mail and network news.

Of course, we can't cover all network applications in this book. If you want to install one that's not discussed here, like talk, gopher, or http, please refer to the manual pages of the server for details.

The inetd Super Server

Programs that provide application services via the network are called network daemons. A daemon is a program that opens a port, most commonly a well-known service port, and waits for incoming connections on it. If one occurs, the daemon creates a child process that accepts the connection, while the parent continues to listen for further requests. This mechanism works well, but has a few disadvantages; at least one instance of every possible service you wish to provide must be active in memory at all times. In addition, the software routines that do the listening and port handling must be replicated in every network daemon.

To overcome these inefficiencies, most Unix installations run a special network daemon, what you might consider a super server. This daemon creates sockets on behalf of a number of services and listens on all of them simultaneously. When an incoming connection is received on any of these sockets, the super server accepts the connection and spawns the server specified for this port, passing the socket across to the child to manage. The server then returns to listening.

The most common super server is called inetd, the Internet Daemon. It is started at system boot time and takes the list of services it is to manage from a startup file named /etc/inetd.conf. In addition to those servers, there are a number of trivial services performed by inetd itself called internal services. They include chargen, which simply generates a string of characters, and daytime, which returns the system's idea of the time of day.

An entry in this file consists of a single line made up of the following fields:

service type protocol wait user server cmdline

Each of the fields is described in the following list:

service

Gives the service name. The service name has to be translated to a port number by looking it up in the /etc/services file. This file will be described later in this chapter in the section the section called “The Services and Protocols Files”.

type

Specifies a socket type, either stream (for connection-oriented protocols) or dgram (for datagram protocols). TCP-based services should therefore always use stream, while UDP-based services should always use dgram.

protocol

Names the transport protocol used by the service. This must be a valid protocol name found in the protocols file, explained later.

wait

This option applies only to dgram sockets. It can be either wait or nowait. If wait is specified, inetd executes only one server for the specified port at any time. Otherwise, it immediately continues to listen on the port after executing the server.

This is useful for single-threaded servers that read all incoming datagrams until no more arrive, and then exit. Most RPC servers are of this type and should therefore specify wait. The opposite type, multi-threaded servers, allow an unlimited number of instances to run concurrently. These servers should specify nowait.

stream sockets should always use nowait.

user

This is the login ID of the user who will own the process when it is executing. This will frequently be the root user, but some services may use different accounts. It is a very good idea to apply the principle of least privilege here, which states that you shouldn't run a command under a privileged account if the program doesn't require this for proper functioning. For example, the NNTP news server runs as news, while services that may pose a security risk (such as tftp or finger) are often run as nobody.

server

Gives the full pathname of the server program to be executed. Internal services are marked by the keyword internal.

cmdline

This is the command line to be passed to the server. It starts with the name of the server to be executed and can include any arguments that need to be passed to it. If you are using the TCP wrapper, you specify the full pathname to the server here. If not, then you just specify the server name as you'd like it to appear in a process list. We'll talk about the TCP wrapper shortly.

This field is empty for internal services.

A sample inetd.conf file is shown in Example 12.1.. The finger service is commented out so that it is not available. This is often done for security reasons, because it can be used by attackers to obtain names and other details of users on your system.

Example 12.1. A Sample /etc/inetd.conf File

# 
	# inetd services
	ftp      stream tcp nowait root  /usr/sbin/ftpd    in.ftpd -l
	telnet   stream tcp nowait root  /usr/sbin/telnetd in.telnetd -b/etc/issue
	#finger  stream tcp nowait bin   /usr/sbin/fingerd in.fingerd
	#tftp    dgram  udp wait  nobody /usr/sbin/tftpd   in.tftpd
	#tftp    dgram  udp wait  nobody /usr/sbin/tftpd   in.tftpd /boot/diskless
	#login   stream tcp nowait root  /usr/sbin/rlogind in.rlogind
	#shell   stream tcp nowait root  /usr/sbin/rshd    in.rshd
	#exec    stream tcp nowait root  /usr/sbin/rexecd  in.rexecd
	#
	#       inetd internal services
	#
	daytime  stream tcp nowait root internal
	daytime  dgram  udp nowait root internal
	time     stream tcp nowait root internal
	time     dgram  udp nowait root internal
	echo     stream tcp nowait root internal
	echo     dgram  udp nowait root internal
	discard  stream tcp nowait root internal
	discard  dgram  udp nowait root internal
	chargen  stream tcp nowait root internal
	chargen  dgram  udp nowait root internal

The tftp daemon is shown commented out as well. tftp implements the Trivial File Transfer Protocol (TFTP), which allows someone to transfer any world-readable files from your system without password checking. This is especially harmful with the /etc/passwd file, and even more so when you don't use shadow passwords.

TFTP is commonly used by diskless clients and Xterminals to download their code from a boot server. If you need to run tftpd for this reason, make sure to limit its scope to those directories from which clients will retrieve files; you will need to add those directory names to tftpd's command line. This is shown in the second tftp line in the example.

The tcpd Access Control Facility

Since opening a computer to network access involves many security risks, applications are designed to guard against several types of attacks. Some security features, however, may be flawed (most drastically demonstrated by the RTM Internet worm, which exploited a hole in a number of programs, including old versions of the sendmail mail daemon), or do not distinguish between secure hosts from which requests for a particular service will be accepted and insecure hosts whose requests should be rejected. We've already briefly discussed the finger and tftp services. Network Administrator would want to limit access to these services to trusted hosts only, which is impossible with the usual setup, for which inetd provides this service either to all clients or not at all.

A useful tool for managing host-specific access is tcpd, often called the daemon wrapper.[51] For TCP services you want to monitor or protect, it is invoked instead of the server program. tcpd checks if the remote host is allowed to use that service, and only if this succeeds will it execute the real server program. tcpd also logs the request to the syslog daemon. Note that this does not work with UDP-based services.

For example, to wrap the finger daemon, you have to change the corresponding line in inetd.conf from this:

# unwrapped finger daemon
	finger    stream tcp nowait bin    /usr/sbin/fingerd in.fingerd

to this:

# wrap finger daemon
	finger  stream  tcp     nowait  root    /usr/sbin/tcpd   in.fingerd

Without adding any access control, this will appear to the client as the usual finger setup, except that any requests are logged to syslog's auth facility.

Two files called /etc/hosts.allow and /etc/hosts.deny implement access control. They contain entries that allow and deny access to certain services and hosts. When tcpd handles a request for a service such as finger from a client host named biff.foobar.com, it scans hosts.allow and hosts.deny (in this order) for an entry matching both the service and client host. If a matching entry is found in hosts.allow, access is granted and tcpd doesn't consult the hosts.deny file. If no match is found in the hosts.allow file, but a match is found in hosts.deny, the request is rejected by closing down the connection. The request is accepted if no match is found at all.

Entries in the access files look like this:

servicelist: hostlist [:shellcmd]

servicelist is a list of service names from /etc/services, or the keyword ALL. To match all services except finger and tftp, use ALL EXCEPT finger, tftp.

hostlist is a list of hostnames, IP addresses, or the keywords ALL, LOCAL, UNKNOWN or PARANOID. ALL matches any host, while LOCAL matches hostnames that don't contain a dot.[52] UNKNOWN matches any hosts whose name or address lookup failed. PARANOID matches any host whose hostname does not resolve back to its IP address.[53] A name starting with a dot matches all hosts whose domain is equal to this name. For example, .foobar.com matches biff.foobar.com, but not nurks.fredsville.com. A pattern that ends with a dot matches any host whose IP address begins with the supplied pattern, so 172.16. matches 172.16.32.0, but not 172.15.9.1. A pattern of the form n.n.n.n/m.m.m.m is treated as an IP address and network mask, so we could specify our previous example as 172.16.0.0/255.255.0.0 instead. Lastly, any pattern beginning with a / character allows you to specify a file that is presumed to contain a list of hostname or IP address patterns, any of which are allowed to match. So a pattern that looked like /var/access/trustedhosts would cause the tcpd daemon to read that file, testing if any of the lines in it matched the connecting host.

To deny access to the finger and tftp services to all but the local hosts, put the following in /etc/hosts.deny and leave /etc/hosts.allow empty:

in.tftpd, in.fingerd: ALL EXCEPT LOCAL, .your.domain

The optional shellcmd field may contain a shell command to be invoked when the entry is matched. This is useful to set up traps that may expose potential attackers. The following example creates a log file listing the user and host connecting, and if the host is not vlager.vbrew.com it will append the output of a finger to that host:

in.ftpd: ALL EXCEPT LOCAL, .vbrew.com : \
	echo "request from %d@%h: >> /var/log/finger.log; \
	if [ %h != "vlager.vbrew.com:" ]; then \ 
	finger -l @%h >> /var/log/finger.log \
	fi

The %h and %d arguments are expanded by tcpd to the client hostname and service name, respectively. Please refer to the hosts_access(5) manual page for details.

The Services and Protocols Files

The port numbers on which certain standard services are offered are defined in the Assigned Numbers RFC. To enable server and client programs to convert service names to these numbers, at least part of the list is kept on each host; it is stored in a file called /etc/services. An entry is made up like this:

service port/protocol   [aliases]

Here, service specifies the service name, port defines the port the service is offered on, and protocol defines which transport protocol is used. Commonly, the latter field is either udp or tcp. It is possible for a service to be offered for more than one protocol, as well as offering different services on the same port as long as the protocols are different. The aliases field allows you to specify alternative names for the same service.

Usually, you don't have to change the services file that comes along with the network software on your Linux system. Nevertheless, we give a small excerpt from that file in Example 12.2..

Example 12.2. A Sample /etc/services File

# The services file:
	#
	# well-known services
	echo           7/tcp                 # Echo
	echo           7/udp                 #
	discard        9/tcp  sink null      # Discard
	discard        9/udp  sink null      #
	daytime       13/tcp                 # Daytime
	daytime       13/udp                 #
	chargen       19/tcp  ttytst source  # Character Generator
	chargen       19/udp  ttytst source  #
	ftp-data      20/tcp                 # File Transfer Protocol (Data)
	ftp           21/tcp                 # File Transfer Protocol (Control)
	telnet        23/tcp                 # Virtual Terminal Protocol
	smtp          25/tcp                 # Simple Mail Transfer Protocol
	nntp         119/tcp  readnews       # Network News Transfer Protocol
	#
	# UNIX services
	exec         512/tcp                 # BSD rexecd
	biff         512/udp  comsat         # mail notification
	login        513/tcp                 # remote login
	who          513/udp  whod           # remote who and uptime
	shell        514/tcp  cmd            # remote command, no passwd used
	syslog       514/udp                 # remote system logging
	printer      515/tcp  spooler        # remote print spooling
	route        520/udp  router routed  # routing information protocol

Note that the echo service is offered on port 7 for both TCP and UDP, and that port 512 is used for two different services: remote execution (rexec) using TCP, and the COMSAT daemon, which notifies users of new mail, over UDP (see xbiff(1x)).

Like the services file, the networking library needs a way to translate protocol namesfor example, those used in the services fileto protocol numbers understood by the IP layer on other hosts. This is done by looking up the name in the /etc/protocols file. It contains one entry per line, each containing a protocol name, and the associated number. Having to touch this file is even more unlikely than having to meddle with /etc/services. A sample file is given in Example 12.3..

Example 12.3. A Sample /etc/protocols File

#
	# Internet (IP) protocols
	#
	ip      0       IP              # internet protocol, pseudo protocol number
	icmp    1       ICMP            # internet control message protocol
	igmp    2       IGMP            # internet group multicast protocol
	tcp     6       TCP             # transmission control protocol
	udp     17      UDP             # user datagram protocol
	raw     255     RAW             # RAW IP interface

Remote Procedure Call

The general mechanism for client-server applications is provided by the Remote Procedure Call (RPC) package. RPC was developed by Sun Microsystems and is a collection of tools and library functions. Important applications built on top of RPC are NIS, the Network Information System (described in Chapter 13., The Network Information System), and NFS, the Network File System (described in Chapter 14., The NetworkFile System), which are both described in this book.

An RPC server consists of a collection of procedures that a client can call by sending an RPC request to the server along with the procedure parameters. The server will invoke the indicated procedure on behalf of the client, handing back the return value, if there is any. In order to be machine-independent, all data exchanged between client and server is converted to the External Data Representation format (XDR) by the sender, and converted back to the machine-local representation by the receiver. RPC relies on standard UDP and TCP sockets to transport the XDR formatted data to the remote host. Sun has graciously placed RPC in the public domain; it is described in a series of RFCs.

Sometimes improvements to an RPC application introduce incompatible changes in the procedure call interface. Of course, simply changing the server would crash all applications that still expect the original behavior. Therefore, RPC programs have version numbers assigned to them, usually starting with 1, and with each new version of the RPC interface, this counter will be bumped up. Often, a server may offer several versions simultaneously; clients then indicate by the version number in their requests which implementation of the service they want to use.

The communication between RPC servers and clients is somewhat peculiar. An RPC server offers one or more collections of procedures; each set is called a program and is uniquely identified by a program number. A list that maps service names to program numbers is usually kept in /etc/rpc, an excerpt of which is shown in Example 12.4..

Example 12.4. A Sample /etc/rpc File

#
	# /etc/rpc - miscellaneous RPC-based services
	#
	portmapper      100000  portmap sunrpc
	rstatd          100001  rstat rstat_svc rup perfmeter
	rusersd         100002  rusers
	nfs             100003  nfsprog
	ypserv          100004  ypprog
	mountd          100005  mount showmount
	ypbind          100007
	walld           100008  rwall shutdown
	yppasswdd       100009  yppasswd
	bootparam       100026
	ypupdated       100028  ypupdate

In TCP/IP networks, the authors of RPC faced the problem of mapping program numbers to generic network services. They designed each server to provide both a TCP and a UDP port for each program and each version. Generally, RPC applications use UDP when sending data, and fall back to TCP only when the data to be transferred doesn't fit into a single UDP datagram.

Of course, client programs need to find out to which port a program number maps. Using a configuration file for this would be too unflexible; since RPC applications don't use reserved ports, there's no guarantee that a port originally meant to be used by our database application hasn't been taken by some other process. Therefore, RPC applications pick any port they can get and register it with a special program called the portmapper daemon. The portmapper acts as a service broker for all RPC servers running on its machine. A client that wishes to contact a service with a given program number first queries the portmapper on the server's host, which returns the TCP and UDP port numbers the service can be reached at.

This method introduces a single point of failure, much like the inetd daemon does for the standard Berkeley services. However, this case is even a little worse because when the portmapper dies, all RPC port information is lost; this usually means you have to restart all RPC servers manually or reboot the entire machine.

On Linux, the portmapper is called /sbin/portmap, or sometimes /usr/sbin/rpc.portmap. Other than making sure it is started from your network boot scripts, the portmapper doesn't require any configuration.

Configuring Remote Loginand Execution

It's often very useful to execute a command on a remote host and have input or output from that command be read from, or written to, a network connection.

The traditional commands used for executing commands on remote hosts are rlogin, rsh and rcp. We saw an example of the rlogin command in Chapter 1., Introduction to Networking in the section the section called “Introduction to TCP/IP Networks”. We briefly discussed the security issues associated with it in the section called “System Security” and suggested ssh as a replacement. The ssh package provides replacements called slogin, ssh, and scp.

Each of these commands spawns a shell on the remote host and allows the user to execute commands. Of course, the client needs to have an account on the remote host where the command is to be executed. Thus, all these commands use an authentication process. The r commands use a simple username and password exchange between the hosts with no encryption, so anyone listening could easily intercept the passwords. The ssh command suite provides a higher level of security: it uses a technique called Public Key Cryptography, which provides authentication and encryption between the hosts to ensure that neither passwords nor session data are easily intercepted by other hosts.

It is possible to relax authentication checks for certain users even further. For instance, if you frequently have to log into other machines on your LAN, you might want to be admitted without having to type your password every time. This was always possible with the r commands, but the ssh suite allows you to do this a little more easily. It's still not a great idea because it means that if an account on one machine is breached, access can be gained to all other accounts that user has configured for password-less login, but it is very convenient and people will use it.

Let's talk about removing the r commands and getting ssh to work instead.

Disabling the r; Commands

Start by removing the r commands if they're installed. The easiest way to disable the old r commands is to comment out (or remove) their entries in the /etc/inetd.conf file. The relevant entries will look something like this:

# Shell, login, exec and talk are BSD protocols.
	  shell    stream  tcp     nowait  root  /usr/sbin/tcpd /usr/sbin/in.rshd
	  login    stream  tcp     nowait  root  /usr/sbin/tcpd /usr/sbin/in.rlogind
	  exec     stream  tcp     nowait  root  /usr/sbin/tcpd /usr/sbin/in.rexecd

You can comment them by placing a # character at the start of each line, or delete the lines completely. Remember, you need to restart the inetd daemon for this change to take effect. Ideally, you should remove the daemon programs themselves, too.

Installing and Configuring ssh

OpenSSH is a free version of the ssh suite of programs; the Linux port can be found at http://violet.ibs.com.au/openssh/ and in most modern Linux distributions.[54] We won't describe compilation here; good instructions are included in the source. If you can install it from a precompiled package, then it's probably wise to do so.

There are two parts to an ssh session. There is an ssh client that you need to configure and run on the local host and an ssh daemon that must be running on the remote host.

The ssh daemon

The sshd daemon is the program that listens for network connections from ssh clients, manages authentication, and executes the requested command. It has one main configuration file called /etc/ssh/sshd_config and a special file containing a key used by the authentication and encryption processes to represent the host end. Each host and each client has its own key.

A utility called ssh-keygen is supplied to generate a random key. This is usually used once at installation time to generate the host key, which the system administrator usually stores in a file called /etc/ssh/ssh_host_key. Keys can be of any length of 512 bits or greater. By default, ssh-keygen generates keys of 1024 bits in length, and most people use the default. To generate a random key, you would invoke the ssh-keygen command like this:

# ssh-keygen -f /etc/ssh/ssh_host_key

You will be prompted to enter a passphrase. However, host keys must not use a passphrase, so just press the return key to leave it blank. The program output will look something like:

Generating RSA keys:  ......oooooO...............................oooooO
	    Key generation complete.
	    Enter passphrase (empty for no passphrase): 
	    Enter same passphrase again: 
	    Your identification has been saved in /etc/ssh/ssh_host_key
	    Your public key has been saved in /etc/ssh/ssh_host_key.pub
	    The key fingerprint is:
	    1024 3a:14:78:8e:5a:a3:6b:bc:b0:69:10:23:b7:d8:56:82 root@moria

You will find at the end that two files have been created. The first is called the private key, which must be kept secret and will be in /etc/ssh/ssh_host_key. The second is called the public key and is one that you can share; it will be in /etc/ssh/ssh_host_key.pub.

Armed with the keys for ssh communication, you need to create a configuration file. The ssh suite is very powerful and the configuration file may contain many options. We'll present a simple example to get you started; you should refer to the ssh documentation to enable other features. The following code shows a safe and minimal sshd configuration file. The rest of the configuration options are detailed in the sshd(8) manpage:

# /etc/ssh/sshd_config
	  #
	  
	  # The IP adddresses to listen for connections on. 0.0.0.0 means all
	  # local addresses.
	  ListenAddress 0.0.0.0
	  
	  # The TCP port to listen for connections on. The default is 22.
	  Port 22
	  
	  # The name of the host key file.
	  HostKey /etc/ssh/ssh_host_key
	  
	  # The length of the key in bits.
	  ServerKeyBits 1024
	  
	  # Should we allow root logins via ssh?
	  PermitRootLogin no
	  
	  # Should the ssh daemon check users' home directory and files permissions?
	  # are safe before allowing login?
	  StrictModes yes
	  
	  # Should we allow old ~/.rhosts and /etc/hosts.equiv authentication method?
	  RhostsAuthentication no
	  # Should we allow pure RSA authentication?
	  RSAAuthentication yes
	  # Should we allow password authentication?
	  PasswordAuthentication yes
	  
	  # Should we allow /etc/hosts.equiv combined with RSA host authentication?
	  RhostsRSAAuthentication no
	  # Should we ignore ~/.rhosts files?
	  IgnoreRhosts yes
	  
	  # Should we allow logins to accounts with empty passwords?
	  PermitEmptyPasswords no

It's important to make sure the permissions of the configuration files are correct to ensure that system security is maintained. Use the following commands:

# chown -R root:root /etc/ssh
	    # chmod 755 /etc/ssh
	    # chmod 600 /etc/ssh/ssh_host_key
	    # chmod 644 /etc/ssh/ssh_host_key.pub
	    # chmod 644 /etc/ssh/sshd_config

The final stage of sshd administration daemon is to run it. Normally you'd create an rc file for it or add it to an existing one, so that it is automatically executed at boot time. The daemon runs standalone and doesn't require any entry in the /etc/inetd.conf file. The daemon must be run as the root user. The syntax is very simple:

/usr/sbin/sshd

The sshd daemon will automatically place itself into the background when being run. You are now ready to accept ssh connections.

The ssh client

There are a number of ssh client programs: slogin, scp and ssh. They each read the same configuration file, usually called /etc/ssh/ssh_config. They each also read configuration files from the .ssh directory in the home directory of the user executing them. The most important of these files is the .ssh/config file, which may contain options that override those specified in the /etc/ssh/ssh_config file, the .ssh/identity file, which contains the user's own private key, and the corresponding .ssh/identity.pub file, containing the user's public key. Other important files are .ssh/known_hosts and .ssh/authorized_keys; we'll talk about those later in the section called “Using ssh”. First, let's create the global configuration file and the user key file.

/etc/ssh/ssh_config is very similar to the server configuration file. Again, there are lots of features you can configure, but a minimal configuration looks like that presented in Example 12.5.. The rest of the configuration options are detailed in the sshd(8) manpage. You can add sections that match specific hosts or groups of hosts. The parameter to the Host statement may be either the full name of a host or a wildcard specification, as we've used in our example, to match all hosts. We could create an entry that used, for example, Host *.vbrew.com to match any host in the vbrew.com domain.

Example 12.5. Example ssh Client Configuration File

# /etc/ssh/ssh_config
	    
	    # Default options to use when connecting to a remote host
	    Host *
	    # Compress the session data?
	    Compression yes
	    # .. using which compression level? (1 - fast/poor, 9 - slow/good)
	    CompressionLevel 6
	    
	    # Fall back to rsh if the secure connection fails?
	    FallBackToRsh no
	    
	    # Should we send keep-alive messages? Useful if you use IP masquerade
	    KeepAlive yes
	    
	    # Try RSA authentication?
	    RSAAuthentication yes
	    # Try RSA authentication in combination with .rhosts authentication?
	    RhostsRSAAuthentication yes

We mentioned in the server configuration section that every host and user has a key. The user's key is stored in his or her ~/.ssh/indentity file. To generate the key, use the same ssh-keygen command as we used to generate the host key, except this time you do not need to specify the name of the file in which you save the key. The ssh-keygen defaults to the correct location, but it prompts you to enter a filename in case you'd like to save it elsewhere. It is sometimes useful to have multiple identity files, so ssh allows this. Just as before, ssh-keygen will prompt you to entry a passphrase. Passphrases add yet another level of security and are a good idea. Your passphrase won't be echoed on the screen when you type it.

Warning

There is no way to recover a passphrase if you forget it. Make sure it is something you will remember, but as with all passwords, make it something that isn't obvious, like a proper noun or your name. For a passphrase to be truly effective, it should be between 10 and 30 characters long and not be plain English prose. Try to throw in some unusual characters. If you forget your passphrase, you will be forced to generate a new key.

You should ask each of your users to run the ssh-keygen command just once to ensure their key file is created correctly. The ssh-keygen will create their ~/.ssh/ directories for them with appropriate permissions and create their private and public keys in .ssh/identity and .ssh/identity.pub, respectively. A sample session should look like:

$ ssh-keygen
	    Generating RSA keys:  .......oooooO..............................
	    Key generation complete.
	    Enter file in which to save the key (/home/maggie/.ssh/identity): 
	    Enter passphrase (empty for no passphrase): 
	    Enter same passphrase again: 
	    Your identification has been saved in /home/maggie/.ssh/identity.
	    Your public key has been saved in /home/maggie/.ssh/identity.pub.
	    The key fingerprint is:
	    1024 85:49:53:f4:8a:d6:d9:05:d0:1f:23:c4:d7:2a:11:67 maggie@moria
	    $

Now ssh is ready to run.

Using ssh

We should now have the ssh command and it's associated programs installed and ready to run. Let's now take a quick look at how to run them.

First, we'll try a remote login to a host. We can use the slogin program in much the same way as we used the rlogin program in our example earlier in the book. The first time you attempt a connection to a host, the ssh client will retrieve the public key of the host and ask you to confirm its identity by prompting you with a shortened version of the public key called a fingerprint.

The administrator at the remote host should have supplied you in advance with its public key fingerprint, which you should add to your .ssh/known_hosts file. If the remote administrator has not supplied you the appropriate key, you can connect to the remote host, but ssh will warn you that it does have a key and prompt you whether you wish to accept the one offered by the remote host. Assuming that you're sure no one is engaging in DNS spoofing and you are in fact talking to the correct host, answer yes to the prompt. The relevant key is then stored automatically in your .ssh/known_hosts and you will not be prompted for it again. If, on a future connection attempt, the public key retrieved from that host does not match the one that is stored, you will be warned, because this represents a potential security breach.

A first-time login to a remote host will look something like:

$ slogin vchianti.vbrew.com
	    The authenticity of host 'vchianti.vbrew.com' can't be established.
	    Key fingerprint is 1024 7b:d4:a8:28:c5:19:52:53:3a:fe:8d:95:dd:14:93:f5.
	    Are you sure you want to continue connecting (yes/no)? yes
	    Warning: Permanently added 'vchianti.vbrew.com,172.16.2.3' to the list of/
	    known hosts.
	    maggie@vchianti.vbrew.com's password: 
	    Last login: Tue Feb  1 23:28:58 2000 from vstout.vbrew.com
	    $

You will be prompted for a password, which you should answer with the password belonging to the remote account, not the local one. This password is not echoed when you type it.

Without any special arguments, slogin will attempt to log in with the same userid used on the local machine. You can override this using the -l argument, supplying an alternate login name on the remote host. This is what we did in our example earlier in the book.

We can copy files to and from the remote host using the scp program. Its syntax is similar to the conventional cp with the exception that you may specify a hostname before a filename, meaning that the file path is on the specified host. The following example illustrates scp syntax by copying a local file called /tmp/fred to the /home/maggie/ of the remote host chianti.vbrew.com:

$ scp /tmp/fred vchianti.vbrew.com:/home/maggie/
	    maggie@vchianti.vbrew.com's password:
	    fred                 100% |*****************************| 50165   00:01 ETA

Again, you'll be prompted for a password. The scp command displays useful progress messages by default. You can copy a file from a remote host with the same ease; simply specify its hostname and filepath as the source and the local path as the destination. It's even possible to copy a file from a remote host to some other remote host, but it is something you wouldn't normally want to do, because all of the data travels via your host.

You can execute commands on remote hosts using the ssh command. Again, its syntax is very simple. Let's have our user maggie retrieve the root directory of the remote host vchianti.vbrew.com. She'd do this with:

$ ssh vchianti.vbrew.com ls -CF /
	    maggie@vchianti.vbrew.com's password:
	    bin/    console@  dos/     home/    lost+found/  pub@   tmp/  vmlinuz@
	    boot/   dev/      etc/     initrd/  mnt/         root/  usr/  vmlinuz.old@
	    cdrom/  disk/     floppy/  lib/     proc/        sbin/  var/

You can place ssh in a command pipeline and pipe program input/output to or from it just like any other command, except that the input or output is directed to or from the remote host via the ssh connection. Here is an example of how you might use this capability in combination with the tar command to copy a whole directory with subdirectories and files from a remote host to the local host:

$ ssh vchianti.vbrew.com "tar cf - /etc/" | tar xvf -
	    maggie@vchianti.vbrew.com's password:
	    etc/GNUstep
	    etc/Muttrc
	    etc/Net
	    etc/X11
	    etc/adduser.conf
	    ..
	    ..

Here we surrounded the command we will execute with quotation marks to make it clear what is passed as an argument to ssh and what is used by the local shell. This command executes the tar command on the remote host to archive the /etc/ directory and write the output to standard output. We've piped to an instance of the tar command running on our local host in extract mode reading from standard input.

Again, we were prompted for the password. Now you can see why we encouraged you to configure ssh so that it doesn't prompt you for passwords all the time! Let's now configure our local ssh client so that it won't prompt for a password when connecting to the vchianti.vbrew.com host. We mentioned the .ssh/authorized_keys file earlier; this is where it is used. The .ssh/authorized_keys file contains the public keys on any remote user accounts that we wish to automatically log in to. You can set up automatic logins by copying the contents of the .ssh/identity.pub from the remote account into our local .ssh/authorized_keys file. It is vital that the file permissions of .ssh/authorized_keys allow only that you read and write it; anyone may steal and use the keys to log in to that remote account. To ensure the permissions are correct, change .ssh/authorized_keys, as shown:

$ chmod 600 ~/.ssh/authorized_keys

The public keys are a long single line of plain text. If you use copy and paste to duplicate the key into your local file, be sure to remove any end of line characters that might have been introduced along the way. The .ssh/authorized_keys file may contain many such keys, each on a line of its own.

The ssh suite of tools is very powerful and there are many other useful features and options that you will be interested in exploring. Please refer to the manual pages and other documentation that is supplied with the package for more information.



[51] Written by Wietse Venema, wietse@wzv.win.tue.nl.

[52] Usually only local hostnames obtained from lookups in /etc/hosts contain no dots.

[53] While its name suggests it is an extreme measure, the PARANOID keyword is a good default, as it protects you against mailicious hosts pretending to be someone they are not. Not all tcpd are supplied with PARANOID compiled in; if yours is not, you need to recompile tcpd to use it.

[54] OpenSSH was developed by the OpenBSD project and is a fine example of the benefit of free software.

When you're running a local area network, your overall goal is usually to provide an environment for your users that makes the network transparent. An important stepping stone is keeping vital data such as user account information synchronized among all hosts. This provides users with the freedom to move from machine to machine without the inconvenience of having to remember different passwords and copy data from one machine to another. Data that is centrally stored doesn't need to be replicated, so long as there is some convenient means of accessing it from a network-connected host. By storing important administrative information centrally, you can make ensure consistency of that data, increase flexibility for the users by allowing them to move from host to host in a transparent way, and make the system administrator's life much easier by maintaining a single copy of information to maintain when required.

We previously discussed an important example of this concept that is used on the Internetthe Domain Name System (DNS). DNS serves a limited range of information, the most important being the mapping between hostname and IP address. For other types of information, there is no such specialized service. Moreover, if you manage only a small LAN with no Internet connectivity, setting up DNS may not seem to be worth the trouble.

This is why Sun developed the Network Information System (NIS). NIS provides generic database access facilities that can be used to distribute, for example, information contained in the passwd and groups files to all hosts on your network. This makes the network appear as a single system, with the same accounts on all hosts. Similarly, you can use NIS to distribute the hostname information from /etc/hosts to all machines on the network.

NIS is based on RPC, and comprises a server, a client-side library, and several administrative tools. Originally, NIS was called Yellow Pages, or YP, which is still used to refer to it. Unfortunately, the name is a trademark of British Telecom, which required Sun to drop that name. As things go, some names stick with people, and so YP lives on as a prefix to the names of most NIS-related commands such as ypserv and ypbind.

Today, NIS is available for virtually all Unixes, and there are even free implementations. BSD Net-2 released one that has been derived from a public domain reference implementation donated by Sun. The library client code from this release had been in the Linux libc for a long time, and the administrative programs were ported to Linux by Swen Thmmler.[55] An NIS server is missing from the reference implementation, though.

Peter Eriksson developed a new implementation called NYS.[56] It supports both plain NIS and Sun's much enhanced NIS+. NYS not only provides a set of NIS tools and a server, but also adds a whole new set of library functions that need to be compiled into your libc if you wish to use it. This includes a new configuration scheme for hostname resolution that replaces the current scheme using host.conf.

The GNU libc, known as libc6 in the Linux community, includes an updated version of the traditional NIS support developed by Thorsten Kukuk.[57] It supports all of the library functions that NYS provided and also uses the enhanced configuration scheme of NYS. You still need the tools and server, but using GNU libc saves you the trouble of having to meddle with patching and recompiling the library.

This chapter focuses on the NIS support included in the GNU libc rather than the other two packages. If you do want to run any of these packages, the instructions in this chapter may or may not be enough. For additional information, refer to the NIS-HOWTO or a book such as Managing NFS and NIS by Hal Stern (O'Reilly).

Getting Acquainted with NIS

NIS keeps database information in files called maps, which contain key-value pairs. An example of a key-value pair is a user's login name and the encrypted form of their login password. Maps are stored on a central host running the NIS server, from which clients may retrieve the information through various RPC calls. Quite frequently, maps are stored in DBM files.[58]

The maps themselves are usually generated from master text files such as /etc/hosts or /etc/passwd. For some files, several maps are created, one for each search key type. For instance, you may search the hosts file for a hostname as well as for an IP address. Accordingly, two NIS maps are derived from it, called hosts.byname and hosts.byaddr. Table 13.1. lists common maps and the files from which they are generated.

Table 13.1. Some Standard NIS Maps and Corresponding Files

Master FileMap(s)Description
/etc/hosts

hosts.byname, hosts.byaddr

Maps IP addresses to host names

/etc/networks

networks.byname, networks.byaddr

Maps IP network addresses to network names

/etc/passwd

passwd.byname, passwd.byuid

Maps encrypted passwords to user login names

/etc/group

group.byname, group.bygid

Maps Group IDs to group names

/etc/services

services.byname, services.bynumber

Maps service descriptions to service names
/etc/rpc

rpc.byname, rpc.bynumber

Maps Sun RPC service numbers to RPC service names

/etc/protocols

protocols.byname, protocols.bynumber

Maps protocol numbers to protocol names

/usr/lib/aliasesmail.aliases

Maps mail aliases to mail alias names

You may find support for other files and maps in other NIS packages. These usually contain information for applications not discussed in this book, such as the bootparams map that is used by Sun's bootparamd server.

For some maps, people commonly use nicknames, which are shorter and therefore easier to type. Note that these nicknames are understood only by ypcat and ypmatch, two tools for checking your NIS configuration. To obtain a full list of nicknames understood by these tools, run the following command:

$ ypcat -x
	Use "passwd" for "passwd.byname"
	Use "group" for "group.byname"
	Use "networks" for "networks.byaddr"
	Use "hosts" for "hosts.byaddr"
	Use "protocols" for "protocols.bynumber"
	Use "services" for "services.byname"
	Use "aliases" for "mail.aliases"
	Use "ethers" for "ethers.byname"

The NIS server program is traditionally called ypserv. For an average network, a single server usually suffices; large networks may choose to run several of these on different machines and different segments of the network to relieve the load on the server machines and routers. These servers are synchronized by making one of them the master server, and the others slave servers. Maps are created only on the master server's host. From there, they are distributed to all slaves.

We have been talking very vaguely about networks. There's a distinctive term in NIS that refers to a collection of all hosts that share part of their system configuration data through NIS: the NIS domain. Unfortunately, NIS domains have absolutely nothing in common with the domains we encountered in DNS. To avoid any ambiguity throughout this chapter, we will therefore always specify which type of domain we mean.

NIS domains have a purely administrative function. They are mostly invisible to users, except for the sharing of passwords between all machines in the domain. Therefore, the name given to an NIS domain is relevant only to the administrators. Usually, any name will do, as long as it is different from any other NIS domain name on your local network. For instance, the administrator at the Virtual Brewery may choose to create two NIS domains, one for the Brewery itself, and one for the Winery, which she names brewery and winery respectively. Another quite common scheme is to simply use the DNS domain name for NIS as well.

To set and display the NIS domain name of your host, you can use the domainname command. When invoked without any argument, it prints the current NIS domain name; to set the domain name, you must become the superuser:

# domainname brewery

NIS domains determine which NIS server an application will query. For instance, the login program on a host at the Winery should, of course, query only the Winery's NIS server (or one of them, if there are several) for a user's password information, while an application on a Brewery host should stick with the Brewery's server.

One mystery now remains to be solved: how does a client find out which server to connect to? The simplest approach would use a configuration file that names the host on which to find the server. However, this approach is rather inflexible because it doesn't allow clients to use different servers (from the same domain, of course) depending on their availability. Therefore, NIS implementations rely on a special daemon called ypbind to detect a suitable NIS server in their NIS domain. Before performing any NIS queries, an application first finds out from ypbind which server to use.

ypbind probes for servers by broadcasting to the local IP network; the first to respond is assumed to be the fastest one and is used in all subsequent NIS queries. After a certain interval has elapsed, or if the server becomes unavailable, ypbind probes for active servers again.

Dynamic binding is useful only when your network provides more than one NIS server. Dynamic binding also introduces a security problem. ypbind blindly believes whoever answers, whether it be a humble NIS server or a malicious intruder. Needless to say, this becomes especially troublesome if you manage your password databases over NIS. To guard against this, the Linux ypbind program provides you with the option of probing the local network to find the local NIS server, or configuring the NIS server hostname in a configuration file.

NIS Versus NIS+

NIS and NIS+ share little more than their name and a common goal. NIS+ is structured entirely differently from NIS. Instead of a flat namespace with disjoint NIS domains, NIS+ uses a hierarchical namespace similar to that of DNS. Instead of maps, so-called tables are used that are made up of rows and columns, in which each row represents an object in the NIS+ database and the columns cover properties of the objects that NIS+ knows and cares about. Each table for a given NIS+ domain comprises those of its parent domains. In addition, an entry in a table may contain a link to another table. These features make it possible to structure information in many ways.

NIS+ additionally supports secure and encrypted RPC, which helps greatly to solve the security problems of NIS.

Traditional NIS has an RPC Version number of 2, while NIS+ is Version 3. At the time we're writing, there isn't yet a good working implementation of NIS+ for Linux, so it isn't covered here.

The Client Side of NIS

If you are familiar with writing or porting network applications, you may notice that most of the NIS maps listed previously correspond to library functions in the C library. For instance, to obtain passwd information, you generally use the getpwnam and getpwuid functions, which return the account information associated with the given username or numerical user ID, respectively. Under normal circumstances, these functions perform the requested lookup on the standard file, such as /etc/passwd.

An NIS-aware implementation of these functions, however, modifies this behavior and places an RPC call to the NIS server, which looks up the username or user ID. This happens transparently to the application. The function may treat the NIS data as though it has been appended to the original passwd file so both sets of information are available to the application and used, or as though it has completely replaced it so that the information in the local passwd is ignored and only the NIS data is used.

For traditional NIS implementations, there were certain conventions for which maps were replaced and which were appended to the original information. Some, like the passwd maps, required kludgy modifications of the passwd file which, when done incorrectly, would open up security holes. To avoid these pitfalls, NYS and the GNU libc use a general configuration scheme that determines whether a particular set of client functions uses the original files, NIS, or NIS+, and in which order. This scheme will be described later in this chapter.

Running an NIS Server

After so much theoretical techno-babble, it's time to get our hands dirty with actual configuration work. In this section, we will cover the configuration of an NIS server. If an NIS server is running on your network, you won't have to set up your own; in this case, you may safely skip this section.

Note that if you are just going to experiment with the server, make sure you don't set it up for an NIS domain name that is already in use on your network. This may disrupt the entire network service and make a lot of people very unhappy and very angry.

There are two possible NIS server configurations: master and slave. The slave configuration provides a live backup machine, should your master server fail. We will cover the configuration only for a master server here. The server documentation will explain the differences, should you wish to configure a slave server.

There are currently two NIS servers freely available for Linux: one contained in Tobias Reber's yps package, and the other in Peter Eriksson's ypserv package. It doesn't matter which one you run.

After installing the server program (ypserv) in /usr/sbin, you should create the directory that is going to hold the map files your server is to distribute. When setting up an NIS domain for the brewery domain, the maps would go to /var/yp/brewery. The server determines whether it is serving a particular NIS domain by checking if the map directory is present. If you are disabling service for some NIS domain, make sure to remove the directory as well.

Maps are usually stored in DBM files to speed up lookups. They are created from the master files using a program called makedbm (for Tobias's server) or dbmload (for Peter's server).

Transforming a master file into a form that dbmload can parse usually requires some awk or sed magic, which tends to be a little tedious to type and hard to remember. Therefore, Peter Eriksson's ypserv package contains a Makefile (called ypMakefile) that manages the conversion of the most common master files for you. You should install it as Makefile in your map directory and edit it to reflect the maps you want the NIS server to share. Towards the top of the file, you'll find the all target that lists the services ypserv offers. By default, the line looks something like this:

all: ethers hosts networks protocols rpc services passwd group netid

If you don't want to produce, for example, the ethers.byname and ethers.byaddr maps, simply remove the ethers prerequisite from this rule. To test your setup, you can start with just one or two maps, like the services.* maps.

After editing the Makefile, while in the map directory, type make. This will automatically generate and install the maps. You have to make sure to update the maps whenever you change the master files, otherwise the changes will remain invisible to the network.

The section Setting Up an NIS Client with GNU libc will explain how to configure the NIS client code. If your setup doesn't work, you should try to find out whether requests are arriving at your server. If you specify the debug command-line flag to ypserv, it prints debugging messages to the console about all incoming NIS queries and the results returned. These should give you a hint as to where the problem lies. Tobias's server doesn't have this option.

NIS Server Security

NIS used to have a major security flaw: it left your password file readable by virtually anyone in the entire Internet, which made for quite a number of possible intruders. As long as an intruder knew your NIS domain name and the address of your server, he could simply send it a request for the passwd.byname map and instantly receive all your system's encrypted passwords. With a fast password-cracking program like crack and a good dictionary, guessing at least a few of your users' passwords is rarely a problem.

This is what the securenets option is all about. It simply restricts access to your NIS server to certain hosts, based on their IP addresses or network numbers. The latest version of ypserv implements this feature in two ways. The first relies on a special configuration file called /etc/ypserv.securenets and the second conveniently uses the /etc/hosts.allow and /etc/hosts.deny files we already encountered in Chapter 12., ImportantNetwork Features.[59] Thus, to restrict access to hosts from within the Brewery, their network manager would add the following line to hosts.allow:

ypserv: 172.16.2.

This would let all hosts from IP network 172.16.2.0 access the NIS server. To shut out all other hosts, a corresponding entry in hosts.deny would have to read:

ypserv: ALL

IP numbers are not the only way you can specify hosts or networks in hosts.allow and hosts.deny. Please refer to the hosts_access(5) manual page on your system for details. However, be warned that you cannot use host or domain names for the ypserv entry. If you specify a hostname, the server tries to resolve this hostnamebut the resolver in turn calls ypserv, and you fall into an endless loop.

To configure securenets security using the /etc/ypserv.securenets method, you need to create its configuration file, /etc/ypserv.securenets. This configuration file is simple in structure. Each line describes a host or network of hosts that will be allowed access to the server. Any address not described by an entry in this file will be refused access. A line beginning with a # will be treated as a comment. Example 13-1 shows what a simple /etc/ypserv.securenets would look like:

Example 13.1. Sample ypserv.securenets File

# allow connections from local host -- necessary
	host 127.0.0.1
	# same as 255.255.255.255 127.0.0.1
	#
	# allow connections from any host on the Virtual Brewery network
	255.255.255.0   172.16.1.0
	#

The first entry on each line is the netmask to use for the entry, with host being treated as a special keyword meaning netmask 255.255.255.255. The second entry on each line is the IP address to which to apply the netmask.

A third option is to use the secure portmapper instead of the securenets option in ypserv. The secure portmapper (portmap-5.0) uses the hosts.allow scheme as well, but offers this for all RPC servers, not just ypserv.[60] However, you should not use both the securenets option and the secure portmapper at the same time, because of the overhead this authorization incurs.

Setting Up an NIS Client with GNU libc

We will now describe and discuss the configuration of an NIS client using the GNU libc library support.

Your first step should be to tell the GNU libc NIS client which server to use for NIS service. We mentioned earlier that the Linux ypbind allows you to configure the NIS server to use. The default behavior is to query the server on the local network. If the host you are configuring is likely to move from one domain to another, such as a laptop, you would leave the /etc/yp.conf file empty and it would query on the local network for the local NIS server wherever it happens to be.

A more secure configuration for most hosts is to set the server name in the /etc/yp.conf configuration file. A very simple file for a host on the Winery's network may look like this:

# yp.conf - YP configuration for GNU libc library.
	#
	ypserver vbardolino

The ypserver statement tells your host to use the host supplied as the NIS server for the local domain. In this example we've specified the NIS server as vbardolino. Of course, the IP address corresponding to vbardolino must be set in the hosts file; alternatively, you may use the IP address itself with the server argument.

In the form shown in the example, the ypserver command tells ypbind to use the named server regardless of what the current NIS domain may be. If, however, you are moving your machine between different NIS domains frequently, you may want to keep information for several domains in the yp.conf file. You can have information on the servers for various NIS domains in yp.conf by specifying the information using the domain statement. For instance, you might change the previous sample file to look like this for a laptop:

# yp.conf - YP configuration for GNU libc library.
	# 
	domain winery server vbardolino 
	domain brewery server vstout

This lets you bring up the laptop in either of the two domains by simply setting the desired NIS domain at boot time using the domainname command. The NIS client then uses whichever server is relevant for the current domain.

There is a third option you may want to use. It covers the case when you don't know the name or IP address of the server to use in a particular domain, but still want the ability use a fixed server on certain domains. Imagine we want to insist on using a specified server while operating within the Winery domain, but want to probe for the server to use while in the Brewery domain. We would modify our yp.conf file again to look like this instead:

# yp.conf - YP configuration for GNU libc library.
	# 
	domain winery server vbardolino 
	domain brewery broadcast

The broadcast keyword tells ypbind to use whichever NIS server it finds for the domain.

After creating this basic configuration file and making sure it is world-readable, you should run your first test to connect to your server. Make sure to choose a map your server distributes, like hosts.byname, and try to retrieve it by using the ypcat utility:

# ypcat hosts.byname
	172.16.2.2      vbeaujolais.vbrew.com    vbeaujolais
	172.16.2.3      vbardolino.vbrew.com     vbardolino
	172.16.1.1      vlager.vbrew.com         vlager
	172.16.2.1      vlager.vbrew.com         vlager
	172.16.1.2      vstout.vbrew.com         vstout
	172.16.1.3      vale.vbrew.com           vale
	172.16.2.4      vchianti.vbrew.com       vchianti

The output you get should resemble that just shown. If you get an error message instead that says: Can't bind to server which serves domain, then either the NIS domain name you've set doesn't have a matching server defined in yp.conf, or the server is unreachable for some reason. In the latter case, make sure that a ping to the host yields a positive result, and that it is indeed running an NIS server. You can verify the latter by using rpcinfo, which should produce the following output:

# rpcinfo -u serverhost ypserv
	program 100004 version 1 ready and waiting
	program 100004 version 2 ready and waiting

Choosing the Right Maps

Having made sure you can reach the NIS server, you have to decide which configuration files to replace or augment with NIS maps. Commonly, you will want to use NIS maps for the host and password lookup functions. The former is especially useful if you do not have the BIND name service. The password lookup lets all users log into their accounts from any system in the NIS domain; this usually goes along with sharing a central /home directory among all hosts via NFS. The password map is explained detail in the next section.

Other maps, like services.byname, don't provide such dramatic gains, but do save you some editing work. The services.byname map is valuable if you install any network applications that use a service name not in the standard services file.

Generally, you want to have some choice of when a lookup function uses the local files, when it queries the NIS server, and when it uses other servers such as DNS. GNU libc allows you to configure the order in which a function accesses these services. This is controlled through the /etc/nsswitch.conf file, which stands for Name Service Switch, but of course isn't limited to the name service. For any of the data lookup functions supported by GNU libc, the file contains a line naming the services to use.

The right order of services depends on the type of data each service is offering. It is unlikely that the services.byname map will contain entries differing from those in the local services file; it will only contain additional entries. So it appears reasonable to query the local files first and check NIS only if the service name isn't found. Hostname information, on the other hand, may change very frequently, so DNS or the NIS server should always have the most accurate account, while the local hosts file is only kept as a backup if DNS and NIS should fail. For hostnames, therefore, you normally want to check the local file last.

The following example shows how to force gethostbyname and gethostbyaddr to look in NIS and DNS before the hosts file and how to have the getservbyname function look in the local files before querying NIS. These resolver functions will try each of the listed services in turn; if a lookup succeeds, the result is returned; otherwise, they will try the next service in the list. The file setting for these priorities is:

# small sample /etc/nsswitch.conf
      #
      hosts:     nis dns files 
      services:  files nis

The following is a complete list of services and locations that may be used with an entry in the nsswitch.conf file. The actual maps, files, servers, and objects queried depend on the entry name. The following can appear to the right of a colon:

nis

Use the current domain NIS server. The location of the server queried is configured in the yp.conf file, as shown in the previous section. For the hosts entry, the hosts.byname and hosts.byaddr maps are queried.

nisplus or nis+

Use the NIS+ server for this domain. The location of the server is obtained from the /etc/nis.conf file.

dns

Use the DNS name server. This service type is useful only with the hosts entry. The name servers queried are still determined by the standard resolv.conf file.

files

Use the local file, such as the /etc/hosts file for the hosts entry.

compat

Be compatible with older file formats. This option can be used when either NYS or glibc 2.x is used for NIS or NIS+ lookups. While these versions normally can't interpret older NIS entries in passwd and group files, compat option allows them to work with those formats.

db

Look up the information from DBM files located in the /var/db directory. The corresponding NIS map name is used for that file.

Currently, the NIS support in GNU libc caters to the following nsswitch.conf databases: aliases, ethers.group, hosts, netgroup, network, passwd, protocols, publickey, rpc, services, and shadow. More entries are likely to be added.

Example 13.2. shows a more complete example that introduces another feature of nsswitch.conf. The [NOTFOUND=return] keyword in the hosts entry tells the NIS client to return if the desired item couldn't be found in the NIS or DNS database. That is, the NIS client will continue searching the local files only if calls to the NIS and DNS servers fail for some other reason. The local files will then be used only at boot time and as a backup when the NIS server is down.

Example 13.2. Sample nsswitch.conf File

# /etc/nsswitch.conf
	#
	hosts:      nis dns [NOTFOUND=return] files
	networks:   nis [NOTFOUND=return] files
	services:   files nis
	protocols:  files nis
	rpc:        files nis

GNU libc provides some other actions that are described in the nsswitch manpage.

Using the passwd and group Maps

One of the major applications of NIS is synchronizing user and account information on all hosts in an NIS domain. Consequently, you usually keep only a small local /etc/passwd file, to which site-wide information from the NIS maps is appended. However, simply enabling NIS lookups for this service in nsswitch.conf is not nearly enough.

When relying on the password information distributed by NIS, you first have to make sure that the numeric user IDs of any users you have in your local passwd file match the NIS server's idea of user IDs. Consistency in user IDs is important for other purposes as well, like mounting NFS volumes from other hosts in your network.

If any of the numeric IDs in /etc/passwd or /etc/group differ from those in the maps, you have to adjust file ownerships for all files that belong to that user. First, you should change all uids and gids in passwd and group to the new values, then find that all files that belong to the users just changed and change their ownership. Assume news used to have a user ID of 9 and okir had a user ID of 103, which were changed to some other value; you could then issue the following commands as root:

# find / -uid   9 -print >/tmp/uid.9
	# find / -uid 103 -print >/tmp/uid.103
	# cat /tmp/uid.9   | xargs chown news
	# cat /tmp/uid.103 | xargs chown okir

It is important that you execute these commands with the new passwd file installed, and that you collect all filenames before you change the ownership of any of them. To update the group ownerships of files, use a similar method with the gid instead of the uid, and chgrp instead of chown.

Once you do this, the numerical uids and gids on your system will agree with those on all other hosts in your NIS domain. The next step is to add configuration lines to nsswitch.conf that enable NIS lookups for user and group information:

# /etc/nsswitch.conf - passwd and group treatment
	passwd: nis files
	group:  nis files

This affects where the login command and all its friends look for user information. When a user tries to log in, login queries the NIS maps first, and if this lookup fails, falls back to the local files. Usually, you will remove almost all users from your local files, and only leave entries for root and generic accounts like mail in it. This is because some vital system tasks may have to map uids to usernames or vice versa. For example, administrative cron jobs may execute the su command to temporarily become news, or the UUCP subsystem may mail a status report. If news and uucp don't have entries in the local passwd file, these jobs will fail miserably during an NIS brownout.

Lastly, if you are using either the old NIS implementation (supported by the compat mode for the passwd and group files in the NYS or glibc implementations), you must insert the unwieldy special entries into them. These entries represent where the NIS derived records will be inserted into the database of information. The entries can be added anywhere, but are usually just added to the end. The entries to add for the /etc/passwd file are:

+::::::

and for the /etc/groups file:

+:::

With both glibc 2.x and NYS you can override parameters in a users record received from the NIS server by creating entries with a + prepended to the login name, and exclude specified users by creating entries with a - prepended to the login name. For example the entries:

+stuart::::::/bin/jacl
	-jedd::::::

would override the shell specified for the user stuart supplied by the NIS server, and would disallow the user jedd from logging in on this machine. Any fields left blank use the information supplied by the NIS server.

There are two big caveats in order here. First, the setup as described up to here works only for login suites that don't use shadow passwords. The intricacies of using shadow passwords with NIS will be discussed in the next section. Second, the login commands are not the only ones that access the passwd filelook at the ls command, which most people use almost constantly. Whenever compiling a long listing, ls displays the symbolic names for user and group owners of a file; that is, for each uid and gid it encounters, it has to query the NIS server. An NIS query takes slightly longer to perform than the equivalent lookup in a local file. You may find that sharing your passwd and group information using NIS causes a noticable reduction in the performance of some programs that use this information frequently.

Still, this is not the whole story. Imagine what happens if a user wants to change her password. Usually, she will invoke passwd, which reads the new password and updates the local passwd file. This is impossible with NIS, since that file isn't available locally anymore, but having users log into the NIS server whenever they want to change their passwords is not an option, either. Therefore, NIS provides a drop-in replacement for passwd called yppasswd, which handles password changes under NIS. To change the password on the server host, it contacts the yppasswdd daemon on that host via RPC, and provides it with the updated password information. Usually you install yppasswd over the normal program by doing something like this:

# cd /bin
	# mv passwd passwd.old
	# ln yppasswd passwd

At the same time, you have to install rpc.yppasswdd on the server and start it from a network script. This will effectively hide any of the contortions of NIS from your users.

Using NIS with Shadow Support

Using NIS in conjunction with shadow password files is somewhat problematic. First we have some bad news: using NIS defeats the goals of shadow passwords. The shadow password scheme was designed to prevent nonroot users from having access to the encrypted form of the login passwords. Using NIS to share shadow data by necessity makes the encrypted passwords available to any user who can listen to the NIS server replies on the network. A policy to enforce users to choose good passwords is arguably better than trying to shadow passwords in an NIS environment. Let's take a quick look at how you do it, should you decide to forge on ahead.

In libc5 there is no real solution to sharing shadow data using NIS. The only way to distribute password and user information by NIS is through the standard passwd.* maps. If you do have shadow passwords installed, the easiest way to share them is to generate a proper passwd file from /etc/shadow using tools like pwuncov, and create the NIS maps from that file.

Of course, there are some hacks necessary to use NIS and shadow passwords at the same time, for instance, by installing an /etc/shadow file on each host in the network, while distributing user information, through NIS. However, this hack is really crude and defies the goal of NIS, which is to ease system administration.

The NIS support in the GNU libc library (libc6) provides support for shadow password databases. It does not provide any real solution to making your passwords accessible, but it does simplify password management in environments in which you do want to use NIS with shadow passwords. To use it, you must create a shadow.byname database and add the following line to your /etc/nsswitch.conf:

# Shadow password support
	shadow:         compat

If you use shadow passwords along with NIS, you must try to maintain some security by restricting access to your NIS database. See the section called “NIS Server Security” earlier in this chapter.



[55] Swen can be reached at swen@uni-paderborn.de. The NIS clients are available as yp-linux.tar.gz from metalab.unc.edu in system/Network.

[56] Peter may be reached at pen@lysator.liu.se. The current version of NYS is 1.2.8.

[57] Thorsten may be reached at kukuk@uni-paderborn.de.

[58] DBM is a simple database management library that uses hashing techniques to speed up search operations. There's a free DBM implementation from the GNU project called gdbm, which is part of most Linux distributions.

[59] To enable use of the /etc/hosts.allow method, you may have to recompile the server. Please read the instructions in the README included in the distribution.

[60] The secure portmapper is available via anonymous FTP from ftp.win.tue.nl below the /pub/security/ directory.

The Network File System (NFS) is probably the most prominent network service using RPC. It allows you to access files on remote hosts in exactly the same way you would access local files. A mixture of kernel support and user-space daemons on the client side, along with an NFS server on the server side, makes this possible. This file access is completely transparent to the client and works across a variety of server and host architectures.

NFS offers a number of useful features:

  • Data accessed by all users can be kept on a central host, with clients mounting this directory at boot time. For example, you can keep all user accounts on one host and have all hosts on your network mount /home from that host. If NFS is installed beside NIS, users can log into any system and still work on one set of files.

  • Data consuming large amounts of disk space can be kept on a single host. For example, all files and programs relating to LaTeX and METAFONT can be kept and maintained in one place.

  • Administrative data can be kept on a single host. There is no need to use rcp to install the same stupid file on 20 different machines.

It's not too hard to set up basic NFS operation on both the client and server; this chapter tells you how.

Linux NFS is largely the work of Rick Sladkey, who wrote the NFS kernel code and large parts of the NFS server.[61] The latter is derived from the unfsd user space NFS server, originally written by Mark Shand, and the hnfs Harris NFS server, written by Donald Becker.

Let's have a look at how NFS works. First, a client tries to mount a directory from a remote host on a local directory just the same way it does a physical device. However, the syntax used to specify the remote directory is different. For example, to mount /home from host vlager to /users on vale, the administrator issues the following command on vale:[62]

# mount -t nfs vlager:/home /users

mount will try to connect to the rpc.mountd mount daemon on vlager via RPC. The server will check if vale is permitted to mount the directory in question, and if so, return it a file handle. This file handle will be used in all subsequent requests to files below /users.

When someone accesses a file over NFS, the kernel places an RPC call to rpc.nfsd (the NFS daemon) on the server machine. This call takes the file handle, the name of the file to be accessed, and the user and group IDs of the user as parameters. These are used in determining access rights to the specified file. In order to prevent unauthorized users from reading or modifying files, user and group IDs must be the same on both hosts.

On most Unix implementations, the NFS functionality of both client and server is implemented as kernel-level daemons that are started from user space at system boot. These are the NFS Daemon (rpc.nfsd) on the server host, and the Block I/O Daemon (biod) on the client host. To improve throughput, biod performs asynchronous I/O using read-ahead and write-behind; also, several rpc.nfsd daemons are usually run concurrently.

The current NFS implementation of Linux is a little different from the classic NFS in that the server code runs entirely in user space, so running multiple copies simultaneously is more complicated. The current rpc.nfsd implementation offers an experimental feature that allows limited support for multiple servers. Olaf Kirch developed kernel-based NFS server support featured in 2.2 Version Linux kernels. Its performance is significantly better than the existing userspace implementation. We'll describe it later in this chapter.

Preparing NFS

Before you can use NFS, be it as server or client, you must make sure your kernel has NFS support compiled in. Newer kernels have a simple interface on the proc filesystem for this, the /proc/filesystems file, which you can display using cat:

$ cat /proc/filesystems
	minix
	ext2
	msdos
	nodev	proc
	nodev	nfs

If nfs is missing from this list, you have to compile your own kernel with NFS enabled, or perhaps you will need to load the kernel module if your NFS support was compiled as a module. Configuring the kernel network options is explained in the Kernel Configuration section of Chapter 3., Configuringthe NetworkingHardware.

Mounting an NFS Volume

The mounting of NFS volumes closely resembles regular file systems. Invoke mount using the following syntax:[63]

# mount -t nfs nfs_volume local_dir options

nfs_volume is given as remote_host:remote_dir. Since this notation is unique to NFS filesystems, you can leave out the t nfs option.

There are a number of additional options that you can specify to mount upon mounting an NFS volume. These may be given either following the o switch on the command line or in the options field of the /etc/fstab entry for the volume. In both cases, multiple options are separated by commas and must not contain any whitespace characters. Options specified on the command line always override those given in the fstab file.

Here is a sample entry from /etc/fstab:

# volume              mount point       type  options
	news:/var/spool/news  /var/spool/news   nfs   timeo=14,intr

This volume can then be mounted using this command:

# mount news:/var/spool/news

In the absence of an fstab entry, NFS mount invocations look a lot uglier. For instance, suppose you mount your users' home directories from a machine named moonshot, which uses a default block size of 4 K for read/write operations. You might increase the block size to 8 K to obtain better performance by issuing the command:

# mount moonshot:/home /home -o rsize=8192,wsize=8192

The list of all valid options is described in its entirety in the nfs(5) manual page. The following is a partial list of options you would probably want to use:

rsize=n and wsize=n

These specify the datagram size used by the NFS clients on read and write requests, respectively. The default depends on the version of kernel, but is normally 1,024 bytes.

timeo=n

This sets the time (in tenths of a second) the NFS client will wait for a request to complete. The default value is 7 (0.7 seconds). What happens after a timeout depends on whether you use the hard or soft option.

hard

Explicitly mark this volume as hard-mounted. This is on by default. This option causes the server to report a message to the console when a major timeout occurs and continues trying indefinitely.

soft

Soft-mount (as opposed to hard-mount) the driver. This option causes an I/O error to be reported to the process attempting a file operation when a major timeout occurs.

intr

Allow signals to interrupt an NFS call. Useful for aborting when the server doesn't respond.

Except for rsize and wsize, all of these options apply to the client's behavior if the server should become temporarily inaccessible. They work together in the following way: Whenever the client sends a request to the NFS server, it expects the operation to have finished after a given interval (specified in the timeout option). If no confirmation is received within this time, a so-called minor timeout occurs, and the operation is retried with the timeout interval doubled. After reaching a maximum timeout of 60 seconds, a major timeout occurs.

By default, a major timeout causes the client to print a message to the console and start all over again, this time with an initial timeout interval twice that of the previous cascade. Potentially, this may go on forever. Volumes that stubbornly retry an operation until the server becomes available again are called hard-mounted. The opposite variety, called soft-mounted, generate an I/O error for the calling process whenever a major timeout occurs. Because of the write-behind introduced by the buffer cache, this error condition is not propagated to the process itself before it calls the write function the next time, so a program can never be sure that a write operation to a soft-mounted volume has succeeded at all.

Whether you hard- or soft-mount a volume depends partly on taste but also on the type of information you want to access from a volume. For example, if you mount your X programs by NFS, you certainly would not want your X session to go berserk just because someone brought the network to a grinding halt by firing up seven copies of Doom at the same time or by pulling the Ethernet plug for a moment. By hard-mounting the directory containing these programs, you make sure that your computer waits until it is able to re-establish contact with your NFS server. On the other hand, non-critical data such as NFS-mounted news partitions or FTP archives may also be soft-mounted, so if the remote machine is temporarily unreachable or down, it doesn't hang your session. If your network connection to the server is flaky or goes through a loaded router, you may either increase the initial timeout using the timeo option or hard-mount the volumes. NFS volumes are hard-mounted by default.

Hard mounts present a problem because, by default, the file operations are not interruptible. Thus, if a process attempts, for example, a write to a remote server and that server is unreachable, the user's application hangs and the user can't do anything to abort the operation. If you use the intr option in conjuction with a hard mount, any signals received by the process interrupt the NFS call so that users can still abort hanging file accesses and resume work (although without saving the file).

Usually, the rpc.mountd daemon in some way or other keeps track of which directories have been mounted by what hosts. This information can be displayed using the showmount program, which is also included in the NFS server package:

# showmount -e moonshot
	Export list for localhost:
	/home <anon clnt>
	
	# showmount -d moonshot
	Directories on localhost:
	/home
	
	# showmount -a moonshot
	All mount points on localhost:
	localhost:/home

The NFS Daemons

If you want to provide NFS service to other hosts, you have to run the rpc.nfsd and rpc.mountd daemons on your machine. As RPC-based programs, they are not managed by inetd, but are started up at boot time and register themselves with the portmapper; therefore, you have to make sure to start them only after rpc.portmap is running. Usually, you'd use something like the following example in one of your network boot scripts:

if [ -x /usr/sbin/rpc.mountd ]; then
        /usr/sbin/rpc.mountd; echo -n " mountd"
	fi
	if [ -x /usr/sbin/rpc.nfsd ]; then
        /usr/sbin/rpc.nfsd; echo -n " nfsd"
	fi

The ownership information of the files an NFS daemon provides to its clients usually contains only numerical user and group IDs. If both client and server associate the same user and group names with these numerical IDs, they are said to their share uid/gid space. For example, this is the case when you use NIS to distribute the passwd information to all hosts on your LAN.

On some occasions, however, the IDs on different hosts do not match. Rather than updating the uids and gids of the client to match those of the server, you can use the rpc.ugidd mapping daemon to work around the disparity. Using the map_daemon option explained a little later, you can tell rpc.nfsd to map the server's uid/gid space to the client's uid/gid space with the aid of the rpc.ugidd on the client. Unfortunately, the rpc.ugidd daemon isn't supplied on all modern Linux distributions, so if you need it and yours doesn't have it, you will need to compile it from source.

rpc.ugidd is an RPC-based server that is started from your network boot scripts, just like rpc.nfsd and rpc.mountd:

if [ -x /usr/sbin/rpc.ugidd ]; then
      /usr/sbin/rpc.ugidd; echo -n " ugidd"
      fi

The exports File

Now we'll look at how we configure the NFS server. Specifically, we'll look at how we tell the NFS server what filesystems it should make available for mounting, and the various parameters that control the access clients will have to the filesystem. The server determines the type of access that is allowed to the server's files. The /etc/exports file lists the filesystems that the server will make available for clients to mount and use.

By default, rpc.mountd disallows all directory mounts, which is a rather sensible attitude. If you wish to permit one or more hosts to NFS-mount a directory, you must export it, that is, specify it in the exports file. A sample file may look like this:

# exports file for vlager
	/home             vale(rw) vstout(rw) vlight(rw)
	/usr/X11R6        vale(ro) vstout(ro) vlight(ro)
	/usr/TeX          vale(ro) vstout(ro) vlight(ro)
	/                 vale(rw,no_root_squash)
	/home/ftp         (ro)

Each line defines a directory and the hosts that are allowed to mount it. A hostname is usually a fully qualified domain name but may additionally contain the * and ? wildcards, which act the way they do with the Bourne shell. For instance, lab*.foo.com matches lab01.foo.com as well as laboratory.foo.com. The host may also be specified using an IP address range in the form address/netmask. If no hostname is given, as with the /home/ftp directory in the previous example, any host matches and is allowed to mount the directory.

When checking a client host against the exports file, rpx.mountd looks up the client's hostname using the gethostbyaddr call. With DNS, this call returns the client's canonical hostname, so you must make sure not to use aliases in exports. In an NIS environment the returned name is the first match from the hosts database, and with neither DNS or NIS, the returned name is the first hostname found in the hosts file that matches the client's address.

The hostname is followed by an optional comma-separated list of flags, enclosed in parentheses. Some of the values these flags may take are:

secure

This flag insists that requests be made from a reserved source port, i.e., one that is less than 1,024. This flag is set by default.

insecure

This flag reverses the effect of the secure flag.

ro

This flag causes the NFS mount to be read-only. This flag is enabled by default.

rw

This option mounts file hierarchy read-write.

root_squash

This security feature denies the superusers on the specified hosts any special access rights by mapping requests from uid 0 on the client to the uid 65534 (that is, -2) on the server. This uid should be associated with the user nobody.

no_root_squash

Don't map requests from uid 0. This option is on by default, so superusers have superuser access to your system's exported directories.

link_relative

This option converts absolute symbolic links (where the link contents start with a slash) into relative links. This option makes sense only when a host's entire filesystem is mounted; otherwise, some of the links might point to nowhere, or even worse, to files they were never meant to point to. This option is on by default.

link_absolute

This option leaves all symbolic links as they are (the normal behavior for Sun-supplied NFS servers).

map_identity

This option tells the server to assume that the client uses the same uids and gids as the server. This option is on by default.

map_daemon

This option tells the NFS server to assume that client and server do not share the same uid/gid space. rpc.nfsd then builds a list that maps IDs between client and server by querying the client's rpc.ugidd daemon.

map_static

This option allows you to specify the name of a file that contains a static map of uids and gids. For example, map_static=/etc/nfs/vlight.map would specify the /etc/nfs/vlight.map file as a uid/gid map. The syntax of the map file is described in the exports(5) manual page.

map_nis

This option causes the NIS server to do the uid and gid mapping.

anonuid and anongid

These options allow you to specify the uid and gid of the anonymous account. This is useful if you have a volume exported for public mounts.

Any error in parsing the exports file is reported to syslogd's daemon facility at level notice whenever rpc.nfsd or rpc.mountd is started up.

Note that hostnames are obtained from the client's IP address by reverse mapping, so the resolver must be configured properly. If you use BIND and are very security conscious, you should enable spoof checking in your host.conf file. We discuss these topics in Chapter 6., Name Service and Resolver Configuration.

Kernel-Based NFSv2 Server Support

The user-space NFS server traditionally used in Linux works reliably but suffers performance problems when overworked. This is primarily because of the overhead the system call interface adds to its operation, and because it must compete for time with other, potentially less important, user-space processes.

The 2.2.0 kernel supports an experimental kernel-based NFS server developed by Olaf Kirch and further developed by H.J. Lu, G. Allan Morris, and Trond Myklebust. The kernel-based NFS support provides a significant boost in server performance.

In current release distributions, you may find the server tools available in prepackaged form. If not, you can locate them at http://csua.berkeley.edu/~gam3/knfsd/. You need to build a 2.2.0 kernel with the kernel-based NFS daemon included in order to make use of the tools. You can check if your kernel has the NFS daemon included by looking to see if the /proc/sys/sunrpc/nfsd_debug file exists. If it's not there, you may have to load the rpc.nfsd module using the modprobe utility.

The kernel-based NFS daemon uses a standard /etc/exports configuration file. The package supplies replacement versions of the rpc.mountd and rpc.nfsd daemons that you start much the same way as their userspace daemon counterparts.

Kernel-Based NFSv3 Server Support

The version of NFS that has been most commonly used is NFS Version 2. Technology has rolled on ahead and it has begun to show weaknesses that only a revision of the protocol could overcome. Version 3 of the Network File System supports larger files and filesystems, adds significantly enhanced security, and offers a number of performance improvements that most users will find useful.

Olaf Kirch and Trond Myklebust are developing an experimental NFSv3 server. It is featured in the developer Version 2.3 kernels and a patch is available against the 2.2 kernel source. It builds on the Version 2 kernel-based NFS daemon.

The patches are available from the Linux Kernel based NFS server home page at http://csua.berkeley.edu/~gam3/knfsd/.



[61] Rick can be reached at jrs@world.std.com.

[62] Actually, you can omit the t nfs argument because mount sees from the colon that this specifies an NFS volume.

[63] One doesn't say filesystem because these are not proper filesystems.

Electronic mail transport has been one of the most prominent uses of networking since the first networks were devised. Email started as a simple service that copied a file from one machine to another and appended it to the recipient's mailbox file. The concept remains the same, although an ever-growing net, with its complex routing requirements and its ever increasing load of messages, has made a more elaborate scheme necessary.

Various standards of mail exchange have been devised. Sites on the Internet adhere to one laid out in RFC-822, augmented by some RFCs that describe a machine-independent way of transferring just about anything, including graphics, sound files, and special characters sets, by email.[64] CCITT has defined another standard, X.400. It is still used in some large corporate and government environments, but is progressively being retired.

Quite a number of mail transport programs have been implemented for Unix systems. One of the best known is sendmail, which was developed by Eric Allman at the University of California at Berkeley. Eric Allman now offers sendmail through a commercial venture, but the program remains free software. sendmail is supplied as the standard mail agent in some Linux distributions. We describe sendmail configuration in Chapter 16., Sendmail.

Linux also uses Exim, written by Philip Hazel of the University of Cambridge. We describe Exim configuration in ???.

Compared to sendmail, Exim is rather young. For the vast bulk of sites with email requirements, their capabilities are pretty close.

Both Exim and sendmail support a set of configuration files that have to be customized for your system. Apart from the information that is required to make the mail subsystem run (such as the local hostname), there are many parameters that may be tuned. sendmail's main configuration file is very hard to understand at first. It looks as if your cat has taken a nap on your keyboard with the shift key pressed. Exim configuration files are more structured and easier to understand than sendmail's. Exim, however, does not provide direct support for UUCP and handles only domain addresses. Today that isn't as big a limitation as it once might have been; most sites stay within Exim's limitations. However, for most sites, the work required in setting up either of them is roughly the same.

In this chapter, we deal with what email is and what issues administrators have to deal with. Chapter 16., Sendmail and ??? provide instructions on setting up sendmail and Exim and for the first time. The included information should help smaller sites become operational, but there are several more options and you can spend many happy hours in front of your computer configuring the fanciest features.

Toward the end of this chapter we briefly cover setting up elm, a very common mail user agent on many Unix-like systems, including Linux.

For more information about issues specific to electronic mail on Linux, please refer to the Electronic Mail HOWTO by Guylhem Aznar,[65] which is posted to comp.os.linux.answers regularly. The source distributions of elm, Exim, and sendmail also contain extensive documentation that should answer most questions on setting them up, and we provide references to this documentation in their respective chapters. If you need general information on email, a number of RFCs deal with this topic. They are listed in the bibliography at the end of the book.

What Is a Mail Message?

A mail message generally consists of a message body, which is the text of the message, and special administrative data specifying recipients, transport medium, etc., like what you see when you look at a physical letter's envelope.

This administrative data falls into two categories. In the first category is any data that is specific to the transport medium, like the address of sender and recipient. It is therefore called the envelope. It may be transformed by the transport software as the message is passed along.

The second variety is any data necessary for handling the mail message, which is not particular to any transport mechanism, such as the message's subject line, a list of all recipients, and the date the message was sent. In many networks, it has become standard to prepend this data to the mail message, forming the so-called mail header. It is offset from the mail body by an empty line.[66]

Most mail transport software in the Unix world use a header format outlined in RFC-822. Its original purpose was to specify a standard for use on the ARPANET, but since it was designed to be independent from any environment, it has been easily adapted to other networks, including many UUCP-based networks.

RFC-822 is only the lowest common denominator, however. More recent standards have been conceived to cope with growing needs such as data encryption, international character set support, and MIME (Multipurpose Internet Mail Extensions, described in RFC-1341 and other RFCs).

In all these standards, the header consists of several lines separated by an end-of-line sequence. A line is made up of a field name, beginning in column one, and the field itself, offset by a colon and white space. The format and semantics of each field vary depending on the field name. A header field can be continued across a newline if the next line begins with a whitespace character such as tab. Fields can appear in any order.

A typical mail header may look like this:

Return-Path: <ph10@cus.cam.ac.uk>
	Received: ursa.cus.cam.ac.uk (cusexim@ursa.cus.cam.ac.uk [131.111.8.6])
	by al.animats.net (8.9.3/8.9.3/Debian 8.9.3-6) with ESMTP id WAA04654
	for <terry@animats.net>; Sun, 30 Jan 2000 22:30:01 +1100
	Received: from ph10 (helo=localhost) by ursa.cus.cam.ac.uk with local-smtp
	(Exim 3.13 #1) id 12EsYC-0001eF-00; Sun, 30 Jan 2000 11:29:52 +0000
	Date: Sun, 30 Jan 2000 11:29:52 +0000 (GMT)
	From: Philip Hazel <ph10@cus.cam.ac.uk>
	Reply-To: Philip Hazel <ph10@cus.cam.ac.uk>
	To: Terry Dawson <terry@animats.net>, Andy Oram <andyo@oreilly.com>
	Subject: Electronic mail chapter
	In-Reply-To: <38921283.A58948F2@animats.net>
	Message-ID: <Pine.SOL.3.96.1000130111515.5800A-200000@ursa.cus.cam.ac.uk>

Usually, all necessary header fields are generated by the mailer interface you use, like elm, pine, mush, or mailx. However, some are optional and may be added by the user. elm, for example, allows you to edit part of the message header. Others are added by the mail transport software. If you look into a local mailbox file, you may see each mail message preceded by a From line (note: no colon). This is not an RFC-822 header; it has been inserted by your mail software as a convenience to programs reading the mailbox. To avoid potential trouble with lines in the message body that also begin with From, it has become standard procedure to escape any such occurrence by preceding it with a > character.

This list is a collection of common header fields and their meanings:

From:

This contains the sender's email address and possibly the real name. A complete zoo of formats is used here.

To:

This is a list of recipient email addresses. Multiple recipient addresses are separated by a comma.

Cc:

This is a list of email addresses that will receive carbon copies of the message. Multiple recipient addresses are separated by a comma.

Bcc:

This is a list of email addresses that will receive carbon copies of the message. The key difference between a Cc: and a Bcc: is that the addresses listed in a Bcc: will not appear in the header of the mail messages delivered to any recipient. It's a way of alerting recipients that you've sent copies of the message to other people without telling them who those others are. Multiple recipient addresses are separated by a comma.

Subject:

Describes the content of the mail in a few words.

Date:

Supplies the date and time the mail was sent.

Reply-To:

Specifies the address the sender wants the recipient's reply directed to. This may be useful if you have several accounts, but want to receive the bulk of mail only on the one you use most frequently. This field is optional.

Organization:

The organization that owns the machine from which the mail originates. If your machine is owned by you privately, either leave this out, or insert private or some complete nonsense. This field is not described by any RFC and is completely optional. Some mail programs support it directly, many don't.

Message-ID:

A string generated by the mail transport on the originating system. It uniquely identifies this message.

Received:

Every site that processes your mail (including the machines of sender and recipient) inserts such a field into the header, giving its site name, a message ID, time and date it received the message, which site it is from, and which transport software was used. These lines allow you to trace which route the message took, and you can complain to the person responsible if something went wrong.

X-anything:

No mail-related programs should complain about any header that starts with X-. It is used to implement additional features that have not yet made it into an RFC, or never will. For example, there was once a very large Linux mailing list server that allowed you to specify which channel you wanted the mail to go to by adding the string X-Mn-Key: followed by the channel name.

How Is Mail Delivered?

Generally, you will compose mail using a mailer interface like mail or mailx, or more sophisticated ones like mutt, tkrat, or pine. These programs are called mail user agents, or MUAs. If you send a mail message, the interface program will in most cases hand it to another program for delivery. This is called the mail transport agent, or MTA. On most systems the same MTA is used for both local and remote delivery and is usually invoked as /usr/sbin/sendmail, or on non-FSSTND compliant systems as /usr/lib/sendmail. On UUCP systems it is not uncommon to see mail delivery handled by two separate programs: rmail for remote mail delivery and lmail for local mail delivery.

Local delivery of mail is, of course, more than just appending the incoming message to the recipient's mailbox. Usually, the local MTA understands aliasing (setting up local recipient addresses pointing to other addresses) and forwarding (redirecting a user's mail to some other destination). Also, messages that cannot be delivered must usually be bounced, that is, returned to the sender along with some error message.

For remote delivery, the transport software used depends on the nature of the link. Mail delivered over a network using TCP/IP commonly uses Simple Mail Transfer Protocol (SMTP), which is described in RFC-821. SMTP was designed to deliver mail directly to a recipient's machine, negotiating the message transfer with the remote side's SMTP daemon. Today it is common practice for organizations to establish special hosts that accept all mail for recipients in the organization and for that host to manage appropriate delivery to the intended recipient.

Mail is usually not delivered directly in UUCP networks, but rather is forwarded to the destination host by a number of intermediate systems. To send a message over a UUCP link, the sending MTA usually executes rmail on the forwarding system using uux, and feeds it the message on standard input.

Since uux is invoked for each message separately, it may produce a considerable workload on a major mail hub, as well as clutter the UUCP spool queues with hundreds of small files taking up a disproportionate amount of disk space.[67] Some MTAs therefore allow you to collect several messages for a remote system in a single batch file. The batch file contains the SMTP commands that the local host would normally issue if a direct SMTP connection were used. This is called BSMTP, or batched SMTP. The batch is then fed to the rsmtp or bsmtp program on the remote system, which processes the input almost as if a normal SMTP connection has occurred.

Email Addresses

Email addresses are made up of at least two parts. One part is the name of a mail domain that will ultimately translate to either the recipient's host or some host that accepts mail on behalf of the recipient. The other part is some form of unique user identification that may be the login name of that user, the real name of that user in Firstname.Lastname format, or an arbitrary alias that will be translated into a user or list of users. Other mail addressing schemes, like X.400, use a more general set of attributes that are used to look up the recipient's host in an X.500 directory server.

How email addresses are interpreted depends greatly on what type of network you use. We'll concentrate on how TCP/IP and UUCP networks interpret email addresses.

RFC-822

Internet sites adhere to the RFC-822 standard, which requires the familiar notation of user@host.domain, for which host.domain is the host's fully qualified domain name. The character separating the two is properly called a commercial at sign, but it helps if you read it as at. This notation does not specify a route to the destination host. Routing of the mail message is left to the mechanisms we'll describe shortly.

You will see a lot of RFC-822 if you run an Internet connected site. Its use extends not only to mail, but has also spilled over into other services, such as news. We discuss how RFC-822 is used for news in Chapter 17., Netnews.

Obsolete Mail Formats

In the original UUCP environment, the prevalent form was path!host!user, for which path described a sequence of hosts the message had to travel through before reaching the destination host. This construct is called the bang path notation, because an exclamation mark is colloquially called a bang. Today, many UUCP-based networks have adopted RFC-822 and understand domain-based addresses.

Other networks have still different means of addressing. DECnet-based networks, for example, use two colons as an address separator, yielding an address of host::user.[68] The X.400 standard uses an entirely different scheme, describing a recipient by a set of attribute-value pairs, like country and organization.

Lastly, on FidoNet, each user is identified by a code like 2:320/204.9, consisting of four numbers denoting zone (2 is for Europe), net (320 being Paris and Banlieue), node (the local hub), and point (the individual user's PC). Fidonet addresses can be mapped to RFC-822; the above, for example, would be written as Thomas.Quinot@p9.f204.n320.z2.fidonet.org. Now didn't we say domain names were easy to remember?

Mixing Different Mail Formats

It is inevitable that when you bring together a number of different systems and a number of clever people, they will seek ways to interconnect the differing systems so they are capable of internetworking. Consequently, there are a number of different mail gateways that are able to link two different email systems together so that mail may be forwarded from one to another. Addressing is the critical question when linking two systems. We won't look at the gateways themselves in any detail, but let's take a look at some of the addressing complications that may arise when gateways of this sort are used.

Consider mixing the UUCP style bang-path notation and RFC-822. These two types of addressing don't mix too well. Assume there is an address of domainA!user@domainB. It is not clear whether the @ sign takes precedence over the path, or vice versa: do we have to send the message to domainB, which mails it to domainA!user, or should it be sent to domainA, which forwards it to user@domainB?

Addresses that mix different types of address operators are called hybrid addresses. The most common type, which we just illustrated, is usually resolved by giving the @ sign precedence over the path. In domainA!user@domainB, this means sending the message to domainB first.

However, there is a way to specify routes in RFC-822 conformant ways: <@domainA,@domainB:user@domainC> denotes the address of user on domainC, where domainC is to be reached through domainA and domainB (in that order). This type of address is frequently called a source routed address. It's not a good idea to rely on this behavior, as revisions to the RFCs describing mail routing recommend that source routing in a mail address be ignored and instead an attempt should be made to deliver directly to the remote destination.

Then there is the % address operator: user%domainB@domainA is first sent to domainA, which expands the rightmost (in this case, the only) percent sign to an @ sign. The address is now user@domainB, and the mailer happily forwards your message to domainB, which delivers it to user. This type of address is sometimes referred to as Ye Olde ARPAnet Kludge, and its use is discouraged.

There are some implications to using these different types of addressing that will be described throughout the following sections. In an RFC-822 environment, you should avoid using anything other than absolute addresses, such as user@host.domain.

How Does Mail Routing Work?

The process of directing a message to the recipient's host is called routing. Apart from finding a path from the sending site to the destination, it involves error checking and may involve speed and cost optimization.

There is a big difference between the way a UUCP site handles routing and the way an Internet site does. On the Internet, the main job of directing data to the recipient host (once it is known by its IP address) is done by the IP networking layer, while in the UUCP zone, the route has to be supplied by the user or generated by the mail transfer agent.

Mail Routing on the Internet

On the Internet, the destination host's configuration determines whether any specific mail routing is performed. The default is to deliver the message to the destination by first determining what host the message should be sent to and then delivering it directly to that host. Most Internet sites want to direct all inbound mail to a highly available mail server that is capable of handling all this traffic and have it distribute the mail locally. To announce this service, the site publishes a so-called MX record for its local domain in its DNS database. MX stands for Mail Exchanger and basically states that the server host is willing to act as a mail forwarder for all mail addresses in the domain. MX records can also be used to handle traffic for hosts that are not connected to the Internet themselves, like UUCP networks or FidoNet hosts that must have their mail passed through a gateway.

MX records are always assigned a preference. This is a positive integer. If several mail exchangers exist for one host, the mail transport agent will try to transfer the message to the exchanger with the lowest preference value, and only if this fails will it try a host with a higher value. If the local host is itself a mail exchanger for the destination address, it is allowed to forward messages only to MX hosts with a lower preference than its own; this is a safe way of avoiding mail loops. If there is no MX record for a domain, or no MX records left that are suitable, the mail transport agent is permitted to see if the domain has an IP address associated with it and attempt delivery directly to that host.

Suppose that an organization, say Foobar, Inc., wants all its mail handled by its machine mailhub. It will then have MX records like this in the DNS database:

green.foobar.com.        IN   MX      5    mailhub.foobar.com.

This announces mailhub.foobar.com as a mail exchanger for green.foobar.com with a preference of 5. A host that wishes to deliver a message to joe@green.foobar.com checks DNS and finds the MX record pointing at mailhub. If there's no MX with a preference smaller than 5, the message is delivered to mailhub, which then dispatches it to green.

This is a very simple description of how MX records work. For more information on mail routing on the Internet, refer to RFC-821, RFC-974, and RFC-1123.

Mail Routing in the UUCP World

Mail routing on UUCP networks is much more complicated than on the Internet because the transport software does not perform any routing itself. In earlier times, all mail had to be addressed using bang paths. Bang paths specified a list of hosts through which to forward the message, separated by exclamation marks and followed by the user's name. To address a letter to a user called Janet on a machine named moria, you would use the path eek!swim!moria!janet. This would send the mail from your host to eek, from there on to swim, and finally to moria.

The obvious drawback of this technique is that it requires you to remember much more about network topology, fast links, etc. than Internet routing requires. Much worse than that, changes in the network topologylike links being deleted or hosts being removedmay cause messages to fail simply because you aren't aware of the change. And finally, in case you move to a different place, you will most likely have to update all these routes.

One thing, however, that made the use of source routing necessary was the presence of ambiguous hostnames. For instance, assume there are two sites named moria, one in the U.S. and one in France. Which site does moria!janet refer to now? This can be made clear by specifying what path to reach moria through.

The first step in disambiguating hostnames was the founding of the UUCP Mapping Project. It is located at Rutgers University and registers all official UUCP hostnames, along with information on their UUCP neighbors and their geographic location, making sure no hostname is used twice. The information gathered by the Mapping Project is published as the Usenet Maps, which are distributed regularly through Usenet. A typical system entry in a map (after removing the comments) looks like this:[69]

moria
	  bert(DAILY/2),
	  swim(WEEKLY)

This entry says moria has a link to bert, which it calls twice a day, and swim, which it calls weekly. We will return to the map file format in more detail later.

Using the connectivity information provided in the maps, you can automatically generate the full paths from your host to any destination site. This information is usually stored in the paths file, also called the pathalias database. Assume the maps state that you can reach bert through ernie; a pathalias entry for moria generated from the previous map snippet may then look like this:

moria           ernie!bert!moria!%s

If you now give a destination address of janet@moria.uucp, your MTA will pick the route shown above and send the message to ernie with an envelope address of bert!moria!janet.

Building a paths file from the full Usenet maps is not a very good idea, however. The information provided in them is usually rather distorted and occasionally out of date. Therefore, only a number of major hosts use the complete UUCP world maps to build their paths files. Most sites maintain routing information only for sites in their neighborhood and send any mail to sites they don't find in their databases to a smarter host with more complete routing information. This scheme is called smart-host routing. Hosts that have only one UUCP mail link (so-called leaf sites) don't do any routing of their own; they rely entirely on their smart host.

Mixing UUCP and RFC-822

The best cure for the problems of mail routing in UUCP networks so far is the adoption of the domain name system in UUCP networks. Of course, you can't query a name server over UUCP. Nevertheless, many UUCP sites have formed small domains that coordinate their routing internally. In the maps, these domains announce one or two hosts as their mail gateways so that there doesn't have to be a map entry for each host in the domain. The gateways handle all mail that flows into and out of the domain. The routing scheme inside the domain is completely invisible to the outside world.

This works very well with the smart-host routing scheme. Global routing information is maintained by the gateways only; minor hosts within a domain get along with only a small, handwritten paths file that lists the routes inside their domain and the route to the mail hub. Even the mail gateways do not need routing information for every single UUCP host in the world anymore. Besides the complete routing information for the domain they serve, they only need to have routes to entire domains in their databases now. For instance, this pathalias entry will route all mail for sites in the sub.org domain to smurf:

.sub.org        swim!smurf!%s

Mail addressed to claire@jones.sub.org will be sent to swim with an envelope address of smurf!jones!claire.

The hierarchical organization of the domain namespace allows mail servers to mix more specific routes with less specific ones. For instance, a system in France may have specific routes for subdomains of fr, but route any mail for hosts in the us domain toward some system in the U.S. In this way, domain-based routing (as this technique is called) greatly reduces the size of routing databases, as well as the administrative overhead needed.

The main benefit of using domain names in a UUCP environment, however, is that compliance with RFC-822 permits easy gatewaying between UUCP networks and the Internet. Many UUCP domains nowadays have a link with an Internet gateway that acts as their smart host. Sending messages across the Internet is faster, and routing information is much more reliable because Internet hosts can use DNS instead of the Usenet Maps.

In order to be reachable from the Internet, UUCP-based domains usually have their Internet gateway announce an MX record for them (MX records were described previously in the section the section called “Mail Routing on the Internet”). For instance, assume that moria belongs to the orcnet.org domain. gcc2.groucho.edu acts as its Internet gateway. moria would therefore use gcc2 as its smart host so that all mail for foreign domains is delivered across the Internet. On the other hand, gcc2 would announce an MX record for *.orcnet.org and deliver all incoming mail for orcnet sites to moria. The asterisk in *.orcnet.org is a wildcard that matches all hosts in that domain that don't have any other record associated with them. This should normally be the case for UUCP-only domains.

The only remaining problem is that the UUCP transport programs can't deal with fully qualified domain names. Most UUCP suites were designed to cope with site names of up to eight characters, some even less, and using nonalphanumeric characters such as dots is completely out of the question for most.

Therefore, we need mapping between RFC-822 names and UUCP hostnames. This mapping is completely implementation-dependent. One common way of mapping FQDNs to UUCP names is to use the pathalias file:

moria.orcnet.org  ernie!bert!moria!%s

This will produce a pure UUCP-style bang path from an address that specifies a fully qualified domain name. Some mailers provide a special file for this; sendmail, for instance, uses the uucpxtable.

The reverse transformation (colloquially called domainizing) is sometimes required when sending mail from a UUCP network to the Internet. As long as the mail sender uses the fully qualified domain name in the destination address, this problem can be avoided by not removing the domain name from the envelope address when forwarding the message to the smart host. However, there are still some UUCP sites that are not part of any domain. They are usually domainized by appending the pseudo-domain uucp.

The pathalias database provides the main routing information in UUCP-based networks. A typical entry looks like this (site name and path are separated by tabs):

moria.orcnet.org  ernie!bert!moria!%s
	  moria             ernie!bert!moria!%s

This makes any message to moria be delivered via ernie and bert. Both moria's fully qualified name and its UUCP name have to be given if the mailer does not have a separate way to map between these namespaces.

If you want to direct all messages to hosts inside a domain to its mail relay, you may also specify a path in the pathalias database, giving the domain name preceded by a dot as the target. For example, if all hosts in sub.org can be reached through swim!smurf, the pathalias entry might look like this:

.sub.org        swim!smurf!%s

Writing a pathalias file is acceptable only when you are running a site that does not have to do much routing. If you have to do routing for a large number of hosts, a better way is to use the pathalias command to create the file from map files. Maps can be maintained much more easily, because you may simply add or remove a system by editing the system's map entry and recreating the map file. Although the maps published by the Usenet Mapping Project aren't used for routing very much anymore, smaller UUCP networks may provide routing information in their own set of maps.

A map file mainly consists of a list of sites that each system polls or is polled by. The system name begins in the first column and is followed by a comma-separated list of links. The list may be continued across newlines if the next line begins with a tab. Each link consists of the name of the site followed by a cost given in brackets. The cost is an arithmetic expression made up of numbers and symbolic expressions like DAILY or WEEKLY. Lines beginning with a hash sign are ignored.

As an example, consider moria, which polls swim.twobirds.com twice a day and bert.sesame.com once per week. The link to bert uses a slow 2,400 bps modem. moria would publish the following maps entry:

moria.orcnet.org
	  bert.sesame.com(DAILY/2),
	  swim.twobirds.com(WEEKLY+LOW)
	  moria.orcnet.org = moria

The last line makes moria known under its UUCP name, as well. Note that its cost must be specified as DAILY/2 because calling twice a day actually halves the cost for this link.

Using the information from such map files, pathalias is able to calculate optimal routes to any destination site listed in the paths file and produce a pathalias database from this which can then be used for routing to these sites.

pathalias provides a couple of other features like site-hiding (i.e., making sites accessible only through a gateway). See the pathalias manual page for details and a complete list of link costs.

Comments in the map file generally contain additional information on the sites described in it. There is a rigid format in which to specify this information so that it can be retrieved from the maps. For instance, a program called uuwho uses a database created from the map files to display this information in a nicely formatted way. When you register your site with an organization that distributes map files to its members, you generally have to fill out such a map entry. Below is a sample map entry (in fact, it's the one for Olaf's site):

#N      monad, monad.swb.de, monad.swb.sub.org
	  #S      AT 486DX50; Linux 0.99
	  #O      private
	  #C      Olaf Kirch
	  #E      okir@monad.swb.de
	  #P      Kattreinstr. 38, D-64295 Darmstadt, FRG
	  #L      49 52 03 N / 08 38 40 E
	  #U      brewhq
	  #W      okir@monad.swb.de (Olaf Kirch); Sun Jul 25 16:59:32 MET DST 1993
	  #
	  monad   brewhq(DAILY/2)
	  # Domains
	  monad = monad.swb.de
	  monad = monad.swb.sub.org

The whitespace after the first two characters is a tab. The meaning of most of the fields is pretty obvious; you will receive a detailed description from whichever domain you register with. The L field is the most fun to find out: it gives your geographical position in latitude/longitude and is used to draw the PostScript maps that show all sites for each country, as well as worldwide.[70]

Configuring elm

elm stands for electronic mail and is one of the more reasonably named Unix tools. It provides a full-screen interface with a good help feature. We won't discuss how to use elm here, but only dwell on its configuration options.

Theoretically, you can run elm unconfigured, and everything works wellif you are lucky. But there are a few options that must be set, although they are required only on occasion.

When it starts, elm reads a set of configuration variables from the elm.rc file in /etc/elm. Then it attempts to read the file .elm/elmrc in your home directory. You don't usually write this file yourself. It is created when you choose Save new options from elm's options menu.

The set of options for the private elmrc file is also available in the global elm.rc file. Most settings in your private elmrc file override those of the global file.

Global elm Options

In the global elm.rc file, you must set the options that pertain to your host's name. For example, at the Virtual Brewery, the file for vlager contains the following:

#
	  # The local hostname
	  hostname = vlager
	  #
	  # Domain name
	  hostdomain = .vbrew.com
	  #
	  # Fully qualified domain name
	  hostfullname = vlager.vbrew.com

These options set elm's idea of the local hostname. Although this information is rarely used, you should set the options. Note that these particular options only take effect when giving them in the global configuration file; when found in your private elmrc, they will be ignored.

National Character Sets

A set of standards and RFCs have been developed that amend the RFC-822 standard to support various types of messages, such as plain text, binary data, PostScript files, etc. These standards are commonly referred to as MIME, or Multipurpose Internet Mail Extensions. Among other things, MIME also lets the recipient know if a character set other than standard ASCII has been used when writing the message, for example, using French accents or German umlauts. elm supports these characters to some extent.

The character set used by Linux internally to represent characters is usually referred to as ISO-8859-1, which is the name of the standard it conforms to. It is also known as Latin-1. Any message using characters from this character set should have the following line in its header:

Content-Type: text/plain; charset=iso-8859-1

The receiving system should recognize this field and take appropriate measures when displaying the message. The default for text/plain messages is a charset value of us-ascii.

To be able to display messages with character sets other than ASCII, elm must know how to print these characters. By default, when elm receives a message with a charset field other than us-ascii (or a content type other than text/plain, for that matter), it tries to display the message using a command called metamail. Messages that require metamail to be displayed are shown with an M in the very first column in the overview screen.

Since Linux's native character set is ISO-8859-1, calling metamail is not necessary to display messages using this character set. If elm is told that the display understands ISO-8859-1, it will not use metamail, but will display the message directly instead. This can be enabled by setting the following option in the global elm.rc:

displaycharset = iso-8859-1

Note that you should set this option even when you are never going to send or receive any messages that actually contain characters other than ASCII. This is because people who do send such messages usually configure their mailer to put the proper Content-Type: field into the mail header by default, whether or not they are sending ASCII-only messages.

However, setting this option in elm.rc is not enough. When displaying the message with its built-in pager, elm calls a library function for each character to determine whether it is printable. By default, this function will only recognize ASCII characters as printable and display all other characters as ^?. You may overcome this function by setting the environment variable LC_CTYPE to ISO-8859-1, which tells the library to accept Latin-1 characters as printable. Support for this and other features have been available since Version 4.5.8 of the Linux standard library.

When sending messages that contain special characters from ISO-8859-1, you should make sure to set two more variables in the elm.rc file:

charset = iso-8859-1
	  textencoding = 8bit

This makes elm report the character set as ISO-8859-1 in the mail header and send it as an 8-bit value (the default is to strip all characters to 7-bit).

Of course, all character set options we've discussed here may also be set in the private elmrc file instead of the global one so individual users can have their own default settings if the global one doesn't suit them.



[64] Read RFC-1437 if you don't believe this statement!

[65] Guylhem can be reached at guylhem@danmark.linux.eu.org.

[66] It is customary to append a signature or .sig to a mail message, usually containing information on the author along with a joke or a motto. It is offset from the mail message by a line containing -- followed by a space.

[67] This is because disk space is usually allocated in blocks of 1,024 bytes. So even a message of a few dozen bytes will eat a full kilobyte.

[68] When trying to reach a DECnet address from an RFC-822 environment, you can use host::user"@relay, for which relay is the name of a known Internet-DECnet relay.

[69] Maps for sites registered with the UUCP Mapping Project are distributed through the newsgroup comp.mail.maps; other organizations may publish separate maps for their networks.

[70] They are posted regularly in news.lists.ps-maps. Beware. They're HUGE.

Introduction to sendmail

It's been said that you aren't a real Unix system administrator until you've edited a sendmail.cf file. It's also been said that you're crazy if you've attempted to do so twice.

sendmail is an incredibly powerful mail program. It's also incredibly difficult to learn and understand. Any program whose definitive reference (sendmail, by Bryan Costales and Eric Allman, published by O'Reilly) is 1,050 pages long scares most people off. Information on the sendmail reference is contained in the bibliography at the end of this book.

Fortunately, new versions of sendmail are different. You no longer need to directly edit the cryptic sendmail.cf file; the new version provides a configuration utility that will create the sendmail.cf file for you based on much simpler macro files. You do not need to understand the complex syntax of the sendmail.cf file; the macro files don't require you to. Instead, you need only list items, such as the name of features you wish to include in your configuration, and specify some of the parameters that determine how that feature operates. A traditional Unix utility called m4 then takes your macro configuration data and mixes it with the data it reads from template files containing the actual sendmail.cf syntax, to produce your sendmail.cf file.

In this chapter we introduce sendmail and describe how to install, configure and test it, using the Virtual Brewery as an example. If the information presented here helps make the task of configuring sendmail less daunting for you, we hope you'll gain the confidence to tackle more complex configurations on your own.

Installing sendmail

The sendmail mail transport agent is included in prepackaged form in most Linux distributions. Installation in this case is relatively simple. Despite this fact, there are some good reasons to install sendmail from source, especially if you are security conscious. The sendmail program is very complex and has earned a reputation over the years for containing bugs that allow security breaches. One of the best known examples is the RTM Internet worm that exploited a buffer overflow problem in early versions of sendmail. We touched on this briefly in Chapter 9., TCP/IP Firewall. Most security exploits involving buffer overflows rely on all copies of sendmail on different machines being identical, as the exploits rely on data being stored in specific locations. This, of course, is precisely what happens with sendmail installed from Linux distributions. Compiling sendmail from source yourself can help reduce this risk. Modern versions of sendmail are less vulnerable because they have come under exceedingly close scrutiny as security has become a more widespread concern throughout the Internet community.

The sendmail source code is available via anonymous FTP from ftp.sendmail.org.

Compilation is very simple bceause the sendmail source package directly supports Linux. The steps involved in compiling sendmail are:

	  # cd /usr/local/src
	  # tar xvfz sendmail.8.9.3.tar.gz
	  # cd src
	  # ./Build
	

You need root permissions to complete the installation of the resulting binary files using:

	  # cd obj.Linux.2.0.36.i586
	  # make install
	

You have now installed the sendmail binary into the /usr/sbin directory. Several symbolic links to the sendmail binary will be installed into the /usr/bin/ directory. We'll talk about those links when we discuss common tasks in running sendmail.

Overview of Configuration Files

Traditionally, sendmail was set up through a system configuration file (typically called /etc/mail/sendmail.cf, or in older distributions, /etc/sendmail.cf, or even /usr/lib/sendmail.cf) that is not anything close to any language you've seen before. Editing the sendmail.cf file to provide customized behavior can be a humbling experience.

Today sendmail makes all configuration options macro driven with an easy-to-understand syntax. The macro method generates configurations to cover most installations, but you always have the option of tuning the resultant sendmail.cf manually to work in a more complex environment.

The sendmail.cf and sendmail.mc Files

The m4 macro processor program generates the sendmail.df file when it processes the macro configuration file provided by the local system administrator. Throughout the remainder of this chapter we will refer to this configuration file as the sendmail.mc file.

The configuration process is basically a matter of creating a suitable sendmail.mc file that includes macros that describe your desired configuration. The macros are expressions that the m4 macro processor understands and expands into the complex sendmail.cf syntax. The macro expressions are made up of the macro name (the text in capital letters at the start), which can be likened to a function in a programming language, and some parameters (the text within brackets) that are used in the expansion. The parameters may be passed literally into the sendmail.cf output or may be used to govern the way the macro processing occurs.

A sendmail.mc file for a minimal configuration (UUCP or SMTP with all nonlocal mail being relayed to a directly connected smart host) can be as short as 10 or 15 lines, excluding comments.

Two Example sendmail.mc Files

If you're an administator of a number of different mail hosts, you might not want to name your configuration file sendmail.mc. Instead, it is common practice to name it after the hostvstout.m4 in our case. The name doesn't really matter as long as the output is called sendmail.cf. Providing a unique name for the configuration file for each host allows you to keep all configuration files in the same directory and is just an administrative convenience. Let's look at two example macro configuration files so we know where we are heading.

Most sendmail configurations today use SMTP only. It is very simple to configure sendmail for SMTP. Example 16.1. expects a DNS name server to be available to resolve hosts and will attempt to accept and deliver all mail for hosts using just SMTP.

Example 16.1. Sample Configuration File vstout.smtp.m4

divert(-1)
	  #
	  # Sample configuration file for vstout - smtp only
	  #
	  divert(0)
	  VERSIONID(`@(#)sendmail.mc	8.7 (Linux) 3/5/96')
	  OSTYPE(`linux')
	  #
	  # Include support for the local and smtp mail transport protocols.
	  MAILER(`local')
	  MAILER(`smtp')
	  #
	  FEATURE(rbl)
	  FEATURE(access_db)
	  # end

A sendmail.mc file for vstout at the Virtual Brewery is shown in Example 16.2.. vstout uses SMTP to talk to all hosts on the Brewery's LAN, and you'll see the commonality with the generic SMTP-only configuration just presented. In addition, the vstout configuration sends all mail for other destinations to moria, its Internet relay host, via UUCP.

Example 16.2. Sample Configuration File vstout.uucpsmtp.m4

divert(-1)
	  #
	  # Sample configuration file for vstout
	  #
	  divert(0)
	  VERSIONID(`@(#)sendmail.mc	8.7 (Linux) 3/5/96')
	  OSTYPE(`linux')
	  dnl
	  # moria is our smart host, using the "uucp-new" transport.
	  define(`SMART_HOST', `uucp-new:moria')
	  dnl
	  # Support the local, smtp and uucp mail transport protocols.
	  MAILER(`local')
	  MAILER(`smtp')
	  MAILER(`uucp')
	  LOCAL_NET_CONFIG
	  # This rule ensures that all local mail is delivered using the
	  # smtp transport, everything else will go via the smart host.
	  R$* < @ $* .$m. > $*	$#smtp $@ $2.$m. $: $1 < @ $2.$m. > $3
	  dnl
	  #
	  FEATURE(rbl)
	  FEATURE(access_db)
	  # end

If you compare and contrast the two configurations, you might be able to work out what each of the configuration parameters does. We'll explain them all in detail.

Typically Used sendmail.mc Parameters

A few of the items in the sendmail.mc file are required all the time; others can be ignored if you can get away with defaults. The general sequence of the definitions in the sendmail.mc is as follows:

  1. VERSIONID

  2. OSTYPE

  3. DOMAIN

  4. FEATURE

  5. Local macro definitions

  6. MAILER

  7. LOCAL_* rulesets

We'll talk about each of these in turn in the following sections and refer to our examples in Example 16.1. and Example 16.2., when appropriate, to explain them.

Comments

Lines in the sendmail.mc file that begin with the # character are not parsed by m4, and will by default be output directly into the sendmail.cf file. This is useful if you want to comment on what your configuration is doing in both the input and output files.

To allow comments in your sendmail.mc that are not placed into the sendmail.cf, you can use the m4 divert and dnl tokens. divert(-1) will cause all output to cease. divert(0) will cause output to be restored to the default. Any output generated by lines between these will be discarded. In our example, we've used this mechanism to provide a comment that appears only in the sendmail.mc file. To achieve the same result for a single line, you can use the dnl token that means, literally, starting at the beginning of the next line, delete all characters up to and including the next newline. We've used this in our example, too.

These are standard m4 features, and you can obtain more information on them from its manual page.

VERSIONID and OSTYPE

VERSIONID(`@(#)sendmail.mc  8.9 (Linux) 01/10/98')

The VERSIONID macro is optional, but is useful to record the version of the sendmail configuration in the sendmail.cf file. So you'll often encounter it, and we recommend it. In any case, be sure to include:

OSTYPE(`linux')

This is probably the most important definition. The OSTYPE macro causes a file of definitions to be included that are good defaults for your operating system. Most of the definitions in an OSTYPE macro file set the pathnames of various configuration files, mailer program paths and arguments, and the location of directories sendmail uses to store messages. The standard sendmail source code release includes such a file for Linux, which would be included by the previous example. Some Linux distributions, notably the Debian distribution, include their own definition file that is completely Linux-FHS compliant. When your distribution does this, you should probably use its definition instead of the Linux default one.

The OSTYPE definition should be one of the first definitions to appear in your sendmail.mc file, as many other definitions depend upon it.

DOMAIN

The DOMAIN macro is useful when you wish to configure a large number of machines on the same network in a standard way. It you're configuring a small number of hosts, it probably isn't worth bothering with. You typically configure items, such as the name of mail relay hosts or hubs that all hosts on your network will use.

The standard installation contains a directory of m4 macro templates used to drive the configuration process. This directory is usually named /usr/share/sendmail.cf or something similar. Here you will find a subdirectory called domain that contains domain-specific configuration templates. To make use of the DOMAIN macro, you must create your own macro file containing the standard definitions you require for your site, and write it into the domain subdirectory. You'd normally include only the macro definitions that were unique to your domain here, such as smart host definitions or relay hosts, but you are not limited to these.

The sendmail source distribution comes with a number of sample domain macro files that you can use to model your own.

If you saved your domain macro file as /usr/share/sendmail.cf/domain/vbrew.m4, you'd include definitions in your sendmail.mc using:

DOMAIN(`vbrew')

FEATURE

The FEATURE macro enables you to include predefined sendmail features in your configuration. These sendmail features make the supported configurations very simple to use. There are a large number, and throughout this chapter we'll talk about only a few of the more useful and important ones. You can find full details of the features available in the CF file included in the source package.

To use any of the features listed, you should include a line in your sendmail.mc that looks like:

FEATURE(name)

where name is substituted with the feature name. Some features take one optional parameter. If you wish to use something other than the default, you should use an entry that looks like:

FEATURE(name, param)

where param is the parameter to supply.

Local macro definitions

The standard sendmail macro configuration files provide lots of hooks and variables with which you can customize your configuration. These are called local macro definitions. Many of them are listed in the CF file in the sendmail source package.

The local macro definitions are usually invoked by supplying the name of the macro with an argument representing the value you wish to assign to the variable the macro manages. Again, we'll explore some of the more common local macro definitions in the examples we present later in the chapter.

Defining mail transport protocols

If you want sendmail to transport mail in any way other than by local delivery, you must tell it which transports to use. The MAILER macro makes this very easy. The current version of sendmail supports a variety of mail transport protocols; some of these are experimental, others are probably rarely used.

In our network we need the SMTP transport to send and receive mail among the hosts on our local area network, and the UUCP transport to send and receive mail from our smart host. To achieve this, we simply include both the smtp and uucp mail transports. The local mail transport is included by default, but may be defined for clarity, if you wish. If you are including both the smtp and the uucp mailers in your configuration, you must always be sure to define the smtp mailer first.

The more commonly used transports available to you using the MAILER macro are described in the following list:

local

This transport includes both the local delivery agent used to send mail into the mailbox of users on this machine and the prog mailer used to send messages to local programs. This transport is included by default.

smtp

This transport implements the Simple Mail Transport Protocol (SMTP), which is the most common means of transporting mail on the Internet. When you include this transport, four mailers are configured: smtp (basic SMTP), esmtp (Extended SMTP), smtp8 (8bit binary clean SMTP), and relay (specifically designed for gatewaying messages between hosts).

uucp

The uucp transport provides support for two mailers: uucp-old, which is the traditional UUCP, and uucp-new, which allows multiple recipients to be handled in one transfer.

usenet

This mailer allows you to send mail messages directly into Usenet style news networks. Any local message directed to an address of news.group.usenet will be fed into the news network for the news.group newsgroup.

fax

If you have the HylaFAX software installed, this mailer will allow you to direct email to it so that you may build an email-fax gateway. This feature is experimental at the time of writing and more information may be obtained from http://www.vix.com/hylafax/.

There are others, such as the pop, procmail, mail11, phquery, and cyrus that are useful, but less common. If your curiosity is piqued, you can read about these in the sendmail book or the documentation supplied in the source package.

Configure mail routing for local hosts

The Virtual Brewery's configuration is probably more complex than most sites require. Most sites today would use the SMTP transport only and do not have to deal with UUCP at all. In our configuration we've configured a smart host that is used to handle all outgoing mail. Since we are using the SMTP transport on our local network we must tell sendmail that it is not to send local mail via the smart host. The LOCAL_NET_CONFIG macro allows you to insert sendmail rules directly into the output sendmail.cf to modify the way that local mail is handled. We'll talk more about rewrite rules later on, but for the moment you should accept that the rule we've supplied in our example specifies that any mail destined for hosts in the vbrew.com domain should be delivered directly to the target hosts using the SMTP mail transport.

Generating the sendmail.cf File

When you have completed editing your m4 configuration file, you must process it to produce the /etc/mail/sendmail.cf file read by sendmail. This is straightforward, as illustrated by the following example:

# cd /etc/mail
      # m4 /usr/share/sendmail.cf/m4/cf.m4 vstout.uucpsmtp.mc >sendmail.cf

This command invokes the m4 macro processor, supplying it the name of two macro definition files to process. m4 processes the files in the order given. The first file is a standard sendmail macro template supplied with the sendmail source package, the second, of course, is the file containing our own macro definitions. The output of the command is directed to the /etc/mail/sendmail.cf file, which is our target file.

You may now start sendmail with the new configuration.

Interpreting and Writing Rewrite Rules

Arguably the most powerful feature of sendmail is the rewrite rule. Rewrite rules are used by sendmail to determine how to process a received mail message. sendmail passes the addresses from the headers of a mail message through collections of rewrite rules called rulesets The rewrite rules transform a mail address from one form to another and you can think of them as being similar to a command in your editor that replaces all text matching a specified pattern with another.

Each rule has a lefthand side and a righthand side, separated by at least one tab character. When sendmail is processing mail it scans through the rewriting rules looking for a match on the lefthand side. If an address matches the lefthand side of a rewrite rule, the address is replaced by the righthand side and processed again.

sendmail.cf R and S Commands

In the sendmail.cf file, the rulesets are defined using commands coded as Sn, where n specifies the ruleset that is considered the current one.

The rules themselves appear in commands coded as R. As each R command is read, it is added to the current ruleset.

If you're dealing only with the sendmail.mc file, you won't need to worry about S commands at all, as the macros will build those for you. You will need to manually code your R rules.

A sendmail ruleset therefore looks like:

Sn
	  Rlhs rhs
	  Rlhs2 rhs2

Some Useful Macro Definitions

sendmail uses a number of standard macro definitions internally. The most useful of these in writing rulesets are:

$j

The fully qualified domain name of this host.

$w

The hostname component of the FQDN.

$m

The domain name component of the FQDN.

We can incorporate these macro definitions in our rewrite rules. Our Virtual Brewery configuration uses the $m macro.

The Lefthand Side

In the lefthand side of a rewriting rule, you specify a pattern that will match an address you wish to transform. Most characters are matched literally, but there are a number of characters that have special meaning; these are described in the following list. The rewrite rules for the lefthand side are:

$@

Match exactly zero tokens

$*

Match zero or more tokens

$+

Match one or more tokens

$-

Match exactly one token

$=x

Match any phrase in class x

$~x

Match any word not in class x

A token is a string of characters delimited by spaces. There is no way to include spaces in a token, nor is it necessary, as the expression patterns are flexible enough to work around this need. When a rule matches an address, the text matched by each of the patterns in the expression will be assigned to special variables that we'll use in the righthand side. The only exception to this is the $@, which matches no tokens and therefore will never generate text to be used on the righthand side.

The Righthand Side

When the lefthand side of a rewrite rule matches an address, the original text is deleted and replaced by the righthand side of the rule. All tokens in the righthand side are copied literally, unless they begin with a dollar sign. Just as for the lefthand side, a number of metasymbols may be used on the righthand side. These are described in the following list. The rewrite rules for the righthand side are:

$n

This metasymbol is replaced with the n'th expression from the lefthand side.

$[name$]

This metasymbol resolves hostname to canonical name. It is replaced by the canonical form of the host name supplied.

$(map key $@arguments $:default $)

This is the more general form of lookup. The output is the result of looking up key in the map named map passing arguments as arguments. The map can be any of the maps that sendmail supports such as the virtusertable that we describe a little later. If the lookup is unsuccessful, default will be output. If a default is not supplied and lookup fails, the input is unchanged and key is output.

$>n

This will cause the rest of this line to be parsed and then given to ruleset n to evaluate. The output of the called ruleset will be written as output to this rule. This is the mechanism that allows rules to call other rulesets.

$#mailer

This metasymbol causes ruleset evaluation to halt and specifies the mailer that should be used to transport this message in the next step of its delivery. This metasymbol should be called only from ruleset 0 or one of its subroutines. This is the final stage of address parsing and should be accompanied by the next two metasymbols.

$@host

This metasymbol specifies the host that this message will be forwarded to. If the destination host is the local host, it may be omitted. The host may be a colon-separated list of destination hosts that will be tried in sequence to deliver the message.

$:user

This metasymbol specifies the target user for the mail message.

A rewrite rule that matches is normally tried repeatedly until it fails to match, then parsing moves on to the next rule. This behavior can be changed by preceding the righthand side with one of two special righthand side metaymbols described in the following list. The rewrite rules for a righthand side loop control metasymbols are:

$@

This metasymbol causes the ruleset to return with the remainder of the righthand side as the value. No other rules in the ruleset are evaluated.

$:

This metasymbol causes this rule to terminate immediately, but the rest of the current ruleset is evaluated.

A Simple Rule Pattern Example

To better see how the macro substitution patterns operate, consider the following rule lefthand side:

$* < $+ >

This rule matches Zero or more tokens, followed by the < character, followed by one or more tokens, followed by the > character.

If this rule were applied to brewer@vbrew.com or Head Brewer < >, the rule would not match. The first string would not match because it does not include a < character, and the second would fail because $+ matches one or more tokens and there are no tokens between the <> characters. In any case in which a rule does not match, the righthand side of the rule is not used.

If the rule were applied to Head Brewer < brewer@vbrew.com >, the rule would match, and on the righthand side $1 would be substituted with Head Brewer and $2 would be substituted with brewer@vbrew.com.

If the rule were applied to < brewer@vbrew.com > the rule would match because $* matches zero or more tokens, and on the righthand side $1 would be substituted with the empty string.

Ruleset Semantics

Each of the sendmail rulesets is called upon to perform a different task in mail processing. When you are writing rules, it is important to understand what each of the rulesets are expected to do. We'll look at each of the rulesets that the m4 configuration scripts allow us to modify:

LOCAL_RULE_3

Ruleset 3 is responsible for converting an address in an arbitrary format into a common format that sendmail will then process. The output format expected is the familiar looking local-part@host-domain-spec.

Ruleset 3 should place the hostname part of the converted address inside the < and > characters to make parsing by later rulesets easier. Ruleset 3 is applied before sendmail does any other processing of an email address, so if you want sendmail to gateway mail from some system that uses some unusual address format, you should add a rule using the LOCAL_RULE_3 macro to convert addresses into the common format.

LOCAL_RULE_0 and LOCAL_NET_CONFIG

Ruleset 0 is applied to recipient addresses by sendmail after Ruleset 3. The LOCAL_NET_CONFIG macro causes rules to be inserted into the bottom half of Ruleset 0.

Ruleset 0 is expected to perform the delivery of the message to the recipient, so it must resolve to a triple that specifies each of the mailer, host, and user. The rules will be placed before any smart host definition you may include, so if you add rules that resolve addresses appropriately, any address that matches a rule will not be handled by the smart host. This is how we handle the direct smtp for the users on our local LAN in our example.

LOCAL_RULE_1 and LOCAL_RULE_2

Ruleset 1 is applied to all sender addresses and Ruleset 2 is applied to all recipient addresses. They are both usually empty.

Interpreting the rule in our example

Our sample in Example 16.3. uses the LOCAL_NET_CONFIG macro to declare a local rule that ensures that any mail within our domain is delivered directly using the smtp mailer. Now that we've looked at how rewrite rules are constructed, we will be able to understand how this rule works. Let's take another look at it.

Example 16.3. Rewrite Rule from vstout.uucpsmtp.m4

LOCAL_NET_CONFIG
	    # This rule ensures that all local mail is delivered using the
	    # smtp transport, everything else will go via the smart host.
	    R$* < @ $* .$m. > $*  $#smtp $@ $2.$m. $: $1 < @ $2.$m. > $3

We know that the LOCAL_NET_CONFIG macro will cause the rule to be inserted somewhere near the end of ruleset 0, but before any smart host definition. We also know that ruleset 0 is the last ruleset to be executed and that it should resolve to a three-tuple specifying the mailer, user, and host.

We can ignore the two comment lines; they don't do anything useful. The rule itself is the line beginning with R. We know that the R is a sendmail command and that it adds this rule to the current ruleset, in this case ruleset 0. Let's look at the lefthand side and the righthand side in turn.

The lefthand side looks like: $* < @ $* .$m. > $*.

Ruleset 0 expects < and > characters because it is fed by ruleset 3. Ruleset 3 converts addresses into a common form and to make parsing easier, it also places the host part of the mail address inside <>s.

This rule matches any mail address that looks like: 'DestUser < @ somehost.ourdomain. > Some Text'. That is, it matches mail for any user at any host within our domain.

You will remember that the text matched by metasymbols on the lefthand side of a rewrite rule is assigned to macro definitions for use on the righthand side. In our example, the first $* matches all text from the start of the address until the < character. All of this text is assigned to $1 for use on the righthand side. Similarly the second $* in our rewrite rule is assigned to $2, and the last is assigned to $3.

We now have enough to understand the lefthand side. This rule matches mail for any user at any host within our domain. It assigns the username to $1, the hostname to $2, and any trailing text to $3. The righthand side is then invoked to process these.

Let's now look at what we're expecting to see outputed. The righthand side of our example rewrite rule looks like: $#smtp $@ $2.$m. $: $1 < @ $2.$m. > $3.

When the righthand side of our ruleset is processed, each of the metasymbols are interpreted and relevant substitutions are made.

The $# metasymbol causes this rule to resolve to a specific mailer, smtp in our case.

The $@ resolves the target host. In our example, the target host is specified as $2.$m., which is the fully qualified domain name of the host on in our domain. The FQDN is constructed of the hostname component assigned to $2 from our lefthand side with our domain name (.$m.) appended.

The $: metasymbol specifies the target user, which we again captured from the lefthand side and had stored in $1.

We preserve the contents of the <> section, and any trailing text, using the data we collected from the lefthand side of the rule.

Since this rule resolves to a mailer, the message is forwarded to the mailer for delivery. In our example, the message would be forwarded to the destination host using the SMTP protocol.

Configuring sendmail Options

sendmail has a number of options that allow you to customize the way it performs certain tasks. There are a large number of these, so we've listed only a few of the more commonly used ones in the upcoming list.

To configure any of these options, you may either define them in the m4 configuration file, which is the preferable method, or you may insert them directly into the sendmail.cf file. For example, if we wished to have sendmail fork a new job for each mail message to be delivered, we might add the following line to our m4 configuration file:

define(confSEPARATE_PROC,true)

The corresponding sendmail.cf entry created is:

O ForkEachJob=true

The following list describes common sendmail m4 options (and sendmail.cf equivalents):

confMIN_FREE_BLOCKS (MinFreeBlocks)

There are occasions when a problem might prevent the immediate delivery of mail messages, causing messages to be queued in the mail spool. If your mail host processes large volumes of mail, it is possible for the mail spool to grow to such a size that it fills the filesystem supporting the spool. To prevent this, sendmail provides this option to specify the minimum number of free disk blocks that must exist before a mail message will be accepted. This allows you to ensure that sendmail never causes your spool filesystem to be filled (Default: 100).

confME_TOO (MeToo)

When a mail target such as an email alias is expanded, it is sometimes possible for the sender to appear in the recipient list. This option determines whether the originators of an email message will receive a copy if they appear in the expanded recipient list. Valid values are true and false (Default: false).

confMAX_DAEMON_CHILDREN (MaxDaemonChildren)

Whenever sendmail receives an SMTP connection from a remote host, it spawns a new copy of itself to deal with the incoming mail message. This way, it is possible for sendmail to be processing multiple incoming mail messages simulatanenously. While this is useful, each new copy of sendmail consumes memory in the host computer. If an unusually large number of incoming connections are received, by chance, because of a problem or a malicious attack, it is possible for sendmail daemons to consume all system memory. This option provides you with a means of limiting the maximum number of daemon children that will be spawned. When this number is reached, new connections are rejected until some of the existing children have terminated (Default: undefined).

confSEPARATE_PROC (ForkEachJob)

When processing the mail queue and sending mail messages, sendmail processes one mail message at a time. When this option is enabled, sendmail will fork a new copy of itself for each message to be delivered. This is particularly useful when there are some mail messages that are stuck in the queue because of a problem with the target host (Default: false).

confSMTP_LOGIN_MSG (SmtpGreetingMessage)

Whenever a connection is made to sendmail, a greeting message is sent. By default, this message contains the hostname, name of the mail transfer agent, the sendmail version number, the local version number, and the current date. RFC821 specifies that the first word of the greeting should be the fully qualified domain name of the host, but the rest of the greeting can be configured however you please. You can specify sendmail macros here and they will be expanded when used. The only people who will see this message are suffering system administrators diagnosing mail delivery problems or strongly curious people interested in discovering how your machine is configured. You can relieve some of the tedium of their task by customizing the welcome message with some witticisms; be nice. The word EMSTP will be inserted between the first and second words by sendmail, as this is the signal to remote hosts that we support the ESMTP protocol (Default: $j Sendmail $v/$Z; $b).

Some Useful sendmail Configurations

There are myriad possible sendmail configurations. In this space we'll illustrate just a few important types of configuration that will be useful in many sendmail installations.

Trusting Users to Set the From: Field

It is sometimes useful to overwrite the From: field of an outgoing mail message. Let's say you have a web-based program that generates email. Normally the mail message would appear to come from the user who owned the web server process. We might want to specify some other source address so that the mail appears to have originated from some other user or address on that machine. sendmail provides a means of specifying which systems users are to be entrusted with the ability to do this.

The use_ct_file feature enables the specification and use of a file that lists the names of trusted users. By default, a small number of system users are trusted by sendmail (root, for example). The default filename for this feature is /etc/mail/trusted-users in systems exploiting the /etc/mail/ configuration directory and /etc/sendmail.ct in those that don't. You can specify the name and location of the file by overriding the confCT_FILE definition.

Add FEATURE(use_ct_file) to your sendmail.mc file to enable the feature.

Managing Mail Aliases

Mail aliases are a powerful feature that enable mail to be directed to mailboxes that are alternate names for users or processes on a destination host. For example, it is common practice to have feedback or comments relating to a World Wide Web server to be directed to webmaster. Often there isn't a user known as webmaster on the target machine, instead it is an alias of another system user. Another common use of mail aliases is exploited by mailing list server programs in which an alias directs incoming messages to the list server program for handling.

The /etc/aliases file is where the aliases are stored. The sendmail program consults this file when determining how to handle an incoming mail message. If it finds an entry in this file matching the target user in the mail message, it redirects the message to wherever the entry describes.

Specifically there are three things that aliases allow to happen:

  • They provide a shorthand or well-known name for mail to be addressed to in order to go to one or more persons.

  • They can invoke a program with the mail message as the input to the program.

  • They can send mail to a file.

All systems require aliases for Postmaster and MAILER-DAEMON to be RFC-compliant.

Always be extremely aware of security when defining aliases that invoke programs or write to programs, since sendmail generally runs with root permissions.

Details concerning mail aliases may be found in the aliases(5) manual page. A sample aliases file is shown in Example 16.4..

Example 16.4. Sample aliases File

#
	  # The following two aliases must be present to be RFC-compliant.
	  # It is important to resolve them to 'a person' who reads mail routinely.
	  #
	  postmaster:    root                            # required entry
	  MAILER-DAEMON: postmaster                      # required entry
	  #
	  #
	  # demonstrate the common types of aliases
	  #
	  usenet:        janet                           # alias for a person
	  admin:         joe,janet                       # alias for several people
	  newspak-users: :include:/usr/lib/lists/newspak # read recipients from file
	  changefeed:    |/usr/local/lib/gup             # alias that invokes program
	  complaints:    /var/log/complaints             # alias writes mail to file
	  #

Whenever you update the /etc/aliases file, be sure to run the command:

# /usr/bin/newaliases

to rebuild the database that sendmail uses internally. The /usr/bin/newaliases command is a symbolic link to the sendmail executable, and when invoked this way, behaves exactly as though it were invoked as:

# /usr/lib/sendmail -bi

The newaliases command is an alternative and more convenient way to do this.

Using a Smart Host

Sometimes a host finds mail that it is unable to deliver directly to the desired remote host. It is often convenient to have a single host on a network take on the role of managing transmission of mail to remote hosts that are difficult to reach, rather than have each local host try to do this independently.

There are a few good reasons to have a single host take on mail management. You can simplify management by having only one host with a comprehensive mail configuration that knows how to handle all of the different mail transport types, such as UUCP, Usenet, etc. All other hosts need only a single tranport protocol to send their mail to this central host. Hosts that fill this central mail routing and forwarding role are called smart hosts. If you have a smart host that will accept mail from you, you can send it mail of any sort and it will manage the routing and transmission of that mail to the desired remote destinations.

Another good application for smart host configurations is to manage transmission of mail across a private firewall. An organization may elect to install a private IP network and use their own, unregistered IP addresses. The private network may be connected to the Internet through a firewall. Sending mail to and from hosts in the private network to the outside world using SMTP would not be possible in a conventional configuration because the hosts are not able to accept or establish direct network connections to hosts on the Internet. Instead, the organization could elect to have the firewall provide a mail smart host function. The smart host running on the firewall is able to establish direct network connections with hosts both on the private network and on the Internet. The smart host would accept mail from both hosts on the private network and the Internet, store them in local storage and then manage the retransmission of that mail to the correct host directly.

Smart hosts are usually used when all other methods of delivery have failed. In the case of the organization with the private network, it would be perfectly reasonable to have the hosts attempt to deliver mail directly first, and if that fails then to send it to the smart host. This relieves the smart host of a lot of traffic because other hosts can directly send mail to other hosts on the private network.

sendmail provides a simple method of configuring a smart host using the SMART_HOST feature; when implementing it in the Virtual Brewery configuration, we do exactly this. The relevant portions of our configuration that define the smart host are:

define(`SMART_HOST', `uucp-new:moria')
	  LOCAL_NET_CONFIG
	  # This rule ensures that all local mail is delivered using the
	  # smtp transport, everything else will go via the smart host.
	  R$* < @ $* .$m. > $*	$#smtp $@ $2.$m. $: $1 < @ $2.$m. > $3

The SMART_HOST macro allows you to specify the host that should relay all outgoing mail that you are unable to deliver directly, and the mail transport protocol to use to talk to it.

In our configuration we are using the uucp-new transport to UUCP host moria. If we wanted to configure sendmail to use an SMTP-based Smart Host, we would instead use something like:

define(`SMART_HOST', `mail.isp.net')

We don't need to specify SMTP as the transport, as it is the default.

Can you guess what the LOCAL_NET_CONFIG macro and the rewrite rule might be doing?

The LOCAL_NET_CONFIG macro allows you to add raw sendmail rewrite rules to your configuration that define what mail should stay within the local mail system. In our example, we've used a rule that matches any email address where the host belongs to our domain (.$m.) and rewrite it so that it is sent directly to the SMTP mailer. This ensures that any message for a host on our local domain is directed immediately to the SMTP mailer and forwarded to that host, rather than falling through to our smart host, which is the default treatment.

Managing Unwanted or Unsolicited Mail (Spam)

If you've subscribed to a mailing list, published your email address on a web site, or posted an article to UseNet, you will most likely have begun to receive unsolicited advertising email. It is commonplace now for people to scour the net in search of email addresses to add to mailing lists that they then sell to companies seeking to advertise their products. This sort of mass-mailing behavior is commonly called spamming.

The Free On-line Dictionary of Computing offers a mail-specific definition of spam as:[71]

2. (A narrowing of sense 1, above) To indiscrimately send large amounts of unsolicited e-mail meant to promote a product or service. Spam in this sense is sort of like the electronic equivalent of junk mail sent to "Occupant."

In the 1990s, with the rise in commercial awareness of the net, there are actually scumbags who offer spamming as a "service" to companies wishing to advertise on the net. They do this by mailing to collections of e-mail addresses, Usenet news, or mailing lists. Such practises have caused outrage and aggressive reaction by many net users against the individuals concerned.

Fortunately, sendmail includes some support for mechanisms that can help you deal with unsolicited mail.

The Real-time Blackhole List

The Real-time Blackhole List is a public facility provided to help reduce the volume of unsolicited advertising you have to contend with. Known email sources and hosts are listed in a queryable database on the Internet. They're entered there by people who have received unsolicited advertising from some email address. Major domains sometimes find themselves on the list because of slip-ups in shutting down spam. While some people complain about particular choices made by the maintainers of the list, it remains very popular and disagreements are usually worked out quickly. Full details on how the service is operated may be found from the home site of the Mail Abuse Protection System at http://maps.vix.com/rbl/.

If you enable this sendmail feature, it will test the source address of each incoming mail message against the Real-time Blackhole List to determine whether to accept the message. If you run a large site with many users, this feature could save a considerable volume of disk space. This feature accepts a parameter to specify the name of the server to use. The default is the main server at rbl.maps.vix.com.

To configure the Real-time Blackhole List feature, add the following macro declaration to your sendmail.mc file:

FEATURE(rbl)

Should you wish to specify some other RBL server, you would use a declaration that looks like:

FEATURE(rbl,`rbl.host.net')

The access database

An alternative system that offers greater flexibility and control at the cost of manual configuration is the sendmail access_db feature. The access database allows you to configure which hosts or users you will accept mail from and which you will relay mail for.

Managing who you will relay mail for is important, as it is another technique commonly employed by spamming hosts to circumvent systems such as the Real-time Blackhole List just described. Instead of sending the mail to you directly, spammers will relay the mail via some other unsuspecting host who allows it. The incoming SMTP connection then doesn't come from the known spamming host, it instead comes from the relay host. To ensure that your own mail hosts aren't used in this way, you should relay mail only for known hosts. Versions of sendmail that are 8.9.0 or newer have relaying disabled by default, so for those you'll need to use the access database to enable individual hosts to relay.

The general idea is simple. When a new incoming SMTP connection is received, sendmail retrieves the message header information and then consults the access database to see whether it should proceed to accept the body of the message itself.

The access database is a collection of rules that describe what action should be taken for messages received from nominated hosts. The default access control file is called /etc/mail/access. The table has a simple format. Each line of the table contains an access rule. The lefthand side of each rule is a pattern used to match the sender of an incoming mail message. It may be a complete email address, a hostname, or an IP address. The righthand side is the action to take. There are five types of action you may configure. These are:

OK

Accept the mail message.

RELAY

Accept messages from this host or user even if they are not destined for our host; that is, accept messages for relaying to other hosts from this host.

REJECT

Reject the mail with a generic message.

DISCARD

Discard the message using the $#discard mailer.

### any text

Return an error message using ### as the error code (which should be RFC-821 compliant) and any text as the message.

An example /etc/mail/access might look like:

friends@cybermail.com   REJECT
	    aol.com                 REJECT
	    207.46.131.30           REJECT
	    postmaster@aol.com      OK
	    linux.org.au            RELAY

This example would reject any email received from friends@cybermail.com, any host in the domain aol.com and the host 207.46.131.30. The next rule would accept email from postmaster@aol.com despite the fact that the domain itself has a reject rule. The last rule allows relaying of mail from any host in the linux.org.au domain.

To enable the access database feature, use the following declaration in your sendmail.mc file:

FEATURE(access_db)

The default definition builds the database using hash -o /etc/mail/access, which generates a simple hashed database from the plain text file. This is perfectly adequate in most installations. There are other options that you should consider if you intend to have a large access database. Consult the sendmail book or other sendmail documentation for details.

Barring users from receiving mail

If you have users or automated processes that send mail but will never need to receive it, it is sometimes useful to refuse to accept mail destined for them. This saves wasted disk-space storing mail that will never be read. The blacklist_recipients feature, when used in combination with the access_db feature, allows you to disable the receipt of mail for local users.

To enable the feature, you add the following lines to your sendmail.mc file, if they're not already there:

FEATURE(access_db)
	    FEATURE(blacklist_recipients)

To disable receipt of mail for a local user, simply add his details into the access database. Usually you would use the ### entry style that would return a meaningful error message to the sender so they know why the mail is not being delivered. This feature applies equally well to users in virtual mail domains, and you must include the virtual mail domain in the access database specification. Some sample /etc/mail/access entries might look like:

daemon          550 Daemon does not accept or read mail.
	    flacco          550 Mail for this user has been administratively disabled.
	    grump@dairy.org 550 Mail disabled for this recipient.

Configuring Virtual Email Hosting

Virtual email hosting provides a host the capability of accepting and delivering mail on behalf of a number of different domains as though it were a number of separate mail hosts. Most commonly, virtual hosting is exploited by Internet Application Providers in combination with virtual web hosting, but it's simple to configure and you never know when you might be in a position to virtual host a mailing list for your favorite Linux project, so we'll describe it here.

Accepting mail for other domains

When sendmail receives an email message, it compares the destination host in the message headers to the local host name. If they match, sendmail accepts the message for local delivery; if they differ, sendmail may decide to accept the message and attempt to forward it on to the final destination (See the section called “The access database” earlier in this chapter for details on how to configure sendmail to accept mail for forwarding).

If we wish to configure virtual email hosting, the first thing we need to do is to convince sendmail that it should also accept mail for the domains that we are hosting. Fortunately, this is a very simple thing to do.

The sendmail use_cw_file feature allows us to specify the name of a file where we store domain names for which sendmail accepts mail. To configure the feature, add the feature declaration to your sendmail.mc file:

FEATURE(use_cw_file)

The default name of the file will be /etc/mail/local-host-names for distributions using the /etc/mail/ configuration directory or /etc/sendmail.cw for those that don't. Alternatively, you can specify the name and location of the file by overriding the confCW_FILE macro using a variation on:

define(`confCW_FILE',`/etc/virtualnames')

To stick with the default filename, if we wished to offer virtual hosting to the bovine.net, dairy.org, and artist.org domains, we would create a /etc/mail/local-host-names that looks like:

bovine.net
	    dairy.org
	    artist.org

When this is done, and assuming appropriate DNS records exist that point those domain names to our host, sendmail will accept mail messages for those domains as though they were destined for our real domain name.

Forwarding virtual-hosted mail to other destinations

The sendmail virtusertable feature configures support for the virtual user table, where we configure virtual email hosting. The virtual user table maps incoming mail destined for some user@host to some otheruser@otherhost. You can think of this as an advanced mail alias feature, one that operates using not just the destination user, but also the destination domain.

To configure the virtusertable feature, add the feature to your sendmail.mc configuration as shown:

FEATURE(virtusertable)

By default, the file containing the rules to perform translations will be /etc/mail/virtusertable. You can override this by supplying an argument to the macro definition; consult a detailed sendmail reference to learn about what options are available.

The format of the virtual user table is very simple. The lefthand side of each line contains a pattern representing the original destination mail address; the righthand side has a pattern representing the mail address the virtual hosted address will be mapped to.

The following example shows three possible types of entries:

samiam@bovine.net     colin
	    sunny@bovine.net      darkhorse@mystery.net
	    @dairy.org            mail@jhm.org
	    @artist.org           $1@red.firefly.com

In this example, we are virtual hosting three domains: bovine.net, dairy.org, and artist.org.

The first entry redirects mail sent to a user in the bovine.net virtual domain to a local user on the machine. The second entry redirects mail to a user in the same virtual domain to a user in another domain. The third example redirects all mail addressed to any user in the dairly.org virtual domain to a single remote mail address. Finally, the last entry redirects any mail to a user in the artist.org virtual domain to the same user in another domain; for example, julie@artists.org would be redirected to julie@red.firefly.com.

Testing Your Configuration

The m4 command processes the macro definition files according to its own syntax rules without understanding anything about correct sendmail syntax; so there won't be any error messages if you've gotten anything wrong in your macro definition file. For this reason, it is very important that you thoroughly test your configuration. Fortunately, sendmail provides a relatively easy way of doing this.

sendmail supports an address test mode that allows us to test our configuration and identify any errors. In this mode of operation, we invoke sendmail from the command line, and it prompts us for a ruleset specification and a destination mail address. sendmail then processes that destination address using the rules specified, displaying the output of each rewrite rule as it proceeds. To place sendmail into this mode, we invoke it with the bt argument:

# /usr/sbin/sendmail -bt
      ADDRESS TEST MODE (ruleset 3 NOT automatically invoked)
      Enter <ruleset> <address>
      >

The default configuration file used is the /etc/mail/sendmail.cf file; you can specify an alternate configuration file using the C argument. To test our configuration, we need to select a number of addresses to process that will tell us that each of our mail-handing requirements are met. To illustrate this, we'll work through a test of our more complicated UUCP configuration shown in Example 16.2..

First we'll test that sendmail is able to deliver mail to local users on the system. In these tests we expect all addresses to be rewritten to the local mailer on this machine:

# /usr/sbin/sendmail -bt
      ADDRESS TEST MODE (ruleset 3 NOT automatically invoked)
      Enter <ruleset> <address>
      > 3,0 isaac
      rewrite: ruleset   3   input: isaac
      rewrite: ruleset  96   input: isaac
      rewrite: ruleset  96 returns: isaac
      rewrite: ruleset   3 returns: isaac
      rewrite: ruleset   0   input: isaac
      rewrite: ruleset 199   input: isaac
      rewrite: ruleset 199 returns: isaac
      rewrite: ruleset  98   input: isaac
      rewrite: ruleset  98 returns: isaac
      rewrite: ruleset 198   input: isaac
      rewrite: ruleset 198 returns: $# local $: isaac
      rewrite: ruleset   0 returns: $# local $: isaac

This output shows us how sendmail processes mail addressed to isaac on this system. Each line shows us what information has been supplied to a ruleset or the result obtained from processing by a ruleset. We told sendmail we wished to use rulesets 3 and 0 to process the address. Ruleset 0 is what is normally invoked and we forced ruleset 3 because it is not tested by default. The last line shows us that the result of ruleset 0 does indeed direct mail to isaac to the local mailer.

Next we'll test mail addressed to our SMTP address: isaac@vstout.vbrew.com. We should be able to produce the same end result as our last example:

# /usr/sbin/sendmail -bt
      ADDRESS TEST MODE (ruleset 3 NOT automatically invoked)
      Enter <ruleset> <address>
      > 3,0 isaac@vstout.vbrew.com
      rewrite: ruleset   3   input: isaac @ vstout . vbrew . com
      rewrite: ruleset  96   input: isaac < @ vstout . vbrew . com >
      rewrite: ruleset  96 returns: isaac < @ vstout . vbrew . com . >
      rewrite: ruleset   3 returns: isaac < @ vstout . vbrew . com . >
      rewrite: ruleset   0   input: isaac < @ vstout . vbrew . com . >
      rewrite: ruleset 199   input: isaac < @ vstout . vbrew . com . >
      rewrite: ruleset 199 returns: isaac < @ vstout . vbrew . com . >
      rewrite: ruleset  98   input: isaac < @ vstout . vbrew . com . >
      rewrite: ruleset  98 returns: isaac < @ vstout . vbrew . com . >
      rewrite: ruleset 198   input: isaac < @ vstout . vbrew . com . >
      rewrite: ruleset 198 returns: $# local $: isaac
      rewrite: ruleset   0 returns: $# local $: isaac

Again, this test passed. Next we'll test mail to our UUCP style address: vstout!isaac.

# /usr/sbin/sendmail -bt
      ADDRESS TEST MODE (ruleset 3 NOT automatically invoked)
      Enter <ruleset> <address>
      > 3,0 vstout!isaac
      rewrite: ruleset   3   input: vstout ! isaac
      rewrite: ruleset  96   input: isaac < @ vstout . UUCP >
      rewrite: ruleset  96 returns: isaac < @ vstout . vbrew . com . >
      rewrite: ruleset   3 returns: isaac < @ vstout . vbrew . com . >
      rewrite: ruleset   0   input: isaac < @ vstout . vbrew . com . >
      rewrite: ruleset 199   input: isaac < @ vstout . vbrew . com . >
      rewrite: ruleset 199 returns: isaac < @ vstout . vbrew . com . >
      rewrite: ruleset  98   input: isaac < @ vstout . vbrew . com . >
      rewrite: ruleset  98 returns: isaac < @ vstout . vbrew . com . >
      rewrite: ruleset 198   input: isaac < @ vstout . vbrew . com . >
      rewrite: ruleset 198 returns: $# local $: isaac
      rewrite: ruleset   0 returns: $# local $: isaac

This test has also passed. These tests confirm that any mail received for local users on this machine will be properly delivered irrespective of how the address is formatted. If you've defined any aliases for your machine, such as virtual hosts, you should repeat these tests for each of the alternate names by which this host is known to ensure they also work correctly.

Next we will test that mail addressed to other hosts in the vbrew.com domain is delivered directly to that host using the SMTP mailer:

# /usr/sbin/sendmail -bt
      ADDRESS TEST MODE (ruleset 3 NOT automatically invoked)
      Enter <ruleset> <address>
      > 3,0 isaac@vale.vbrew.com
      rewrite: ruleset   3   input: isaac @ vale . vbrew . com
      rewrite: ruleset  96   input: isaac < @ vale . vbrew . com >
      rewrite: ruleset  96 returns: isaac < @ vale . vbrew . com . >
      rewrite: ruleset   3 returns: isaac < @ vale . vbrew . com . >
      rewrite: ruleset   0   input: isaac < @ vale . vbrew . com . >
      rewrite: ruleset 199   input: isaac < @ vale . vbrew . com . >
      rewrite: ruleset 199 returns: isaac < @ vale . vbrew . com . >
      rewrite: ruleset  98   input: isaac < @ vale . vbrew . com . >
      rewrite: ruleset  98 returns: isaac < @ vale . vbrew . com . >
      rewrite: ruleset 198   input: isaac < @ vale . vbrew . com . >
      rewrite: ruleset 198 returns: $# smtp $@ vale . vbrew . com . /
      $: isaac < @ vale . vbrew . com . >
      rewrite: ruleset   0 returns: $# smtp $@ vale . vbrew . com . /
      $: isaac < @ vale . vbrew . com . >

We can see that this test has directed the message to the SMTP mailer to be forwarded directly to the vale.vbrew.com host and specifies the user isaac. This test confirms that our LOCAL_NET_CONFIG definition works correctly. For this test to succeed, the destination hostname must be able to be resolved correctly, so it must either have an entry in our /etc/hosts file, or in our local DNS. We can see what happens if the destination hostname isn't able to be resolved by intentionally specifying an unknown host:

# /usr/sbin/sendmail -bt
      ADDRESS TEST MODE (ruleset 3 NOT automatically invoked)
      Enter <ruleset> <address>
      > 3,0 isaac@vXXXX.vbrew.com
      rewrite: ruleset   3   input: isaac @ vXXXX . vbrew . com
      rewrite: ruleset  96   input: isaac < @ vXXXX . vbrew . com >
      vXXXX.vbrew.com: Name server timeout
      rewrite: ruleset  96 returns: isaac < @ vXXXX . vbrew . com >
      rewrite: ruleset   3 returns: isaac < @ vXXXX . vbrew . com >
      == Ruleset 3,0 (3) status 75
      rewrite: ruleset   0   input: isaac < @ vXXXX . vbrew . com >
      rewrite: ruleset 199   input: isaac < @ vXXXX . vbrew . com >
      rewrite: ruleset 199 returns: isaac < @ vXXXX . vbrew . com >
      rewrite: ruleset  98   input: isaac < @ vXXXX . vbrew . com >
      rewrite: ruleset  98 returns: isaac < @ vXXXX . vbrew . com >
      rewrite: ruleset 198   input: isaac < @ vXXXX . vbrew . com >
      rewrite: ruleset  95   input: < uucp-new : moria > isaac </
      @ vXXXX . vbrew . com >
      rewrite: ruleset  95 returns: $# uucp-new $@ moria $: isaac </
      @ vXXXX . vbrew . com >
      rewrite: ruleset 198 returns: $# uucp-new $@ moria $: isaac </
      @ vXXXX . vbrew . com >
      rewrite: ruleset   0 returns: $# uucp-new $@ moria $: isaac </
      @ vXXXX . vbrew . com >

This result is very different. First, ruleset 3 returned an error message indicating the hostname could not be resolved. Second, we deal with this situation by relying on the other key feature of our configuration, the smart host. The smart host will is to handle any mail that is otherwise undeliverable. The hostname we specified in this test was unable to be resolved and the rulesets determined that the mail should be forwarded to our smart host moria using the uucp-new mailer. Our smart host might be better connected and know what to do with the address.

Our final test ensures that any mail addressed to a host not within our domain is delivered to our smart host. This should produce a result similar to our previous example:

# /usr/sbin/sendmail -bt
      ADDRESS TEST MODE (ruleset 3 NOT automatically invoked)
      Enter <ruleset> <address>
      > 3,0 isaac@linux.org.au
      rewrite: ruleset   3   input: isaac @ linux . org . au
      rewrite: ruleset  96   input: isaac < @ linux . org . au >
      rewrite: ruleset  96 returns: isaac < @ linux . org . au . >
      rewrite: ruleset   3 returns: isaac < @ linux . org . au . >
      rewrite: ruleset   0   input: isaac < @ linux . org . au . >
      rewrite: ruleset 199   input: isaac < @ linux . org . au . >
      rewrite: ruleset 199 returns: isaac < @ linux . org . au . >
      rewrite: ruleset  98   input: isaac < @ linux . org . au . >
      rewrite: ruleset  98 returns: isaac < @ linux . org . au . >
      rewrite: ruleset 198   input: isaac < @ linux . org . au . >
      rewrite: ruleset  95   input: < uucp-new : moria > isaac </
      @ linux . org . au . >
      rewrite: ruleset  95 returns: $# uucp-new $@ moria $: isaac </
      @ linux . org . au . >
      rewrite: ruleset 198 returns: $# uucp-new $@ moria $: isaac </
      @ linux . org . au . >
      rewrite: ruleset   0 returns: $# uucp-new $@ moria $: isaac </
      @ linux . org . au . >

The results of this test indicate that the hostname was resolved, and that the message would still have been routed to our smart host. This proves that our LOCAL_NET_CONFIG definition works correctly and it handled both cases correctly. This test was also successful, so we can happily assume our configuration is correct and use it.

Running sendmail

The sendmail daemon can be run in either of two ways. One way is to have to have it run from the inetd daemon; the alternative, and more commonly used method is to run sendmail as a standalone daemon. It is also common for mailer programs to invoke sendmail as a user command to accept locally generated mail for delivery.

When running sendmail in standalone mode, place the command in an rc file so it starts at boot time. The syntax used is commonly:

/usr/sbin/sendmail -bd -q10m

The -bd argument tells sendmail to run as a daemon. It will fork and run in the background. The -q10m argument tells sendmail to check its queue every ten minutes. You may choose to use a different queue to check time.

To run sendmail from the inetd network daemon, you'd use an entry like:

smtp  stream  tcp nowait  nobody  /usr/sbin/sendmail -bs

The -bs argument here tells sendmail to use the SMTP protocol on stdin/stdout, which is required for use with inetd.

The runq command is usually a symlink to the sendmail binary and is a more convenient form of:

# sendmail -q

When sendmail is invoked this way, it processes any mail waiting in the queue to be transmitted. When running sendmail from inetd you must also create a cron job that runs the runq command periodically to ensure that the mail spool is serviced periodically.

A suitable cron table entry would be similar to:

# Run the mail spool every fifteen minutes
	0,15,30,45     *     *     *     *     /usr/bin/runq

In most installations sendmail processes the queue every 15 minutes as shown in our crontab example, attempting to transmit any messages there.

Tips and Tricks

There are a number of things you can do to make managing a sendmail site efficient. A number of management tools are provided in the sendmail package; let's look at the most important of these.

Managing the Mail Spool

Mail is queued in the /var/spool/mqueue directory before being transmitted. This directory is called the mail spool. The sendmail program provides a means of displaying a formatted list of all spooled mail messages and their status.

The /usr/bin/mailq command is a symbolic link to the sendmail executable and behaves indentically to:

# sendmail -bp

The output displays the message ID, its size, the time it was placed in the queue, who sent it, and a message indicating its current status. The following example shows a mail message stuck in the queue with a problem:

$ mailq
	  Mail Queue (1 request)
	  --Q-ID-- --Size-- -----Q-Time----- ------------Sender/Recipient------------
	  RAA00275      124 Wed Dec  9 17:47 root
	  (host map: lookup (tao.linux.org.au): deferred)
	  terry@tao.linux.org.au

This message is still in the mail queue because the destination host IP address could not be resolved.

We can force sendmail to process the queue now by issuing the /usr/bin/runq command.

The runq command produces no output. sendmail will begin processing the mail queue in the background.

Forcing a Remote Host to Process its Mail Queue

If you use a temporary dial-up Internet connection with a fixed IP address and rely on an MX host to collect your mail while you are disconnected, you will find it useful to force the MX host to process its mail queue soon after you establish your connection.

A small perl program is included with the sendmail distribution that makes this simple for mail hosts that support it. The etrn script has much the same effect on a remote host as the runq command has on our own. If we invoke the command as shown in this example:

# etrn vstout.vbrew.com

we will force the host vstout.vbrew.com to process any mail queued for our local machine.

Typically you'd add this command to your PPP ip-up script so that it is executed soon after your network connection is established.

Analyzing Mail Statistics

sendmail collects data on mail traffic volumes and some information on hosts to which it has delivered mail. There are two commands available to display this information, mailstats, and hoststat.

mailstats

The mailstats command displays statistics on the volume of mail processed by sendmail. The time at which data collection commenced is printed first, followed by a table with one row for each configured mailer and one showing a summary total of all mail. Each line presents eight items of information:

FieldMeaning
MThe mailer (transport protocol) number
msgsfrThe number of messages received from the mailer
bytes_fromThe Kbytes of mail from the mailer
msgstoThe number of messages sent to the mailer
bytes_toThe Kbytes of mail sent to the mailer
msgsregThe number of messages rejected
msgsdisThe number of messages discarded
MailerThe name of the mailer

A sample of the output of the mailstats command is shown in Example 16.5..

Example 16.5. Sample Output of the mailstats Command

# /usr/sbin/mailstats
	    Statistics from Sun Dec 20 22:47:02 1998
	    M   msgsfr  bytes_from   msgsto    bytes_to  msgsrej msgsdis  Mailer
	    0        0          0K       19        515K        0       0  prog
	    3       33        545K        0          0K        0       0  local
	    5       88        972K      139       1018K        0       0  esmtp
	    =============================================================
	    T      121       1517K      158       1533K        0       0

This data is collected if the StatusFile option is enabled in the sendmail.cf file and the status file exists. Typically you'd add the following to your sendmail.cf file:

# status file
	    O StatusFile=/var/log/sendmail.st

To restart the statistics collection, you need to make the statistics file zero length:

> /var/log/sendmail.st

and restart sendmail.

hoststat

The hoststat command displays information about the status of hosts that sendmail has attempted to deliver mail to. The hoststat command is equivalent to invoking sendmail as:

sendmail -bh

The output presents each host on a line of its own, and for each the time since delivery was attempted to it, and the status message received at that time.

Example 16.6. shows the sort of output you can expect from the hoststat command. Note that most of the results indicate successful delivery. The result for earthlink.net, on the other hand, indicates that delivery was unsuccessful. The status message can sometimes help determine the cause of the failure. In this case, the connection timed out, probably because the host was down or unreachable at the time delivery was attempted.

Example 16.6. Sample Output of the oststat Command

# hoststat
	    -------------- Hostname ---------- How long ago ---------Results---------
	    mail.telstra.com.au                    04:05:41 250 Message accepted for
	    scooter.eye-net.com.au              81+08:32:42 250 OK id=0zTGai-0008S9-0
	    yarrina.connect.com.a               53+10:46:03 250 LAA09163 Message acce
	    happy.optus.com.au                  55+03:34:40 250 Mail accepted
	    mail.zip.com.au                        04:05:33 250 RAA23904 Message acce
	    kwanon.research.canon.com.au        44+04:39:10 250 ok 911542267 qp 21186
	    linux.org.au                        83+10:04:11 250 IAA31139 Message acce
	    albert.aapra.org.au                    00:00:12 250 VAA21968 Message acce
	    field.medicine.adelaide.edu.au      53+10:46:03 250 ok 910742814 qp 721
	    copper.fuller.net                   65+12:38:00 250 OAA14470 Message acce
	    amsat.org                            5+06:49:21 250 UAA07526 Message acce
	    mail.acm.org                        53+10:46:17 250 TAA25012 Message acce
	    extmail.bigpond.com                 11+04:06:20 250 ok
	    earthlink.net                       45+05:41:09 Deferred: Connection time

The purgestat command flushes the collected host data and is equivalent to invoking sendmail as:

# sendmail -bH

The statistics will continue to grow until you purge them. You might want to periodically run the purgestat command to make it easier to search and find recent entries, especially if you have a busy site. You could put the command into a crontab file so it runs automatically, or just do it yourself occasionally.



[71] The Free On-Line Dictionary of Computing can be found packaged in many Linux distributions, or online at its home page at http://wombat.doc.ic.ac.uk/foldoc/.

Netnews, or Usenet news, remains one of the most important and highly valued services on computer networks today. Dismissed by some as a mire of unsolicited commercial email and pornography, Netnews still maintains several cases of the high-quality discussion groups that made it a critical resource in pre-web days. Even in these times of a billion web pages, Netnews is still a source for online help and community on many topics.

Usenet History

The idea of network news was born in 1979 when two graduate students, Tom Truscott and Jim Ellis, thought of using UUCP to connect machines for information exchange among Unix users. They set up a small network of three machines in North Carolina.

Initially, traffic was handled by a number of shell scripts (later rewritten in C), but they were never released to the public. They were quickly replaced by A News, the first public release of news software.

A News was not designed to handle more than a few articles per group and day. When the volume continued to grow, it was rewritten by Mark Horton and Matt Glickman, who called it the B release (a.k.a. B News). The first public release of B News was version 2.1 in 1982. It was expanded continuously, with several new features added. Its current version is B News 2.11. It is slowly becoming obsolete; its last official maintainer switched to INN.

Geoff Collyer and Henry Spencer rewrote B News and released it in 1987; this is release C, or C News. Since its release, there have been a number of patches to C News, the most prominent being the C News Performance Release. On sites that carry a large number of groups, the overhead involved in frequently invoking relaynews, which is responsible for dispatching incoming articles to other hosts, is significant. The Performance Release adds an option to relaynews that allows it to run in daemon mode, through which the program puts itself in the background. The Performance Release is the C News version currently included in most Linux releases. We describe C News in detail in Chapter 18., C News.

All news releases up to C were primarily targeted for UUCP networks, although they could be used in other environments, as well. Efficient news transfer over networks like TCP/IP or DECNet required a new scheme. So in 1986, the Network News Transfer Protocol (NNTP) was introduced. It is based on network connections and specifies a number of commands to interactively transfer and retrieve articles.

There are a number of NNTP-based applications available from the Net. One of them is the nntpd package by Brian Barber and Phil Lapsley, which you can use to provide newsreading service to a number of hosts inside a local network. nntpd was designed to complement news packages, such as B News or C News, to give them NNTP features. If you want to use NNTP with the C News server, you should read Chapter 19., NNTP and thenntpd Daemon, which explains how to configure the nntpd daemon and run it with C News.

An alternative package supporting NNTP is INN, or Internet News. It is not just a frontend, but a news system in its own right. It comprises a sophisticated news relay daemon that can maintain several concurrent NNTP links efficiently, and is therefore the news server of choice for many Internet sites. We discuss it in detail in Chapter 20., Internet News.

What Is Usenet, Anyway?

One of the most astounding facts about Usenet is that it isn't part of any organization, nor does it have any sort of centralized network management authority. In fact, it's part of Usenet lore that except for a technical description, you cannot define what it is; at the risk of sounding stupid, one might define Usenet as a collaboration of separate sites that exchange Usenet news. To be a Usenet site, all you have to do is find another Usenet site and strike an agreement with its owners and maintainers to exchange news with you. Providing another site with news is called feeding it, whence another common axiom of Usenet philosophy originates: Get a feed, and you're on it.

The basic unit of Usenet news is the article. This is a message a user writes and posts to the net. In order to enable news systems to deal with it, it is prepended with administrative information, the so-called article header. It is very similar to the mail header format laid down in the Internet mail standard RFC-822, in that it consists of several lines of text, each beginning with a field name terminated by a colon, which is followed by the field's value.[72]

Articles are submitted to one or more newsgroup. One may consider a newsgroup a forum for articles relating to a common topic. All newsgroups are organized in a hierarchy, with each group's name indicating its place in the hierarchy. This often makes it easy to see what a group is all about. For example, anybody can see from the newsgroup name that comp.os.linux.announce is used for announcements concerning a computer operating system named Linux.

These articles are then exchanged between all Usenet sites that are willing to carry news from this group. When two sites agree to exchange news, they are free to exchange whatever newsgroups they like, and may even add their own local news hierarchies. For example, groucho.edu might have a news link to barnyard.edu, which is a major news feed, and several links to minor sites which it feeds news. Now Barnyard College might receive all Usenet groups, while GMU only wants to carry a few major hierarchies like sci, comp, or rec. Some of the downstream sites, say a UUCP site called brewhq, will want to carry even fewer groups, because they don't have the network or hardware resources. On the other hand, brewhq might want to receive newsgroups from the fj hierarchy, which GMU doesn't carry. It therefore maintains another link with gargleblaster.com, which carries all fj groups and feeds them to brewhq. The news flow is shown in Figure 17.1..

Figure 17.1. Usenet newsflow through Groucho Marx University

The labels on the arrows originating from brewhq may require some explanation, though. By default, it wants all locally generated news to be sent to groucho.edu. However, as groucho.edu does not carry the fj groups, there's no point in sending it any messages from those groups. Therefore, the feed from brewhq to GMU is labeled all,!fj, meaning that all groups except those below fj are sent to it.

How Does Usenet Handle News?

Today, Usenet has grown to enormous proportions. Sites that carry the whole of Netnews usually transfer something like a paltry 60 MB a day.[73] Of course, this requires much more than pushing files around. So let's take a look at the way most Unix systems handle Usenet news.

News begins when users create and post articles. Each user enters a message into a special application called a newsreader, which formats it appropriately for transmission to the local news server. In Unix environments the newsreader commonly uses the inews command to transmit articles to the newsserver using the TCP/IP protocol. But it's also possible to write the article directly into a file in a special directory called the news spool. Once the posting is delivered to the local news server, it takes responsibility for delivering the article to other news users.

News is distributed through the net by various transports. The medium used to be UUCP, but today the main traffic is carried by Internet sites. The routing algorithm used is called flooding. Each site maintains a number of links (news feeds) to other sites. Any article generated or received by the local news system is forwarded to them, unless it has already been at that site, in which case it is discarded. A site may find out about all other sites the article has already traversed by looking at the Path: header field. This header contains a list of all systems through which the article has been forwarded in bang path notation.

To distinguish articles and recognize duplicates, Usenet articles have to carry a message ID (specified in the Message-Id: header field), which combines the posting site's name and a serial number into <serial@site>. For each article processed, the news system logs this ID into a history file, against which all newly arrived articles are checked.

The flow between any two sites may be limited by two criteria. For one, an article is assigned a distribution (in the Distribution: header field), which may be used to confine it to a certain group of sites. On the other hand, the newsgroups exchanged may be limited by both the sending and receiving systems. The set of newsgroups and distributions allowed to be transmitted to a site are usually kept in the sys file.

The sheer number of articles usually requires that improvements be made to the above scheme. On UUCP networks, systems collect articles over a period of time and combine them into a single file, which is compressed and sent to the remote site. This is called batching.

An alternative technique is the ihave/sendme protocol that prevents duplicate articles from being transferred, thus saving net bandwidth. Instead of putting all articles in batch files and sending them along, only the message IDs of articles are combined into a giant ihave message and sent to the remote site. The remote site reads this message, compares it to its history file, and returns the list of articles it wants in a sendme message. Only the requested articles are sent.

Of course, ihave/sendme makes sense only if it involves two big sites that receive news from several independent feeds each, and that poll each other often enough for an efficient flow of news.

Sites that are on the Internet generally rely on TCP/IP-based software that uses the Network News Transfer Protocol (NNTP). NNTP is described in RFC-977; it is responsible for the transfer of news between news servers and provides Usenet access to single users on remote hosts.

NNTP knows three different ways to transfer news. One is a real-time version of ihave/sendme, also referred to as pushing news. The second technique is called pulling news, in which the client requests a list of articles in a given newsgroup or hierarchy that have arrived at the server's site after a specified date, and chooses those it cannot find in its history file. The third technique is for interactive newsreading and allows you or your newsreader to retrieve articles from specified newgroups, as well as post articles with incomplete header information.

At each site, news is kept in a directory hierarchy below /var/spool/news, each article in a separate file, and each newsgroup in a separate directory. The directory name is made up of the newsgroup name, with the components being the path components. Thus, comp.os.linux.misc articles are kept in /var/spool/news/comp/os/linux/misc. The articles in a newsgroup are assigned numbers in the order they arrive. This number serves as the file's name. The range of numbers of articles currently online is kept in a file called active, which at the same time serves as a list of newsgroups your site knows.

Since disk space is a finite resource, you have to start throwing away articles after some time.[74] This is called expiring. Usually, articles from certain groups and hierarchies are expired at a fixed number of days after they arrive. This may be overridden by the poster by specifying a date of expiration in the Expires: field of the article header.

You now have enough information to choose what to read next. UUCP users should read about C-News in Chapter 18., C News. If you're using a TCP/IP network, read about NNTP in Chapter 19., NNTP and thenntpd Daemon. If you need to transfer moderate amounts of news over TCP/IP, the server described in that chapter may be enough for you. To install a heavy-duty news server that can handle huge volumes of material, go on to read about InterNet News in Chapter 20., Internet News.



[72] The format of Usenet news messages is specified in RFC-1036, Standard for interchange of USENET messages.

[73] Wait a minute: 60 Megs at 9,600 bps, that's 60 million multiplied by 1,024, that is mutter, mutter Hey! That's 34 hours!

[74] Some people claim that Usenet is a conspiracy by modem and hard disk vendors.

One of the most popular software packages for Netnews is C News. It was designed for sites that carry news over UUCP links. This chapter will discuss the central concepts of C News, basic installation, and maintenance tasks.

C News stores its configuration files in /etc/news, and most of its binaries are kept below the /usr/lib/news/ directory. Articles are kept below /var/spool/news. You should make sure that virtually all files in these directories are owned by user news or group news. Most problems arise from files being inaccessible to C News. Use su to become the user news before you touch anything in the directory. The only exception is the setnewsids command, which is used to set the real user ID of some news programs. It must be owned by root and have the setuid bit set.

In this chapter, we describe all C News configuration files in detail and show you what you have to do to keep your site running.

Delivering News

Articles can be fed to C News in several ways. When a local user posts an article, the newsreader usually hands it to the inews command, which completes the header information. News from remote sites, be it a single article or a whole batch, is given to the rnews command, which stores it in the /var/spool/news/in.coming directory, from where it will be picked up at a later time by newsrun. With any of these two techniques, however, the article will eventually be handed to the relaynews command.

For each article, the relaynews command first checks if the article has already been seen at the local site by looking up the message ID in the history file. Duplicate articles are dropped. Then relaynews looks at the Newsgroups: header line to find out if the local site requests articles from any of these groups. If it does, and the newsgroup is listed in the active file, relaynews tries to store the article in the corresponding directory in the news spool area. If this directory does not exist, it is created. The article's message ID is then logged to the history file. Otherwise, relaynews drops the article.

Sometimes relaynews fails to store an incoming article because a group to which it has been posted is not listed in your active file. In this case, the article is moved to the junk group.[75] relaynews also checks for stale or misdated articles and reject them. Incoming batches that fail for any other reason are moved to /var/spool/news/in.coming/bad, and an error message is logged.

After this, the article is relayed to all other sites that request news from these groups, using the transport specified for each particular site. To make sure an article isn't sent to a site that has already seen it, each destination site is checked against the article's Path: header field, which contains the list of sites the article has traversed so far, written in the UUCP-style bang-path source-routing style described in Chapter 15., Electronic Mail. If the destination site's name does not appear in this list, the article is sent to it.

C News is commonly used to relay news between UUCP sites, although it is also possible to use it in an NNTP environment. To deliver news to a remote UUCP site, either in single articles or whole batches, uux is used to execute the rnews command on the remote site and feed the article or batch to it on standard input. Refer to ???, for more information on UUCP.

Batching is the term used to describe sending large bundles of individual articles all in one transmission. When batching is enabled for a given site, C News does not send any incoming article immediately; instead, it appends its path name to a file, usually called out.going/site/togo. Periodically, a program is executed from a crontab entry by the cron program, which reads this file and bundles all of the listed articles into one or more file, optionally compressing them and sending them to rnews at the remote site.[76]

Figure 18.1. shows the news flow through relaynews. Articles may be relayed to the local site (denoted by ME), to a site named ponderosa via email, and a site named moria, for which batching is enabled.

Figure 18.1. News flow through relaynews

Installation

C News should be available in a prepackaged format for any modern Linux distribution, so installation will be easy. If not, or if you want to install from the original source distribution, then of course you can.[77] No matter how you install it, you will need to edit the C News configuration files. Their formats are described in the following list:

sys

The sys file controls which newsgroups your site receives and forwards. We discuss it in detail in the following section.

active

Not usually edited by the administration; contains directions for handling articles in each newsgroup the site handles.

organization

This file should contain your organization's name, for example, Virtual Brewery, Inc. On your home machine, enter private site, or anything else you like. Most people will not consider your site properly configured if you haven't customized this file.

newsgroups

This file is a list of all newsgroups, with a one-line description of each one's purpose. These descriptions are frequently used by your newsreader when displaying the list of all groups to which you are subscribed.

mailname

Your site's mail name, e.g., vbrew.com.

whoami

Your site's name for news purposes. Quite often, the UUCP site name is used, e.g., vbrew.

explist

You should probably edit this file to reflect your preferred expiration times for special newsgroups. Disk space may play an important role in your choices.

To create an initial hierarchy of newsgroups, obtain active and newsgroups files from the site that feeds you. Install them in /etc/news, making sure they are owned by news and have a mode of 644, using the chmod command. Remove all to.* groups from the active file, and add to.my-site, to.feed-site, junk, and control. The to.* groups are normally used for exchanging ihave/sendme messages, but you should list them regardless of whether you plan to use ihave/sendme or not. Next, replace all article numbers in the second and third field of active using the following commands:

# cp active active.old
# sed 's/ [0-9]* [0-9]* / 0000000000 00001 /' active.old > active
# rm active.old

The second command invokes the sed stream editor. This invocation replaces two strings of digits with a string of zeroes and the string 000001, respectively.

Finally, create the news spool directory and the subdirectories used for incoming and outgoing news:

# cd /var/spool
# mkdir news news/in.coming news/out.going news/out.master
# chown -R news.news news
# chmod -R 755 news

If you're using precompiled newsreaders sourced from a different distribution to the C News server you have running, you may find that some expect the news spool in /usr/spool/news rather than /var/spool/news. If your newsreader doesn't seem to find any articles, create a symbolic link from /usr/spool/news to /var/spool/news like this:

# ln -sf /usr/spool/news /var/spool/news

Now you are ready to receive news. Note that you don't have to create the individual newsgroup spool directories. C News automatically creates spool directories for any newsgroup it receives an article for, if one doesn't already exist.

In particular, this happens to all groups to which an article has been cross-posted. So, after a while, you will find your news spool cluttered with directories for newsgroups you have never subscribed to, like alt.lang.teco. You may prevent this by either removing all unwanted groups from active, or by regularly running a shell script that removes all empty directories below /var/spool/news (except out.going and in.coming, of course).

C News needs a user to send error messages and status reports to. By default, this is usenet. If you use the default, you have to set up an alias for it that forwards all of its mail to one or more responsible person. You may also override this behavior by setting the environment variable NEWSMASTER to the appropriate name. You have to do so in news's crontab file, as well as every time you invoke an administrative tool manually, so installing an alias is probably easier. Mail aliases are described in Chapter 16., Sendmail, and ???.

While you're hacking /etc/passwd, make sure that every user has her real name in the pw_gecos field of the password file (this is the fourth field). It is a question of Usenet netiquette that the sender's real name appears in the From: field of the article. Of course, you will want to do so anyway when you use mail.

The sys File

The sys file, located in /etc/news, controls which hierarchies you receive and forward to other sites. Although there are maintenance tools named addfeed and delfeed, we think it's better to maintain this file by hand.

The sys file contains entries for each site to which you forward news, as well as a description of the groups you will accept. The first line is a ME entry that describes your system. It's a safe bet to use the following:

ME:all/all::

You also have to add a line for each site to which you feed news. Each line looks like this:

site[/exclusions]:grouplist[/distlist][:flags[:cmds]]

Entries may be continued across newlines using a backslash (\) at the end of the line to be continued. A hash sign (#) denotes a comment.

site

This is the name of the site the entry applies to. One usually chooses the site's UUCP name for this. There has to be an entry for your site in the sys file too, or you will not receive any articles yourself.

The special site name ME denotes your site. The ME entry defines all groups you are willing to store locally. Articles that aren't matched by the ME line will go to the junk group.

C News rejects any articles that have already passed through this site to prevent loops. C News does this by ensuring that the local site name does not appear in the Path: of the article. Some sites may be known by a number of valid names. For example, some sites use their fully qualified domain name in this field, or an alias like news.site.domain. To ensure the loop prevention mechanism works, it is important to add all aliases to the exclusion list, separating them by commas.

For the entry applying to site moria, for instance, the site field would contain moria/moria.orcnet.org. If moria were also by an alias of news.orcnet.org, then our site field would contain moria/moria.orcnet.org,news.orcnet.org.

grouplist

This is a comma-separated subscription list of groups and hierarchies for this particular site. A hierarchy may be specified by giving the hierarchy's prefix (such as comp.os for all groups whose names start with this prefix), optionally followed by the keyword all (e.g., comp.os.all).

You can exclude a hierarchy or group from forwarding by preceding it with an exclamation mark. If a newsgroup is checked against the list, the longest match applies. For example, if grouplist contains this list:

!comp,comp.os.linux,comp.folklore.computers

no groups from the comp hierarchy except comp.folklore.computers and all groups below comp.os.linux will be fed to that site.

If the site requests to be forwarded all news you receive yourself, enter all as grouplist.

distlist

This value is offset from the grouplist by a slash and contains a list of distributions to be forwarded. Again, you may exclude certain distributions by preceding them with an exclamation mark. All distributions are denoted by all. Omitting distlist implies a list of all.

For example, you may use a distribution list of all,!local to prevent news meant only for local use from being sent to remote sites.

There are usually at least two distributions: world, which is often the default distribution used when none is specified by the user, and local. There may be other distributions that apply to a certain region, state, country, etc. Finally, there are two distributions used by C News only; these are sendme and ihave, and are used for the sendme/ihave protocol.

The use of distributions is a subject of debate. The distribution field in a news article can be created arbitrarily, but for a distribution to be effective, the news servers in the network must know it. Some misbehaving newsreaders create bogus distributions by simply assuming the top-level newsgroup hierarchy of the article destination is a reasonable distribution. For example, one might assume comp to be a reasonable distribution to use when posting to the comp.os.linux.networking newsgroup. Distributions that apply to regions are often questionable, too, because news may travel outside of your region when sent across the Internet.[78] Distributions applying to an organization, however, are very meaningful; e.g., to prevent confidential information from leaving the company network. This purpose, however, is generally served better by creating a separate newsgroup or hierarchy.

flags

This option describes certain parameters for the feed. It may be empty or a combination of the following:

F

This flag enables batching.

f

This is almost identical to the F flag, but allows C News to calculate the size of outgoing batches more precisely, and should probably be used in preference.

I

This flag makes C News produce an article list suitable for use by ihave/sendme. Additional modifications to the sys and the batchparms file are required to enable ihave/sendme.

n

This creates batch files for active NNTP transfer clients like nntpxmit (see Chapter 19., NNTP and thenntpd Daemon). The batch files contain the article's filename along with its message ID.

L

This tells C News to transmit only articles posted at your site. This flag may be followed by a decimal number n, which makes C News transfer articles posted only within n hops from your site. C News determines the number of hops from the Path: field.

u

This tells C News to batch only articles from unmoderated groups.

m

This tells C News to batch only articles from moderated groups.

You may use at most one of F, f, I, or n.

cmds

This field contains a command that will be executed for each article, unless you enable batching. The article will be fed to the command on standard input. This should be used for very small feed only; otherwise, the load on both systems will be too high.

The default command is:

uux - -r -z remote-system!rnews

This invokes rnews on the remote system, feeding it the article on standard input.

The default search path for commands given in this field is /bin:/usr/bin:/usr/lib/news/batch. The latter directory contains a number of shell scripts whose names start with via; they are briefly described later in this chapter.

If batching is enabled using one of the F, f, I, or n flags, C News expects to find a filename in this field rather than a command. If the filename does not begin with a slash (/), it is assumed to be relative to /var/spool/news/out.going. If the field is empty, it defaults to remote-system/togo. The file is expected to be in the same format as the remote-system/togo file and contain a list of articles to transmit.

When setting up C News, you will most probably have to write your own sys file. Here is a sample file for vbrew.com, from which you may copy what you need:

# We take whatever they give us.
ME:all/all::
# We send everything we receive to moria, except for local and
# brewery-related articles. We use batching.
moria/moria.orcnet.org:all,!to,to.moria/all,!local,!brewery:f:
# We mail comp.risks to jack@ponderosa.uucp
ponderosa:comp.risks/all::rmail jack@ponderosa.uucp
# swim gets a minor feed
swim/swim.twobirds.com:comp.os.linux,rec.humor.oracle/all,!local:f:
# Log mail map articles for later processing
usenet-maps:comp.mail.maps/all:F:/var/spool/uumaps/work/batch

The active File

The active file is located in /etc/, and lists all groups known at your site and the articles currently online. You will rarely have to touch it, but we explain it nevertheless for sake of completion. Entries take the following form:

newsgroup high low perm

newsgroup is the group's name. low and high are the lowest and highest numbers of articles currently available. If none are available at the moment, low is equal to high+1.

At least that's what the low field is meant to do. However, for efficiency, C News doesn't update this field. This wouldn't be such a big loss if there weren't newsreaders that depend on it. For instance, trn checks this field to see if it can purge any articles from its thread database. To update the low field, you therefore have to run the updatemin command regularly (or, in earlier versions of C News, the upact script).

perm is a parameter detailing the access users are granted to the group. It takes one of the following values:

y

Users are allowed to post to this group.

n

Users are not allowed to post to this group. However, the group may still be read.

x

This group has been disabled locally. This happens sometimes when news administrators (or their superiors) take offense at articles posted to certain groups.

Articles received for this group are not stored locally, although they are forwarded to the sites that request them.

m

This denotes a moderated group. When a user tries to post to this group, an intelligent newsreader notifies her of this and send the article to the moderator instead. The moderator's address is taken from the moderators file in /var/lib/news.

=real-group

This marks newsgroup as being a local alias for another group, namely real-group. All articles posted to newsgroup will be redirected to it.

In C News, you will generally not have to access this file directly. Groups can be added or deleted locally using addgroup and delgroup (see the section the section called “Maintenance Tools and Tasks” later in this chapter). A newgroup control message adds a group for the whole of Usenet, while a rmgroup message deletes a group. Never send such a message yourself! For instructions on how to create a newsgroup, read the monthly postings in news.announce.newusers.

The active.times file is closely related to the active file. Whenever a group is created, C News logs a message to this file containing the name of the group created, the date of creation, whether it was done by a newgroup control message or locally, and who did it. This is convenient for newsreaders that may notify the user of any recently created groups. It is also used by the NEWGROUPS command of NNTP.

Article Batching

News batches follow a particular format that is the same for B News, C News, and INN. Each article is preceded by a line like this:

#! rnews count

count is the number of bytes in the article. When you use batch compression, the resulting file is compressed as a whole and preceded by another line, indicated by the message to be used for unpacking. The standard compression tool is compress, which is marked by:

#! cunbatch

Sometimes, when the news server sends batches via mail software that removes the eighth bit from all data, a compressed batch may be protected using what is called c7-encoding; these batches will be marked by c7unbatch.

When a batch is fed to rnews on the remote site, it checks for these markers and processes the batch appropriately. Some sites also use other compression tools, like gzip, and precede their gzipped files with the word zunbatch instead. C News does not recognize nonstandard headers like these; you have to modify the source to support them.

In C News, article batching is performed by /usr/lib/news/batch/sendbatches, which takes a list of articles from the site/togo file and puts them into several newsbatches. It should be executed once per hour, or even more frequently, depending on the volume of traffic. Its operation is controlled by the batchparms file in /var/lib/news. This file describes the maximum batch size allowed for each site, the batching and optional compression program to be used, and the transport for delivering it to the remote site. You may specify batching parameters on a per-site basis, as well as a set of default parameters for sites not explicitly mentioned.

When installing C News, you will most likely find a batchparms file in your distribution that contains a reasonable default entry, so there's a good chance that you won't have to touch the file. Just in case, we describe its format. Each line consists of six fields, separated by spaces or tabs:

site size max batcher muncher transport
site

site is the name of the site to which the entry applies. The togo file for this site must reside in out.going/togo below the news spool. A site name of /default/ denotes the default entry and is to match any site not directly specified with an entry unique to it.

size

size is the maximum size of article batches created (before compression). For single articles larger than this, C News makes an exception and puts each in a single batch by itself.

max

max is the maximum number of batches created and scheduled for transfer before batching stalls for this particular site. This is useful in case the remote site should be down for a long time, because it prevents C News from cluttering your UUCP spool directories with zillions of newsbatches.

C News determines the number of queued batches using the queuelen script in /usr/lib/news/. If you've installed C News in a prepackaged format, the script should not need any editing, but if you choose to use a different flavor of spool directories, for example, Taylor UUCP, you might have to write your own. If you don't care about the number of spool files (because you're the only person using your computer and you don't write articles by the megabyte), you may replace the script's contents by a simple exit 0 statement.

batcher

The batcher field contains the command used for producing a batch from the list of articles in the togo file. For regular feeds, this is usually batcher. For other purposes, alternative batchers may be provided. For instance, the ihave/sendme protocol requires the article list to be turned into ihave or sendme control messages, which are posted to the newsgroup to.site. This is performed by batchih and batchsm.

muncher

The muncher field specifies the compression command. Usually, this is compcun, a script that produces a compressed batch.[79] Alternatively, suppose you create a muncher that uses gzip, say gzipcun (note that you have to write it yourself). You have to make sure that uncompress on the remote site is patched to recognize files compressed with gzip.

If the remote site does not have an uncompress command, you may specify nocomp, which does not do any compression.

transport

The last field, transport, describes the transport to be used. A number of standard commands for different transports are available; their names begin with via. sendbatches passes them the destination sitename on the command line. If the batchparms entry is not /default/, sendbatches derives the sitename from the site field by stripping it of anything after and including the first dot or slash. If the batchparms entry is /default/, the directory names in out.going are used.

To perform batching for a specific site, use the following command:

# su news -c "/usr/lib/news/batch/sendbatches site"

When invoked without arguments, sendbatches handles all batch queues. The interpretation of all depends on the presence of a default entry in batchparms. If one is found, all directories in /var/spool/news/out.going are checked; otherwise, sendbatches cycles through all entries in batchparms, processing just the sites found there. Note that sendbatches, when scanning the out.going directory, takes only those directories that contain no dots or at signs (@) as sitenames.

There are two commands that use uux to execute rnews on the remote system: viauux and viauuxz. The latter sets the z flag for uux to keep older versions from returning success messages for each article delivered. Another command, viamail, sends article batches to the user rnews on the remote system via mail. Of course, this requires that the remote system somehow feeds all mail for rnews to its local news system. For a complete list of these transports, refer to the newsbatch manual page.

All commands from the last three fields must be located in either out.going/site or /usr/lib/news/batch. Most of them are scripts; you can easily tailor new tools for your personal needs. They are invoked through pipes. The list of articles is fed to the batcher on standard input, which produces the batch on standard output. This is piped into the muncher, and so on.

Here is a sample file:

# batchparms file for the brewery
# site        | size   |max    |batcher  |muncher    |transport
#-------------+--------+-------+---------+-----------+-----------
/default/       100000  22      batcher   compcun     viauux
swim             10000  10      batcher   nocomp      viauux

Expiring News

In B News, expiration needs to be performed by a program called expire, which took a list of newsgroups as arguments, along with a time specification after which articles had to be expired. To have different hierarchies expire at different times, you had to write a script that invoked expire for each of them separately. C News offers a more convenient solution. In a file called explist, you may specify newsgroups and expiration intervals. A command called doexpire is usually run once a day from cron and processes all groups according to this list.

Occasionally, you may want to retain articles from certain groups even after they have been expired; for example, you might want to keep programs posted to comp.sources.unix. This is called archiving. explist permits you to mark groups for archiving.

An entry in explist looks like this:

grouplist perm times archive

grouplist is a comma-separated list of newsgroups to which the entry applies. Hierarchies may be specified by giving the group name prefix, optionally appended with all. For example, for an entry applying to all groups below comp.os, enter either comp.os or comp.os.all.

When expiring news from a group, the name is checked against all entries in explist in the order given. The first matching entry applies. For example, to throw away the majority of comp after four days, except for comp.os.linux.announce, which you want to keep for a week, you simply have an entry for the latter, which specifies a seven-day expiration period, followed by an expiration period for comp, which specifies four days.

The perm field details if the entry applies to moderated, unmoderated, or any groups. It may take the values m, u, or x, which denote moderated, unmoderated, or any type.

The third field, times, usually contains only a single number. This is the number of days after which articles expire if they haven't been assigned an artificial expiration date in an Expires: field in the article header. Note that this is the number of days counting from its arrival at your site, not the date of posting.

The times field may, however, be more complex than that. It may be a combination of up to three numbers separated from one another by dashes. The first denotes the number of days that have to pass before the article is considered a candidate for expiration, even if the Expires: field would have it expire already. It is rarely useful to use a value other than zero. The second field is the previously mentioned default number of days after which it will be expired. The third is the number of days after which an article will be expired unconditionally, regardless of whether it has an Expires: field or not. If only the middle number is given, the other two take default values. These may be specified using the special entry /bounds/, which is described a little later.

The fourth field, archive, denotes whether the newsgroup is to be archived and where. If no archiving is intended, a dash should be used. Otherwise, you either use a full pathname (pointing to a directory) or an at sign (@). The at sign denotes the default archive directory, which must then be given to doexpire by using the a flag on the command line. An archive directory should be owned by news. When doexpire archives an article from say, comp.sources.unix, it stores it in the directory comp/sources/unix below the archive directory, creating it if necessary. The archive directory itself, however, will not be created.

There are two special entries in your explist file that doexpire relies on. Instead of a list of newsgroups, they have the keywords /bounds/ and /expired/. The /bounds/ entry contains the default values for the three values of the times field described previously.

The /expired/ field determines how long C News will hold onto lines in the history file. C News will not remove a line from the history file once the corresponding article(s) have been expired, but will hold onto it in case a duplicate should arrive after this date. If you are fed by only one site, you can keep this value small. Otherwise, a couple of weeks is advisable on UUCP networks, depending on the delays you experience with articles from these sites.

Here is a sample explist file with rather tight expiry intervals:

# keep history lines for two weeks. No article gets more than three months
/expired/                       x       14      -
/bounds/                        x       0-1-90  -
# groups we want to keep longer than the rest
comp.os.linux.announce          m       10      -
comp.os.linux                   x       5       -
alt.folklore.computers          u       10      -
rec.humor.oracle                m       10      -
soc.feminism                    m       10      -
# Archive *.sources groups
comp.sources,alt.sources        x       5       @
# defaults for tech groups
comp,sci                        x       7       -
# enough for a long weekend
misc,talk                       x       4       -
# throw away junk quickly
junk                            x       1       -
# control messages are of scant interest, too
control                         x       1       -
# catch-all entry for the rest of it
all                             x       2       -

Expiring presents several potential problems. One is that your newsreader might rely on the third field of the active file described earlier, which contains the number of the lowest article online. When expiring articles, C News does not update this field. If you need (or want) to have this field represent the real situation, you need to run a program called updatemin after each run of doexpire. (In older versions of C News, a script called upact did this.)

C News does not expire by scanning the newsgroup's directory, but simply checks the history file if the article is due for expiration.[80] If your history file somehow gets out of sync, articles may be around on your disk forever because C News has literally forgotten them.[81] You can repair this by using the addmissing script in /usr/lib/news/maint, which will add missing articles to the history file or mkhistory, which rebuilds the entire file from scratch. Don't forget to become user news before invoking it, or else you will wind up with a history file unreadable by C News.

Miscellaneous Files

There are a number of files that control the behavior of C News, but are not essential. All of them reside in /etc/news. We describe them briefly here:

newsgroups

This is a companion file of active that contains a list of each newsgroup name along with a one-line description of its main topic. This file is automatically updated when C News receives a checknews control message.

localgroups

If you have a lot of local groups, you can keep C News from complaining about them each time you receive a checkgroups message by putting their names and descriptions in this file, just as they would appear in newsgroups.

mailpaths

This file contains the moderator's address for each moderated group. Each line contains the group name followed by the moderator's email address (offset by a tab).

Two special entries are provided as defaults: backbone and internet. Both provide, in bang-path notation, the path to the nearest backbone site and the site that understands RFC-822 style addresses (user@host). The default entries are:

internet           backbone

You do not have to change the internet entry if you have exim or sendmail installed; they understand RFC-822 addressing.

The backbone entry is used whenever a user posts to a moderated group whose moderator is not listed explicitly. If the newsgroup's name is alt.sewer and the backbone entry contains path!%s, C News will mail the article to path!alt-sewer, hoping that the backbone machine is able to forward the article. To find out which path to use, ask the news-admin at the site that feeds you. As a last resort, you can also use uunet.uu.net!%s.

distributions

This file is not really a C News file, but is used by some newsreaders and nntpd. It contains the list of distributions recognized by your site and a description of their (intended) effects. For example, Virtual Brewery has the following file:

world         everywhere in the world
local         Only local to this site
nl            Netherlands only
mugnet        MUGNET only
fr            France only
de            Germany only
brewery       Virtual Brewery only
log

This file contains a log of all C News activities. It is culled regularly by running newsdaily; copies of the old log files are kept in log.o, log.oo, etc.

errlog

This is a log of all error messages created by C News. These messages do not include logs of articles junked due to being sent to an invalid wrong group or other user errors. This file is mailed to the newsmaster (usenet by default) automatically by newsdaily if it is not found empty.

errlog is cleared by newsdaily. errlog.o keeps old copies and companions.

batchlog

This file logs all runs of sendbatches. It is usually of scant interest. It is also attended by newsdaily.

watchtime

This is an empty file created each time newswatch runs.

Control Messages

The Usenet news protocol knows a special category of articles that evoke certain responses or actions by the news system. These are called control messages. They are recognized by the presence of a Control: field in the article header, which contains the name of the control operation to be performed. There are several types of them, all of which are handled by shell scripts located in /usr/lib/news/ctl.

Most of these messages perform their action automatically at the time the article is processed by C News without notifying the newsmaster. By default, only checkgroups messages will be handed to the newsmaster, but you may change this by editing the scripts.

The cancel Message

The most widely known message is cancel, with which a user can cancel an article sent earlier. This effectively removes the article from the spool directories, if it exists. The cancel message is forwarded to all sites that receive news from the groups affected, regardless of whether the article has been seen already. This takes into account the possibility that the original article has been delayed over the cancellation message. Some news systems allow users to cancel other people's messages; this is, of course, a definite no-no.

newgroup and rmgroup

Two messages dealing with creation or removal of newsgroups are the newgroup and rmgroup messages. Newsgroups below the usual hierarchies may be created only after a discussion and voting has been held among Usenet readers. The rules applying to the alt hierarchy allow for something close to anarchy. For more information, see the regular postings in news.announce.newusers and news.announce.newgroups. Never send a newgroup or rmgroup message yourself unless you definitely know that you are allowed to.

The checkgroups Message

checkgroups messages are sent by news administrators to make all sites within a network synchronize their active files with the realities of Usenet. For example, commercial Internet Service Providers might send out such a message to their customers' sites. Once a month, the official checkgroups message for the major hierarchies is posted to comp.announce.newgroups by its moderator. However, it is posted as an ordinary article, not as a control message. To perform the checkgroups operation, save this article to a file, say /tmp/check, remove everything up to the beginning of the control message itself, and feed it to the checkgroups script using the following command:

# su news -c "/usr/lib/news/ctl/checkgroups" < /tmp/check

This will update your newsgroups file from the new list of groups, adding the groups listed in localgroups. The old newsgroups file will be moved to newsgroups.bac. Note that posting the message locally rarely works, because inews, the command that accepts and posts articles from users, refuses to accept that large an article.

If C News finds mismatches between the checkgroups list and the active file, it produces a list of commands that would bring your site up to date and mails it to the news administrator.

The output typically looks like this:

From news Sun Jan 30 16:18:11 1994
Date: Sun, 30 Jan 94 16:18 MET
From: news (News Subsystem)
To: usenet
Subject: Problems with your active file
The following newsgroups are not valid and should be removed.
        alt.ascii-art
        bionet.molbio.gene-org
        comp.windows.x.intrisics
        de.answers
You can do this by executing the commands:
         /usr/lib/news/maint/delgroup alt.ascii-art
         /usr/lib/news/maint/delgroup bionet.molbio.gene-org
         /usr/lib/news/maint/delgroup comp.windows.x.intrisics
         /usr/lib/news/maint/delgroup de.answers
The following newsgroups were missing.
        comp.binaries.cbm
        comp.databases.rdb
        comp.os.geos
        comp.os.qnx
        comp.unix.user-friendly
        misc.legal.moderated
        news.newsites
        soc.culture.scientists
        talk.politics.crypto
        talk.politics.tibet

When you receive a message like this from your news system, don't believe it automatically. Depending on who sent the checkgroups message, it may lack a few groups or even entire hierarchies; you should be careful about removing any groups. If you find groups are listed as missing that you want to carry at your site, you have to add them using the addgroup script. Save the list of missing groups to a file and feed it to the following little script:

#!/bin/sh
#
WHOIAM=`whoami`
if [ "$WHOIAM" != "news" ]
then
	echo "You must run $0 as user 'news'" >&2
	exit 1
fi
#
cd /usr/lib/news
while read group; do
    if grep -si "^$group[[:space:]].*moderated" newsgroup; then
        mod=m
    else
        mod=y
    fi
    /usr/lib/news/maint/addgroup $group $mod
done

sendsys, version, and senduuname

Finally, there are three messages that can be used to find out about the network's topology. These are sendsys, version, and senduuname. They cause C News to return the sys file to the sender, as well as a software version string and the output of uuname, respectively. C News is very laconic about version messages; it returns a simple, unadorned C.

Again, you should never issue such a message unless you have made sure that it cannot leave your (regional) network. Replies to sendsys messages can quickly bring down a UUCP network.[82]

C News in an NFS Environment

A simple way to distribute news within a local network is to keep all news on a central host and export the relevant directories via NFS so that newsreaders may scan the articles directly. The overhead involved in retrieving and threading articles is significantly lower than NNTP. NNTP, on the other hand, wins in a heterogeneous network where equipment varies widely among hosts, or where users don't have equivalent accounts on the server machine.

When you use NFS, articles posted on a local host have to be forwarded to the central machine because accessing adminstrative files might otherwise expose the system to race conditions that leave the files inconsistent. Also, you might want to protect your news spool area by exporting it read-only, which also requires forwarding to the central machine.

C News handles this central machine configuration transparently to the user. When you post an article, your newsreader usually invokes inews to inject the article into the news system. This command runs a number of checks on the article, completes the header, and checks the file server in /etc/news. If this file exists and contains a hostname different from the local host's name, inews is invoked on that server host via rsh. Since the inews script uses a number of binary commands and support files from C News, you have to either have C News installed locally or mount the news software from the server.

For the rsh invocation to work properly, each user who posts news must have an equivalent account on the server system, i.e., one to which she can log in without being asked for a password.

Make sure that the hostname given in server literally matches the output of the hostname command on the server machine, or else C News will loop forever in an attempt to deliver the article. We discuss NFS is detail in Chapter 14., The NetworkFile System.

Maintenance Tools and Tasks

Despite the complexity of C News, a news administrator's life can be fairly easy; C News provides you with a wide variety of maintenance tools. Some of these are intended to be run regularly from cron, like newsdaily. Using these scripts greatly reduces daily care and feeding requirements of your C News installation.

Unless stated otherwise, these commands are located in /usr/lib/news/maint. (Note that you must become user news before invoking these commands. Running them as a superuser may render critical newsfiles inaccessible to C News.):

newsdaily

The name already says it: run this once a day. It is an important script that helps you keep log files small, retaining copies of each from the last three runs. It also tries to sense anomalies, like stale batches in the incoming and outgoing directories, postings to unknown or moderated newsgroups, etc. Resulting error messages are mailed to the newsmaster.

newswatch

This script should be run regularly to look for anomalies in the news system, once an hour or so. It is intended to detect problems that will have an immediate effect on the operability of your news system, in which case it mails a trouble report to the newsmaster. Things checked include stale lock files that don't get removed, unattended input batches, and disk space shortage.

addgroup

This script adds a group to your site locally. The proper invocation is:

addgroup groupname y|n|m|=realgroup

The second argument has the same meaning as the flag in the active file, meaning that anyone may post to the group (y), that no one may post (n), that it is moderated (m), or that it is an alias for another group (=realgroup). You might also want to use addgroup when the first articles in a newly created group arrive earlier than the newgroup control message that is intended to create it.

delgroup

This script allows you to delete a group locally. Invoke it as:

delgroup groupname

You still have to delete the articles that remain in the newsgroup's spool directory. Alternatively, you might leave it to the natural course of events (i.e., expiration) to make them go away.

addmissing

This script adds missing articles to the history file. Run it when there are articles that seem to hang around forever.

newsboot

This script should be run at system boot time. It removes any lock files left over when news processes were killed at shutdown, and closes and executes any batches left over from NNTP connections that were terminated when shutting down the system.

newsrunning

This script resides in /usr/lib/news/input and may be used to disable unbatching of incoming news, for instance during work hours. You may turn off unbatching by invoking:

/usr/lib/news/input/newsrunning off

It is turned on by using on instead of off.



[75] There may be a difference between the groups that exist at your site and those that your site is willing to receive. For example, the subscription list might specify comp.all, which should send all newsgroups below the comp hierarchy, but at your site you might not list several of the comp newsgroups in the active file. Articles posted to those groups will be moved to junk.

[76] Note that this should be the crontab of news; file permissions will not be mangled.

[77] You can obtain the C News source distribution from its home site at ftp.cs.toronto.edu /pub/c-news/c-news.tar.Z

[78] It is not uncommon for an article posted in say, Hamburg, to go to Frankfurt via reston.ans.net in the Netherlands, or even via some site in the U.S.

[79] As shipped with C News, compcun uses compress with the 12-bit option, since this is the lowest common denominator for most sites. You may produce a copy of the script, say compcun16, for which you use 16-bit compression. The improvement is not too impressive, though.

[80] The article's date of arrival is kept in the middle field of the history line and given in seconds since January 1, 1970.

[81] I don't know why this happens, but it does from time to time.

[82] I wouldn't try this on the Internet, either.

Network News Transfer Protocol (NNTP) provides for a vastly different approach to news exchange from C News and other news servers without native NNTP support. Rather than rely on a batch technology like UUCP to transfer news articles between machines, it allows articles to be exchanged via an interactive network connection. NNTP is not a particular software package, but an Internet standard described in RFC-977. It is based on a stream-oriented connection, usually over TCP, between a client anywhere in the network and a server on a host that keeps Netnews on disk storage. The stream connection allows the client and server to interactively negotiate article transfer with nearly no turnaround delay, thus keeping the number of duplicate articles low. Together with the Internet's high-transfer rates, this adds up to a news transport that surpasses the original UUCP networks by far. While some years ago it was not uncommon for an article to take two weeks or more before it arrived in the last corner of Usenet; it is now often less than two days. On the Internet itself, it is even within the range of minutes.

Various commands allow clients to retrieve, send, and post articles. The difference between sending and posting is that the latter may involve articles with incomplete header information; it generally means that the user has just written the article.[83] Article retrieval may be used by news transfer clients as well as newsreaders. This makes NNTP an excellent tool for providing news access to many clients on a local network without going through the contortions that are necessary when using NFS.

NNTP also provides for an active and a passive way to transfer news, colloquially called pushing and pulling. Pushing is basically the same as the ihave/sendme protocol used by C News (described in Chapter 18., C News). The client offers an article to the server through the IHAVE msgid command, and the server returns a response code that indicates whether it already has the article or if it wants it. If the server wants the article, the client sends the article, terminated by a single dot on a separate line.

Pushing news has the single disadvantage that it places a heavy load on the server system, since the system has to search its history database for every single article.

The opposite technique is pulling news, in which the client requests a list of all (available) articles from a group that have arrived after a specified date. This query is performed by the NEWNEWS command. From the returned list of message IDs, the client selects those articles it does not yet have, using the ARTICLE command for each of them in turn.

Pulling news needs tight control by the server over which groups and distributions it allows a client to request. For example, it has to make sure that no confidential material from newsgroups local to the site is sent to unauthorized clients.

There are also a number of convenience commands for newsreaders that permit them to retrieve the article header and body separately, or even single header lines from a range of articles. This lets you keep all news on a central host, with all users on the (presumably local) network using NNTP-based client programs for reading and posting. This is an alternative to exporting the news directories via NFS, as described in Chapter 18., C News.

An overall problem of NNTP is that it allows a knowledgeable person to insert articles into the news stream with false sender specification. This is called news faking or spoofing.[84] An extension to NNTP allows you to require user authentication for certain commands, providing some measure of protection against people abusing your news server in this way.

There are a number of NNTP packages. One of the more widely known is the NNTP daemon, also known as the reference implementation. Originally, it was written by Stan Barber and Phil Lapsley to illustrate the details of RFC-977. As with much of the good software available today, you may find it prepackaged for your Linux distribution, or you can obtain the source and compile it yourself. If you choose to compile it yourself, you will need to be quite familiar with your distribution to ensure you configure all of the file paths correctly.

The nntpd package has a server, two clients for pulling and pushing news, and an inews replacement. They live in a B News environment, but with a little tweaking, they will be happy with C News, too. However, if you plan to use NNTP for more than offering newsreaders access to your news server, the reference implementation is not really an option. We will therefore discuss only the NNTP daemon contained in the nntpd package and leave out the client programs.

If you wish to run a large news site, you should look at the InterNet News package, or INN, that was written by Rich Salz. It provides both NNTP and UUCP-based news transport. News transport is definitely better than nntpd. We discuss INN in detail in Chapter 20., Internet News.

The NNTP Protocol

We've mentioned two NNTP commands that are key to how news articles are pushed or pulled between servers. Now we'll look at these in the context of an actual NNTP session to show you how simple the protocol is. For the purposes of our illustration, we'll use a simple telnet client to connect to an INN-based news server at the Virtual Brewery called news.vbrew.com. The server is running a minimal configuration to keep the examples short. We'll look at how to complete the configuration of this server in Chapter 20., Internet News. In our testing we'll be very careful to generate articles in the junk newsgroup only, to avoid disturbing anyone else.

Connecting to the News Server

Connecting to the news server is a simple as opening a TCP connection to its NNTP port. When you are connected, you will be greeted with a welcome banner. One of the first commands you might try is help. The response you get generally depends upon whether the server believes you are a remote NNTP server or a newsreader, as there are different command sets required. You can change your operating mode using the mode command; we'll look at that in a moment:

$ telnet news.vbrew.com nntp
	Trying 172.16.1.1...
	Connected to localhost.
	Escape character is '^]'.
	200 news.vbrew.com InterNetNews server INN 1.7.2 08-Dec-1997 ready
	help
	100 Legal commands
        authinfo
	help
	ihave
	check
	takethis
	list
	mode
	xmode
	quit
	head
	stat
	xbatch
	xpath
	xreplic
	For more information, contact "usenet" at this machine.
	.

The responses to NNTP commands always end with a period (.) on a line by itself. The numbers you see in the output listing are response codes and are used by the server to indicate success or failure of a command. The response codes are described in RFC-977; we'll talk about the most important ones as we proceed.

Pushing a News Article onto a Server

We mentioned the IHAVE command when we talked about pushing news articles onto a news server. Let's now have a look at how the IHAVE command actually works:

ihave <123456@gw.vk2ktj.ampr.org>
	335
	From: terry@gw.vk2ktj.ampr.org
	  Subject: test message sent with ihave
	  Newsgroups: junk
	  Distribution: world
	  Path: gw.vk2ktj.ampr.org
	  Date: 26 April 1999
	  Message-ID: <123456@gw.vk2ktj.ampr.org>
	  Body: 
	  
	  This is a test message sent using the NNTP IHAVE command.
	  
	  .
	235

All NNTP commands are case insensitive, so you may enter them in either upper- or lowercase. The IHAVE command takes one mandatory argument, it being the Message ID of the article that is being pushed. Every news article is assigned a unique message ID when it is created. The IHAVE command provides a way of the NNTP server to say which articles it has when it wants to push articles to another server. The sending server will issue an IHAVE command for each article it wishes to push. If the command response code generated by the receiving NNTP server is in the 3xx range, the sending NNTP server will transmit the complete article, including it's full header, terminating the article with a period on a line by itself. If the response code was in the 4xx range, the receiving server has chosen not to accept this article, possibly because it already has it, or because of some problem, such as running out of disk space.

When the article has been transmitted, the receiving serve issues another response code indicating whether the article transmission was successful.

Changing to NNRP Reader Mode

Newsreaders use their own set of commands when talking to a news server. To activate these commands, the news server has to be operating in reader mode. Most news servers default to reader mode, unless the IP address of the connecting host is listed as a news-forwarding peer. In any case, NNTP provides a command to explicitly switch into reader mode:

mode reader
	200 news.vbrew.com InterNetNews NNRP server INN 1.7.2 08-Dec-1997 ready/
	(posting ok).
	help
	100 Legal commands
	authinfo user Name|pass Password|generic <prog> <args>
	article [MessageID|Number]
	body [MessageID|Number]
	date
	group newsgroup
	head [MessageID|Number]
	help
	ihave
	last
	list [active|active.times|newsgroups|distributions|distrib.pats|/
	overview.fmt|subscriptions]
	listgroup newsgroup
	mode reader
	newgroups yymmdd hhmmss ["GMT"] [<distributions>]
	newnews newsgroups yymmddhhmmss ["GMT"] [<distributions>]
	next
	post
	slave
	stat [MessageID|Number]
	xgtitle [group_pattern]
	xhdr header [range|MessageID]
	xover [range]
	xpat header range|MessageID pat [morepat...]
	xpath MessageID
	Report problems to <usenet@vlager.vbrew.com>
	.

NNTP reader mode has a lot of commands. Many of these are designed to make the life of a newsreader easier. We mentioned earlier that there are commands that instruct the server to send the head and the body of articles separately. There are also commands that list the available groups and articles, and others that allow posting, an alternate means of sending news articles to the server.

Listing Available Groups

The list command lists a number of different types of information; notably the groups supported by the server:

list newsgroups
	215 Descriptions in form "group description".
	control                 News server internal group
	junk                    News server internal group
	local.general           General local stuff
	local.test              Local test group
	.

Listing Active Groups

list active shows each supported group and provides information about them. The two numbers in each line of the output are the high-water mark and the low-water markthat is, the highest numbered article and lowest numbered article in each group. The newsreader is able to form an idea of the number of articles in the group from these. We'll talk a little more about these numbers in a moment. The last field in the output displays flags that control whether posting is allowed to the group, whether the group is moderated, and whether articles posted are actually stored or just passed on. These flags are described in detail in Chapter 20., Internet News. An example looks like this:

list active
	215 Newsgroups in form "group high low flags".
	control 0000000000 0000000001 y
	junk 0000000003 0000000001 y
	alt.test 0000000000 0000000001 y
	.

Posting an Article

We mentioned there was a difference between pushing an article and posting an article. When you are pushing an article, there is an implicit assumption that the article already exists, that it has a message identifier that has been uniquely assigned to it by the server to which it was originally posted, and that it has a complete set of headers. When posting an article, you are creating the article for the first time and the only headers you supply are those that are meaningful to you, such as the Subject and the Newgroups to which you are posting the article. The news server you post the article on will add all the other headers for you and create a message ID that it will use when pushing the article onto other servers.

All of this means that posting an article is even easier than pushing one. An example posting looks like this:

post
	340 Ok
	From: terry@richard.geek.org.au
	  Subject: test message number 1
	  Newsgroups: junk
	  Body: 
	  
	  This is a test message, please feel free to ignore it.
	  
	  .
	240 Article posted

We've generated two more messages like this one to give our following examples some realism.

Listing New Articles

When a newsreader first connects to a new server and the user chooses a newsgroup to browse, the newsreader will want to retrieve a list of new articles, those posted or received since the last login by the user. The newnews command is used for this purpose. Three mandatory arguments must be supplied: the name of the group or groups to query, the start date, and the start time from which to list. The date and time are each specified as six-digit numbers, with the most significant information first; yymmdd and hhmmss, respectively:

newnews junk 990101 000000
	230 New news follows
	<7g2o5r$aa$6@news.vbrew.com>
	<7g5bhm$8f$2@news.vbrew.com>
	<7g5bk5$8f$3@news.vbrew.com>
	.

Selecting a Group on Which to Operate

When the user selects a newsgroup to browse, the newsreader may tell the news server that the group was selected. This simplifies the interaction between newsreader and news server; it removes the need to constantly send the name of the newsgroup with each command. The group command simply takes the name of the selected group as an argument. Many following commands use the group selected as the default, unless another newsgroup is specified explicitly:

group junk
	211 3 1 3 junk

The group command returns a message indicating the number of active messages, the low-water mark, the high-water mark, and the name of the group, respectively. Note that while the number of active messages and the high-water mark are the same in our example, this is not often the case; in an active news server, some articles may have expired or been deleted, lowering the number of active messages but leaving the high-water mark untouched.

Listing Articles in a Group

To address newsgroup articles, the newsreader must know which article numbers represent active articles. The listgroup command offers a list of the active article numbers in the current group, or an explicit group if the group name is supplied:

listgroup junk
	211 Article list follows
	1
	2
	3
	.

Retrieving an Article Header Only

The user must have some information about an article before she can know whether she wishes to read it. We mentioned earlier that some commands allow the article header and body to be transferred separately. The head command is used to request that the server transmit just the header of the specified article to the newsreader. If the user doesn't want to read this article, we haven't wasted time and network bandwidth transferring a potentially large article body unnecessarily.

Articles may be referenced using either their number (from the listgroup command) or their message identifier:

head 2
	221 2 <7g5bhm$8f$2@news.vbrew.com> head
	Path: news.vbrew.com!not-for-mail
	From: terry@richard.geek.org.au
	Newsgroups: junk
	Subject: test message number 2
	Date: 27 Apr 1999 21:51:50 GMT
	Organization: The Virtual brewery
	Lines: 2
	Message-ID: <7g5bhm$8f$2@news.vbrew.com>
	NNTP-Posting-Host: localhost
	X-Server-Date: 27 Apr 1999 21:51:50 GMT
	Body: 
	Xref: news.vbrew.com junk:2
	.

Retrieving an Article Body Only

If, on the other hand, the user decides she does want to read the article, her newsreader needs a way of requesting that the message body be transmitted. The body command is used for this purpose. It operates in much the same way as the head command, except that only the message body is returned:

body 2
	222 2 <7g5bhm$8f$2@news.vbrew.com> body
	This is another test message, please feel free to ignore it too.
	
	.

Reading an Article from a Group

While it is normally most efficient to separately transfer the headers and bodies of selected articles, there are occasions when we are better off transferring the complete article. A good example of this is in applications through which we want to transfer all of the artices in a group without any sort of preselection, such as when we are using an NNTP cache program like leafnode.[85]

Naturally, NNTP provides a means of doing this, and not surprisingly, it operates almost identically to the head command as well. The article command also accepts an article number or message ID as an argument, but returns the whole article including its header:

article 1
	220 1 <7g2o5r$aa$6@news.vbrew.com> article
	Path: news.vbrew.com!not-for-mail
	From: terry@richard.geek.org.au
	Newsgroups: junk
	Subject: test message number 1
	Date: 26 Apr 1999 22:08:59 GMT
	Organization: The Virtual brewery
	Lines: 2
	Message-ID: <7g2o5r$aa$6@news.vbrew.com>
	NNTP-Posting-Host: localhost
	X-Server-Date: 26 Apr 1999 22:08:59 GMT
	Body: 
	Xref: news.vbrew.com junk:1
	
	This is a test message, please feel free to ignore it.
	
	.

If you attempt to retrieve an unknown article, the server will return a message with an appropriately coded response code and perhaps a readable text message:

article 4
	423 Bad article number

We've described how the most important NNTP commands are used in this section. If you're interested in developing software that implements the NNTP protocol, you should refer to the relevant RFC documents; they provide a great deal of detail that we couldn't include here.

Let's now look at NNTP in action through the nntpd server.

Installing the NNTP Server

The NNTP server (nntpd) may be compiled in two ways, depending on the expected load on the news system. There are no compiled versions available, because of some site-specific defaults that are hardcoded into the executable. All configuration is done through macros defined in common/conf.h.

nntpd may be configured as either a standalone server that is started at system boot time from an rc file, or a daemon managed by inetd. In the latter case, you have to have the following entry in /etc/inetd.conf:

nntp    stream  tcp nowait      news    /usr/etc/in.nntpd    nntpd

The inetd.conf syntax is described in detail in Chapter 12., ImportantNetwork Features. If you configure nntpd as standalone, make sure that any such line in inetd.conf is commented out. In either case, you have to make sure the following line appears in /etc/services:

nntp    119/tcp   readnews untp    # Network News Transfer Protocol

To temporarily store any incoming articles, nntpd also needs a .tmp directory in your news spool. You should create it using the following commands:

# mkdir /var/spool/news/.tmp
	# chown news.news /var/spool/news/.tmp

Restricting NNTP Access

Access to NNTP resources is governed by the file nntp_access in /etc/news. Lines in this file describe the access rights granted to foreign hosts. Each line has the following format:

	site   read|xfer|both|no    post|no      [!exceptgroups]

If a client connects to the NNTP port, nntpd attempts to obtain the host's fully qualified domain name from its IP address using reverse lookup. The client's hostname and IP address are checked against the site field of each entry in the order in which they appear in the file. Matches may be either partial or exact. If an entry matches exactly, it applies; if the match is partial, it applies only if there is no other match following it that is at least as good. site may be specified in one of the following ways:

Hostname

This is a fully qualified domain name of a host. If this matches the client's canonical hostname literally, the entry applies, and all following entries are ignored.

IP address

This is an IP address in dotted quad notation. If the client's IP address matches this, the entry applies, and all following entries are ignored.

Domain name

This is a domain name, specified as *.domain. If the client's hostname matches the domain name, the entry matches.

Network name

This is the name of a network as specified in /etc/networks. If the network number of the client's IP address matches the network number associated with the network name, the entry matches.

Default

The string default matches any client.

Entries with a more general site specification should be specified earlier, because any matches will be overridden by later, more exact matches.

The second and third fields describe the access rights granted to the client. The second field details the permissions to retrieve news by pulling (read), and transmit news by pushing (xfer). A value of both enables both; no denies access altogether. The third field grants the client the right to post articles, i.e., deliver articles with incomplete header information, which is completed by the news software. If the second field contains no, the third field is ignored.

The fourth field is optional and contains a comma-separated list of groups to which the client is denied access.

This is a sample nntp_access file:

#
	# by default, anyone may transfer news, but not read or post
	default                 xfer            no
	#
	# public.vbrew.com offers public access via modem. We allow
	# them to read and post to any but the local.* groups
	public.vbrew.com        read            post    !local
	#
	# all other hosts at the brewery may read and post
	*.vbrew.com             read            post

NNTP Authorization

The nntpd daemon provides a simple authorization scheme. If you capitalize any of the access tokens in the nntp_access file, nntpd requires authorization from the client for the respective operation. For instance, when specifying a permission of Xfer or XFER, (as opposed to xfer), nntpd will not let the client transfer articles to your site unless it passes authorization.

The authorization procedure is implemented by means of a new NNTP command named AUTHINFO. Using this command, the client transmits a username and a password to the NNTP server. nntpd validates them by checking them against the /etc/passwd database and verifies that the user belongs to the nntp group.

The current implementation of NNTP authorization is only experimental and has therefore not been implemented very portably. The result of this is that it works only with plain-style password databases; shadow passwords are not recognized. If you are compiling from source and have the PAM package installed, the password check is fairly simple to change.

nntpd Interaction with C News

When nntpd receives an article, it has to deliver it to the news subsystem. Depending on whether it was received as a result of an IHAVE or POST command, the article is handed to rnews or inews, respectively. Instead of invoking rnews, you may also configure it (at compile time), to batch the incoming articles and move the resulting batches to /var/spool/news/in.coming, where they are left for relaynews to pick them up at the next queue run.

nntpd has to have access to the history file to be able to properly perform the ihave/sendme protocol. At compile time, you have to make sure the path to that file is set correctly. If you use C News, make sure that C News and nntpd agree on the format of your history file. C News uses dbm hashing functions to access it; however, there are quite a number of different and slightly incompatible implementations of the dbm library. If C News has been linked with a different dbm library than you have in your standard libc, you have to link nntpd with this library, too.

nntpd and C news disagreement sometimes produces error messages in the system log that nntpd can not open it properly, or you might see duplicate articles being received via NNTP. A good test of a malfunctioning news transfer is to pick an article from your spool area, telnet to the nntp port, and offer it to nntpd as shown in the next example. Of course, you have to replace msg@id with the message ID of the article you want to feed to nntpd:

$ telnet localhost nntp
      Trying 127.0.0.1...
      Connected to localhost
      Escape characters is '^ ]'.
      201 vstout NNTP[auth] server version 1.5.11t (16 November 1991) ready at
      Sun Feb 6 16:02:32 1194 (no posting)
      IHAVE msg@id
      435 Got it.
      QUIT

This conversation shows nntpd's proper reaction; the message Got it tells you that it already has this article. If you get a message of 335 Ok instead, the lookup in the history file failed for some reason. Terminate the conversation by typing Ctrl-D. You can check what has gone wrong by checking the system log; nntpd logs all kinds of messages to the daemon facility of syslog. An incompatible dbm library usually manifests itself in a message complaining that dbminit failed.



[83] When posting an article over NNTP, the server always adds at least one header field, NNTP-Posting-Host:. The field contains the client's hostname.

[84] The same problem exists with the Simple Mail Transfer Protocol (SMTP), although most mail transport agents now provide mechanisms to prevent spoofing.

[85] leafnode is available by anonymous FTP from wpxx02.toxi.uni-wuerzburg.de in the /pub/ directory.

The Internet News daemon (INN) is arguably the most popular Netnews server in use today. INN is extremely flexible and is suitable for all but the smallest news sites.[86] INN scales well and is suited to large news server configurations.

The INN server comprises a number of components, each with their own configuration files that we will discuss in turn. Configuration of INN can be a little involved, but we'll describe each of the stages in this chapter and arm you with enough information to make sense of the INN manual pages and documentation and build configurations for just about any application.

Some INN Internals

INN's core program is the innd daemon. innd's task is to handle all incoming articles, storing them locally, and to pass them on to any outgoing newsfeeds if required. It is started at boot time and runs continually as a background process. Running as a daemon improves performance because it has to read its status files only once when starting. Depending on the volume of your news feed, certain files such as history (which contain a list of all recently processed articles) may range from a few megabytes to tens of megabytes.

Another important feature of INN is that there is always only one instance of innd running at any time. This is also very beneficial to performance, because the daemon can process all articles without having to worry about synchronizing its internal states with other copies of innd rummaging around the news spool at the same time. However, this choice affects the overall design of the news system. Because it is so important that incoming news is processed as quickly as possible, it is unacceptable that the server be tied up with such mundane tasks as serving newsreaders accessing the news spool via NNTP, or decompressing newsbatches arriving via UUCP. Therefore, these tasks have been broken out of the main server and implemented in separate support programs. Figure 20.1. attempts to illustrate the relationships between innd, the other local tasks, and remote news servers and newsreaders.

Today, NNTP is the most common means of transporting news articles around, and innd doesn't directly support anything else. This means that innd listens on a TCP socket (port 119) for connections and accepts news articles using the ihave protocol.

Articles arriving by transports other than NNTP are supported indirectly by having another process accept the articles and forward them to innd via NNTP. Newsbatches coming in over a UUCP link, for instance, are traditionally handled by the rnews program. INN's rnews decompresses the batch if necessary, and breaks it up into individual articles; it then offers them to innd one by one.

Newsreaders can deliver news when a user posts an article. Since the handling of newsreaders deserves special attention, we will come back to this a little later.

Figure 20.1. INN architecture (simplified for clarity)

When receiving an article, innd first looks up its message ID in the history file. Duplicate articles are dropped and the occurrences are optionally logged. The same goes for articles that are too old or lack some required header field, such as Subject:.[87] If innd finds that the article is acceptable, it looks at the Newsgroups: header line to find out what groups it has been posted to. If one or more of these groups are found in the active file, the article is filed to disk. Otherwise, it is filed to the special group junk.

Individual articles are kept below /var/spool/news, also called the news spool. Each newsgroup has a separate directory, in which each article is stored in a separate file. The file names are consecutive numbers, so that an article in comp.risks may be filed as comp/risks/217, for instance. When innd finds that the directory it wants to store the article in does not exist, it creates it automatically.

Apart from storing articles locally, you may also want to pass them on to outgoing feeds. This is governed by the newsfeeds file that lists all downstream sites along with the newsgroups that should be fed to them.

Just like innd's receiving end, the processing of outgoing news is handled by a single interface, too. Instead of doing all the transport-specific handling itself, innd relies on various backends to manage the transmission of articles to other news servers. Outgoing facilities are collectively dubbed channels. Depending on its purpose, a channel can have different attributes that determine exactly what information innd passes on to it.

For an outgoing NNTP feed, for instance, innd might fork the innxmit program at startup, and, for each article that should be sent across that feed, pass its message ID, size, and filename to innxmit's standard input. For an outgoing UUCP feed, on the other hand, it might write the article's size and file name to a special logfile, which is head by a different process at regular intervals in order to create batches and queue them to the UUCP subsystem.

Besides these two examples, there are other types of channels that are not strictly outgoing feeds. These are used, for instance, when archiving certain newsgroups, or when generating overview information. Overview information is intended to help newsreaders thread articles more efficiently. Old-style newsreaders had to scan all articles separately in order to obtain the header information required for threading. This would put an immense strain on the server machine, especially when using NNTP; furthermore, it was very slow.[88] The overview mechanism alleviates this problem by prerecording all relevant headers in a separate file (called .overview) for each newsgroup. This information can then be picked up by newsreaders either by reading it directly from the spool directory, or by using the XOVER command when connected via NNTP. INN has the innd daemon feed all articles to the overchan command, which is attached to the daemon through a channel. We'll see how this is done when we discuss configuring news feeds later.

Newsreaders and INN

Newsreaders running on the same machine as the server (or having mounted the server's news spool via NFS) can read articles from the spool directly. To post an article composed by the user, they invoke the inews program, which adds any header fields that are missing and forwards them to the daemon via NNTP.

Alternatively, newsreaders can access the server remotely via NNTP. This type of connection is handled differently from NNTP-based news feeds, to avoid tying up the daemon. Whenever a newsreader connects to the NNTP server, innd forks a separate program called nnrpd, which handles the session while innd returns to the more important things (receiving incoming news, for example).[89] You may be wondering how the innd process can distinguish between an incoming news feed and a connecting newsreader. The answer is quite simple: the NNTP protocol requires that an NNTP-based newsreader issue a mode reader command after connecting to the server; when this command is received, the server starts the nnrpd process, hands the connection to it, and returns to listening for connections from another news server. There used to be at least one DOS-based newsreader which was not configured to do this, and hence failed miserably when talking to INN, because innd itself does not recognize any of the commands used to read news if it doesn't know the connection is from a news reader.

We'll talk a little more about newsreader access to INN under "Controlling Newsreader Access," later in the chapter.

Installing INN

Before diving into INN's configuration, let's talk about its installation. Read this section, even if you've installed INN from one of the various Linux distributions; it contains some hints about security and compatibility.

Linux distributions included Verson INN-1.4sec for quite some time. Unfortunately, this version had two subtle security problems. Modern versions don't have these problems and most distributions include a precompiled Linux binary of INN Version 2 or later.

If you choose, you can build INN yourself. You can obtain the source from ftp.isc.org in the /isc/inn/ directory. Building INN requires that you edit a configuration file that tells INN some detail about your operating system, and some features may require minor modifications to the source itself.

Compiling the package itself is pretty simple; there's a script called BUILD that will guide you through the process. The source also contains extensive documentation on how to install and configure INN.

After installing all binaries, some manual fixups may be required to reconcile INN with any other applications that may want to access its rnews or inews programs. UUCP, for instance, expects to find the rnews program in /usr/bin or /bin, while INN installs it in /usr/lib/bin by default. Make sure /usr/lib/bin/ is in the default search path, or that there are symbolic links pointing to the actual location of the rnews and inews commands.

Configuring INN: the Basic Setup

One of the greatest obstacles beginners may face is that INN requires a working network setup to function properly, even when running on a standalone host. Therefore, it is essential that your kernel supports TCP/IP networking when running INN, and that you have set up the loopback interface as explained in Chapter 5., Configuring TCP/IP Networking.

Next, you have to make sure that innd is started at boot time. The default INN installation provides a script file called boot in the /etc/news/ directory. If your distribution uses the SystemV-style init package, all you have to do is create a symbolic link from your /etc/init.d/inn file pointing to /etc/news/boot. For other flavors of init, you have to make sure /etc/news/boot is executed from one of your rc scripts. Since INN requires networking support, the startup script should be run after the network interfaces are configured.

INN Configuration Files

Having completed these general tasks, you can now turn to the really interesting part of INN: its configuration files. All these files reside in /etc/news. Some changes to configurations files were introduced in Version 2, and it is Version 2 that we describe here. If you're running an older version, you should find this chapter useful to guide you in upgrading your configuration. During the next few sections, we will discuss them one by one, building the Virtual Brewery's configuration as an example.

If you want to find out more about the features of individual configuration files, you can also consult the manual pages; the INN distribution contains individual manual pages for each of them.

Global Parameters

There are a number of INN parameters that are global in nature; they are relevant to all newsgroups carried.

The inn.conf file

INN's main configuration file is inn.conf. Among other things, it determines the name by which your machine is known on Usenet. Version 2 of INN allows a baffling number of parameters to be configured in this file. Fortunately, most parameters have default values that are reasonable for most sites. The inn.conf(5) file details all of the parameters, and you should read it carefully if you experience any problems.

A simple example inn.conf might look like:

	    # Sample inn.conf for the Virtual Brewery
	    server:          vlager.vbrew.com
	    domain:          vbrew.com
	    fromhost:        vbrew.com
	    pathhost:        news.vbrew.com
	    organization:    The Virtual Brewery
	    mta:             /usr/sbin/sendmail -oi %s
	    moderatormailer: %s@uunet.uu.net
	    #
	    # Paths to INN components and files.
	    #
	    pathnews:               /usr/lib/news
	    pathbin:                /usr/lib/news/bin
	    pathfilter:             /usr/lib/news/bin/filter
	    pathcontrol:            /usr/lib/news/bin/control
	    pathdb:                 /var/lib/news
	    pathetc:                /etc/news
	    pathrun:                /var/run/news
	    pathlog:                /var/log/news
	    pathhttp:               /var/log/news
	    pathtmp:                /var/tmp
	    pathspool:              /var/spool/news
	    patharticles:           /var/spool/news/articles
	    pathoverview:           /var/spool/news/overview
	    pathoutgoing:           /var/spool/news/outgoing
	    pathincoming:           /var/spool/news/incoming
	    patharchive:            /var/spool/news/archive
	    pathuniover:            /var/spool/news/uniover
	    overviewname:           .overview

The first line tells the programs rnews and inews which host to contact when delivering articles. This entry is absolutely crucial; to pass articles to innd, they have to establish an NNTP connection with the server.

The domain keyword should specify the domain portion of the host's fully qualified domain name. A couple of programs must look up your host's fully qualified domain name; if your resolver library returns the unqualified hostname only, the name given in the domain attribute is tacked onto it. It's not a problem to configure it either way, so it's best to define domain.

The next line defines what hostname inews is going to use when adding a From: line to articles posted by local users. Most newsreaders use the From: field when composing a reply mail message to the author of an article. If you omit this field, it will default to your news host's fully qualifed domain name. This is ot always the best choice. You might, for example, have news and mail handled by two different hosts. In this case, you would supply the fully qualified domain name of your mail host after the fromhost statement.

The pathhost line defines the hostname INN is to add to the Path: header field whenever it receives an article. In most cases, you will want to use the fully qualified domain name of your news server; you can then omit this field since that is the default. Occasionally you may want to use a generic name, such as news.vbrew.com, when serving a large domain. Doing this allows you to move the news system easily to a different host, should you choose to at some time.

The next line contains the organization keyword. This statement allows you to configure what text inews will put into the Organization: line of articles posted by your local users. Formally, you would place a description of your organization or your organization's name in full here. Should you not wish to be so formal, it is fashionable for organizations with a sense of humor to exhibit it here.

The organization keyword is mandatory and specifies the pathname of the mail transport agent that will be used for posting moderator messages. %s is replaced by the moderator email address.

The moderatormailer entry defines a default address used when a user tries to post to a moderated newsgroup. The list of moderator addresses for each newsgroup is usually kept in a separate file, but you will have a hard time keeping track of all of them. The moderatormailer entry is therefore consulted as a last resort; if it is defined, inews will replace the %s string with the (slightly transformed) newsgroup name and send the entire article to this address. For instance, when posting to soc.feminism, the article is mailed to soc-feminism@uunet.uu.net, given the above configuration. At UUNET, there should be a mail alias installed for each of these submissions addresses that automatically forwards all messages to the appropriate moderator.

Finally, each of the remaining entries specifies the location of some component file or executable belonging to INN. If you've installed INN from a package, these paths should have been configured for you. If you're installing from source, you'll need to ensure that they reflect where you've installed INN.

Configuring Newsgroups

The news administrator on a system is able to control which newsgroups users have access to. INN provides two configuration files that allow the administrator to decide which newsgroups to support and provide descriptions for them.

The active and newsgroups files

The active and newsgroups files are used to store and describe the newsgroups hosted by this news server. They list which newsgroups we are interested in receiving and serving articles for, and administrative information about them. These files are found in the /var/lib/news/ directory.

The active file determines which newsgroups this server supports. Its syntax is straightforward. Each line in the active file has four fields delimited by whitespace:

name himark lomark flags

The name field is the name of the newsgroup. The himark field is the highest number that has been used for an article in that newsgroup. The lomark field is the lowest active number in use in the newsgroup. To illustrate how this works, consider the follow scenario. Imagine that we have a newly created newsgroup: himark and lowmark are both 0 because there are no articles. If we post 5 articles, they will be numbered 1 through 5. himark will now equal 5, the highest numbered article, and lowmark will equal 1, the lowest active article. If article 5 is cancelled there will be no change; himark will remain at 5 to ensure that that article number is not reallocated and lowmark will remain at 1, the lowest active article. If we now cancel article 1, himark will remain unchanged, but lowmark will now equal 2, because 1 is no longer active. If we now post a new article, it will be assigned article number 6, so himark will now equal 6. Article 5 has been in use, so we won't reassign it. lowmark remains at 2. This mechanism allows us to easily allocate unique article numbers for new articles and to calculate approximately how many active articles there are in the group: himarklowmark.

The field may contain one of the following:

y

Posting directly to this news server is allowed.

n

Posting directly to this news server is not allowed. This prevents newsreaders from posting directly to this news server. New articles may only be received from other news servers.

m

The group is moderated. Any articles posted to this newsgroup are forwarded to the newsgroup moderator for approval before they enter the newsgroup. Most newsgroups are unmoderated.

j

Articles in this group are not kept, but only passed on. This causes the news server to accept the article, but all it will do with it is pass it to the up-stream news servers. It will not make the articles available to newsreaders reading from this server.

x

Articles cannot be posted to this newsgroup. The only way that news articles are delivered to this server is by feeding them from another news server. Newsreaders may not directly write articles to this server.

=foo.bar

Articles are locally filed into the ``foo.bar'' group.

In our simple server configuration we'll carry a small number of newsgroups, so our /var/lib/news/active file will look like:

	    control 0000000000 0000000001 y
	    junk 0000000000 0000000001 y
	    rec.crafts.brewing 0000000000 0000000001 y
	    rec.crafts.brewing.ales 0000000000 0000000001 y
	    rec.crafts.brewing.badtaste 0000000000 0000000001 y
	    rec.crafts.brewing.brandy 0000000000 0000000001 y
	    rec.crafts.brewing.champagne 0000000000 0000000001 y
	    rec.crafts.brewing.private 0000000000 0000000001 y

The himark and lomark numbers in this example are those you would use when creating new newsgroups. The himark and lomark numbers will look quite different for a newsgroup that has been active for some time.

The newsgroups file is even simpler. It provides one-line descriptions of newsgroups. Some newsreaders are able to read and present this information to a user to help them decide whether they want to subscribe.

The format of the newsgroups file is simply:

name description

The name field is the name of a newsgroup, and the description is a single line description of that newsgroup.

We want to describe the newsgroups that our server supports, so we'll build our newsgroups file as follows:

	    rec.crafts.brewing.ales         Home brewing Ales and Lagers
	    rec.crafts.brewing.badtaste     Home brewing foul tasting brews
	    rec.crafts.brewing.brandy       Home brewing your own Brandy
	    rec.crafts.brewing.champagne    Home brew your own Champagne
	    rec.crafts.brewing.private      The Virtual Brewery home brewers group

Configuring Newsfeeds

INN provides the news administrator the ability to control which newsgroups are forwarded on to other news servers and how they will be forwarded. The most common method uses the NNTP protocol described earlier, but INN also allows newsfeeds via other protocols, such as UUCP.

The newsfeeds file

The newsfeeds file determines where news articles will be sent. It normally resides in the /etc/news/ directory.

The format of the newsfeeds is a little complicated at first. We'll describe the general layout here, and the newsfeeds(5) manual page describes what we leave out. The format is as follows:

# newsfeeds file format
	    site:pattern:flags:param
	    site2:pattern2\
	    :flags2:param2

Each news feed to a site is described by a single line, or may be spread across multiple lines using the \ continuation character. The : characters delimit the fields in each line. The # character at the start of a line marks that line as a comment.

The site field names the site to which this feed description relates. The sitename can be coded any way you like and doesn't have to be the domain name of the site. The site name will be used later and will refer to an entry in a table that supplies the hostname to the innxmit program that transmits the news articles by NNTP to the remote server. You may have multiple entries for each site; each entry will be treated individually.

The pattern field specifies which news groups are to be sent to this site. The default is to send all groups, so if that is what you want, just make this field empty. This field is usually a comma-delimited list of pattern-matching expressions. The * character matches zero or more of any character, the . character has no special significance, the ! character (if used at the start of an expression) performs a logical NOT, and the character at the start of a newsgroup name means Do not forward any articles that are posted or crossposted to this group. The list is read and parsed from left to right, so you should ensure that you place the more specific rules first. The pattern:

rec.crafts.brewing*,!rec.crafts.brewing.poison,@rec.crafts.brewing.private

would send all of the rec.crafts.brewing news heirarchy except the rec.crafts.brewing.poison. It would not feed any articles that were either posted or crossposted to the rec.crafts.brewing.private newsgroup; these articles will be trapped and available only to those people who use this server. If you reversed the first two patterns, the first pattern would be overridden by the second and you would end up feeding articles for the rec.crafts.brewing.poison newsgroup. The same is true of the first and last patterns; you must always place the more specific patterns before any less specific patterns for them to take effect.

flags controls and places constraints on the feed of news articles to this site. The flags field is a comma delimited list can contain any of the items from the following list, delimited by commands:

<size

Article must be less then size bytes.

Aitems

Article checks. items can be one or more of d (must have Distribution header) or p (don't check for site in Path header).

Bhigh/low

Internal buffer size before writing to output.

H[count]

Article must have less then count hops; the default is 1.

Isize

Internal buffer size (for a file feed).

Mpattern

Only moderated groups that match the pattern.

Npattern

Only unmoderated groups that match the pattern.

Ssize

Start spooling if more than size bytes get queued.

Ttype

Feed types: f (file), m (funnel; the param field names the entry that articles will be funneled to), p (pipe to program), c (send to stdin channel of the param field's subprocess), and x (like c, but handles commands on stdin).

Witems

What to write: b (article bytesize), f (full path), g (first newsgroup), m (Message ID), n (relative path), s (site that fed article), t (time received), * (names of funnel feed-ins or all sites that get the article), N (newsgroups header), D (distribution header), H (all headers), O (overview data), and R (replication data).

The param field has special coding that is dependent on the type of feed. In the most common configuration it is where you specify the name of the output file to which you will write the outgoing feed. In other configurations you can leave it out. In yet other configurations it takes on different meanings. If you want to do something unusual, the newsfeeds(5) manual page will explain the use of the param field in some detail.

There is a special site name that should be coded as ME and should be the first entry in the file. This entry is used to control the default settings for your news feeds. If the ME entry has a distribution list associated with it, this list will be prepended to each of the other site entries before they are sent. This allows you to, for example, declare some newsgroups to be automatically fed, or automatically blocked from feeding, without having to repeat the pattern in each site entry.

We mentioned earlier that it was possible to use some special feeds to generate thread data that makes the newsreader's job easier. We'll do this by exploiting the overchan command that is part of the INN distribution. To do this, we've created a special local feed called overview that will pass the news articles to the overchan command for processing into overview data.

Our news server will provide only one external news feed, which goes to the Groucho Marx University, and they receive articles for all newsgroups except the control and junk newsgroups, the rec.crafts.brewing.private newsgroup, which will be kept locally, and the rec.crafts.brewing.poison newsgroup, which we don't want people from our brewery seen posting to.

We'll use the nntpsend command to transport the news via NNTP to the news.groucho.edu server. nntpsend requires us to use the file delivery method and to write the article's pathname and article ID. Note that we've set the param field to the name of the output file. We'll talk a little more about the nntpsend command in a moment. Our resulting newsfeed's configuration is:

	    # /etc/news/newsfeeds file for the Virtual Brewery
	    #
	    # Send all newsgroups except the control and junk ones by default
	    ME:!control,!junk::
	    #
	    # Generate overview data for any newsreaders to use.
	    overview::Tc,WO:/usr/lib/news/bin/overchan
	    #
	    # Feed the Groucho Marx University everything except our private newsgroup
	    # and any articles posted to the rec.crafts.brewing.poison newsgroup.
	    gmarxu:!rec.crafts.brewing.poison,@rec.crafts.brewing.private:\
	    Tf,Wnm:news.groucho.edu
	    #

The nntpsend.ctl file

The nntpsend program manages the transmission of news articles using the NNTP protocol by calling the innxmit command. We saw a simple use of the nntpsend command earlier, but it too has a configuration file that provides us with some flexibility in how we configure our news feeds.

The nntpsend command expects to find batch files for the sites it will feed. It expects those batch files to be named /var/spool/news/out.going/sitename. innd creates these batch files when acting on an entry in the newsfeeds, which we saw in the previous sections. We specified the sitename as the filename in the param field, and that satisfies the nntpsend command's input requirements.

The nntpsend command has a configuration file called nntpsend.ctl that is usually stored in the /etc/news/ directory.

The nntpsend.ctl file allows us to associate a fully qualified domain name, some news feed size constraints, and a number of transmission parameters with a news feed site name. The sitename is a means of uniquely identifying a logical feed of articles. The general format of the file is:

sitename:fqdn:max_size:[args]

The following list describes the elements of this format:

sitename

The sitename as supplied in the newsfeeds file

fqdn

The fully qualified domain name of the news server to which we will be feeding the news articles

max_size

The maximum volume of news to feed in any single transfer

args

Additional arguments to pass to the innxmit command

Our sample configuration requires a very simple nntpsend.ctl file. We have only one news feed. We'll restrict the feed to a maximum of 2 MB of traffic and we'll pass an argument to the innxmit that sets a 3-minute (180 second) timeout. If we were a larger site and had many news feeds, we'd simply create new entries for each new feed site that looked much the same as this one:

# /etc/news/nntpsend.ctl
	    #
	    gmarxu:news.groucho.edu:2m:-t 180
	    #

Controlling Newsreader Access

Not so many years ago, it was common for organizations to provide public access to their news servers. Today it is difficult to locate public news servers; most organizations carefully control who has access to their servers, typically restricting access to users supported on their network. INN provides configuration files to control this access.

The incoming.conf file

We mentioned in our introduction to INN that it achieves some of its efficiency and size by separating the news feed mechanism from the newsreading mechanism. The /etc/news/incoming.conf file is where you specify which hosts will be feeding you news using the NNTP protocol, as well as where you define some parameters that control the way articles are fed to you from these hosts. Any host not listed in this file that connects to the news socket will not be handled by the innd daemon; instead, it will be handled by the nnrpd daemon.

The /etc/news/incoming.conf file syntax is very simple, but it takes a moment to come to terms with. Three types of valid entries are allowed: key/value pairs, which are how you specify attributes and their values; peers, which is how you specify the name of a host allowed to send articles to us using NNTP; and groups, a means of applying key/value pairs to groups of peers. Key/value pairs can have three different types of scope. Global pairs apply to every peer defined in the file. Group pairs apply to all peers defined within that group. Peer pairs apply only to that one peer. Specific definitions override less specific ones: therefore, peer definitions override group definitions, which in turn override global pairs.

Curly brace characters ({}) are used to delimit the start and end of the group and peer specifications. The # character marks the rest of the line it appears on as a comment. Key/value pairs are separated by the colon character and appear one to a line.

A number of different keys may be specified. The more common and useful are:

hostname

This key specifies a comma-separated list of fully qualifed names or IP addresses of the peers that we'll allow to send us articles. If this key is not supplied, the hostname defaults to the label of the peer.

streaming

This key determines whether streaming commands are allowed from this host. It is a Boolean value that defaults to true.

max-connections

This key specifies the maximum number of connections allowed from this group or peer. A value of zero means unlimited (which can also be specified using none).

password

This key allows you to specify the password that must be used by a peer if it is to be allowed to transfer news. The default is to not require a password.

patterns

This key specifies the newsgroups that we accept from the associated peer. This field is coded according to precisely the same rules as we used in our newsfeeds file.

In our example we have only one host that we are expecting to feed us news: our upstream news provider at Groucho Marx University. We'll have no password, but we will ensure that we don't accept any articles for our private newsgroup from outside. Our hosts.nntp looks like:

	    # Virtual Brewery incoming.conf file.
	    
	    # Global settings
	    streaming:       true
	    max-connections: 5
	    
	    # Allow NNTP posting from our local host.
	    peer ME {
	    hostname: "localhost, 127.0.0.1"
	    }
	    
	    # Allow groucho to send us all newsgroup except our local ones.
	    peer groucho {
	    hostname: news.groucho.edu
	    patterns: !rec.crafts.brewing.private
	    }

The nnrp.access file

We mentioned earlier that newsreaders, and in fact any host not listed in the hosts.nntp, that connect to the INN news server are handled by the nnrpd program. nnrpd uses the /etc/news/nnrp.access file to determine who is allowed to make use of the news server, and what permissions they should have.

The nnrp.access file has a similar structure to the other configuration files we've looked at. It comprises a set of patterns used to match against the connecting host's domain name or IP address, and fields that determine what access and permission it should be given. Each entry should appear on a line by itself, and fields are separated by colons. The last entry in this file that matches the connecting host will be the one used, so again, you should put general patterns first and follow them with more specific ones later in the file. The five fields of each entry in the order they should appear are:

Hostname or IP address

This field conforms to wildmat(3) pattern-matching rules. It is a pattern that describes the connecting host's name or IP address.

Permissions

This field determines what permissions the matching host should be granted. There are two permissons you may configure: R gives read permissions, and P gives posting permissions.

Username

This field is optional and allows you to specify a username that an NNTP client must log into the server before being allowed to post news articles. This field may be left blank. No user authentication is required to read articles.

Password

This field is optional and is the password accompanying the username field. Leaving this field blank means that no password is required to post articles.

Newsgroups

This field is a pattern specifying which newsgroups the client is allowed to access. The pattern follows the same rules as those used in the newsfeeds file. The default for this field is no newsgroups, so you would normally have a pattern configured here.

In the virtual brewery example, we will allow any NNTP client in the Virtual Brewery domain to both read and post to all newsgroups. We will allow any NNTP client read-only access to all newsgroups except our private internal newsgroup. Our nnrp.access file will look like this:

	    # Virtual Brewery - nnrp.access
	    # We will allow public reading of all newsgroups except our private one.
	    *:R:::*,!rec.crafts.brewing.private
	    
	    # Any host with the Virtual Brewery domain may Read and Post to all 
	    # newsgroups
	    *.vbrew.com:RP::*

Expiring News Articles

When news articles are received by a news server, they are stored to disk. News articles need to be available to users for some period of time to be useful, so a large operating news server can consume lots of disk space. To ensure that the disk space is used effectively, you can opt to delete news articles automatically after a period of time. This is called article expiration. Naturally, INN provides a means of automatically expiring news articles.

The expire.ctl file

The INN server uses a program called expire to delete expired news articles. The expire program in turn uses a file called /etc/news/expire.ctl to configure the rules that govern article expiration.

The syntax of /etc/news/expire.ctl is fairly simple. As with most configuration files, empty lines or lines beginning with the # character are ignored. The general idea is that you specify one rule per line. Each rule defines how article expiration will be performed on newsgroups matching a supplied pattern. The rule syntax looks like this:

	    pattern:modflag:keep:default:purge

The following list describes the fields:

pattern

This field is a comma-delimited list of patterns matching names of newsgroups. The wildmat(3) routine is used to match these patterns. The last rule matching a newsgroup name is the one that is applied, so if you want to specify wildcard (*) rules, they should be listed first in this file.

modflag

This flag describes how this rule applies to moderated newsgroups. It can be coded with an M to mean that this rule applies only to moderated newsgroups, a U to mean that this rule applies only to unmoderated newsgroups, or an A to mean that this rule ignores the moderated status and applies to all groups.

keep

This field allows you to specify the minimum time an article with an Expires header will be kept before it is expired. The units are days, and are a floating point, so you may specify values like 7.5 for seven-and-a-half days. You may also specify never if you wish articles to stay in a newsgroup forever.

default

This field is the most important. This field allows you to specify the time an article without an Expires header will be kept. Most articles won't have an Expires header. This field is coded in the same way as the keep field, with never meaning that articles without Expires headers will never be expired.

purge

This field allows you to specify the maximum time an article with an Expires header will be kept before it is expired. The coding of this field is the same as for the keep field.

Our requirements are simple. We will keep all articles in all newsgroups for 14 days by default, and between 7 and 21 days for articles that have an Expires header. The rec.crafts.brewing.private newsgroup is our internal newsgroup, so we'll make sure we don't expire any articles from it:

	    # expire.ctl file for the Virtual Brewery
	    
	    # Expire all articles in 14 days by default, 7-21 days for those with
	    # Expires: headers
	    *:A:7:14:21
	    
	    # This is a special internal newsgroup, which we will never expire.
	    rec.crafts.brewing.private:A:never:never:never

We will mention one special type of entry you may have in your /etc/news/expires.ctl file. You may have exactly one line that looks like this:

/remember/:days

This entry allows you to specify the minimum number of days that an article will be remembered in the history file, irrespective of whether the article itself has been expired or not. This might be useful if one of the sites that is feeding you articles is infrequent and has a habit of sending you old articles every now and again. Setting the /remember/ field helps to prevent the upstream server from sending you the article again, even if it has already been expired from your server. If your server remembers it has already received the article, it will reject attempts to resend it. It is important to remember that this setting has no effect at all on article expiration; it affects only the time that details of an article are kept in the history database.

Handling Control Messages

Just as with C News, INN can automatically process control messages. INN provides a powerful configuration mechanism to control what action will occur for each of a variety of control messages, and an access control mechanism to control who can initiate actions against which newsgroups.

The control.ctl file

The control.ctl file is fairly simple in structure. The syntax rules for this file are much the same as for the other INN configuration files. Lines beginning with # are ignored, lines may be continued using /, and fields are delimited by :.

When a control message is received, it is tested against each rule in turn. The last rule in the file that matches the message is the rule that will be used, so you should put any generic rules at the start of the file and more specific rules at the end of the file. The general syntax of the file is:

	    message:from:newsgroups:action

The meanings of each of the fields are:

message

This is the name of the control message. Typical control messages are described later.

from

This is a shell-style pattern matching the email address of the person sending the message. The email address is converted to lowercase before comparison.

newsgroups

If the control message is newgroup or rmgroup, this field is a shell-style pattern matching the newsgroup created or removed.

action

This field specifies what action to take for any message matching the rule. There are quite a number of actions we can take; they are described in the next list.

The message field of each line can have one of the following values:

checkgroups

This message requests that news administrators resynchonrize their active newsgroups database against the list of newsgroups supplied in the control message.

newgroup

This message requests the creation of a new newsgroup. The body of the control message should contain a short description of the purpose of the newsgroup to be created.

rmgroup

requests that a newsgroup be removed.

sendsys

This message requests that the sys file of this news server be transmitted by mail to the originator of the control message. RFC-1036 states that it is a requirement of Usenet membership that this information be publicly available because it is used to keep the map of Usenet up to date.

version

This message requests that the hostname and version of news server software be returned to the originator of the control message.

all

This is a special coding that will match any control message.

The message field may include the following actions:

doit

The requested command is performed. In many cases, a mail message will be sent to the administrator to advise them that the action has taken place.

doit=file

This is the same as the doit action except that a log message will be written to the file log file. If the specified file is mail, the log entry is sent by email. If the specified file is the null string, the log message is written to /dev/null and is equivalent to using the unqualified doit action. If the file name begins with a / character, the name is taken to be an absolute filename for the logfile; otherwise, the specified name is translated to /var/log/news/file.log.

doifarg

The requested command is performed if the command has an argument. If the command has no argument, the control message is ignored.

drop

The requested command is ignored.

log

A log message is sent to the stderr output of the innd process. This is normally directed out to the /var/log/news/errlog file.

log=file

This is the same as a log action, except the logfile is specified as per the rules given for the doit=file action.

mail

An email message is sent to the news administrator containing the requested command details. No other action takes place.

verify-*

If an action begins with the string verify-, then the control message is authenticated using PGP (or GPG).[90]

So that you can see what a control.ctl file would look like in practice, here is a very short illustrative sample:

	  ## Sample /etc/news/control.ctl
	  ##
	  ## Warning: You should not use this file, it is illustrative only.
	  
	  ##	Control Message Handling
	  all:*:*:mail
	  checkgroups:*:*:mail
	  ihave:*:*:drop
	  sendme:*:*:drop
	  sendsys:*:*:log=sendsys
	  senduuname:*:*:log=senduuname
	  version:*:*:log=version
	  newgroup:*:*:mail
	  rmgroup:*:*:mail
	  
	  ##  Handle control messages for the eight most important news heirarchies
	  ##  COMP, HUMANITIES, MISC, NEWS, REC, SCI, SOC, TALK
	  checkgroups:*:comp.*|humanities.*|misc.*|news.*|rec.*|sci.*|soc.*|talk.*:drop
	  newgroup:*:comp.*|humanities.*|misc.*|news.*|rec.*|sci.*|soc.*|talk.*:drop
	  rmgroup:*:comp.*|humanities.*|misc.*|news.*|rec.*|sci.*|soc.*|talk.*:drop
	  checkgroups:group-admin@isc.org:*:verify-news.announce.newgroups
	  newgroup:group-admin@isc.org:comp.*|misc.*|news.*:verify-news.announce.newgroups
	  newgroup:group-admin@isc.org:rec.*|sci.*|soc.*:verify-news.announce.newgroups
	  newgroup:group-admin@isc.org:talk.*|humanities.*:verify-news.announce.newgroups
	  rmgroup:group-admin@isc.org:comp.*|misc.*|news.*:verify-news.announce.newgroups
	  rmgroup:group-admin@isc.org:rec.*|sci.*|soc.*:verify-news.announce.newgroups
	  rmgroup:group-admin@isc.org:talk.*|humanities.*:verify-news.announce.newgroups
	  
	  ## GNU ( Free Software Foundation )
	  newgroup:gnu@prep.ai.mit.edu:gnu.*:doit
	  newgroup:news@*ai.mit.edu:gnu.*:doit
	  rmgroup:gnu@prep.ai.mit.edu:gnu.*:doit
	  rmgroup:news@*ai.mit.edu:gnu.*:doit
	  
	  ## LINUX (Newsfeed from news.lameter.com)
	  checkgroups:christoph@lameter.com:linux.*:doit
	  newgroup:christoph@lameter.com:linux.*:doit
	  rmgroup:christoph@lameter.com:linux.*:doit

Running INN

The inn source package provides a script suitable for starting inn at boot time. The script is usually called /usr/lib/news/bin/rc.news. The script reads arguments from another script, usually called /usr/lib/news/innshellvars, which contains definitions of the filenames and filepaths that inn will use to locate components it needs. It is generally considered a good idea to execute inn with the permissions of a non-root user, such as news.

To ensure that inn is started at boot time, you should check that /usr/lib/news/innshellvars is configured correctly and then call the /usr/lib/news/bin/rc.news script from a script executed at boot time.

Additionally, there are administrative tasks that must be performed periodically. These tasks are usually configured to be executed by the cron command. The best way to do this is to add the appropriate commands to your /etc/crontab file, or even better, create a file suitable for the /etc/cron.d directory, if your distribution provides one. An example of such a file might look like:

	# Example /etc/cron.d/inn file, as used in the Debian distribution.
	#
	SHELL=/bin/sh
	PATH=/usr/lib/news/bin:/sbin:/bin:/usr/sbin:/usr/bin
	
	# Expire old news and overview entries nightly, generate reports.
	
	15 0 * * *      news    news.daily expireover lowmark delayrm
	
	# Every hour, run an rnews -U. This is not only for UUCP sites, but
	# also to process queued up articles put there by in.nnrpd in case
	# innd wasn't accepting any articles.
	
	10 * * * *      news    rnews -U

These commands will ensure that old news is automatically expired each day, and that any queued articles are processed each hour. Note also that they are executed with the permissions of the news user.

Managing INN: The ctlinnd Command

The INN news server comes with a command to manage its day-to-day operation. The ctlinnd command can be used to manipulate newsgroups and newsgroup feeds, to obtain the status, of the server, and to reload, stop, and start the server.

You'd normally get a summary of the ctlinnd command syntax using:

# ctlinnd -h

We'll cover some of the more important uses of ctlinnd here; please consult the ctlinnd manual page for more detail.

Add a New Group

Use the following syntax to add a new group:

	  ctlinnd newgroup group rest creator

The arguments are defined as follows:

group

The name of the group to create.

rest

This argument should be coded in the same way as the flags field of the active file. It defaults to y if not supplied.

creator

The name of the person creating the group. Enclose it in quotes if there are any spaces in the name.

Change a Group

Use the following syntax to change a group:

	  ctlinnd changegroup group rest

The arguments are defined as follows:

group

The name of the group to change.

rest

This argument should be coded in the same way as the flags field of the active file.

This command is useful to change the moderation status of a group.

Remove a Group

Use the following syntax to remove a group:

ctlinnd rmgroup group

The argument is defined as follows:

group

The name of the group to remove.

This command removes the specified newsgroup from the active file. It has no effect on the news spool. All articles in the spool for the specified group will be expired in the usual fashion, but no new articles will be accepted.

Renumber a Group

Use the following syntax to renumber a group:

ctlinnd renumber group

The argument is defined as follows:

group

The name of the group to renumber. If a group is an empty string, all groups are renumbered.

This command updates the low-water mark for the specified group.

Allow/Disallow Newsreaders

Use the following syntax to allow or disallow newsreaders:

ctlinnd readers flag text

The arguments are defined as follows:

flag

Specifying n causes all newsreader connections to be disallowed. Specifying y allows newsreader connections.

text

The text supplied will be given to newsreaders who attempt to connect, and usually describes the reason for disabling newsreader access. When reenabling newsreader access, this field must be either an empty string or a copy of the text supplied when the newsreader was disabled.

This command does not affect incoming newsfeeds. It only controls connections from newsreaders.

Reject Newsfeed Connections

Use the following syntax to reject newsfeed connections:

ctlinnd reject reason

The argument is defined as follows:

reason

The text supplied should explain why incoming connections to innd are rejected.

This command does not affect connections that are handed off to nnrpd (i.e., newsreaders); it only affects connections that would be handled by innd directly, such as remote newsfeeds.

Allow Newsfeed Connections

Use the following syntax to allow newsfeed connections:

ctlinnd allow reason

The argument is defined as follows:

reason

The supplied text must be the same as that supplied to the preceding reject command or an empty string.

This command reverses the effect of a reject command.

Disable News Server

Use the following syntax to disable the news server:

ctlinnd throttle reason

The argument is defined as follows:

reason

The reason for throttling the server.

This command is simultaneously equivalent to a newsreaders no and a reject, and is useful when emergency work is performed on the news database. It ensures that nothing attempts to update it while you are working on it.

Restart News Server

Use the following syntax to restart the news server:

ctlinnd go reason

The argument is defined as follows:

reason

The reason given when stopping the server. If this field is an empty string, the server will be reenabled unconditionally. If a reason is given, only those functions disabled with a reason matching the supplied text will be restarted.

This command is used to restart a server function after a throttle, pause, or reject command.

Display Status of a Newsfeed

Use the following syntax to display the status of a newsfeed:

ctlinnd feedinfo site

The argument is defined as follows:

site

The site name (taken from the newsfeeds file) for which you wish to display the newsfeed's status.

Drop a Newsfeed

Use the following syntax to drop a newsfeed:

ctlinnd drop site

The argument is defined as follows:

site

The name of the site (taken from the newsfeeds file) to which feeds are dropped. If this field is an empty string, all active feeds will be dropped.

Dropping a newsfeed to a site halts any active feeds to the site. It is not a permanent change. This command would be useful if you've modified the feed details for a site and a feed to that site is active.

Begin a Newsfeed

Use the following syntax to begin a newsfeed:

ctlinnd begin site

The argument is defined as follows:

site

The name of the site from the newsfeeds file to which feeds are started. If a feed to the site is already active, a drop command is done first automatically.

This command causes the server to reread the newsfeeds file, locate the matching entry, and commence a newsfeed to the named site using the details found. You can use this command to test a new news feed to a site after you've added or modified its entry in the newsfeeds file.

Cancel an Article

Use the following syntax to cancel an article:

ctlinnd cancel Message-Id

The argument is defined as follows:

Message-ID

The ID of the article to be cancelled.

This command causes the specified article to be deleted from the server. It does not generate a cancel message.



[86] Very small news sites should consider a caching NNTP server program like leafnode, which is available at http://wpxx02.toxi.uni-wuerzburg.de/~krasel/leafnode.html.

[87] This is indicated by the Date: header field; the limit is usually two weeks.

[88] Threading 1,000 articles when talking to a loaded server could easily take around five minutes, which only the most dedicated Usenet addict would find acceptable.

[89] The name apparently stands for NetNews Read & Post Daemon.

[90] PGP and GPG are tools designed to authenticate or encrypt messages using public key techniques. GPG is the GNU free version of PGP. GPG may be found at http://www.gnupg.org/, and PGP may be found at http://www.pgp.com/.

A newsreader is a program that users invoke to view, store, and create news articles. Several newsreaders have been ported to Linux. We will describe the basic setup for the three most popular newsreaders: tin, trn, and nn.

One of the most effective newsreaders is:

$ find /var/spool/news -name '[0-9]*' -exec cat {} \; | more

This is the way Unix die-hards read their news.

Most newsreaders, however, are much more sophisticated. They usually offer a full-screen interface with separate levels for displaying all groups the user has subscribed to, an overview of all articles in each group, and individual articles. Many web browsers double as newsreaders, but if you want to use a standalone newsreader, this chapter explains how to configure two classic ones: trn and nn.

At the newsgroup level, most newsreaders display a list of articles, showing their subject lines and authors. In big groups, it is difficult for the user to keep track of articles relating to each other, although it is possible to identify responses to earlier articles.

A response usually repeats the original article's subject, prepending it with Re:. Additionally, the References: header line should include the message ID of the article on which the response is directly following up. Sorting articles by these two criteria generates small clusters (in fact, trees) of articles, which are called threads. One of the tasks of writing a newsreader is devising an efficient scheme of threading, because the time required for this is proportional to the square of the number of articles.

We will not go into how the user interfaces are built here. All newsreaders currently available for Linux have a good help function; please refer to it for more details.

In the following sections, we will deal only with administrative tasks. Most of these relate to the creation of threads databases and accounting.

tin Configuration

The most versatile newsreader with respect to threading is tin. It was written by Iain Lea and is loosely modeled on an older newsreader named tass (written by Rich Skrenta). It does its threading when the user enters the newsgroup, and it is pretty fast unless you're getting posts via NNTP.

On a 486DX50, it ta