Currently, the PAT changes consist of four independent patches.

1) pat_mmap_single.patch

   This adds a new cdevsw routine d_mmap_single() which gets called to
   fill an entire mmap() request for a character device.  It is an
   optional routine and if it is not present or returns ENODEV, then
   the mmap() request will fall back to using the device pager and
   d_mmap().  One can use d_mmap_single() to validate a request and
   return ENODEV to have it still be backed by the device pager.
   However, its intended use is to "claim" a device mmap() request and
   redirect it to a different VM object that is not the device's
   device pager object.  In this case, d_mmap_single() should return a
   reference to the desired VM object.  It may also wish to adjust the
   starting offset of the mapping relative to the desired VM object.

2) pat_cache_mode.patch

   This patch adds caching mode support to the MI VM layer.  What I
   have done is to add a new field to each vm_map_entry that includes
   the cache mode for that mapping range.  It is treated very similar
   to protection (VM_PROT_*) in that each entry has a cache mode and
   adjacent VM map entries may only be coalesced if they have the same
   cache mode.  The actual cache mode is stored in a MI typedef of a
   uchar.  However, the valid values for the cache mode are defined by
   each architecture in <machine/pmap.h>.  Drivers can use #ifdef's to
   see if a specific cache mode is supported at compile time.  Each
   architecture is required to support VM_CACHE_DEFAULT at a minimum.
   It would probably be best for architectures to use the same
   constant names to describe the same effective mode.

   I also added a VM cache mode to each VM object.  It defaults to
   VM_CACHE_DEFAULT.  When an object is inserted into a map, the
   object's cache mode is used.

   None of this attempts to solve the problem of multiple mappings of
   the same page with different cache modes.  I'm not sure that is
   something we should solve however.  It may also be that some other
   arch may not have that requirement some day (or at the least, it
   may not make a lot of sense in e.g., MIPS where you have direct
   maps in hardware for both WB and UC).  What this does right now is
   punt and require the driver to get this correct.  However, if the
   driver is careful to ensure that all the mappings are done via a VM
   object it controls, then it can use that to ensure that all
   mappings use the same cache mode.

   Note that this does require changes to the pmap in that a few
   routines that create physical mappings now accept a cache mode
   parameter.  I have updated amd64 and i386 but I have only runtested
   amd64.  Other archs can simply #define only VM_CACHE_DEFAULT and
   ignore the cache mode paramters for now.

   I also have additional patches to add a mcache() system call
   similar to mprotect() that changes the cache mode on a range.  I
   have not included this in this patch as I'm not sure it is useful
   (it has a high foot-shooting potential and I believe it is not
   needed for DRM/Nvidia).  I also have not implemented the pmap
   routine it needs on amd64 or i386.

3) pat_sg.patch

   In addition to this patch, one needs the new files under this tree
   at kern/subr_sglist.c, sys/sglist.h, and vm/sg_pager.c.  sglist(9)
   is a new data type used to describe a scatter/gather list of
   physical memory ranges.  I originally developed it for the unmapped
   buffer I/O project which is why it has a bit of a rich API.  On top
   of this I created a new VM object type and pager: OBJT_SG.  This
   pager is very much like the device pager.  However, instead of
   calling d_mmap() to determine the physical address of a given page
   in the VM object, the physical address is looked up using the
   scatter/gather list.  Note that scatter/gather lists are immutable
   after they have been created (similar to credential structures in
   the kernel).  These objects can be useful to export physical
   address ranges like BARs, etc.

4) pat_mmap_prefault.patch

   This adds two new flags to mmap(): MAP_PREFAULT_READ and
   MAP_PREFAULT_WRITE.  If either of these flags is set for an mmap()
   request, then the pages will be prefaulted using vm_fault() before mmap()
   returns.  If MAP_PREFAULT_WRITE is set, then the pages will be
   prefaulted for read/write as dirty pages.  Otherwise the pages will be
   prefaulted for read.

A small test demo is available at modules/patdev/.  It creates a /dev/patdev
device that implements mmap() using d_mmap_single().  It exports two different
VM objects and uses the offset passed to the mmap() system call to determine
which object is exported.  For requests with an offset of 0, a shared
anonymous object is used to satisfy mapping requests.  The object is created
on the first mmap() request and its size is set to the size passed in to
the mmap() call.  It is mapped WC.  Subsequent mmap()'s at offset 0 will all
share this same bit of anonymous memory.  I do not demonstrate doing DMA from
this region.  One would need to wire the pages first.  That could be done by
something like this:

	vm_ooffset_t foff;
	vm_offset_t kva, ofs;
	vm_object_t obj;
	vm_size_t size;
	int rv;

	foff = starting_offset_to_map();
	size = range_to_map();
	obj = my_vm_object();

	/* Map the object into the kernel_map. */
	vm_object_reference(obj);
	kva = vm_map_pin(kernel_map);
	ofs = foff & PAGE_SIZE;
	foff = trunc_page(foff);
	size = round_page(size + ofs);
	rv = vm_map_find(kernel_map, obj, foff, &kva, size, TRUE,
	    VM_PROT_READ | VM_PROT_WRITE, VM_PROT_READ | VM_PROT_WRITE, 0);
	if (rv != KERN_SUCCESS) {
		vm_object_deallocate(obj);
		/* handle error */
	}

	/* Wire this mapping. */
	rv = vm_map_wire(kernel_map, kva, kva + size, VM_MAP_WIRE_SYSTEM |
	    VM_MAP_WIRE_NOHOLES);
	if (rv != KERN_SUCCESS) {
		vm_map_remove(kernel_map, kva, kva + size);
		/* handle error */
	}

	bus_dmamap_load(..., kva, size, ...);

Later the buffer can be unmapped and unwired once the DMA is finished using
vm_map_remove:

	vm_map_remove(kernel_map, kva, kva + size);

The second object that the test device exports is a scatter/gather object
(OBJT_SG).  When the module is loaded, it creates a scatter/gather list with
a single entry that covers the local APIC.  It then creates a VM object using
that list and sets its cache mode to UC.  mmap() requests that have a starting
offset of PAGE_SIZE use this object.  Note that I set the starting offset of
the internal mapping request to 0 in my d_mmap_single() handler so that the
resulting mapping starts at the beginning of the VM object.  In this case,
the effect is that doing:

	fd = open("/dev/pat", O_RDWR);
	r = mmap(0, getpagesize(), PROT_READ, MAP_SHARED, fd, getpagesize());

Actually maps the local APIC into a process' address space at 'r'.