Patch to introduce bus_dmamap_sync_size() for i386

Our busdma interface currently differs from NetBSD in that you can synchronize only the entire DMA map. This is a problem for places where a device may write buffers piecewise and the driver needs to read them as such. The patch implements an additional method bus_dmamap_sync_size() that takes the same parameters as NetBSD's bus_dmamap_sync(). If that works it could replace our bus_dmamap_sync. This implementation should be the same for all archs except sparc.

This patch also fixes one bug. It is easy to spot so I simply included it in this patch (but I'll commit it seperately): bus_dmamap_load fails to return the error code. From the same problem suffer ia64, alpha, amd64. Sparc64 is ok.

I found also something I think is a bug, but I'm not sure. In bus_dma_tag_create and bus_dmamap_create the number of bounce pages one needs is computed. This seems to round down the dmat->maxsize to a PAGE_SIZE and leads to failures, if the map size (this is the maxsize) is not a multiple of PAGES_SIZE (speficically if it is lesser than one page).

And finally the new function itself. For testing purposes I decided to just add a function. We may, of course, just rename this to bus_dmamap_sync() and adjust all the callers. Two remarks:

1. Because the internals of our implementation and NetBSD are very different, this function is actually harder to implement than NetBSD's. The problem is, that we the time of the sync we have no access to the original object that was loaded. For this reason I needed to add an offset member to the bounce page, so at sync time I know, what the offset of this page in the object is (an object may only be partially bounced).

2. For maps with bounce pages the overhead is clearly higher than before when using this function to sync the entire map. I think, however, that this can be neglected (the copying itself costs much more).

busdma0.diff