FreeBSD ZFS
The Zettabyte File System
|
File Range Locking for ZFS. More...
#include <sys/zfs_rlock.h>
Go to the source code of this file.
Functions | |
static void | zfs_range_lock_writer (znode_t *zp, rl_t *new) |
Check if a write lock can be grabbed, or wait and recheck until available. | |
static rl_t * | zfs_range_proxify (avl_tree_t *tree, rl_t *rl) |
If this is an original (non-proxy) lock then replace it by a proxy and return the proxy. | |
static rl_t * | zfs_range_split (avl_tree_t *tree, rl_t *rl, uint64_t off) |
Split the range lock at the supplied offset returning the *front* proxy. | |
static void | zfs_range_new_proxy (avl_tree_t *tree, uint64_t off, uint64_t len) |
Create and add a new proxy range lock for the supplied range. | |
static void | zfs_range_add_reader (avl_tree_t *tree, rl_t *new, rl_t *prev, avl_index_t where) |
static void | zfs_range_lock_reader (znode_t *zp, rl_t *new) |
Check if a reader lock can be grabbed, or wait and recheck until available. | |
rl_t * | zfs_range_lock (znode_t *zp, uint64_t off, uint64_t len, rl_type_t type) |
Lock an object range. | |
static void | zfs_range_unlock_reader (znode_t *zp, rl_t *remove) |
Unlock a reader lock. | |
void | zfs_range_unlock (rl_t *rl) |
void | zfs_range_reduce (rl_t *rl, uint64_t off, uint64_t len) |
Reduce range locked as RL_WRITER from whole file to specified range. | |
int | zfs_range_compare (const void *arg1, const void *arg2) |
AVL comparison function used to order range locks Locks are ordered on the start offset of the range. |
File Range Locking for ZFS.
This file contains the code to implement file range locking in ZFS, although there isn't much specific to ZFS (all that comes to mind is support for growing the blocksize).
Interface --------- Defined in zfs_rlock.h but essentially: rl = zfs_range_lock(zp, off, len, lock_type); zfs_range_unlock(rl); zfs_range_reduce(rl, off, len);
AVL tree -------- An AVL tree is used to maintain the state of the existing ranges that are locked for exclusive (writer) or shared (reader) use. The starting range offset is used for searching and sorting the tree.
Common case ----------- The (hopefully) usual case is of no overlaps or contention for locks. On entry to zfs_lock_range() a rl_t is allocated; the tree searched that finds no overlap, and *this* rl_t is placed in the tree.
Overlaps/Reference counting/Proxy locks --------------------------------------- The avl code only allows one node at a particular offset. Also it's very inefficient to search through all previous entries looking for overlaps (because the very 1st in the ordered list might be at offset 0 but cover the whole file). So this implementation uses reference counts and proxy range locks. Firstly, only reader locks use reference counts and proxy locks, because writer locks are exclusive. When a reader lock overlaps with another then a proxy lock is created for that range and replaces the original lock. If the overlap is exact then the reference count of the proxy is simply incremented. Otherwise, the proxy lock is split into smaller lock ranges and new proxy locks created for non overlapping ranges. The reference counts are adjusted accordingly. Meanwhile, the orginal lock is kept around (this is the callers handle) and its offset and length are used when releasing the lock.
Thread coordination ------------------- In order to make wakeups efficient and to ensure multiple continuous readers on a range don't starve a writer for the same range lock, two condition variables are allocated in each rl_t. If a writer (or reader) can't get a range it initialises the writer (or reader) cv; sets a flag saying there's a writer (or reader) waiting; and waits on that cv. When a thread unlocks that range it wakes up all writers then all readers before destroying the lock.
Append mode writes ------------------ Append mode writes need to lock a range at the end of a file. The offset of the end of the file is determined under the range locking mutex, and the lock type converted from RL_APPEND to RL_WRITER and the range locked.
Grow block handling ------------------- ZFS supports multiple block sizes currently upto 128K. The smallest block size is used for the file which is grown as needed. During this growth all other writers and readers must be excluded. So if the block size needs to be grown then the whole file is exclusively locked, then later the caller will reduce the lock range to just the range to be written using zfs_reduce_range.
Definition in file zfs_rlock.c.
static void zfs_range_add_reader | ( | avl_tree_t * | tree, |
rl_t * | new, | ||
rl_t * | prev, | ||
avl_index_t | where | ||
) | [static] |
Definition at line 274 of file zfs_rlock.c.
int zfs_range_compare | ( | const void * | arg1, |
const void * | arg2 | ||
) |
AVL comparison function used to order range locks Locks are ordered on the start offset of the range.
Definition at line 585 of file zfs_rlock.c.
Lock an object range.
off | Offset into the file that begins the range |
len | Length of the range to lock |
type | Either shared (RL_READER) or exclusive (RL_WRITER or RL_APPEND). APPEND is a special type that is converted to WRITER that specified to lock from the start of the end of file. |
Definition at line 423 of file zfs_rlock.c.
Check if a reader lock can be grabbed, or wait and recheck until available.
Definition at line 359 of file zfs_rlock.c.
Check if a write lock can be grabbed, or wait and recheck until available.
Definition at line 107 of file zfs_rlock.c.
static void zfs_range_new_proxy | ( | avl_tree_t * | tree, |
uint64_t | off, | ||
uint64_t | len | ||
) | [static] |
Create and add a new proxy range lock for the supplied range.
Definition at line 257 of file zfs_rlock.c.
If this is an original (non-proxy) lock then replace it by a proxy and return the proxy.
Definition at line 194 of file zfs_rlock.c.
void zfs_range_reduce | ( | rl_t * | rl, |
uint64_t | off, | ||
uint64_t | len | ||
) |
Reduce range locked as RL_WRITER from whole file to specified range.
Unlock range and destroy range lock structure.
Asserts the whole file is exclusivly locked and so there's only one entry in the tree.
Definition at line 562 of file zfs_rlock.c.
Split the range lock at the supplied offset returning the *front* proxy.
Definition at line 226 of file zfs_rlock.c.
void zfs_range_unlock | ( | rl_t * | rl | ) |
Definition at line 524 of file zfs_rlock.c.
Unlock a reader lock.
Definition at line 460 of file zfs_rlock.c.