The FreeBSD Symmetric MultiProcessing (SMP) project, often referred to as SMPng (SMP next generation), is focused on introducing parallelism into the FreeBSD kernel. While earlier versions of FreeBSD (3.x, 4.x) supported parallelism in user processes, the kernel was limited to executing on a single processor at a time, using what is referred to as a "Giant lock" around the kernel. For many interesting workloads, this results in a substantial speed-up, as significant computation occurs in user processes, especially for applications such as rendering and compilation. However, for kernel-intensive applications, such as intensive network or file system I/O, contention on the kernel lock results in little speed-up. The end goal of the SMPng Project is to decompose the Giant lock into a number of smaller locks, resulting in reduced contention (and improved SMP performance). However, important steps along the way include redesigning significant portions of the FreeBSD kernel architecture around the notion of ubiquitous parallelism: that at any moment, many processors might enter the kernel at the same time. This includes the introduction of more mature threading and synchronization primitives, interrupt threads, cache-aware allocation and scheduling, and topology-aware scheduling.
SMPng debuted in FreeBSD 5.0-RELEASE in January of 2003, and involved over five years of continuous development by a large number of members of the FreeBSD development team, as well as many external contributors. Since 5.0, the implementation has substantially matured; in the 5.0 release, the basic architectural changes required to support SMPng were complete, including new kernel memory allocators, synchronization routines, the move to ithreads, and the removal of the Giant lock from activities such as process scheduling and several common forms of IPC. Since 5.0, the architecture has been refined in a number of ways, including optimizing synchronization approaches, stability and performance testing on larger systems (up to 12 processors), and the removal of Giant from several significant parts of the operating system, including large parts of Virtual Memory (VM), the Virtual File System (VFS), UNIX File System (UFS), most parts of the network stack, including IPv4, IPV6, FAST_IPSEC, UNIX domain sockets, and NetGraph, and also from additional inter-process communication primitives. The SMP-aware kernel slab allocator is now used almost universally, and the focus has changed from "make it work", to "optimize it!". Simultaneous work on KSE 1:1 and M:N threading has also allowed applications to take advantage of new kernel parallelism. The FreeBSD 5.3 kernel introduced Giant-free network stack execution for most relevant code paths, and the FreeBSD 6.x kernel introduces MPSAFE VFS, as well as wide-spread performance optimization.
Continuing work on SMPng includes work sweeping up the "loose ends" that remain under Giant, such as parts of NFS, less widely used file systems such as NTFS, and less commonly used network stack components, such as SLIP. Another important focus is performance measurement and optimization, which build on and refine the SMPng architecture: the introduction of features such as the kernel trace facility (KTR), hardware performance monitor counters (hwpmc), lock profiling, and improved memory monitoring play important roles in this process. Other important debugging and testing facilities include WITNESS, a run-time kernel lock order verifier, and wide-spread use of lock assertions and run-time invariants testing.
The FreeBSD Project recognizes, in particular, the contributions of the BSD/OS development team, including architectural direction and suggestions, as well as for initial locking strategies for some system components, and source code for some of the starting primitives, not to mention BSDI's contributions in staffing and resources for early parts of this project.
This web page contains information relating to the SMPng effort; because of the large amount of work and rapid pace of development, it can fall a bit behind reality.
The task list below is not intended to be complete, but does represent a set of relevant and/or important components of the overall work. The "Responsible" field identifies a developer who has expressed willingness to be responsible for completing the identified task; this doesn't preclude others working on it, but suggests that coordination with the responsible party might be appropriate so as to avoid unnecessary duplication of work, and to maximize forward progress. If beginning work on a new area of substantial size, or one that appears unclaimed, it may be worth dropping an e-mail to the FreeBSD SMP mailing list to see if any progress has been made.
The definition of the date field varies depending on the status of a task. For completed tasks, it refers to the date completed or reported completed. For in-progress tasks, it refers to the date of the last update of the entry. For stalled tasks, it refers to the date that the task was declared stalled. For new tasks, it refers to the date the task was added to the list.
Locking down of individual device drivers is tracked at the busdma and SMPng driver conversion webpage.
Network stack locking information is available at Robert Watson's netperf web page and the FreeBSD.org Netperf web page. An SMP network performance cluster has also been created for the purposes of testing.
Tasks are sorted first by status, then by date.
Most SMP-related discussion takes place on the freebsd-smp mailing list. You can read more about mailing lists in the Resources on the Internet appendix of the FreeBSD Handbook.
Steve Passe has been maintaining a SMP project page that contains additional information, and goes back further in time than this web page.
Robert Watson is maintaining a page for SMP-related network performance work for the Netperf project. In addition, he has a a personal web page with a change log and other information.
OSNews has run an interview with FreeBSD developers Robert Watson, John Baldwin, and Scott Long, who talk about many features in 6.x, including recent SMPng work, SMPVFS, and more. The article is here.
A series of man pages on kernel synchronization and threading primitives can be found here:
In addition, the old SPL man page may be of interest, as it reflects the previous SMP synchronization model.
Hiten Pandya's SMP kernel synchronization rules.
Matt Dillon maintained a web page that documented the work he was doing on the SMP project.
"Locking in the Multithreaded FreeBSD Kernel" USENIX paper by John Baldwin.
"Reasoning about SMP in FreeBSD" BSDCon'03 paper by Jeffrey Hsu.
"ULE: A Modern Scheduler for FreeBSD" BSDCon'03 paper by Jeff Roberson.
This is an incomplete list of high-level kernel subsystems and current, active staff working on SMP architecture and stability.
Subsystem | Status | Last updated | Staffing |
---|---|---|---|
Newbus | In progress | 5 October 2003 | Warner Losh |
VM | In progress | 4 October 2003 | Alan L. Cox |
Buffer cache | In progress | 4 October 2003 | Jeff Roberson, Poul-Henning Kamp |
VFS | In progress | 4 October 2003 | Jeff Roberson |
Processes and thread operations | In progress | 5 May 2003 | John Baldwin |
Scheduler | Done | 23 April 2004 | John Baldwin, Jeff Roberson |
GEOM | Done | 5 February 2003 | Poul-Henning Kamp |
File descriptors | Done | 5 February 2003 | Alfred Perlstein, Seigo Tanimura Robert Watson |
TTY subsystem | In progress | 24 July 2004 | Poul-Henning Kamp |
Pipe IPC | Done | 4 October 2003 | Alfred Perlstein |
Socket structures and system calls | Done | 25 November 2004 | Sam Leffler, Robert Watson |
KQueue | Done | 24 November 2004 | John-Mark Gurney, Brian Feldman |
IPv4 | Done | 23 April 2004 | Jennifer Yang, Jeffrey Hsu, Sam Leffler, Robert Watson, George V. Neville-Neil |
IPv6 | In progress | 01 July 2005 | Robert Watson, George V. Neville-Neil |
IPX/SPX | Done | 09 January 2005 | Robert Watson |
netatalk | Done | 02 February 2005 | Robert Watson |
Network stack infrastructure | In progress | 4 October 2003 | Jeffrey Hsu, Sam Leffler, Robert Watson, Max Laier, Luigi Rizzo, Maurycy Pawlowski-Wieronski <maurycy@fouk.org>, Brooks Davis, Roman Kurakin |
NFS Client | In progress | 23 April 2004 | |
NFS Server | In progress | 24 November 2004 | Robert Watson |
Following is an incomplete list of general tasks.
Task | Responsible | Last updated | Status |
---|---|---|---|
Convert the giant lock from spinning to blocking, add the scheduler lock, add per-CPU idle processes. | Matt Dillon | 25 June 2000 | Done |
Port the BSD/OS locking primitives (i386). | Jake Burkholder | 3 July 2000 | Done |
Implement heavy-weight interrupt threads (i386). | Greg Lehey | 3 August 2000 | Done |
Rewrite the low level interrupt code (i386 UP). | Greg Lehey | 3 August 2000 | Done |
Demonstrated reasonable stability (self-hosted buildworld) (i386 UP). | -smp developers | 12 August 2000 | Done |
Port the BSD/OS locking primitives (alpha). | Doug Rabson | 24 August 2000 | Done |
Stub out (disable) spl()s. | Greg Lehey | 30 August 2000 | Done |
Port the BSD/OS ktr code. | Greg Lehey, John Baldwin | 30 August 2000 | Done |
Rewrite the low level interrupt code (i386 SMP). | John Baldwin | 1 September 2000 | Done |
Demonstrated reasonable stability (self-hosted buildworld) (i386 SMP). | -smp developers | 6 September 2000 | Done |
Demonstrated reasonable stability (self-hosted buildworld) (alpha). | -smp developers | 6 September 2000 | Done |
Make malloc and friends thread-safe. | Jason Evans | 10 September 2000 | Done |
Implement msleep(), make tsleep() an msleep() wrapper. | Jake Burkholder | 11 September 2000 | Done |
Make fxp driver thread-safe. | Chuck Paterson | 17 September 2000 | Done |
Make mbuf's thread-safe. | Bosko Milekic | 29 September 2000 | Done |
Lock manager re-work. | Jason Evans | 3 October 2000 | Done |
Implement heavy-weight interrupt threads (alpha). | John Baldwin, Doug Rabson | 5 October 2000 | Done |
Rewrite the low level interrupt code (alpha). | Doug Rabson, John Baldwin | 5 October 2000 | Done |
Process accounting. | Tor Egge, John Baldwin | 5 October 2000 | Done |
Make ethernet drivers thread-safe. | Bill Paul | 15 October 2000 | Done |
Make the mutex headers mostly machine-independent. | John Baldwin | 20 October 2000 | Done |
Rename SMP_DEBUG to MUTEX_DEBUG. | John Baldwin | 20 October 2000 | Done |
Give each soft interrupt its own thread. | Chuck Paterson | 25 October 2000 | Done |
Make sf_bufs (sendfile(2)) thread-safe. | Bosko Milekic | 5 November 2000 | Done |
Make the witness code work correctly. | John Baldwin | 18 November 2000 | Done |
Split the ktr-specific code out of db_interface.c. | John Baldwin | 15 December 2000 | Done |
Convert the sio driver to using a spin mutex. | John Baldwin | 18 December 2000 | Done |
Implement condition variables. | Jake Burkholder, Jason Evans | 15 January 2001 | Done |
Add a flag to mtx_init() (MTX_RECURSE) that denotes whether a mutex is allowed to recurse. | Bosko Milekic | 19 January 2001 | Done |
Make the zone allocator thread-safe. | Dag-Erling Smorgrav | 21 January 2001 | Done |
Convert simplelocks to mutexes. | Jason Evans | 24 January 2001 | Done |
Make kernel preemptive with respect to interrupts. | Jake Burkholder | 31 January 2001 | Done |
Cleanup of mutex API. | Bosko Milekic | 8 February 2001 | Done |
Remove COM_LOCK. | Mark Murray | 11 February 2001 | Done |
Merge various scheduling classes into one run queue. Modify scheduler to support preemptable kernel. | Jake Burkholder | 11 February 2001 | Done |
Make priority propagation work correctly. | Jake Burkholder | 11 February 2001 | Done |
Make most of the interrupt thread code MI and shared between hardware and software interrupts. | John Baldwin | 18 February 2001 | Done |
Add protection to struct jail and jail-related functionality. | Robert Watson | 20 February 2001 | Done |
Implement sx (shared/exclusive) locks. | Jason Evans | 5 March 2001 | Done |
Generalize/improve witness to handle more complex locking primitives (mtx, sx). | John Baldwin | 28 March 2001 | Done |
Convert the allproc and proctree locks from lockmgr locks to sx locks. | John Baldwin | 28 March 2001 | Done |
Make mbuf system use condition variables instead of msleep()/wakeup(). | Bosko Milekic | 2 April 2001 | Done |
Remove <sys/mutex.h> includes from other kernel headers such as <vm/vm_zone.h>, <sys/resourcevar.h>, <sys/ucred.h>, and <sys/mbuf.h>. | Mark Murray | 15 May 2001 | Done |
Cleanup the various mp_machdep.c's, unify various SMP API's such as IPI delivery, etc. | John Baldwin | 15 May 2001 | Done |
Make most of the forward_* and forwarded_* functions MI. | John Baldwin | 15 May 2001 | Done |
Complete the MD support for SMP on the Alpha platform. | Andrew Gallatin, Doug Rabson, John Baldwin | 15 May 2001 | Done |
Convert select() to use condition variables. | Seigo Tanimura | 15 May 2001 | Done |
Add a "giant" lock around the VM subsystem. | Alfred Perlstein | 13 June 2001 | Done |
Introduce a modified slab allocator for the mbuf subsystem. | Bosko Milekic | 21 June 2001 | Done |
Add a witness_assert() function to handle lock assertions. | John Baldwin | 27 June 2001 | Done |
Extend sx locks to support try lock operations. | John Baldwin | 27 June 2001 | Done |
Document KTR. | John Baldwin | 28 June 2001 | Done |
Make fork_return, fork_exit, ast, and userret MI. | John Baldwin | 29 June 2001 | Done |
Make sched_lock's savecrit a per-process property saved and restored in mi_switch and initialized in fork_exit. | John Baldwin | 30 June 2001 | Done |
Make ast() loop. | John Baldwin | 10 August 2001 | Done |
Add upgrade/downgrade sx lock operations. | Alexander Kabaev, Jason Evans | 13 August 2001 | Done |
Implement semaphores. | Jason Evans | 14 August 2001 | Done |
Add support for upgrade/downgrades in witness. | John Baldwin | 23 August 2001 | Done |
Make most of cpu_wait() and cpu_exit() MI. | Peter Wemm | 9 September 2001 | Done |
Split NFS into client and server. | Peter Wemm | 18 Oct 2001 | Done |
Lock taskqueues. | Andrew Reiter, John Baldwin | 25 October 2001 | Done |
Add a per-thread ucred reference. | John Baldwin | 25 October 2001 | Done |
Make most of the per-CPU stuff MI. | John Baldwin | 11 December 2001 | Done |
Make critical section saved state per-thread instead of per-lock so that interlocking spin locks work properly. | John Baldwin | 17 December 2001 | Done |
Replace the APIC-specific imen_mtx with a MI-named icu_lock to protect interrupt controllers and associated data within the kernel for both i386 and alpha. | John Baldwin | 20 December 2001 | Done |
Use the per-thread critical section nesting level in the mutex and interrupt thread code to automatically determine when to not preempt. This makes the MTX_NOSWITCH, SWI_SWITCH, and SWI_NOSWITCH flags obsolete as the kernel will be able to figure out the proper behavior on its own. | John Baldwin | 5 January 2002 | Done |
Lock struct filedesc and struct file. | Seigo Tanimura, Alfred Perlstein | 12 January 2002 | Done |
Lock struct pgrp, struct session, and struct sigio. | Seigo Tanimura | 23 February 2002 | Done |
Lock pipe implementation, but not sigio/fown, VM interactions. | Alfred Perlstein | 27 February 2002 | Done |
Move to explicit reference counting for soft vnode references. | Poul-Henning Kamp | 8 March 2002 | Done |
Initialize mutex pools early enough that sx locks can be used for VM. | Brian Feldman | 14 March 2002 | Done |
Place a global lock (sellock) around selinfo structures to fix a variety of lock order reversals, and make select() MP-safe. | Alfred Perlstein, Chad David | 14 March 2002 | Done |
Push down Giant on read, write, pread, pwrite system calls, acquiring Giant in the per-subsystem fileop layer for sockets, VFS, etc. | Alfred Perlstein | 15 March 2002 | Done |
Lock down kernel module structures. | Andrew Reiter | 18 March 2002 | Done |
Lock down kernel linker globals. | Andrew Reiter | 18 March 2002 | Done |
Rewrite kernel memory allocator to be a slab allocator that uses per-cpu caches. | Jeff Roberson | 21 March 2002 | Done |
Replace incorrect use of MD critical section API to disable interrupts with a specific interrupt disable API. | Warner Losh, Doug Rabson, Benno Rice, John Baldwin | 21 March 2002 | Done |
Lock down access to the shared p_args "process arguments" structure through appropriate protection of that structure and references to it. | Jonathan Mini | 31 March 2002 | Done |
Move from flags/tsleep lock to sx locks to protect sysctl tree from updates during sysctl operations. | Jonathan Mini | 1 April 2002 | Done |
Create/port userland tool to manage KTR event dumps. | Jake Burkholder | 1 April 2002 | Done |
Create MTX_SYSINIT and SX_SYSINIT macros that allow for initializing locks that are subsystem independent. | Andrew Reiter | 2 April 2002 | Done |
Lock down the global securelevel variable. | Andrew Reiter | 2 April 2002 | Done |
Make grow_stack() MI. Possibly even a macro or inline. | Alan L. Cox | 6 April 2002 | Done |
Lock use of p_fd, which otherwise can result in corrupted p_fd panics during heavy operation. Start with a global, and move to per-proc locking. | Alfred Perlstein, Seigo Tanimura | 8 April 2002 | Done |
Lock struct pargs. | Jonathan Mini | 9 April 2002 | Done |
Lock sysctl hierarchy. | Jonathan Mini | 9 April 2002 | Done |
Make {o,}sigreturn() MPSAFE. | Alan L. Cox | 11 April 2002 | Done |
Rewrite kernel memory allocator so that Giant is not required for malloc() or free(). | Jeff Roberson | 2 May 2002 | Done |
Replace complex shared/exclusive locking scheme in the VM system with a purely exclusive lockmgr locking scheme, simplifying locking and removing potential livelock/deadlock scenarios. | Brian Feldman, Alan L. Cox | 3 May 2002 | Done |
Push down Giant into readv/writev system calls in style of read/write/pread/pwrite once malloc no longer requires Giant in the handling of iovec structures for uio. | Alan L. Cox | 9 May 2002 | Done |
Push down Giant in mprotect(), minherit(), and madvise() so that it is no longer acquired and released directly. | Alan L. Cox | 18 May 2002 | Done |
Update suser() and p_can*() APIs to accept threads instead of processes. | John Baldwin | 18 May 2002 | Done |
Broadly transition to td_ucred from p_ucred once KSE dependencies are in place. | John Baldwin | 18 May 2002 | Done |
Add a witness_sleep() check to uma_zalloc() to catch code calling malloc() or uma_zalloc() while holding non-sleepable locks. | John Baldwin | 20 May 2002 | Done |
Optimize UP support by changing spin locks to only perform critical section enter and exits. | John Baldwin | 21 May 2002 | Done |
Make sleep mutexes spin if the current lock holder is executing on another CPU. | John Baldwin | 21 May 2002 | Done |
Add support for the IA32 pause instruction to spin loops in locks. | John Baldwin | 21 May 2002 | Done |
Make KTRACE write into tracefiles asynchronously. | John Baldwin | 7 June 2002 | Done |
Remove Giant from modnext(2), modfnext(2), modstat(2),and modfind(2). | Andrew Reiter | 25 June 2002 | Done |
Fix synchronization of TLB flushes and invlpg() on x86 SMP. | Peter Wemm | 12 July 2002 | Done |
Add KTR(9) tracing for mutex contention. | Ian Dowse | 26 Augist 2002 | Done |
Make cpu_coredump MI. | Peter Wemm | 7 September 2002 | Done |
Add a subsystem lock to the accounting code. | Andrew Reiter | 11 September 2002 | Done |
Allow KTR(9) to write trace records to alq(9) record facility. | Jeff Roberson | 22 September 2002 | Done |
Create mechanism in cdevsw structure to protect thread-unsafe drivers. | Poul-Henning Kamp | 27 September 2002 | Done |
Fix SIGXPU and other #if 0'd things in mi_switch(). | John Baldwin | 30 September 2002 | Done |
Lock down TrustedBSD MAC implementation. | Robert Watson | 11 November 2002 | Done |
Lock eventhandlers. | Mike Smith, Jonathan Mini, John Baldwin | 11 March 2003 | Done |
Fix PHOLD() so that it blocks to guarantee PS_INMEM. | John Baldwin | 22 April 2003 | Done |
Fix various procfs_machdep.c to not use sched_lock. | John Baldwin | 22 April 2003 | Done |
Lock all references to process credentials and remove Giant from process credential-related system calls. | John Baldwin | 1 May 2003 | Done |
Merge the procsig and sigacts structures, move the new sigacts structure out of the U-area and add appropriate locking. | John Baldwin | 13 May 2003 | Done |
Remove Giant from the kill() and killpg() system calls. | John Baldwin | 13 May 2003 | Done |
Enhance the mutex pool implementation to allow creation and use of multiple, dynamically allocated pools with adjustable pool sizes and mutex options. | Don Lewis | 16 July 2003 | Done |
Create mutex profiling tool for the kernel so as to measure contention and behavior of kernel mutexes. | Eivind Eklund, Dag-Erling Smorgrav | 31 March 2002 | Done |
Lock down linker_file_t structures in the kernel linker. | Andrew Reiter | 19 June 2002 | Done |
Lock pipe implementation: VM optimizations. | 4 October 2003 | Done | |
Reimplement i386 interrupt and SMP code so that SMP kernels work on UP boxes and SMP can be enabled in GENERIC. | John Baldwin | 3 November 2003 | Done |
Implement generic turnstiles to use when blocking on non-sleepable locks. | John Baldwin | 11 November 2003 | Done |
Split witness_lock() into witness_checkorder() and witness_lock(). witness_checkorder() would be called before acquiring a lock to increase the changes of detecting and warning about a reversal prior to deadlocking. witness_lock() would simply update witness' internal state to note that a lock has been acquired. | John Baldwin | 24 January 2004 | Done |
Lock per-process resource limits. | Mike Makonnen, John Baldwin | 4 February 2004 | Done |
Implement a sleep queue abstraction to be used by both msleep() and condition variables. This new abstraction should use a hash table of sleep queues with a spin lock on each sleep queue chain similar to turnstile chain locks to make sched_lock finger grained. | John Baldwin | 27 February 2004 | Done |
Remove Giant from jail(2). | Andrew Reiter, Robert Watson | 23 April 2004 | Done |
Add subsystem locking to NFSv2, NFSv3 server, permitting upcalls and other network-related elements to run Giant-free. | Robert Watson | 24 July 2004 | Done |
Add KTR(9) tracing for UMA allocation/free events. | Robert Watson | 05 August 2004 | Done |
Add KTR(9) tracing for GEOM I/O events. | Robert Watson | 21 October 2004 | Done |
Add KTR(9) tracing for busdma events. | Robert Watson | 23 October 2004 | Done |
Add KTR(9) tracing for critical sections. | Robert Watson | 07 November 2004 | Done |
Make the kernel fully preemptive. | John Baldwin | 24 November 2004 | Done |
Lock pipe implementation: sigio/fown-related evil. | Alfred Perlstein | 24 November 2004 | Done |
Lock down the SysV IPC code. | Alfred Perlstein | 24 November 2004 | Done |
Lock contention measurement tool to measure heat of various locks, including Giant, and permit more directed performance and locking strategy optimization. | Robert Watson | 24 November 2004 | Done |
Add KTR(9) tracing to scheduler run queues. | Jeff Roberson | 26 December 2004 | Done |
Review locking strategy and correctness of VFS operations and fix up various failure modes associated with enabling VFS locking assertions. | Jeff Roberson | 01 January 2005 | Done |
Document in-vnode locking strategy, clean it up. | Jeff Roberson | 01 January 2005 | Done |
Run cross-file system VFS without Giant, acquiring Giant conditionally based on a file system flag. | Jeff Roberson | 01 January 2005 | Done |
Run UFS file system MPSAFE. | Jeff Roberson | 01 January 2005 | Done |
Add KTR(9) tracing for buffer cache events. | Jeff Roberson | 24 January 2005 | Done |
Break out critical section and spin lock APIs, and re-optimize critical sections to not disable interrupts in hardware due to the high cost on some hardware architectures. | John Baldwin | 04 April 2005 | Done |
Modify uma(9) to use critical sections to protect per-CPU statistics, instead of mutexdes, in order to optimize access. | Robert Watson | 29 April 2005 | Done |
Migrate malloc(9) to per-CPU statistics, and use critical sections to optimize access to those statistics. | Robert Watson | 29 May 2005 | Done |
Add KTR(9) support for KTR_VFS to trace additional VFS events, rather than mechanically inserted KTR_VOP events. | Jeff Roberson | 11 June 2005 | Done |
Push the grabbing of Giant into Linux i386 ABI system calls. | John Baldwin | 13 July 2005 | Done |
Push the grabbing of Giant into Linux AXP ABI system calls. | John Baldwin | 13 July 2005 | Done |
Push the grabbing of Giant into SVR4 i386 ABI system calls. | John Baldwin | 13 July 2005 | Done |
Push the grabbing of Giant into OSF/1 AXP ABI system calls. | John Baldwin | 13 July 2005 | Done |
Push the grabbing of Giant into IBCS i386 ABI system calls. | John Baldwin | 13 July 2005 | Done |
Add a new witness check for exiting threads to verify that an exiting thread holds no locks. | John Baldwin | 2 September 2005 | Done |
Implement atomic_fetchadd() for ints. | John Baldwin | 27 September 2005 | Done |
Implement a simple reference count API using atomic operations and use this to replace locks that just protect a reference count. | John Baldwin | 27 September 2005 | Done |
Split the interrupt handler list out of struct ithread into its own structure and only start up kthreads for interrupt vectors that actually have threaded interrupt handlers. | John Baldwin | 25 October 2005 | Done |
Lock aio(4). | David Xu | 22 January 2006 | Done |
Implement reader/writer locks. | John Baldwin | 27 January 2006 | Done |
Lock struct proc. | John Baldwin | 20 February 2001 | In progress |
Lock down the tty subsystem. | Dick Garner, Jeremy Scofield, Thomas Moestl, Poul-Henning Kamp | 24 July 2004 | In progress |
Fix clock locking to be the same on all platforms. | John Baldwin | 16 November 2001 | In progress |
Make use of process locking and process reference counting to protect debugging interfaces (and procfs). | John Baldwin | 27 February 2002 | In progress |
Make use of process locking to protect process monitoring sysctls, including those employed by 'ps' and related tools. | John Baldwin | 27 February 2002 | In progress |
Lock down newbus infrastructure to support driver fine-graining. | Warner Losh | 28 February 2002 | In progress |
Remove the MP safe syscall flag from the syscall table and add explicit mtx_lock/unlock's of Giant to all syscalls. | Matt Dillon, Maxime Henrion, Robert Watson | 24 July 2004 | In progress |
SMPng architecture document. | John Baldwin, Robert Watson | 28 February 2002 | In progress |
Move to shared lock for VOP_GETATTR() to reduce blocking during frequent lightweight VFS operations. Modify namei() to provide a LOOKUP_SHARED flag to indicate when the lock required may be shared instead of exclusive. | Jeff Roberson | 11 March 2002 | In progress |
Document existing vm_map locking and verify it's correctness. | Alan L. Cox | 18 May 2002 | In progress |
Document existing vm_object locking and verify it's correctness. | Alan L. Cox | 4 May 2002 | In progress |
Implement lazy interrupt thread switching (context stealing) on i386. | Bosko Milekic, Alexander Kabaev | 10 December 2002 | In progress |
Implement lazy interrupt thread switching (context stealing) on sparc64. | Jake Burkholder | 10 December 2002 | In progress |
Switch from using lockmgr in VM to using a mutex or exclusive sxlock. Push down Giant on all VM except for vm_object/VFS and vm_page/pmap components. | Alan L. Cox | 10 December 2002 | In progress |
Modify device driver API to permit drivers to more easily split "in interrupt context" and "in interrupt thread" code so as to acknowledge interrupts faster. This will permit lower latency in interrupt handling. | Peter Wemm, Scott Long | 1 July 2005 | In progress |
Make printf() safe to call in almost any situation to avoid deadlocks. | Chuck Paterson | 15 May 2001 | Stalled |
Conditionalize atomic ops in the SMP code that are used for debugging statistics. | Peter Wemm | 15 March 2001 | Not Started |
Axe schedcpu() in favor of event driven priority updates as much as possible. | 7 September 2001 | Not Started | |
Fix *hold (e.g. crhold) to return reference to object. | 7 September 2001 | Not Started | |
Add witness checking for lockmgr locks. | 7 September 2001 | Not Started | |
Add ICU spin locks on ia64. | 4 January 2002 | Not Started | |
Expand mutex profiling tool to also profile sx locks. | Eivind Eklund, Dag-Erling Smorgrav | 1 April 2002 | Not Started |
Add a witness_sleep() check to copyin/out() and s/fuword(). | John Baldwin | 7 June 2002 | Not Started |
Remove Giant from mi_startup() and push Giant down into individual SYSINIT functions. Many SYSINIT functions probably do not need Giant anyway. | 21 October 2005 | Not Started |
This table lists the todo subtasks for multithreading the network stack.
Task | Responsible | Last updated | Status |
---|---|---|---|
Protect network interface queues. | Jonathan Lemon | 24 November 2000 | Done |
Lock up IP. | Jennifer Yang, Jeffrey Hsu | 10 June 2002 | Done |
Lock up TCP. | Jennifer Yang, Jeffrey Hsu, Sam Leffler, Robert Watson | 24 November 2004 | Done |
Lock up UDP. | Jennifer Yang, Jeffrey Hsu, Robert Watson | 24 November 2004 | Done |
Lock ifaddr reference counts. | Jeffrey Hsu | 18 December 2002 | Done |
Lock up ifnet list. | Jeffrey Hsu | 21 December 2002 | Done |
Lock radix trees. | Jeffrey Hsu | 23 December 2002 | Done |
Lock up ARP. | Jeffrey Hsu | 16 January 2003 | Done |
Lock up raw IP. | Sam Leffler, Robert Watson | 24 July 2004 | Done |
Lock divert sockets. | Sam Leffler | 4 October 2003 | Done |
Lock ipfw2. | Sam Leffler | 4 October 2003 | Done |
Lock DUMMYNET. | Sam Leffler | 4 October 2003 | Done |
Lock ethernet bridge. | Sam Leffler | 4 October 2003 | Done |
Lock IP fragment queues. | Robert Watson | 4 October 2003 | Done |
Lock routing entries. | Sam Leffler | 4 October 2003 | Done |
Lock FAST_IPSEC. | Sam Leffler | 4 October 2003 | Done |
Permit parallel entry into isr processing. | Robert Watson, Sam Leffler | 11 October 2003 | Done |
Lock if_disc "discard interface". | Robert Watson | 9 March 2004 | Done |
Lock if_faith "IPv6-to-IPv4 TCP relay interface. " | Sam Leffler, Robert Watson | 9 March 2004 | Done |
Lock if_gif "generic tunnel interface". | Robert Watson | 9 March 2004 | Done |
Review ECN tunnel support (ip_ecn.c). | Robert Watson | 9 March 2004 | Done |
if_tap global and softc locking. | Robert Watson | 23 April 2004 | Done |
if_tun global and softc locking. | Robert Watson | 23 April 2004 | Done |
netatalk/aarp.c locking. | Robert Watson | 23 April 2004 | Done |
Cache socket MAC label in inpcb label for IPv4 sockets so that the label can be used safely at the inet layer without socket locks. | Robert Watson | 23 April 2004 | Done |
IP encapsulation subroutines (ip_encap.c). | Robert Watson | 23 April 2004 | Done |
Lock globals in loopback interface (if_loop.c). | Robert Watson | 23 April 2004 | Done |
Use m_tags in if_gif to limit looping configurations, rather than a non-MPSAFE static coutner. | Ruslan Ermilov | 23 April 2004 | Done |
netatalk DDP PCB locking. | Robert Watson | 24 July 2004 | Done |
Lock up syncache. | Jeffrey Hsu, Sam Leffler | 10 November 2003 | Done |
Permit IP forwarding path to run Giant-free. | Sam Leffler | 1 December 2003 | Done |
Lock UNIX® domain protocols, fifofs. | Sam Leffler, Robert Watson | 24 July 2004 | Done |
Giant lock over NFS server to protect against so_upcall() w/o Giant | Robert Watson | 24 July 2004 | Done |
Lock interface cloning meta-data. | Brooks Davis | 24 July 2004 | Done |
Apply combination of socket and socket buffer locks, label caching to MAC labels on sockets so that they can be used safely without Giant. | Robert Watson | 24 July 2004 | Done |
Make routing socket message dispatch use a netisr to avoid re-entering the socket code from the routing code, resolving lock order issues. | Robert Watson | 24 July 2004 | Done |
Introduce accept locking to protect accept incomplete and complete queues on listen sockets. | Robert Watson | 24 July 2004 | Done |
Break out socket buffer wakeup, socket buffer append, socket state change, socket buffer reserve, flush, etc, calls into _locked() and unlocked versions, and avoid conditional locking. | Robert Watson | 24 July 2004 | Done |
Lock down AARP, AppleTalk Address Resoluton Protocol. | Robert Watson | 24 July 2004 | Done |
Fix pull/push cache data synchronization issues in sosend(), soreceive(), allowing them to run Giant-free. | Robert Watson | 24 July 2004 | Done |
Protect socket global counters/limits and generation number with a mutex. | Robert Watson | 24 July 2004 | Done |
Lock down unit allocation meta-data in interface related netgraph modules. | Robert Watson | 24 July 2004 | Done |
Lock down socket buffer OOB fields across TCP/IP, IPX. | Robert Watson | 24 July 2004 | Done |
Add MSG_NBIO so that fifofs can avoid frobbing SO_NBIO in a manner that risks races. | Don Lewis | 24 July 2004 | Done |
Protect all use of so_count with socket lock. | Robert Watson | 24 July 2004 | Done |
Move socket buffer related state from so_state to sb_state so it can be properly locked by the socket buffer mutex. | Robert Watson | 24 July 2004 | Done |
Introduce a temporary global lock to lock the if_label field used by the MAC Framework. | Robert Watson | 24 July 2004 | Done |
Push VFS-specific behavior out of fdrop_locked() and acquire Giant in the fo_close per-object methods rather than fdrop_locked(), so that pipes and sockets can run fo_close() Giant-free. | Robert Watson | 24 July 2004 | Done |
Push Giant acquisition into fo_stat() file descriptor stat operation, rather than acquiring it in fstat(), so that fstat() on sockets and pipes can run Giant-free. | Robert Watson | 24 July 2004 | Done |
Don't hold socket locks over entry to protocol switch methods, allowing protocol methods to acquire socket locks after protocol locks in the lock order. | Robert Watson | 24 July 2004 | Done |
Port inpcb mutex locking, assertions from IPv4 to IPv6. | Robert Watson | 8 August 2004 | Done |
Add IFF_NEEDSGIANT to allow if_start to run with Giant for specific interfaces. Defer if_start to task queue. | Robert Watson | 8 August 2004 | Done |
Push down Giant in stat(), fo_stat() to allow Giant-free stat of pipes, sockets. | Robert Watson | 8 August 2004 | Done |
Add TCP lock assertions. | Robert Watson | 24 November 2004 | Done |
Lock socket layer. | Sam Leffler, Robert Watson | 24 November 2004 | Done |
Review TCP timer code. | Robert Watson | 24 November 2004 | Done |
Analyze and reduce cost of entropy gathering in network critical paths. | Robert Watson, Mark Murray | 24 November 2004 | Done |
Allow code to declare NET_NEEDS_GIANT(), forcing Giant over the network stack if that code is compiled into the kernel. | Robert Watson | 28 August 2004 | Done |
Disable Giant over the network stack in the default configuration. | Robert Watson | 28 August 2004 | Done |
Additional KTR tracing for UMA, callouts, interrupts, etc. | Robert Watson | 07 November 2004 | Done |
Move to using file descriptor reference counts instead of socket reference counts for socket system calls, avoiding extra reference couht operations. | Robert Watson | 24 October 2004 | Done |
Lock IPv6. | Sam Leffler, Robert Watson, Hajimu UMEMOTO, Max Laier | 8 August 2004 | In progress |
if_ppp global, per-softc locking. | Robert Watson, Maurycy Pawlowski-Wieronski <maurycy@fouk.org> | 23 April 2004 | In progress |
Lock struct ifnet. | Max Laier, Luigi Rizzo, Maurycy Pawlowski-Wieronski <maurycy@fouk.org> | 23 April 2004 | In progress |
Lock IPv4, IPv6, atalk interface address lists. | Max Laier, Robert Watson | 8 August 2004 | In progress |
Lock consumers of BSD compress (bsd_comp.c) code to protect compression state. | Robert Watson | 23 April 2004 | In progress |
Lock global and softc state for six-to-four converter (if_stf.c). | Robert Watson | 23 April 2004 | In progress |
Lock down global and softc state for SLIP (if_sl.c). | Robert Watson | 23 April 2004 | In progress |
Lock global and softc state for SPPP (if_sppsubr.c). | Roman Kurakin, Robert Watson | 23 April 2004 | In progress |
IGMP locking. | Robert Watson | 23 April 2004 | In progress |
IP ID locking. | Stephan Uphoff | 24 June 2005 | In progress |
Lock down netnatm. | Robert Watson | 01 July 2005 | In progress |
Research and select options for inbound network stack parallelism. Direct dispatch is one option currently being considered. | Robert Watson | 19 October 2005 | In progress |
Locking for polling(4). | Pawel Jakub Dawidek, Gleb Smirnoff | 19 October 2005 | In progress |
Reduce contention upon locking a socket buffer by replacing tsleep() and wakeup() with a condvar. | Seigo Tanimura | 21 April 2002 | Not Started |
Lock if_ef "ethernet frame" driver. | 9 March 2004 | Not Started | |
Further cleanup of socket state machine in order to facilitate finishing socking locking of state transitions. | Robert Watson | 19 October 2005 | Not Started |
Lock KAME IPSEC. | 19 October 2005 | Not Started | |
Only one of our ATM stacks is MPSAFE, the other two should be deleted or fixed. | 19 October 2005 | Not Started | |
Lock ND6 (IPv6 Neighbor Discovery). | George V. Neville-Neil | 19 October 2005 | Not Started |
Lock IPv6 multicast address lists. | 19 October 2005 | Not Started | |
Lock IPv4 and IPv6 global address lists. | 19 October 2005 | Not Started | |
Continued cleanup of the stack/device driver ownership and locking for struct ifnet needs to be done. Most fields are now either locked or assigned ownership. Some fields, such as if_flags, need a bit more cleanup due to device drivers modifying stack-owned fields. | Robert Watson | 19 October 2005 | Not Started |
BPF locking needs some cleanup, there are some race conditions there relating to interface removal. | 19 October 2005 | Not Started | |
When interfaces are torn down, there are a number of races (not all associated with SMPng) that need to be thought about. | 19 October 2005 | Not Started | |
Lock if_vlan and inter-layer multicast address manipulation and synchronization in if_vlan. | Gleb Smirnoff, Yar Tikhiy | 19 October 2005 | Not Started |
Further investigate locking in in_gif and in6_gif. | 19 October 2005 | Not Started | |
FAST_IPSEC and KAME IPSEC's PF_KEY support likely needs an asynchronous dispatch to prevent socket lock ordering issues similar to what was done for PF_ROUTE. | Robert Watson | 19 October 2005 | Not Started |
Investigate how to eliminate the use of ACCEPT_LOCK(), which currently prevents races in the tear-down of sockets. | Robert Watson | 19 October 2005 | Not Started |
Fix SMP problems with netgraph restructuring. | Gleb Smirnoff | 19 October 2005 | Not Started |
Verify locking in netgraph nodes and improve where necessary. | Gleb Smirnoff | 19 October 2005 | Not Started |
More finely-grained locking for pf(4). | Gleb Smirnoff | 19 October 2005 | Not Started |
Issue | Last updated | Status |
---|---|---|
Idle processor time is not charged to the idle processes. | 20 September 2000 | Resolved |
microuptime creeps backwards. | 4 October 2000 | Resolved |
microuptime() went backwards | 4 October 2000 | Resolved |
Process accounting is not accurate (the more CPUs, the closer to correct it is). | 5 October 2000 | Resolved |
M_DEVBUF is probably the wrong memory pool for interrupt stuff and we should think about creating a new malloc pool for that stuff. | 9 February 2001 | Resolved |
PC card eject panics due to a race condition in the interrupt thread code. | 15 March 2001 | Resolved |
SMP x86 boxes are seeing NCPU * 100 clk interrupts and NCPU * 128 rtc interrupts. | 15 May 2001 | Resolved |
Witness will infinitely recurse when it acquires Giant after sleeping with a sleepable lock. | 27 June 2001 | Resolved |
Serial gdb does not work if boot_ddb and boot_gdb options are specified. | 14 July 2002 | Resolved |
Serial gdb does not work at 115200 baud. | 14 July 2002 | Resolved |
Serial gdb never regains control once 'cont' has been entered. | 14 July 2002 | Resolved |
Profiling is broken. | 20 February 2001 | Unresolved |
The remainder of this page is structured as a reverse-chronological log.
28 August 2004Robert Watson threw the switch to change the network stack to run without the Giant lock by default, permitting the network stack to be run on multiple CPUs at a time, as well as to preempt and be preempted by other code.
Greg Lehey submitted a FreeBSD SMPng paper to the Asian Enterprise Open Source Conference in Singapore. The paper presents a historical view of SMPng development through 2001, but omits discussion of more recent progress on the SMPng project, such as substantial performance enhancements resulting from extensive lock pushdown in the storage subsystem, VM subsystem, and major IPC subsystems.
A status report was sent to the -smp mailing list.
Greg Lehey has made his USENIX paper available, which he will present in Boston at the end of June.
A status report was sent to the -smp mailing list.
A status report was sent to the -smp mailing list.
A status report was sent to the -smp mailing list.
A status report was sent to the -smp mailing list.
John Baldwin and Chuck Paterson came up with a preliminary list of rules that should be followed when working on kernel synchronization.
The SMP code has been committed. All further work is being done in cvs rather than with patches.
An updated patch is available for download. This patch is probably what will actually get committed.
An updated patch is available for download. This patch makes rtc a fast interrupt, uses locked instructions for mutexes in MP kernels, and corrects mtx_*() linkage within modules.
The code is working for the most part now on i386 (UP and MP). Some additional coding is still necessary for the alpha, which is being done now.
Updated patches for i386 and alpha are available here.
Updated patches for i386 are available here. Process accounting still doesn't work correctly, but a number of other improvements have been made.
Patches with functional heavy-weight threads for the i386 platform are available here. There are a couple of minor issues with this patch set. Specifically, process accounting doesn't work correctly.
Sheldon Hearn has prepared a mutex(9) man page based on the BSD/OS one, which is available here.
Jake Burkholder put an updated patch here.
Jake Burkholder has the BSD/OS lock code working now, and has incorporated the pertinent portions of Matt Dillon's patches (idle processes, some of the schedlock changes, etc.). His patch set is available here.
Chuck Paterson has provided the PostScript versions of his presentation slides for the first day and second day of the SMP meeting.
Here's a copy of the SMP meeting summary that was posted to the -smp mailing list.
Here's a copy of the SMP project announcement that was posted to the -current mailing list.