Memory allocation performance on FreeBSD and Linux with ebizzy (Mar 2008)

ebizzy is a benchmark intended to simulate the back end of a busy database. It allocates memory, searches it and copies it concurrently between multiple threads.

FreeBSD 7.0 includes a new highly scalable memory allocator, jemalloc. FreeBSD was compared to the 2.6.24 Linux kernel and glibc 2.7 provided by Fedora 8. On FreeBSD 7.0 and above, the ebizzy benchmark has excellent performance on 8 core systems, and performance far exceeds the glibc memory allocator in Linux. The following graph compares FreeBSD 7.0 and Linux 2.6.24+glibc 2.7 on an 8-core Intel Xeon system running in 32-bit mode.

(This graph shows FreeBSD 8.0 but FreeBSD 7.0 does not perform substantially differently)

By default ebizzy allocates memory in 256KB chunks. FreeBSD 7.0 has excellent performance out of the box, scaling linearly to 8 CPUs and with no degradation at higher load. Linux 2.6.24 has uneven scaling on this benchmark, with a local maximum at 8 threads but a substantial dip above 8. The continued scaling towards 20 threads suggests that the Linux+glibc memory allocator is not maximizing its throughput on this test.

With 1MB chunk size both Linux and FreeBSD perform substantially worse by default. The reason in FreeBSD is that 1MB is a threshold allocation size beyond which jemalloc begins to involve the kernel in memory management, which imposes substantial overhead and reduces scalability. In most situations this is acceptable because such large allocations are rare. However in workloads that routinely perform such large allocations jemalloc can be tuned at runtime to manage the memory in userland.

In this case, setting the MALLOC_OPTIONS=K environment variable is sufficient to restore performance. See the malloc.conf(5) manpage for more details. It is unknown whether the glibc malloc can be tuned at runtime to provide better performance.