Joseph Koshy > Projects > PMC based Performance Measurement in FreeBSD > Code Snapshots > Snapshot #5
Snapshot #5, against -CURRENT of 20 Feb 2005
Against -CURRENT of date
Patch (-p1), gzip'ed
## New snapshot of hardware PMC support code
I am pleased to announce a new snapshot of the hardware performance
counter support code.
Warning: This is pre-alpha code. It may panic and behave nastily.
Please test on a scratch box.
## What's available
You can now answer the question "what are the hardware events
happening on this system?" on the following CPUs:
- AMD Athlon64/Opteron
- AMD Athlon
- Intel P4 and P4/HTT processors
(Support for answering the next question, namely "which are the spots
of code related these events?" is being worked on).
## Code components
- A kernel driver pmc(4).
- A userland library ("libpmc", see pmc(3)) to access the driver.
- Userland utilities to use the driver (pmcstat(8) and
- Documentation in the form of manual pages.
## What can it do today?
- Measure a whole bunch of hardware events. See the documentation
- Supported PMC kinds:
(a) Process-virtual PMCs: these PMCs count hardware events only
when their target process is scheduled on a CPU,
(b) System-wide PMCs: these PMCs count hardware events for
the system as a whole.
- Supported PMC modes:
(a) "Counting mode" PMCs: these PMCs only count events, and do not
sample the instruction pointer.
"Sampling mode" PMCs are being worked on.
## Using the code
- Download the patch.
- Apply it to a freshly checked out -CURRENT source.
# cd /usr/src
# patch -p1 < PATCH-FILE
- Update 'world'.
- Add "options PMC_HOOKS" to your kernel config file, recompile
and reboot the new kernel.
- Load the new kernel module and start using it.
# kldload pmc
- Example 1: Measure the TLB miss behaviour of 'firefox' on an
AMD Athlon. Print counts every 1 second.
% ps -ax | grep firefox
1884 v0 S 0:04.59 /usr/X11R6/lib/firefox/lib/firefox-0.9.3/firefox-bin
'firefox' is already running so we attach to it using the '-t
TARGET' option. The '-w 1' option specifies the desired interval.
% pmcstat -p k7-l1-dtlb-miss-and-l2-dtlb-hits -p k7-l1-and-l2-dtlb-misses \
-w 1 -t 1884
# p/k7-l1-dtlb-miss-and-l2-dtlb-hits p/k7-l1-and-l2-dtlb-misses
Clearly this program can stress the TLB!
- Example 2: Measure cycles interrupts were masked while the
ATA driver's interrupt handling thread was executing while
the 'diskinfo' command was scheduled.
We need to be root to do this:
amd64# ps -ax | grep ata
25 ?? WL 0:00.25 [irq14: ata0]
26 ?? WL 0:00.00 [irq15: ata1]
31 ?? WL 0:00.00 [irq20: atapci0]
We setup pmcstat(8) to count cycles spent with the processors IF
bit cleared and when the ata0 thread (pid 25) is executing.
amd64# diskinfo -c ad0 > /dev/null & \
pmcstat -p k8-fr-interrupts-masked-while-pending-cycles -t 25 -w 1
- Example 3: Measure the total number of interrupts seen by the
system while a particular command was executing. Also count the
number of cycles the CPU's IF bit was zero when the command was
scheduled on a CPU.
amd64# pmcstat -p k8-fr-interrupts-masked-while-pending-cycles \
-s k8-fr-taken-hardware-interrupts -w 1 diskinfo -c ad0 > /dev/null
# p/k8-fr-interrupts-masked-while-pending-cycles s/k8-fr-taken-hardware-interrupts
## Known Bugs
- The P4 HTT code is prone to freezing or panic'ing. If you turn
off HTT, the P4 code works fine.
- Sampling mode support is incomplete. If you allocate and start a
sampling mode PMC, you'll get an NMI, (if you are lucky).
## Next Steps (in no particular order)
Please contact me if you would like to take up any of these.
- Implement sampling modes.
- Support Intel P-Pro and Pentium MMX PMC implementations.
- Test suites.
- A number of Intel P4 specific features (precise sampling,
PMC cascading etc. remain to be implemented).
- A port of PAPI.
- userland tools
- use PMC based instruction pointer sampling with
- enhance our profiling support code to use the ability to read
process-mode PMC counts with the RDPMC instruction.
- convert sampling mode output to gprof format.
- create a tool that can correlate measured cache/tlb/etc.
behaviour with data structure layout and code layout.
- Write documentation suitable for /usr/share/doc/papers/.
Sat Apr 21 22:53:24 2007