Skip to main content

Microkernel Performance Metrics

Traditionally, the performance metric people care most with microkernels is the basic IPC cost, i.e. the cost for sending a single, minimal-length message betwe en two threads. While it can be debated how meaningful such a microbenchmark is, IPC is the most critical operation of a microkernel.

All figures shown here are measured under the following circumstances:

  • hot cache (tight loop)
  • one way (half of round-trip)
  • inter address space (requiring a full switch of addressing context)

Data

Kernel Architecture Processor Clock Latency Time Measured Source
MHz Cycles ns by year
seL4 Intel x86 (32 bit) Core i7 4770 (Haswell) 3,400 301 89 NICTA 2013 SOSP'13
seL4 ARM v7 Cortex A9 1,000 316 316 NICTA 2013 SOSP'13
seL4 ARMv6 ARM11 532 188 353 NICTA 2013 SOSP'13
NOVA(1) x86 (32 bit) Core i7 920 (Bloomfield) 2,667 288 108 TU Dresden 2010 EuroSys'10
okL4 ARMv5 XScale 255 400 151 378 NICTA 2007 NICTA
okL4 ARM v4 StrongARM SA1100 206 131 635 NICTA 2007 NICTA
L4KA:Pistachio IA64 Itanium 2 1,500 36 24 NICTA 2005 Usenix'05
L4KA:Pistachio x86 (64 bit) AMD Opteron 242 1,600 230 144 University of Karlsruhe 2004? L4Ka site
Hazelnut x86 (32 bit) Pentium 4 1,400 1,008 720 University of Karlsruhe 2002 L4Ka site
Hazelnut x86 (32 bit) Pentium II 400 273 683 University of Karlsruhe 2002 L4Ka site
L4/MIPS(2) MIPS64 R4700 100 86 860 UNSW 1997 HotOS'97
L4/Alpha(2) Alpha 21064 433 45 104 TU Dresden 1997 HotOS'97
Original x86 Pentium 166 121 756 Liedtke 1997 HotOS'97
Original x86 i486 50 250 5,000 Liedtke 1993 SOSP'93

Notes:

  1. NOVA does not support the standard GCC version of thread local storage (using segment registers). Instead it supports TLS via a general-purpose register (compiler options -ffixed-rbp and -ftls-model=global-dynamic). This reduces context-switch costs at the expense of slightly reduced application performance.
  2. The MIPS and Alpha figures published in HotOS'97 were for incomplete kernel implementations. Actual costs for the fully functional kernels were higher (100 cycles on MIPS and 70–80 cycles on Alpha).
Served by Apache on Linux on seL4.