Runtime Benchmarks
This page describes the results of benchmarking the performance of code generated by bon
's builder macros.
Builder macros generate code that is easily optimizable by the compiler. This has been tested by the benchmarks below. The benchmarks compare regular positional function call syntax and builder syntax for functions annotated with #[builder]
.
In many cases rustc
generates the same assembly code for the builder syntax as it would for a regular function call. Even when the generated assembly differs, the performance differences are negligible.
TIP
Don't take these microbenchmarks for granted. Do your own performance measurements in your application in real conditions. Feel free to open an issue if you find performance problems in bon
.
Wallclock Statistics
Benchmark | Description | Assembly output | Run time |
---|---|---|---|
args_3 | 3 args of primitive types | Equal | regular: 6.2751ns builder: 6.3021ns |
args_5 | 5 args of primitive types | Equal | regular: 7.8298ns builder: 7.8321ns |
args_10 | 10 args of primitive types | Ordering diff | regular: 17.322ns builder: 17.178ns |
args_10_structs | 10 args of primitive types and structs | Instructions diff | regular: 2.7477ns builder: 2.7311ns |
args_10_alloc | 10 args of primitive and heap-allocated types | Instructions diff | regular: 91.666ns builder: 84.818ns (*) |
args_20 | 20 args of primitive types | Equal | regular: 36.467ns builder: 36.786ns |
(*)
Interestingly, in this case builder version performed even better. If you don't believe this, you can run these benchmarks for yourself. Maybe some ASM expert could explain this 😳?
High-Precision Statistics
Benchmark | Instructions count | L1 accesses | L2 accesses | RAM accesses |
---|---|---|---|---|
args_3 | regular: 71 builder: 71 | regular: 81 builder: 81 | regular: 1 builder: 1 | regular: 10 builder: 9 |
args_5 | regular: 89 builder: 89 | regular: 111 builder: 111 | regular: 0 builder: 0 | regular: 10 builder: 10 |
args_10 | regular: 206 builder: 206 | regular: 269 builder: 268 | regular: 0 builder: 0 | regular: 20 builder: 21 |
args_10_structs | regular: 20 builder: 20 | regular: 29 builder: 28 | regular: 0 builder: 0 | regular: 5 builder: 6 |
args_10_alloc | regular: 1830 builder: 1829 | regular: 2555 builder: 2554 | regular: 1 builder: 1 | regular: 36 builder: 36 |
args_20 | regular: 414 builder: 414 | regular: 548 builder: 547 | regular: 0 builder: 0 | regular: 46 builder: 47 |
Conditions
The code was compiled with opt-level = 3
and debug = 0
.
Hardware
The benchmarks were run on a dedicated root server AX51-NVMe
on Hetzner.
- OS: Ubuntu 22.04.4 (Linux 5.15.0-76-generic)
- CPU: AMD Ryzen 7 3700X 8-Core Processor (x86_64)
- RAM: 62.8 GiB
References
The source code of the benchmarks is available here.