Runtime Benchmarks ​
This page describes the results of benchmarking the performance of code generated by bon
's builder macros.
Builder macros generate code that is easily optimizable by the compiler. This has been tested by the benchmarks below. The benchmarks compare regular positional function call syntax and builder syntax for functions annotated with #[builder]
.
In many cases rustc
generates the same assembly code for the builder syntax as it would for a regular function call. Even when the generated assembly differs, the performance differences are negligible.
TIP
Don't take these microbenchmarks for granted. Do your own performance measurements in your application in real conditions. Feel free to open an issue if you find performance problems in bon
.
Wallclock Statistics ​
Benchmark | Description | Assembly output | Run time |
---|---|---|---|
args_3 | 3 args of primitive types | Equal | regular: 6.2751ns builder: 6.3021ns |
args_5 | 5 args of primitive types | Equal | regular: 7.8298ns builder: 7.8321ns |
args_10 | 10 args of primitive types | Ordering diff | regular: 17.322ns builder: 17.178ns |
args_10_structs | 10 args of primitive types and structs | Instructions diff | regular: 2.7477ns builder: 2.7311ns |
args_10_alloc | 10 args of primitive and heap-allocated types | Instructions diff | regular: 91.666ns builder: 84.818ns (*) |
args_20 | 20 args of primitive types | Equal | regular: 36.467ns builder: 36.786ns |
(*)
Interestingly, in this case builder version performed even better. If you don't believe this, you can run these benchmarks for yourself. Maybe some ASM expert could explain this 😳?
High-Precision Statistics ​
Benchmark | Instructions count | L1 accesses | L2 accesses | RAM accesses |
---|---|---|---|---|
args_3 | regular: 107 builder: 107 | regular: 134 builder: 134 | regular: 1 builder: 1 | regular: 8 builder: 8 |
args_5 | regular: 125 builder: 125 | regular: 164 builder: 164 | regular: 1 builder: 1 | regular: 7 builder: 7 |
args_10 | regular: 283 builder: 283 | regular: 382 builder: 383 | regular: 4 builder: 2 | regular: 18 builder: 19 |
args_10_structs | regular: 22 builder: 22 | regular: 30 builder: 31 | regular: 2 builder: 1 | regular: 5 builder: 5 |
args_10_alloc | regular: 2038 builder: 2037 | regular: 2839 builder: 2837 | regular: 1 builder: 1 | regular: 33 builder: 34 |
args_20 | regular: 557 builder: 557 | regular: 775 builder: 775 | regular: 1 builder: 1 | regular: 32 builder: 32 |
Conditions ​
The code was compiled with opt-level = 3
and debug = 0
.
Hardware ​
The benchmarks were run on a dedicated root server AX51-NVMe
on Hetzner.
- OS: Ubuntu 22.04.4 (Linux 5.15.0-76-generic)
- CPU: AMD Ryzen 7 3700X 8-Core Processor (x86_64)
- RAM: 62.8 GiB
References ​
The source code of the benchmarks is available here.