Runtime Benchmarks
This page describes the results of benchmarking the performance of code generated by bon's builder macros.
Builder macros generate code that is easily optimizable by the compiler. This has been tested by the benchmarks below. The benchmarks compare regular positional function call syntax and builder syntax for functions annotated with #[builder].
In many cases rustc generates the same assembly code for the builder syntax as it would for a regular function call. Even when the generated assembly differs, the performance differences are negligible.
TIP
Don't take these microbenchmarks for granted. Do your own performance measurements in your application in real conditions. Feel free to open an issue if you find performance problems in bon.
Wallclock Statistics
| Benchmark | Description | Assembly output | Run time |
|---|---|---|---|
args_3 | 3 args of primitive types | Equal | regular: 6.2751nsbuilder: 6.3021ns |
args_5 | 5 args of primitive types | Equal | regular: 7.8298nsbuilder: 7.8321ns |
args_10 | 10 args of primitive types | Ordering diff | regular: 17.322nsbuilder: 17.178ns |
args_10_structs | 10 args of primitive types and structs | Instructions diff | regular: 2.7477nsbuilder: 2.7311ns |
args_10_alloc | 10 args of primitive and heap-allocated types | Instructions diff | regular: 91.666nsbuilder: 84.818ns (*) |
args_20 | 20 args of primitive types | Equal | regular: 36.467nsbuilder: 36.786ns |
(*)
Interestingly, in this case builder version performed even better. If you don't believe this, you can run these benchmarks for yourself. Maybe some ASM expert could explain this 😳?
High-Precision Statistics
| Benchmark | Instructions count | L1 accesses | L2 accesses | RAM accesses |
|---|---|---|---|---|
args_3 | regular: 71builder: 71 | regular: 81builder: 81 | regular: 1builder: 1 | regular: 10builder: 9 |
args_5 | regular: 89builder: 89 | regular: 111builder: 111 | regular: 0builder: 0 | regular: 10builder: 10 |
args_10 | regular: 206builder: 206 | regular: 269builder: 268 | regular: 0builder: 0 | regular: 20builder: 21 |
args_10_structs | regular: 20builder: 20 | regular: 29builder: 28 | regular: 0builder: 0 | regular: 5builder: 6 |
args_10_alloc | regular: 1830builder: 1829 | regular: 2555builder: 2554 | regular: 1builder: 1 | regular: 36builder: 36 |
args_20 | regular: 414builder: 414 | regular: 548builder: 547 | regular: 0builder: 0 | regular: 46builder: 47 |
Conditions
The code was compiled with opt-level = 3 and debug = 0.
Hardware
The benchmarks were run on a dedicated root server AX51-NVMe on Hetzner.
- OS: Ubuntu 22.04.4 (Linux 5.15.0-76-generic)
- CPU: AMD Ryzen 7 3700X 8-Core Processor (x86_64)
- RAM: 62.8 GiB
References
The source code of the benchmarks is available here.