Bench
The Bench module provides statistical benchmarking capabilities for measuring code performance. It supports warmup iterations, multiple measured runs, and provides comprehensive statistics including mean, median, percentiles, and standard deviation.
Running Benchmarks
Parameters
name- A descriptive name for the benchmarkwarmup- Number of warmup iterations (not measured)iterations- Number of measured iterationsthunk- A zero-argument function to benchmark
# Basic benchmark with 10 warmup and 100 measured iterations
result = Bench.run "fibonacci" 10 100 (fn => fib 30)
println "Mean: ${result.mean_ns} ns"
println "Median: ${result.median_ns} ns"
println "Ops/sec: ${result.ops_per_sec}"
Result Fields
Bench.run returns a BenchResult record with the following fields:
| Field | Type | Description |
|---|---|---|
name |
String | The benchmark name passed to Bench.run |
iterations |
Int | Number of measured iterations performed |
mean_ns |
Int | Mean (average) execution time in nanoseconds |
median_ns |
Int | Median execution time in nanoseconds |
min_ns |
Int | Minimum execution time in nanoseconds |
max_ns |
Int | Maximum execution time in nanoseconds |
p99_ns |
Int | 99th percentile execution time in nanoseconds |
std_dev |
Float | Standard deviation of execution times |
ops_per_sec |
Float | Operations per second based on mean time |
Examples
Comparing Algorithms
# Define two sorting implementations
bubble-sort = fn(list) =>
# ... bubble sort implementation
list
quick-sort = fn(list) =>
# ... quick sort implementation
list
# Create test data
test-data = [5, 2, 8, 1, 9, 3, 7, 4, 6]
# Benchmark both implementations
bubble-result = Bench.run "bubble-sort" 5 50 (fn => bubble-sort test-data)
quick-result = Bench.run "quick-sort" 5 50 (fn => quick-sort test-data)
println "Bubble Sort: ${bubble-result.mean_ns} ns"
println "Quick Sort: ${quick-result.mean_ns} ns"
Benchmarking with Different Input Sizes
benchmark-size = fn(n) =>
data = range 1 n
Bench.run "sum-${n}" 10 100 (fn => fold (+) 0 data)
# Test with increasing sizes
results = [100, 1000, 10000] |> map benchmark-size
each results (fn(r) =>
println "${r.name}: ${r.mean_ns} ns (${r.ops_per_sec} ops/sec)"
)
Detailed Statistics Report
result = Bench.run "complex-operation" 20 200 (fn =>
# Some complex computation
[1, 2, 3, 4, 5]
|> map (fn(x) => x * x)
|> filter (fn(x) => x > 5)
|> fold (+) 0
)
println "Benchmark: ${result.name}"
println "Iterations: ${result.iterations}"
println "---"
println "Mean: ${result.mean_ns} ns"
println "Median: ${result.median_ns} ns"
println "Min: ${result.min_ns} ns"
println "Max: ${result.max_ns} ns"
println "P99: ${result.p99_ns} ns"
println "Std Dev: ${result.std_dev}"
println "---"
println "Throughput: ${result.ops_per_sec} ops/sec"
Best Practices
Use warmup iterations to ensure the code being benchmarked has been optimized and caches are warm. A typical starting point is 5-20 warmup iterations depending on code complexity.
More iterations provide more accurate statistics. For stable results, use at least 50-100 measured iterations. For micro-benchmarks of very fast operations, consider 1000+ iterations.
The benchmarked function should be pure when possible. Side effects like I/O operations can add significant variance to timing measurements.
Use median_ns for a robust central tendency measure that
is less affected by outliers. Compare p99_ns to understand
worst-case performance. A high std_dev indicates variable
performance that may need investigation.