It's the most wonderful time of the year... for benchmarking!
I've been brushing up on Rust, doing some of this year's Advent of Code for fun.
I'm somewhat comfortable in Rust, but wanted to take a look at the performance optimization tooling it offers, with a simple comparison of a naive solution to an optimized solution.
Rust has benchmarking tooling with the cargo bench
command, but it wasn't clear to me how to use it at first.
I started by looking at the built-in benchmark tests, but quickly realized I would need to be on the latest Rust nightly, which I didn't really feel like fussing with.
criterion
was the next thing I found, which looked perfect... and indeed it was!
If you're not familiar with Advent of Code, there's a new coding challenge every day in December. It's fun and creative, weaving a story arc into the problems every year. Today I was working on day 6, where the elves need you need to fix their communication device and identify the first set of 14 unique characters in a sequence. A classic sliding window problem! You're given a string of 4096 characters, and you have to find the character that marks the end the first 14 unique character window. Let's see how it went, comparing time complexity of each step and seeing how that equates to actual time.
HashSet
(a data structure containing only unique values, similar to Set
in JavaScript) with the 14 characters - O(m) time each iteration, m being 14 in this case.HashSet
- O(1) time.You'll notice we have a nested loop - we iterate over the input string, and at each iteration we iterate 14 times to fill the HashSet
.
In this cse O(n * m) isn't as bad as O(n2), and it isn't as naive as it could be, but can still be optimized.
The questions is, by how much?
I set to work adding criterion
to the project.
It was surpisingly easy, in a few minutes I was up and running.
As you'll soon see, criterion
runs your code many, many times collecting performance metrics, and then gives you a breakdown.
I used the default options, but it's also very configurable.
Benchmarking advent 06: Collecting 100 samples in estimated 5.1632 s (2000 iterations)
advent 06 time: [... 2.3460 ms ...]
OK, 2.5 milliseconds. Not bad, I guess? Without anything to compare this to, I honestly couldn't say whether this is bad or not. This is an important point when talking about performance optimization: Measure a baseline before you do anything else. I often see people spending time optimizing their code before measuring a baseline. Is this bad on its own? No. But nothing is ever on it's own, is it? Time spent optimizing is time that could be spent on other things. Also, optimized code in some cases is less readable than non-optimized code. If the optimization is worth it, then by all means! But you can't make an educated decision on the tradeoff until you measure.
OK, let me just get down off this soapbox and we can move onto the optimized solution.
HashMap
(similar to a JavaScript Map
) with the first 14 characters - O(m) time complexity, but we only do it once.HashMap
with a value of 1, or add 1 to the value if it already exists.HashMap
entry. If the entry is 0, remove it from the map.HashMap
- O(1) time complexityWe still have to iterate over the string input, but we've removed the extra inner loop. This brings the whole thing down to O(n)! Well, technically O(n + m), but equivalent to O(n). JUST LET ME HAVE THIS MOMENT. Now, what does that look like by the numbers?
Benchmarking advent 06: Collecting 100 samples in estimated 5.8134 s (15k iterations)
advent 06 time: [... 386.54 µs ...]
change: [... -83.341% ...] (p = 0.00 < 0.05)
Performance has improved.
0.38ms, an 83% improvement... this code is running 6X faster than the previous solution!
I was curious if fs::read_as_string()
and storing the characters as a vector of char
s was slower than using fs::read()
and iterating over the bytes directly, but more benchmarking didn't show a difference.
This whole exercise was less about optimizing the code than it was about understanding the tooling. This is of course a very basic benchmarking example, but good enough to get a better understanding of the tooling in the Rust ecosystem. I always learn best by getting my hands dirty, and this was a fun way to do it!