Fri Dec 23 2022

Benchmarking Advent of Code

It's the most wonderful time of the year... for benchmarking! I've been brushing up on Rust, doing some of this year's Advent of Code for fun. I'm somewhat comfortable in Rust, but wanted to take a look at the performance optimization tooling it offers, with a simple comparison of a naive solution to an optimized solution. Rust has benchmarking tooling with the cargo bench command, but it wasn't clear to me how to use it at first. I started by looking at the built-in benchmark tests, but quickly realized I would need to be on the latest Rust nightly, which I didn't really feel like fussing with. criterion was the next thing I found, which looked perfect... and indeed it was!

The problem

If you're not familiar with Advent of Code, there's a new coding challenge every day in December. It's fun and creative, weaving a story arc into the problems every year. Today I was working on day 6, where the elves need you need to fix their communication device and identify the first set of 14 unique characters in a sequence. A classic sliding window problem! You're given a string of 4096 characters, and you have to find the character that marks the end the first 14 unique character window. Let's see how it went, comparing time complexity of each step and seeing how that equates to actual time.

The first solution

Iterate over the string with 2 pointers 14 characters apart - O(n) time complexity.
At each iteration, fill a HashSet (a data structure containing only unique values, similar to Set in JavaScript) with the 14 characters - O(m) time each iteration, m being 14 in this case.
Check the length of the HashSet - O(1) time.
If the length is less than 14, there's at least one duplicate character in the window.
If the length is 14, we've found our window!

use std::collections::HashSet; const UNIQUE_LENGTH: usize = 14; pub fn run() -> usize { let content = std::fs::read_to_string("./src/advent06.txt").expect("Error reading file"); let chars: Vec<char> = content.chars().collect(); let mut a = 0; let mut b = UNIQUE_LENGTH; while b < chars.len() { let mut set = HashSet::<char>::new(); for i in a..b { set.insert(chars[i]); } if set.len() == UNIQUE_LENGTH { break; } a += 1; b += 1; } return b; }

You'll notice we have a nested loop - we iterate over the input string, and at each iteration we iterate 14 times to fill the HashSet. In this cse O(n * m) isn't as bad as O(n²), and it isn't as naive as it could be, but can still be optimized. The questions is, by how much?

The first benchmark

I set to work adding criterion to the project. It was surpisingly easy, in a few minutes I was up and running. As you'll soon see, criterion runs your code many, many times collecting performance metrics, and then gives you a breakdown. I used the default options, but it's also very configurable.

Benchmarking advent 06: Collecting 100 samples in estimated 5.1632 s (2000 iterations)

advent 06         time:   [... 2.3460 ms ...]

OK, 2.5 milliseconds. Not bad, I guess? Without anything to compare this to, I honestly couldn't say whether this is bad or not. This is an important point when talking about performance optimization: Measure a baseline before you do anything else. I often see people spending time optimizing their code before measuring a baseline. Is this bad on its own? No. But nothing is ever on it's own, is it? Time spent optimizing is time that could be spent on other things. Also, optimized code in some cases is less readable than non-optimized code. If the optimization is worth it, then by all means! But you can't make an educated decision on the tradeoff until you measure.

OK, let me just get down off this soapbox and we can move onto the optimized solution.

The second solution

Fill a HashMap (similar to a JavaScript Map) with the first 14 characters - O(m) time complexity, but we only do it once.
Iterate over the string with 2 pointers 14 characters apart - O(n) time complexity.
At each iteration, add the character at the head pointer to the HashMap with a value of 1, or add 1 to the value if it already exists.
Still in that iteration, get the character at the tail pointer and remove 1 from its HashMap entry. If the entry is 0, remove it from the map.
Check the length of the HashMap - O(1) time complexity
If the length is less than 14, some entries must have a value of more than 1. In other words, not all are unique.
If the length is 14, we've found our window!

pub fn run() -> usize { let content = std::fs::read_to_string("./src/advent06.txt").expect("Error reading file"); let chars: Vec<char> = content.chars().collect(); let mut a = 0; let mut b = UNIQUE_LENGTH; let mut map = HashMap::<char, i32>::new(); for i in a..b { let k = chars[i]; map.entry(k).or_insert(0); map.insert(k, 1 + map[&k]); } while b < chars.len() { let k_b = chars[b]; map.entry(k_b).or_insert(0); map.insert(k_b, 1 + map[&k_b]); let k_a = chars[a]; if map[&k_a] > 1 { map.insert(k_a, map[&k_a] - 1); } else { map.remove(&k_a); } if map.len() == UNIQUE_LENGTH { break; } a += 1; b += 1; } return b; }

We still have to iterate over the string input, but we've removed the extra inner loop. This brings the whole thing down to O(n)! Well, technically O(n + m), but equivalent to O(n). JUST LET ME HAVE THIS MOMENT. Now, what does that look like by the numbers?

The second benchmark

Benchmarking advent 06: Collecting 100 samples in estimated 5.8134 s (15k iterations)

advent 06         time:   [... 386.54 µs ...]
                  change: [... -83.341% ...] (p = 0.00 < 0.05)
                  Performance has improved.

0.38ms, an 83% improvement... this code is running 6X faster than the previous solution! I was curious if fs::read_as_string() and storing the characters as a vector of chars was slower than using fs::read() and iterating over the bytes directly, but more benchmarking didn't show a difference.

Takeaway

This whole exercise was less about optimizing the code than it was about understanding the tooling. This is of course a very basic benchmarking example, but good enough to get a better understanding of the tooling in the Rust ecosystem. I always learn best by getting my hands dirty, and this was a fun way to do it!

<< Previous Next >>