Fork me on GitHub

on algorithms and boundaries

I'm taking the Algorithms I class from Coursera that just started up again. (I'm going to finish it this time.) (No really!) The first programming assignment has students building a Monte Carlo simulation for the percolation problem, using a weighted quick-union algorithm. We'd already covered union finds and the internals of a WQU during the lectures and exercises, so the assignment focused on applying it efficiently.

I built a slow prototype that mostly worked, then made it a decent amount faster, then fixed an edge case that I'd broken while making it faster, then got stuck. My solution still mostly worked--the gist was there--but it was partially wrong, and I couldn't even conceptualize a fix without at least doubling the memory or runtime. Which, obviously, would be wrong. Right? I mean, I spent six years in C. Performance is everything.

I even slept on it. Nothing.

Feeling fairly stupid, I started procrastinating on the class forums and stumbled over a thread that directly addressed the problem I was trying to solve. Most of the comments went something like this:

Yeah, I couldn't even get close to 100% [grade] until I doubled the memory used.

Wait, what? They doubled the memory..and it was ok? WAIT A SECOND.

I went back to the automated feedback from my best attempt (a mere 87%) and checked the details of the memory usage tests. Sure enough, I was using less than half what the assignment allowed. I re-submitted my solution with the doubling-memory, solving-the-whole-problem tweak: 100%.

Lesson learned: Don't start rejecting solutions until they overflow the actual boundaries of the problem. Speed for speed's sake, or memory for memory's sake, or LOC for LOC's sake, or whatever imaginary boundary you've constructed for yourself ends up being just a waste of time if you can't get the rest of it done.