The Interview Question I Failed

Everyone has that interview they messed up. This one was mine.

If you have been in an industry long enough, applied to enough jobs, you will inevitably have an interview that you just... bomb. I tell myself a lot of stories about my own experience - I didn't realize it was meant to be a technical interview, it was over the phone, I really hadn't had a minute away from my day job to study - stories that are meant to ease the pain of having clearly failed. But the fact remains I did fail, and that sticks in my craw.

Even more frustratingly, even after years of thinking about it and asking other people, I don't know how I wouldn't have failed.

The situation was this: I'd snagged an opportunistic interview with a fairly explosive health tech company. The VP of Engineering called me personally to chat about the job. It's normal to have a 'what this job is going to be' chat before having a 'do you have the technical chops' chat. I thought it was going to be the former, but realized after about five minutes it was meant to be the latter.

Maybe that was a smart move on the VP's part. Maybe catching me flat footed was going to give him a better sense of my technical chops, or lack thereof. I don't know, can't say. What I can say is the question he asked me, which was (paraphrased) this:

Suppose you have a function that is meant to simulate the rolling of a six sided die. That is, you will call it, with no parameters, and it will return a perfectly random integer between 1 and 6. How do you unit test this function for correctness?

Right off the bat, what I should have noted is that testing randomization is an incredibly sticky morass. Even the question, 'What does random mean?' is hard to answer - and for that reason, a question I should have asked in the moment. At the simplest, consider the difference between 'random card draws' and 'random rolls of a die'. Intuitively, cards will not have a repeated result. But it gets more complicated than that - you can expect that three rolls of a die will produce three separate results, but you can also expect that two of those results will match. How many rolls of the die do you need to prove that the resultant distribution is truly random?

One of the reasons I know I failed (apart from never hearing from the company again) is that after the fact I knew that, were I asking the question, that is the first thing I would have wanted to hear in reply.

Instead, I dove in. I picked the first solution that came to mind and tried to implement it. This was a n00b mistake, of course, but I was panicking. (If you panicked, chances are you failed the interview. This sucks, but is basically true.) I decided that a stochastic method was all I could manage. Roll the die (call the function) a thousand times and look at the distribution of results.

Of course, this violates a major principle of unit tests, which is that they should be determinative. That is, if they are correct they should pass and if they are incorrect they should fail. They shouldn't probably pass or probably fail - a real possibility if you're dealing with random numbers. It is also pretty inefficient - most unit tests call the function under test a handful of times to test various cases. Calling it more than you call anything else in your test suite is poor from the get go.

So, I totally understand why I failed.

Yet, after the fact I hounded myself with the question: what was the correct response? I looked everywhere on the internet, asked a bunch of engineers who are way smarter than me, but I couldn't find a satisfactory response. I quickly arrived at the response I believe is my best effort solution, but (spoiler) never found one that I counted as convincingly correct.

Here is what I concluded is the best I could do:

Any function that produces a random result must be using an underlying mechanism to achieve a random seed. The first step is to mock out this seed, and then test the function across a range of seed values, to ensure the expected outcome exists.

Alright, that's a bit short on detail, but the key notion is that you have to mock out the random part. Most randomizers will do some variation of give you a random number between zero and one. In that case, you should look to see that a bunch of numbers between 0.0 and 0.1666... give you a '1' result, between 0.1666... and 0.3222... giver you an output of '2' and so forth. Of course, it's hairy because a six sided die does not evenly divide. You can't point at a number that is the upper and lower bound of any given range.

Even worse, you don't want to dictate how the function does it's work! While mocking out the underlying randomizer effectively takes control of the 'input' to the function under test, you don't want to, in your unit test, imply that, say, 0 to 0.1666... maps to '1'. It could as easily map to '6' or '3', as long as each output number had an equal range. Or, perhaps the function takes the first digit and if it is between 1 and 6 returns that result, and if not it moves to the next digit, looks for a value between 1 and 6 and so on. That is still probably (stochastically) random, but also the sort of thing that if you test for it, you are testing how the function works and not if it works. Bad unit test design!

Still, by mocking out the seed you can test certain other properties. You can test that giving it the same seed multiple times returns the same result. Maybe you don't want to - maybe your function works in such a novel way that it can be perfectly random while returning non-consistent results. I think, from an engineering perspective it's valid to say 'such a function is out of bounds', because a big chunk of your job is simplifying things.

In short, I could think of a lot of reasons this is an interesting question to ask:

  • Does the candidate know to ask good questions, like 'what does random mean'?
  • Can the candidate identify that the function has a 'hidden input'? Normally unit tests provide an input and expect a certain output - here, the 'input' is hidden from the caller.
  • Does the candidate realize that manipulating this particular input carries certain challenges? Do they know not to test how the function works, but rather that it works?
  • Are there choices the candidate makes to limit the problem that serve the business case while still simplifying their task?

In short, I think I can imagine myself posing this same question to a candidate and getting the sort of differentiating signal I would need to decide if they are a decent engineer, or if they aren't. (I guess I'd decide I was not, at least in that moment!) For a low-to-mid level engineer I think I could get some insight out of asking this question - maybe not as much as other questions, but enough to convince myself that this VP knew what he was doing.

But I still don't know what the right answer is, and probably never will.

top
You've successfully subscribed to Sabretache
Great! Next, complete checkout for full access to Sabretache
Welcome back! You've successfully signed in.
Unable to sign you in. Please try again.
Success! Your account is fully activated, you now have access to all content.
Error! Stripe checkout failed.
Success! Your billing info is updated.
Error! Billing info update failed.