Back in 2005 I proposed an experiment in automation of job candidate screening at Exoweb. I came up with a list of programming problems to be used on candidates, then Ken and Bjorn (Exoweb CTO and CEO, respectively, and fellow recruiters) picked one. The problem went something like this:
The sequence of integers is given in an array. Write a program that finds such an index in the array, that the sum of preceding elements (the prefix sum) is equal to the sum of succeeding elements (the suffix sum). The program should be capable of handling the input data of hundreds of megabytes. (Here followed an an example and few lines of more precise specs in terms of math, sigmas, etc., but let's skip it.)
I wrote a set of scripts called Exobench to run the automated assessments and we started processing the masses of candidates. The outcomes quickly reinforced our expectations that this problem is well suited for assessments thanks to:
- Combinatorial boundary cases. Carelessly written code usually mishandles empty sequence, single-element sequence, cases where the index is the first one or the last one in the array etc.
- Arithmetic boundary cases. Imperfect code in languages with fixed-size data types neglects dangers of arithmetic overflows leading to false positives.
- Multiple scalability choices. The problem can be solved in several ways; each way yields correct results, but the execution times of different solutions vary greatly as the input data gets LARGE.
In our field, there are three types of people. At one end of the scale, there are the unwashed masses lacking even the most basic skills for this job. [...] At the other extreme, are the brilliant superstars who write lisp compilers for fun, in a weekend, in Assembler for the Palm Pilot. And in the middle, you have a large number of "maybes" who seem like they might just be able to contribute something. The trick is telling the difference between the superstars and the maybes, because at Fog Creek Software we only hire the superstars.
This is certainly a strategic choice whether you hire superstars only (I have seen companies that were fine with hiring decent maybes), nevertheless you at least want to filter out the unwashed masses and identify the superstars.
Combinatorial and arithmetic boundary cases are great litmus test to sift off the unwashed masses. Getting all boundary cases right is the bread and butter of an honest journeyman in this profession. Codility tests out all the murky corners of the candidate’s solution to make sure all boundary cases are handled correctly. But this is the easier part of the Codility’s job.
Multiple scalability choices give a great indication of whether a person is a maybe or a superstar. The most common, intuitive solutions given by maybes (we call such solutions naive) work fine for the input data of tens, hundreds or thousands bytes, but as you keep increasing the input data size, the execution time grows disproportionally fast to the amounts of data you pump in (it turns out that in naive solutions the execution time is roughly proportional to the square of the input data size):
- input data of less than 1KB - below a second,
- input data of 10KB - few seconds,
- input data of 100KB - below a minute,
- input data of 1MB - several minutes,
- input data of 10MB - several hours,
- input data of hundreds MB - several years.
The running time of a naive solution as the function of input data size. The blue circles mark measurements taken on my laptop, the curve is an approximation of the running time for larger input data sizes. The shape of the curve (quadratic polynomial) follows from the construction of the algorithm used in the naive solution. Running on different hardware or using a different programming language will only compress or stretch the curve vertically (scaling by a constant factor), but won't affect the overall shape (it will remain quadratic). The actual running time for larger data sets may further exceed the approximation, due to cache effects and limited RAM size.There exist more scalable solutions (we call them golden) capable of processing a gigabyte in less than few seconds. For this particular problem golden solution is not very difficult to code (10-15 lines), but inventing one definitely requires the candidate to think big.
Codility attempts to recognize the scalability of a solution and approximates the asymptotic complexity. The complexity is reflected in the assessment score (a single numeric measure for non-tech HR persons) and also displayed in terms of O/Θ/Ω (for those initiated into the mysteries of asymptotics). For the programming problem mentioned above, the naive solutions have pessimistic time complexity of Θ(n2), whereas golden ones have Θ(n). We have seen solutions having higher complexities, we put them in our collection of programming curiosities. Obviously for this programming problem it is not possible to build a solution with pessimistic time complexity below Ω(n).
Interestingly, for a superstar a golden solution is usually a natural, aesthetic choice. It may not be completely rational, but superstars presented with a suboptimal solution frequently feel discomfort and an urge to fix the ugly thing.

