Unwashed masses, maybes, superstars 
and asymptotic complexity assessment

Tuesday, August 18, 2009

Back in 2005 I proposed an experiment in automation of job candidate screening at Exoweb. I came up with a list of programming problems to be used on candidates, then Ken and Bjorn (Exoweb CTO and CEO, respectively, and fellow recruiters) picked one. The problem went something like this:

The sequence of integers is given in an array. Write a program that finds such an index in the array, that the sum of preceding elements (the prefix sum) is equal to the sum of succeeding elements (the suffix sum). The program should be capable of handling the input data of hundreds of megabytes. (Here followed an an example and few lines of more precise specs in terms of math, sigmas, etc., but let's skip it.)

I wrote a set of scripts called Exobench to run the automated assessments and we started processing the masses of candidates. The outcomes quickly reinforced our expectations that this problem is well suited for assessments thanks to:

  • Combinatorial boundary cases. Carelessly written code usually mishandles empty sequence, single-element sequence, cases where the index is the first one or the last one in the array etc.
  • Arithmetic boundary cases. Imperfect code in languages with fixed-size data types neglects dangers of arithmetic overflows leading to false positives.
  • Multiple scalability choices. The problem can be solved in several ways; each way yields correct results, but the execution times of different solutions vary greatly as the input data gets LARGE.
In his great piece on interviewing Joel Spolsky wrote:

In our field, there are three types of people. At one end of the scale, there are the unwashed masses lacking even the most basic skills for this job. [...] At the other extreme, are the brilliant superstars who write lisp compilers for fun, in a weekend, in Assembler for the Palm Pilot. And in the middle, you have a large number of "maybes" who seem like they might just be able to contribute something. The trick is telling the difference between the superstars and the maybes, because at Fog Creek Software we only hire the superstars.

This is certainly a strategic choice whether you hire superstars only (I have seen companies that were fine with hiring decent maybes), nevertheless you at least want to filter out the unwashed masses and identify the superstars.

Combinatorial and arithmetic boundary cases are great litmus test to sift off the unwashed masses. Getting all boundary cases right is the bread and butter of an honest journeyman in this profession. Codility tests out all the murky corners of the candidate’s solution to make sure all boundary cases are handled correctly. But this is the easier part of the Codility’s job.

Multiple scalability choices give a great indication of whether a person is a maybe or a superstar. The most common, intuitive solutions given by maybes (we call such solutions naive) work fine for the input data of tens, hundreds or thousands bytes, but as you keep increasing the input data size, the execution time grows disproportionally fast to the amounts of data you pump in (it turns out that in naive solutions the execution time is roughly proportional to the square of the input data size):
  • input data of less than 1KB - below a second,
  • input data of 10KB - few seconds,
  • input data of 100KB - below a minute,
  • input data of 1MB - several minutes,
  • input data of 10MB - several hours,
  • input data of hundreds MB - several years.
This is way too slow, especially that we explicitly stated the expectation of several hundreds megabytes in the problem description.

The running time of a naive solution as the function of input data size. The blue circles mark measurements taken on my laptop, the curve is an approximation of the running time for larger input data sizes. The shape of the curve (quadratic polynomial) follows from the construction of the algorithm used in the naive solution. Running on different hardware or using a different programming language will only compress or stretch the curve vertically (scaling by a constant factor), but won't affect the overall shape (it will remain quadratic). The actual running time for larger data sets may further exceed the approximation, due to cache effects and limited RAM size.

There exist more scalable solutions (we call them golden) capable of processing a gigabyte in less than few seconds. For this particular problem golden solution is not very difficult to code (10-15 lines), but inventing one definitely requires the candidate to think big.

Codility attempts to recognize the scalability of a solution and approximates the asymptotic complexity. The complexity is reflected in the assessment score (a single numeric measure for non-tech HR persons) and also displayed in terms of O/Θ/Ω (for those initiated into the mysteries of asymptotics). For the programming problem mentioned above, the naive solutions have pessimistic time complexity of Θ(n2), whereas golden ones have Θ(n). We have seen solutions having higher complexities, we put them in our collection of programming curiosities. Obviously for this programming problem it is not possible to build a solution with pessimistic time complexity below Ω(n).

Interestingly, for a superstar a golden solution is usually a natural, aesthetic choice. It may not be completely rational, but superstars presented with a suboptimal solution frequently feel discomfort and an urge to fix the ugly thing.

Definition of a programmer

Tuesday, August 11, 2009

By definition (New Oxford American Dictionary):

programmer |ˈprōˌgramər|
noun
a person who writes computer programs.

If you want to hire one, you'd better make sure that he/she can write programs. And what is the better way to find out, than putting him/her in front of a computer and asking to write a program? None.

Easier said than done. You need to invent a programming assignment (ideally an extract of daily programming practice). Assignment must be meaningful, yet simple enough to be solved in ten-something minutes. Then you need a qualified specialist to assess the solution. Most likely you will need to screen more than one candidate (usually tens to hundreds per one position), s you need to make sure that assessment made by different specialists are consistent. Moreover, specialists' time is usually in short supply. And good luck motivating them to review hundreds of applications with proper diligence.

Codlity attempts to solve some of the above problems by delivering the set of programming assignments along with an automated on-line tool for assessing solutions. Each assignment is a short program to write. The programs are assessed for their correctness and scalability. In a software development organization, Codility attempts to help the following groups:

HR Specialistst and Non-Technical Recruiters gain a powerful tool to perform initial technical assessment of job candidates. Experience shows that up to 90% of software job ad respondents may not be qualified for a job due to lack of programming skills. Codility is able to filter these candidates out in a very early stage of the recruitment process.

Technical Interviewers and Technical Managers need to spend less time in interviews and meet only qualified candidates (which leads to more interesting interviews!). Moreover, for each Codility test, the system generates detailed technical reports of what was right and wrong in their solutions, giving the interviewer immediate insight in the candidate's strong and weak points.

HR Managers gain insight in valuable statistics showing how the candidates sourced and/or hired compare to the average for the industry or location.

Besides, Codility has a hidden agenda. I was once interviewing a guy for a programmer position. He seemed OK and had several years of experience, but after programming exercises I was sure that he is an "absolute no." He left and I grabbed his CV to have a look again. I was curious how he survived several years in programming jobs and I froze in horror - the guy worked on the software for medical systems. The hidden agenda of Codility is to raise the standards to prevent similar hazards.

Or perhaps he just faked the medical systems job, so that we get scared and hire him to limit the damage?

Codility genesis

Tuesday, August 4, 2009

If you point your browser to websites like zhaopin.com or 51job.cn you will see lots of boxes with logos and links to job offers. In May 2005 one of these boxes had a logo of Exoweb (at that time a small Beijing-based outsourcing business), and at the other end of the hyperlink Ken, Bjørn and myself were waiting for candidates. Several weeks earlier Exoweb landed a promising contract with a Norwegian customer and was hungry for programmers. Hungry, nevertheless picky. Quality mattered.

We decided to adopt quality measures outlined in Joel Spolsky's Guerilla Guide to Inverviewing. In particular, we were running independent interviews (up to 3 tech interviews per candidate) and we were diligently checking whether candidates can actually write correct code. The overwhelming majority could not. It quickly occurred that we were spending long hours in interviews, but the gain was less than modest.

Beijing is a great city and there are more exciting things to do than interviewing programmers who cannot write programs. We had strong motivation to figure out a way to improve the process. The inspiration came from Olympiad in Informatics, the ultimate programming competition for high-school students. I had some experience with Polish chapter of the Olympiad and I was aware that this institution has been employing automated tools for assessing solutions of programming problems since 1990s. I decided to adopt similar means to screening job candidates.

Ken and Bjørn were a bit skeptical about the amount of work that had to be invested, but surely interested in freeing up some of their time on Interview Saturdays. I hacked up an automated evaluator called Exobench and we have set up one machine as a workstation for candidates. From now on, every candidate had to sit in front of a computer and deliver a solution of one simple programming problem using a set of standard programming tools (editor + compiler). Our HR person later ran the automated evaluator to determine whether the candidate goes home or stays for tech interviews.

We quickly noticed that only 1 in 10 candidates was passing the programming test. We were reviewing the solutions of the rejected candidates to make sure we don't throw out the baby with the bath water, but we were increasingly positive that persons unable to get few lines of code straight should never tinker with our complex distributed web application.

When I was leaving Exoweb 2.5 years later, I took a final look at stats. Around that time we have had screened 2500, interviewed 250, hired 50 (I rounded a bit, but the actual numbers were close). Given that we usually organized two to three independent tech interviews per candidate, the automation saved something between 4 and 7 thousands hours of Senior Engineer time.

Based on my experience with Exobench, the team at Enpoka.com created Codility, an on-line system for programming skills assessment. In 2009 Exoweb introduced Codility to its hiring process, becoming our first large customer generating 1200+ evaluations per month.