## Friday, January 4, 2013

### Chi 2012 Codility Programming Certificate Solution

It is time to show how the Codility Certificate codenamed X (Chi) can be solved. You can still give it a try, but no certificate will be granted.

In this task, one has to simulate flying cannonballs and find places where they hit the ground. The height of the ground, extending from the cannon outwards, is stored in array $$A$$, and the levels at which the cannonballs are shot are stored in array $$B$$. For each cannonball shot at level $$H$$, we have to find the smallest index $$I$$ such that $$0 < I < M$$ and $$A[I] \ge H$$. (If there is no such $$I$$ a cannonball can be ignored.) Moreover, values in array $$A$$ change, as $$A[I-1]$$ is increased when the cannonball falls.

We need a data structure that can quickly answer queries of the form Where does the cannonball shot at level $$H$$ hit the ground? and that can be updated by increasing ground levels accordingly. Such a data structure can consist of two arrays: the array $$A$$ representing ground levels and another array $$T$$ such that $$T[H]$$ is the smallest value of $$I$$ to satisfy $$0 < I < M$$ and $$A[I] \ge H$$ (or $$-1$$ if there is no such $$I$$).

Initially, we can calculate array $$T$$ in a linear time. The key observation is that if $$H_1 \le H_2$$ then $$T[H_1] \le T[H_2]$$. In other words, the higher a cannonball flies, the greater the distance it will travel. If $$T[H] = I$$, then $$T[H+1]$$ can be calculated by checking values of array $$A$$ from $$A[I+1]$$ on.

When a cannonball is shot, we can instantly locate the place where it hits the ground by using $$T[H] = I$$. However, after increasing $$A[I-1]$$, we have to decrease values in array $$T$$ accordingly. Luckily, only one element of $$T$$ can change: namely, if $$T[A[I-1]] > I-1$$, then it should be decreased to $$I-1$$. Clearly, one cannonball can be processed in constant time. Hence, the overall time complexity of this algorithm is linear.

Here is a solution in Python, resulting from the above analysis:

def cannonballs(A, B):
M = len(A)
N = len(B)
H = max(B)

T = [-1] * (H+1)
i = 0
for j in xrange(H+1):
while (i < M) and (A[i] < j):
i += 1
T[j] = i

for j in B:
i = T[j]
if 0 < i < M:
A[i-1] += 1
if T[A[i-1]] > i-1:
T[A[i-1]] = i-1

return A


1. How can you store T in the allowed O(M) space? I need at least O(H) space to do so...

2. Come on! I was waiting months to find out how you can fulfill the space requirements and you now show me you didn't!

3. Oooops!
You are right.
It can be solved in O(M) memory, but we finally decided to aim at simpler solutions.
I must have forgot to update the expected memory complexity.
My fault.

So how can the memory complexity be improved?
The key observation is that we only use values of T[B[i]].
So we don't have to store the whole array T.
Instead we can use a hash table.

We need a sorted copy of array B to generate the hash table T, because we have to process all indices of T in the increasing order.
(Sorting of integers from a bounded range can be done in linear time.)

In the second part of the algorithm, we update T only for those indices, that already exist.

4. You can use simple lookup table with values of T[B[i]], but accessing it will not be possible in fixed time.
Hash table on the other hand is an easy solution if you use for example C++, where it is a part of standard library. But I program in C, so you either require me to implement one or to use non-portable POSIX hsearch (which I might not know). Both ways are not fair.
But as a codility fan I tend to forgive you :-) Especially that the next challenge was excellent to solve in C.

5. For the record, in my submitted solution I used a red-black tree (Java TreeMap) instead of the hashtable, and only stored the values where "i" changes, as it allows me to lookup the next smaller/larger present key+value in (worst-case) O(log(n)) time. That will add an extra logarithmic factor to the required time, but it fulfilled the space requirements and Codility did not detect that I cheated on the time requirement :)

BTW hashtables are not worst-case O(1) access time, only in typical/average case; but unless you run a webserver that was DoSed with requests with lots of request parameters all hashing to the same value, it does not matter in practice.