Thursday, August 16, 2012

Upsilon 2012 Codility Programming Certificate Solution

A new Codility Certificate codenamed Φ (Phi) has been posted and awaits you, so it is time to show how the previous one, codenamed Y (Upsilon), can be solved. You can still give it a try, but no certificate will be granted.

It is easier to describe the solution if we first introduce the notion of a Cartesian tree. We are given an array of \(N\) different integers. A Cartesian tree is a binary tree whose nodes are elements of the array. The root is the maximal element in the array. The left and right sub-trees are constructed recursively. The left sub-tree is a Cartesian tree built of all the elements to the left of the root, and the right sub-tree is a Cartesian tree built of all the elements to the right of the root. For the example array \([9, 10, 2, -1, 3, -5, 0, -3, 1, 12, 5, 8, -2, 6, 4]\) we obtain the following tree:

The task can be seen as looking for the height of the Cartesian tree. The solution resembles the algorithm used to build the Cartesian tree, but we don't have to store that much information. We scan the given array from left to right. In the algorithm that builds the Cartesian tree, scanned elements are kept on the stack of Cartesian (sub-)trees. Each element of the stack is a node of the resulting Cartesian tree together with its left sub-tree, but with the right sub-tree missing. Trees that are higher on the stack have smaller values as their roots. For each root of a tree on the stack, its right sub-tree will be constructed by merging all trees that are above it on the stack, and possibly some elements of the array that have not been scanned yet.

How should we update the stack, when another element of the array is scanned? If the scanned integer is smaller than the root of the topmost tree on the stack, we can simply push it onto the stack as a single-node tree. If it is larger than the roots of some trees on top of the stack, we have to pop all these trees and merge them to form the left sub-tree of the scanned element. The following figure shows the situation on the stack just before and just after scanning 12 in the example sequence.

What changes if we do not need the Cartesian tree itself, but just its height? Instead of a whole tree, it is sufficient to store its height and its root. Below is a solution in Python. Heights of trees on the stack are kept in h_stack and roots of trees are kept in v_stack. The code can be simplified by adding two guards: initially, if the stack is not empty, but contains \(\infty\) instead, it will never be popped and we don't have to check if the stack is empty. By adding ''\(\infty - 1\)'' at the end of the given sequence, we have a guarantee that, in the last step, all the trees on the stack (except the other guard) are merged into one Cartesian tree.
INFTY = 2000000000

def sequence(A):
    N = len(A)
    if N == 0: return 0 
    A = A + [INFTY-1]
    
    v_stack = [INFTY] * (N+2)
    h_stack = [0] * (N+2)
    sp = 1
    for X in A:
        if X > v_stack[sp]:
            # Elements smaller than X on the stack will form its left sub-tree
            while X > v_stack[sp-1]:
                # merge two elements on top of the stack
                h_stack[sp-1] = max(h_stack[sp-1], h_stack[sp]+1)
                sp = sp-1
            # Element on top of the stack represents left sub-tree of x
            v_stack[sp] = X
            h_stack[sp] = h_stack[sp] + 1
        else:
            # X is a leaf
            sp = sp+1
            v_stack[sp] = X
            h_stack[sp] = 0
    assert sp == 2
    return h_stack[sp]

3 comments:

  1. I didn't find the solution just because at some point you have to check the stack but, by doing this, it becomes O(n*n) and it had to be O(n), so this solutions doesn't seem OK.

    ReplyDelete
    Replies
    1. Every iteration of the inner while loop will remove one item from the stack and there are at most N items pushed to the stack during the lifetime of the algorithm, so the runtime of the inner while loop is O(n) and not O(n²).

      That is one of the most tricky things of this algorithm (at least if you are not very used to amortized time analysis), and I am also disappointed that the blog post did not mention it at all...

      Delete
  2. I don't understand since i read "The task can be seen as looking for the height of the Cartesian tree". But If I have an array 1,2,3,4,5,6,7,8,9,10, and the height of the tree is 10, and the answer is 2. Or i didn't understand something?

    ReplyDelete