# Insertion Sort Advanced Analysis: Augmented Trees

This problem comes from HackerRank.

You can see my full solution in the GitHub repo.

## Problem Summary

Given a list of integers, you must report how many swaps are necessary to sort the list via insertion sort.

Insertion sort is `O(n^2)` though, and you gotta be faster than that.

## Naive Approach

You could just do insertion sort on the lists and keep track. But again, that's not very clever and it's too slow.

## Crucial Insight: What Causes a Swap

Clearly we need to be clever; gain some sort of special insight into the nature of the problem at hand. That insight is in what causes a swap to occur during an insertion sort: an encounter with a previously sorted value larger than the value being processed. So how many swaps are required to sort one value? The answer is equal to the number of larger values found to its left in the original, unsorted state.

## Solution: Annotated Binary Search Tree

So how do we retrieve this value? Doing a linear search on the left hand side for larger values is `O(n^2)` which gets us nowhere. But what if we record the values to the left in some data structure that provides the information we need. That would require fast insertion and lookup, so a binary search tree is a good candidate.

Yet the most useful feature about a BST, for my purposes, isn't the complexity of its methods; it's the information that can be gleaned from subtree sizes. At every node in a BST, every element in the node's left subtree must be less than or equal to the node, and every element in its right subtree must be greater than or equal to it.

If we can keep track of subtree sizes without affecting the complexity of the BST methods, then we can simply perform a lookup on the tree with a little bookkeeping to find the number of items greater than it.

## Implementation: Unbalanced

On an unbalanced BST, lookup and insertion is `O(n)`, but a balanced tree is only `O(log(n))`. Balanced trees are a pain to implement though, so I tried an unbalanced BST first and hoped for the best. Besides, it was sort of an MVP on which to test my subtree-size logic. This strategy only passed 9 of 14 test cases, timing out on all others.

### Execution Flow Outline

```size_t swap_count{0};
// left to right
for (auto i : unsorted) {
swap_count += bst.count_greater_than(i);
bst.insert(i);
}
return swap_count;
```

### Annotating Subtree Sizes

Chapter 14 of my copy of CLRS is called "Augmenting Data Structures", and within is a paragraph called "Maintaining Subtree Sizes". Check that out if my words are unhelpful.

#### Storing

The `Node` class is given a new data member `size_t size` which will hold the number of values in the subtree for which it is the root, including itself. Clearly the default value for this data member should be `1` (a leaf node).

#### Maintaining

This problem only requires maintenance on insertion. Insertion is performed by recursing from the root to a leaf such that inserting the value at the leaf won't violate BST properties. From the subtree size perspective, the new node belongs to the subtree of every node that gets inspected. So that code is simply

```void insert(int val) {
Node **p{&root};
while (*p != nullptr) {
(*p)->size += 1;
p = (val < (*p)->val) ? &((*p)->left) : &((*p)->right);
}
*p = new Node(val);
}
```

Adding the single addition instruction does not affect the complexity of the method.

#### Retrieving

This is the only really tricky part, but if you're careful it's not so bad. It helps to generate a small list of random numbers, construct the annotated BST for them on paper, and do some examples by hand.

`count_greater_than` should return the number of values in the BST that are greater than the input. To do this, we will start at the root and recurse down, as if we were looking up the value. At every node along the way, the decision between left or right child gives us some valuable information.

• If we are about to go right, then we know the value is greater than the current node and everything in the left subtree, so the count doesn't increase.
• If we are about to recurse down the left subtree, then we know the value is less than the current node and all nodes in the right subtree, so we should increase our count by the size of the right subtree and 1 for the current node.
```size_t count_gt(int val) {
size_t count{0};
Node *p{root};
while (p != nullptr)
if (val < p->val) {
count += 1 + (p->right ? p->right->size : 0);
p = p->left;
}
else
p = p->right;
return count;
}
```

This adds one comparison (testing that `p->right` isn't `nullptr`) and conditionally a single addition to the lookup method, which does not affect its complexity.

## Implementation: Red-Black

Obviously they're throwing some worst case scenarions at us (already in order or reverse order). So we need to implement a self-balancing binary tree. I picked Red-Black tree for my poison, but they're all tricky. This strategy passes all test cases.

You'll probably need to do some debugging on your tree implementation. I suggest looking into the `column` command's `--tree` flag, and generating output with some colors (red, bright-black).

As far as our annotation maintenance is concerned, the only difference from the plain BST is in the `rotate` method.

1. During a rotation I like to take care of the grandparent links first, but this has no affect on the annotations (i.e. grandparent->size doesn't change, they just rotate underneath it).
2. Next I take care of the inner child; subtract its size from the node rotating up, and that's it. We don't have to update the node going down, because the inner-child was already reflected in its size.
3. Subtract the size of the node rotating up from the size of the node rotating down.
4. Add the size of the node rotating down to the node rotating up.

## Final Complexity

• Annotation maintenance is of constant complexity
• Every value requires a lookup (`O(log(n))`) and an insertion (`O(log(n))`), yielding `O(n(log(n) + log(n))) == O(2nlog(n)) == O(nlog(n))`.
• The lookup and insertion steps could be done simultaneously, halving the running time but not affecting the complexity.
• This version did complete all test cases correctly and under the time limit.

## Footnote

The editorial on HackerRank gives a much more elegant solution, though computationally its just as complex as my solution here. Still, can't help but feel like an idiot.