To learn how to write complex data structures, and then package them in modules and functors for better reusability.

It has been said that the advantage of using a language like ocaml with a sophisticated type system only really becomes apparent when the data structures you use in your program become significantly complex. This lab will allow you to get familiar with how to write code for data structures that are more complex than simple lists or trees.

- Modules
- Module signatures
- Functors

Chapters 12 and 13 of the textbook.

This week, you will be implementing a data structure called a __priority
queue__. The specific implementation you will create is called a
__leftist heap__. So let's talk about these things now.

**NOTE:** There is a priority queue implementation given in the ocaml
manual as part of the module documentation. Please don't consult that for
this assignment (the implementations are different anyway).

A priority queue is a data structure that is ordered so that it provides fast access to its minimum element, as defined by some ordering relationship. Conceptually, you can think of that element as "the next thing to be processed". It also has to have an operation which deletes the minimum element and returns another priority queue, as well as several other operations, which are summarized here as an ocaml module signature:

module type PriorityQueueSig = sig exception Empty type t (* Abstract type of elements of queue. *) type queue (* Abstract type of queue. *) val empty : queue (* Return empty queue. *) val is_empty : queue -> bool (* Check if queue is empty. *) val insert : queue -> t -> queue (* Insert item into queue. *) val find_min : queue -> t (* Return minimum element. *) val delete_min : queue -> queue (* Delete minimum element. *) val from_list : t list -> queue (* Convert list to queue. *) end

Of these operations, note that the operation `from_list`

is not
fundamental to the definition of a priority queue, but it's extremely
useful.

Also note that we are being loose about the words "insert" and "delete". These operations are purely functional, so that when we say we "insert" an element into a queue, we really mean that we create a new queue which has the extra element in it; the original queue is unchanged. Similarly, when we "delete" an element from a queue, we really mean that we create a new queue without the offending element; the original queue is once again unchanged.

Now that we know how our data structure is supposed to behave, the next
question is: how do we implement it? Naturally, there are lots of choices,
each of which will have different trade-offs. In this lab you're going to
implement priority queues as *leftist heaps*. This is a data structure
that has the following attributes:

It can either be a leaf (representing an empty heap) or a node which contains:

- a stored element
- a non-negative integer value called its "rank"
- a left and right subheap, each of which is also a leftist heap

Thus, a leftist heap is a binary tree with additional rank information stored in the nodes. The tree will in general

**not**be balanced; it will usually have more elements in the left subtree than in the right. We say that the tree "skews to the left".The element stored at the top of the tree is smaller (more precisely, is no larger) than any of the elements stored beneath it in either subtree.

As mentioned, each node has an integer value associated with it, which is called its

*rank*. The rank of a node is equal to the length of its*right spine*, which is the number of nodes in the rightmost path from the node in question to an empty (leaf) node (this will also turn out to be the shortest path to a leaf node). Thus, a leaf node has rank 0, and a heap with a left and/or right subheap which is a leaf has rank 1. However, ranks are only stored in nodes, not in leaves. Having the rank stored in the nodes makes many operations much faster.The rank of any left child is required to be at least as large as the rank of its right sibling. This is why it's called a "leftist" heap.

`insert`

and
`delete_min`

operations in terms of merging, and the other
operations are then trivial to define, except for the list-to-heap conversion
routine. That can be done easily in O(N * log(N)) time, and with more
difficulty in O(N) time. You're not required to find the most efficient
solution, but it's a good exercise.
If either heap is empty, return the other heap.

If the first heap's minimum element is smaller than the second heap's minimum element, make a new heap (*) from the first heap's minimum element, the first heap's left subheap, and the result of merging the first heap's right subheap with the second heap.

Otherwise, make a new heap (*) from the second heap's minimum element, the second heap's left subheap, and the result of merging the first heap with the second heap's right subheap.

(*) Here is how to make a new heap from a minimum element and two heaps: the resulting heap must have:

- the given minimum element
- a rank which is the smaller of the ranks of the original heaps, plus 1
- a left subheap which is the original heap with the larger rank
- a right subheap which is the original heap with the smaller rank

This algorithm will preserve the leftist heap property in the merged heap.

Your implementation should be written as follows (assuming that the priority queue will have integers as elements):

module PriorityQueue : (PriorityQueueSig with type t = int) = struct (* Your code goes here. *) end

Note that since the type `t`

is abstract, if you want to use a
real type as the elements of your priority queue (and it would be pretty
useless otherwise), you have to specify which type you want the
`PriorityQueueSig`

's `t`

value to represent. This is
kind of clunky syntax. Also, you might ask why you can't just make it
parametric, like a type `'a list`

. You actually can in some
cases, but here we need a type with an ordering relation, and there is no way
to guarantee that any arbitrary type will (a) be orderable at all -- what if
it's a function type? or (b) will be orderable using the same function
(*e.g.* the built-in `compare`

function). So it's better to
write the code as it's written above. Also, this will make it easy to
transform this into a functor (see below).

Here's an interesting point to ponder: why don't we also have to specify
what the abstract `queue`

type represents in the code above?

Use your priority queue implementation to write a *heap sort*
function. This will take a list as its argument and will

- first convert the list to a priority queue (implemented as a leftist
heap) using the
`from_list`

function of the module - successively remove the smallest element from the heap and cons it onto a list until there are no more items in the heap
- reverse the list to get a sorted list in ascending order

Only use your module's exported functions (those in the signature) in your solution. Test it by using the heap sort to sort a list of integers.

This code should go into a file called `priority_queue.ml`

.

As written, the code is dependent on the built-in comparison functions. To make this more generic, let's define some types and module signatures:

(* Type for ordered comparisons. *) type comparison = LessThan | Equal | GreaterThan (* Signature for ordered objects. *) module type OrderedSig = sig type t val compare: t -> t -> comparison end (* Signature for priority queues. *) module type PriorityQueueSig = sig exception Empty type t type queue val empty : queue val is_empty : queue -> bool val insert : queue -> t -> queue val find_min : queue -> t val delete_min : queue -> queue val from_list : t list -> queue end

What you have to do now is generalize your priority queue implementation into a functor that takes a module matching the OrderedSig signature as its argument, and produces a priority queue (implemented using a leftist heap) which is specialized for that particular type of data. For instance, you can define a module of ordered strings like this:

module OrderedString = struct type t = string let compare x y = if x = y then Equal else if x < y then LessThan else GreaterThan end

and then define your string priority queue like this:

module StringPQ = MakePriorityQueue(OrderedString)

Once you've done this, redefine your heap sort function using a StringPQ as the heap. Note that this heap sort will only work on strings. Use it to sort a list of strings.

To get you started, here is a skeleton of the code you should use for the functor definition:

module MakePriorityQueue (Elt : OrderedSig) : (PriorityQueueSig with type t = Elt.t) = struct (* Your code goes here... *) end

Note that again you have to specify what the type `t`

in the
`PriorityQueueSig`

represents. Here, it better be the same type
as the type `t`

in the `OrderedSig`

argument (which we
have called `Elt`

).

Your functor code should go into a file called
`make_priority_queue.ml`

.

The files `priority_queue.ml`

and
`make_priority_queue.ml`

. Add some test code to both files to
show that your heap sort works correctly.

- Introduction to the Objective Caml Programming Language by Jason Hickey, chapters 12 and 13.