In the next two assignments you will implement an ordered-set data type that internally uses a binary search tree to store and retrieve its values. A binary search tree is a binary tree where each node stores a value, and a node's left subtree stores values less than the node's value, and the right subtree stores values greater than the node's value. This is an invariant on the data-structure that must always hold.
Because we are implementing an ordered set using a tree, we will call it the TreeSet
, and declare it in a file treeset.h
. If you want to separate the function definitions from the declaration then you can do this in treeset.cpp
. For now, our TreeSet
will only store int
values. We will make the TreeSet
much more generalized in a future assignment.
You do not need to maintain a balanced tree in this assignment. Balanced binary trees impose additional constraints on the distances between the root and the leaves of the tree, so that searching for a particular value is always O(log(N)) worst case. Our tree won't be quite as good, but it will definitely be simpler to implement.
The TreeSet
may only expose operations that are relevant to ordered sets; it must not expose any internal implementation details, like the fact that the set is implemented with a tree. Therefore, you will need to create some internal "tree-node" type that is only visible within your tree-set class, which can hold a value, and pointers to a left and right subtree.
In this implementation you must use C++ smart pointers. Smart pointers implement the "Resource Acquisition Is Initialization" (RAII) pattern for heap-allocated memory, and they make it much easier to avoid leaking memory, because they completely eliminate the need to delete
heap allocations. For this assignment you will want to use the std::shared_ptr<T>
type, declared in the <memory>
C++ standard header.
Note: Your tree nodes should not need to have a pointer to their parent node. C++ smart pointers work through reference-counting, and if nodes point to both their children and their parent then the reference graph will have cycles in it. You can break such cycles by using the
std::weak_ptr<T>
, but you actually won't need for nodes to know their parents for the functionality you will implement in this data structure.
Here are the public operations you must provide on your TreeSet
class:
TreeSet()
default constructor initializes an empty set.
int size()
returns the number of elements in the set. Your implementation should do this in constant time; it should not have to iterate over the internal contents of the set to compute the size.
bool add(int value)
attempts to add value
to the set. If value
does not already appear in the set, return true
. If value
already appears in the set, return false
.
bool del(int value)
attempts to remove value
from the set. If value
appears in the set, it is removed and true
is returned. If value
already does not appear in the set, false
is returned.
bool contains(int value)
returns true
if the value appears in the set, or false
if the value doesn't appear in the set.
Implement support for copy-construction and copy-assignment. This will be a bit involved, because you need to make a deep copy of the set's internal tree structure. This can be greatly simplified by providing a helper function on your tree-node struct that clones the node, and also recursively clones the node's left and right children. (Hint: recursive tree-node copy constructor...)
Implement support for move-construction and move-assignment. This one will be very easy with smart pointers!
As always, use the const
keyword and pass-by-value/pass-by-reference semantics everywhere that is appropriate for this class.
As always, write Doxygen-format comments for all types and all functions in your implementation. We expect your comments to be complete and concise. You are not required to provide structural commands for function arguments, return values, etc., unless you really want to.
As stated earlier, it should not be visible outside of your class that you are building the set with a tree, so the tree-node type should be declared inside the private
section of the class. To reduce typing, you may also want to declare a smart-pointer type with the using
syntax; if you do so, do this inside the private section of your class as well.
You must also provide a private sanity_check
operation that traverses the set's internal tree and verifies that the tree's invariant is maintained: each node's left subtree should only contain values less than the node's value, and the right subtree should only contain values greater than the node's value. It is up to you how you decide to implement this; the reference implementation has a function like this:
bool sanity_check(sp_node n, int minval, int maxval)
The function verifies that the node n holds a value between minval and maxval, and then it recursively checks the children of n with the same function, updating minval and/or maxval appropriately. The function prints all identified issues to cerr
(not stopping after the first issue is found), and it returns false
if any issues are identified along the way. This way the function can be used with assert()
everywhere the tree is changed.
Once you have completed a sanity-check function, call it every time you mutate the contents of the tree, so that you can catch any bugs immediately.
Because you are using smart pointers in your implementation, your tree set implementation should not leak any memory. You can use tools like Valgrind to check your implementation.
Finally, to repeat the earlier requirement, your tree nodes may not have a pointer to their parent node. This will make adding and deleting nodes a bit more complex, but it should not create a substantial increase in complexity if you do it correctly.
The most difficult part of this assignment is implementing deletion of values, because the tree must maintain the binary-search-tree invariant stated earlier. Wikipedia has a helpful article on binary search trees, and the deletion section is particularly helpful.
Recall that in C++, you can create member functions, constructors, destructors, etc. on structs as well as on classes.
As mentioned earlier, you may want to create a custom copy-constructor for your tree-node type that makes a deep copy of the tree node.
You may also want to make a function like "replace_child(old_child, new_child)
", so that it is easy to replace one of the node's children with a new node (or with an empty smart-pointer if you wish to delete the child node), without having to keep track of whether the node being replaced is the left child or the right child.
Finally, your del(int value)
implementation may find it useful to recursively invoke itself in certain cases, but unnecessarily slow implementations will be penalized.
A test suite for your class is provided here. This program does some simple testing, and then it does an exhaustive test of tree-sets up to 6 elements, adding and deleting values in all possible orders. This is not a very large collection, but the act of inserting and deleting in all possible orders should exhaustively exercise your add/delete code. If your code uses your sanity-check function before and after every modification, you can have very high confidence that your code is correct after you have passed the entire suite.
Note that the N=5 and N=6 tests may take some time; the reference implementation requires about 10 seconds for the N=6 test.
Make sure to fix any test failures you encounter.
As usual, you will need to write a Makefile
for your project that builds and tests your TreeSet
code.
The all
target should build the test binary, but not run it.
The test
target should run the test binary.
The clean
target should delete .o
files and binaries. As always, make sure to back up your code before testing your clean
target for the first time!
Make sure all files are compiled with the -Wall
and -Werror
arguments (these will go in your CXXFLAGS
variable) so that any malformed but otherwise legal code is identified by the compiler.
Once you have completed the above tasks, submit your work through csman. Make sure to submit these files:
Makefile
treeset.h
treeset.cpp
(if present)You do not need to submit the test code; we will test your program with a fresh copy of these files.
Please also complete and submit a feedback survey with your submission, telling us about your experience with this assignment. Doing so will help us to improve CS11 Advanced C++ in the future.
Copyright © 2019-2020 by California Institute of Technology. All rights reserved. Generated from treeset-1.md.