CS3134 Homework #4
Due on November 11, 2003 at 11:00am
There are two parts to this homework: a written component worth
5 points, and a programming assignment worth 20 points. See
the homework submission
instructions on how to hand it in and for important notes on
programming style and structure.
Note: parts in red are
revisions/clarifications.
Written questions
- (5 points)You're given the following list of numbers to
insert in a binary search tree.
35, 21, 48, 1, 93, 87,
55, 100, 34, 97
- (1 point) Draw the resulting tree.
- (2 points) Using the book's convention, draw the tree
that results in the deletion of 93 from the
previous tree.
- (2 points) An alternative convention in handling a
two-child delete would be to take the next-smallest node (or
the inorder predecessor) instead of the next-largest
number. If we used this convention when deleting
93, which number would it be, and what is the
general algorithm (explain in 1-2 sentences) for finding the
inorder predecessor?
Programming problems
Both of these programming problems work with a spelling
dictionary, i.e., you will build data structures and tools to
handle large numbers of words (Strings). You will use
this file as input for both programs;
it's a scrambled version of a "dictionary" of words (without
definitions) as distributed with Debian Linux. (To download it
in IE/Mozilla/Netscape, right-click and choose Save Target As or
Save Link To Disk.) The file only contains words with alphabetic
characters, both upper- and lower-case, one per line. You are
to process the file, ignoring but preserving case, and
support the operations as described below.
- (10 points) Use an array-backed list to handle this "dictionary".
- (7 points) Build an ArrayBackedList class
that uses an array to store the words. The constructor for
the class must take one parameter: the number of words to be
stored, which serves as a capacity property for the array.
You must then implement the following methods in your
ArrayBackedList:
- (1/2 point) public boolean insert(String
s): this takes a String and inserts it at the
bottom of the (occupied part of the) array. It also
updates an object-level variable called
longestWord on each insert, so that by the end
of input, longestWord contains the length of
the longest word. You should return true
unless the array is full, in which case return
false.
- (1/4 point) public String elementAt(int
index): this returns the element at the specified
index, or null if no such index exists.
- (1/4 point) public int size(): this
returns the number of elements in the array.
- (1 point) public boolean binarySearch(String
s): this performs a binary search of the array and
returns true if the element exists, or
false if the element does not exist. You are
to implement this method recursively, i.e.,
binarySearch should call a private
binarySearch that has the necessary parameters
to work recursively. Make sure to ignore the case of
the strings!
- (5 points) public int radixSort: this does
an (iterative) alphabetic radix sort of the array. The
strategy is similar, but not the same, as when sorting
numbers. First of all, there will be 27 groups (26
characters plus "too-short" words), not 11.
Second, words aren't "right-aligned", but rather
"left-aligned". In other words, you will start the
radix sort at the last character of the longest
String, but only words that are that long will
be grouped appropriately; all other words will be thrown
into the "zero" group that holds "too-short" words.
Future passes then go through every group and sort by
the second-to-last-longest character,
third-to-last-longest character, etc, and throw the
result into the appropriate new group (note that
you need a new "set" of groups for every pass!). Once
you get to the "zeroth" character, you will finally have
a configuration where there is no data in the "zero"
group, but data in the remaining groups are in
order. Read the groups starting with the "a" group,
and copy the elements back into the array.
You will use a doubly linked-list structure to
store each individual group, i.e., you'll use an array
of linked lists to store the collection of groups.
Instead of having to modify the book's, use the
LinkedList as supplied by Java in the
java.util package. (Note that this is the
only java.util data structure you should be
using!)
Your radix sort method will return an integer: the
number of "operations" in and out of groups. That is,
any element inserted into a linked list acts as one
operations, and any element read out of a linked list
acts as another. Copying from one group into another
group acts as two operations. Add all of these up and
return it from the radix sort.
- (3 points) Implement an ArrayBackedListApp class
with a main() method that does the following:
- Uses a BufferedReader to read the words
from the aforementioned words.txt file into a
new instance of ArrayBackedListApp;
- Radix sorts this new instance and prints out the
nuber of assignments;
- Presents a small user
interface (at a ">" prompt) with the following
commands (don't worry about invalid input). There are 45,372 words in the array; you can
create a static-sized array for the purposes of this
assignment.
- s word: Searches for the word,
and prints out found or not found
based on whether the binary search finds a
word;
- d count: Dumps the first
count elements to screen. If count is
0, print all the elements to screen;
- i index: Print out the
element referenced by index. 0 would
imply the first element. If no such element
exists, print out not found.
- q: Quit.
- (10 points) Use a tree to handle this "dictionary".
- (8 points) Modify the Tree class in Tree.java
(download here; it's the same as the
book version, with the TreeApp class and traversal methods
thrown out) in the following manner:
- (2 points) Change the tree and its respective
functions to handle String keys, and no
additional data associated with the key. This also
means that the find method just returns a
boolean indicating whether or not the supplied key was
found. Make sure to ignore case when comparing
Strings!
- (6 points) Make the tree an indexed binary search
tree. An indexed BST differs in that it keeps
information that enables it to find any arbitrary
element without having to do a complete inorder
traversal of the tree, e.g., it can find an arbitrary
element corresponding to an index in O(log n) time. The
first element (in sorted order) would have index zero,
and the last element would have index
#elements-1, much like an array.
You are to accomplish this by adding a new field to
every node in the tree, called leftSize.
This field represents the size of the left subtree
rooted at this node (with a leaf having a
leftSize of zero). Once this is accomplished, the
algorithm to find a node given an index is relatively
simple:
- Set a current reference to be the
root.
- If the index we're looking for equals the
leftSize of the current node, we've found
our node, and we can stop.
- If the index is less than
leftSize, move current to the left child
and repeat.
- If the index is greater than
leftSize, change the index we're looking
for to index-(leftSize+1), change
current to the right child, and repeat.
- If current becomes null, the element
does not exist in the tree.
Given the aforementioned algorithm, you must add the
field to the object, change one method and write one
method.
- (3 points) Modify insert to correctly
update leftSize when nodes are inserted.
You can do this without adding any iterative or
recursive loops. (You don't need to worry about
delete here.)
- (3 points) Write a method called
elementAt that takes one parameter (the
index) and returns a String that
corresponds to the data at that index. Return
null if the index doesn't exist.
- (2 points) Create a new class called TreeApp,
based on the code in the previous app but modified to handle
the tree as designed above. The user interface should
remain the same (although we aren't doing radix sort or
calculating assignments here).
- (6 points) Extra credit: If you do any of this, make
sure to clearly indicate you've done so in your README.
- (3 points) In programming problem 2(a)(ii), we might
also want to support deletes. Modify delete to
correctly update leftSize. Then, update the
TreeApp to support a delete operation (del
word; if successful, it says nothing, otherwise it
says not found).
- (3 points) You may have observed that, as stated above,
radix sort is rather inefficient -- we've got a few long
words for which we have to keep on scanning through lots and
lots of short words. Radix sort is best when we have words
with similar length, not with such a heterogeneous
collection as you might find in a dictionary. However,
there is a modification that will make radix sort faster
with a spelling dictionary:
- First, create a set of groups that are arranged by
length. You'll have m groups, one for words of
each possible length (where m is bounded by the maximum
length over all the words. In the first pass, you will
walk through the list of words and throw it into one of
these m groups based on length.
- Now, as you do the radix sort, start with the
mth character by grabbing all the words from the
group that has words that are m characters long,
and put them into the alphabetically-sorted groups.
(Note that you will no longer need the "0" group,
although you're welcome to leave it alone.) After
that's done, continue looping to the m-1th
character, grab the words from the m-1-length
group, and combine it with the words in the
alphabetically-sorted groups. Repeat this process over
and over until we reach (and finish) length 1 words, at
which point the alphabetic groups will have all the
words sorted. Make sure to update the code that handles
the number of group operations -- a read or write from
any kind of group should add to this total.
If you choose to do this, make sure to implement it in a
separate method (call it smartRadixSort), and
modify your ArrayBackedListApp code to load the
words into two separate arrays, sort each of them, and
display the # of comparisons for each. (Remaining
operations can use just the first array as specified
earlier in the homework.)