0% found this document useful (0 votes)
67 views

04hashtables Java Good PDF

Here is a sample output format for the lookup method: [filename]: 1: Evan Velasquez, Estelle Han, Timothy Wang, ... 3: Thien Nguyen, Evan Velasquez, ... The lookup method should return a string with this format, listing the filename at the top, followed by line numbers and contents for each line containing the search term. You may assume the file is well-formatted, with one guest list per line separated by newlines. You can use Java's built-in HashMap to build an index from names to line numbers during construction. Test your Grep class thoroughly! The tests we provide only cover a few cases.

Uploaded by

Amany Shousha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
67 views

04hashtables Java Good PDF

Here is a sample output format for the lookup method: [filename]: 1: Evan Velasquez, Estelle Han, Timothy Wang, ... 3: Thien Nguyen, Evan Velasquez, ... The lookup method should return a string with this format, listing the filename at the top, followed by line numbers and contents for each line containing the search term. You may assume the file is well-formatted, with one guest list per line separated by newlines. You can use Java's built-in HashMap to build an index from names to line numbers during construction. Test your Grep class thoroughly! The tests we provide only cover a few cases.

Uploaded by

Amany Shousha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

CS18 Integrated Introduction to Computer Science Fisler

Homework 4: Hash Tables


Due: 5:00 PM, Mar 7, 2020

Contents

1 Searching in Files (Grep) 3


1.1 The Mystery Continues. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Command-Line (Unix) Grep . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 DIY Grep . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 Chaining Hash Tables 6

3 Hash Table Iterator 7

Objectives

This homework has you practice

ˆ Building applications with built-in hash tables

ˆ Implementing hashtables from scratch using arrays and lists

ˆ Catching I/O exceptions

ˆ Thinking about how to write iterators for data structures that you implement

You will be in a position to do the DIY Grep problem after lecture on Friday. We will discuss
hashtable implementation in detail on Monday.

How to Hand In

The source code files should comprise the hw04.src package, and your solution code files, the
hw04.sol package.
Begin by copying the source code from the course directory to your own personal directory.
That is, copy the following files from /course/cs0180/src/hw04/src/*.java to ˜/course/
cs0180/workspace/javaproject/src/hw04/src:

ˆ AbsHashTable.java containing public abstract class AbsHashTable<K, V> implements


IDictionary<K, V>

ˆ IDictionary.java containing public interface IDictionary<K, V>


CS18 Homework 4: Hash Tables Due: 5:00 PM, Mar 7, 2020

ˆ IGrep.java containing public interface IGrep


ˆ KeyNotFoundException.java containing public class KeyNotFoundException
ˆ KeyAlreadyExistsException.java containing public class KeyAlreadyExistsException

Do not alter these files!


We have also provided you with the following partial testing files:

ˆ ConstructorTest.java containing public class ConstructorTest

Copy them over from /course/cs0180/sol/hw04/sol/*.java to ˜/course/cs0180/workspace/


javaproject/sol/hw04/sol.

The purpose of these files is mainly to ensure your code compiles with our testsuite. In addi-
tion, it should be a guide for what test cases should look like. Note that your real testing files (as
required by the homework) should be MUCH more exhaustive.
After completing this assignment, the following solution files should be all be part of the hw04.sol
package when you hand in:

ˆ DIY Grep

– Grep.java, containing public class Grep implements IGrep


– GrepTest.java containing public class GrepTest

ˆ Chaining

– Chaining.java, containing public class Chaining<K, V> extends AbsHashTable


<K, V>
– HashTableTester.java containing public class HashTableTester

ˆ Hash Table Iterator

– Iterator.tex containing the LateX file describing your solution to the Hash Table
Iterator questions. We will also accept Iterator.pdf, but highly encourage trying out
LaTeX.

ˆ Constructor Test Suite

– ConstructorTest.java containing public class ConstructorTest

Only hand in the files that you have under the hw04.sol package, as well as the pdf
file that has your answers to Problem 3. Handing in the src files may break the testsuite.
There is no compatibility check for this assignment. Make a private post on Piazza with a
link to your submission if the autograder on Gradescope fails.
To hand in your files, submit them to Gradescope. Once you have handed in your homework, you
should receive an email, more or less immediately, confirming that fact. If you don’t receive this
email, try handing in again, or ask the TAs what went wrong.

2
CS18 Homework 4: Hash Tables Due: 5:00 PM, Mar 7, 2020

Java’s Built-in Hash Tables

In this homework, you will be writing an application (DIY Grep) using Java’s built-in hashtables,
and completing an implementation of hashtables from scratch (via chaining). Java has two styles of
built-in hash tables, specifically, HashMap or HashSet.
These classes differ in that the former uses a hash table to represent a dictionary (a mapping from
some key to values), while the latter uses a hash table to represent a set, which is a special case
of a dictionary; the keys are the elements of the set, and there are no values.1 Consequently, it is
straightforward to use a hash table to represent a set.
You can use Java’s HashMap class by importing java.util.HashMap. Documentation on how to
use Java’s HashMap can be found here. Likewise, you can use Java’s HashSet class by importing
java.util.HashSet. Documentation on how to use Java’s HashSet can be found here. You may
use these only in the Grep problem.

Problems

1 Searching in Files (Grep)

1.1 The Mystery Continues. . .

The lights have only been mysteriously cut out for 30 seconds, but the CIT formal has descended
into disarray. Suddenly, a loud sound detonates above and the lights flicker back to life. Shouts
ring out as everyone looks up, just in time for a huge cloud of glitter to descend in their eyes — a
glitter bomb! Out of the corner of your eye, you spot a distinctly non-glittery figure on the balcony,
reveling in the chaos unfolding downstairs. You sprint to the stairs — it’s them, and you’re finally
going to catch them red-handed (or is it glitter-handed?). When you throw open the doors to the
second floor, though, there’s no one there. The culprit must have already disappeared into the maze
of CIT hallways. Heart pounding, you plunge into the labyrinth, desperate to find any trace of the
mysterious figure.
By the time you re-emerge from the winding halls, you’re exhausted and have nothing. Dejected,
you lean against the wall by CIT 201 to catch your breath — you can’t believe that they slipped
away right through your fingers, leaving not even a glitter trail in their wake. But when you look
up, you see something that definitely wasn’t there before. The painting that usually overlooks the
whole lobby is gone. In its place, there’s a giant, hastily painted symbol: a compass rose, like one
you would see on a nautical map.
The next day, all anyone can talk about is the eventful night and the compass rose that now overlooks
the CIT. Your suspicions have been confirmed once and for all, and you are more determined than
ever to solve this mystery. You survey everyone to find out who was seen on the dance floor right
before the blackout — and even more importantly, who wasn’t. You know your limits as the resident
CIT detective though, and there’s way too many names to check by hand. As you’re working, your
CS 18 HTA Evan sees you and, being the helpful TA he is, suggests you write a program to search
1
Recall the invariant that dictionaries do not allow duplicate keys. That is why it makes sense to view sets as
dictionaries.

3
CS18 Homework 4: Hash Tables Due: 5:00 PM, Mar 7, 2020

a file for specific guests using special data structures to make it faster (this sort of step is called
“preprocessing”).
For example, you might have a file like this, where each line is a different guest’s list of people that
they saw at the formal:

Evan Velasquez, Estelle Han, Timothy Wang, ...


Jefferson Bernard, Sohum Gupta, ...
Thien Nguyen, Evan Velasquez, ...

Then a search on Evan Velasquez would identify lines 1 and 3 (the lines of the file). A concrete
example of the output format we want is further down in this section.
This same problem arises in other settings, like finding which lines of a play involve a given character,
or generally searching in files for where specific information might lie. In fact, this operation is so
common that operating systems such as Unix have a built-in command for it!

1.2 Command-Line (Unix) Grep

The UNIX command grep is an extremely powerful and useful tool. You can use grep to search a
file for a given pattern, and report where that pattern appears in the file, as follows:

grep -n <pattern> <file>

The -n is an optional parameter that tells grep that we want it to report line numbers. It prints
out each line of text next to the line number. This is but one of many, many grep features, which
are fully documented on its man page (which you can access, if you want to learn more, by typing
‘man grep’ into a terminal).
Here’s a couple of examples of how you’d use grep with this line number option:

grep -n mystery myFile

This command would print out something like:

1: Evan likes a good mystery


4: mystery
9: The criminal remains a mystery and Evan is in for a surprise

Your output format will look different (see below), but this is the essence of what you
are trying to replicate with your code.

1.3 DIY Grep

Now we’re going to implement (a limited version of) grep for ourselves!

4
CS18 Homework 4: Hash Tables Due: 5:00 PM, Mar 7, 2020

Task: Explain how a hashtable could be used to make it easy to look up the line numbers associated
with a given word in a multi-file. Your answer to this part is just in prose. Write your answer to
this question in a comment at the top of the Grep class, which the next task asks you to write.
Task: Write a class Grep with a constructor and a single method, lookup. This class should
implement the IGrep interface in the source files for this assignment. Your constructor should take
as input a filename and perform any necessary preprocessing. The lookup method should take as
input a word and return a set of the line numbers on which that word appears in the file. It should
operate in expected constant time.
As noted above, you can (and should) make use of Java’s built-in hash table data structure, which
is called HashMap, to solve this problem.
Notes:

ˆ If the same word appears more than once on the same line, you should include the line number
only once in your output.

ˆ You should treat words as sequences of characters separated by whitespace; so "glitter" and
"glitter!" are distinct words. Also, you can assume words are case sensitive; so "mystery"
and "Mystery" are distinct words.

Hints:

ˆ As part of the preprocessing step, you may want to use the split method in the String
class, which splits up a string into pieces each time it encounters a specific character, and
stores those pieces in an array of strings.

ˆ You might find the LineNumberReader class, which extends BufferedReader, useful. It has
a method getLineNumber that gets the current line number.

ˆ You may need to catch and handle relevant exceptions! Think about exceptions like FileNotFoundException
and/or IOException.

ˆ The Java syntax to declare a HashMap that maps, for example, from a String to a Set of
type Integer is new HashMap<String, Set<Integer>>().
Likewise, the syntax to declare a HashSet of Integers is new HashSet<Integer>().

Task: Write a main method for the Grep class. The String[] args should correspond to a file
name, and then at least 1 other word to look for, in that order.
For example, running:
‘java hw04.sol.Grep /course/cs0180/src/poems/iliad tree water cats’
from the bin directory, should print something like:

tree found on lines: 1353


water found on lines: 686 7260 9731 15877 17749 20584
cats is not found

Note: To take in arguments using IntelliJ, press the drop-down menu near the green “run” button
and select edit configurations. In the “Program Arguments” field, put your program arguments

5
CS18 Homework 4: Hash Tables Due: 5:00 PM, Mar 7, 2020

(space separated). These arguments will appear in the String[] args variable in your main
method. For giving arguments through the commandline, you can just list them after calling the
program (space-separated), and they will similarly appear in String[] args. This would look like:
‘java hw04.sol.Grep filename’
Task: Write a class GrepTest.java that tests your lookup function. For each test, include a
comment above the test explaining what scenario that particular test was trying to check. Be sure
to test edge cases and a variety of scenarios. You do not need to test for exceptions that might be
thrown on I/O errors (as you are catching those and printing a message, which can’t be tested).
Note: We are testing lookup here, not the main grep program, because the latter simply prints
output to the screen. In such cases, you test the inner logic, then manually inspect output build
from the results of the inner logic.
Hint: We’ve included several test files (thankfully, none of which is the file shown above) in /course
/cs018/src/poems/. But do not be afraid to create your own text files for testing, especially to
try to catch edge cases! If you do this, make sure to hand in these files with the rest of your code in
your sol/hw04/sol/ directory.
Now that you’ve preprocessed the list of people who were at the party, you can figure out which
TAs were missing! First you try the Head TAs, but then you remember that they were definitely
there, managing the formal’s SignMeUp queue for refreshments the entire night. You then start to
search the guest list for the UTAs, and notice that there’s exactly one who was missing right before
the blackout. . .

2 Chaining Hash Tables

We will finish everything you need for this problem in class on Monday. You should have a sense of
what a hashtable looks like under the hood (an array), but we haven’t yet talked about how to deal
with collisions. For those who want to start thinking ahead, we handle collisions by putting a list in
each cell of the array, and storing all keys/values that map to the same array index in the list at
that index.
In this problem, you will implement a hash table using chaining. Recall that the internal data
structure of a hash table implemented using chaining is an array of lists. You will use Java’s
LinkedList class in particular for the list implementation.
The point of this problem is for you to see how hashtables are built when they don’t already exist
in a language. Therefore, you may not use Java’s hashmaps or hashsets in your implementation.
The starter files contain an abstract class named AbsHashTable.java. It defines a class for
combining keys and values (as shown in Monday’s lecture). Look at the KVPair<K, V> class
stored inside. AbsHashTable.java implements an interface IDictionary<K, V>, which has the
definitions of the methods we need to implement.
Task: Extend AbsHashTable<K,V> with a class Chaining<K, V>, which will be your hashtable
implementation. The Chaining constructor should take as input a size variable, which specifies
the size (number of slots) of the table.
Your Chaining class needs to implement each of the abstract methods in AbsHashTable (abstract
methods have headers, but not yet bodies). You are welcome to implement additional (helper)

6
CS18 Homework 4: Hash Tables Due: 5:00 PM, Mar 7, 2020

methods if you wish.


Hint: The key method to write in Chaining is findKVPair. Write this method first, and use it as
a helper when you write insert and delete.
Hint: You will see that some of the abstract methods throw exceptions. The names of the exceptions
should make it clear what each exception is used for. You should throw these exceptions where
appropriate. If your code can reasonably handle an exception, you should catch it. But if you think
that an exception should be passed back out to the user of your hashtable class, you may leave it
uncaught.
Hint: Recall that hash tables are backed by a hashing function, which maps a key to an index
in the array. You should use the built-in hashCode method on your keys for hashing. The Java
built-in hash function returns a signed (either positive or negative) integer. Taking the absolute
value of the integer will let you map it into an array index.
Task: Test your Chaining class thoroughly. Your tests should cover the methods in the IDictionary
interface, including cases that throw exceptions. Since your class is generic, be sure to use multiple
types in your testing. Additionally, since we are working with a mutable data structure, be sure to
have setup methods. Put your tests in the HashTableTester.java file.
Hint: We only expect you to test the methods that are in the IDictionary interface. That
includes public abstract methods, insert and delete, that are defined in AbsHashTable (and the
constructor). You do not need to test findKVPair, or any helper methods that you write in order
to write insert and delete. With this in mind, consider the access modifiers that you should put
on fields and methods in your Chaning class.

  In Java, you cannot create an array of a generic type. In order to circumvent this limitation, you
  
 should do the following to create your hash table:

this.data = (LinkedList<KVPair<K,V>>[]) new LinkedList[size];

This situation is one of a few exceptions (equals is another) when casting is acceptable; in
general, however, it’s still discouraged.
This line of code will generate a warning that there is an unchecked cast. To get rid of the warning,
above any methods that cast like this, you should write:

@SuppressWarnings("unchecked")

Typically, warnings such as these indicate a problem with your code, so you should not suppress
them; you should pay attention to them! In this specific instance, however, we justify its use as
we are trying to get around a Java limitation.

3 Hash Table Iterator

Note: For this problem, you need not submit any Java code. A high-level description of each
algorithm is enough. We recommend doing this portion of the homework in LATEX; although,
we would also accept a PDF. You can find a description of LATEX at the bottom of the previous
homework. And our template is linked here.

7
CS18 Homework 4: Hash Tables Due: 5:00 PM, Mar 7, 2020

Whenever you implement a collection (like a list or a hashtable) you should override iterator. We
did this successfully for doubly linked lists, and in this problem, you will think about how to do this
for hash tables. You are not being asked to implement the iterator, just to explain how
it would work. (You may assume that the hash table does not change while you are iterating; no
items are inserted or deleted.)
Note: While not part of the assignment, take a moment to think about how you would use an
iterator to write equals and toString for hash tables.
First, let’s think about how an iterator for a chaining hash table might work. As a first attempt,
you could try iterating over all the slots in the hash table: if a slot is empty, you can skip right over
it; but if it is not empty, you would then iterate over the bucket stored at that slot.
Task: The iterator we just described would examine all n slots in the chaining hash table, even
though there might not be data stored at all, or even most, of them. Explain how to implement a
more efficient chaining iterator that only examines slots which store data.
Your iterator should not affect the run time of the hash table’s basic operations; that is, the run
times of lookup, insert, update, and delete should not change.
Note: You cannot simply store the keys in Java’s HashSet and iterate through this set. Your
solution should not involve iterating through another HashSet or HashTable (note that these two
data structures would have similar iterators) as this would be using the solution to the problem to
solve the problem!
Hint: Consider augmenting the key-value pairs stored in your dictionary with additional fields.
Task: Discuss the trade-offs between the naive iterator we proposed, and your iterator design.

Please let us know if you find any mistakes, inconsistencies, or confusing language in this or any
other CS18 document by filling out the anonymous feedback form: https://cs.brown.edu/
courses/cs018/feedback.

You might also like