After thinking about the problem for a bit, you accept the offer. The recent talk about sorting in class gives you an idea: Assuming the words were represented as Strings in an array, you could sort them so that they were in alphabetical order. Once sorted, duplicate words would be adjacent to each other in the array, and you could make a pass through to count how many times each word appeared.
< or == on Strings. Instead, Java provides the compareTo and equals methods. When comparing strings, a negative result from compareTo means that the first string is alphabetically before the second. Note that both methods are case sensitive, but that's fine for our purposes. For example:
> "abc".compareTo("xyz")
-23 (int)
> "abc".compareTo("abc")
0 (int)
> "xyz".compareTo("abc")
23 (int)
> "aardvark".compareTo("anteater")
-13 (int)
> "Abc".compareTo("abc")
-32 (int)
> "Java".equals("Java")
true (boolean)
> "Java".equals("java")
false (boolean)
WordGetter class mentioned above. You can use the methods in this class to do more extensive testing of your frequency-finding code. The methods in WordGetter will find all of the words in a file or on a web page, and return them as an array of Strings. Once you open the project, you'll need to do the following:
FrequencyFinder to hold the code you write. Your class won't need any state — just the two methods described below for finding word frequencies.
FrequencyFinder class. (You could also use the version from Lab 7 if you prefer.) Edit the method so that it takes an array of Strings as its argument, then make the necessary changes to the body of the code so that it compares and swaps Strings instead of integers. For full credit, this method should be private, though you may wish to leave it public until you've tested it thoroughly:
> FrequencyFinder ff = new FrequencyFinder();
> String[] words = {"zoo", "aardvark", "java", "apple"};
> ff.sortStrings(words);
> words[0]
"aardvark" (String)
> words[1]
"apple" (String)
> words[2]
"java" (String)
> words[3]
"zoo" (String)
printFrequencies that takes an array of Strings as its input. It should call the string-sorting method you modified in the previous step, then traverse the sorted array and print out frequency information as shown below. (In my output, I printed a tab character ("\t") between each count and the corresponding word, to keep the columns nice and tidy.) Hint: On your final pass through the array, look at adjacent items. Each time you find adjacent items that differ, it's time to print a line of output.
The examples below illustrate the correct output for various examples. The final example prints nearly 200 lines of output, but I'm only showing the last 40 or so to keep the assignment page to a manageable length. The full list of output is here. Some of the "words" don't look much like English, but that's because the method that retrieves them from the web page grabs HTML formatting commands along with the page's text.
> FrequencyFinder ff = new FrequencyFinder();
> String[] word = {"hello"};
> ff.printFrequencies(word);
1 hello
> String[] words = {"hello", "world", "hello"};
> ff.printFrequencies(words);
2 hello
1 world
> String[] words2 = {"hello", "world", "Hello"};
> ff.printFrequencies(words2);
1 Hello
1 hello
1 world
> WordGetter g = new WordGetter();
> g.fromURL("http://www.cs.ups.edu").length
198 (int)
> ff.printFrequencies(g.fromURL("http://www.cs.ups.edu"));
[This is just the end of the output]
2 like
1 links
1 looked
1 looks
1 maintained
1 majors
1 math
1 more
1 occupies
7 of
1 offer
1 offered
1 on
1 other
1 our
1 pages
1 photo
1 programming
1 programs
1 renovation
1 right
1 sc_invisible=1;
1 sc_partition=7;
1 sc_project=874859;
1 sc_security="4855c8f0";
1 semester.
1 specifics
1 statistics,
1 students
12 the
1 through
6 to
1 topics.
1 tower,
1 ups
1 use
4 var
2 what
2 with
1 year
1 year,
"Hello" is considered to be different from "hello". One way to fix this would be to turn all strings in the array to lower case before you sort and process them. (See toLowerCase in the String class.) You could also use case-insensitive String methods like compareToIgnoreCase in your code instead of the basic case-sensitive versions.