Assignment III

Due date Oct. 14, 2003

In this assignment we will parse a string into tokens.

When manipulating text, one of the most common operations is to break it into words. We define words as contiguous sequences of characters, separated by blanks and punctuation marks. We consider the period, comma, colon, semi- colon, question mark, exclamation mark to be punctuation marks. For simplicity, you can assume that the text contains only alphabetic characters and the marks mentioned above (no digits or other symbols).

Input to the program is a single string (one line) consisting of several words (tokens). Your program must create an array of these words, sort the words in alphabetic order, and print the contents of this array. To sort, use the algorithm that we presented in the testArray program, and compare the strings ignoring uppper and lower case. If the string contains a repeated word, it should appear only once in the array.

You can present the input to the program either as a single command line parameter (in quotes) of using a dialog box as we showed in class today.

The task of breaking up a string into tokens is so common that Java provides a class to do exactly that: stringTokenizer. Of course, the purpose of this assignment is for you to program the functionality of this class.