Computer Science & Software Engineering |
CSCI 135 |
In this assignment, you will be building a program that stores and searches a dictionary of words. The program also speaks words using a library of recorded WAV files (for most of the 5K words anyway). You will gain practice building a common class design pattern in which one class holds a collection of other objects. |
The first column is the word. The second column is the probability of the word under a language model trained on a large corpus of English. The third column on a line is optional, when present it contains the directory and filename of the audio for the word.him 5.60991E-04 word_audio/him.wav across 5.59660E-04 word_audio/across.wav died 5.58570E-04 often 5.58313E-04 word_audio/often.wav problem 5.58185E-04 word_audio/problem.wav that's 5.58137E-04 word_audio/that_s.wav
The program first reads in the data in the dictionary and outputs the total number of entries. It goes through each command line argument, returning the word that matches the pattern and has the high probability. The program prints one line of output per command line argument. Each output line has four columns:% java SpeakText it w.. t.e .est of ti.+ < dict5k.txt Read in 5000 dictionary entries. it it 0.00255839 word_audio/it.wav w.. was 0.00676185 word_audio/was.wav t.e the 0.00534691 word_audio/the.wav .est test 3.81697E-4 word_audio/test.wav of of 0.00647559 word_audio/of.wav ti.+ time 7.04731E-4 word_audio/time.wav
Classes and APIs. You should start by downloading the file speakdict.zip. This file contains recorded audio files for almost four thousand words, so it is a bit big! Unzip the file into the directory you plan to develop your Java programs in. This time, you only get bare bone versions of the three classes you will be developing:% java SpeakText < dict5k.txt Read in 5000 dictionary entries. deal religion can delivered wouldn't lot ...
Here is the API you should implement in DictEntry.java:public class Dict ------------------------------------------------------------------------------------------------------------------------------ public int size() // how many dictionary entries there are public void add(String word, double prob) // add a new entry given a word and a probability public void add(String word, double prob, String filename) // add a new entry given a word, probability, and audio filename public String getAudioFilename(String word) // given a word, return audio filename, "" if word not found or no audio file public double getProb(String word) // given a word, return probability, 0.0 if word not found public String matchPattern(String pattern) // find a word matching this pattern, "" if no match public String getRandomWord() // draw a random word based on the probabilities
Audio playback. For this program, we need to play a series of audio files, waiting for each to finish before the next is played. StdAudio cannot do this (plus it seems to crash on Linux). We will instead use a new class AudioFile.java. You'll need to place this file in your working directory. AudioFile has normal instance methods (not static methods like StdAudio). You will need to create an object using the filename as the parameter to the constructor. For this assignment, you want to use the playBlocking() method. Here is an example of using the class:public class DictEntry ------------------------------------------------------------------------------------------------------------------------------ public DictEntry(String word, double prob) // create a new entry given a word and probability public DictEntry(String word, double prob, String filename) // create a new entry given a word, probability, and audio filename public String getAudio() // getter for audio filename public String getWord() // getter for the word public double getProb() // getter for the probability public boolean matchPattern(String pattern) // does this word match the given pattern?
AudioFile sound = new AudioFile("word_audio/quick.wav"); sound.playBlocking();
% java SpeakT9 48 927 843 2378 63 84637 < dict5k.txt Read in 5000 dictionary entries. 48 it 0.00255839 word_audio/it.wav 927 was 0.00676185 word_audio/was.wav 843 the 0.00534691 word_audio/the.wav 2378 best 3.60753E-4 word_audio/best.wav 63 of 0.00647559 word_audio/of.wav 84637 times 2.42175E-4 word_audio/times.wav
Page last updated: August 16, 2012