Processing SMS from the database/ Testing
Task 12 – Completed
Next find synonyms for the selected words. Synonyms are found by posting each of the word to fee dictionary website (http://www.thefreedictionary.com/) and analyzing the HTML response from it. Again calculate tf-idf weight of each word from the database. Highest tf-idf synonyms are selected. The data is stored in “sms_synonym” table.
Fig 1. sms_synonym table
Find poetry lines from the database where the selected synonym is used in the same context as in the SMS. Select the final poetry line which maximize the tf weight and minimizes emotional weight difference to the users SMS. The result is stored in “sms_poem_line” table.
Fig 2. sms_poem_line table
Poetry Selection
Given a query of i words, the end result is to calculate this weight (w) for each word in every poem line.
(1)Where $tf_{i,d}$; is term frequency of the $i^{th}$ word in each poem line in a set of d poem lines. n is the total number of poem lines. $df_{i}$ is the document frequency of the $i^{th}$ word. For each word i, the system then returns the poem lines such that $\sum w_{i,d}$ is maximized.
Fig 3. Poetry selection data flow diagram
Revised Project Plan
Fig 4. New project plan
Changes to be made
- GUI to edit config files
- Options to enable polling results, display polling results, type of polling results display pie chart, bar chart, etc.
- backup "sms_log" data to text files
- config file editor
- instead of connecting to Internet use a off line dictionary to find synonyms
- Remove SMS max length restriction, it should be handled by display application
- documentations