Date of Submission

Spring 2012

Academic Program

Mathematics

Project Advisor 1

Sam Hsiao

Abstract/Artist's Statement

As an academic field of study, information retrieval is defined as an activity of finding useful information from a collection of information stored in computers. People engage in information retrieval by providing a query consisting of keyword(s) to the web search engine for relevant documents to be returned. Several information retrieval systems are currently used. This senior project is designed to explore the mathematical mechanism underlying the retrieval technique referred to as latent semantic indexing. It studies 530 twitter tweets, in which there exist 282 keywords. I use matrix decomposition to construct a low-rank approximation to the term-document matrix. Then I examine the application of such low-rank approximation method to indexing and retrieving documents. I utilize precision, recall and F-measure to compare the method in the rank 2 and rank 3 cases.

Distribution Options

Access restricted to On-Campus only

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License.

Share

COinS