Date of Submission

Spring 2012

Academic Program

Mathematics

Project Advisor 1

Sam Hsiao

Abstract/Artist's Statement

As an academic field of study, information retrieval is defined as an activity of finding useful information from a collection of information stored in computers. People engage in information retrieval by providing a query consisting of keyword(s) to the web search engine for relevant documents to be returned. Several information retrieval systems are currently used. This senior project is designed to explore the mathematical mechanism underlying the retrieval technique referred to as latent semantic indexing. It studies 530 twitter tweets, in which there exist 282 keywords. I use matrix decomposition to construct a low-rank approximation to the term-document matrix. Then I examine the application of such low-rank approximation method to indexing and retrieving documents. I utilize precision, recall and F-measure to compare the method in the rank 2 and rank 3 cases.

Distribution Options

Access restricted to On-Campus only

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License.

This work is protected by a Creative Commons license. Any use not permitted under that license is prohibited.

Bard Off-campus Download

Bard College faculty, staff, and students can login from off-campus by clicking on the Off-campus Download button and entering their Bard username and password.

Share

COinS