Date of Submission
Spring 2012
Academic Program
Mathematics
Project Advisor 1
Sam Hsiao
Abstract/Artist's Statement
As an academic field of study, information retrieval is defined as an activity of finding useful information from a collection of information stored in computers. People engage in information retrieval by providing a query consisting of keyword(s) to the web search engine for relevant documents to be returned. Several information retrieval systems are currently used. This senior project is designed to explore the mathematical mechanism underlying the retrieval technique referred to as latent semantic indexing. It studies 530 twitter tweets, in which there exist 282 keywords. I use matrix decomposition to construct a low-rank approximation to the term-document matrix. Then I examine the application of such low-rank approximation method to indexing and retrieving documents. I utilize precision, recall and F-measure to compare the method in the rank 2 and rank 3 cases.
Distribution Options
Access restricted to On-Campus only
Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License.
Recommended Citation
Du, Siyao, "A Linear Algebraic Approach to Information Retrieval" (2012). Senior Projects Spring 2012. 232.
https://digitalcommons.bard.edu/senproj_s2012/232
This work is protected by a Creative Commons license. Any use not permitted under that license is prohibited.
Bard Off-campus DownloadBard College faculty, staff, and students can login from off-campus by clicking on the Off-campus Download button and entering their Bard username and password.