Date of Submission
Spring 2011
Academic Program
Computer Science
Advisor
Sven Anderson, Rebecca Thomas
Abstract/Artist's Statement
In this project documents that come from defined classes are clustered. The clustering is done using non-negative matrix factorization performed by a approximation method called rank one residue iterations. In order to employ this method the optimal number of clusters and cluster sparsity has to be determined. Normalized mutual information is a measure of how well the clustering represents the original class structure, and this measure is used to find the optimal number of clusters and sparsity.
Distribution Options
Access restricted to On-Campus only
Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.
Recommended Citation
Tsikhanovich, Maksim, "An information theoretic approach to determining sparsity in clustering classified documents" (2011). Senior Projects Spring 2011. 85.
https://digitalcommons.bard.edu/senproj_s2011/85
This work is protected by a Creative Commons license. Any use not permitted under that license is prohibited.
Bard Off-campus DownloadBard College faculty, staff, and students can login from off-campus by clicking on the Off-campus Download button and entering their Bard username and password.