Date of Submission
Spring 2022
Academic Program
Computer Science
Project Advisor 1
Kerri-Ann Norton
Project Advisor 2
Sven Anderson
Abstract/Artist's Statement
This project seeks to find the similarity score between content on the page and title using cosine similarity from a word2vec model. Frequent words and randomly chosen words from each article were analyzed and compared against the title using three samples. Frequent words were found to have a higher similarity score with the title than random words. Word frequency helps you identify the most relevant keyword on the page. The bigger goal of the project is to develop a keyword suggestion tool. Identifying which keywords are most relevant in writing content is the first step.
Open Access Agreement
Open Access
Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.
Recommended Citation
Detchou, Yabo Ornella, "Identifying The Relationship Between Page Content and Title" (2022). Senior Projects Spring 2022. 126.
https://digitalcommons.bard.edu/senproj_s2022/126
This work is protected by a Creative Commons license. Any use not permitted under that license is prohibited.