Date of Submission
Spring 2022
Academic Program
Computer Science
Project Advisor 1
Sven Anderson
Abstract/Artist's Statement
While great strides have been made with natural language processing (NLP) techniques in the last few decades, there has been a notable lack of research into utilizing NLP for the genre of fiction. This project seeks to address this gap by considering the use of NLP techniques for the summarization of European fairy tales. This subgenre of fiction is an appropriate starting point for investigation due to its archetypal characters and relatively simple story arcs. My approach is to extract the main characters of texts, along with key descriptors in the form of modifying adjectives and verbal actions the characters take part in. Through this method, I suggest how we may parse characters into Proppian archetypes by tracking their probabilistic association with certain linguistic occurrences. This classification schema in turn makes possible the broader classification of fairy tales into types. The model has an overall F1 score of 0.77, the individual parts having F1 scores of 0.89, 0.75, and 0.66 for character retrieval, adjective extraction, and verb extraction, respectively. This project may also be extended further, laying key groundwork for further automatization of categorization of characters and ultimately stories themselves.
Open Access Agreement
Open Access
Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.
Recommended Citation
Ostrow, Ruby Alling, "Heroes, Villains, and the In-Between: A Natural Language Processing Approach to Fairy Tales" (2022). Senior Projects Spring 2022. 275.
https://digitalcommons.bard.edu/senproj_s2022/275
This work is protected by a Creative Commons license. Any use not permitted under that license is prohibited.