Categories
Fairy Tales

Uncovering Fairytales Using A.I.

By: Gavin Butts

In recent years, the rapid development of machine learning and artificial intelligence has led to their widespread adoption in many areas of society. From voice recognition technology to the creation of the image above, A.I. has become an integral part of our daily lives. As A.I. continues to advance, it is natural to consider its potential applications in more fundamental areas of human experience, such as storytelling. Fairy tales, with their fantastical and unpredictable narratives, have long been a beloved part of human culture (Warner 309-311). These tales, which have been passed down through the generations, often feature magical elements and larger-than-life characters, making them a unique and fascinating test case for the capabilities of A.I. (Lüthi 375-379). Only through analyzing their unique idiomatic expressions are we able to create A.I. to predict whether a given text is a fairy tale.

Through combating the technical challenges of word frequency analysis, the analysis of fairy tales also offers the opportunity to explore the cultural significance of these stories. Through viewing fairy tales as deeply embedded in the human experience, rather than simply words on a page or lingual patterns leaving the mouth, we can provide valuable insights into the way that humans think, feel, and relate to one another (Bettelheim 50-53). Overall, the application of A.I. to the study of fairy tales represents a fascinating intersection of technology and culture.

To tackle this problem, I first sought to compile a comprehensive list of fairy tales by utilizing a precompiled list created by renowned folklorist Maria Tatar. This list contained over 60 commonly told tales, which I converted into epub format for convenient storage on a computer. I then utilized an I/O stream to separate each tale into its own file, allowing for efficient identification of chapter breaks. To further streamline the data, I removed the titles from each file and extracted only the words in all lowercase, using a parsing technique to remove any non-alphabetic characters. This process effectively reduced the program’s time complexity and set the foundation for subsequent analysis.

Finally, I performed a comprehensive word frequency analysis on our simplified fairy tale corpus to uncover the most and least commonly used words. This analysis involved examining over 60 tales to identify the ten most frequently used words in fairy tales and compare them with their frequency in classic English literature. This provided valuable insights into the linguistic characteristics of fairy tales and allowed for a deeper understanding of their unique narrative style. Shown are the 10 most common words in fairy tales, as well as their frequency in traditional English literature, respectively (Norvig).

‘the’, 5.56%‘the’, 7.14%
‘and’, 3.76%and’, 3.04%
‘a’, 3.52%‘a’, 2.06%
‘to’, 2.60%‘to’, 2.60%
‘was’, 2.45%‘was’, 0.74%
‘her’, 2.02%‘her’, 0.22%
‘she’, 2.00%‘she’, < 0.20%
‘of’, 1.74%‘of’, 4.16%
‘he’, 1.43%‘he’, 0.55%
‘it’, 1.41%‘it’, < 0.20%
‘in’, 1.19%‘in’, 2.27%
‘had’, 1.14%‘had’, 0.35%
‘there’, 0.98%‘there’, 0.22%

In both fairy tales and traditional literature, the most common word in English is ‘the’ by 1.8% and 4.1% respectively.  Contrasting to the chart above, it is also important to note what words are common in English literature that are not common in fairy tales. Most notably, the word ‘is’ only comprises only 0.37% of words in our text, but composes a staggering 1.13% of words in literature.

Simply looking at the frequency of the word ‘she’ and ‘her’ offers a valuable insight to how fairy tales might have originated. Originating as tales predominately told from feminine figures in society, fairy tales have a long history of empowering, yet fetishizing, women and young girls (Warner 310-311) (Rowe 302-305). It, then, is not surprising that words in fairy tales are 4.04% ‘she’ and ‘her’. Contrastingly, the word ‘her’ only composes 0.22% of words in literature and ‘she’ is so little that there is no measure to its frequency. In a patriarchal society, there is little resemblance to the society found in fairy tales due to the high prevalence of femininity and delicacy.

Another factor that contributes to the distinctive language of fairy tales is the frequent use of the pronoun ‘it’. The third-person perspective and the use of impersonal constructions in fairy tales often require the use of this pronoun to refer to objects, animals, and other entities (Blevins 2003). This extensive use of this word is evident in the data, with a frequency of 1.41% in fairy tales– compared to traditional literature which uses the word so infrequently that there is no known figure (Norvig). This difference reflects the unique narrative style of fairy tales and the way that they use language to create an atmosphere of mystery and uncertainty. 

With such unique linguistic differences, these factors were implemented directly into A.I. to see if these could determine if a story was a fairytale. This step requires a common linear algebra approach to data comparison called the dot product. If two pieces of data are entirely different, then their value is 0. On the other hand, due to the underlying construction of the algorithm, the dot product of two similar pieces of data is 1 (Ogita et al. 2005). Therefore, we set our algorithm to a threshold of 0.5, where a result of 0.5 and above is a fairy tale and anything else is not a fairy tale.

To evaluate the effectiveness of the algorithm, I compiled a dataset consisting of a diverse range of texts, including both fairytales generated using GPT-3.5 by OpenAI and non-fairytale literature sourced from various online sources. This allowed me to test the algorithm’s ability to accurately classify fairytales and non-fairytales in a variety of contexts. For example, GPT-3 was prompted, “Create an original fairy tale.” As a response, the following story was created:

“Once upon a time, in a faraway kingdom, there was a beautiful princess named Rose. She had long, flowing hair and eyes as blue as the ocean. Despite her beauty, Rose was very unhappy, for her kingdom was being threatened by a terrible dragon.

The dragon had been causing destruction and terror throughout the land, and the people were living in fear. The king and his army had tried to defeat the dragon, but it was too powerful.

One day, a brave knight named Sir Lionel rode into the kingdom. He had heard of the dragon and came to offer his help. The king was overjoyed and offered Sir Lionel anything he desired in return for defeating the dragon.

Sir Lionel accepted the challenge and set out to find the dragon. He searched high and low, but no matter where he looked, he could not find it.

Finally, after many days of searching, Sir Lionel came across a wise old woman who lived in the forest. She told him that the dragon lived deep in the mountains, and that the only way to defeat it was to outsmart it.

Sir Lionel thanked the old woman and set off for the mountains. When he finally reached the dragon’s lair, he saw that it was sleeping. Sir Lionel knew that he had to be careful, so he crept up to the dragon and pulled out a magical sword that the old woman had given him.

With one swift stroke, Sir Lionel cut off the dragon’s head, and the kingdom was saved. The king was overjoyed and rewarded Sir Lionel with anything he desired.

But Sir Lionel did not want wealth or power. Instead, he asked for the hand of the beautiful Princess Rose in marriage. She had fallen in love with the brave knight and happily accepted his proposal.

They were married in a grand ceremony and lived happily ever after. And the kingdom lived in peace, thanks to the bravery of Sir Lionel and the magic of the old woman.”

To evaluate the algorithm’s performance to correctly predict if fairy tales were fairy tales, the aforementioned response, as well as many others, we assessed. The algorithm responded with 87% accuracy– far from ideal. Using text from Brave New World by Aldous Huxley, Of Mice and Men by John Steinbeck, and Crime and Punishment by Fyodor Dostoevsky, the algorithm responded with 85% accuracy– again, far from ideal. Though not entirely inaccurate, the algorithm showed clear flaws.

While the idiomatic patterns present in fairytales may differ from those in other forms of literature, this alone may not be a sufficient basis for accurately classifying a given text as a fairytale. A quantitative comparison of these linguistic patterns is insightful, but this approach may not capture the full range of characteristics that differentiate fairytales from other types of literature. Rather, it is important to consider additional factors such as narrative structure and frequency of types of words– such as nouns, verbs, adjectives, etc.. Since fairy tales have such a deep root in our culture and history, simply knowing what words comprise a text does not offer as much insight as we may originally think.

Work Cited:

Bettelheim, Bruno. The Uses of Enchantment. Vintage Books, 1989. 

Blevins, James P. “Passives and Impersonals.” Journal of Linguistics, vol. 39, no. 3, 2003, pp. 473–520., doi:10.1017/s0022226703002081. 

Dostoyevsky, Fyodor. Crime And Punishment. Signet Classics, 2001.

Huxley, Aldous. Brave New World. 11th ed., Vintage, 2010.

Lüthi, Max. “Abstract Style.” The Classic Fairy Tales: Texts, Criticism, by Maria Tatar, W.W. Norton Et Company, 2017, pp. 375–386. 

Ogita, Takeshi, et al. “Accurate Sum and Dot Product.” SIAM Journal on Scientific Computing, vol. 26, no. 6, 2005, pp. 1955–1988., doi:10.1137/030601818. 

Rowe, Karen. “To Spin a Yarn: The Female Voice in Folklore and Fairy Tale.” The Classic Fairy Tales: Texts, Criticism, by Maria Tatar, W.W. Norton Et Company, 2017, pp. 393–404. 

Steinbeck, John. Of Mice and Men. Penguin Classics, 2000.

Warner, Marina. “The Old Wive’s Tale.” The Classic Fairy Tales: Texts, Criticism, by Maria Tatar, W.W. Norton Et Company, 2017, pp. 309–317. 

css.php