Tuesday, October 19, 2010

Text Analysis and Data Visualization - -

For the text analysis and data visualization exercise, I decided to analyze data from the novel, Alice in Wonderland by: Lewis Carroll, found in the Project Gutenberg catalogue. I chose this novel because it was one of my childhood favorites, along with the recent movie premiere, with one of my favorite actors, Johnny Depp! This story is about a girl named Alice who falls down a rabbit hole in a fantasy world aka wonderland. Wonderland is a place where talking animals and objects exist. Throughout Alice’s journey, she encounters some conflicts and whimsical songs. 

 ♠◦♣◦♥◦♦-♠◦♣◦♥◦♦-♠◦♣◦♥◦♦-♠◦♣◦♥◦♦-♠◦♣◦♥◦♦-♠◦♣◦♥◦♦-♠◦♣◦♥◦♦-♠◦♣◦♥◦♠◦♣◦♥◦♦-♠◦♣◦♥◦♦


 
The first example of analyzing the text, Alice in Wonderland, I created a Wordle. The colors in the Wordle, is customized to how I perceived the story and my interpretations. The great thing about Wordle, is that it delivers the frequencies of certain words. While I was creating this Wordle i chose the option, " Remove common English words," therefore, it shows that the words that appear the most out of the text, is little and Alice. Alice is the main character of the story, which i expected to appear as a frequent word. 

 ♠◦♣◦♥◦♦-♠◦♣◦♥◦♦-♠◦♣◦♥◦♦-♠◦♣◦♥◦♦-♠◦♣◦♥◦♦-♠◦♣◦♥◦♦-♠◦♣◦♥◦♦-♠◦♣◦♥◦♠◦♣◦♥◦♦-♠◦♣◦♥◦♦



After creating a Wordle, I then chose to analyze this text once again by using a  Word Tree. A word tree shows the frequencies of senctences that are created and involved with certain words. I explored and entered different words into the search bar to see which words created different outcomes. I first entered the word “ Alice”, and was content with the results. I then entered a more common word, “the”, and was blown away by the results. The word “the” appears approximately 10 times more frequently than the word “Alice” or “Rabbit”. One attribute that I really love about word trees is that you have the ability to control and visualize specific sentences of phrases just by clicking on a word. 

 ♠◦♣◦♥◦♦-♠◦♣◦♥◦♦-♠◦♣◦♥◦♦-♠◦♣◦♥◦♦-♠◦♣◦♥◦♦-♠◦♣◦♥◦♦-♠◦♣◦♥◦♦-♠◦♣◦♥◦♠◦♣◦♥◦♦-♠◦♣◦♥◦♦

For my final analysis of this text I settled on doing a manual extraction of data by following the procedure of the metadata exercise we did in blog#2. For this exercise I extracted given information about the article on Project Gutenberg and basically stated important facts about the data.
  • Title / Name – Alice’s Adventures in Wonderland
  • Author / Creator – Lewis Carroll (1832-1898)
  • Publisher – Project Gutenberg
  • Date of Creation – June 27, 2008
  • Date viewed – October 18, 2010
  • Language – English
  • Format (.html, .pdf, .avi, etc.)- PDF or HTML
  • Media Type (if applicable) – online book (ebook)
  • Rights - Public domain in the USA.
  • Subject or Topic – Fantasy
  • Category or Categories – Text

No comments:

Post a Comment