Kirk vs Spock Linguistics Head-to-Head

Today I harvested the three seasons of Star Trek TOS episode scripts from a friendly website hosting the annotated text.  Then I extracted just the lines spoken by Captain Kirk and just the lines of Mr Spock. So how does Kirk match up to Spock linguistically? First of all, the Lingua::EN::Fathom module that computes text […]

Mock Survey Analysis Example with R

What are the basic techniques used to analyze survey response data? First off, this code generates a random sample of survey responses that we will analyze:

Bach Chorale Diversity

What is the relative statistical diversity of Bach chorale harmonisations? Ever since I stumbled upon this Bach Choral Harmony Data Set I’ve been wanting ways to analyze it! So I set out to do just that today with the Shannon Diversity Index with this handy perl code:

First off (above), is the standard perl […]

Traveling Salesman with Perl, R and Google Maps

tl;dr: ggplot-nyc && googlemap-nyc && TSP-Map (the Dancer app) One day I decided to glue-together a couple cool Perl modules and the visualization capabilities of R to generate a map of locations and the computed path of a traveling salesman (TSP) – who in this case is a restaurant critic. The prerequisite is to have MongoDB installed […]

Syuzhet Sentiment of Sacred Texts

These graphs show the “emotional valence” or sentiment over narrative time percent. That is, they chart the positive and negative valued sentences from beginning to end.  A positive valued sentence would be, “I feel good.” A negative one of equal value would be, “I feel bad.”  A neutral sentence would be, “I feel okay.”  If […]

Visualizing Vocalization with R

One day I decided that I wanted to have the ability to see a frequency x time “amplitude density” plot of sound – specifically dolphin and bird voices. So the first task was to locate some sounds!  Preferably these should be free from all other sounds, including those from the ambient environment – splashing, microphone […]

Inspecting American Inaugural Addresses with Perl and R

Given all the inaugural  addresses of American presidents, what are the readability stats?  What is the sentiment over time? UPDATE: Results charted for 2017 As usual I reach for perl to acquire and format the data for exploration with R.  The code below (and available on github) reads and analyzes a collection of text documents.  […]

Inspecting the English Premier League Player Stats with R

Being a soccer person and programmer, I wanted to inspect player statistics for myself.  I finally found this excellent site for many leagues and primarily with player stats: whoscored.com.  So, seeing that there was no download link, I determined to tediously copy/paste all the records for each player, for defensive, offensive, passing and summary categories, […]

Visualizing Chatter on a Polar Plot

What does IRC chatter look like on a polar plot, with lines linking the conversations?

Time-Successive Old Faithful Eruption Durations

Because of the observed data, a four group clustering emerges when you plot successive eruption durations, of the famous geyser.  That is, the x axis is the first eruption and y is the next.