Kirk vs Spock Linguistics Head-to-Head

Today I harvested the three seasons of Star Trek TOS episode scripts from a friendly website hosting the annotated text.  Then I extracted just the lines spoken by Captain Kirk and just the lines of Mr Spock. So how does Kirk match up to Spock linguistically? First of all, the Lingua::EN::Fathom module that computes text […]

Bach Chorale Diversity

What is the relative statistical diversity of Bach chorale harmonisations? Ever since I stumbled upon this Bach Choral Harmony Data Set I’ve been wanting ways to analyze it! So I set out to do just that today with the Shannon Diversity Index with this handy perl code:

First off (above), is the standard perl […]

Traveling Salesman with Perl, R and Google Maps

tl;dr: ggplot-nyc && googlemap-nyc && TSP-Map (the Dancer app) One day I decided to glue-together a couple cool Perl modules and the visualization capabilities of R to generate a map of locations and the computed path of a traveling salesman (TSP) – who in this case is a restaurant critic. The prerequisite is to have MongoDB installed […]

Syuzhet Sentiment of Sacred Texts

These graphs show the “emotional valence” or sentiment over narrative time percent. That is, they chart the positive and negative valued sentences from beginning to end.  A positive valued sentence would be, “I feel good.” A negative one of equal value would be, “I feel bad.”  A neutral sentence would be, “I feel okay.”  If […]

Visualizing Vocalization with R

One day I decided that I wanted to have the ability to see a frequency x time “amplitude density” plot of sound – specifically dolphin and bird voices. So the first task was to locate some sounds!  Preferably these should be free from all other sounds, including those from the ambient environment – splashing, microphone […]

Inspecting American Inaugural Addresses with Perl and R

Given all the inaugural  addresses of American presidents, what are the readability stats?  What is the sentiment over time? UPDATE: Results charted for 2017 As usual I reach for perl to acquire and format the data for exploration with R.  The code below (and available on github) reads and analyzes a collection of text documents.  […]

Inspecting the English Premier League Player Stats with R

Being a soccer person and programmer, I wanted to inspect player statistics for myself.  I finally found this excellent site for many leagues and primarily with player stats:  So, seeing that there was no download link, I determined to tediously copy/paste all the records for each player, for defensive, offensive, passing and summary categories, […]

Visualizing Chatter on a Polar Plot

What does IRC chatter look like on a polar plot, with lines linking the conversations?

Time-Successive Old Faithful Eruption Durations

Because of the observed data, a four group clustering emerges when you plot successive eruption durations, of the famous geyser.  That is, the x axis is the first eruption and y is the next.

After over 15 years of solid LAMP development with Perl and friends, I have finally started to use Python – both as programmer absorbed with the syntactical nuts and bolts, and as scientist manipulating and visualizing data. Frankly, it is exciting.