Kirk vs Spock Linguistics Head-to-Head

Today I harvested the three seasons of Star Trek TOS episode scripts from a friendly website hosting the annotated text.  Then I extracted just the lines spoken by Captain Kirk and just the lines of Mr Spock. So how does Kirk match up to Spock linguistically? First of all, the Lingua::EN::Fathom module that computes text […]

Mock Survey Analysis Example with R

What are the basic techniques used to analyze survey response data? First off, this code generates a random sample of survey responses that we will analyze:

Bach Chorale Diversity

What is the relative statistical diversity of Bach chorale harmonisations? Ever since I stumbled upon this Bach Choral Harmony Data Set I’ve been wanting ways to analyze it! So I set out to do just that today with the Shannon Diversity Index with this handy perl code:

First off (above), is the standard perl […]

Traveling Salesman with Perl, R and Google Maps

tl;dr: ggplot-nyc && googlemap-nyc && TSP-Map (the Dancer app) One day I decided to glue-together a couple cool Perl modules and the visualization capabilities of R to generate a map of locations and the computed path of a traveling salesman (TSP) – who in this case is a restaurant critic. The prerequisite is to have MongoDB installed […]

Subsequent Prime Number Distribution

In 2016, two mathematicians, Kannan Soundararajan and Robert Lemke Oliver found that the prime numbers do not occur at random when looking at the final digits of subsequent primes.  For instance, a prime ending in 9 is more likely to be followed by a prime ending in 1, than any other digits (1, 3, 7 […]

Visualization of Led Zeppelin Lyrics with R

What can be known about Led Zeppelin lyrics from the standpoint of a computer geek? First, I collected and properly named every song with lyrics, with the power of perl and persistence. Then I found/crafted some R code to process these into a few graphics.

Weighted Graph “Music”

Code: synch-weighted-randomized   This piece was constructed in 4-bar increments by running the above code and importing the generated phrases (MIDI file) into my DAW (Apple Logic).  I then gave the patterns “better” patches, used layers and added a drum track for continuity. The code uses randomly generated weighted graphs to build up a melody […]

Syuzhet Sentiment of Sacred Texts

These graphs show the “emotional valence” or sentiment over narrative time percent. That is, they chart the positive and negative valued sentences from beginning to end.  A positive valued sentence would be, “I feel good.” A negative one of equal value would be, “I feel bad.”  A neutral sentence would be, “I feel okay.”  If […]

Visualizing Vocalization with R

One day I decided that I wanted to have the ability to see a frequency x time “amplitude density” plot of sound – specifically dolphin and bird voices. So the first task was to locate some sounds!  Preferably these should be free from all other sounds, including those from the ambient environment – splashing, microphone […]

Inspecting American Inaugural Addresses with Perl and R

Given all the inaugural  addresses of American presidents, what are the readability stats?  What is the sentiment over time? UPDATE: Results charted for 2017 As usual I reach for perl to acquire and format the data for exploration with R.  The code below (and available on github) reads and analyzes a collection of text documents.  […]