Today I harvested the three seasons of Star Trek TOS episode scripts from a friendly website hosting the annotated text. Then I extracted just the lines spoken by Captain Kirk and just the lines of Mr Spock. So how does Kirk match up to Spock linguistically? First of all, the Lingua::EN::Fathom module that computes text […]

## Bach Chorale Diversity

What is the relative statistical diversity of Bach chorale harmonisations? Ever since I stumbled upon this Bach Choral Harmony Data Set I’ve been wanting ways to analyze it! So I set out to do just that today with the Shannon Diversity Index with this handy perl code:

1 2 3 4 5 6 |
#!/usr/bin/env perl use strict; use warnings; use Text::CSV; use Statistics::Diversity::Shannon; |

First off (above), is the standard perl […]

## Traveling Salesman with Perl, R and Google Maps

tl;dr: ggplot-nyc && googlemap-nyc && TSP-Map (the Dancer app) One day I decided to glue-together a couple cool Perl modules and the visualization capabilities of R to generate a map of locations and the computed path of a traveling salesman (TSP) – who in this case is a restaurant critic. The prerequisite is to have MongoDB installed […]

## Visualization of Led Zeppelin Lyrics with R

What can be known about Led Zeppelin lyrics from the standpoint of a computer geek? First, I collected and properly named every song with lyrics, with the power of perl and persistence. Then I found/crafted some R code to process these into a few graphics.

## Syuzhet Sentiment of Sacred Texts

These graphs show the “emotional valence” or sentiment over narrative time percent. That is, they chart the positive and negative valued sentences from beginning to end. A positive valued sentence would be, “I feel good.” A negative one of equal value would be, “I feel bad.” A neutral sentence would be, “I feel okay.” If […]

## Visualizing Vocalization with R

One day I decided that I wanted to have the ability to see a frequency x time “amplitude density” plot of sound – specifically dolphin and bird voices. So the first task was to locate some sounds! Preferably these should be free from all other sounds, including those from the ambient environment – splashing, microphone […]

## Inspecting American Inaugural Addresses with Perl and R

Given all the inaugural addresses of American presidents, what are the readability stats? What is the sentiment over time? UPDATE: Results charted for 2017 As usual I reach for perl to acquire and format the data for exploration with R. The code below (and available on github) reads and analyzes a collection of text documents. […]

## Inspecting the English Premier League Player Stats with R

Being a soccer person and programmer, I wanted to inspect player statistics for myself. I finally found this excellent site for many leagues and primarily with player stats: whoscored.com. So, seeing that there was no download link, I determined to tediously copy/paste all the records for each player, for defensive, offensive, passing and summary categories, […]

## The Density Plot of the Prime Gaps is a Fractal

As you look at the density plots of increasing numbers of prime gaps (the distance between subsequent primes), a fractal emerges. Just get the gaps and graph the densities with this simple R code:

1 2 3 4 5 6 7 8 9 |
library(primes) max <- 100 p <- generate_primes( min = 0, max ) gaps <- p[ 2 : length(p) ] - p[ 1 : length(p) - 1 ] plot( density(gaps), xlab = 'prime gaps', main = 'Below 100' ) |

For increasing numbers of gaps (shown to 100_000_000), this results in the following graphs. You can see the self-similar, fractal […]

## Alternatives to the Logistic Equation

tl;dr: bifurcation.R Yesterday, I decided to plot the bifurcation diagram of the logistic equation. This is a famous plot from the 70s, that many geeks will be familiar with (left). It shows that simple systems can switch into “chaos mode” and begin to bifurcate wildly. To produce the graph, we use code in the R […]