Musical Ngrams

What are the most repeated phrases of musical compositions?  Naturally I wrote a program to tell me!

Here is the first part that declares the preamble and modules to use:

This uses my simple MIDIUtil module to set things up.

Next up, we take input from the command-line user:

(Sadly, the ngram rhythms are not preserved – only the pitches.  Instead, we give the program a set of durations to choose from at random.)

This is followed by some variables that will be used:

Now for the first procedure of the program: Turn the MIDI note information into strings of phrases (e.g. “71 69 71 55” -> “hb gj hb ff”) and then into ngram chunks:

Next up is to actually construct a MIDI file from these phrases:

I ran Bach’s Jesu Joy of Man’s Desiring (bach_jesu_joy_with_piano) through this program and generated this file: note-ngram-play-JESU.

Here is the text output showing the index, the first 20 repetitions and the 4 note phrase itself (in MIDI note number notation):

And here is what it sounds like after re-assigning the piano patches:


Ok. How about some Beethoven?  Here is the Moonlight Sonata in 8 note phrases with different durations given: note-ngram-play-MOONLIGHT

And here is what that sounds like when put through my DAW:


Here is Peter Gabriel’s “Games Without Frontiers” in chunks of 4 note phrases, made with this command, and then re-orchestrated with my DAW:


How about “Big Time” from the same site?  This actually sounds like old school Peter Gabriel when re-orchestrated with flutes and mallets.


UPDATE: A vastly superior update to this program that includes documentation and uses “weighted choice” to determine the phrases to play is ngram-play.  And now even more superior module is now on CPAN. Woo!  And here is a Web GUI app that I made to make this analysis easy: App-MIDI-Ngram