Visiting with The Echo Nest

June 13, 2009 on 11:45 am | In Music, Programming | 1 Comment

Yesterday I paid a head-spinning visit to The Echo Nest, a small software company in nearby Somerville, at the invitation of their CTO Brian Whitman. You might not have heard of The Echo Nest, but their products power an increasing number of music recommendation engines in sites around the world.

You heard it from me: these folks are writing some of the most badass music-related code on the planet.

The Echo Nest are experts in “machine listening”: they have developed a set of algorithms that crunch through raw audio media and extract a set of distinctive musical features. These features roughly describe what is happening in the music at a hierarchy of time durations (beat, measure, section), and from the features they can compute a notion of similarity between different pieces of music. This similarity metric drives the recommendation aspect of their business.

Naturally enough Noteflight and The Echo Nest have some mutual interests, hence our visit. Audio media and music notation are both descriptions of music, so our companies both think a lot about how those descriptions are related. It’s a tough problem to go from either description to the other, and no algorithm can perform either task anywhere near as well as a human musician.

Anyway, while I was over there they showed off a very cool music hacking tool called Remix which you can grab from Google Code. It’s basically a Python library that takes an audio file, analyzes it using Echo Nest wizardry, and then returns a data structure describing the audio down to the beat level. You can then mess with these beat-length samples based on their descriptive data, and reassemble them in bizarre and unexpectedly musical ways.

As an example, they played me a version of “Here Comes The Sun”, in a strangely filtered version in which only beats in the same key as the opening intro had been retained. The result was a odd, drone-like modification of the song in which the intro itself was intact, but then unfolded into a sequence of snippets from the song that were completely familiar but from which all harmonic motion had been precisely excised.

I then heard a Hall and Oates song that had had beats 2 and 4 surgically removed from every measure. The result? A weird double-time version in which the song form progressed at twice the normal speed, the lyrics were mostly unintelligible but with many recognizable syllables, and the entire song’s length was chopped in half. The latter aspect could be viewed as an improvement on the original.

As a code-on-the-spot challenge, I asked if they could put together a version that sorted all the beats by amplitude, putting the softest ones first and the loudest ones last. 5 minutes of Python hacking later, we were listening to a bizarre, long crescendo of segments from the song, seamlessly reassembled into a whole. The beginning consisted mostly of quieter instrumental chords or the unaccented syllables of words, while the end was a kind of synopsis of all the climactic moments in the song with kick drum or vocal accents. The whole song turned into a single musical gesture, reassembled from its fragments into something completely different but still wholly familiar sounding.

1 Comment

Entries and comments feeds. Valid XHTML and CSS.
All content copyright (c) 2006-2007 Joseph Berkovitz. All Rights Reserved.