Drawing soundscapes

The making of Vuzak 101

Ernesto Peña (he/him)
5 min readMay 3, 2020

An article authored by me and Dr. Kedrick James was recently published in Leonardo. In fact, I wrote a short piece about this publication a couple of years ago, when the article was accepted for publication. Both the article and the follow-up piece discuss how to create digital sound files from raster images at a raw data level by identifying some of the patterns that allow the translation between both media. Those pieces explain that if, for example, you want to create a 1-second square wave at A440 in monoaural, you would have to create a greyscale image file of 44100 pixels (210 × 210 px, for instance) in a raster image editor (of the likes of Gimp or Photoshop), and alternate arrays of 50 black pixels and 50 white pixels in a left-to-right, top-to-bottom distribution. You would then have to save the image file in a raw data format (either .raw or .data, depending on the software used) and you’ll be ready to import it into an audio editor (let’s say Audacity).

The implications of the method outlined in these pieces depend on personal interests. Some might use them to understand the process behind some forms of databending, some might be interested in using transmediation as a new approach to found sound-based works and some might find the entire exercise pointless. Maybe because I am an amateur musician interested in noise-oriented genres, at the moment of writing the article I found myself either assessing the aesthetic qualities of the sounds or trying to reproduce canonical features of Western music. I was also struggling with something that many academics experience, the notion that the work that went into that article would not be explored beyond a theoretical construct. Initially, I thought about publishing a version of the article somewhere oriented to practitioners (maybe musicians or sound artists) but I have no visibility into that world nor the right to ask for a platform. So, I decided to do something with the patterns we found.

In the last few months before publishing these lines, I have been creating images intended to be transformed into sound, however, instead of focusing on trying to embed the sound qualities in the image (as when I was co-writing the article), this time I focused on limiting the creative process to the geometry of the image. It would be mostly the image, and not the resulting sound, which would provide the parameters for the pieces. That being said, there were a few decisions that were made considering the attributes of both formats to prevent discrepancies. For example:

  • I decided to create all the pieces in a red and blue duotone, which would make it easier for me to produce a stereo effect but would also be reminiscent of the traditional colours in anaglyphs (stereoscopic images).
  • I decided to work with a square image of 2520 × 2520 pixels. The reason for this particular resolution responded to a. the need for having a number of pixels that would result in a discrete number of seconds, which meant that it would have to be divisible by 44100 (Hz, or samples per second) and b. the need to have a number with enough factors to provide a variety of “tones”. The resulting tones are, understandably, unconventional.
  • I decided to avoid editing or mastering the resulting audio file. As a result, the audio is quite rough.

The compilation of pieces, titled Vuzak 101 is, in essence, an early exploration of the affordances of transmediation from raster images to sound. I did not pretend to create something pleasant (either visually or aurally) nor artistic, but potentially interesting. Vuzak 101 will not be my last attempt at this, I plan to release other records of my experimentation with these methods.

Now, here it is, Vuzak 101. Mind the volume!

The making of Soundflake

I realize that this explanation or the whole idea of visual synthesis is pointless when the final product seems to be a poorly edited non-musical piece. I now offer some insight into the process and a technical way, not to prove its feasibility but to invite others to attempt this by themselves.

This is Soundflake, the fourth piece in the playlist:

This image is a .jpg version of the .raw file, which can be imported in some photo editing software and audio editing software.

Soundflake is composed of 8 different layers of square waves in two groups of four, overlapped using an overlay filter at 50% opacity.

Composite production layers of Soundflake. Top: Red channel (right). Bottom: Blue channel (left)

Each one of these groups constitutes one channel, red and blue, left and right. These channels can be seen in the photo editing software (let’s say Photoshop) is the image is opened in two interleaved channels and in the audio editing software (let’s say Audacity) if the file is imported in two channels with no endianess in unsigned 8-bit.

Now, what’s in these images? Let’s take one of the composites of the 8 layers of Soundflake:

The grey band from the top is just silence (baseline), as all the bands of the same tone of grey between each checkered pattern. The checkered patterns are the tone. In this case, each one of the 12 contrasting pairs (light grey and dark grey) constitutes a wavelength, which defines the frequency of the tones. These are 210 pixels in length, which would be a tone between G#3 and A3. Based on the file size, these constant tones play for 6 seconds between 6 seconds of silence.

If you have a raw image editor and Audacity in your computer and you are curious enough, you can try playing with the .raw file of Soundflake, you can download the file here, and if you are interested in getting more information about the other files in the collection, you can always send me an email.