Imagine that you somehow could absorb all of the sound patches created for a specific instrument, say the Yamaha DX7, and your brain was wired in a way that would make it possible for you to see common patterns in sounds of a certain type and the respective settings in the patch data.
Also, imagine your roommate being an FM sound aficionado, to the extent that she spends entire days just memorizing DX7 patch sheets, classifying them according to timbre, and finding deep joy in reading particularly beautiful combinations of frequencies, output levels, and envelope curves.
She also made you a bet, that no matter how hard you study, you won’t be able to fool her by slipping a fake patch sheet by her, something you just made up that might even sound horrible. The bet is on!
This is pretty much how a GAN, or a Generative Adversarial Network trained on thousands and thousands of sysex files for the Yamaha DX7 works, and initial results were good enough to warrant further research.
When we delve deeper into the actual code, the data representation, and the trials and tribulations of taming algorithms that tend towards collapsing into shrieking noise or creating almost identical versions of the same sound over and over again, I promise to share my experiences with GAN training, bugs, and less-than-pleasing sonic results, but for now, let’s jump forward to present time, and a few examples of what Deep DX can produce at its current state.
Well, what are the sounds like?
Cliffhangers are fun and all that, but I think it would be a better service to you, dear reader, to have an opportunity to already hear for yourself some of the results, before we jump into the deep end and describe how we got here.
As a teaser, here are some demos recorded with sounds generated through a couple of different parameter combinations. If you happen to have access to a DX7, TX7 or compatible unit (Dexed is a wonderful emulation that was used heavily in the early phases of trial and error with Deep DX)
All sound banks demoed below are provided here as sysex files for you to try out yourself. All patches can be tested simply by downloading and unpacking the ZIP archive below and use your favourite tool to send them to your synthesizer (or load them directly into Dexed or another virtual representation of the DX7).
These banks are provided as-is. They have not been curated, and there are anomalies. Expect a few duds every bank, which I think is a perfectly fine tradeoff in order to allow a larger span of sounds in the end. Also, make sure to make backups of any sounds you may want to save for later before you load these into your DX7/TX7 as they will fill up the entire 32 sound slots.
The banks provided have been tested on a real DX7 and TX7, and also both Arturia and Dexed emulations of the real thing. But no guarantees are provided.
If you don’t own a DX7 (poor soul!) or just can’t be bothered to dig it out of the attic and hook it up to your computer, I also recorded a few example demos that you can listen to right away, to get a feel for the results. They will not win any prizes for originality but I made sure to make them slightly more enjoyable than just plodding through sound after sound.
Yes, they are recorded with effects. The sounds have been subjected to a bit of chorus, a dash of Blackhole delay/reverb and gently compressed to take the sting out of the worst volume jumps that may occur when changing sounds – you have been warned. Why effects, you may ask? Well, this is the way I would record DX-sounds myself in context, and after subjecting myself to listening to raw FM-sounds of varying quality for hours on end the past months, I would not want to expose you, dear reader, to the same experience. After all, if you really want the original experience, the patches are there for you to download, and do whatever you wish with. (Except sell them, I guess)
FM Pianos, of course!
This is basically me changing patches manually on a Yamaha TX7 while playing back a MIDI loop (and some drums, to make it more listenable – hopefully). These demos are curated in the sense that I skip a couple of patches per bank to keep it short. You can definitely hear a few variations of that Rhodes sound in here, but also other interesting variations on the theme. I apologize for the slightly jarring changes at a few points, which is an unfortunate (or sometimes, interesting) effect when changing sounds abruptly on the DX synthesizers.
Care for some pads? Here we go:
DX7 Pads anyone?
What did I define as “pads” by the way? Well, as I will discuss in a later post, classifying the training data turned out to be one of the major challenges, and as I figured out early on that I wanted to be able to generate sounds from a given category, “Pads” would be the obvious next choice after “Pianos”. But what is a pad?
In short, I defined a “pad” as any sound with a slow-to-medium attack and an “interesting envelope”. As with many things in this endeavour, a better definition could probably be found, but I still think it generates mainly sounds that have a “pad character”.
An interesting thing with GANs is that you can explore the latent parameter space of different sounds. We will go through this too, eventually, in more detail, but what it means is that we can mix the internal parameter representation of two sounds and come up with an intermediary sound, if that makes any sense. To be more concise, we can actually instruct Deep DX to generate sounds that have, for instance the characteristics from both a piano and a pad.
Here is a demo of what Deep DX thinks they should sound like:
Pianos – and Pads, combined
What would be another logical category for FM sounds? Why, plucked sounds of course. If there ever was a synthesis method that could have had “Pluck” as its second name, FM would be it. The category of Pluck offers everything between bells, xylophones and almost-piano sounds which gives it a slightly larger span than the previous ones:
After this, the categories are somewhat less obvious. Here are a few demos of other types of sounds created by Deep DX and fed into a TX7 with a bit of chorus and reverb added.
It’s time to sign off for this time, but to put the sounds in even more context, here is a demo song using only sounds generated from Deep DX (apart from the drums) as generously presented by bitley!
And that it for this time. Next time, we will look at how the patch data was represented for the convolutional networks to chew on, and the arduous process of categorizing training data appropriately!