Luka and Marko: MetalGAN
Somewhat atypically, we started our project by brainstorming on a dataset we wanted to use: a collection of 140k metal album covers with various metadata (artist, release year, genre etc.). Both of us being metal fans, that selection was an easy task, the harder one being to pick which of the many topics we’d learned about at PSIML seemed the most interesting.
We settled for trying to generate (new) covers given the subgenre using a type of Generative Adversarial Network (GAN) which is able to do both generation and classification, and calculates how much the generated image resembles a real album cover of that subgenre. After filtering the data to remove the bizarre subgenre entries and cleaning up the rest (which was both a harder and funnier task than we’d anticipated - Doom Metal with Middle Eastern Folk influences, anyone?), we took a random sample and were finally ready to experiment with machine learning techniques.
The generative network worked hard to produce an image based on the subgenre, while the classifier did its best to discern what was generated and provide feedback to the generator. For days, all we had were different kinds of low-res psychedelic and abstract art, depending on the network architecture and our tweaks, making no sense whatsoever.
It was not until we started training on a larger sample (d’oh!) that we finally saw something discernible. The fact it was a fairly raw dataset combined with the nature of the images (it was art after all) meant that variability was quite large; even judging what constituted a “good result” was nearly impossible to be done objectively. Until the penultimate day we were experimenting with different network architectures and parameters, attempting to subdue the white noise patterns that emerged periodically, leaving our “best guess” to train over the final night.
In the morning, we had something to see—patterns were there, classifications made sense (at least to us and our mentor) and in some cases even letter-like squiggles surfaced where one could expect to find a band logo on an album cover. After some finishing touches, we had results to show!
And, more importantly, we were miles ahead from where we’d started. In those several days we managed to come up with an original project, wrangle with the data, fail (multiple times), realize what we’d been doing wrong and correct it. The best part being, of course, the ability to look back at some truly kvlt album art after the happy ending.