Fast and Flexible Neural Audio Synthesis

Online Supplement

Lamtharn Hantrakul, Jesse Engel, Adam Roberts, Chenjie Gu

Main Paper

Appendix

Contents

Resynthesis with Extracted Features
Pitch Shifting
Interpolating Loudness
Adding Tremolo
Adding Vibrato

Resynthesis with Extracted Features

Spectrograms of examples from the NSynth dataset and and audio resynthesized from audio features (loudness and fundamental frequency) extracted from the original audio. The bottom row shows the loudness features of the original audio and resynthesized audio. Spectral and loudness contours are largely reproduced by the resynthesized audio.

Resynthesis with extracted features
Original
Resynthesis

Pitch Shifting

Synthesizing audio at different pitches with thesame loudness envelope.

Pitch shifting
Pitch 24 Pitch 36 Pitch 43
Pitch 48 Pitch 55 Pitch 60
Pitch 67 Pitch 72 Pitch 79

Interpolating Loudness

Interpolating in loudness conditioning.

Interopolating loudness

Adding Tremolo

Applying tremolo to loudness conditioning.

Adding tremolo

Adding Vibrato

Applying vibrato to frequency conditioning. F0 extracted with CREPE.

Adding Vibrato