Spectrograms of examples from the NSynth dataset and and audio resynthesized from audio features (loudness and fundamental frequency) extracted from the original audio. The bottom row shows the loudness features of the original audio and resynthesized audio. Spectral and loudness contours are largely reproduced by the resynthesized audio.
Original | ||||||
Resynthesis |
Synthesizing audio at different pitches with thesame loudness envelope.
Pitch 24 | Pitch 36 | Pitch 43 |
Pitch 48 | Pitch 55 | Pitch 60 |
Pitch 67 | Pitch 72 | Pitch 79 |
Interpolating in loudness conditioning.
Applying tremolo to loudness conditioning.
Applying vibrato to frequency conditioning. F0 extracted with CREPE.