LLark: A Multimodal Foundation Model for Music

Josh Gardner*, Simon Durand, Daniel Stoller, Rachel Bittner

Spotify (*:University of Washington; work completed while Spotify intern.)


This is the companion website for the paper 'LLark: A Multimodal Foundation Model for Music' (under review at ICLR 2024).This website presents a selection of example outputs from our model, alongside the music used to produce these outputs. We also provide samples of the training data used for our model and provide comparison to other multimodal models for captioning and reasoning tasks.

Abstract: Music has a unique and complex structure which is challenging for both expert humans and existing AI systems to understand, and presents unique challenges relative to other forms of audio. We present LLark, an instruction-tuned multimodal model for music understanding. We detail our process for dataset creation, which involves augmenting the annotations of diverse open-source music datasets and converting them to a unified instruction-tuning format. We propose a multimodal architecture for LLark, integrating a pretrained generative model for music with a pretrained language model. In evaluations on three types of tasks (music understanding, captioning, and reasoning), we show that our model outperforms existing baselines in zero-shot generalization for music understanding, and that humans show a high degree of agreement with the model's responses in captioning and reasoning tasks. LLark is trained entirely from open-source music data and models, and we make our training code available along with the release of this paper.

Note: LLark only process up to 25-second crops of audio. However, for some datasets, we cannot host the original audio due to licensing restrictions. As a result, we link to the (full) track at an external source; however, LLark only received a single 25-second crop of the track to generate its response.

To view metadata and license info for each track, hover over the ๐ŸŽถSource: A dataset Title: Song title Artist: Artist name License: License info


Music Understanding (Classification/Regression) Tasks

Tempo Estimation (Giant Steps Tempo Dataset)


Prompt: What is the tempo of this song? ๐ŸŽถSource: Giant Steps Tempo Dataset Title: Merlin Artist: BOYS NOIZE


LLark: The tempo of this song is approximately 125 beats per minute (BPM).


Label: 125 BPM


Prompt: What is the tempo of this song? ๐ŸŽถSource: Giant Steps Tempo Dataset Title: See 4 Eva (Cube Hard Remix) Artist: Petrol Heads


LLark: The tempo of this song is approximately 171.4 beats per minute (BPM).


Label: 170 BPM


Prompt: What is the tempo of this song? ๐ŸŽถSource: Giant Steps Tempo Dataset Title: Typhoon ft. Chasing Shadows Artist: Foreign Beggars


LLark: The tempo of this song is approximately 69.8 beats per minute (BPM).


Label: 66 BPM


Prompt: What is the tempo of this song? ๐ŸŽถSource: Giant Steps Tempo Dataset Title: Gyroscope (Original Mix) Artist: Silica


LLark: The tempo of this song is approximately 139.5 beats per minute (BPM).


Label: 139.5 BPM

Key Estimation (Giant Steps Key Dataset)


Prompt: What is the key of this song? ๐ŸŽถSource: Giant Steps Key Dataset Title: Release Me Artist: The Attic feat. Oh Laura


LLark: The key of this song is F minor.


Label: F minor


Prompt: What is the key of this song? ๐ŸŽถSource: Giant Steps Key Dataset Title: You Don't Have to Hopp (Guti Remix) Artist: Nick Curly


LLark: The key of this song is Db major.


Label: Db major


Prompt: What is the key of this song? ๐ŸŽถSource: Giant Steps Key Dataset Title: Deep Style (Original Mix) Artist: Martin Roth


LLark: The key of this song is D# minor.


Label: Eb minor


Prompt: What is the key of this song? ๐ŸŽถSource: Giant Steps Key Dataset Title: Timbuktu Artist: Francesco Diaz & Young Rebels


LLark: The key of this song is F# major.


Label: Gb major

Genre Estimation (GTZAN Dataset)


Prompt: What genre is this song? ๐ŸŽถSource: GTZAN Dataset Title: The Four Seasons, Op. 8: Concerto No. 2 for Violin in G Minor, RV 315 'Summer': III. Presto Artist: Antonio Vivaldi ยท Jose-Luis Garcia ยท Leonard Slatkin ยท English Chamber Orchestra


LLark: The genre of this song is classical.


Label: classical


Prompt: What genre is this song? ๐ŸŽถSource: GTZAN Dataset Title: Deep Throat Blues Artist: James Carter


LLark: The song can be classified as both jazz and instrumental pop.


Label: jazz


Prompt: What genre is this song? ๐ŸŽถSource: GTZAN Dataset Title: Divine Lines Artist: Jeremiah Dirt


LLark: The genre of this song is Hip-Hop.


Label: hip-hop


Prompt: What genre is this song? ๐ŸŽถSource: GTZAN Dataset Title: Shame on the Night Artist: Dio


LLark: The song is in the rock genre.


Label: metal

Instrument Identification (MedleyDB Dataset)


Prompt: List the instruments you hear in this song, including drums and vocals. ๐ŸŽถSource: MedleyDB Title: Booth Shot Lincoln Artist: Chris Jacoby License: Creative Commons Attribution - NonCommercial - ShareAlike license


LLark: In this clip, I can hear the following instruments: guitar, bass, drums, and vocals.


Label: ['drums', 'mandolin (guitar)', 'clean electric guitar (guitar)']


Prompt: List the instruments you hear in this song, including drums and vocals. ๐ŸŽถSource: MedleyDB Title: Cool Jazz Artist: Music Delta License: Creative Commons Attribution - NonCommercial - ShareAlike license


LLark: In this clip, I can hear the piano, saxophone, bass, and drums. There are no vocals in this clip.


Label: ['bass', 'drums', 'piano', 'saxophone', 'trumpet']


Prompt: List the instruments you hear in this song, including drums and vocals. ๐ŸŽถSource: MedleyDB Title: All Good Things Artist: Michael Kropf License: Creative Commons Attribution - NonCommercial - ShareAlike license


LLark: In this clip, you can hear the violin and the piano.


Label: ['piano', 'violin']


Prompt: List the instruments you hear in this song, including drums and vocals. ๐ŸŽถSource: MedleyDB Title: Fairer Hopes Artist: Matthew Entwistle License: Creative Commons Attribution - NonCommercial - ShareAlike license


LLark: In this clip, I can hear the piano, computer-generated sounds, strings, and synthesizers. There are no vocals or drums present.


Label: ['drums', 'cello', 'bass', 'harp', 'piano', 'flute']


Music Captioning Task

Music Captioning on Test Datasets

MusicCaps Dataset


Prompt: Describe the contents of the provided audio in detail. ๐ŸŽถSource: MusicCaps/AudioSet Dataset Title: -hSMzrWZCAE License: CC Attribution 4.0 International


LLark: As I listen to the audio clip, I hear a rock song with a lively and energetic feel. The song starts with a guitar playing a short riff, followed by a drum beat. The guitar continues to play a catchy melody, while the drums maintain a steady rhythm throughout the clip. The tempo of the song is moderately fast, with a beat that makes you want to move and dance.

The guitar is the main instrument in this clip, providing the melodic and harmonic foundation of the song. It plays a combination of power chords and melodic lines, creating a sense of tension and release. The guitar is played with a distorted tone, which gives it a raw and edgy sound.

The drums provide a driving rhythm that keeps the song moving forward. They play a consistent beat, emphasizing the downbeats and adding a strong sense of groove to the music. The drum pattern is simple but effective, providing a solid foundation for the other instruments.

The overall mood of the song is upbeat and lively, with a sense of excitement and energy. The combination of the guitar and drums creates a dynamic and catchy sound that is characteristic of rock music.

In terms of the song's structure, I can hear distinct sections such as the verse and chorus. The verse features a repetitive guitar riff, while the chorus introduces a different chord progression and a more energetic guitar solo. These sections help to create contrast and keep the listener engaged.

The song is in the key of D minor, which gives it a darker and more intense sound. The chords used in the song include D minor, D major, and G major, which contribute to the overall harmonic structure and progression.

Overall, this audio clip showcases a rock song with a catchy guitar riff, driving drums, and an energetic feel. The combination of the guitar and drums creates a dynamic and lively sound that is characteristic of the rock genre.


Prompt: Describe the contents of the provided audio in detail. ๐ŸŽถSource: MusicCaps/AudioSet Dataset Title: -dVgUSrR8g4 License: CC Attribution 4.0 International


LLark: As I listen to the music clip, I am immediately struck by the powerful and intense atmosphere created by the combination of instruments and vocals. The song begins with a strong guitar riff, which is accompanied by a driving drum beat. The guitar plays a repeating pattern of power chords, creating a sense of tension and energy. The drums provide a steady rhythm, with the kick drum emphasizing each beat, while the snare drum adds a crisp accent on the second and fourth beats of each measure.

As the clip progresses, the guitar continues to play its powerful riff, while the drums maintain their energetic rhythm. The guitar's distorted tone adds to the aggressive and heavy feel of the music. The absence of other instruments in this section allows the guitar and drums to take center stage and showcase their raw power.

Around the 13-second mark, the guitar briefly transitions to a higher register, playing a series of rapid notes. This creates a moment of tension and anticipation before returning to the main riff. The guitar's melodic lines are accompanied by the drums, which add a sense of dynamic and rhythmic variation to the music.

At around the 16-second mark, the drums momentarily pause, allowing the guitar to take over the spotlight with a solo. The guitarist showcases their technical skill and improvisational abilities, playing intricate melodic lines and powerful chords. This solo section adds a sense of excitement and variation to the music.

Throughout the clip, the vocals play a prominent role, delivering intense and passionate lyrics. The singer's voice is powerful and emotive, adding to the overall intensity of the music. The vocals are often in sync with the guitar riff, creating a strong connection between the instruments and the vocals.

The overall mood of the music clip is energetic and intense, with a strong sense of power and aggression. The combination of the guitar riff, driving drums, and powerful vocals creates a captivating and exhilarating listening experience. The music is in the key of C minor, which adds to the dark and intense atmosphere.

In terms of structure, the clip features a repeating pattern of chords, with variations in the guitar riff and solo sections. The tempo of the music is fast-paced, with a BPM of approximately 176.5. The downbeats are clearly emphasized, giving the music a strong sense of rhythm and drive.

Overall, this music clip showcases the raw power and intensity of a rock song, with the guitar and drums taking center stage, supported by passionate vocals. The combination of the guitar riff, driving drums, and powerful vocals creates a captivating and energetic listening experience.


Prompt: Describe the contents of the provided audio in detail. ๐ŸŽถSource: MusicCaps/AudioSet Dataset Title: VjXt63pqUgw License: CC Attribution 4.0 International


LLark: As I listen to the audio clip, I am immediately struck by the haunting and atmospheric sounds that fill the space. The music seems to create a sense of tension and anticipation, with a hint of darkness. The overall mood is somber and introspective, evoking a feeling of mystery and intrigue.

The first thing that stands out is the absence of any discernible instruments or melodies. Instead, the focus seems to be on creating an atmospheric texture through the use of various sound effects. These effects include eerie and unsettling sounds that contribute to the overall mood of the piece.

I notice that the music is in a minor key, specifically C minor. This key is often associated with darker and more melancholic emotions. The chord progression, if there is one, is not easily discernible due to the lack of traditional melodic elements. However, there is a sense of harmonic movement, with the chords shifting subtly throughout the clip.

The tempo of the music is relatively slow, with a BPM of approximately 77.9. This slow tempo adds to the atmospheric and introspective nature of the piece. The downbeats are clearly defined, with a consistent pattern of four beats per measure. This suggests that the music is in a 4/4 time signature, which is a common time signature in many genres of music.

Overall, the audio clip is a captivating and atmospheric piece of music. Its use of sound effects, minor key, and slow tempo contribute to its haunting and introspective mood. The absence of traditional melodic elements allows the listener to focus on the evocative textures and emotions conveyed by the music.

MusicNet Dataset


Prompt: Describe the contents of the provided audio in detail. ๐ŸŽถSource: MusicNet/Soni Ventorum Wind Quintet Title: Serenade in E-flat major, 4. Menuetto Artist: Soni Ventorum Wind Quintet License: Creative Commons Attribution 4.0 International


LLark: As I listen to the music clip, I hear a beautiful and serene piece of classical music. The main instrument that stands out is a clarinet, playing a melodic line in the upper register. It begins with a short note, followed by a series of notes that ascend and descend in a graceful manner. The clarinet's tone is warm and expressive, creating a sense of intimacy and emotional depth.

Accompanying the clarinet, I hear a bassoon playing a steady rhythmic pattern. Its notes are lower in pitch, providing a rich and grounding foundation to the melody. The bassoon's notes are played with a smooth and legato style, adding a sense of elegance to the overall sound.

Another instrument that adds to the texture of the music is a French horn. It plays a combination of sustained notes and short melodic phrases, complementing the clarinet and bassoon. The French horn's notes are played with a bright and vibrant tone, enhancing the overall harmonic structure of the piece.

Throughout the clip, I notice the presence of chords that support the melodic lines. These chords add a sense of depth and complexity to the music, enhancing the emotional impact of the piece. The chords change subtly over time, creating a sense of tension and release that draws the listener in.

The tempo of the music is moderate, with a steady beat that gives a sense of stability and continuity. I can hear downbeats marking the beginning of each measure, providing a clear sense of rhythm and structure.

The key of the piece is E-flat major, which gives it a warm and uplifting character. The choice of key contributes to the overall mood and emotional impact of the music.

Overall, this music clip is a beautiful and introspective piece, featuring a clarinet, bassoon, and French horn. The melodic lines, rhythmic patterns, and harmonic progression work together to create a sense of tranquility and emotional depth.


Prompt: Describe the contents of the provided audio in detail. ๐ŸŽถSource: MusicNet/Kimiko Ishizaka/www.bachcentral.com Title: WTK I, No. 5: Prelude and Fugue in D major 1. Prelude License: Creative Commons Attribution 4.0 International


LLark: As I listen to the music clip, I hear a solo piano performing a piece by Bach. The piano starts with a quick and lively melody, playing notes in the upper register. The melody is accompanied by lower notes that provide a harmonic foundation. The piano then transitions into a more complex section, with rapid arpeggios and runs in the mid-range of the instrument.

Throughout the piece, the piano explores different registers and dynamics, creating a sense of movement and tension. The rhythmic patterns are intricate and varied, adding to the overall complexity of the composition.

I notice that the piece has a distinctive rhythmic pulse, which is emphasized by the downbeats. These downbeats occur at regular intervals, providing a sense of structure and stability to the music.

The harmony of the piece is characterized by the use of chords. The chords change periodically, providing a sense of progression and tension. The chords are rich and full-sounding, adding depth and complexity to the overall sound of the piece.

The key of the piece is in A major, which gives it a bright and uplifting quality. The melodies and harmonies are consonant and harmonically pleasing, creating a sense of resolution and satisfaction.

Overall, this excerpt from Bach's "WTK I, No. 16: Prelude and Fugue in G major" showcases the technical skill and musicality of the pianist. The intricate melodies, arpeggios, and harmonies create a rich and captivating musical experience.


Prompt: Describe the contents of the provided audio in detail. ๐ŸŽถSource: MusicNet/Museopen/Andrew D. Lawson Title: Quartet No 53 in D major 3. Menuetto: Allegretto License: Creative Commons Attribution 4.0 International


LLark: As I listen to the music clip, I hear a string quartet playing a lively and energetic piece. The composition is "String Quartet No. 2 in G major" by Beethoven, specifically the "4. Allegro" movement. The ensemble consists of two violins, a viola, and a cello.

The violins start the piece with a rapid and energetic melody. They play a series of fast notes in the higher register, creating a sense of excitement and intensity. The violins are accompanied by the cello, which plays a steady rhythm in the lower register.

As the piece progresses, the violins and cello continue to play in unison, creating a strong and unified sound. The viola enters with a melodic line that complements the violins and cello, adding depth and richness to the overall texture of the music.

Throughout the clip, I notice the presence of chords. The chords change periodically, adding harmonic color and supporting the melodic lines. The chords played include D major, G major, E minor, and A major, among others.

The tempo of the music is quite fast, with a tempo of 139.5 beats per minute. This contributes to the energetic and lively nature of the piece.

Overall, this music clip showcases the skillful interplay between the violins, viola, and cello, creating a dynamic and engaging performance. The melodic lines, rhythmic patterns, and harmonic progression all work together to create a captivating musical experience.

FMA Dataset


Prompt: Describe the contents of the provided audio in detail. ๐ŸŽถSource: FMA Title: Gens Sont Tristes Artist: Misiaczek License: Attribution-NonCommercial-NoDerivatives (aka Music Sharing) 3.0 International


LLark: As I listen to the audio clip, I hear a hip-hop song with a laid-back and relaxed vibe. The track features a male vocalist, who delivers the lyrics in a smooth and rhythmic manner. The beat is steady and consistent, providing a solid foundation for the vocals.

The song begins with a chord progression in the key of G major, which sets a bright and uplifting tone. The chords are played by a synthesizer or keyboard, creating a melodic and harmonic backdrop for the vocals. The chord progression consists of G major, A major, and B minor chords, which are commonly used in many popular songs.

Throughout the clip, the vocals are the main focus, with the male vocalist delivering the lyrics with a smooth flow. The lyrics are rhythmic and rhyming, adding to the catchy and memorable nature of the song. The vocals are accompanied by occasional vocal harmonies, which add depth and richness to the overall sound.

The tempo of the song is moderate, with a BPM of approximately 84.5. This tempo contributes to the relaxed and chilled-out atmosphere of the track. The downbeats are clearly defined, with the first beat of every measure emphasized, giving the song a steady and grounded feel.

Overall, this clip features a hip-hop song with a smooth male vocal performance, a catchy chord progression, and a relaxed vibe. The combination of the vocals, chords, and beat creates a well-rounded and enjoyable musical experience.


Prompt: Describe the contents of the provided audio in detail. ๐ŸŽถSource: FMA Title: Bogi Dance Artist: Neuroleptic Trio License: Attribution-NonCommercial-ShareAlike 3.0 International


LLark: As I listen to the audio clip, I can hear a combination of electronic and acoustic elements. The song starts with a bassline that sets a groovy and rhythmic foundation. The bass is accompanied by a synthesizer, which adds layers of melodic and atmospheric textures. The synthesizer creates a dreamy and ethereal sound, enhancing the overall mood of the song.

Around the 0.3-second mark, I notice the introduction of a piano playing a simple yet captivating melody. The piano adds a sense of elegance and sophistication to the mix. It is accompanied by a drum machine, which provides a steady beat and rhythm. The drum machine adds a modern and electronic element to the song, enhancing its overall energy.

Throughout the clip, I can hear the interplay between the piano, synthesizer, and drum machine. They create a dynamic and harmonious blend of sounds, with each instrument contributing to the overall musical texture. The piano and synthesizer take turns playing melodic phrases, while the drum machine maintains a consistent rhythm.

The tempo of the song is moderately fast, with a tempo of approximately 130.4 beats per minute. This contributes to the energetic and lively feel of the music.

As I listen further, I can hear the presence of chords that provide harmonic structure to the song. The chords change periodically, creating a sense of movement and progression. The chords include D major, D minor, G major, and C major, among others.

Overall, the audio clip showcases a fusion of electronic and acoustic elements, with a prominent bassline, synthesizer, piano, and drum machine creating a captivating and dynamic musical experience. The interplay between the instruments, the rhythmic patterns, and the changing chords contribute to the overall mood and atmosphere of the song.


Prompt: Describe the contents of the provided audio in detail. ๐ŸŽถSource: FMA Title: Terrorist Beard Artist: Dental Work License: Attribution-Noncommercial-Share Alike 3.0 United States


LLark: As I listen to the audio clip, I hear a chaotic and experimental sound. The music is characterized by intense and unpredictable elements, creating a sense of dissonance and unease. The clip features a fast tempo of approximately 193.5 beats per minute, which contributes to the frenetic energy of the track.

The absence of vocals and the use of unconventional instruments or sounds give the music an abstract and avant-garde quality. The chords are not clearly defined, as the clip features a single chord labeled as "no chord" that spans the entire duration of the clip. This lack of harmonic structure adds to the experimental nature of the music.

The downbeats in the clip are emphasized, with the first beat of each measure being particularly prominent. This rhythmic emphasis helps to establish a sense of structure amidst the chaotic sounds.

Overall, the audio clip is a challenging and unconventional piece of music that pushes the boundaries of traditional musical genres. It is not for the faint of heart, as it may evoke discomfort or confusion due to its experimental nature. However, for those who appreciate unconventional and boundary-pushing music, this clip may offer a unique and intriguing listening experience.


Music Captioning: Instruction Following Demo

Prompt: Describe the provided audio in one word. ๐ŸŽถSource: FMA Title: Into Orbit Artist: Radius System License: Attribution-ShareAlike


LLark: Energetic

Prompt: Give a short summary of the provided audio. ๐ŸŽถSource: FMA Title: Into Orbit Artist: Radius System License: Attribution-ShareAlike


LLark: Hard rock instrumental with heavy drum rolls, distorted guitar riffs, and bass.

Prompt: Describe the contents of the provided audio in detail. ๐ŸŽถSource: FMA Title: Into Orbit Artist: Radius System License: Attribution-ShareAlike


LLark: As the music begins, a loud and heavy rock song with a prominent guitar riff and energetic drumming immediately captures your attention. The guitar plays a catchy and distorted riff, with power chords and fast strumming patterns. The drums provide a driving rhythm, with strong downbeats and a mix of acoustic and electronic elements. The song has a fast tempo, with a strong sense of energy and intensity.

The guitar riff repeats throughout the clip, creating a catchy and memorable melody. The guitarist showcases their technical skill with rapid guitar solos and intricate fingerpicking patterns. The guitar tone is distorted, giving the music a gritty and aggressive sound.

In addition to the guitar and drums, there are no other prominent instruments or sounds in the clip. The focus is mainly on the guitar and drums, creating a raw and powerful sound.

The clip does not contain any vocals or singing. It is purely instrumental, allowing the listener to focus on the instrumentation and the energy of the music.

The key of the song is F minor, which contributes to the dark and intense atmosphere. The chords played in the clip are F major, C# major, and C# minor, which add tension and release to the music.

Overall, this clip features a powerful and energetic rock song with a guitar riff and driving drums. The absence of vocals allows the listener to fully immerse themselves in the instrumental elements and appreciate the skill and intensity of the musicianship.


Reasoning Tasks

MTG-Jamendo Dataset

Prompt: How could a music producer recreate the sounds in this track? ๐ŸŽถSource: MTG-Jamendo Title: i'm DJ Artist: Angus License: reative Commons Attribution-No Derivatives


LLark: A music producer would need to focus on creating an atmospheric soundscape to recreate the sounds in this track. They would likely use a variety of synthesizers, both digital and analog, to create the layered textures that form the basis of this music. The producer would also need to create a steady beat, likely using a drum machine or software, to keep the rhythm. The tempo would be set to around 120 beats per minute to match the pace of the original track. The chord progression would need to be programmed or played to match the sequence of E minor, E major, and D major.

Prompt: What are some characteristics that potentially differentiate the song from other similar songs? ๐ŸŽถSource: MTG-Jamendo Title: i'm DJ Artist: Angus License: reative Commons Attribution-No Derivatives


LLark: One characteristic that potentially differentiates the song from other similar songs is the use of the synthesizer as the main instrument. The synthesizer is a versatile instrument that can create a wide range of sounds and textures, allowing for unique musical expressions. Additionally, the song's tempo of 120 BPM and the use of the E minor key contribute to its distinctive sound. The combination of these elements, along with the specific chord progression and downbeat pattern, gives the song its own unique identity within the electronic genre.

Prompt: In what kind of environments or situations would someone likely listen to this track? ๐ŸŽถSource: MTG-Jamendo Title: i'm DJ Artist: Angus License: reative Commons Attribution-No Derivatives


LLark: Given the tempo and the chord progression, this track might be suitable for casual listening or as background music in a variety of settings. It could be played in a cafe, during a relaxed gathering at home, or in a retail store. The rhythm and mood of the track might also make it suitable for personal listening during activities such as studying, reading, or commuting.

Prompt: Describe the styles or genres of this song and explain how the song illustrates each style or genre mentioned. ๐ŸŽถSource: MTG-Jamendo Title: i'm DJ Artist: Angus License: reative Commons Attribution-No Derivatives


LLark: This song can be categorized under the genres of pop, electronic, and synthpop. The pop genre is exemplified by its catchy and upbeat melody, which is typical of the genre. The electronic influence is evident in the synthesized sounds and beats used throughout the track. The synthpop genre is characterized by the use of synthesizers as the main instrument, which is the case in this song.

Prompt: How could a music producer recreate the sounds in this track? ๐ŸŽถSource: MTG-Jamendo Title: night club Artist: CHRISS ONAC License: Creative Commons Attribution-Share-Alike


LLark: A music producer could recreate the sounds in this track by focusing on a few key elements. First, they would need to set the tempo to around 127.7 BPM to match the pace of the track. Then, they would need to create a beat pattern that follows a 4/4 time signature, with emphasis on the first and third beats of each measure. The chord progression would be another important aspect to reproduce. The song begins with an E minor chord, transitions to an E major, then to a G major, and finally to a D major. This progression repeats throughout the song. To capture the essence of the song, the producer would need to use instruments that can produce these chords, such as a piano, guitar, or synthesizer.

Prompt: What are some characteristics that potentially differentiate the song from other similar songs? ๐ŸŽถSource: MTG-Jamendo Title: night club Artist: CHRISS ONAC License: Creative Commons Attribution-Share-Alike


LLark: Some characteristics that potentially differentiate the song from other similar songs include the specific combination of instruments used, the specific chord progressions and harmonies, the tempo and rhythmic patterns, and the overall mood and emotions conveyed. Each song has its own unique combination of elements that contribute to its distinct sound and feel.

Prompt: In what kind of environments or situations would someone likely listen to this track? ๐ŸŽถSource: MTG-Jamendo Title: night club Artist: CHRISS ONAC License: Creative Commons Attribution-Share-Alike


LLark: Given the genres and the tempo of this song, it would be suitable for a variety of situations. The electronic and dance elements make it a good fit for a nightclub or party environment, where the steady beat can encourage people to dance. The funk and soul aspects, combined with the moderate tempo, also make it suitable for casual listening, perhaps in a relaxed social gathering or a chill-out session at home. It could also work well as background music in a restaurant or cafe, where the music adds to the atmosphere without being overly intrusive.

Prompt: Describe the styles or genres of this song and explain how the song illustrates each style or genre mentioned. ๐ŸŽถSource: MTG-Jamendo Title: night club Artist: CHRISS ONAC License: Creative Commons Attribution-Share-Alike


LLark: This song embodies several styles and genres, which are primarily represented by its electronic, dance, and electropop elements. The electronic influence can be heard in the synthesized sounds and digital effects that dominate the track. The dance elements come to life through the upbeat tempo and the rhythmic patterns, which are designed to make people move. The electropop influence is evident in the catchy, synthesized melodies and the overall structure of the song, which follows a typical pop format. The combination of these styles and genres creates a unique and vibrant sound that is both energetic and catchy.


Other Examples


Prompt: Write a short bedtime story that could accompany this song, starting with the phrase 'There was once a young chipmunk'. Then, give a detailed explanation of how the story matches the song. ๐ŸŽถSource: MusicNet/Museopen/Andrew D. Lawson Title: Quartet No 53 in D major 3. Menuetto: Allegretto License: Creative Commons Attribution 4.0 International


LLark:
Once upon a time, in a small village nestled in the rolling hills of the countryside, there lived a group of animals who were the best of friends. There was a wise old owl named Hootie, a sassy squirrel named Squeaky, a gentle goat named Billy, and a mischievous rabbit named Fluffy. They spent their days exploring the lush greenery and sparkling streams of their home, and their nights gathered around a fire, sharing stories and laughter. One evening, as the sun set and the stars twinkled overhead, the friends decided to have a picnic in a beautiful meadow. They packed a basket with their favorite treats and set off, singing and chattering, to their special spot. As they sat down to eat, they heard a rustling in the grass. A little rabbit peered out, looking for his friends. Fluffy and the others welcomed him with open arms, and soon, the little rabbit became a part of their group. They spent the rest of the night under the stars, watching the sky turn pink and purple as the sun dipped below the horizon. As the night grew late, they said their goodbyes and settled in for a peaceful night, each one snuggled up close to their friends.

This bedtime story matches the song in many ways. The lively tempo and rhythm of the music reflect the energetic and joyful mood of the animals. The chords played, such as A#major, D#major, and Fmajor, add color and depth to the story, mirroring the harmonies created by the different instruments in the song. The use of different instruments, like the oboe, bassoon, clarinet, and French horn, creates a rich and vibrant sound that complements the imaginative adventures of the animals.


Prompt: Describe this song in the style of a pirate. ๐ŸŽถSource: FMA Title: Summer Wind Artist: Cyclone 60 License: Attribution-Noncommercial-Share Alike 3.0 United States


LLark: Shiver me timbers! This be a swashbuckling tune, filled with the high-spirited energy of rock 'n' roll. The tempo is brisk, like a ship sailing through treacherous waters. The guitar and drums provide a rhythmic backbone, reminiscent of the rhythmic clapping of a pirate crew. The chords shift between E major and G major, creating a sense of tension and release, much like the battles between pirates and the Royal Navy. Overall, this song would make a fine addition to the soundtrack of a pirate adventure movie!


Prompt: What can you tell me about the cultural and historical relevance of a piece like this? Where would it fit in a cultural context? ๐ŸŽถSource: MusicNet/Soni Ventorum Wind Quintet Title: Serenade in E-flat major, 4. Menuetto Artist: Soni Ventorum Wind Quintet License: Creative Commons Attribution 4.0 International


LLark: The piece is a classical composition, which is a significant part of Western musical tradition. Originating in the Western world, classical music has a rich history that spans centuries, with roots in medieval, Renaissance, and Baroque periods. This particular composition, being a trio in E-flat major for clarinet, horn, and bassoon, is reminiscent of the chamber music tradition, where small ensembles of instruments perform intimate, intricate musical dialogues. This style of music was often played in private homes or small public venues, and was a significant part of the social and cultural life of the time. Today, classical music continues to be highly valued for its artistic and emotional depth, and is performed in concert halls and music festivals all over the world.


Prompt: Describe a movie scene where you might hear this song playing. ๐ŸŽถSource: GTZAN Dataset Title: If I Had Possession Over Judgement Day Artist: Robert Johnson


LLark: This song, with its laid-back tempo and major chords, could fit well into a scene in a classic Western film or a road trip movie. Imagine a scene where the protagonist is driving through the open countryside, or perhaps riding a horse down a dusty road. The simplicity and warmth of the music would complement the vast landscape and the freedom of the characters' movements.


Training Data Samples - Instruction-Tuning Query/Response Pairs

MusicNet Dataset

Music Understanding (Classification/Regression)


๐ŸŽถSource: MusicNet/Timothy Jones/Darren L. Slider Title: Violin Sonata No 1 in D major 1. Allegro con brio Artist: Timothy Jones
Question: What is the approximate tempo of the clip in beats per minute (BPM)?
Answer: The approximate tempo of the clip is 150 beats per minute (BPM). This indicates a moderately fast tempo, which gives the music a lively and energetic feel.


๐ŸŽถSource: MusicNet/Museopen/Segundo G. Yogore Title: Piano Sonata in A major 2. Andante
Question: What is the genre of this song?
Answer: This song is a classical composition. It is a solo piano piece, which is a common format in classical music.


๐ŸŽถSource: MusicNet/Paul Pitman/harfesoft.de Title: Trio in E-flat major 1. Allegro moderato Artist: Paul Pitman
Question: What instruments are playing in this clip?
Answer: In this clip, you can hear the piano, clarinet, and cello playing.

Reasoning


๐ŸŽถSource: MusicNet/European Archive/harfesoft.de Title: String Quartet in C minor 4. Allegro
Question: Can you tell me about the unique characteristics of this piece, and how it's typical or atypical of Brahms' style?
Answer: Certainly. This sample from Brahms' String Quartet in C minor features a lively tempo, characteristic of the Allegro movement. Like many of Brahms' works, it relies heavily on string instrumentation. Instruments such as the violin, cello, and viola combine to create complex melodies and harmonies. The key of C major gives the piece a stable, resolute sound, typical of Brahms. However, this piece is somewhat atypical for Brahms in that it utilizes frequent key changes, transitioning between C major and several other keys, including A#major, G#major, Dminor, D#major, Eminor, Fmajor, Cmajor, Dmajor, Gmajor, and G#major throughout its duration. These frequent transitions give the piece a sense of drama and dynamic tension.


๐ŸŽถSource: MusicNet/Scott Goff/David J. Grossman Title: Partita in A minor 2. Corrente
Question: What are some key elements that signify this as a composition by Bach?
Answer: Significantly, the structure of the music played, beautifully executed through a solo Flute, represents Bach's intricate use of counterpoint and harmonic innovativeness which are some of the essential characteristics of Baroque music, the period during which Bach wrote his compositions. The extensive range of pitches utilized, covering multiple octaves, also points to Bach's tendency to exploit the full potential of an instrument's capabilities. Furthermore, the virtuosic and intricate flute lines often moving in a perpetual motion kind of way, coupled with shifting harmonies and the key being in A minor alludes to the piece being composed by Bach.


๐ŸŽถSource: MusicNet/Paul Pitman/piano-midi.de Title: Piano Sonata No 29 in B-flat major 1. Allegro
Question: Can you provide some context on the composer and the composition? Who is Beethoven and what is notable about Piano Sonata No 29 in B-flat major?
Answer: Ludwig van Beethoven was a German composer and pianist, widely regarded as one of the most influential musicians in the history of Western classical music. His compositions, which spanned the transition from the classical period to the romantic era, incorporated new ideas and forms that broke with established traditions, expanding the scope of sonata, symphony, concerto, and quartet. The Piano Sonata No. 29 in B-flat major, also known as the 'Hammerklavier', is one of his most renowned compositions. Composed in his late period, it's characterized by its extreme contrasts and substantial length, making it one of the most challenging pieces to perform in the piano literature.

Captioning


๐ŸŽถSource: MusicNet/Chris Borovas/Darren L. Slider Title: Violin Sonata No 7 in C minor 4. Finale: Allegro; Presto
Caption: {'question': 'Can you provide some context on the composer and the composition? Who is Beethoven and what is notable about Piano Sonata No 29 in B-flat major?', 'answer': "Ludwig van Beethoven was a German composer and pianist, widely regarded as one of the most influential musicians in the history of Western classical music. His compositions, which spanned the transition from the classical period to the romantic era, incorporated new ideas and forms that broke with established traditions, expanding the scope of sonata, symphony, concerto, and quartet. The Piano Sonata No. 29 in B-flat major, also known as the 'Hammerklavier', is one of his most renowned compositions. Composed in his late period, it's characterized by its extreme contrasts and substantial length, making it one of the most challenging pieces to perform in the piano literature."}


๐ŸŽถSource: MusicNet/Paul Pitman/piano-midi.de Title: Piano Sonata No 23 in F minor 1. Allegro assai
Caption: {'question': 'Can you provide some context on the composer and the composition? Who is Beethoven and what is notable about Piano Sonata No 29 in B-flat major?', 'answer': "Ludwig van Beethoven was a German composer and pianist, widely regarded as one of the most influential musicians in the history of Western classical music. His compositions, which spanned the transition from the classical period to the romantic era, incorporated new ideas and forms that broke with established traditions, expanding the scope of sonata, symphony, concerto, and quartet. The Piano Sonata No. 29 in B-flat major, also known as the 'Hammerklavier', is one of his most renowned compositions. Composed in his late period, it's characterized by its extreme contrasts and substantial length, making it one of the most challenging pieces to perform in the piano literature."}


๐ŸŽถSource: MusicNet/Soni Ventorum Wind Quintet/Oliver Seely Title: Wind Quintet No 2 in D Minor 2. Larghetto sostenuto ma con moto
Caption: {'question': 'Can you provide some context on the composer and the composition? Who is Beethoven and what is notable about Piano Sonata No 29 in B-flat major?', 'answer': "Ludwig van Beethoven was a German composer and pianist, widely regarded as one of the most influential musicians in the history of Western classical music. His compositions, which spanned the transition from the classical period to the romantic era, incorporated new ideas and forms that broke with established traditions, expanding the scope of sonata, symphony, concerto, and quartet. The Piano Sonata No. 29 in B-flat major, also known as the 'Hammerklavier', is one of his most renowned compositions. Composed in his late period, it's characterized by its extreme contrasts and substantial length, making it one of the most challenging pieces to perform in the piano literature."}

MagnaTagATune Dataset

Music Understanding (Classification/Regression)


๐ŸŽถSource: MagnaTagATune Title: Alleluia (Georgian chant) Artist: Three Holies Church Choristers License: Creative Commons Attribution โ€“ Noncommercial-Share Alike 3.0
Question: What is the approximate tempo of the clip in beats per minute (BPM)?
Answer: The approximate tempo of the clip is 71.4 beats per minute (BPM).


๐ŸŽถSource: MagnaTagATune Title: Under Fire And Water Artist: Jamie Janover and Michael Masley License: Creative Commons Attribution โ€“ Noncommercial-Share Alike 3.0
Question: What instruments can you hear in this track?
Answer: In this track, I can hear strings and piano. The strings provide a rich and melodic texture to the music, while the piano adds a harmonic foundation and adds some rhythmic elements as well.


๐ŸŽถSource: MagnaTagATune Title: Futurengine Artist: Psychetropic License: Creative Commons Attribution โ€“ Noncommercial-Share Alike 3.0
Question: What is the overall mood and emotions conveyed by the song?
Answer: The song has a slow tempo and a calm atmosphere. The music has a peaceful and relaxed feel to it, with a hint of melancholy. The gentle and soft sounds create a soothing and introspective mood. Overall, the song evokes a sense of tranquility and contemplation.

Reasoning


๐ŸŽถSource: MagnaTagATune Title: Grechaninov-The Cherubic Hymn Artist: Kyiv Chamber Choir License: Creative Commons Attribution โ€“ Noncommercial-Share Alike 3.0
Question: Can you tell me how the tempo of this song compares to most contemporary songs?
Answer: Certainly, the tempo of this piece is approximately 90.9 beats per minute, which is a moderate tempo. It could be seen as relaxed and serene when compared to a fast-paced contemporary pop song which could clock at around 120 BPM or more. On the other hand, it could be considered brisk when compared to a slow ballad that may have a pace of about 60-70 BPM. Tempo is relative and can greatly affect the feel and style of a piece of music.


๐ŸŽถSource: MagnaTagATune Title: 1x3 Artist: Tilopa License: Creative Commons Attribution โ€“ Noncommercial-Share Alike 3.0
Question: How could a music producer achieve the sparse, woodwind-dominated sound heard in this track?
Answer: A producer would focus on minimal texture and arrangement to amplify the woodwind, likely a flute, which is the primary instrument here. Since the track doesn't have any accompaniment from drums, strings, or any noticeable electronics, it creates a lot of sonic space for the flute to occupy. Techniques like reverb or delay can make the flute feel even more prominent and create an atmospheric effect. Recording in spaces with natural echo can also help achieve this sound. Such approaches are typical of New Age music, emphasizing the individual timbre of the instrument and the mood created by the melody and harmonies.


๐ŸŽถSource: MagnaTagATune Title: Born of a silent sun Artist: Jade Leary License: Creative Commons Attribution โ€“ Noncommercial-Share Alike 3.0
Question: Can you describe the style and any prevalent themes within this music piece?
Answer: The audio clip is predominantly characterized by a pop style. This is signified by clear, catchy melodies, often played by the piano, and laid-back rhythms, along with prominent vocals by a male singer. There's a thematic repetition that can be discerned in this piece which is characteristic of pop music, with the tune regularly returning to certain chords. For instance, it highly frequents D minor and F minor chords throughout.