Music & Language

Music and Language – not as much overlap as we thought?

I spent my time PhD investigating the similarities and differences between the ways that verbal and musical sounds are processed in short-term memory. So it should come as no surprise that I am always partial to a good music/language study: And I have just read a cracker! A new paper that challenges a tried and (well) tested method and suggests a new exciting paradigm for the future. In sum, the authors argue that there may not be much higher order music/language processing after all…

At the time of my PhD, findings were beginning to emerge in large numbers that suggested a great deal of overlap existed in the way that music and language were processed. These included well designed behavioural studies of self paced reading and complex language comprehension with background music (Fedorenko, 2009; Slevc, Rosenberg, & Patel, 2009), ERP studies of syntax violations (Patel et al. 1998; Koelsch, 2005), MRI studies of structure processing (Tillmann et al. 2003) and various neuroimaging studies of memory (Koelsch et al. 2008, Vines et al. 2006).

What about my PhD work? My experiments seemed to suggest that there was indeed a degree of overlap in the way that sounds were rehearsed ‘online’ in short-term memory (while you are trying to do other things). However, I did find evidence to suggest that musical and verbal sounds might be stored in different ways, especially in nonmusicians.

The new paper looks at the way that the brain processes both language and music sounds using fMRI (and a large variety of analysis methods including – for those who are interested – whole-brain conjunction, region of interest, and multivariate pattern classification analysis). The authors also tried a neat manipulation – speeding up and slowing down the speech and melodies (30% each way). The idea was to see which regions of the brain respond to changes in the two stimuli over time.


The authors tested 20 people, 12 of whom had some musical training. The participants had to listen to blocks of meaningless ‘jabberwocky’ sentences (“It as the glander in my nederop” – a good sentence structure but meaningless), scrambled ‘jabberwocky’ sentences (where the sentence structure no longer made sense) and especially composed simple melodies (average of 8 tones per melody).


Although both language and music sounds activated the superior temporal lobe bilaterally, the distribution of activity in response to the two stimuli was very different:

Melodies: Dorsomedial temporal lobe

Sentences: Ventrolateral regions

The authors went further and showed that even in the regions where there appeared to be overlap, the two types of stimuli actually activated non-identical neural assembles, or activated the same neural assemblies to different degrees.

The authors then looked at which brain regions responded selectively to the speed manipulation. This analysis, they argue, serves to identify regions that are selective for one stimulus relative to the other. They found:

Melodies: Medial anterior regions on the supratemporal plane, extending into the insula, and prefrontal regions

Sentences: Lateral regions of superior temporal lobe, with a small focus in the posterior temporal lobe (the sentences with proper structure also selectively activated anterior temporal lobe.)


The present study found a degree of overlap in music/language processing, although this was restricted to relatively early stages of auditory analysis; in the higher regions of the brain where hierarchical analysis of structure is thought to take place, sentences and melodies showed quite unique signatures of activity – and Broca’s area, an old favourite region which used to show overlapping activation for music and language all the time, did not activate to either stimulus type!

The authors conclude that the distinguishable patterns of activity likely reflect the different auditory features present in speech and music: Also, the fact that while language is largely dependent on grammar, music is more dependent on pitch and rhythmic contours.  Finally, the ‘end game’ of language and music are fundamentally different; the former is to derive combinatorial semantic representations and the latter is to drive acoustic recognition and perhaps emotional modulation.

But like me you may be wondering, “Why is there not overlap in this paper when it has been shown so often in the past?” The authors make the intriguing argument that the overlap shown by previous studies was led by task demands and not stimulus demands.

By this they mean that previous tasks often used the ‘structure violation’ paradigm – where you mess with structure in music or language and see how that affects processing of the other stimuli or see how similarly the brain reacts in both cases. However, ‘structure violation paradigms’ activate working memory and cognitive control for both language and music – and the authors suggest that this was the source of the previously reported overlapping activity.

The new ‘temporal manipulation’ task débuted in this paper may not load so heavily on higher abstract processing systems and therefore may be more likely to reveal stimulus-specific networks.

So what now? The temporal manipulation task needs more study. At this stage the authors are only reasoning that the pattern of activity revealed by their analysis is consistent with the idea that rate modulation can isolate higher-order aspects of music and language processing –they haven’t demonstrated it yet. Secondly, there is now more clear scope to investigate aspects of language and music that are more likely to share representation, such as prosody and melody.

In sum, there is nothing like a clear and well designed hit to a well established theory to force it to reform and come back more advanced, polished and exacting. The language/music overlap hypothesis is set for a new wave of discovery; exciting stuff!


Paper: Rogalsky et al. (2011) Functional anatomy of language and music perception: Temporal and structural factors investigated using functional magnetic resonance imaging. The Journal of Neuroscience, 31(10), 3843-3852.


  • Bob Woody

    Vicky – This study fascinates me. Thank you for covering it in your blog. I have a question, though. You wrote, “Finally, the ‘end game’ of music and language are fundamentally different; the former is to derive combinatorial semantic representations and the latter is to drive acoustic recognition and perhaps emotional modulation.” Do you have your former and latter reversed here? Wouldn’t language be linked to semantic representations, and music to acoustic recognition and emotional modulation?

  • Elisa

    Thanks for covering this article. Very interesting. I’m not sure how much the stimuli used in this study (melodies) were tapping into the same processes as the ones involved in the previous studies, which used chord sequences. But it’s really exciting to see lots of new studies looking at music and language! Here’s another one I’ve just come across Enjoy 🙂

  • vicky

    Indeed the words were inadvertently reversed! Many thanks for spotting the error, which I have now corrected.