• Reference Manager
  • Simple TEXT file

People also looked at

Original research article, emotional responses to music: shifts in frontal brain asymmetry mark periods of musical change.

research paper on music and emotions

  • 1 School of Psychological Sciences, Monash University, Melbourne, VIC, Australia
  • 2 Institute for Systematic Musicology, University of Hamburg, Hamburg, Germany
  • 3 Monash Biomedical Imaging, Monash University, University of Newcastle, Newcastle, NSW, Australia
  • 4 Centre for Positive Psychology, Graduate School of Education, University of Melbourne, Melbourne, VIC, Australia

Recent studies have demonstrated increased activity in brain regions associated with emotion and reward when listening to pleasurable music. Unexpected change in musical features intensity and tempo – and thereby enhanced tension and anticipation – is proposed to be one of the primary mechanisms by which music induces a strong emotional response in listeners. Whether such musical features coincide with central measures of emotional response has not, however, been extensively examined. In this study, subjective and physiological measures of experienced emotion were obtained continuously from 18 participants (12 females, 6 males; 18–38 years) who listened to four stimuli—pleasant music, unpleasant music (dissonant manipulations of their own music), neutral music, and no music, in a counter-balanced order. Each stimulus was presented twice: electroencephalograph (EEG) data were collected during the first, while participants continuously subjectively rated the stimuli during the second presentation. Frontal asymmetry (FA) indices from frontal and temporal sites were calculated, and peak periods of bias toward the left (indicating a shift toward positive affect) were identified across the sample. The music pieces were also examined to define the temporal onset of key musical features. Subjective reports of emotional experience averaged across the condition confirmed participants rated their music selection as very positive, the scrambled music as negative, and the neutral music and silence as neither positive nor negative. Significant effects in FA were observed in the frontal electrode pair FC3–FC4, and the greatest increase in left bias from baseline was observed in response to pleasurable music. These results are consistent with findings from previous research. Peak FA responses at this site were also found to co-occur with key musical events relating to change, for instance, the introduction of a new motif, or an instrument change, or a change in low level acoustic factors such as pitch, dynamics or texture. These findings provide empirical support for the proposal that change in basic musical features is a fundamental trigger of emotional responses in listeners.


One of the most intriguing debates in music psychology research is whether the emotions people report when listening to music are ‘real.’ Various authorities have argued that music is one of the most powerful means of inducing emotions, from Tolstoy’s mantra that “music is the shorthand of emotion,” to the deeply researched and influential reference texts of Leonard Meyer (“Emotion and meaning in music”; Meyer, 1956 ) and Juslin and Sloboda (“The Handbook of music and emotion”; Juslin and Sloboda, 2010 ). Emotions evolved as a response to events in the environment which are potentially significant for the organism’s survival. Key features of these ‘utilitarian’ emotions include goal relevance, action readiness and multicomponentiality ( Frijda and Scherer, 2009 ). Emotions are therefore triggered by events that are appraised as relevant to one’s survival, and help prepare us to respond, for instance via fight or flight. In addition to the cognitive appraisal, emotions are also widely acknowledged to be multidimensional, yielding changes in subjective feeling, physiological arousal, and behavioral response ( Scherer, 2009 ). The absence of clear goal implications of music listening, or any need to become ‘action ready,’ however, challenges the claim that music-induced emotions are real ( Kivy, 1990 ; Konecni, 2013 ).

A growing body of ‘emotivist’ music psychology research has nonetheless demonstrated that music does elicit a response in multiple components, as observed with non-aesthetic (or ‘utilitarian’) emotions. The generation of an emotion in subcortical regions of the brain (such as the amygdala) lead to hypothalamic and autonomic nervous system activation and release of arousal hormones, such as noradrenaline and cortisol. Sympathetic nervous system changes associated with physiological arousal, such as increased heart rate and reduced skin conductance, are most commonly measured as peripheral indices of emotion. A large body of work now illustrates, under a range of conditions and with a variety of music genres, that emotionally exciting or powerful music impacts on these autonomic measures of emotion (see Bartlett, 1996 ; Panksepp and Bernatzky, 2002 ; Hodges, 2010 ; Rickard, 2012 for reviews). For example, Krumhansl (1997) recorded physiological (heart rate, blood pressure, transit time and amplitude, respiration, skin conductance, and skin temperature) and subjective measures of emotion in real time while participants listened to music. The observed changes in these measures differed according to the emotion category of the music, and was similar (although not identical) to that observed for non-musical stimuli. Rickard (2004) also observed coherent subjective and physiological (chills and skin conductance) responses to music selected by participants as emotionally powerful, which was interpreted as support for the emotivist perspective on music-induced emotions.

It appears then that the evidence supporting music evoked emotions being ‘real’ is substantive, despite no obvious goal implications, or need for action, of this primarily aesthetic stimulus. Scherer and Coutinho (2013) have argued that music may induce a particular ‘kind’ of emotion – aesthetic emotions – that are triggered by novelty and complexity, rather than direct relevance to one’s survival. Novelty and complexity are nonetheless features of goal relevant stimuli, even though in the case of music, there is no significance to the listener’s survival. In the same way that secondary reinforcers appropriate the physiological systems of primary reinforcers via association, it is possible then that music may also hijack the emotion system by sharing some key features of goal relevant stimuli.

Multiple mechanisms have been proposed to explain how music is capable of inducing emotions (e.g., Juslin et al., 2010 ; Scherer and Coutinho, 2013 ). Common to most theories is an almost primal response elicited by psychoacoustic features of music (but also shared by other auditory stimuli). Juslin et al. (2010) describe how the ‘brain stem reflex’ (from their ‘BRECVEMA’ theory) is activated by changes in basic acoustic events – such as sudden loudness or fast rhythms – by tapping into an evolutionarily ancient survival system. This is because these acoustic events are associated with events that do in fact signal relevance for survival for real events (such as a nearby loud noise, or a rapidly approaching predator). Any unexpected change in acoustic feature, whether it be in pitch, timbre, loudness, or tempo, in music could therefore fundamentally be worthy of special attention, and therefore trigger an arousal response ( Gabrielsson and Lindstrom, 2010 ; Juslin et al., 2010 ). Huron (2006) has elaborated on how music exploits this response by using extended anticipation and violation of expectations to intensify an emotional response. Higher level music events – such as motifs, or instrumental changes – may therefore also induce emotions via expectancy. In seminal work in this field, Sloboda (1991) asked participants to identify music passages which evoked strong, physical emotional responses in them, such as tears or chills. The most frequent musical events coded within these passages were new or unexpected harmonies, or appoggiaturas (which delay an expected principal note), supporting the proposal that unexpected musical events, or substantial changes in music features, were associated with physiological responses. Interestingly, a survey by Scherer et al. (2002) rated musical structure and acoustic features as more important in determining emotional reactions than the listener’s mood, affective involvement, personality or contextual factors. Importantly, because music events can elicit emotions via both expectation of an upcoming event and experience of that event, physiological markers of peak emotional responses may occur prior to, during or after a music event.

This proposal has received some empirical support via research demonstrating physiological peak responses to psychoacoustic ‘events’ in music (see Table 1 ). On the whole, changes in physiological arousal – primarily, chills, heart rate or skin conductance changes – coincided with sudden changes in acoustic features (such as changes in volume or tempo), or novel musical events (such as entry of new voices, or harmonic changes).


TABLE 1. Music features identified in the literature to be associated with various physiological markers of emotion.

Supporting evidence for the similarity between music-evoked emotions and ‘real’ emotions has also emerged from research using central measures of emotional response. Importantly, brain regions associated with emotion and reward have been shown to also respond to emotionally powerful music. For instance, Blood and Zatorre (2001) found that pleasant music activated the dorsal amygdala (which connects to the ‘positive emotion’ network comprising the ventral striatum and orbitofrontal cortex), while reducing activity in central regions of the amygdala (which appear to be associated with unpleasant or aversive stimuli). Listening to pleasant music was also found to release dopamine in the striatum ( Salimpoor et al., 2011 , 2013 ). Further, the release was higher in the dorsal striatum during the anticipation of the peak emotional period of the music, but higher in the ventral striatum during the actual peak experience of the music. This is entirely consistent with the differentiated pattern of dopamine release during craving and consummation of other rewarding stimuli, e.g., amphetamines. Only one group to date has, however, attempted to identify musical features associated with central measures of emotional response. Koelsch et al. (2008a) performed a functional MRI study with musicians and non-musicians. While musicians tended to perceive syntactically irregular music events (single irregular chords) as slightly more pleasant than non-musicians, these generally perceived unpleasant events induced increased blood oxygen levels in the emotion-related brain region, the amygdala. Unexpected chords were also found to elicit specific event related potentials (ERAN and N5) as well as changes in skin conductance ( Koelsch et al., 2008b ). Specific music events associated with pleasurable emotions have not yet been examined using central measures of emotion.

Davidson and Irwin (1999) , Davidson (2000 , 2004 ), and Davidson et al. (2000) , have demonstrated that a left bias in frontal cortical activity is associated with positive affect. Broadly, a left bias frontal asymmetry (FA) in the alpha band (8–13 Hz) has been associated with a positive affective style, higher levels of wellbeing and effective emotion regulation ( Tomarken et al., 1992 ; Jackson et al., 2000 ). Interventions have been demonstrated to shift frontal electroencephalograph (EEG) activity to the left. An 8-week meditation training program significantly increased left sided FA when compared to wait list controls ( Davidson et al., 2003 ). Blood et al. (1999) observed that left frontal brain areas were more likely to be activated by pleasant music than by unpleasant music. The amygdala appears to demonstrate valence-specific lateralization with pleasant music increasing responses in the left amygdala and unpleasant music increasing responses in the right amygdala ( Brattico, 2015 ; Bogert et al., 2016 ). Positively valenced music has also been found to elicit greater frontal EEG activity in the left hemisphere, while negatively valenced music elicits greater frontal activity in the right hemisphere ( Schmidt and Trainor, 2001 ; Altenmüller et al., 2002 ; Flores-Gutierrez et al., 2007 ). The pattern of data in these studies suggests that this frontal lateralization is mediated by the emotions induced by the music, rather than just the emotional valence they perceive in the music. Hausmann et al. (2013) provided support for this conclusion via mood induction through a musical procedure using happy or sad music, which reduced the right lateralization bias typically observed for emotional faces and visual tasks, and increased the left lateralization bias typically observed for language tasks. A right FA pattern associated with depression was found to be shifted by a music intervention (listening to 15 min of ‘uplifting’ popular music previously selected by another group of adolescents) in a group of adolescents ( Jones and Field, 1999 ). This measure therefore provides a useful objective marker of emotional response to further identify whether specific music events are associated with physiological measures of emotion.

The aim in this study was to examine whether: (1) music perceived as ‘emotionally powerful’ and pleasant by listeners also elicited a response in a central marker of emotional response (frontal alpha asymmetry), as found in previous research; and (2) peaks in frontal alpha asymmetry were associated with changes in key musical or psychoacoustic events associated with emotion. To optimize the likelihood that emotions were induced (that is, felt rather than just perceived), participants listened to their own selections of highly pleasurable music. Two validation hypotheses were proposed to confirm the methodology was consistent with previous research. It was hypothesized that: (1) emotionally powerful and pleasant music selected by participants would be rated as more positive than silence, neutral music or a dissonant (unpleasant) version of their music; and (2) emotionally powerful pleasant music would elicit greater shifts in frontal alpha asymmetry than control auditory stimuli or silence. The primary novel hypothesis was that peak alpha periods would coincide with changes in basic psychoacoustic features, reflecting unexpected or anticipatory musical events. Since music-induced emotions can occur both before and after key music events, FA peaks were considered associated with music events if the music event occurred within 5 s before to 5 s after the FA event. Music background and affective style were also taken into account as potential confounds.

Materials and Methods


The sample for this study consisted of 18 participants (6 males, 12 females) recruited from tertiary institutions located in Melbourne, Australia. Participants’ ages ranged between 18 and 38 years ( M = 22.22, SD = 5.00). Participants were excluded if they were younger than 17 years of age, had an uncorrected hearing loss, were taking medication that may impact on mood or concentration, were left-handed, or had a history of severe head injuries or seizure-related disorder. Despite clearly stated exclusion criteria, two left handed participants attended the lab, although as the pattern of their hemispheric activity did not appear to differ to right-handed participants, their data were retained. Informed consent was obtained through an online questionnaire that participants completed prior to the laboratory session.

Online Survey

The online survey consisted of questions pertaining to demographic information (gender, age, a left-handedness question, education, employment status and income), music background (MUSE questionnaire; Chin and Rickard, 2012 ) and affective style (PANAS; Watson and Tellegen, 1988 ). The survey also provided an anonymous code to allow matching with laboratory data, instructions for attending the laboratory and music choices, and explanatory information about the study and a consent form.

Peak Frontal Asymmetry in Alpha EEG Frequency Band

The physiological index of emotion was measured using electroencephalography (EEG). EEG data were recorded using a 64-electrode silver-silver chloride (Ag-AgCl) EEG elastic Quik-cap (Compumedics) in accordance with the international 10–20 system. Data are, however, analyzed and reported from midfrontal sites (F3/F4 and FC3/FC4) only, as hemispheric asymmetry associated with positive and negative affect has been observed primarily in frontal cortex ( Davidson et al., 1990 ; Tomarken et al., 1992 ; Dennis and Solomon, 2010 ). Further spatial exploration of data for structural mapping purposes was beyond of the scope of this paper. In addition, analyses were performed for the P3–P4 sites as a negative control ( Schmidt and Trainor, 2001 ; Dennis and Solomon, 2010 ). All channels were referenced to the mastoid electrodes (M1–M2). The ground electrode was situated between FPZ and FZ and impedances were kept below 10 kOhms. Data were collected and analyzed offline using the Compumedics Neuroscan 4.5 software.

Subjective Emotional Response

The subjective feeling component of emotion was measured using ‘EmuJoy’ software ( Nagel et al., 2007 ). This software allows participants to indicate how they feel in real time as they listen to the stimulus by moving the cursor along the screen. The Emujoy program utilizes the circumplex model of affect ( Russell, 1980 ) where emotion is measured in a two dimensional affective space, with axes of arousal and valence. Previous studies have shown that valence and arousal account for a large portion of the variation observed in the emotional labeling of musical (e.g., Thayer, 1986 ), as well as linguistic ( Russell, 1980 ) and picture-oriented ( Bradley and Lang, 1994 ) experimental stimuli. The sampling rate was 20 Hz (one sample every 50 ms), which is consistent with recommendations for continuous monitoring of subjective ratings of emotion ( Schubert, 2010 ). Consistent with Nagel et al. (2007) , the visual scale was quantified as an interval scale from -10 to +10.

Music Stimuli

Four music stimuli—practice, pleasant, unpleasant, and neutral—were presented throughout the experiment. Each stimulus lasted between 3 and 5 min in duration. The practice stimulus was presented to familiarize participants with the Emujoy program and to acclimatize participants to the sound and the onset and offset of the music stimulus (fading in at the start and fading out at the end). The song was selected on the basis that it was likely to be familiar to participants, positive in affective valence, and containing segments of both arousing and calming music—The Lion King musical theme song, “ The circle of life. ”

The pleasant music stimulus was participant-selected. This option was preferred over experimenter-selected music as participant-selected music was considered more likely to induce robust emotions ( Thaut and Davis, 1993 ; Panksepp, 1995 ; Blood and Zatorre, 2001 ; Rickard, 2004 ). Participants were instructed to select a music piece that made them, “experience positive emotions (happy, joyful, excited, etc.) – like those songs you absolutely love or make you get goose bumps.” This song selection also had to be one that would be considered a happy song by the general public. That is, it could not be sad music that participants enjoyed. While previous research has used both positively and negatively valenced music to elicit strong experiences with music, in the current study, we limited the music choices to those that expressed positive emotions. This decision was made to reduce variability in EEG responses arising from perception of negative emotions and experience of positive emotions, as EEG can be sensitive to differences in both perception and experience of emotional valence. The music also had to be alyrical 1 —music with unintelligible words, for example in another language or skat singing, were permitted—as language processing might conceivably elicit different patterns of hemisphere activation solely as a function of the processing of vocabulary included in the song. [It should be noted that there are numerous mechanisms by which a piece of music might induce an emotion (see Juslin and Vastfjall, 2008 ), including associations with autobiographical events, visual imagery and brain stem reflexes. Differentiating between these various causes of emotion was, however, beyond the scope of the current study.]

The unpleasant music stimulus was intended to induce negative emotions. This was a dissonant piece produced by manipulating the participant’s pleasant music stimulus and was achieved using Sony Sound Forge© 8 software. This stimulus consisted of three versions of the song played simultaneously— one shifted a tritone down, one pitch shifted a whole tone up, and one played in reverse (adapted from Koelsch et al., 2006 ). The neutral condition was an operatic track, La Traviata, chosen based upon its neutrality observed in previous research ( Mitterschiffthaler et al., 2007 ).

The presentation of music stimuli was controlled by the experimenter via the EmuJoy program. The music volume was set to a comfortable listening level, and participants listened to all stimuli via bud earphones (to avoid interference with the EEG cap).

Prior to attending the laboratory session, participants completed the anonymously coded online survey. Within 2 weeks, participants attended the EEG laboratory at the Monash Biomedical Imaging Centre. Participants were tested individually during a 3 h session. An identification code was requested in order to match questionnaire data with laboratory session data.

Participants were seated in a comfortable chair and were prepared for fitting of the EEG cap. The participant’s forehead was cleaned using medical grade alcohol swabs and exfoliated using NuPrep exfoliant gel. Participants were fitted with the EEG cap according to the standardized international 10/20 system ( Jasper, 1958 ). Blinks and vertical/horizontal movements were recorded by attaching loose electrodes from the cap above and below the left eye, as well as laterally on the outer canthi of each eye. The structure of the testing was explained to participants and was as follows (see Figure 1 ):


FIGURE 1. Example of testing structure with conditions ordered; pleasant, unpleasant, neutral, and control. B, baseline; P, physiological recording; and S, subjective rating. ∗ These stimuli were presented to participants in a counter balanced order.

The testing comprised four within-subjects conditions: pleasant, unpleasant, neutral, and control. Differing only in the type of auditory stimulus presented, each condition consisted of:

(a) Baseline recording (B)—no stimulus was presented during the baseline recordings. These lasted 3 min in duration and participants were asked to close their eyes and relax.

(b) Physiological recording (P)—the stimulus (depending on what condition) was played and participants were asked to have their eyes closed and to just experience the sounds.

(c) Subjective rating (S)—the stimulus was repeated, however, this time participants were asked to indicate, with eyes open, how they felt as they listened to the same music on the computer screen using the cursor and the EmuJoy software.

At every step of each condition, participants were guided by the experimenter (e.g., “I’m going to present a stimulus to you now, it may be music, something that sounds like music, or it could be nothing at all. All I would like you to do is to close your eyes and just experience the sounds”). Before the official testing began, the participant was asked to practice using the EmuJoy program in response to the practice stimulus. Participants were asked about their level of comfort and understanding with regards to using the EmuJoy software; experimentation did not begin until participants felt comfortable and understood the use of EmuJoy. Participants were reminded of the distinction between rating emotions felt vs. emotions perceived in the music; the former was encouraged throughout relevant sections of the experiment. After this, the experimental procedure began with each condition being presented to participants in a counterbalanced fashion. All procedures in this study were approved by the Monash University Human Research Ethics Committee.

EEG Data Analysis for Frontal Asymmetry

Electroencephalograph data from each participant was visually inspected for artifacts (eye movements and muscle artifacts were manually removed prior to any analyses). EEG data were also digitally filtered with a low-pass zero phase-shift filter set to 30 Hz and 96 dB/oct. All data were re-referenced to mastoid processes. The sampling rate was 1250 Hz and eye movements were controlled for with automatic artifact rejection of >50 μV in reference to VEO. Data were baseline corrected to 100 ms pre-stimulus period. EEG data were aggregated for all artifact-free periods within a condition to form a set of data for the positive music, negative music, neutral, and the control.

Chunks of 1024 ms were extracted for analyses using a Cosine window. A Fast Fourier Transform (FFT) was applied to each chunk of EEG permitting the computation of the amount of power at different frequencies. Power values from all chunks within an epoch were averaged (see Dumermuth and Molinari, 1987 ). The dependent measure that was extracted from this analysis was power density (μV 2 /Hz) in the alpha band (8–13 Hz). The data were log transformed to normalize their distribution because power values are positively skewed ( Davidson, 1988 ). Power in the alpha band is inversely related to activation (e.g., Lindsley and Wicke, 1974 ) and has been the measure most consistently obtained in studies of EEG asymmetry ( Davidson, 1988 ). Cortical asymmetry [ln(right)–ln(left)] was computed for the alpha band. This FA score provides a simple unidimensional scale representing relative activity of the right and left hemispheres for an electrode pair (e.g., F3 [left]/F4 [right]). FA scores of 0 indicate no asymmetry, while scores greater than 0 putatively are indicative of greater left frontal activity (positive affective response) and scores below 0 are indicative of greater right frontal activity (negative affective response), assuming that alpha is inversely related to activity ( Allen et al., 2004 ). Peak FA periods at the FC3/FC4 site were also identified across each participant’s pleasant music piece for purposes of music event analysis. FA (difference between left and right power densities) values were ranked from highest (most asymmetric, left biased) to lowest using spectrograms (see Figure 2 for an example). Due to considerable inter-individual variability in asymmetry ranges, descriptive ranking was used as a selection criterion instead of an absolute threshold or statistical difference criterion. The ranked FA differences were inspected and those that were clearly separated from the others (on average six peaks were clearly more asymmetric than the rest of the record) were selected for each individual as their greatest moments of FA. This process was performed by two raters (authors H-AA and NR), with 100% interrater reliability, so no further analysis was performed/considered necessary required to rank the FA peaks.


FIGURE 2. Sample data for participant 4 – music selection: The Four Seasons: Spring; Antonio Vivaldi. Recording: Karoly Botvay, Budapest Strings, Cobra Entertainment). (A) EEG alpha band spectrogram; (B) subjective valence and arousal ratings; and (C) music feature analysis.

Music Event Data Coding

A subjective method of annotating each pleasant music piece with temporal onsets and types of all notable changes in musical features was utilized in this study. Coding was performed by a music performer and producer with postgraduate qualifications in systematic musicology. A decision was made to use subjective coding as it has been successfully used previously to identify significant changes in a broad range of music features associated with emotional induction by music ( Sloboda, 1991 ). This method was framed within a hierarchical category model which contained both low-level and high-level factors of important changes. First, each participant’s music piece was described by annotating the audiogram, noting the types of music changes at respective times. Secondly, the low-level factor model utilized by Coutinho and Cangelosi (2011) was applied to assign the identified music features deductively to changes within six low-level factors: loudness, pitch level, pitch contour, tempo, texture, and sharpness. Each low-level factor change was coded as a change toward one of the two anchors of the feature. For example, if a modification was marked in terms of loudness with ‘loud,’ it described an increase in loudness of the current part compared to the part before (see Table 2 ).


TABLE 2. Operational definitions of high and low level musical features investigated in the current study.

Due to the high variability of the analyzed musical pieces from a musicological perspective – including the genre, which ranged from classical and jazz to pop and electronica – every song had a different frequency of changes in terms of these six factors. Hence, we applied a third step of categorization which led to a more abstract layer of changes in musical features that included two higher-level factors: motif changes and instrument changes. A time point in the music is marked with ‘motif change’ if the theme, movement or motif of the leading melody change from one part to the next one. The factor ‘instrument change’ can be defined as an increase or decrease of the number of playing instruments or as a change of instruments used within the current part.

Data were scored and entered into PASW Statistics 18 for analyses. No missing data or outliers were observed in the survey data. Bivariate correlations were run between potential confounding variables – Positive affect negative affect schedule (PANAS), and the Music use questionnaire (MUSE) – and FA to determine if they were potential confounds, but no correlations were observed.

A sample of data obtained for each participant is shown in Figure 2 . For this participant, five peak alpha periods were identified (shown in blue arrows at top). Changes in subjective valence and arousal across the piece are shown in the second panel, and then the musicological analysis in the final section of the figure.

Subjective Ratings of Emotion – Averaged Emotional Responses

A one-way analysis of variance (ANOVA) was conducted to compare mean subjective ratings of emotional valence. Kolmogorov–Smirnov tests of normality indicated that distributions were normal for each condition except the subjective ratings of the control condition D (9) = 0.35, p < 0.001. Nonetheless, as ANOVAs are robust to violations of normality when group sizes are equal ( Howell, 2002 ), parametric tests were retained. No missing data or outliers were observed in the subjective rating data. Figure 3 below shows the mean ratings of each condition.


FIGURE 3. Mean subjective emotion ratings (valence and arousal) for control (silence), unpleasant (dissonant), neutral, and pleasant (self-selected) music conditions.

Figure 3 shows that both the direction and magnitude of subjective emotional valence differed across conditions, with the pleasant condition rated very positively, the unpleasant condition rated negatively, and the control and neutral conditions rated as neutral. Arousal ratings appeared to be reduced in response to unpleasant and pleasant music. (Anecdotal reports from participants indicated that in addition to being very familiar with their own music, participants recognized the unpleasant piece as a dissonant manipulation of their own music selection, and were therefore familiar with it also. Several participants noted that this made the piece even more unpleasant to listen to for them.)

Sphericity was met for the arousal ratings, but not for valence ratings, so a Greenhouse-Geisser correction was made for analyses on valence ratings. A one-way repeated measures ANOVA revealed a significant effect of stimulus condition on valence ratings, F (1.6,27.07) = 23.442, p < 0.001, η p 2 = 0.58. Post hoc contrasts revealed that the mean subjective valence rating for the unpleasant music was significantly lower than for the control F (1,17) = 5.59, p = 0.030, η p 2 = 0.25, and the mean subjective valence rating for the pleasant music was significantly higher than for the control condition, F (1,17) = 112.42, p < 0.001, η p 2 = 0.87. The one-way repeated measures ANOVA for arousal ratings also showed a significant effect for stimulus condition, F (3,51) = 5.20, p = 0.003, η p 2 = 0.23. Post hoc contrasts revealed that arousal ratings were significant reduced by both unpleasant, F (1,17) = 10.11, p = 0.005, η p 2 = 0.37, and pleasant music, F (1,17) = 6.88, p = 0.018, η p 2 = 0.29, when compared with ratings for the control.

Aim 1: Can Emotionally Pleasant Music Be Detected by a Central Marker of Emotion (FA)?

Two-way repeated measures ANOVAs were conducted on the FA scores (averaged across baseline period, and averaged across condition) for each of the two frontal electrode pairs, and the control parietal site pair. The within-subjects factor included the music condition (positive, negative, neutral, and control) and time (baseline and stimulus). Despite the robustness of ANOVA to assumptions, caution should be taken in interpreting results as both the normality and sphericity assumptions were violated across each electrode pair. Where sphericity was violated, a Greenhouse–Geisser correction was applied. Asymmetry scores above two were considered likely a result of noisy or damaged electrodes (62 points out of 864) and were omitted as missing data which were excluded pairwise. Two outliers were identified in the data and were replaced with a score ±3.29 standard deviations from the mean.

A signification time by condition interaction effect was observed at the FC3/FC4 site, F (2.09,27.17) = 3.45, p = 0.045, η p 2 = 0.210, and a significant condition main effect was observed at the F3/F4 site, F (2.58,21.59) = 3.22, p = 0.039, η p 2 = 0.168. No significant effects were observed at the P3/P4 site [time by condition effect, F (1.98,23.76) = 2.27, p = 0.126]. The significant interaction at FC3/FC4 is shown in Figure 4 .


FIGURE 4. FC3/FC4 (A) and P3/P4 (B) (control) asymmetry score at baseline and during condition, for each condition. Asymmetry scores of 0 indicate no asymmetry. Scores >0 indicate left bias asymmetry (and positive affect), while scores <0 indicate right bias asymmetry (and negative affect). ∗ p < 0.05.

The greatest difference between baseline and during condition FA scores was for the pleasant music, representative of a positive shift in asymmetry from the right hemisphere to the left when comparing the baseline period to the stimulus period. Planned simple contrasts revealed that when compared with the unpleasant music condition, only the pleasant music condition showed a significant positive shift in FA score, F (1,13) = 6.27, p = 0.026. Positive shifts in FA were also apparent for control and neutral music conditions, although not significantly greater than for the unpleasant music condition [ F (1,13) = 2.60, p = 0.131, and F (1,13) = 3.28, p = 0.093], respectively.

Aim 2: Are Peak FA Periods Associated with Particular Musical Events?

Peak periods of FA were identified for each participant, and the sum varied between 2 and 9 ( M = 6.5, SD = 2.0). The music event description was then examined for presence or absence of coded musical events within a 10 s time window of (5 s before to 5 s after) the peak FA time-points. Across all participants, 106 peak alpha periods were identified, 78 of which (74%) were associated with particular music events. The type of music event coinciding with peak alpha periods is shown in Table 3 . A two-step cluster analysis was also performed to explore natural groupings of peak alpha asymmetry events that coincided with distinct combinations (2 or more) of musical features. A musical feature was to be deemed a salient characteristic of a cluster if present in at least 70% of the peak alpha events within the same cluster.


TABLE 3. Frequency and percentages of musical features associated with a physiological marker of emotion (peak alpha FA). High level, low level, and clusters of music features are distinguished.

Table 3 shows that, considered independently, the most frequent music features associated with peak alpha periods were primarily high level factors (changes in motif and instruments), with the addition of one low level factor (pitch). In exploring the data for clusters of peak alpha events associated with combinations of musical features, a four cluster solution was found to successfully group approximately half (53%) of the events into groups with identifiable patterns. This equated to 3 separate clusters characterized by distinct combinations of musical features, with the remaining half (47%) deemed unclassifiable as higher factor solutions provided no further differentiation.

In the current study, a central physiological marker (alpha FA) was used to investigate the emotional response of music selected by participants to be ‘emotionally powerful’ and pleasant. Musical features of these pieces were also examined to explore associations between key musical events and central physiological markers of emotional responding. The first aim of this study was to examine whether pleasant music elicited physiological reactions in this central marker of emotional responding. As hypothesized, pleasant musical stimuli elicited greater shifts in FA than did the control auditory stimulus, silence or an unpleasant dissonant version of each participant’s music. This finding confirmed previous research findings and demonstrated that the methodology was robust and appropriate for further investigation. The second aim was to examine associations between key musical features (affiliated with emotion), contained within participant-selected musical pieces, and peaks in FA. FA peaks were commonly associated with changes in both high and low level music features, including changes in motif, instrument, loudness and pitch, supporting the hypothesis that key events in music are marked by significant physiological changes in the listener. Further, specific combinations of individual musical features were identified that tended to predict FA peaks.

Central Physiological Measures of Responding to Musical Stimuli

Participants’ subjective valence ratings of music were consistent with expectations; control and neutral music were both rated neutrally, while unpleasant music was rated negatively and pleasant music was rated very positively. These findings are consistent with previous research indicating that music is capable of eliciting strong felt positive affective reports ( Panksepp, 1995 ; Rickard, 2004 ; Juslin et al., 2008 ; Zenter et al., 2008 ; Eerola and Vuoskoski, 2011 ). The current findings were also consistent with previous negative subjective ratings (unpleasantness) by participants listening to the dissonant manipulation of musical stimuli ( Koelsch et al., 2006 ). It is not entirely clear why arousal ratings were reduced by both the unpleasant and pleasant music. The variety of pieces selected by participants means that both relaxing and stimulating pieces were present in these conditions, although overall, the findings suggest that listening to music (regardless of whether pleasant or unpleasant) was more calming than silence for this sample. In addition, as both pieces were likely to be familiar (as participants reported that they recognized the dissonant manipulations of their own piece), familiarity could have reduced the arousal response expected for unpleasant music.

As hypothesized, FA responses from the FC3/FC4 site were consistent with subjective valence ratings, with the largest shift to the left hemisphere observed for the pleasant music condition. While not statistically significant, the small shifts to the left hemisphere during both control and neutral music conditions, and the small shift to the right hemisphere during the unpleasant music condition, indicate the trends in FA were also consistent with subjective valence reports. These findings support previous research findings on the involvement of the left frontal lobe in positive emotional experiences, and the right frontal lobe in negative emotional experiences ( Davidson et al., 1979 , 1990 ; Fox and Davidson, 1986 ; Davidson and Fox, 1989 ; Tomarken et al., 1990 ). The demonstration of these effects in the FC3/FC4 site is consistent with previous findings ( Davidson et al., 1990 ; Jackson et al., 2003 ; Travis and Arenander, 2006 ; Kline and Allen, 2008 ; Dennis and Solomon, 2010 ), although meaningful findings are also commonly obtained from data collected from the F3/F4 site (see Schmidt and Trainor, 2001 ; Thibodeau et al., 2006 ), which was not observed in the current study. The asymmetry findings also verify findings observed in response to positive and negative emotion induction by music ( Schmidt and Trainor, 2001 ; Altenmüller et al., 2002 ; Flores-Gutierrez et al., 2007 ; Hausmann et al., 2013 ). Importantly, no significant FA effect was observed in the control P3/P4 sites, which is an area not implicated in emotional responding.

Associations between Musical Features and Peak Periods of Frontal Asymmetry

Individual musical features.

Several individual musical features coincided with peak FA events. Each of these musical features occurred in over 40% of the total peak alpha asymmetry events identified throughout the sample and appear to be closely related to changes in musical structure. These included changes in motif and instruments (high level factors), as well as pitch (low level factor). Such findings are in line with previous studies measuring non-central physiological measures of affective responding. For example, high level factor musical features such as instrument change, specifically changes and alternations between orchestra and solo piece instruments have been cited to coincide with chill responses ( Grewe et al., 2007b ; Guhn et al., 2007 ). Similarly, pitch events have been observed in previous research to coincide with various physiological measures of emotional responding including skin conductance and heart rate ( Coutinho and Cangelosi, 2011 ; Egermann et al., 2013 ). In the current study, instances of high pitch were most closely associated with physiological reactions. These findings can be explained through Juslin and Sloboda’s (2010 ) description of the activation of a ‘brain stem reflex’ in response to changes in basic acoustic events. Changes in loudness and high pitch levels may trigger physiological reactions on account of being psychoacoustic features of music that are shared with more primitive auditory stimuli that signal relevance for survival to real events.

Changes in instruments and motif, however, may be less related to primitive auditory stimuli and stimulate physiological reactions differently. Motif changes have not been observed in previous studies yet appeared most frequently throughout the peak alpha asymmetry events identified in the sample. In music, motif has been described as “...the smallest structural unit possessing thematic identity” ( White, 1976 , p. 26–27) and exists as a salient and recurring characteristic musical fragment throughout a musical piece. Within the descriptive analysis of the current study, however, a motif can be understood in a much broader sense (see definitions in Table 2 ). Due to the broad musical diversity of the songs selected by participants, the term motif change emerged as most appropriate description to cover high level structural changes in all the different musical pieces (e.g., changes from one small unit to another in a classic piece and changes from a long repetitive pattern to a chorus in an electronic dance piece). Changes in such a fundamental musical feature, as well as changes in instrument, are likely to stimulate a sense of novelty and add complexity, and possibly unexpectedness (i.e., features of goal oriented stimuli), to a musical piece. This may therefore also recruit the same neural system which has evolved to yield an emotional response, which in this study, is manifest in the greater activation in the left frontal hemisphere compared to the right frontal hemisphere. Many of the other musical features identified, however, did not coincide frequently with peak FA events. While peripheral markers of emotion, such as skin conductance and heart rate changes, are likely to respond strongly to basic psychoacoustic events associated with arousal, it is likely that central markers such as FA are more sensitive to higher level musical events associated with positive affect. This may explain why motif changes were a particularly frequent event associated with FA peaks. Alternatively, some musical features may evoke emotional and physiological reactions only when present in conjunction with other musical features. It is recognized that an objective method of low level music feature identification would also be useful in future research to validate the current findings relating to low level psychoacoustic events. A limitation of the current study, however, was that the coding of both peak FA events and music events was subjective, which limits both replicability and objectivity. It is recommended future research utilize more objective coding techniques including statistical identification of peak FA events, and formal psychoacoustic analysis (such as achieved using software tools such as MIR Toolbox or PsySound). While an objective method of detecting FA events occurring within a specific time period after a music event is also appealing, the current methodology operationalized synchrony of FA and music events within a 10 s time window to include mechanisms of anticipation as well as experience of the event. Future research may be able to provide further distinction between these emotion induction mechanisms by applying different time windows to such analyses.

Feature Clusters of Musical Feature Combinations

Several clusters comprising combinations of musical features were identified in the current study. A number of musical events which on their own did not coincide with FA peaks did nonetheless appear in music event clusters that were associated with FA peaks. For example, feature cluster 1 consists of motif and instrument changes—both individually considered to coincide frequently with peak alpha asymmetry events—as well as texture (multi) and sharpness (dull). Changes in texture and sharpness, individually, were observed to occur in only 24.3 and 19.2% of the total peak alpha asymmetry events, respectively. After exploring the data for natural groupings of musical events that occurred during peak alpha asymmetry scores, however, texture and sharpness changes appeared to occur frequently in conjunction with motif changes and instrument changes. Within feature cluster 1, texture and sharpness occurred in 86 and 93% of the peak alpha asymmetry periods. This suggests that certain musical features, like texture and sharpness, may lead to stronger emotional responses in central markers of physiological functioning when presented concurrently with specific musical events as compared to instances where they are present in isolation.

An interesting related observation is the specificity with which these musical events can combine to form a cluster. While motif and instrument changes occurred often in conjunction with texture (multi) and sharpness (dull) during peak alpha asymmetry events, both also occurred distinctly in conjunction with dynamic changes in volume (high level factor) and softness (low level factor) in a separate feature cluster. While both the texture/sharpness and loudness change/softness combinations frequently occur with motif and instrument changes, they appear to do so in a mutually exclusive manner. This suggests a high level of complexity and specificity with which musical features may complement one another to stimulate physiological reactions during musical pieces.

The current findings extend previous research which has demonstrated that emotionally powerful music elicits changes in physiological, as well as subjective, measures of emotion. This study provides further empirical support for the emotivist theory of music and emotion which proposes that if emotional responses to music are ‘real,’ then they should be observable in physiological indices of emotion ( Krumhansl, 1997 ; Rickard, 2004 ). The pattern of FA observed in this study is consistent with that observed in previous research in response to positive and negative music ( Blood et al., 1999 ; Schmidt and Trainor, 2001 ), and non-musical stimuli ( Fox, 1991 ; Davidson, 1993 , 2000 ). However, the current study utilized music which expressed and induced positive emotions only, whereas previous research has also included powerful emotions induced by music expressing negative emotions. It would be of interest to replicate the current study with a broader range of powerful music to determine whether FA is indeed a marker of emotional experience, or a mixture of emotion perception and experience.

The findings also extend those obtained in studies which have examined musical features associated with strong emotional responses. Consistent with the broad consensus in this research, strong emotional responses often coincide with music events that signal change, novelty or violated expectations ( Sloboda, 1991 ; Huron, 2006 ; Steinbeis et al., 2006 ; Egermann et al., 2013 ). In particular, FA peaks were found to be associated in the current sample’s music selections with motif changes, instrument changes, dynamic changes in volume, and pitch, or specific clusters of music events. Importantly, however, these conclusions are limited by the modest sample size, and consequently by the music pieces selected. Further research utilizing a different set of music pieces may identify a quite distinct pattern of music features associated with FA peaks. In sum, these findings provide empirical support for anticipation/expectation as a fundamental mechanism underlying music’s capacity to evoke strong emotional responses in listeners.

Ethics Statement

This study was carried out in accordance with the recommendations of the National Statement on Ethical Conduct in Human Research, National Health and Medical Research Council, with written informed consent from all subjects. All subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the Monash University Standing Committee for Ethical Research on Humans.

Author Contributions

H-AA conducted the experiments, contributed to the design and methods of the study, analysis of data and preparation of all sections of the manuscript. NR contributed to the design and methods of the study, analysis of data and preparation of all sections the manuscript, and provided oversight of this study. JH conducted the musicological analyses of the music selections, and contributed to the methods and results sections of the manuscript. BP performed the analyses of the EEG recordings and contributed to the methods and results sections of the manuscript.

Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

  • ^ One participant only chose music with lyrical content; the experimenter confirmed with this participant that the language (Italian) was unknown to them.

Allen, J., Coan, J., and Nazarian, M. (2004). Issues and assumptions on the road from raw signals to metrics of frontal EEG asymmetry in emotion. Biol. Psychol. 67, 183–218. doi: 10.1016/j.biopsycho.2004.03.007

PubMed Abstract | CrossRef Full Text | Google Scholar

Altenmüller, E., Schürmann, K., Lim, V. K., and Parlitz, D. (2002). Hits to the left, flops to the right: different emotions during listening to music are reflected in cortical lateralisation patterns. Neuropsychologia 40, 2242–2256. doi: 10.1016/S0028-3932(02)00107-0

Bartlett, D. L. (1996). “Physiological reactions to music and acoustic stimuli,” in Handbook of Music Psychology , 2nd Edn, ed. D. A. Hodges (San Antonio, TX: IMR Press), 343–385.

Google Scholar

Blood, A. J., and Zatorre, R. J. (2001). Intensely pleasurable responses to music correlate with activity in brain regions implicated in reward and emotion. Proc. Natl. Acad. Sci. U.S.A. 98, 11818–11823. doi: 10.1073/pnas.191355898

Blood, A. J., Zatorre, R. J., Bermudez, P., and Evans, A. C. (1999). Emotional responses to pleasant and unpleasant music correlate with activity in paralimbic brain regions. Nat. Neurosci. 2, 382–387. doi: 10.1038/7299

Bogert, B., Numminen-Kontti, T., Gold, B., Sams, M., Numminen, J., Burunat, I., et al. (2016). Hidden sources of joy, fear, and sadness: explicit versus implicit neural processing of musical emotions. Neuropsychologia 89, 393–402. doi: 10.1016/j.neuropsychologia.2016.07.005

Bradley, M. M., and Lang, P. J. (1994). Measuring emotion: the self-assessment manikin and the semantic differential. J. Behav. Ther. Exp. Psychiatry 25, 49–59. doi: 10.1016/0005-7916(94)90063-9

CrossRef Full Text | Google Scholar

Brattico, E. (2015). “From pleasure to liking and back: bottom-up and top-down neural routes to the aesthetic enjoyment of music,” in Art, Aesthetics and the Brain , eds M. Nadal, J. P. Houston, L. Agnati, F. Mora, and C. J. CelaConde (Oxford, NY: Oxford University Press), 303–318. doi: 10.1093/acprof:oso/9780199670000.003.0015

Chin, T. C., and Rickard, N. S. (2012). The Music USE (MUSE) questionnaire; an instrument to measure engagement in music. Music Percept. 29, 429–446. doi: 10.1525/mp.2012.29.4.429

Coutinho, E., and Cangelosi, A. (2011). Musical emotions: predicting second-by-second subjective feelings of emotion from low-level psychoacoustic features and physiological measurements. Emotion 11, 921–937. doi: 10.1037/a0024700

Davidson, R. J. (1988). EEG measures of cerebral asymmetry: conceptual and methodological issues. Int. J. Neurosci. 39, 71–89. doi: 10.3109/00207458808985694

Davidson, R. J. (1993). “The neuropsychology of emotion and affective style,” in Handbook of Emotion , eds M. Lewis and J. M. Haviland (New York, NY: The Guildford Press), 143–154.

Davidson, R. J. (2000). Affective style, psychopathology, and resilience. Brain mechanisms and plasticity. Am. Psychol. 55, 1196–1214. doi: 10.1037/0003-066X.55.11.1196

Davidson, R. J. (2004). Well-being and affective style: neural substrates and biobehavioural correlates. Philos. Trans. R. Soc. 359, 1395–1411. doi: 10.1098/rstb.2004.1510

Davidson, R. J., Ekman, P., Saron, C. D., Senulis, J. A., and Friesen, W. V. (1990). Approach-withdrawal and cerebral asymmetry: emotional expression and brain physiology: I. J. Pers. Soc. Psychol. 58, 330–341. doi: 10.1037/0022-3514.58.2.330

Davidson, R. J., and Fox, N. A. (1989). Frontal brain asymmetry predicts infants’ response to maternal separation. J. Abnorm. Psychol. 98, 127–131. doi: 10.1037/0021-843X.98.2.127

Davidson, R. J., and Irwin, W. (1999). The functional neuroanatomy of emotion and affective style. Trends Cogn. Sci. 3, 11–21. doi: 10.1016/S1364-6613(98)01265-0

Davidson, R. J., Jackson, D. C., and Kalin, N. H. (2000). Emotion, plasticity, context, and regulation: perspectives from affective neuroscience. Psychol. Bull. 126, 890–909. doi: 10.1037/0033-2909.126.6.890

Davidson, R. J., Kabat-Zinn, J., Schumacher, J., Rosenkranz, M., Muller, D., Santorelli, S. F., et al. (2003). Alterations in brain and immune function produced by mindfulness meditation. Psychosom. Med. 65, 564–570. doi: 10.1097/01.PSY.0000077505.67574.E3

Davidson, R. J., Schwartz, G. E., Saron, C., Bennett, J., and Goleman, D. J. (1979). Frontal versus parietal EEG asymmetry during positive and negative affect. Psychophysiology 16, 202–203.

Dennis, T. A., and Solomon, B. (2010). Frontal EEG and emotion regulation: electrocortical activity in response to emotional film clips is associated with reduced mood induction and attention interference effects. Biol. Psychol. 85, 456–464. doi: 10.1016/j.biopsycho.2010.09.008

Dumermuth, G., and Molinari, L. (1987). “Spectral analysis of EEG background activity,” in Handbook of Electroencephalography and Clinical Neurophysiology: Methods of Analysis of Brain Electrical and Magnetic Signals , Vol. 1, eds A. S. Gevins and A. Remond (Amsterdam: Elsevier), 85–130.

Eerola, T., and Vuoskoski, J. K. (2011). A comparison of the discrete and dimensional models of emotion in music. Psychol. Music 39, 18–49. doi: 10.1093/scan/nsv032

Egermann, H., Pearce, M. T., Wiggins, G. A., and McAdams, S. (2013). Probabilistic models of expectation violation predict psychophysiological emotional responses to live concert music. Cogn. Affect. Behav. Neurosci. 13, 533–553. doi: 10.3758/s13415-013-0161-y

Flores-Gutierrez, E. O., Diaz, J.-L., Barrios, F. A., Favila-Humara, R., Guevara, M. A., del Rio-Portilla, Y., et al. (2007). Metabolic and electric brain patterns during pleasant and unpleasant emotions induced by music masterpieces. Int. J. Psychophysiol. 65, 69–84. doi: 10.1016/j.ijpsycho.2007.03.004

Fox, N. A. (1991). If it’s not left, it’s right: electroencephalogram asymmetry and the development of emotion. Am. Psychol. 46, 863–872. doi: 10.1037/0003-066X.46.8.863

Fox, N. A., and Davidson, R. J. (1986). Taste-elicited changes in facial signs of emotion and the asymmetry of brain electrical activity in human newborns. Neuropsychologia 24, 417–422. doi: 10.1016/0028-3932(86)90028-X

Frijda, N. H., and Scherer, K. R. (2009). “Emotion definition (psychological perspectives),” in Oxford Companion to Emotion and the Affective Sciences , eds D. Sander and K. R. Scherer (Oxford: Oxford University Press), 142–143.

Gabrielsson, A., and Lindstrom, E. (2010). “The role of structure in the musical expression of emotions,” in Handbook of Music and Emotion: Theory, Research, Applications , eds P. N. Juslin and J. A. Sloboda (New York, NY: Oxford University Press), 367–400.

Gomez, P., and Danuser, B. (2007). Relationships between musical structure and psychophysiological measures of emotion. Emotion 7, 377–387. doi: 10.1037/1528-3542.7.2.377

Grewe, O., Nagel, F., Kopiez, R., and Altenmüller, E. (2007a). Emotions over time: synchronicity and development of subjective, physiological, and facial affective reactions to music. Emotion 7, 774–788.

PubMed Abstract | Google Scholar

Grewe, O., Nagel, F., Kopiez, R., and Altenmüller, E. (2007b). Listening to music as a re-creative process: physiological, psychological, and psychoacoustical correlates of chills and strong emotions. Music Percept. 24, 297–314. doi: 10.1525/mp.2007.24.3.297

Guhn, M., Hamm, A., and Zentner, M. (2007). Physiological and musico-acoustic correlates of the chill response. Music Percept. 24, 473–484. doi: 10.1525/mp.2007.24.5.473

Hausmann, M., Hodgetts, S., and Eerola, T. (2013). Music-induced changes in functional cerebral asymmetries. Brain Cogn. 104, 58–71. doi: 10.1016/j.bandc.2016.03.001

Hodges, D. (2010). “Psychophysiological measures,” in Handbook of Music and Emotion: Theory, Research and Applications , eds P. N. Juslin and J. A. Sloboda (New York, NY: Oxford University Press), 279–312.

Howell, D. C. (2002). Statistical Methods for Psychology , 5th Edn. Belmont, CA: Duxbury.

Huron, D. (2006). Sweet Anticipation: Music and the Psychology of Expectation. Cambridge, MA: MIT Press.

Jackson, D. C., Malmstadt, J. R., Larson, C. L., and Davidson, R. J. (2000). Suppression and enhancement of emotional responses to unpleasant pictures. Psychophysiology 37, 515–522. doi: 10.1111/1469-8986.3740515

Jackson, D. C., Mueller, C. J., Dolski, I., Dalton, K. M., Nitschke, J. B., Urry, H. L., et al. (2003). Now you feel it now you don’t: frontal brain electrical asymmetry and individual differences in emotion regulation. Psychol. Sci. 14, 612–617. doi: 10.1046/j.0956-7976.2003.psci_1473.x

Jasper, H. H. (1958). Report of the committee on methods of clinical examination in electroencephalography. Electroencephalogr. Clin. Neurophysiol. 10, 370–375. doi: 10.1016/0013-4694(58)90053-1

Jones, N. A., and Field, T. (1999). Massage and music therapies attenuate frontal EEG asymmetry in depressed adolescents. Adolescence 34, 529–534.

Juslin, P. N., Liljestrom, S., Vastfjall, D., Barradas, G., and Silva, A. (2008). An experience sampling study of emotional reactions to music: listener, music, and situation. Emotion 8, 668–683. doi: 10.1037/a0013505

Juslin, P. N., Liljeström, S., Västfjäll, D., and Lundqvist, L. (2010). “How does music evoke emotions? Exploring the underlying mechanisms,” in Music and Emotion: Theory, Research and Applications , eds P. N. Juslin and J. A. Sloboda (Oxford: Oxford University Press), 605–642.

Juslin, P. N., and Sloboda, J. A. (eds) (2010). Handbook of Music and Emotion: Theory, Research and Applications. New York, NY: Oxford University Press.

Juslin, P. N., and Vastfjall, D. (2008). Emotional responses to music: the need to consider underlying mechanisms. Behav. Brain Sci. 31, 559–621. doi: 10.1017/S0140525X08005293

Kivy, P. (1990). Music Alone; Philosophical Reflections on the Purely Musical Experience. London: Cornell University Press.

Kline, J. P., and Allen, S. (2008). The failed repressor: EEG asymmetry as a moderator of the relation between defensiveness and depressive symptoms. Int. J. Psychophysiol. 68, 228–234. doi: 10.1016/j.ijpsycho.2008.02.002

Koelsch, S., Fritz, T., and Schlaugh, G. (2008a). Amygdala activity can be modulated by unexpected chord functions during music listening. Neuroreport 19, 1815–1819. doi: 10.1097/WNR.0b013e32831a8722

Koelsch, S., Fritz, T., von Cramon, Y., Muller, K., and Friederici, A. D. (2006). Investigating emotion with music: an fMRI study. Hum. Brain Mapp. 27, 239–250. doi: 10.1002/hbm.20180

Koelsch, S., Kilches, S., Steinbeis, N., and Schelinski, S. (2008b). Effects of unexpected chords and of performer’s expression on brain responses and electrodermal activity. PLOS ONE 3:e2631. doi: 10.1371/journal.pone.0002631

Konecni, V. (2013). Music, affect, method, data: reflections on the Carroll versus Kivy debate. Am. J. Psychol. 126, 179–195. doi: 10.5406/amerjpsyc.126.2.0179

Krumhansl, C. L. (1997). An exploratory study of musical emotions and psychophysiology. Can. J. Exp. Psychol. 51, 336–352. doi: 10.1037/1196-1961.51.4.336

Lindsley, D. B., and Wicke, J. D. (1974). “The electroencephalogram: autonomous electrical activity in man and animals,” in Bioelectric Recording Techniques , eds R. Thompson and M. N. Patterson (New York, NY: Academic Press), 3–79.

Meyer, L. B. (1956). “Emotion and meaning in music,” in Handbook of Music and Emotion: Theory, Research and Applications , eds P. N. Juslin and J. A. Sloboda (Oxford: Oxford University Press), 279–312.

Mitterschiffthaler, M. T., Fu, C. H. Y., Dalton, J. A., Andrew, C. M., and Williams, S. C. R. (2007). A functional MRI study of happy and sad affective states induced by classical music. Hum. Brain Mapp. 28, 1150–1162. doi: 10.1002/hbm.20337

Nagel, F., Kopiez, R., Grewe, O., and Altenmuller, E. (2007). EMuJoy: software for continuous measurement of perceived emotions in music. Behav. Res. Methods 39, 283–290. doi: 10.3758/BF03193159

Panksepp, J. (1995). The emotional sources of ‘chills’ induced by music. Music Percept. 13, 171–207. doi: 10.2307/40285693

Panksepp, J., and Bernatzky, G. (2002). Emotional sounds and the brain: the neuro-affective foundations of musical appreciation. Behav. Process. 60, 133–155. doi: 10.1016/S0376-6357(02)00080-3

Rickard, N. S. (2004). Intense emotional responses to music: a test of the physiological arousal hypothesis. Psychol. Music 32, 371–388. doi: 10.1177/0305735604046096

Rickard, N. S. (2012). “Music listening and emotional well-being,” in Lifelong Engagement with Music: Benefits for Mental Health and Well-Being , eds N. S. Rickard and K. McFerran (Hauppauge, NY: de Sitter), 207–238.

Russell, J. A. (1980). A circumplex model of affect. J. Soc. Psychol. 39, 1161–1178. doi: 10.1037/h0077714

Salimpoor, V. N., Benovoy, M., Larcher, K., Dagher, A., and Zatorre, R. J. (2011). Anatomically distinct dopamine release during anticipation and experience of peak emotion to music. Nat. Neurosci. 14, 257–264. doi: 10.1038/nn.2726

Salimpoor, V. N., van den Bosch, I., Kovacevic, N., McIntosh, A. R., Dagher, A., and Zatorre, R. J. (2013). Interactions between the nucleus accumbens and auditory cortices predict music reward value. Science 340, 216–219. doi: 10.1126/science.1231059

Scherer, K. R. (2009). Emotions are emergent processes: they require a dynamic computational architecture. Philos. Trans. R. Soc. Ser. B 364, 3459–3474. doi: 10.1098/rstb.2009.0141

Scherer, K. R., and Coutinho, E. (2013). “How music creates emotion: a multifactorial process approach,” in The Emotional Power of Music , eds T. Cochrane, B. Fantini, and K. R. Scherer (Oxford: Oxford University Press). doi: 10.1093/acprof:oso/9780199654888.003.0010

Scherer, K. R., Zentner, M. R., and Schacht, A. (2002). Emotional states generated by music: an exploratory study of music experts. Music. Sci. 5, 149–171. doi: 10.1177/10298649020050S106

Schmidt, L. A., and Trainor, L. J. (2001). Frontal brain electrical activity (EEG) distinguishes valence and intensity of musical emotions. Cogn. Emot. 15, 487–500. doi: 10.1080/02699930126048

Schubert, E. (2010). “Continuous self-report methods,” in Handbook of Music and Emotion: Theory, Research and Applications , eds P. N. Juslin and J. A. Sloboda (Oxford: Oxford University Press), 223–224.

Sloboda, J. (1991). Music structure and emotional response: some empirical findings. Psychol. Music 19, 110–120. doi: 10.1177/0305735691192002

Steinbeis, N., Koelsch, S., and Sloboda, J. (2006). The role of harmonic expectancy violations in musical emotions: evidence from subjective, physiological, and neural responses. J. Cogn. Neurosci. 18, 1380–1393. doi: 10.1162/jocn.2006.18.8.1380

Thaut, M. H., and Davis, W. B. (1993). The influence of subject-selected versus experimenter-chosen music on affect, anxiety, and relaxation. J. Music Ther. 30, 210–233. doi: 10.1093/jmt/30.4.210

Thayer, J. F. (1986). Multiple Indicators of Affective Response to Music. Doctoral Dissertation, New York University, New York, NY.

Thibodeau, R., Jorgsen, R. S., and Kim, S. (2006). Depression, anxiety, and resting frontal EEG asymmetry: a meta-analytic review. J. Abnorm. Psychol. 115, 715–729. doi: 10.1037/0021-843X.115.4.715

Tomarken, A. J., Davidson, R. J., and Henriques, J. B. (1990). Resting frontal brain asymmetry predicts affective responses to films. J. Pers. Soc. Psychol. 59, 791–801. doi: 10.1037/0022-3514.59.4.791

Tomarken, A. J., Davidson, R. J., Wheeler, R. E., and Doss, R. C. (1992). Individual differences in anterior brain asymmetry and fundamental dimensions of emotion. J. Pers. Soc. Psychol. 62, 676–687. doi: 10.1037/0022-3514.62.4.676

Travis, F., and Arenander, A. (2006). Cross-sectional and longitudinal study of effects of transcendental meditation practice on interhemispheric frontal asymmetry and frontal coherence. Int. J. Neurosci. 116, 1519–1538. doi: 10.1080/00207450600575482

Watson, D., and Tellegen, A. (1988). Development and validation of brief measures of positive and negative affect: the PANAS scales. J. Pers. Soc. Psychol. 54, 1063–1070. doi: 10.1037/0022-3514.54.6.1063

White, J. D. (1976). The Analysis of Music. Duke, NC: Duke University Press.

Zenter, M., Grandjean, D., and Scherer, K. R. (2008). Emotions evoked by the sound of music: characterization, classification, and measurement. Emotion 8, 494–521. doi: 10.1037/1528-3542.8.4.494

Keywords : frontal asymmetry, subjective emotions, pleasurable music, musicology, positive and negative affect

Citation: Arjmand H-A, Hohagen J, Paton B and Rickard NS (2017) Emotional Responses to Music: Shifts in Frontal Brain Asymmetry Mark Periods of Musical Change. Front. Psychol. 8:2044. doi: 10.3389/fpsyg.2017.02044

Received: 08 November 2016; Accepted: 08 November 2017; Published: 04 December 2017.

Reviewed by:

Copyright © 2017 Arjmand, Hohagen, Paton and Rickard. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Nikki S. Rickard, [email protected]

This article is part of the Research Topic

Music and the Functions of the Brain: Arousal, Emotions, and Pleasure

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings
  • Advanced Search
  • Journal List
  • Front Neurosci

Music-Evoked Emotions—Current Studies

Hans-eckhardt schaefer.

1 Tübingen University, Institute of Musicology, Tübingen, Germany

2 Institute of Functional Matter and Quantum Technology, Stuttgart University, Stuttgart, Germany

Associated Data

The present study is focused on a review of the current state of investigating music-evoked emotions experimentally, theoretically and with respect to their therapeutic potentials. After a concise historical overview and a schematic of the hearing mechanisms, experimental studies on music listeners and on music performers are discussed, starting with the presentation of characteristic musical stimuli and the basic features of tomographic imaging of emotional activation in the brain, such as functional magnetic resonance imaging (fMRI) and positron emission tomography (PET), which offer high spatial resolution in the millimeter range. The progress in correlating activation imaging in the brain to the psychological understanding of music-evoked emotion is demonstrated and some prospects for future research are outlined. Research in psychoneuroendocrinology and molecular markers is reviewed in the context of music-evoked emotions and the results indicate that the research in this area should be intensified. An assessment of studies involving measuring techniques with high temporal resolution down to the 10 ms range, as, e.g., electroencephalography (EEG), event-related brain potentials (ERP), magnetoencephalography (MEG), skin conductance response (SCR), finger temperature, and goose bump development (piloerection) can yield information on the dynamics and kinetics of emotion. Genetic investigations reviewed suggest the heredity transmission of a predilection for music. Theoretical approaches to musical emotion are directed to a unified model for experimental neurological evidence and aesthetic judgment. Finally, the reports on musical therapy are briefly outlined. The study concludes with an outlook on emerging technologies and future research fields.


Basic discussions of music center about questions such as: What actually is music? How can we understand music? What is the effect of music on human beings? Music is described as multidimensional and researchers have categorized it by its arousal properties (relaxing/calming vs. stimulating), emotional quality (happy, sad, peaceful), and structural features (as, e.g., tempo, tonality, pitch range, timbre, rhythmic structure) (Chanda and Levitin, 2013 ). One can ask the question how to recognize and describe the concretely beautiful in music. Efforts have been undertaken to answer this question (Eggebrecht, 1991 ), e.g., by discussing the beauty of the opening theme of the second movement of Mozart's piano concerto in d minor (KV 466). In this formal attempt to transform music into a descriptive language, particular sequences of tones and rhythmical structures have been tentatively ascribed to notions such as “flattering” or “steady-firm” (Eggebrecht, 1991 ). From the viewpoint of a composer, Mozart himself obviously was aware of the attractiveness of this beauty-component in music, stating that his compositions should be “…angenehm für die Ohren…” of the audience “…natürlich ohne in das Leere zu fallen…” (…pleasing for the ear… (of the audience) …naturally without falling into the shallow…) (see Eggebrecht, 1991 ). In modern and contemporary music, however, formal attempts of understanding are useless because form and self-containedness are missing (Zender, 2014 ). Thus, in atonality and in the emancipation of noise, a tonal center is absent, by simultaneous appearance of different rhythmic sequences the regular meter is demolished, and in aleatory music the linear order of musical events is left open.

A few earlier comments on the understanding of the interplay between music and man may be quoted here: “…there is little to be gained by investigation of emotion in music when we have little idea about the true fundamental qualities of emotion” (Meyer, 1956 ). “…music is so individual that attempts to provide a systematic explanation of the interaction might well be ultimately fruitless—there may be no systematic explanation of what happens when individuals interact with music” (Waterman, 1996 ). “Die Qualitäten und die Inhalte ihrer (der Komponisten) Musik zu beschreiben ist unmöglich. Eben deshalb werden sie in Klang gefasst, weil sie sonst nicht erfahrbar sind” (To describe the qualities and content of their (of the composers) music is impossible. Exactly for this reason they are expressed in musical sound, otherwise they are not communicable) (Maurer, 2014 ). Some historical comments on music-evoked emotions are compiled in section Historical Comments on the Impact of Music on People of this study.

The advent of brain-imaging technology with high spatial resolution (see principles section Experimental Procedures for Tomographic Imaging of Emotion in the Brain) gave new impact to interdisciplinary experimental research in the field of music-evoked emotions from the physiological and molecular point of view. With the broader availability of magnetic resonance imaging (MRI, first demonstrated in 1973; Lauterbur, 1973 ) and positron emission tomography (PET, first demonstrated 1975; Ter-Pogossian, 1975 ) since about two decades for studying both music listeners and performing musicians, a wealth of music-evoked brain activation data has been accomplished which is discussed in section Experimental Results of Functional (tomographic) Brain Imaging (fMRI, PET) together with psychoendocrinological and molecular markers. Due to the refinement of the more phenomenological measuring techniques, such as electroencephalography (EEG) and magnetoencephalography [MEG, section Electro- and Magnetoencephalography (EEG, MEG)], skin conductance response and finger temperature measurements (section Skin Conductance Response (SCR) and Finger Temperature) as well as goose bump development (section Goose Bumps—Piloerection), emotions can be measured with high temporal resolution. Genetic studies of musical heredity are reported in section Is There a Biological Background for the Attractiveness of Music?—Genomic Studies and recent theoretical approaches of musical emotions in section Towards a Theory of Musical Emotions. Some therapeutic issues of music are discussed in section Musical Therapy for Psychiatric or Neurologic Impairments and Deficiencies in Music Perception prior to the remarks concluding this study with an outlook. A brief outline of the psychological discussion of music-evoked emotion is given in the online Supplementary Material section.

Historical comments on the impact of music on people

The effects of music on man have been considered phenomenologically from antiquity to the nineteenth century mainly from the medical point of view according to Kümmel ( 1977 ) which will be preferentially referred to in the brief historical comments of the present section.

The only biblical example of a healing power of music refers to King Saul (~1,000 BC) who was tormented by an evil spirit and relief came to him when David played the lyre (1. Sam. 16, 14-23). In Antiquity, Pythagoras (~570-507 BC) was said to substantially affect the souls of people by diatonic, chromatic, or enharmonic tunes (see Kümmel, 1977 ). Platon (428-348 BC) in his Timaios suggested for the structure of the soul the same proportions of the musical intervals which are characteristic for the trajectories of the celestial bodies (see Kümmel, 1977 ). This concept of a numeral order of music and its effect on man was transferred to the Middle Ages, e.g., by Boethius (480-525). The Greek physician Asklepiades (124-60 BC) was said to have used music as a remedy for mental illness where the application of the Phrygian mode was considered to be particularly adequate for brightening up depressive patients. Boethius emphasized that music has to be correlated to the category of “moralitas” because of its strong effect on individuals. In his treatise De institutione musica he stated that “…music is so naturally united with us that we cannot be free from it even if we so desired….” Since the ninth century, music took a strong position in the medicine of the Arabic world and the musician was an assisting professional of the physician. According to Arabic physicians, music for therapeutic purposes should be “pleasant,” “dulcet,” “mild,” “lovely,” “charming,” and in the course of the assimilation of the Arabic medicine, the Latin West took over the medical application of music. Johannes Tinctoris (1435-1511) listed 20 effects of music, such as, e.g., that music banishes unhappiness, contributes to a cheerful mood, and cures diseases. In addition, music was supposed to delay aging processes. Agrippa von Nettesheim (1486-1535) was convinced that music can maintain physical health and emboss a moral behavior. He discusses in his treatise De occulta philosophia (Agrippa von Nettesheim, 1992 ) the powerful and prodigious effects of music. From his list of 20 different musical effects—adapted to the sequence of effects established by Johannes Tinctoris (1435-1511) (Schipperges, 2003 ) a brief selection should be presented here:

  • (1) Musica Deum delectat
  • (7) Musica tristitiam repellit
  • (13) Musica homines laetificat
  • (14) Musica aegrotos sanat
  • (17) Musica amorem allicit etc.

These effects could be translated into nowadays notions as religiosity (1), depression (7), joy (13), therapy (14), and sexuality (17).

Agrippa points out the alluring effects of music on unreasoning beasts: “…ipsas quoque bestias, serpentes, volucres, delphines, ad auditum suae modulationis provocat…magna vis est musica” (It stirs the very beasts, even serpents, birds and dolphins, to want to hear its melody…great is the power of music).

The physician of Arnstadt, Johann Wittich (1537-1598) summarized the requirement for good health concisely: “Das Hertz zu erfrewen/und allen Unmuht zu wenden/haben sonderliche große Krafft diese fünff Stück (To rejoice the heart/ and reverse all discontent/five things have particularly great power):

  • Gottes Wort (The word of God).
  • Ein gutes Gewissen (A clear conscience).
  • Die Musica (Music).
  • Ein guter Wein (good wine).
  • Ein vernünftig Weib (A sensible wife).”

René Descartes (1596-1650) formulated a fairly detailed view of the effects of music: The same music which stimulates some people to dancing may move others to tears. This exclusively depends on the thoughts which are aroused in our memory. In the medical encyclopedia of Bartolomeo Castelli of 1682 it is stated that music is efficient for both the curing of diseases and for maintaining health. A famous historical example for a positive impact of music on mental disorders is the Spanish King Philipp V (1683-1746) who—due to his severe depressions—stopped signing official documents and got up from his bed only briefly and only by night. In 1737, his wife Elisabeth Farnese (1692-1766, by the way a descendant of Pope Paul III and Emperor Karl V) appointed the famous Italian castrato singer Carlo Broschi Farinelli (1705-1782) to Madrid. Over 10 years, Farinelli performed every night (in total 3,600 times) four arias in order to banish the black melancholia from the kings mind until the king himself “…die Musik lernet…” (…learns music…) (see Kümmel, 1977 ). With his singing, Farinelli succeeded in agitating the king to partial fulfillment of his governmental duties and an occasional appearance in the governmental council. The king's favorite aria was Quell' usignolo with a difficult coloratura part (see Figure ​ Figure1) 1 ) of Geminiano Giacomelli's (1692-1740) opera Merope (1734).

An external file that holds a picture, illustration, etc.
Object name is fnins-11-00600-g0001.jpg

Extract from the aria Quell' usignolo of Geminiano Giacomelli's (1692-1740) opera Merope (1734) sung by Carlo Broschi Farinelli (1705-1782) for Philipp V (1683-1746), king of Spain (Haböck, 1923 ). Reprinted with permission from Haböck ( 1923 ) © 1923 Universal Edition.

The widely known Goldberg Variationen composed by J. S. Bach in 1740 may be considered, as reported by Bach biographer J. N. Forkel (1749-1818), as therapeutic music. H. C. von Keyserlingk, a Russian diplomat, asked Bach for “…einige Clavierstücke für seinen Adlatus Johann Gottlieb Goldberg,…die so sanften und etwas munteren Charakters wären, daß er dadurch in seinen schlaflosen Nächten ein wenig aufgeheitert werden könnte…” (… a number of clavier pieces for his personal assistant J. G. Goldberg…which should be of such gentle and happy character that he be somewhat cheered in his sleepless nights…). Bach chose a variations composition because of the unchanged basic harmony, although he initially had regarded a piece of this technique as a thankless task (see Kümmel, 1977 ).

In 1745 the medicine professor E. A. Nicolai (1722-1802) of Jena University started to report on more physical observations: “… wenn man Musik höre richten sich die Haare …in die Höhe, das Blut bewegt sich von aussen nach innen, die äusseren Teile fangen an kalt zu werden, das Herz klopft geschwinder und man hohlt etwas langsamer und tiefer Athem” (…when one hears music the hair stands on end (see section Goose Bumps—Piloerection), the blood is withdrawn from the surface, the outer parts begin to cool, the heart beats faster, and one breathes somewhat slower and more deeply). The French Encyclopédie of 1765 listed the diseases for which music was to be employed therapeutically: Pathological anxieties, the bluster of mental patients, gout pain, melancholia, epilepsy, fever, and plague. The physician and composer F. A. Weber (1753-1806) of Heilbronn, Germany assessed in 1802 the health effects of music more reluctantly: “Nur in Übeln aus der Klasse der Nervenkrankheiten läßt sich von…der Musik etwas Gedeihliches erhoffen. Vollständige Impotenz ist durch Musik nicht heilbar…Allein als Erwärmungsmittel erkaltender ehelicher Zärtlichkeit mag Musik vieles leisten” (Only in afflictions of the class of nervous diseases can …something profitable be expected from music. Complete impotence is not curable by music. …But as a means of rekindling marital tenderness music may achieve considerable results). The French psychiatrist J. E. D. Esquirol (1772-1840, see Charland, 2010 ) started to perform numerous experiments with the application of music to single patients or to groups. He, however, stated that the effect of music was transient and disappeared when the music ended. This change of thinking is also visible in the essay by Eduard Hanslick (1825-1904) Vom musikalisch Schönen (1854): “Die körperliche Wirkung der Musik ist weder an sich so stark, noch so sicher, noch von psychischen und ästhetischen Voraussetzungen so unabhängig, noch endlich so willkürlich behandelbar, dass sie als wirkliches Heilmittel in Betracht kommen könnte” (The physical effect of music is as such neither sufficiently strong, consistent, free from psychic and aesthetic preconditions nor freely usable as to allow its use as a real medical treatment).

With the rise of the experimental techniques of natural sciences in the medicine of the late nineteenth century, the views, patterns, and notions as determined by musical harmony began to take a backseat. It should be mentioned here that skepticism with regard to the effects of music arose in early times. In the third century Quintus Serenus declared the banishing of fever by means of vocals as pure superstition. In 1650 Athanasius Kircher wrote: “Denn dass durch (die Musik) ein Schwindsüchtiger, ein Epileptiker oder ein Gicht-Fall…geheilt werden können, halte ich für unmöglich.” (For I hold it for impossible that a consumptive, an epileptic or a gout sufferer …could be cured by music).

The mechanisms of hearing

Sound waves are detected by the ear and converted into neural signals which are sent to the brain. The ear has three divisions: The external, the middle, and the inner ear (see Figure ​ Figure2A). 2A ). The sound waves vibrate the ear drum which is connected to the ear bones (malleus, incus, and stapes) in the middle ear that mechanically carry the sound waves to the frequency-sensitive cochlea (35 mm in length, Figure ​ Figure2B) 2B ) with the basilar membrane in the inner ear. Here, making use of the cochlear hair cells (organ of Corti), the sound waves are converted into neural signals which are passed to the brain via the auditory nerve (Zenner, 1994 ). For each frequency, there is a region of maximum stimulation, or resonance region, on the basilar membrane. The spatial position x along the basilar membrane of the responding hair cells and the associated neurons determine the primary sensation of the pitch. A change in frequency of a pure tone causes a shift of the position of the activated region. This shift is then interpreted as a change in pitch (see Roederer, 2008 ) effect and laser studies allowed for a precise measurement of the movement of the basilar membrane (see Roederer, 2008 ).

An external file that holds a picture, illustration, etc.
Object name is fnins-11-00600-g0002.jpg

Anatomy of the ear. Reprinted with permission from William E. Brownell © 2016. (B) Components of the inner ear. Reprinted with permission from © 2016 Encyclopedia Britannica. (C) Confocal micrographs of rat auditory hair cells. Scale bar: 1 μm. The protein myosin XVa is localized to the stereocilia tips (Rzadzinska et al., 2004 ). Reprinted with permission from Rzadzinska et al. ( 2004 ) © 2016 Bechara Kachar.

The cochlear hair cells assist in relaying sound to the brain. The about 20,000 hair cells in the human ear are covered by stereocilia (see Figure ​ Figure2C), 2C ), giving them a hairy look. The stereocilia of the hair cell, which is sitting on the basilar membrane, are the primary structures used in sound transduction. With acoustic stimulation, the stereocilia bend which causes a signal that goes to the auditory nerve (see Figure ​ Figure2A) 2A ) and eventually to the auditory cortex allowing sound to be processed by the brain.

At loudest sound the bending amplitude of the stereocilia is about their diameter of 200 nm (a nanometer nm is a millionth of a mm) and at auditory threshold the movement is about 1 nm or, in the order of the diameter of small molecules (Fettiplace and Hackney, 2006 ), i.e., close to the thermal equilibrium fluctuations of the Brownian motion in the surrounding lymphatic liquid (Roederer, 2008 ).

The bending of the stereocilia initiates an uptake of potassium ions (K + ) which in turn opens voltage-dependent calcium ion (Ca + ) channels. This causes neurotransmitter release at the basal end of the hair cell, eliciting an action potential in the dendrites of the auditory nerve (Gray, 0000 ).

The action speed of the hair cells is incredibly high to satisfy the amazing demands for speed in the auditory system. Signal detection and amplification must be preferentially handled by processes occurring within one hair cell. The acoustic apparatus cannot afford the “leisurely pace” of the nervous system that works on a time scale of several milliseconds or more.

Specific experimental techniques for studying musical emotion and discussion of the results

Emotionally relevant musical stimuli.

Emotional relevance of music is ascribed, e.g., to enharmonic interchange, starting of a singing voice, the climax of a crescendo, a downward quint, or in general a musically unexpected material (Spitzer, 2003, 2014 ). Four musical parameters for the activation of emotions appear to be particularly prominent in the literature (Kreutz et al., 2012 ): musical tempo, consonance, timbre, and loudness. Musical tempo could influence cardiovascular dynamics. The category of consonance could be associated with activation in the paralimbic and cortical brain areas (Blood and Zatorre, 2001 ) whereas dissonances containing partials with non-integer (irrational) frequency ratios may give rise to a sensation of roughness. The loudness or the physical sound pressure seems to be of relevance to psychoneuroendocrinological responses to music. Thus, crescendo leads to specific modulation of cardiovascular activity (see Kreutz et al., 2012 ), such as musical expectancy and tension (Koelsch, 2014 ). Musical sounds are often structured in time, space, and intensity. Several structural factors in music give rise to musical tension: consonance or dissonance, loudness, pitch, and timber can modulate tension. Sensory consonance and dissonance are already represented in the brainstem (Tramo et al., 2001 ) and modulate activity in the amygdala.

The stability of a musical structure also contributes to tension, such as a stable beat or its perturbation (for example, by an accelerando or a ritardando, syncopations, off-beat phrasings, etc.) (Koelsch, 2014 ). The stability of a tonal structure in tonal music also contributes to tension. Moving away from the tonal center creates tension and returning to it evokes relaxation. Figure ​ Figure3 3 illustrates how the entropy of the frequency of the occurrence of tones and chords determines the stability of a tonal structure and thus the ease, or the difficulty, of establishing a tonal center. Additionally, the extent of a structural context contributes to tension. Figure ​ Figure3 3 shows the probabilities of certain chords following other chords in Bach chorales. The red bars indicate that after a dominant the next chord is most likely to be a tonic. The uncertainty of the predictions for the next chord (and thus the entropy of the probability distribution for the next chord) is low during the dominant, intermediate during the tonic, and relatively high during the submediant. Progressive tones and harmonies thus create an entropic flux that gives rise to constantly changing (un)certainties of predictions. The increasing complexity of regulations, and thus the increase of entropic flux, requires an increasing amount of knowledge about the musical regularities to make precise predictions about upcoming events. Tensions emerge from the suspense about whether a prediction proves true (Koelsch, 2014 ). Tensions and release may be important for a religious chorale as metaphors for sin and redemption (Koelsch, 2014 ).

An external file that holds a picture, illustration, etc.
Object name is fnins-11-00600-g0003.jpg

This graph shows the context-dependent bigram probabilities for the corpus of Bach chorales. Blue bars show probabilities of chord functions following the tonic (I), green bars following the submediant (vi), and red bars following a dominant (V). The probability for, e.g., a tonic (I) following a dominant (V) is high, the entropy is low (Koelsch, 2014 ). Reprinted with permission from Koelsch ( 2014 ) © 2014 Nature Publishing Group.

Tension can be further modulated by a structural breach. The emotional effects of the violations of predictions, which can be treated in analogy to the free energy of a system (Friston and Friston, 2013 ) includes surprise. Irregular unexpected chord functions, with rating of felt tensions, evoke skin conductance responses, activity changes in the amygdala and the orbitofrontal cortex while listening to a piece of classical piano music (see Koelsch, 2014 ).

Anticipatory processes can also be evoked by structural cues, for example by a dominant in a Bach chorale with a high probability being followed by a tonic (see Figure ​ Figure3), 3 ), or a dominant seventh chord which has a high probability for being followed by a tonic, thus evoking the anticipation of release. Such anticipation of relaxation might envolve dopaminergic activity in the dorsal striatum (Koelsch, 2014 ).

Another effect arising from music is emotional contagion. Music can trigger psychological processes that reflect emotion: “happy” music triggers the zygomatic muscle for smiling, together with an increase in skin conductance and breathing rate, whereas “sad” music activates the corrugator muscle. Interestingly, there seems to be an acoustic similarity between expression of emotion in Western music and affective prosody (see Koelsch, 2014 ).

Experimental procedures for tomographic imaging of emotion in the brain

Magnetic resonance imaging (mri) and functional magnetic resonance imaging (fmri).

Magnetic resonance imaging (see Reiser et al., 2008 ) can show anatomy and in some cases function (fMRI). Studies on the molecular level have been reported recently (Xue et al., 2013 ; Liu et al., 2014 ). In a magnetic resonance scanner (Figure ​ (Figure4A) 4A ) the magnetic moments of the hydrogen nuclei (protons) are aligned (Figure ​ (Figure4A) 4A ) by a strong external magnetic field (usually 1.5 Tesla) that is generated in a superconducting coil cooled by liquid helium. Magnetic resonance of the proton magnetic moments—a quantum mechanical phenomenon—can be initiated by exciting the proton spin system to precession resonance (Figure ​ (Figure4A) 4A ) by means of radio-frequency (RF) pulses of some milliseconds duration. This gives rise to a voltage signal with the resonance frequency ω 0 (Larmor frequency) which decays with the relaxation times T1 (longitudinal or spin-lattice relaxation time) and T2 (transversal or spin-spin relaxation time) which are characteristic for different chemical surroundings (see Figure ​ Figure4B 4B ).

An external file that holds a picture, illustration, etc.
Object name is fnins-11-00600-g0004.jpg

(A) Principles of magnetic resonance tomography (Birbaumer and Schmidt, 2010 ). (a) The patient is moved into the center of the MRI scanner. (b) A strong homogeneous magnetic field aligns the magnetic moments of the protons in in the patient's body. (c) An RF-pulse excites the proton magnetic moments to precession which gives rise to an alternating voltage signal in the detector. (d) After the switching-off the RF-pulse the proton magnetic moments relax to the initial orientation. The relaxation times (see B ) are measured. Reprinted with permission from Birbaumer and Schmidt ( 2010 ) © 2010 Springer. (B) Nuclear magnetic relaxation times T1 (top) and T2 (bottom) of hydrogen nuclei for various biological materials (Schnier and Mehlhorn, 2013 ). Reprinted with permission from Schnier and Mehlhorn ( 2013 ) © 2013 Phywe Systeme. (C) Spatial encoding of the local magnetic resonance information (Birbaumer and Schmidt, 2010 ). Due to a slicing (left) and finally a three-dimensional structuring (right) by means of gradient fields, the resonance frequency and the relaxation times can be assigned to a particular pixel. Reprinted with permission from Birbaumer and Schmidt ( 2010 ) © 2010 Springer.

A necessary condition for image generation is the exact information about the magnetic resonance signal's spatial origin. This spatial information is generated by additional site-dependent magnetic fields, called magnetic field gradients, along the three spatial axes. Due to these field gradients—much smaller in magnitude than the homogeneous main field—the magnetic field is grid-like (see Figure ​ Figure4C) 4C ) slightly different in each volume element (voxel). As a consequence, the application of an RF pulse with the frequency ω' excites only the nuclear magnetic moment ensemble in voxels where the Larmor frequency ω 0 —given by the local magnetic field strength—matches the resonance condition. The signal intensity which is determined by the number of nuclear spins and the relaxation times characteristic for the particular tissue (Figure ​ (Figure4B) 4B ) is assigned in this spatial encoding procedure to an element (pixel) in the three-dimensional image. The MRI scanner (Figure ​ (Figure4A) 4A ) comprising the homogeneous magnetic field, the RF systems, and the gradient fields is controlled by a computer including fast Fourier-transform algorithms for frequency analysis.

Functional magnetic resonance imaging (fMRI) is based on the effect that in the case of activation of neurons by, e.g., musical stimuli, an oxygen (O 2 )-enrichment occurs in oxyhemoglobin which gives rise to an enhancement of the relaxation time T2 (Birbaumer and Schmidt, 2010 ) of the protons of this molecule and an enhancement of the magnetic resonance signal. This effect which enables active brain areas to be imaged is called BOLD (blood oxygen level dependent) effect.

By an increase of the magnetic field strength, the signal-to-noise ratio and thereby the spatial resolution can be enhanced.

Positron emission tomography (PET)

PET imaging is based on the annihilation of positrons with electrons of the body. The positrons are emitted from proton-rich radioactive atomic nuclei (see Table ​ Table1) 1 ) which are embedded in specific biomolecules (Figure ​ (Figure5A). 5A ). The positron-electron annihilation process gives rise to two high-energy (0.511 MeV) annihilation photons (Figure ​ (Figure5B) 5B ) which can be monitored by radiation detectors around the body of the patient and thereby identify the site of the radioactive element. In a PET camera or PET scanner many detectors are implemented (Figure ​ (Figure5B) 5B ) allowing for tomographic imaging with good spatial resolution of about 4 mm.

PET isotopes produced by high energy protons in a cyclotron accelerator.

see http://en.wikipedia.org/wiki/Positron_emission_tomography ; downloaded 22.12. 14 .

An external file that holds a picture, illustration, etc.
Object name is fnins-11-00600-g0005.jpg

(A) Chemical formulae of two compounds doped with the positron emitters 18 F (left. http://de.wikipedia.org/wiki/Fluordesoxyglucose ; 19.12.14) and 11 C (right; http://www.ncbi.nlm.nih.gov/books/NBK23614/ 19.12.14) for PET scans. (B) Principles of positron emission tomography (PET). Left: A positron is emitted from a radioactive nucleus and annihilated with electrons of the tissue emitting two colinear annihilation photons which are monitored by radiation detectors and checked for coincidence. Right: Multi-detector PET scanner taking images (slices) of the concentration of positron emitting isotopes in the brain and thereby measuring the emotional activity of brain sections (Birbaumer and Schmidt, 2010 ). Reprinted with permission from Birbaumer and Schmidt ( 2010 ) © 2010 Springer.

Making use of fluorodeoxyglucose ( 18 F-FDG) doped with the radioactive fluorine isotope 18 F (Figure ​ (Figure5A), 5A ), the local sugar metabolism in neurologically activated areas of the brain can be monitored (Figure ​ (Figure5B). 5B ). After injection of 18 F-FDG into a patient, a PET scanner (Figure ​ (Figure5B) 5B ) can form a three-dimensional image of the 18 F-FDG concentration in the body. For specifically probing molecular changes in postsynaptic monoamine receptors such as the dopamine receptor D 2 and the serotonin receptor 5-HT 2A , 11 C-N-methyl-spiperone (11C-MNSP, Figure ​ Figure5A) 5A ) doped with the positron-emitting carbon isotope 11 C can be used. It should be pointed out here that the combination of MRI/PET (Bailey et al., 2014 ) represents an innovative imaging modality.

Experimental results of functional (tomographic) brain imaging (fMRI, PET)

Movements during listening to music.

Music is a universal feature of human societies, partly owing to its power to evoke strong emotions and influence moods. Understanding of neural correlates of music-evoked emotions has been invaluable for the understanding of human emotions (Koelsch, 2014 ).

Functional neuroimaging studies on music and emotion, such as fMRI and PET (see Figure ​ Figure6A) 6A ) show that music can modulate the activity in brain structures that are known to be crucially involved in emotion, such as the amygdala and nucleus accumbens (NAc). The nucleus accumbens plays an important role in the mesolimbic system generating pleasure, laughter, reward but also fear, aggression, impulsivity, and addiction. The mesolimbic system is additionally intensely involved in emotional learning processes. Drugs can in this system effectuate the release of the neurotransmitter dopamine (Figure ​ (Figure6B). 6B ). Neurotransmitters such as dopamine, serotonin, adrenaline, noradrenaline, or acetylcholine are biochemicals (see Figure 6B) which diffuse across a chemical synapse, bind to a postsynaptic receptor opening a sodium ion (Na + ) channel to transfer the excitation of a neuron to the neighboring neuron.

An external file that holds a picture, illustration, etc.
Object name is fnins-11-00600-g0006.jpg

(A) Neural correlates of music-evoked emotions. A meta-analysis of brain-imaging studies that shows neural correlates of music-evoked emotions. A meta-analysis is a statistical analysis of a lager set of the analyses of earlier data. The meta -analysis indicates clusters of activities derived from numerous studies (for references see Koelsch, 2014 ) in the amygdala (SF, LB), the hippocampal formation (a), the left caudate nucleus with a maximum in the nucleus accumbens (NAc, b), pre-supplementary motor area (SMA), rostral cingulated zone (RCZ), orbifrontal cortex (OFC), and mediodorsal thalamus (MD, c), as well as in auditory regions (Heschls gyrus HG) and anterior superior temporal gyrus (aSTG, d). Additional limbic and paralimbic brain areas may contribute to music-evoked emotions. For details see Koelsch ( 2014 ). Reprinted with permission from Koelsch ( 2014 ) © 2014 Nature Publishing Group. (B) Structural formula of dopamine ( http://de.wikipedia.org/wiki/Dopamin ) downloaded19.12.14.

A meta-analysis of functional neuroimaging studies (fMRI, PET) of music-evoked emotions is shown in Figure ​ Figure6A, 6A , including studies of music of intense pleasure, consonant or dissonant music, happy or sad music, joy- or fear-evoking music, muzak, expectancy violations, and music-evoked tension (for references see Koelsch, 2014 ).

In response to music, changes of the activity of the amygdala, the hippocampus, the right central striatum, the auditory cortex, the pre-supplementary motor area, the cingulate cortex, and the orbitofrontal cortex are observed (Figure ​ (Figure6A). 6A ). In the following, the role of the amygdala, the nucleus accumbens and the hippocampus in music-evoked emotion is briefly discussed in more detail.

The amygdala is central in the emotion network and can regulate and modulate this network. It processes emotions such as happiness, anxiety, anger, annoyance, and, additionally assesses the impression of facial expression and thereby contributes to communication, social behavior, and memory (Kraus and Canlon, 2012 ). It, moreover, releases a number of neurotransmitters such as dopamine and serotonin, and effectuates reflexes such as being scared (Kraus and Canlon, 2012 ). The amygdala receives input from the central auditory system (Kraus and Canlon, 2012 ) and the sensory systems, and its pathways to the hypothalamus affect the sympathetic neuronal system for the release of hormones via the hypothalamus-pituitary-adrenal (HPA)-axis but also the parasympathetic neuronal system (Kraus and Canlon, 2012 ). The hormone cortisol and the neuropeptide endorphine have been observed in musical tasks 20 years ago (see Kreutz et al., 2012 ).

Fear conditioning is mediated by synaptic plasticity in the amygdala (Koelsch et al., 2006 ). It may affect the auditory cortex and its plasticity (learning) by a thalamus-amygdala-cullicular feedback circuit (Figure ​ (Figure7A). 7A ). Neuronal pathways between the hippocampus and the amygdala allow for a direct interaction of emotion and declarative verbally describable memory and vice versa (Koelsch et al., 2006 ).

An external file that holds a picture, illustration, etc.
Object name is fnins-11-00600-g0007.jpg

(A) Main pathways underlying autonomic and muscular responses to music. The cortex (AC) also projects to the orbifrontal cortex (OFC) and the cingulated cortex (projections not shown). Moreover, the amygdala (AMYG), the OFC and the cingulated cortex send numerous projections to the hypothalamus (not shown) and thus also exert influence on the endocrine system. ACC, anterior cingulate cortex; CN, cochlear nuclei; IC, inferior colliculus; M1, primary motor cortex; MCC, middle cingulate cortex; MGB, medial geniculate body; NAc, nucleus accumbens; PMC, premotor cortex; RCZ, rostral cingulated zone; VN, vestibular nuclei (Koelsch, 2014 ). Reprinted with permission from Koelsch ( 2014 ) © 2014 Nature Publishing Group. (B) Hippocampus. Reprinted with permission from Annie Krusznis © 2016.

The superficial amygdala is sensitive to faces, sounds, and music that is perceived as pleasant or joyful. Functional connections between the superficial amygdala, the nucleus accumbens (Figure ​ (Figure7A), 7A ), and the mediodorsal thalamus are stronger during joy-evoking music than during fear-evoking music. The laterobasal amygdala shows activity changes during joyful or sad music. The connection of the amygdala to the hypothalamus affects the sympathetic neuronal system for the release of corticosteroid hormones via the HPS-axis and also affects the parasympathetic neural system (Kraus and Canlon, 2012 ). Functional magnetic resonance imaging (fMRI) (Koelsch et al., 2006 ) evidenced music-induced activity changes in the amygdala, ventral striatum and the hippocampal formation without the experience of “chills.” The study compared the brain responses of joyful dance-tunes by A. Dvorak and J. S. Bach (Figure ​ (Figure8) 8 ) played by professional musicians with responses to electronically manipulated dissonant (unpleasant) variations of these tunes. Unpleasant music induced increases of the blood-oxygen-level dependent (BOLD) signals in the amygdala and the hippocampus in contrast to pleasant music giving rise to BOLD decreases in these structures. In a PET experiment (Blood and Zatorre, 2001 ) the participants' favorite CD music was used in order to induce “chills” or “shivers down the spine.” Increased chill intensity was observed in brain regions ascribed to reward and emotion such as the nucleus accumbens (NAc), in the anterior cingulate cortex (ACC) and the orbitofrontal cortex (see Figure ​ Figure7A). 7A ). Decreases of the blood flow were observed in the amygdala and the anterior hippocampal formation with increasing chill intensity.

An external file that holds a picture, illustration, etc.
Object name is fnins-11-00600-g0008.jpg

Joyful instrumental dance-tunes of major-minor tonal music by Dvorak ( 1955 ) and Bach ( 1967 ) used from commercially available CDs as pleasant stimuli in Koelsch et al. ( 2006 ). Reprinted with permission from Bach ( 1967 ) © 1967 Bärenreiter.

These observations demonstrated the modulation of the activities of the brain core structures ascribed to emotion processing by music. Furthermore, they gave direct support to the phenomenological efforts in music-therapeutic approaches for the treatment of disorders such as depression and anxiety because these disorders are partly ascribed to dysfunctions of the amygdala and presumably of the hippocampus (Koelsch and Stegemann, 2012 ) (see section Musical Therapy for Psychiatric or Neurologic Impairments and Deficiencies in Music Perception).

Nucleus accumbens (NAc)

The activities observed by functional neuroimaging in this brain section (see Figure ​ Figure7A) 7A ) are initiated by “musical frissons,” involving experiences of shivers or goose bumps. This brain section is sensitive to primary rewards (food, drinks, or sex), consuming the rewards, and to addiction. This shows that music-evoked pleasure is associated with the activation of a phylogenetically old reward network that functions to ensure the survival of the individual and the species. The network seems to be functionally connected with the auditory cortex: while listening to music the functional connectivity between the nucleus accumbens and the auditory cortex predicts whether individuals will decide to buy a song (Salimpoor et al., 2013 ).

A PET study on musical frissons (Blood and Zatorre, 2001 ) making use of the radioactive marker 11 C-raclopride to measure the release of the neurotransmitter dopamine at synapses indicated that neural activity in the ventral and dorsal striatum involves increased dopamine availability, probably released by dopaminergic neurons in the ventral tegmental area (VTA). This indicates that music-evoked pleasure is associated with activation of the mesolimbic dopaminergic reward pathway.


A number of studies on music-evoked emotions has reported activity changes in the hippocampus (see Figure ​ Figure7B), 7B ), in striking contrast to the monetary or erotic rewards which do not activate the hippocampus (see Koelsch, 2014 ). This suggests that music-evoked emotions are not related to reward alone. Hippocampal activity was associated in some studies with music-evoked tenderness, peacefulness, joy, frissons or sadness and both, positive or negative emotions (for references see Koelsch, 2014 ). There is mounting evidence that the hippocampus is involved in emotion due to its role in the hippothalamus-pituitary-adrenal (HPA) axis stress response. The hippocampus appears to be involved in music-evoked positive emotions that have endocrine effects (see section Psychoneuroendocrinology—Neuroendocrine and Immunological Markers) associated with a reduction of emotional stress effectuated by a lowering of the cortisol (C 21 H 30 O 5 ) level which controls the carbon hydrate, fat, and protein metabolisms.

Another emotional function of the hippocampus in humans, beyond stress regulation, is the formation and maintenance of social attachments, such as, e.g., love. The evocation of attachment-related neurological activities by music appears to confirm the phenomenologically observed social functions of music establishing, maintaining, and strengthening social attachments. In this sense, music is directly related to the fulfillment of basic human needs, such as contact and communication, social cohesion and attachment (Koelsch, 2014 ). Some researchers even speculate that the strengthening of inter-individual attachments could have been an important adaptive function of music in the evolution of humans (Koelsch, 2014 ).

The prominent task of the hippocampal-auditory system is the long-term auditive memory. The downloading from the music memory activates the hippocampus predominantly on the right hemisphere (Watanabe et al., 2008 ). The hippocampus is, due to its projections to the amygdala, also involved in the emotional processing of music (Mitterschiffthaler et al., 2007 ). fMRI studies show an activation of the right hippocampus and the amygdala by sad music but not by happy or neutral music (Koelsch et al., 2006 ). Functional neuroimaging studies investigated how music influences and interacts with the processing of visual information (see Koelsch, 2014 ). These studies show that a combination of films or images with music expressing joy, fear, or surprise increase BOLD responses in the amygdala or the hippocampus (see Koelsch, 2014 ).

The hippocampus finds projections from the frontal, temporal and parietal lobes, as well as from the parahippocampal and the perirhinal cortices. The amygdala can modify the information storage processes of the hippocampus but, inversely, the reactions generated in the amygdala by external stimuli can be influenced by the hippocampus. These synergetic effects can contribute to the long-term storage of emotional events which is supported by the plasticity of the two units, enabling the acquisition of experience.

The degree of overlap between music-evoked emotions and so-called everyday emotions remains to be specified. Some musical emotions may appear in everyday life, such as surprise or joy. Some emotions are sought in music because they might be rare in everyday life, such as transcendence or wonder and some so-called moral emotions of everyday life, such as shame or guilt are lacking in music (Koelsch, 2014 ).

The molecular level of music-evoked neural processes can be achieved by making use of PET scans employing biomolecules doped with radioactive positron emitters. By using 11 C-N-methyl-spiperone ( 11 C-NMSP, see Figure ​ Figure5A) 5A ) as an antagonist binding the postsynaptic dopamine receptor 2 (D 2 ) and the serotonin receptor 5-hydroxytriptamine2A (5-HT 2A , see Figure ​ Figure9A), 9A ), acute changes of these neurotransmitter receptors in response to frightening music could be demonstrated (Zhang et al., 2012 ). Thus, the binding of 11 C-NMSP directly reflects the postsynaptic receptor level. Because the antagonist 11 C-NMSP binds predominantly D 2 in the striatum and 5-HT 2A in the cortex the antagonist can be used to map these receptors directly and simultaneously in the same individual (Watanabe, 2012 ). It is hypothesized (Zhang et al., 2012 ) that emotional processing of fear is mediated by the D 2 and the 5-HT 2A receptors. Frightening music is reported (Zhang et al., 2012 ) to rapidly arouse emotions in listeners that mimic those from actual life-threatening experiences.

An external file that holds a picture, illustration, etc.
Object name is fnins-11-00600-g0009.jpg

(A) 5-hydroxytryptamine (serotonin) receptor 2A (5-HT 2A ), G protein coupled; diameter of the protein alpha-helix ~0.5 nm https://en.wikipedia.org/wiki/5-HT2A_receptor downloaded 4. 10. 2016. (B) PET images showing decrease in 11 C-NMSP binding clusters (arrows) in a subject listening to frightening music: right caudate head, right frontal subgirus, and right anterior cingulated (A); left lateral globus pallidus and left caudate body (B); right anterior cingulated (C); and right superior temporal gyrus, right claustrum, and right amygdala. (D) (Zhang et al., 2012 ). Reprinted with permission from Zhang et al. ( 2012 ) © 2012 SNMMI. (C) PET images showing increase in 11C-NMSP binding clusters (arrows) in a subject listening to frightening music: right frontal lobe and middle frontal gyrus (A); right fusiform gyrus and right middle occipital gyrus (B); right superior occipital gyrus, right middle occipital gyrus (C); and left middle temporal gyrus (D) (Zhang et al., 2012 ). Reprinted with permission from Zhang et al. ( 2012 ) © 2012 SNMMI.

However, studies of the underlying mechanisms for perceiving danger created by music are limited. The musical stimulus in the investigations on frightening music (Zhang et al., 2012 ) discussed here was selected from the Japanese horror film Ju-On which is widely accepted as one of the scariest and most influential movies ever made (Shimizu, 2004 ). The film music (see The Grudge theme song https://www.youtube.com/watch?v=1dqjXyIu02s ) has been composed by Shiro Sato.

For the PET scans (see Figures 9B,C ) 11 C-NMSP-activities of 740 MBq (20 mCi) were used. In the course of frightening music significant decreases in 11 C-NMSP binding was observed in the limbic and paralimbic brain regions in four clusters (Figure ​ (Figure9B): 9B ): In the right caudate head, the right frontal subgyral region, and the right anterior cingulate region (A); the left lateral globus pallidus and left caudate body (B); the right anterior cingulate region (C); and the right superior temporal gyrus, right claustrum, and right amygdala (D). Increased 11 C-NMSP accumulation (Figure ​ (Figure9C) 9C ) was found in the cerebral cortex, in the right frontal lobe and the middle frontal gyrus (A); the right fusiform gyrus and the right middle occipital gyrus (B); the right superior occipital gyrus, the right middle occipital gyrus, and the superior occipital gyrus (C); and the left middle temporal gyrus (D).

The decrease in the caudate nucleus in response to frightening music indicates that frightening music triggers a downregulation of postsynaptic D 2. This suggests that the caudate nucleus is involved in a wide range of emotional processes evoked by music (Zhang et al., 2012 ). The finding that the 11 C-NMSP binding decreases significantly (Figure ​ (Figure9B) 9B ) during frightening music demonstrates the musical triggering of the monoamine receptors in the amygdala. It is assumed (Zhang et al., 2012 ) that changes of 11 C-NMSP binding (Figures 9B,C ) mainly reflect 5-HT 2A levels in the cortex, where 5-HT 2A overdensity is thought to be involved in the pathogenesis of depression (Eison and Mullins, 1996 ).

It should be additionally pointed out that the 11 C-NMSP PET study (Zhang et al., 2012 ) found the right hemisphere to have superiority in the processing of auditory stimuli and the defense reaction.

Movements of performing musicians

Brain activation of professional classical singers has been monitored by fMRI during overt singing and imagined singing of an Italian aria (Kleber et al., 2007 ). Overt singing (Figure 10A ) involved bilateral primary (A1) and secondary sensorimotor areas (SMA) and auditory cortices with Broca's and Wernike's areas but also areas associated with speech and language.

An external file that holds a picture, illustration, etc.
Object name is fnins-11-00600-g0010.jpg

(A) Overt singing. The activation maps show activations of the bilateral sensorimotor cortex and the cerebellum, the bilateral auditory cortex, Broca's and Wernicke's areas, medulla, thalamus, and ventral striatum but also ACC and insula were activated. Coordinates of cuts are given above each slice (Kleber et al., 2007 ). Reprinted with permission from Kleber et al. ( 2007 ) © 2007 Elsevier. (B) Mental rehearsal of singing (imaginary singing). Activation of typical imagery regions such as sensorimotor areas (SMA), premotor cortex areas, thalamus, basal ganglia, and cerebellum. Areas processing emotions showed intense activation (ACC and insula, hippocampus, amygdala, and ventrolateral prefrontal cortex). Coordinates of cuts are given above each slice (Kleber et al., 2007 ). Reprinted with permission from Kleber et al. ( 2007 ) © 2007 Elsevier.

Activation in the gyri of Heschl occurred in both hemispheres, together with the subcortical motor areas (cerebellum, thalamus, medulla and basal ganglia) and slight activation in areas of emotional processing (anterior cingulate cortex, anterior insula). Imagined singing (Figure 10B ) effectuated cerebral activation centered in fronto-parietal areas and bilateral primary and secondary sensorimotor areas. No activation was found in the primary auditory cortex or in the auditory belt area. Regions processing emotion showed intense activation (anterior cingulate cortex—ACC, insula, hippocampus, and amygdala).

Performing music in one's mind is a technique commonly used by professional musicians to rehearse. Composers write music regardless of the presence of a musical instrument, as, e.g., Mozart or Schubert did (see Kleber et al., 2007 ). Singing of classical music involves technical-motor and emotional engagement in order to communicate artistic, emotional, and semantic aspects of the song. A tight regulation of pitch, meter, and rhythm as well as an increased sound intensity and vocal range, vibrato and a dramatic expression of emotion are indispensible. Motor aspects of these requirements are reflected in a fine laryngeal motor control and a high involvement of the thoracic muscles during singing. The aria used in this study (Kleber et al., 2007 ) comprises text, rhythm, and melody which make the bilateral activation of A1 plausible.

For the study of music-evoked emotions during performing in the fMRI scanner the bel canto aria Caro mio ben by Tommaso Giordani (1730-1806) has been used (Kleber et al., 2007 ).

Interestingly, most areas involved in motor processing were activated both during overt singing and imaginary singing, a finding that may demonstrate the significance of imagined rehearsal. The basal ganglia which were active in both overt and imaginary singing may be involved in the modulation of the voice. The overt singing task activated only the ACC and the insula which were both also activated during imaginary singing. The ACC is involved in the recall of emotions (Kleber et al., 2007 )—a capability which is important for both overt and imaginary performance. The activation of the insula seems to reflect the intensity of the emotion. The amygdala which was only activated by imagined singing is known to be involved in passive avoidance or approach tasks. This is reported (Kleber et al., 2007 ) to be consistent with the observation that the amygdala was not active during overt singing. Imagined singing activated a large fronto-parietal network, indicating increased involvement of working memory processes during mental imagery which in turn may indicate that imagined singing is less automatized than overt singing (Kleber et al., 2007 ). Areas processing emotions showed also enhanced activation during imagined singing which may reflect increased emotional recall during this task.

An overview of the sensory-motor control of the singing voice has been given based on fMRI research of somatosensory and auditory feedback processing during singing in comparison to theoretical models (Zarate, 2013 ).

Movement organization that enables skilled piano performance has been recently reviewed, including the advances in diagnosis and therapy of movement disorders (Furuya and Altenmüller, 2013 ).

Psychoneuroendocrinology—neuroendocrine and immunological markers

Psychoneuroendocrinology (PNE) aims at the study of the musical experiences leading to hormonal changes in the brain and the body. These effects may be similar to those effectuated by pharmacological substances. In addition to investigating psychiatric illnesses and syndromes, PNE investigates more positive experiences such as the neurobiology of love (see Kreutz et al., 2012 ). In contrast to the neuronal system which transmits its messages by electrical signals, the endocrinal system makes use of biomolecules, such as hormones in order to communicate with the target organs which are equipped with specific receptors for these hormones (see Birbaumer and Schmidt, 2010 ).

For considering the neuroendocrine and immunological molecular markers which could be released during music-evoked emotion, the three interrelated systems regulating hormonal stress responses should be briefly introduced:

The hypothalamic-pituitary-adrenocortical axis (HPA). This axis is initiated by a stimulus in the brain area of the hypothalamus giving rise to the release of the corticotropin releasing factor (CRF) which in turn leads to the release of adrenocorticotropic hormone (ACTH) and beta-endorphin from the pituitary into the circulation. ACTH then stimulates the synthesis and release of cortisol and of testosterone from the adrenal cortex.

Beta-endorphin (see Figure ​ Figure11) 11 ) is a hormone where increased concentration levels are associated with situative stress. Delivering special relaxation music to coronary patients leads to significant decrease of beta-endorphin concentration with a simultaneous reduction of blood pressure, anxiety and worry. Music therapy can also be effective before and during surgeries in operating theaters, again due to a reduction of the beta-endorphin level (see Kreutz et al., 2012 ).

An external file that holds a picture, illustration, etc.
Object name is fnins-11-00600-g0011.jpg

Neuroendocrine and immunological molecular markers released during music- evoked emotion (see Kreutz et al., 2012 ). The molecular masses are given in kDa = 1.66 × 10 −24 kg. http://en.wikipedia.org/wiki/Beta-endorphin#mediaviewer/File:Betaendorphin.png ; http://de.wikipedia.org/wiki/Cortisol ; http://de.wikipedia.org/wiki/Testosteron ; http://de.wikipedia.org/wiki/Prolaktin ; http://de.wikipedia.org/wiki/Oxytocin ; http://en.wikipedia.org/wiki/Immunoglobulin_A downloads 20.12.2014.

Cortisol (see Figure ​ Figure11) 11 ) is a hormone where high levels of concentration are associated with psychological and physiological stresses. Listening to classical choral, meditative, or folk music significantly reduces the cortisol level, however, increases have been detected for listeners exposed to Techno (see Kreutz et al., 2012 ). Individual differences were evidenced in listening experiments where music students responded with increases and biology students with decreases of the cortisol levels. Changes of the cortisol concentration can also be induced by actively singing. In clinical context, exposure to music has been shown to reduce cortisol levels during medical treatment. In gender studies cortisol reductions were found in females in contrast to males, exhibiting increases. Little is known about the sustainability of these effects over a longer period of time (see Kreutz et al., 2012 ).

Testosterone (see Figure ​ Figure11), 11 ), a sex hormone, appears to be of particular relevance to music. Darwin ( 1871 ; see Kreutz et al., 2012 ) suggests music as originating from sexual selection. Female composers showed above average and male composers below average testosterone levels which has initiated discussions whether physiologically androgynous individuals are on a higher level of creativity.

Secretory immunoglobulin A (sIgA; see Figure ​ Figure11) 11 ) is an antibody considered as a molecular marker of the local immune system in the respiratory tract and as a first line of defense against bacterial and viral infections. High levels of sIgA may exert positive effects and low levels may be characteristic for chronic stress. Significant increases of sIgA concentrations were observed in response to listening to relaxation music or musak. Increases of the sIgA concentration were observed from rehearsal to public performance of choral singers (Kreutz et al., 2012 ).

Another study investigated the concentration of prolactin (see Figure ​ Figure11) 11 ) while listening to music of Hans-Werner Henze. The concentration of prolactin which is a hormone with important regulatory functions during pregnancy decreased in response to Henze (Kreutz et al., 2012 ).

It should be summarized that the neuroendocrine changes reflecting the psychophysiological processes in response to music appear to be complex but might promise favorable effects with respect to health implications deserving enhanced research activities.

The simpatho-adrenomedullary system is part of the sympathetic nervous system executing fight and flight responses. By, e.g., stress activation, norepinephrine is released. Sympathetic enervations of the medulla of the adrenal glands give rise to the secretion of the catecholamines (dopamine, epinephrine, norepinephrine). Since this works by nervous operation of the adreanal gland it responds much faster than the HPA which is regulated by hormonal processes.

The endogeneous opioid system is related to the HPA axis and can influence the ACTH and cortisol levels in the blood (see Kreutz et al., 2012 ). None of these three responses is specific to one kind of challenge and the response delays vary to a great deal.

There is an increasing interest in PNE research for studying musical behavior due to the increasing specificity of neuroendocrinological research technologies. It is likely that musical behaviors significantly influence neurotransmitter processes.

Whether music processing can be associated with the processing of, e.g., linguistic sound is a matter of debate (Kreutz et al., 2012 ). However, functional imaging brain studies suggest that the perception of singing is different of the perception of speech since singing evokes stronger activations in the subcortical regions which are associated with emotional processing (see Kreutz et al., 2012 ).

Experiments are suggested (Chanda and Levitin, 2013 ) that aim to uncover the connection between music, the neurochemical changes in the following health domains

  • Reward, motivation, and pleasure,
  • Stress and arousal,
  • Immunity, and
  • Social affiliation,

and the neurochemical systems

  • Dopamine and opioids,
  • Cortisol, adrenocorticotropic hormone (ACTH)
  • Serotonin, and
  • And the “love” drug oxytocin (see Figure ​ Figure11 11 ).

Electro- and magnetoencephalography (EEG, MEG)

Electroencephalography (eeg) and event-related brain potentials (erp).

This technique yields valuable information on the brain—behavior relationship on much shorter time scales (ms) than tomography, however, with limited spatial information.

Measurements of electrical potentials are performed making use of an array of voltage probes on the scalp. The EEG arises due to electrical potential oscillations in the brain, i.e., by excitatory postsynaptic potentials. Cortical afferences of the thalamus activate the apical dendrities (see Figure ​ Figure12). 12 ). Compensating extracellular electrical currents (Figure ​ (Figure12) 12 ) generate measurable potentials on the scalp with characteristic oscillations in the frequency range of about 4–15 Hz (Birbaumer and Schmidt, 2010 ). Event-related brain potentials (ERPs) are of particular interest in the present context of considering music-evoked emotions (Neuhaus, 2013 ). By synchronized averaging of many measurements, the ERPs are extracted from noise showing a sequence of characteristic components which can be ascribed to separate phases of cognitive processes. Slow negative potentials (100–600 ms) are thought to be generated by cortical cholinergic synapses with high synchronization of pulses at the apical dendrites (see Figure ​ Figure12). 12 ). Positive potentials may be due to a decrease of the synchronization of the thalamic activity (Birbaumer and Schmidt, 2010 ).

An external file that holds a picture, illustration, etc.
Object name is fnins-11-00600-g0012.jpg

Negative surface slow brain potentials on the skalp are generated by extracellular currents (red dashed arrows) which arise due to the electrical activation of apical dendrites by thalamocortical afferences (Birbaumer and Schmidt, 2010 ). Reprinted with permission from Birbaumer and Schmidt ( 2010 ) © 2010 Springer.

The interpretation of single ERP components as correlates of processing specific information is on a phenomenological stage. Up to 300 ms the components are ascribed to unconscious (autonomous) processing. Changes of consciousness can be attributed to components from 300 ms and higher (Birbaumer and Schmidt, 2010 ).

An impressive neurocognitive approach to musical form perception has been presented recently by ERP studies (Neuhaus, 2013 ). The study investigates the listeners' chunking abilities of two eight-measure theme types AABB and ABAB for pattern similarity (AA) and pattern contrast (AB). In the experiments a theme type of eight measures in length (2+2+2+2), often found in the Classical and Romantic periods, was used. In addition to behavioral rating considerations, ERP measurements were performed while non-musicians listened. The advantage of ERP, compared to the more direct neuroimaging techniques such as PET and fMRI, is the good time resolution in range of about 10 ms.

The experiments were performed on 20 students without musical training. The tunes were presented in various transpositions so that the tonality has not to be considered as an independent parameter. Each melody of the AABB or ABAB form types used the harmonic scheme tonic—dominant—tonic. The melodies with an average duration of 10.8 s and form part length of 2.7 s were presented from a programmable keyboard with a tempo of 102.4 BPM. The brain activity was measured making use of 59 Ag/AgCl electrodes with an impedance below 5 Ω.

In the behavioral studies the sequence ABAB is more often assessed as non-sequential than the sequence AABB. The tendency to recognize chunk form parts was high with the two following aspects coinciding: Rhythmic contrasts in A and B and when the melodic contour was upward- downward.

In grand average ERPs, an anterior negative shift N300 for immediate AA sequences as well as for non-immediate repetitions ABA or ABAB of similar form parts was observed suggesting pattern matching at phrase onsets based on rhythmical similarity. In the discussion of the grand average the most interesting feature is the negative shift in the time range 300–600 ms with a maximum in the fronto-central brain. This is ascribed to recognition of pattern similarity at phrase onsets with exactly the same rhythmical structure. The maximum amplitudes measured in the frontal parts of the brain suggest that non-expert listeners use the frontal part working memory for musical pattern recognition processes.

Magnetoencephalography (MEG)

Weak magnetic fields which can be detected on the scalp are generated by the electrical currents in the brain (Figure 13A ). By measuring these magnetic fields by a highly sensitive detector (Figure 13B ), a tomographic image (MEG) of the brain activities can be reconstructed. The brain comprises about 2 × 10 10 cells and about 10 14 synapses. The dendritic current in the cell (see Figure 13A ) generally flows perpendicular to the cortex (Figure 13A ). In the case of the sulcus, this gives rise to a magnetic field in parallel to the scalp which is suggested to be detected outside when about 100,000 cells contribute, e.g., in the auditory cortex, with a spatial resolution of about 2–3 mm (Vrba and Robinson, 2001 ).

An external file that holds a picture, illustration, etc.
Object name is fnins-11-00600-g0013.jpg

(A) Origin of the MEG signal. (a) Coronal section of the human brain with the cortex in dark color. The electrical currents flow roughly perpendicular to the cortex. (b) In the convoluted cortex with the sulci and gyri the currents flow either radially or tangentially (c) or radially (d) in the head. (e) The magnetic fields generated by the tangential currents can be detected outside the head (Vrba and Robinson, 2001 ). Reprinted with permission from Vrba and Robinson ( 2001 ) © 2001 Elsevier. (B) (a) Magnetoencephalography facility containing 150 magnetic field sensors. (b) SQUIDs (superconducting quantum interference devices) and sensors immersed for cooling in liquid helium contained in a Dewar vessel (cross section) (Birbaumer and Schmidt, 2010 ). Reprinted with permission from Birbaumer and Schmidt ( 2010 ) © 2010 Springer. (C) Cortical stimulation by pure and piano tones . Left : Medial–lateral coordinates are shown for single equivalent current dipoles fitted to the field patterns evoked by pure sine tones and piano tones in control subjects. The inset defines the coordinate system of the head. Right : Equivalent current dipoles (ECD) shift toward the sagittal midline along the medial–lateral coordinate as a function of the frequency of the tone. Ant–post, anterior–posterior; med–lat, medial–lateral; inf–sup, inferior–superior (Pantev et al., 1998 ). Reprinted with permission from Pantev et al. ( 1998 ) © 2001 Nature Publishing Group.

The brain magnetic fields (10 −13 Tesla) are much smaller than the earth magnetic field (6.5 × 10 −5 Tesla) and much smaller than the urban magnetic noise (10 −6 Tesla) (Vrba and Robinson, 2001 ). The only detectors resolving these small fields are superconducting quantum interference devices (SQUIDs) based on the Josephson effect (see Figure 13B ). The SQUIDs are coupled to the brain magnetic fields using combinations of superconducting coils called flux transformers (primary sensors, see Figure 13B ).

One of the most successful methods for noise elimination is the use of synthetic higher-order gradiometers. A number of approaches is available for image reconstruction of the MEG signals. Present MEG systems incorporate several hundred sensors in a liquid helium helmet array (see Figure 13B ).

By MEG scanning, neuronal activation in the brain can be monitored locally (Vrba and Robinson, 2001 ). Acoustic stimuli are processed in the auditory cortex by neurons that are aggregated into “tonotopic” maps according to their specific frequency tunings (see Pantev et al., 1998 ). In the auditory cortex, the tonotopic representation of the cortical sources corresponding to tones with different spectral content distributes along the medial-lateral axis of the supratemporal lane (see Figure 13C , left), with the medial-lateral center of the cortical activation shifting toward the sagittal midline with increasing frequency (see Figure 13C , right). This shift is less pronounced for a piano tone than for a pure sine tone. In this study, it could be additionally shown that dipole moments for piano tones are enhanced by about 25% in musicians compared with control subjects who had never played an instrument (Pantev et al., 1998 ). In the evaluation of the MEG data, for each evoked magnetic field a single equivalent current dipole (ECD) of about 50 nA was derived by a fit. From that a contribution of ~150,000 dendrites to this magnetic field can be estimated (Pantev et al., 1998 ). The coordinates of the dipole location were calculated satisfying the requirements of an anatomical distance of the ECD to the midsagittal plane of >2 cm and an inferior-superior value of >2 cm.

Skin conductance response (SCR) and finger temperature

In a study of the relationship of the temporal dynamics of emotion and the verse-chorus form of five popular “heartbreak” songs, the listeners' skin conductance responses (SCR; Figure 14A ) and finger temperatures (Figure 14B ) were used to infer levels of arousal and relaxation, respectively (Tsai et al., 2014 ). The passage preceding the chorus and the entrance of the chorus evoked two significant skin conductance responses (see Figure 14A ). These two responses may reflect the arousal associated with the feelings of “wanting” and “liking,” respectively. Brain-imaging studies have shown that pleasurable music activates the listeners' reward system and serves as an abstract reward (Blood and Zatorre, 2001 ). The decrease of the finger temperature (Figure 14A ) within the first part of the songs indicated negative emotions in the listeners, whereas the increases of the finger temperature within the second part may reflect a release of negative emotions. These findings may demonstrate the rewarding nature of the chorus and the cathartic effects associated with the verse-chorus form of heart-break songs.

An external file that holds a picture, illustration, etc.
Object name is fnins-11-00600-g0014.jpg

(A) The median curve of the skin conductance response (SCR) amplitude around the entrance of the chorus. The first downbeat was set to t = 0 s (Tsai et al., 2014 ). The two peaks are ascribed to the two closely related phases of listening experience: anticipatory “wanting” and hedonic “liking” of rewards. Reprinted with permission from Tsai et al. ( 2014 ) © 2014 Sage. (B) The u-shaped time-dependence of the finger temperatures of the listeners during presentation of the five songs. The end of the first chorus (see full dots) devides each song into two parts with a decrease of the finger temperature in the first part and an increase in the second part (Tsai et al., 2014 ). Reprinted with permission from Tsai et al. ( 2014 ) © 2014 Sage. The symbols *** and * indicate that the two peaks are significantly larger than the control data.

Goose bumps—piloerection

The most common psychological elicitors of piloerection or chills are moving music passages, or scenes in movies, plays, or books (see Benedek and Kaernbach, 2011 ). Other elicitors may be heroic or nostalgic moments, or physical contact with other persons. In Charles Darwin's seminal work on The expression of emotions in Man and Animals (1872), he already acknowledged that “…hardly any expressive movement is so general as the involuntary erection of the hairs…” (Darwin, 1872 ). Musical structures for triggering goose bumps or chills are considered to be crescendos, unexpected harmonies, or the entry of a solo voice, a choir, or a an additional instrument. It thus was concluded that piloerection may be a useful indicator which marks individual peaks in emotional arousal. Recently optical measuring techniques have been developed for monitoring and analyzing chills by means of piloerection (Benedek et al., 2010 ).

Additional experimental studies had shown that chills gave rise to higher skin conduction, increased heart and respiratory rates, and an enhancement of skin temperature (see Benedek and Kaernbach, 2011 ). Positron emission tomography correlated to musical chills showed a pattern typical for processes involved in reward, euphoria, and arousal, including ventral striatum, midbrain, amygdala, orbitofrontal cortex, and ventral medial prefrontal cortex (see Benedek and Kaernbach, 2011 ).

In the studies of piloerection as an objective and direct means of monitoring music-evoked emotion, music pieces ranging from 90 s (theme of Pirates of the Caribbean ) to 300 s ( The Scientist ). Film audio tracks ( Knocking on Heavens Door, Dead Poets Society ) ranging from 141 to 148 s were employed. All musical stimuli were averaged to the same root mean square power (RMS), so that they featured equal average power.

Half of the musical stimuli ( My Heart will go on by Celine Dion, Only Time by Enya, and film tracks of Armageddon and Braveheart ) was pre-selected by the experimenter and half, with stronger stimulation, was self-selected by the 50 participants. The stimuli were presented via closed Beyerdynamic DT 770 PRO head-phones (Heilbronn, Germany) at an average sound pressure level of 63 dB. The procedure was approved by the Ethics Committee of the German Psychological Society (Benedek and Kaernbach, 2011 ). The sequence of a measurement is depicted in Figure 15A .

An external file that holds a picture, illustration, etc.
Object name is fnins-11-00600-g0015.jpg

(A) Time-dependence of the relative piloerection intensity of a single experiment, including a baseline period (30 s), stimulus description (20 s) and stimulus presentation (variable duration). The initial stable level of piloerection intensity indicates no visible piloerection. In this experiment, piloerection occurs shortly after the onset of stimulus presentation; after some time it fades away. The asterisk marks the first detected onset of piloerection. This time is used for the short-term physiological response (Benedek and Kaernbach, 2011 ). Reprinted with permission from Benedek and Kaernbach ( 2011 ) © 2011 Elsevier. (B) Procedure of piloerection quantification without (top row) and with visible piloerection (bottom row). From B (bottom) a two-dimensional spatial Fourier transform is computed (C, shown for the frequency range ±1.13 mm −1 ) which is converted to a one-dimensional spectrum of spatial frequency. The maximum spectral power in the 0.23–0.75 mm −1 range (D) is considered as a correlate of the piloerection intensity (Benedek et al., 2010 ). Reprinted with permission from Benedek et al. ( 2010 ) © 2010 Wiley. (C) Time dependence of the short-term response of physiological measurements for a time slot of −15 s to +15 s around the first onset of piloerection. Dark bars indicate significant deviations from zero, white bars indicate non-significant deviations. ISCR-integrated skin conductance response, SCL-skin conductance level, HR-heart rate, PVA-pulse volume amplitude, RR-respiration rate, RD- respiration depth (Benedek and Kaernbach, 2011 ). Reprinted with permission from Benedek and Kaernbach ( 2011 ) © 2011 Elsevier.

The formation of piloerection on the forearm was monitored by a video scanner with a sampling rate of 10 Hz, with simultaneous measurements of the skin conductance response and the increased heart and respiratory rates. By means of the Gooselab software the spatial Fourier transform (Figure 15B ) of a video scan (Figure 15B ) is derived which is a measure of the intensity of piloerection.

Piloerection could not always be detected objectively when indicated by the participant and was sometimes detected without an indication by the participant.

Piloerection starts with the onset of music (Figure 15A ), then increases with a time constant of ~20 s and then fades off (time constant about 10 s). An analysis of the time constants of piloerection and of the kinetics of the simultaneously monitored physiological reactions (Figure 15C ), should provide us with specific information on the neuronal and muscular processes contributing. This has not been discussed up to now. In the physiological quantities (Figure 15C ) studied simultaneously with piloerection, a significant increase in skin conductance response, in heart rate, and in respiration depth has been observed. This demonstrates that a number of subsystems of the sympathetic neuronal system can be activated by music and that in particular listening to film sound tracks initiates a physiological state of intense arousal (Benedek and Kaernbach, 2011 ). Based on the experimental studies of piloerection and physiological quantities (Benedek and Kaernbach, 2011 ), two models of piloerection are discussed (Benedek and Kaernbach, 2011 ): On the one hand, it had been argued that the appearance of piloerection may mark a peak in emotional arousal (see Grewe et al., 2009 ). On the other hand, the psychobiological model (Panksepp, 1995 ) conceives emotional piloerection as an evolutionary relic of thermoregulatory response to an induced sensation of coldness and links it with the emotional quality of sadness (separation call hypothesis) (Panksepp, 1995 ). By comparing the physiological patterns of the two approaches to the experimental results, the authors (Benedek and Kaernbach, 2011 ) favor the separation call hypothesis (Panksepp, 1995 ) to the hypothesis of peak arousal (Grewe et al., 2009 ).

Is there a biological background for the attractiveness of music?—genomic studies

In a recent genomic study, the correlation of the frequency of the listening to music and the availability of the arginine vasopressin receptor 1A (AVPR1A) gene or haplotype (with a length of 1,472 base pairs) has been investigated. A haplotype is a collection of particular d eoxyribonucleic acid (DNA) sequences in a cluster of tightly-linked genes on a chromosome that are likely to be inherited together. In this sense, a haplotype is a group of genes that a progeny inherits from one parent [ http://en.wikipedia.org/wiki/Haplotype ]. The AVPR1A gene encodes for a receptor molecule amino peptide that mediates the influence of the arginine vasopressin (AVP) hormone in the brain which plays an important role in memory and learning [ http://en.wikipedia.org/wiki/Haplotype ]. AVPR1A has been shown to modulate the social cognition and behavior, including social bonding and altruism in humans (Wallum et al., 2008 ). However, in contrast to that, the AVPR1A gene has also been referred to as the “ruthlessness gene” (Hopkin, 2008 ).

Recently an association of the AVPR1A gene with musical aptitude and with creativity in music, e.g., composing and arranging of music, has been reported (see Ukkola-Vuoti et al., 2011 ). In this study (Ukkola-Vuoti et al., 2011 ) a total of 31 Finnish families with 437 family members (mean age 43 years) participated. The musical aptitude of the individuals was tested by means of the Karma test. In this test, which does not depend on training in music, musical aptitude is defined as the ability of auditory structuring (Karma, 2007 ). In addition, the individual frequency of music listening was registered. Genomic DNA was extracted from peripheral blood of the individuals for the determination of the AVPR1A gene. The AVPR1A gene showed strongest association with current active music listening which is defined as attentive listening to music, including attending concerts. No dependence of the musical aptitude was discovered. These results appear to indicate a biological background for the attractiveness of music. The association with the AVPR1A gene suggests that listening to music is related to the neural pathways affecting attachment behavior and social communication (Ukkola-Vuoti et al., 2011 ).

Towards a theory of musical emotions

In a recent overview (Juslin, 2013 ) aimed at a unified theory of musical emotions, a framework is suggested that tries to explain both the everyday emotions and aesthetic emotions, and yields some outlines for future research. This model comprises eight mechanisms for emotion by music—referred to as BRECVEMA: Brain stem reflexes, Rhythmic entrainment, Evaluative conditioning, Contagion, Visual imagery, Episodic memory, Musical expectancy, and Aesthetic judgment. The first seven mechanisms (BRECVEM) arousing the everyday emotions, are each correlated (see Juslin, 2013 ) to the evolutionary order, the survival value of the brain functions, the information focus, the mental representation, the key brain regions identified experimentally, the cultural impact, the ontogenetic development, the induced effect, the temporal focus of the effect, the induction speed, the degree of volitional influence, the availability of consciousness, and the dependence of musical structure.

Of particular significance is the addition of a mechanism corresponding to aesthetic judgments of music, in order to better account for typical appreciation emotions such as admiration and awe.

Aesthetic judgments have not received much attention in psychological research to date (Juslin, 2013 ) since aesthetic and stylistic norms and ideas change over time in society. Though it may be difficult to characterize aesthetic judgments, some preliminaries are offered (Juslin, 2013 ) as to how a psychological theory of aesthetic judgment in music experience might look like.

Some pieces of music will invite an aesthetic attitude of the listener due to perceptual inputs by sensory impressions, due to more knowledge-based cognitive inputs, or due to emotional inputs. Some criteria that may underlie listeners' aesthetic judgments of music are suggested (Juslin, 2013 ) such as beauty, wittiness, originality, taste, sublimity, expression, complexity, use as art, artistic skill, emotion arousal, message, representation, and artistic intention. Certain criteria such as expression, emotional arousal, originality, skill, message, or beauty were considered as more important than others (see Figure 16A ) and different listeners tend to focus on different criteria (see Figure 16B ). With its multi-level framework of everyday emotions and aesthetic judgment, the study (Juslin, 2013 ) might help to explain the occurrence of mixed emotions such as bitter-sweet combinations of joy and melancholy.

An external file that holds a picture, illustration, etc.
Object name is fnins-11-00600-g0016.jpg

(A) Mean values and standard errors for listeners' ratings of criteria for aesthetic value of music. (B) Individual ratings of criteria for aesthetic value of music by four subjects (see Juslin, 2013 ). Reprinted with permission from Juslin ( 2013 ) © 2013 Elsevier.

This discussion suggests (Juslin, 2013 ) that researchers have to elaborate specific experimental paradigms that reliably arouse specific emotions in listeners through each of the mechanisms mentioned, including the empirical exploration of candidate-criteria for aesthetic value, similarly to what has been performed for various BRECVEM mechanisms. Empirical research so far has primarily focused on the beauty criterion (see Juslin, 2013 ). Developments of hypotheses for the criteria such as style appreciation, neural correlates of perceived expressivity in music performances, or perceptual correlates of novelty appear feasible (Juslin, 2013 ). An additional possibility could be the use of a neurochemical interference strategy (Chanda and Levitin, 2013 ; Juslin, 2013 ). It has been shown that blocking of a specific class of amino acid receptors in the amygdala can interfere with the acquisition of evaluative conditioning (see Juslin, 2013 ) discussed within BRECVEM. Interactions between BRECVEM mechanisms and aesthetic judgments have yet to be investigated.

Musical therapy for psychiatric or neurologic impairments and deficiencies in music perception

Mounting evidence indicates that making music or listening to music activates a multitude of brain structures involved in cognitive, sensorimotor, and emotional processing (see Koelsch and Stegemann, 2012 ). The present knowledge on the neural correlates of music-evoked emotions and their health-related autonomic, endocrinological, and immunological effects could be used as a starting point for high-quality investigations of the beneficial effects of music on psychological and physiological health (Koelsch and Stegemann, 2012 ).

Music-evoked emotions can give rise to autonomic and endocrine responses as well as to motoric expression of motion (facial expression). The evidence that music improves health and well-being through the engagement of neurochemical systems for (i) reward, motivation and pleasure; (ii) stress and arousal; (iii) immunity; and (iv) social affiliation has been reviewed (Chanda and Levitin, 2013 ). From these observations, criteria for the potential use of music in therapy should be derived.

Dysfunctions and structural abnormalities in, e.g., the amygdala, hippocampus, thalamus, nucleus accumbens, caudate, and cingulate cortex are characteristic in pychiatric and neurological disorders, such as depression, anxiety, stress disorder, Parkinson's disease, schizophrenia, and neurodegenerative diseases. The findings that music can change the activity in these structures should encourage high-quality studies (see Koelsch, 2014 ) of the neural correlates of the therapeutic effects of music in order to provide convincing evidence for these effects (Drevets et al., 2008 ; Maratos et al., 2008 ; Omar et al., 2011 ). The activation of the amygdala and the hippocampal formation by musical chills as demonstrated in PET scans (Blood and Zatorre, 2001 ) may give direct support to the phenomenological efforts in music-therapeutic approaches for the treatment of disorders such as depression and anxiety because these disorders are partly ascribed to dysfunctions of the amygdala and presumably of the hippocampus (Koelsch and Stegemann, 2012 ).

Another condition in which music should have therapeutic effects is autism spectrum disorder (ASD). Functional MRI studies show (Caria et al., 2011 ) that individuals with ASD exhibit relatively intact perception and processing of music-evoked emotions despite their deficit in the ability to understand emotions in non-musical social communication (Lai et al., 2012 ). Active music therapy can be used to develop communication skills since music involves communication capabilities (Koelsch, 2014 ).

With regard to neurodegenerative disorders, some patients with Alzheimer's disease (AD) have almost preserved memory of musical information for, e.g., familiar or popular tunes. Learning of sung lyrics might lead to better retention of words in AD patients and anxiety levels of these patients can be reduced with the aid of music. Because of colocalization of memory functions and emotion in the hippocampus, future studies are suggested to more specifically investigate how music is preserved in AD patients and how it can ameliorate AD effects (Cuddy et al., 2012 ) and other neurodegenerative diseases such as Parkinson's disease (Nombela et al., 2013 ). In addition, music-therapeutical efforts for cancer (Archie et al., 2013 ) or stroke (Johansson, 2012 ) have been reported.

Music has been shown to be effective for the reduction of worries and anxiety (Koelsch and Stegemann, 2012 ) as well as for pain relief in clinical settings with, however, minor effects compared to analgesic drugs (see Koelsch, 2014 ). Deficiencies in music perception are reported for patients with cerebral degeneration or damage (Koelsch, 2014 ). Recognition of music expressing joy, sadness, anger, or fear is impaired in patients with frontotemporal lobar degeneration or damage of the amygdala (Koelsch, 2014 ). Patients with lesions in the hippocampus find dissonant music pleasant in contrast to healthy controls who find dissonance unpleasant. The degree of overlap between music-evoked emotions and so-called everyday emotions remains to be specified.

Conclusions and outlook

As shown by tomographic imaging (fMRI, PET), which exhibits a high spatial resolution, activation of various brain areas can be initiated by musical stimuli. Some of these areas can be correlated to particular functions such as motor or auditive functions activated by non-musical stimuli. In the case of fMRI, emotion processing is identified by the more general feature of local energy consumption. Imaging of emotional processing on a molecular level can be achieved by PET, where specific molecules such as 11 C-NMSP have been employed (Zhang et al., 2012 ) for a targeted investigation of synaptic activity (Zhang et al., 2012 ). A powerful combination of specific detection of molecules and tomographic imaging of the brain could arise from a future development of Raman tomography (Demers et al., 2012 ). Raman scattering provides specific information on the characteristic properties of molecules, such as vibrational or rotational modes.

Development of the technically demanding tomographic methods (fMRI, PET, MEG) for easy use would be highly desirable for the investigation of the emotions of performing musicians or even the astounding sensations of composers while composing, as, e.g., expressed by Ennio Morricone, composer of the music of the film Once upon a time in the West (Spiel mir das Lied vom Tod, 1968): “Vermutlich hat der Komponist, während er ein Stück schreibt, nicht mal die Kontrolle über seine eigenen Emotionen” (Morricone, 2014, Jun 1 ). (The composer, when witing a piece, is probably not even in control of his own emotions). Jörg Widmann, composer of the contemporary opera Babylon (2012), formulates: “Man gerät beim Schreiben in extreme Zustände, kann nicht schlafen, macht weiter in einer Art Rausch – und Rausch ist womöglich der klarste Zustand überhaupt.” (Widmann, 2014, August 20 ) (When composing one gets into extreme states, cannot sleep, continues in a sort of drunkenness—and drunkenness is perhaps the clearest possible state).

Future studies on a targeted molecular level may deepen the understanding of music-evoked emotion. Novel microscopy technologies for investigating single molecules are emerging. The rapid fusion of synaptic vesicles for neurotransmission after optical stimulation has been observed by cryo electron microscopy (Chemistry Nobel Prize 2017) with an electron energy of 200 keV where radiation damage appears tolerable and on a time scale of 15 ms (Watanabe et al., 2013 ) (see Figure 17A ). Radiation damage can be entirely suppressed by combining electron holography and coherent electron diffraction imaging in a low- energy (50–250 eV) lens-less electron microscope with a spatial resolution of 0.2 nm (Latychevskaia et al., 2015 ). Of particular interest is the in vivo optical imaging of neurons (see Figure 17B ) in the brain by STED (stimulated emission depletion) optical microscopy techniques (Chemistry Nobel Prize 2014) with a lateral resolution of 67 nm (Berning et al., 2012 ). The dynamics of the neuron spine morphology on a 7-min time scale (Figure 17B ) potentially reflect alterations in the connectivity in the neural network characteristic for learning processes, even in the adult brain.

An external file that holds a picture, illustration, etc.
Object name is fnins-11-00600-g0017.jpg

(A) Representative cryo electron micrographs of fusing vesicles (see arrows) in mouse hippocampal synapses at 15 ms (c) and 30 ms (d) after light onset (Watanabe et al., 2013 ). Reprinted with permission from Watanabe et al. ( 2013 ) © 2013 Nature Publishing Group. (B) STED (stimulated emission depletion) microscopy in the molecular layer of the somatosensory cortex of a mouse with EYFP-labeled neurons. (A) Anesthetized mouse under the objective lens. (B) Projected volumes of dendritic and axonal structures reveal (C) temporal dynamics of spine morphology with (D) an approximately four-fold improved spatial resolution compared with diffraction limited imaging. The curve is three-pixel-wide line profile fitted to raw data with a Gaussian. Scale bars, 1 μm (Berning et al., 2012 ). Reprinted with permission from Berning et al. ( 2012 ) © 2012 AAAS.

In addition, neurochemical interference strategies could be promising for future research as discussed in section Musical Therapy for Psychiatric or Neurologic Impairments and Deficiencies in Music Perception. For example, blocking of a specific class of amino acid receptors in the amygdala can interfere with the acquisition of evaluative conditioning (Juslin, 2013 ). In fact, studies of the neurochemistry of music may be the next great frontier (Chanda and Levitin, 2013 ), particularly as researchers try to investigate claims about the effects of music on health, where neurochemical studies are thought to be more appropriate than neuroanatomical studies (Chanda and Levitin, 2013 ).

The number of reports on beneficial effects of music on reward, motivation, pleasure, stress, arousal, immunity and social affiliation is mounting and the following issues could have future impact (Chanda and Levitin, 2013 ): (i) Rigorously matched control conditions in postoperative or chronic pain trials, including controls such as speeches, TV, comedy recordings etc. (ii) Experiments to uncover the neurochemical basis of pleasure and reward, such as through the use of the opioid antagonist naloxone in order to discover whether musical pleasure is subserved by the same chemical system as other forms of pleasure (Chanda and Levitin, 2013 ). (iii) Experiments to uncover the connection between oxytoxin (see Figure ​ Figure11), 11 ), group affiliation, and music (Chanda and Levitin, 2013 ). (iv) Investigation of the contribution of stress hormones, vasopressin, dopamine, and opioids in biological assays and pharmacological interventions together with neuroimaging (Chanda and Levitin, 2013 ).

The investigation of particular BRECVEM mechanisms (see section Musical Therapy for Psychiatric or Neurologic Impairments and Deficiencies in Music Perception) could be intensified through specific experiments. The interaction between BRECVEM mechanisms and aesthetic judgments has yet to be explored (Juslin, 2013 ). For an empirical exploration of candidate criteria for aesthetic judgment one has to map the characteristics of separate aesthetic criteria, as has been done with various BRECVEM mechanisms. Empirical research so far has focused on the beauty criterion (see Juslin, 2013 ) The more phenomenological measuring techniques such as encephalographic methods (EEG, MEG), skin conductance, and finger temperature or goose bump development characterized by a high time resolutions of 10 ms to 1 s are powerful tools for future observation of the dynamics and kinetics of emotional processing, where MEG can provide good time resolution together with moderate spatial resolution (Vrba and Robinson, 2001 ).

In addition to short-term studies, high-quality long-term studies would be desirable for the assessment of therapeutic efficacy over months in analogy to the year-long efforts of Carlo Farinelli for King Philipp V of Spain (see Section Historical Comments on the Impact of Music on People).

Author contributions

H-ES selected the topic, performed the literature retrieval, and wrote the manuscript.

Conflict of interest statement

The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. The reviewer AF declared a shared affiliation, with no collaboration, with the author HS to the handling Editor.


The present study has been stimulated by a discussion with Hans-Christoph Rademann, Internationale Bachakademie Stuttgart. Continuous support of Thomas Schipperges, University of Tübingen is highly appreciated. The author is indebted to Christiane Neuhaus, University of Hamburg; Hans-Peter Zenner, University of Tübingen; Klaus Scheffler, Max Planck Institute of Biological Cybernetics and University of Tübingen; Hubert Preissl, Helmholtz Center Munich at the University of Tübingen; Boris Kleber, Sunjung Kim, and Julian Malcolm Clarke, University of Tübingen; and Bernd-Christoph Kämper and Ulrike Mergenthaler, University of Stuttgart for most competent discussions. Bettina Dietrich carefully read the manuscript.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fnins.2017.00600/full#supplementary-material

  • Agrippa von Nettesheim H. C. (1992). De Occulta Philosophia , ed P. Compagni, Leiden: Vittoria. [ Google Scholar ]
  • Archie P., Bruera E., Cohen L. (2013). Music-based intervention in palliative cancer care: a review of quantitative studies and neurobiological literature . Support. Care Cancer 21 , 2609–2624. 10.1007/s00520-013-1841-4 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Bach J. S. (1967). Neue Ausgabe sämtlicher Werke, Serie VII: Orchesterwerke Band 1. Kassel: Bärenreiter. [ Google Scholar ]
  • Bailey D. L., Barthel H., Beuthin-Baumann B., Beyer T., Bisdas S., Boellaard R., et al.. (2014). Combined PET/MR: where are we now? Summary report of the second international workshop on PET/MR imaging April 8-12, 2013, Tübingen, Germany . Mol. Imaging Biol . 16 , 295–310. 10.1007/s11307-014-0725-4 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Benedek M., Kaernbach C. (2011). Physiological correlates and emotional specificity of human piloerection . Biol. Psychol. 86 , 320–329. 10.1016/j.biopsycho.2010.12.012 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Benedek M., Wilfling B., Lukas-Wolfbauer R., Katzur B. H., Kaernbach C. (2010). Objective and continuous measurement of piloerection . Psychophysiology 47 , 989–993. 10.1111/j.1469-8986.2010.01003.x [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Berning S., Willig K. I., Steffens H., Dibay P., Hell S. W. (2012). Nanoscopy in a living mouse brain . Science 335 , 551–551. 10.1126/science.1215369 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Birbaumer N., Schmidt R. F. (2010). Biologische Psychologie. Heidelberg: Springer-Verlag. [ Google Scholar ]
  • Blood A. J., Zatorre R. J. (2001). Intensely pleasurable responses to music correlate with activity in brain regions implicated in reward and emotion . Proc. Nat. Acad. Sci. U.S.A. 98 , 11818–11823. 10.1073/pnas.191355898 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Caria A., Venuti P., de Falco S. (2011). Functional and dysfunctional brain circuits underlying emotional processing of music in autism spectrum disorders . Cereb. Cortex 21 , 2838–2849. 10.1093/cercor/bhr084 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Chanda M. L., Levitin D. J. (2013). The neurochemistry of music . TrendsCogn. Sci. 17 , 179–193 10.1016/j.tics.2013.02.007 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Charland L. C. (2010). Reinstating the passions: arguments from the history of psychopathology , in The Oxford Handbook of Philosophy of Emotion , ed Goldie P. (Oxford: Oxford University Press; ), 237–259. [ Google Scholar ]
  • Cuddy L. L., Duffin J. M., Gill S. S., Brown C. L., Sikka R., Vanstone A. D. (2012). Memories for melodies and lyrics in Alzheimer's disease . Music Percept . 29 , 479–491. 10.1525/mp.2012.29.5.479 [ CrossRef ] [ Google Scholar ]
  • Darwin C. (1871). The Descent of Man and Selection in Relation to Sex . London: John Murray [ Google Scholar ]
  • Darwin C. (1872). The Expression of Emotions in Man and Animals . London: John Murray. [ Google Scholar ]
  • Demers J. L. H., Davis S. C., Pogue B. W., Morris M. D. (2012). Multichannel diffuse optical Raman tomography for bone characterization in vivo : a phantom study . Biomed. Optics Express 3 , 2299–2305. 10.1364/BOE.3.002299 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Drevets W. C., Price J. L., Furey M. L. (2008). Brain structure and functional abnormalities in mood disorders: implications for neurocircuitry models of depression . Brain Struct. Funct . 213 , 93–118. 10.1007/s00429-008-0189-x [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Dvorak A. (1955). Slavonic Dances, Edition Based on the Composers Manuscript . Prag: Artia Prag. [ Google Scholar ]
  • Eggebrecht H. H. (1991). Musik im Abendland – Prozesse und Stationen vom Mittelalter bis zur Gegenwart. München: Piper. [ Google Scholar ]
  • Eison A. S., Mullins U. L. (1996). Regulation of central 5-HT2A receptors: a review of in vivo studies . Behav. Brain Res. 73 , 177–181. 10.1016/0166-4328(96)00092-7 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Fettiplace R., Hackney C. M. (2006). The sensory and motor roles of auditory hair cells . Nat. Rev. Neurosci. 7 , 19–29. 10.1038/nrn1828 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Friston K. J., Friston D. A. (2013). A free energy formulation of music performance and perception - Helmholtz revisited , in Sound-Perception-Performance , ed Bader R. (Heidelberg: Springer; ), 43–69. [ Google Scholar ]
  • Furuya S., Altenmüller (2013). Flexibility of movement organization in piano performance . Front. Hum. Neurosci. 7 :173. 10.3389/fnhum.2013.00173 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Gray L. Auditory System: Structure and Function in Neuroscience. Online-Electronic Textbook for the Neurosciences . The University of Texas Medical School; Available online at: http://neuroscience.uth.tmc.edu/s2/chapter12.html [ Google Scholar ]
  • Grewe O., Kopiez R., Altenmüller E. (2009). The chill parameter: goose bumps and shivers as promising measures in emotion research . Music Percept. 27 , 61–74. 10.1525/mp.2009.27.1.61 [ CrossRef ] [ Google Scholar ]
  • Haböck F. (1923). Die Gesangskunst der Kastraten. Erster Notenband: A. Die Kunst des Cavaliere Carlo Broschi Farinelli. B. Farinellis berühmte Arien . Wien: Universal Edition. [ Google Scholar ]
  • Hopkin M. (2008). Ruthlessness gene' discovered . Nature News . [Epub ahead of print]. 10.1038/news.2008.738 [ CrossRef ] [ Google Scholar ]
  • Johansson B. B. (2012). Multisensory stimulation in stroke rehabilitation . Front. Hum. Neurosci. [Epub ahead of print]. 6 :60. 10.3389/fnhum.2012.00060 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Juslin P. N. (2013). From everyday emotions to aesthetic emotions: towards a unified theory of musical emotions . Phys. Life Rev. 10 , 235–266. 10.1016/j.plrev.2013.05.008 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Karma K. (2007). Musical aptitude definition and measure validation: ecological validity can endanger the construct validity of musical aptitude tests . Psychomusicology 19 , 79–90. 10.1037/h0094033 [ CrossRef ] [ Google Scholar ]
  • Kleber B., Birbaumer N., Veit R., Trevorrow T., Lotze M. (2007). Overt and imagined singing of an Italian aria . Neuroimage 36 , 889–900. 10.1016/j.neuroimage.2007.02.053 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Koelsch S. (2014). Brain correlates of music-evoked emotion . Nat. Rev. Neurosci. 15 , 170–180. 10.1038/nrn3666 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Koelsch S., Fritz T., Cramon D. Y. V., Müller K., Friederici A. D. (2006). Investigating emotion with music: an fMRI study . Hum. Brain Mapp. 27 , 239–250. 10.1002/hbm.20180 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Koelsch S., Stegemann T. (2012). The brain and positive biological effects in healthy and clinical populations , in Music, Health and Wellbeing , eds MacDonald R., Kreutz D., Mitchell L. (Oxford: Oxford University Press; ), 436–456. [ Google Scholar ]
  • Kraus K. S., Canlon S. (2012). Neuronal connectivity and interactions between the auditory and the limbic systems . Hear. Res. 288 , 34–46. 10.1016/j.heares.2012.02.009 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Kreutz G., Murcia C. Q., Bongard S. (2012). Psychoneuroendocrine research on music and health: an overview , in Music, Health and Wellbeing , eds MacDonald R., Kreutz D., Mitchell L. (Oxford: Oxford University Press; ), 457–476. [ Google Scholar ]
  • Kümmel W. F. (1977). Musik und Medizin – Ihre Wechselbeziehung in Theorie und Praxis von 800 bis 1800 . Freiburg: Verlag Alber. [ Google Scholar ]
  • Lai G., Pantazatos S. P., Schneider H., Hirsch J. (2012). Neural systems for speech and song in autism . Brain 135 , 961–975. 10.1093/brain/awr335 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Latychevskaia T., Longchamp J.-N., Escher C., Fink H.-W. (2015). Holography and coherent diffraction with low-energy electrons: a route towards structural biology at the single molecule level . Ultramicroscopy . 159 , 395–402. 10.1016/j.ultramic.2014.11.024 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Lauterbur P. C. (1973). Image formation by induced local interactions: examples employing nuclear magnetic resonance . Nature 242 , 190–191. 10.1038/242190a0 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Liu C. H., Ren J., Liu C.-M., Liu P. K. (2014). Intracellular gene transcription factor protein-guided MRI by DNA aptamers in vivo . FASEB J. 28 , 464–473. 10.1096/fj.13-234229 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Maratos A., Gold C., Wang X., Crawford M. (2008). Music therapy for depression . Cochrane Database Syst. Rev. 1 :CD004517 10.1002/14651858.CD004517.pub2 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Maurer B. (2014). Saitenweise – Neue Klangphänomene auf Streichinstrumenten und ihre Notation. Wiesbaden: Breitkopf and Härtel. [ Google Scholar ]
  • Meyer L. B. (1956). Emotion and Meaning in Music . Chicago: The University of Chicago Press. [ Google Scholar ]
  • Mitterschiffthaler M. T., Fu C. H., Dalton J. A., Andrew C. M., Williams S. C. (2007). A functional MRI study of happy and sad affective states evoked by classical music . Hum. Brain Mapp. 28 , 1150–1162. 10.1002/hbm.20337 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Morricone E. (2014, Jun 1). Besser werden . Sonntag Aktuell , p. 12. [ Google Scholar ]
  • Neuhaus C. (2013). Processing musical form: behavioural and neurocognitive approaches . Mus. Sci. 17 , 109–127. 10.1177/1029864912468998 [ CrossRef ] [ Google Scholar ]
  • Nombela C., Hughes L. E., Owen A. M., Grahn J. A. (2013). Into the groove: can rhythm influence Parkinson's disease? Neurosci. Biobehav. Rev. 37 , 2564–2570. 10.1016/j.neubiorev.2013.08.003 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Omar R., Henley S. M. D., Bartlett J. W., Hailstone J. C., Gordon E., Sauter D. A., et al.. (2011). The structural neuroanatomy of music emotion recognition: evidence from frontotemporal lobar degeneration , Neuroimage 56 , 1814–1861. 10.1016/j.neuroimage.2011.03.002 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Panksepp J. (1995). The emotional sources of, chills' induced by music . Music Percept. 13 , 171–207. 10.2307/40285693 [ CrossRef ] [ Google Scholar ]
  • Pantev C., Osterveld R., Engelien A., Ross B., Roberts L. E., Hoke M. (1998). Increased auditory cortical representation in musicians . Nature 392 , 811–813. 10.1038/33918 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Reiser M. F., Semmler W., Hricak H. (eds.). (2008). Magnetic Resonance Tomography . Berlin; Heidelberg: Springer-Verlag. [ Google Scholar ]
  • Roederer J. G. (2008). The Physics and Psychophysics of Music. An Introduction . New York, NY: Springer Science and Business. [ Google Scholar ]
  • Rzadzinska A. K., Schneider M. E., Davis C., Riordan G. P., Kachar B. (2004). An actin molecular treadmill and myosins maintain stereocilia functional architecture and self-renewal . J. Cell Biol. 164 , 887–897. 10.1083/jcb.200310055 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Salimpoor V. N., van den Bosch I., Kovacevic N., McIntosh A. R., Dagher A., Zatorre R. J. (2013). Interactions between the nucleus accumbens and auditory cortices predict music reward value . Science 340 , 216–219. 10.1126/science.1231059 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Schipperges T. (2003). Wider die Musik. Untersuchungen zur Entdeckung der Musikfeindschaft als Idee im sechzehnten bis achtzehnten Jahrhundert mit Rückblicken auf die Tradition der effectus musicae und Ausblicken zu ihrem Weiterwirken, Habilitationsschrift 2000; Separatdruck . Zeitschrift für Religions- und Geistesgeschichte 55 , 205–226. 10.1163/157007303322146529 [ CrossRef ] [ Google Scholar ]
  • Schnier F., Mehlhorn M. (2013). Magnetic Resonance Tomography. Göttingen: Phywe Systeme. [ Google Scholar ]
  • Shimizu T. (2004). Ju-On (DVD) . Santa Monica, CA; Lionsgate Entertainment Corp.). The film music (see The Grudge theme song Available online at: https://www.youtube.com/watch?v=1dqjXyIu02s ). [ Google Scholar ]
  • Spitzer M. (2003, 2014). Musik im Kopf . Stuttgart: Schattauer. [ Google Scholar ]
  • Ter-Pogossian M. M., Phelps M. E., Hoffman E. J., Mullani N. A. (1975). A positron-emission transaxial tomograph for nuclear imaging (PETT) . Radiology 114 , 89–98. 10.1148/114.1.89 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Tramo M. J., Cariani P. A., Delgutte B., Braida L. D. (2001). Neurobiological foundations for the theory of harmony in Western tonal music , in The Biological Foundations of Music , Vol. 930 , ed Zatorre R. J., Peretz I. (New York, NY: Academy of Sciences; ), 92–116. [ PubMed ] [ Google Scholar ]
  • Tsai C.-G., Chen R.-S., Tsai T.-S. (2014). The arousing and cathartic effects of popular heartbreak songs as revealed in the physiological responses of the listeners . Musicae Sci. 18 , 410–422. 10.1177/1029864914542671 [ CrossRef ] [ Google Scholar ]
  • Ukkola-Vuoti L., Oikkonen J., Onkamo P., Karma K., Raijas P., Järvelä I. (2011). Association of the arginine vasopressin receptor 1A (AVPR1A) haplotypes with listening to music . J. Hum. Genet. 56 , 324–329. 10.1038/jhg.2011.13 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Vrba J., Robinson S. E. (2001). Signal processing in magnetoencephalography . Methods 25 , 249–271. 10.1006/meth.2001.1238 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Wallum K., Westberg L., Henningsson S., Neiderhiser J. M., Reiss D., Igl W., et al. (2008). Genetic variation in the vasopressin receptor 1A gene (AVPR1A) associates with pair bonding in humans . Proc. Nat. Acad. Sci. U.S.A. 105 , 14153–14156. 10.1073/pnas.0803081105 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Watanabe S., Rost B. R., Camacho-Perez M., Davis M. W., Söhl-Kielczynski B., Rosenmund C., et al.. (2013). Ultrafast endocytosis at mouse hippocampal synapses . Nature 504 , 242–247. 10.1038/nature12809 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Watanabe T., Yagishita S., Kikyo H. (2008). Memory of music: roles of right hippocampus and left inferior frontal gyrus . Neuroimage 39 , 483–491. 10.1016/j.neuroimage.2007.08.024 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Watanabe Y. (2012). New findings on the underlying neural mechanism of emotion induced by frightening music . J. Nucl. Med. 53 , 1497–1498. 10.2967/jnumed.112.109447 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Waterman M. (1996). Emotional responses to music: implicit and explicit effects in listeners and performers . Psychol. Music 24 , 53–64. 10.1177/0305735696241006 [ CrossRef ] [ Google Scholar ]
  • Widmann J. (2014, August 20). Der Rausch ist womöglich überhaupt der klarste Zustand . Der Standard , p. 24. [ Google Scholar ]
  • Xue S., Qiao J., Pu F., Cameron M., Yang J. J. (2013). Design of a novel class of protein- based magnetic resonance imaging contrast agents for the molecular imaging of cancer biomarkers . Wiley Interdiscip. Rev. Nanomed. Nanobiotechnol . 5 . 163–179. 10.1002/wnan.1205 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Zarate S. M. (2013). The Neural control of singing . Front. Hum. Neurosci. 7 :237. 10.3389/fnhum.2013.00237 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Zender Hans. (2014). Waches Hören – Über Musik . München: Carl Hanser Verlag. [ Google Scholar ]
  • Zenner H.-P. (1994). Hören – Physiologie, Biochemie, Zell- und Neurobiologie. Stuttgart: Georg Thieme Verlag. [ Google Scholar ]
  • Zhang Y., Chen Q. Z., Du F. L., Hu Y. N., Chao F. F., Tian M., et al.. (2012). Frightening music triggers rapid changes in brain monoamine receptors: a pilot PET study . J. Nucl. Med. 53 , 1573–1578. 10.2967/jnumed.112.106690 [ PubMed ] [ CrossRef ] [ Google Scholar ]

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 22 June 2023

How have music emotions been described in Google books? Historical trends and corpus differences

  • Liang Xu   ORCID: orcid.org/0000-0003-3889-927X 1 , 2 ,
  • Zehua Jiang 2 ,
  • Xin Wen 2 ,
  • Yishan Liu 2 ,
  • Zaoyi Sun 1 ,
  • Hongting Li 1 &
  • Xiuying Qian 2  

Humanities and Social Sciences Communications volume  10 , Article number:  346 ( 2023 ) Cite this article

798 Accesses

1 Citations

1 Altmetric

Metrics details

  • Cultural and media studies
  • Language and linguistics

Human records can assist us in understanding real descriptions and expected ideals of music. The present work examined how have music emotions been described in millions of Google books. In general, positive adjectives were more regularly used to describe music than negative adjectives, demonstrating a positivity bias in music. The emotional depiction of music has shifted over time, including a decrease in the frequency of emotional adjectives used in English books over the past two centuries, and a sudden surge in the usage of positive adjectives in simplified Chinese books during China’s Cultural Revolution. Negative adjectives were substantially less employed to describe music in simplified Chinese books than in English books, reflecting cultural differences. Finally, a comparison of different corpora showed that emotion-related adjectives were more frequently used to describe music in fictional literature.


Music moves us by conveying and evoking anything from arousal and basic emotions (e.g., happiness and sadness) to complex emotions (e.g., love and nostalgia). Thus, the perceived emotions (emotions expressed by music) and the felt emotions (emotions aroused by music) have attracted increasing academic attention in recent decades (e.g., Juslin et al., 2014 ; Kallinen and Ravaja, 2006 ; Schubert, 2013 ; Xu et al., 2021 ). Compared to the felt emotions of music, the perceived emotions are regarded as the “objective” aspects of music-elicited emotion (Gabrielsson, 2001 ). People may describe and share the emotional information of music that they listen to, and show a preference for music expressing specific emotions such as sadness (Xu et al., 2021 ; Yoon et al., 2020 ) and happiness (Schellenberg et al., 2008 ). Furthermore, the emotion information of music has been widely used in various application fields including music recommendation (Han et al., 2010 ), music therapy (Bernatzky et al., 2011 ), and music information retrieval (Downie, 2008 ).

Since music plays an important role in human life, how music was described by people has been recorded in different corpora, such as books (Michel et al., 2011 ), music reviews (Vannini, 2004 ), and texts in social media platforms (Dewan and Ramaprasad, 2014 ). By mining the music-related texts, researchers have investigated the meanings of a singer (Vannini, 2004 ), the contributions of music journalism (Fürsich and Avant-Mier, 2013 ), music-related metaphors (Šeškauskienė and Levandauskaitė, 2013 ), and so forth. However, the emotional description of music in texts has not been systematically investigated. As the soul of music, music’s emotional information often appears in various human records. Listeners may share their emotional states in music reviews after listening to music, the protagonists in books may describe the perceived emotions of music they heard, and people may recommend music that expresses specific emotions on social media. Thus, can we also obtain people’s attitudes towards music emotions from corpora? Similar to human description studies (Leising et al., 2014 ; Ye et al., 2018 ; Wen et al., 2023 ), we believe that the use of music descriptors can reflect either real descriptions (how music actually is) or expected ideals (how music should be). Therefore, mining the emotional description of music may help us know the emotional roles of music in human life.

In addition to the studies focusing on the short-term scale of corpora, examining the dynamics at longer time scales has attracted increasing academic attention. Berger and Luckmann ( 1991 ) noted that social change is interconnected with language. As a remarkably long-lived phenomenon, language containing a large number of common words has been passed down through multiple generations for centuries (Lieberman et al., 2007 ; Pagel et al., 2007 ). In fact, music, much like language, can also serve as a valuable research model to comprehend the evolution of cultural traits across time and space, thereby enriching our comprehension of cultural variation and transformation (Savage, 2019 ; Youngblood et al., 2023 ). In addition, by analyzing humanity’s written records, researchers have investigated the historical trends of personality description (Leising et al., 2014 ; Roivainen, 2013 ; Ye et al., 2018 ), emotion expression (Acerbi et al., 2013 ), women’s status (Twenge et al., 2012 ), cultural values (Greenfield, 2013 ; Zeng and Greenfield, 2015 ), morality (Kesebir and Kesebir, 2012 ), and so forth. For example, Roivainen’s series of works calculated the usage frequencies of personality adjectives in books to test the generational changes in personality (Roivainen, 2020 ), gender differences in personality (Motschenbacher and Roivainen, 2020 ), and the suitability of personality adjectives (Roivainen, 2015a ). With reference to previous research methods, can we learn about the changes in the emotional description of music from books or other corpora? Studies have also indicated that the distribution of word frequency can be influenced by social and historical events (Bochkarev et al., 2014 ). For instance, during World War II, the Great Depression, and the Baby Boom period (Acerbi et al., 2013 ; Bochkarev et al., 2014 ), there was an increase in the use of emotional vocabulary in literature, reflecting a trend toward more emotional expression. Considering that music is inseparable from human society, we believe that historical changes in music emotions (perceived or expected emotions) may have shifted themselves in music’s descriptions. Therefore, the present work would like to investigate how emotion-related words have been used to describe music in books, and how the description changes over time.

Cultural difference, reflected by language difference in texts, is an inevitable issue in text analysis. Music’s descriptions in books can be regarded as cultural artifacts whose meanings are “symbolically constructed, historically transmitted, and expressed by individuals in instances of situated communication” (Wilkins and Gareis, 2006 ). The study of music’s description is also a study of culture, investigating the anthropological knowledge about cultural values that are reproduced in the communication of music emotions. Cultural differences embodied in language have been observed in many fields. For example, the work of Besemeres ( 2004 ) found that the expression and description of emotion may differ in different languages; categorization, which organizes and classifies objects together, can be affected by different testing languages (Ji et al., 2004 ); by comparing English, French, and Dutch listeners’ speech segmentation, Tyler and Cutler ( 2009 ) found cross-language differences in cue use for continuous-speech segmentation; and previous studies have shown that “each written language has its own unique rhetorical patterns in terms of style, structure, and content” (Almuhailib, 2019 ; Leki, 1991 ). Therefore, it is necessary to compare musical emotion descriptions in different language corpora.

In fact, even with the same language, there may still be differences in different types of corpora. Roivainen ( 2015b ) found that variation exists in adjectives’ usage across corpora (Twitter tweets and Google Books). Ye et al. ( 2018 ) noted that person-descriptive adjectives were more frequently used by fictional literature than by non-fictional literature. Underwood and Sellers ( 2012 ) conducted a study on literary diction and found that fictional literature uses more archaic words than non-fictional literature. Similarly, Heuser and Le-Khac ( 2011 ) examined the phrasing in 19th-century novels and found a decline in the usage of abstract vocabulary (such as integrity, modesty, sensitivity, and reason) and an increase in the usage of concrete vocabulary (such as action verbs, body parts, colors, and numbers) in literary language. By comparing Facebook, Twitter, Instagram, and WhatsApp, previous work found that the norms of expressing emotions on social media are different (Waterloo et al., 2018 ). The above findings reveal the differences in many aspects of different corpora of the same language, which reminds us to investigate the variation in music’s emotional descriptions across corpora. Thus, the corpus differences were also taken into account in this study.

In sum, with the accessibility of various online Big Data, we conducted an exploratory study to investigate the emotional descriptions of music in books. Our research question was three-fold. First, we sought to investigate the historical trends of music’s emotional descriptions. To this end, we utilized Google Books N-gram (GBN), the largest digital corpus of written records in human history (Michel et al., 2011 ), to analyze the frequency of different emotional words used to describe music over the past two centuries. Second, we aimed to examine whether the aforementioned trends were consistent across books in different languages. We achieved this by investigating the variation in music’s emotional descriptions across books written in American English, British English, and Simplified Chinese. Lastly, we explored the variation in music’s emotional descriptions across different corpora by comparing various types of books. By conducting these studies, we sought to understand how emotion-related words have been used to describe music in books, how these descriptions have evolved over time, and how they vary across different corpora.

Calculation and presentation of historical trends

The present work used different corpora of the GBN database (available online at: http://storage.googleapis.com/books/ngrams/books/datasetsv2.html ) to achieve our research goals. First, we used the English corpus from GBN database, which contains approximately 1,510,000 English-language books drawn from 100 sources (Lin et al., 2012 ; Michel et al., 2011 ), to examine the historical trends of music’s emotional description. The time span of 1800–2000 was investigated because few books were published before 1800 and the books’ selection criteria were changed after 2000 (Greenfield, 2013 ; Ye et al., 2018 ).

Leising et al. ( 2014 ) suggested that terms describing more significant things were more frequently used in a large corpus. Therefore, this study used the usage frequency, widely used in previous works (Moon, 2014 ; Roivainen, 2013 ; Ye et al., 2018 ), to evaluate the importance of music’s emotional descriptions. Referring to the methods of Roivainen ( 2013 ), we first searched the usage frequencies of emotion-related adjectives when they were used to describe music, which formed the combinations of “adjective music”. This study considered 320 emotion-related adjectives, which were mainly extracted from the “affect” category of the Linguistic Inquiry and Word Count (LIWC) dictionary (Pennebaker et al., 2015 ), and a small number of which were extracted from previous music emotion studies (Juslin et al., 2014 ; Kallinen and Ravaja, 2006 ; Zentner et al., 2008 ). The LIWC is a typical text mining tool that consists of a number of taxonomies of semantically affine words that are evaluated by human judges (Kahn et al., 2007 ; Schwartz et al., 2013 ). Notably, these 320 adjectives contained not only words that describe emotions (e.g., happy , sad , and relaxing ), but also words associated with emotions (e.g., sweet , terrible , and rude ). In addition, each word was classified into positive adjectives, negative adjectives, or others (neutral or be of different polarity in different contexts) referring to the category of LIWC (Pennebaker et al., 2015 ). The list of adjectives is presented in Supplemental Materials Table S1 .

Second, since the usage frequencies of music may differ in different years, we calculated the adjusted frequencies of the combinations (“adjective” + “music”) as follows: \(AF_i^{{\rm {combi}}} = F_i^{{\rm {combi}}}/F_i^{{\rm {music}}}\) , where \(AF_i^{{\rm {combi}}}\) is the adjusted frequency of each combination in year i , and \(F_i^{{\rm {combi}}}\) and \(F_i^{\text {{music}}}\) are the usage frequencies of each combination and music in year i . The calculation of adjusted frequencies allowed us to get the usage of each combination in each year, which can be used to present the historical trend of each adjective when describing music. To show the time trends, we used the three-year smoothing method, a 7-year moving average; for instance, the adjusted frequency of 1950 is an average of 1947–1953. Third, we summed the adjusted frequencies of all adjectives for each polarity (positive or negative) as the adjusted frequencies of that category. Finally, the mean adjusted frequency and the historical trend of each adjective or each polarity were compared respectively.

Comparison of books in different languages

We then investigated the variation in music’s emotional descriptions across different language books. In this study, American English (AE) books, British English (BE) books, and Simplified Chinese (SC) books were compared. SC appeared in the 1950s (Zeng and Greenfield, 2015 ), so we compared the differences in books from 1960 to 2000.

The corpora from the GBN database, including American English-language books (about 1,160,000 books), British English-language books (about 370,000 books), and Simplified Chinese-language books (about 300,000 books), were used here. For the analysis of AE and BE books, the same 320 emotion-related adjectives presented in the Supplemental Materials Table S1 were applied; while for the analysis of SC books, 585 emotion-related adjectives extracted from the “affect” category of Simplified Chinese LIWC dictionary (Huang et al., 2015 ) were used (see Supplemental Materials Table S2 ). Then, using the same method as in the section “Calculation and presentation of historical trends”, the mean adjusted frequency and the historical trend of each adjective or each polarity in different language books were compared respectively.

Comparison of different corpora

Finally, we investigated the variation in music’s emotional descriptions across different corpora. We investigated the differences between music’s emotional descriptions in English fictional books and in the overall English books of the GBN database. We utilized the Google English Fiction Corpus, which contains books mostly in the English language that a library or publisher recognized as fiction (Michel et al., 2011 ). The Google Ngram database does not divide fiction and nonfiction data to allow for direct comparisons, so we compared the Google English Fiction Corpus (including about 330,000 English fictional books) with the overall English Corpus. We used the same 320 emotion-related adjectives presented in the Supplemental Materials Table S1 here. Using the same method as in the section “Calculation and presentation of historical trends”, we compared the mean adjusted frequency and the historical trend of each polarity in different corpora.

Trend correlation check

To begin our data exploration, we examined whether the patterns in adjective frequency we observed were unique to descriptions of music , or if they followed broader trends in the corpus and language over time. To accomplish this, we calculated the Pearson correlation between the usage frequency of each adjective and the frequency of its usage when describing music over the past two centuries. The results, shown in Supplemental Materials Table S3 , revealed a mean Pearson correlation coefficient of 0.172 ± 0.316 for all adjectives, indicating that the use of adjectives to describe music is not closely tied to general usage trends. While some words, such as solemn , serious , and delicious , displayed a strong positive correlation with overall frequency, most words did not exhibit such a trend. In fact, some words, including beautiful , light , and soothing , showed a negative correlation with overall frequency, meaning that as their frequency in the corpus increased, their frequency in descriptions of music decreased. Therefore, we believe that while some words may be influenced by general trends in language usage, most of the trend results we observed are specific to music .

Historical trends in English books

Figure 1 shows the mean adjusted frequency of emotion-related adjectives when describing music in English books from 1800 to 2000. We noticed that sweet was most frequently used to describe music , followed by solemn , beautiful , fine , best , excellent , sad , and melancholy . Comparing the mean adjusted frequencies of each emotion category, we found that, in general, positive adjectives were more frequently used to describe music than negative adjectives ( Z  = 12.293, p  < 0.001, effect size r  = 1.000). This phenomenon conforms to the Pollyanna hypothesis that human language has a positivity bias (Boucher and Osgood, 1969 ). Positive words are used more frequently than negative words or words judged as less likeable (Dodds et al., 2015 ). For positive adjectives (see Supplemental Materials Table S3 ), words that express positive evaluation (e.g., fine , best , and excellent ) were used more frequently to describe music than positive emotion words (e.g., happy , joyful , and relaxing ). On the contrary, for negative adjectives, negative emotion adjectives ( sad and melancholy ) were the most frequently used to describe music . This reminds us that the difference between the usage of positive and negative adjectives mainly depends on the adjectives that express evaluation (e.g., best , bad , and fine ), but not emotion words (e.g., sad , happy , and relaxing ).

figure 1

Source: Own elaboration.

In our trend analysis, we began by examining the usage of the word music . As depicted in Fig. 2 , the usage of the term music has consistently demonstrated an upward trend over the past two centuries, whether measured in terms of an absolute number of uses or frequency of use. Then, the historical trends of emotion-related adjectives when describing music are presented in Fig. 3 . In general, positive adjectives were more frequently used to describe music than negative adjectives every year, but the difference exhibited a reduction from 1862 to 1975 (see Fig. 3a ). Since words may not be equally used in books (Roivainen, 2013 ), analyzing the top used words can help us better understand how the trends were formed. Figure 3 b and c show the historical trends of the top five positive and negative adjectives, respectively. We noticed that the trend of sweet is very similar to the trend of the sum of positive adjectives, and Pearson correlation analysis shows that the adjusted frequency of sweet is positively correlated with the sum of the adjusted frequency of positive adjectives ( r (200) = 0.945, p  < 0.001). The usage of other positive adjectives is relatively stable from 1800 to 2000. Therefore, is the dynamic of positive adjectives primarily driven by the most frequent adjective, sweet ? To explore this question, we compared the overall trend in the use of positive words with and without the word sweet (see Supplemental Materials Fig. S3 ). Surprisingly, we found that although the overall frequency of positive word usage decreased, the trends in both cases remained similar, thus refuting the aforementioned speculation.

figure 2

a the adjusted frequencies of positive and negative adjectives when describing music from 1800 to 2000. Error bands indicate the standard error (3-year smoothing). b and c are the adjusted frequencies of the top five positive and negative adjectives when describing music from 1800 to 2000 (loess smoothing). Source: Own elaboration.

For negative adjectives, the trends of sad , melancholy , strange , and bad are similar, showing slight declines after 1900. While in the early 19th century, the usage frequency of serious suddenly increased, and declined in the late 19th century. In sum, the usage trend of many top words appears to be consistent with the overall trends of their respective categories (both positive and negative), which could be attributed to Zipf’s law for word frequencies (Ferrer-i-Cancho, 2016 ). According to this law, individual words make up a significant portion of the whole, and therefore the dynamics of these individual words can predict the overall dynamics. However, we also observed different trends for some words, and their underlying causes warrant further investigation and discussion.

In combining the results of the section “Trend correlation check”, we have also discovered some interesting phenomena related to the use of specific words in relation to music. For instance, we observed that the word solemn is used less frequently than beautiful in books (see Supplemental Materials Fig. S1 ), yet the mean frequency of beautiful music and solemn music is similar (Fig. 1 ), indicating that the word solemn is more commonly used to describe music in literature. This finding aligns with the work of Knoop et al. ( 2016 ), who found that certain terms are more commonly associated with specific genres of literature (e.g., funny and sad for drama; suspenseful , interesting , and romantic for novels). Furthermore, we observed a significant decline in the frequency of solemn in describing music after 1854 (see Supplemental Materials Fig. S2 ). This trend may be linked to the secularization of music, where the influence of religion on society gradually weakened due to industrialization and urbanization (Lombaard et al., 2019 ). This led to the emergence of new forms of music that may have been less associated with religious solemnity and more reflective of the changing societal values and beliefs.

Language differences

First, we identified similarities across the books in all three languages. As shown in Fig. 4a , whether in AE, BE, or SC books, the positive adjectives were more frequently used to describe music than negative adjectives from 1960 to 2000 (for each type of books, Z  = 5.579, p  < 0.001, effect size r  = 1.000).

figure 4

a The adjusted frequencies of positive and negative adjectives when describing music from 1960 to 2000 (3-year smoothing). AE indicates American English, BE indicates British English, SC indicates Simplified Chinese, POS indicates positive adjectives, and NEG indicates negative adjectives. b and c are the adjusted frequencies of the top five positive and negative adjectives when describing music in AE books. d and e are the results of BE books, and f and g are the results of SC books (loess smoothing). Source: Own elaboration.

Then, to examine potential differences in language, we initially conducted a Friedman test to assess whether there were any significant differences among the three languages. Following this, we performed pairwise comparisons using the Post hoc Wilcoxon test. Friedman test showed that there is no difference in the use of positive adjectives in AE, BE, and SC books ( χ 2 (2) = 0.146, p  = 0.929), but indicated a significant effect for negative adjectives ( χ 2 (2) = 58.390, p  < 0.001). Post hoc Wilcoxon test showed that negative adjectives were less used to describe music in SC books than in AE and BE books (all p  < 0.001, all effect size r  = 1.000).

In addition, the usage frequency of positive adjectives when describing music in SC books suddenly increased in about 1966 and then declined in about 1976. The above period seems to be closely related to China’s Cultural Revolution (Yao, 2000 ), and positive adjectives were more frequently used to describe music in SC books than in AE and BE books in that period (see Table 1 ; χ 2 (2) = 7.091, p  = 0.029; post hoc Wilcoxon test of SC and AE: p  < 0.05, effect size r = 0.455; SC and BE: p  = 0.182, r  = 0.284). As shown in Table 2 , We also observed a decline in the usage of positive adjectives in BE language books ( Z  = 3.920, p  < 0.001, r  = 0.620), and the decline in the usage of positive adjectives in AE language books is smaller ( Z  = 2.091, p  < 0.05, r  = 0.331).

For negative adjectives, the usage frequencies of negative adjectives in AE and BE language books declined from 1960 to 2000 (AE: Z  = 3.920, p  < 0.001, r  = 0.488; BE: Z  = 3.845, p  < 0.001, r  = 0.439), but the usage frequency of negative adjectives in SC language books has not changed significantly over time ( Z  = 0.597, p  = 0.550, r  = 0.000). Additionally, we computed the correlation between the adjusted frequency of adjectives in the AE and BE corpora (see Supplemental Materials Table S4 ). The results revealed that the average Pearson correlation coefficient is only 0.141 ± 0.225. Notably, words such as sweet ( r  = 0.839), fine ( r  = 0.762), and solemn ( r  = 0.750) displayed higher correlation coefficients. However, several words exhibited negative correlations, including beautiful ( r  = −0.375), inspiring ( r  = −0.296), and lost ( r  = −0.295).

The historical trends of top-used positive and negative adjectives when describing music in AE, BE, and SC language books are, respectively, presented in Fig. 4 b– g . For positive adjectives in AE language books, the word beautiful was most frequently used to describe music , showing an upward trend; whereas the usage frequencies of the words sweet , fine , and special slightly declined from 1960 to 2000. For negative adjectives in AE language books, the word serious was more frequently used when describing music than the other top words ( χ 2 (4) = 136.669, p  < 0.001). The most frequently used adjectives when describing music in BE books were basically the same as those in AE books, including beautiful , sweet , fine , special , serious , sad , strange , bad , and different . For SC language books, the trends of the top-used positive adjectives are in line with the overall trend, that is, the usage frequency increased sharply from 1966 to 1976. We also observed that negative adjectives were rarely used to describe music in SC books, although the most frequently used negative adjectives were similar to the previous two types of languages, such as serious , sad , and strange . However, it is important to note that the results obtained from Chinese corpora, especially prior to 1978, may be unreliable due to the limited number of simplified Chinese books included in Google Books during that time (see Supplemental Materials Figs. S4 and S5 ), resulting in highly unstable results.

English fictional corpus vs. the overall English corpus

As shown in Fig. 5 , we observed that the positive adjectives were more frequently used to describe music than negative adjectives in English Fictional books ( Z  = 13.394, p  < 0.001, effect size r  = 0.940). Both positive and negative adjectives were more frequently used to describe music in English fictional books than in the overall English books (positive words: Z  = 11.003, p  < 0.001, effect size r  = 0.781; negative words: Z  = 10.862, p  < 0.001, effect size r  = 0.771). These results indicated that emotion-related adjectives were more frequently used in fictional literature.

figure 5

ALL indicates English books, FIC indicates English fictional books, POS indicates positive adjectives, and NEG indicates negative adjectives. Error bands indicate the standard errors. Source: Own elaboration.

The present work examined how have music emotions been described by analyzing the usage frequencies of emotion-related adjectives in Google books. We observed similarities and differences in the usage frequencies of emotion-related adjectives when describing music across histories, languages, and corpora. In terms of similarities, we found that positive adjectives were more frequently used to describe music than negative adjectives, based on the Google English Books Ngram database, the Google American English Books Ngram database, the Google British English Books Ngram database, the Google Simplified Chinese Books Ngram database, the Google English fiction Books Ngram database. This finding supports the Pollyanna hypothesis that human language has a positivity bias (Boucher and Osgood, 1969 ; Dodds et al., 2015 ), which is also reflected in people’s descriptions of music. This result is predictable because positive events have been proven to be more prevalent than negative events in daily life (Rozin et al., 2010 ). In addition, words describing more significant items were employed more frequently in a vast corpus (Leising et al., 2014 ). Thus, our finding of the similarity may also indirectly reflect that people generally prefer positive music, which has been concluded by a study using a different methodology (Xu et al., 2021 ).

Of course, not all research supports the aforementioned positive bias. Brand et al. ( 2019 ) study of popular music lyrics in the past 50 years found that the use of negative words in lyrics is increasing while that of positive words is decreasing. This may reflect a negative bias in the dissemination of social information, meaning that negative language information is more frequently shared by society. Therefore, there is still a lack of accurate conclusions regarding whether language information is more positive or negative. Furthermore, the words in LIWC are not exhaustive, meaning that they cannot represent all the vocabulary used in human languages. Therefore, we still need more research to interpret the observed positive preference for music description.

In terms of the time trends in English books, we observed that, for both positive and negative adjectives, the usage frequency started high in the middle of the 18th century and then showed a steady decline when describing music . Previous work has found that the use of words related to positive emotions decreased in song lyrics over time (DeWall et al., 2011 ), and it is consistent with part of our findings. DeWall et al. ( 2011 ) explained their findings by the rise of psychopathology (Twenge et al., 2010 ), but they did not evaluate the words related to negative emotions. Our findings revealed a general decrease in the emotional description of music over time, regardless of emotional polarity. In fact, a genuine decrease in the literary expression of emotion has been found in the work of Acerbi et al. ( 2013 ). We think these trends may be explained by the cultural shift from collectivism towards individualism (Greenfield, 2013 ). Previous research has shown that music arousing specific emotions, including happiness, surprise, interest, nostalgia, anxiety, love, and spirituality, was more frequent in collectivist cultures than in individualist cultures (Juslin et al., 2016 ). Decreased pursuit of in-group emotional experience may lead to a reduction in the emotional description of the music. Another possibility could be that emotional words’ usage has altered over the century, rather than decreased. However, this appears unlikely to explain the observed decline because we utilized contemporary word lists.

In addition, we still agree with the opinion of DeWall et al. ( 2011 ) that “shifts in song lyrics reflect cultural changes.” Therefore, cultural shifts may also be reflected in changes in music’s emotional description. Indirectly supporting the above viewpoint is the co-occurrence phenomena in simplified Chinese books. We observed that the usage frequency of positive adjectives when describing music in simplified Chinese books suddenly increased in about 1966, and then declined in about 1976. This time period is consistent with China’s Cultural Revolution (Donnithorne, 1972 ). During the above time period, many Chinese people were actively seeking out “positive” objects, which had a noticeable impact on the emotional descriptions of music in Simplified Chinese books. We believe that this phenomenon aligns with the broader perspective on cultural evolution in music (Savage, 2019 ; Youngblood et al., 2023 ). By exploring the relationship between musical characteristics and specific cultural environments, music can serve as an important research tool to understand how cultural features change over time and across different geographic locations.

Changes in the emotional description of music can also be linked to significant social and historical events. Bochkarev et al. ( 2014 ) have argued that the distribution of word frequencies is sensitive to social and historical changes. From 1820 to 1850, we observed that both positive and negative adjectives were used with greater frequency. This suggests an increase in emotional expression during this period, with the upward trend being more pronounced for positive words, which began around 1800 and ended around 1860 (see Fig. 3b ). In the history of music, the early 19th century marked a significant shift from classical music to the early romantic period (Erfurth and Hoff, 2000 ). By 1820, the romantic period was officially underway (Hansen et al., 2016 ). Artistic works from this period were distinguished by their romanticism, which was characterized by a strong subjective flavor, a focus on the expression of personal emotions, and the use of passionate language to convey those emotions. This style may not only impact the form of music but also influence literary works. However, the increase in emotional expression during the romantic period was followed by a prolonged period of decline after 1860. This trend is particularly evident in the long-term and significant decrease in the use of positive emotional words. Morin and Acerbi ( 2017 ) suggest that following the excessive emotional expression of the romantic period, emotional expression returned to average levels. Furthermore, Fig. 3c reveals that the frequency of the negative word serious shows relatively inconsistent changes compared to other negative words. Its usage frequency began to rise rapidly after 1925 and remained at a high level until around 1980 when it started to decline. This period of high usage frequency coincides with World War II, the Great Depression, and the Baby Boom, which were significant historical events (Bochkarev et al., 2014 ). These changes in vocabulary usage appear to be driven by specific major historical events (Acerbi et al., 2013 ).

The main discovery in terms of the language difference is the usage frequency of negative adjectives. We noticed that negative adjectives were significantly less used to describe music in simplified Chinese books than in English books. This may be explained by the differences in display rules between Western and Eastern cultures. Eastern cultures usually place a lower value on the display of an individual’s emotions (Matsumoto et al., 2008 ), particularly anger and grief expressions (Matsumoto, 1990 ; Safdar et al., 2009 ; Song et al., 2021 ), and encourage hiding the negative emotions (Gross, 2001 ; Rychlowska et al., 2017 ); whereas western cultures are more focused on the development of the self (Markus and Kitayama, 1991 ) and the expression of emotion (Butler et al., 2007 ). In a representative oriental country, Chinese people may express fewer negative feelings and describe music with fewer negative words. In fact, except during China’s Cultural Revolution, there are also fewer positive adjectives used to describe music in simplified Chinese books than in English books (see Fig. 4a ). This also emphasizes how Chinese culture is more reserved when it comes to expressing and describing emotions.

Of course, the observed language differences may also stem from the ideology of music. Sorce Keller ( 2007 ) argues that the narrative content of music inevitably carries ideology, and these underlying ideologies can elicit responses from people and achieve a certain purpose by shaping and conveying the ideology. Therefore, there may be differences in the words used to describe musical emotions under different ideologies. Additionally, the author’s literary style is also influenced by the cultural era in which they live, and this influence is often reflected in the words they use (Knight and Tabrizi, 2016 ). Hence, distinct cultural backgrounds may result in differences in the vocabulary used to describe music. Unfortunately, this study only compares the differences between the three languages, which makes it challenging to determine whether the observed variations are due to cultural differences between the East and West, the impact of the ideology of the country and society, or other reasons (e.g., the differences between the original LIWC lexicon and the SC-LIWC lexicon).

Corpora differences were also observed. We found that emotion-related adjectives were more frequently used to describe music in the fictional English books than the overall English books. Previous research has shown that, in comparison to non-fictional literature, fictional writing tends to utilize more person-descriptive adjectives (Ye et al., 2018 ), and fiction books were more biased toward intuition words (Scheffer et al., 2021 ). Thus, stronger emotive descriptions of music in fictional books are to be expected. These findings support the substantial disparities in words used in various corpora, which affect people’s descriptions of musical emotions.

Notably, this study has several limitations. First, the current study solely examined bigrams composed of an adjective and a target word (e.g., sad music). Other ways of describing music are not covered: for example, single nouns (music is art) and single verbs (the music moved me). Our results also excluded expressions such as “that music is so sad”. Second, this study initially focused on the word “music” as our sole target. However, it’s worth noting that there are other semantically related concepts that could be considered, such as song , melody , and fantasia . While concentrating on a single seed word can help us avoid bias introduced by extraneous concepts, it also limits the scope of our results. Additionally, by only examining a subset of the corpus, we may have encountered unforeseen deviations. Thus, in future research, if we can remove the noise caused by additional seed words, more adjacent concepts should be integrated. Third, the corpora employed in this study were primarily from the previous two centuries, so they can only be used for historical analysis and cannot reflect the current state of society. More corpora, expressions, and languages should be investigated in future studies to better understand people’s attitudes toward music.

Furthermore, we must acknowledge the limitations of the Google Books corpus. First, while GBN is a free database that provides vast amounts of data, there are concerns regarding its reliability and representativeness (Solovyev et al., 2020 ). For instance, recognition errors may occur due to book printing quality and scanning problems, and there is a lack of metadata which makes it difficult to determine whether the content in the database truly belongs to the category it is labeled as. Studies have shown that even datasets labeled as fiction are heavily populated with scientific literature (Pechenick et al., 2015 ). Second, although the number of books included is substantial, it only represents digital scan samples of 6% of the world’s published books and does not encompass all languages, genres, and types of text data (Solovyev et al., 2020 ), therefore it cannot fully represent all published books. Our research is based on text content obtained from this database, and as such, incomplete samples are an inevitable issue. Third, another issue with GBN is the unstable composition of the corpus over time, which introduces bias in diachronic comparisons. GBN encompasses a mixture of different genres, with fluctuating proportions over time. The uncontrolled composition of the GBN corpus leads to an apparent increase in cognitive distortions (Schmidt et al., 2021 ). Therefore, if Google books were constructed using a representative sample of existing books, it would significantly improve its reliability.

It should also be noted that even though google books provide some insight into the possible impact of key historical events on vocabulary use, we have only examined material from a single database and therefore cannot be considered to demonstrate historical evolution. The use of text in published books is likely to be influenced by the censorship system and the willingness of authors, editors, and publishing houses to produce and distribute certain types of content (Pechenick et al., 2015 ; Salganik et al., 2006 ). Therefore, the materials analyzed in this study may not necessarily reflect true cultural evolution. Furthermore, the language used in books tends to be more conservative than spoken language (Bochkarev et al., 2014 ), meaning that our findings only reflect trends in published materials and may not fully represent changes in human language. Despite these limitations, we believe that the Google Books corpus can still serve as an important basis for scientific research, helping us identify connections between the use of literary terms and social events or behaviors (Müller-Spitzer et al., 2015 ).

The present work examined how have music emotions been described in books and music reviews by analyzing the usage frequencies of emotion-related adjectives when describing music . Positive adjectives were more commonly employed to describe music than negative terms in all corpora studied, indicating a positivity bias in music. Historical changes shifted in the emotional description of the music. For example, fewer and fewer emotion-related adjectives were used to describe music in English books over the past two centuries (1800–2000), and the usage frequency of positive adjectives suddenly increased in simplified Chinese books during China’s Cultural Revolution. Language differences were also observed. For instance, negative adjectives were significantly less used to describe music in simplified Chinese books than in English books, reflecting cultural differences. Finally, we discovered significant disparities in the music’s emotional descriptions between English Fictional Corpus and the Overall English Corpus. Of course, more research is needed in order to understand the reasons behind the historical changes and cultural differences.

Data availability

The data is accessible at http://storage.googleapis.com/books/ngrams/books/datasetsv2.html .

Acerbi A, Lampos V, Garnett P, Bentley RA (2013) The expression of emotions in 20th century books. PLoS ONE 8(3):e59030. https://doi.org/10.1371/journal.pone.0059030

Article   ADS   PubMed   PubMed Central   CAS   Google Scholar  

Almuhailib B (2019) Analyzing cross-cultural writing differences using contrastive rhetoric: a critical review. Adv Language Lit Stud 10(2):102–106

Article   Google Scholar  

Berger PL, Luckmann T (1991) The social construction of reality: a treatise in the sociology of knowledge (no. 10). Penguin, UK

Google Scholar  

Bernatzky G, Presch M, Anderson M, Panksepp J (2011) Emotional foundations of music as a nonpharmacological pain management tool in modern medicine. Neurosci Biobehav Rev 35:1989–1999

Article   PubMed   Google Scholar  

Besemeres M (2004) Different languages, different emotions? Perspectives from autobiographical literature. J Multiling Multicult Dev 25(2-3):140–158

Bochkarev V, Solovyev V, Wichmann S (2014) Universals versus historical contingencies in lexical evolution. J R Soc Interface 11(101):20140841

Article   PubMed   PubMed Central   CAS   Google Scholar  

Boucher J, Osgood CE (1969) The Pollyanna hypothesis. J Verbal Learn Verbal Behav 8(1):1–8

Brand C, Acerbi A, Mesoudi A (2019) Cultural evolution of emotional expression in 50 years of song lyrics. Evol Hum Sci 1:e11. https://doi.org/10.1017/ehs.2019.11

Butler EA, Lee TL, Gross JJ (2007) Emotion regulation and culture: Are the social consequences of emotion suppression culture-specific? Emotion 7(1):30

DeWall CN, Pond RS, Campbell Jr WK, Twenge JM (2011) Tuning in to psychological change: linguistic markers of psychological traits and emotions over time in popular US song lyrics. Psychol Aesthet Creat Arts 5(3):200

Dewan S, Ramaprasad J (2014) Social media, traditional media, and music sales. Mis Q 38(1):101–122

Dodds PS, Clark EM, Desu S, Frank MR, Reagan AJ, Williams JR, Danforth CM (2015) Human language reveals a universal positivity bias. Proc Natl Acad Sci USA 112(8):2389–2394

Donnithorne A (1972) China’s cellular economy: some economic trends since the Cultural Revolution. China Q 52:605–619

Downie JS (2008) The music information retrieval evaluation exchange (2005–2007): a window into music information retrieval research. Acoust Sci Technol 29(4):247–255

Erfurth A, Hoff P (2000) Mad scenes in early 19th-century opera. Acta Psychiatr Scand 102:310–313

Article   PubMed   CAS   Google Scholar  

Ferrer-i-Cancho R (2016) Compression and the origins of Zipf’s law for word frequencies. Complexity 21:409–411

Article   MathSciNet   Google Scholar  

Fürsich E, Avant-Mier R (2013) Popular journalism and cultural change: the discourse of globalization in world music reviews. Int J Cult Stud 16(2):101–118

Gabrielsson A (2001) Emotion perceived and emotion felt: same or different? Music Sci 5(1_suppl):123–147

Greenfield PM (2013) The changing psychology of culture from 1800 through 2000. Psychol Sci 24(9):1722–1731. https://doi.org/10.1177/0956797613479387

Gross JJ (2001) Emotion regulation in adulthood: timing is everything. Curr Dir Psychol Sci 10(6):214–219

Han BJ, Rho S, Jun S, Hwang E (2010) Music emotion classification and context-based music recommendation. Multimed Tools Appl 47(3):433–460

Hansen NC, Sadakata M, Pearce M (2016) Nonlinear changes in the rhythm of European art music. Music Percept 33(4):414–431

Heuser R, Le-Khac L (2011) Learning to read data: bringing out the humanistic in the digital humanities. Vic Stud 54(1):79–86

Huang CL, Lin WF, Seih YT, Lin CT, Lee CL (2015) Simplified Chinese LIWC2015 dictionary. http://www.liwc.net/dictionaries/index.php/liwcdic/indexofsubordinatedocument

Ji LJ, Zhang Z, Nisbett RE (2004) Is it culture or is it language? Examination of language effects in cross-cultural research on categorization. J Pers Soc Psychol 87(1):57–65. https://doi.org/10.1037/0022-3514.87.1.57

Juslin PN, Harmat L, Eerola T (2014) What makes music emotionally significant? Exploring the underlying mechanisms. Psychol Music 42(4):599–623

Juslin PN, Barradas GT, Ovsiannikow M, Limmo J, Thompson WF (2016) Prevalence of emotions, mechanisms, and motives in music listening: a comparison of individualist and collectivist cultures. Psychomusicology 26(4):293–326. https://doi.org/10.1037/pmu0000161

Kahn JH, Tobin RM, Massey AE, Anderson JA (2007) Measuring emotional expression with the linguistic inquiry and word count. Am J Psychol 120(2):263–286

Kallinen K, Ravaja N (2006) Emotion perceived and emotion felt: same and different. Music Sci 10(2):191–213

Kesebir P, Kesebir S (2012) The cultural salience of moral character and virtue declined in twentieth century America. J Posit Psychol 7(6):471-480

Knight GP, Tabrizi N (2016) Using n-grams to identify time periods of cultural influence. J Comput Cult Herit 9(3):1–19

Knoop CA, Wagner V, Jacobsen T, Menninghaus W (2016) Mapping the aesthetic space of literature “from below”. Poetics 56:35–49

Leising D, Scharloth J, Lohse O, Wood D (2014) What types of terms do people use when describing an individual’s personality? Psychol Sci 25(9):1787–1794. https://doi.org/10.1177/0956797614541285

Leki I (1991) Twenty-five years of contrastive rhetoric: text analysis and writing pedagogies. Tesol Q 25(1):123–143

Lieberman E, Michel JB, Jackson J, Tang T, Nowak MA (2007) Quantifying the evolutionary dynamics of language. Nature 449(7163):713–716

Lin Y, Michel J-B, Aiden EL, Orwant J, Brockman W, Petrov S (2012) Syntactic annotations for the Google Books Ngram corpus. In: Li H, Lin C, Osborne M, Lee GG, Park JC (eds) Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics 2012 system demonstrations, Jeju Island, Korea, July, 2012, p 169–174

Lombaard C, Benson, IT, Otto E (2019) Faith, society and the post-secular: Private and public religion in law and theology. HTS Teologiese Studies/Theological Studies 75(3):1–12

Markus HR, Kitayama S (1991) Culture and the self: Implications for cognition, emotion, and motivation. Psychol Rev 98(2):224

Matsumoto D (1990) Cultural similarities and differences in display rules. Motiv Emot 14(3):195–214

Matsumoto D, Yoo SH, Fontaine J (2008) Mapping expressive differences around the world: the relationship between emotional display rules and individualism versus collectivism. J Cross Cult Psychol 39(1):55–74

Michel JB, Shen YK, Aiden AP et al. (2011) Quantitative analysis of culture using millions of digitized books. Science 331(6014):176–182. https://doi.org/10.1126/science.1199644

Article   ADS   PubMed   CAS   Google Scholar  

Moon R (2014) From gorgeous to grumpy: adjectives, age and gender. Gender Lang 8(1):5–41

Morin O, Acerbi A (2017) Birth of the cool: a two-centuries decline in emotional expression in Anglophone fiction. Cogn Emot 31(8):1663–1675

Motschenbacher H, Roivainen E (2020) Personality traits, adjectives and gender: integrating corpus linguistic and psychological approaches. J Language Discrim 4(1):16–50. https://doi.org/10.1558/jld.40370

Müller-Spitzer C, Wolfer S, Koplenig A (2015) Observing online dictionary users: Studies using Wiktionary log files. Int J Lexicogr 28(1):1–26

Pagel M, Atkinson QD, Meade A (2007) Frequency of word-use predicts rates of lexical evolution throughout Indo-European history. Nature 449(7163):717–720

Pechenick EA, Danforth CM, Dodds PS (2015) Characterizing the Google books corpus: strong limits to inferences of socio-cultural and linguistic evolution. Plos One 10(10):e0137041

Article   PubMed   PubMed Central   Google Scholar  

Pennebaker JW, Boyd RL, Jordan K, Blackburn K (2015) The development and psychometric properties of LIWC. University of Texas at Austin, Austin, TX

Roivainen E (2013) Frequency of the use of English personality adjectives: Implications for personality theory. J Res Pers 47(4):417–420. https://doi.org/10.1016/j.jrp.2013.04.004

Roivainen E (2015b) Personality adjectives in Twitter tweets and in the Google books corpus. An analysis of the facet structure of the openness factor of personality. Curr Psychol 34(4):621–625

Roivainen E (2015a) The big five factor marker adjectives are not especially popular words. Are they superior descriptors? Integr Psychol Behav Sci 49(4):590–599. https://doi.org/10.1007/s12124-015-9311-9

Roivainen E (2020) Generational changes in personality: the evidence from corpus linguistics. Psychol Rep 123(2):325–340. https://doi.org/10.1177/0033294118805937

Rozin P, Berman L, Royzman E (2010) Biases in use of positive and negative words across twenty natural languages. Cogn Emot 24(3):536–548

Rychlowska M, Jack RE, Garrod OG, Schyns PG, Martin JD, Niedenthal PM (2017) Functional smiles: Tools for love, sympathy, and war. Psychol Sci 28(9):1259–1270

Safdar S, Friedlmeier W, Matsumoto D, Yoo SH, Kwantes CT, Kakai H, Shigemasu E (2009) Variations of emotional display rules within and across cultures: a comparison between Canada, USA, and Japan. Can J Behav Sci 41(1):1

Salganik MJ, Dodds PS, Watts DJ (2006) Experimental study of inequality and unpredictability in an artificial cultural market. Science 311(5762):854–856

Savage PE (2019) Cultural evolution of music. Palgrave Commun 5(1):1–12

Scheffer M, van de Leemput I, Weinans E, Bollen J (2021) The rise and fall of rationality in language. Proc Natl Acad Sci USA 118:51

Schellenberg EG, Peretz I, Vieillard S (2008) Liking for happy-and sad-sounding music: effects of exposure. Cogn Emot 22(2):218–237

Schmidt B, Piantadosi ST, Mahowald K (2021) Uncontrolled corpus composition drives an apparent surge in cognitive distortions. Proc Natl Acad Sci USA 118(45):e2115010118

Schubert E (2013) Emotion felt by the listener and expressed by the music: literature review and theoretical perspectives. Front Psychol 4:837

Schwartz HA, Eichstaedt JC, Kern ML et al. (2013) Personality, gender, and age in the language of social media: the open-vocabulary approach. PLoS ONE 8(9):e73791

Šeškauskienė I, Levandauskaitė T (2013) Conceptualising music: metaphors of classical music reviews. Stud About Lang 23:78–88

Solovyev VD, Bochkarev VV, Akhtyamova SS (2020) Google books Ngram: problems of representativeness and data reliability. In: Elizarov A, Novikov B, Stupnikov S (eds) Data Analytics and Management in Data Intensive Domains. DAMDID/RCDL 2019. Communications in computer and information science, vol 1223. Springer, Cham

Song SY, Curtis AM, Aragón OR (2021) Anger and sadness expressions situated in both positive and negative contexts: an investigation in South Korea and the United States. Front Psychol 11:579509

Sorce Keller M (2007) Why is music so ideological, and why do totalitarian states take it so seriously? A personal view from history and the social sciences. J Musicol Res 26(2–3):91–122. https://doi.org/10.1080/01411890701361086

Twenge JM, Campbell WK, Gentile B (2012) Male and female pronoun use in US books reflects women’s status, 1900–2008. Sex Roles 67(9–10):488–493. 10.1007/s11199-012-0194-7

Twenge JM, Gentile B, DeWall CN, Ma D, Lacefield K, Schurtz DR (2010) Increases in psychopathology among young Americans. 1938–2007: a cross-temporal meta-analysis of the MMPl. Clin Psychol Rev 30:145–154

Tyler MD, Cutler A (2009) Cross-language differences in cue use for speech segmentation. J Acoust Soc Am 126(1):367–376

Article   ADS   PubMed   PubMed Central   Google Scholar  

Underwood T, Sellers J (2012) The emergence of literary diction. J Digit Humanit 1(2):1–2

Vannini P (2004) The meanings of a star: interpreting music fans’ reviews. Symb Interact 27(1):47–69

Waterloo SF, Baumgartner SE, Peter J, Valkenburg PM (2018) Norms of online expressions of emotion: Comparing Facebook, Twitter, Instagram, and WhatsApp. New media & society 20(5):1813–1831

Wen X, Xu L, Ye S, Sun Z, Huang P, Qian X (2023) Personality differences between children and adults over the past two centuries: evidence from corpus linguistics. J Res Pers 102:104336

Wilkins R, Gareis E (2006) Emotion expression and the locution “I love you”: a cross-cultural study. Int J Intercult Relat 30(1):51–75

Xu L, Zheng Y, Xu D, Xu L (2021) Predicting the preference for sad music: the role of gender, personality, and audio features. IEEE Access 9:92952–92963

Xu L, Wen X, Shi J, Li S, Xiao Y, Wan Q, Qian X (2021) Effects of individual factors on perceived emotion and felt emotion of music: based on machine learning methods. Psychol Music 49(5):1069–1087

Yao S (2000) Economic development and poverty reduction in China over 20 years of reforms. Econ Dev Cult Change 48(3):447–474

Ye S, Cai S, Chen C, Wan Q, Qian X (2018) How have males and females been described over the past two centuries? An analysis of Big-Five personality-related adjectives in the Google English Books. J Res Pers 76:6–16

Yoon S, Verona E, Schlauch R, Schneider S, Rottenberg J (2020) Why do depressed people prefer sad music? Emotion 20(4):613

Youngblood M, Ozaki Y, Savage PE (2023) Cultural evolution and music. In: Jamshid JT, Jeremy K, Rachel K (eds) The Oxford handbook of cultural evolution. Oxford University Press, p C42S1–C42N14

Zeng R, Greenfield PM (2015) Cultural evolution over the last 40 years in China: using the Google Ngram Viewer to study implications of social and political change for cultural values. Int J Psychol 50(1):47–55

Zentner M, Grandjean D, Scherer KR (2008) Emotions evoked by the sound of music: characterization, classification, and measurement. Emotion 8(4):494

Download references


The authors use this opportunity to thank the Humanities and Social Sciences Youth Foundation, the Ministry of Education of the People’s Republic of China (Grant number 22YJC840026), and the Start-up Foundation of Zhejiang University of Technology (Grant number 2022161080009) for the financial support of this paper.

Author information

Authors and affiliations.

Department of Psychology, College of Education, Zhejiang University of Technology, 310023, Hangzhou, China

Liang Xu, Min Xu, Zaoyi Sun & Hongting Li

Department of Psychology and Behavioral Sciences, Zhejiang University, 310030, Hangzhou, China

Liang Xu, Zehua Jiang, Xin Wen, Yishan Liu & Xiuying Qian

You can also search for this author in PubMed   Google Scholar


LX: conceive and design the experiment; methodology; analyzed and interpreted the data; writing-original draft preparation; contributed analysis tools and data. MX: formal analysis; writing-original draft preparation. ZJ: writing-original draft preparation. XW: conceive and design the experiment; analyzed and interpreted the data. YL: analyzed and interpreted the data; validation. ZS: resources; writing-review and editing. HL: supervision; writing-review and editing. XQ: supervision; writing-review and editing.

Corresponding authors

Correspondence to Liang Xu , Hongting Li or Xiuying Qian .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Ethical approval

Ethical approval did not apply in this study as the research did not include any human or animal participants.

Informed consent

This article does not contain any studies with human participants performed by any of the authors.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplemental materials, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and Permissions

About this article

Cite this article.

Xu, L., Xu, M., Jiang, Z. et al. How have music emotions been described in Google books? Historical trends and corpus differences. Humanit Soc Sci Commun 10 , 346 (2023). https://doi.org/10.1057/s41599-023-01853-1

Download citation

Received : 28 January 2023

Accepted : 12 June 2023

Published : 22 June 2023

DOI : https://doi.org/10.1057/s41599-023-01853-1

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

research paper on music and emotions


  1. Music, a catalyst for emotions

    research paper on music and emotions

  2. How Many Emotions Can Music Make You Feel? : Conscious Life News

    research paper on music and emotions

  3. ≫ Understanding Music Therapy Free Essay Sample on Samploon.com

    research paper on music and emotions

  4. Music And Emotions Stock Photo

    research paper on music and emotions

  5. Does music affect memory research paper

    research paper on music and emotions

  6. 😂 Music education research paper. 20 Potential Topics For Your Research Paper About Music. 2019

    research paper on music and emotions


  1. Revision 2012

  2. The Incredible Power of Music: Emotions, Harmony, and Intention

  3. EMOTIONS (Slowed)

  4. Music Emotions

  5. 📚भावनात्मक विकास #Ugc net paper 2 education #UGC

  6. The Informative Impact of Music on Emotion


  1. How Do You Make an Acknowledgment in a Research Paper?

    To make an acknowledgement in a research paper, a writer should express thanks by using the full or professional names of the people being thanked and should specify exactly how the people being acknowledged helped.

  2. What Is a Good Title for My Research Paper?

    The title of a research paper should outline the purpose of the research, the methods used and the overall tone of the paper. The title is important because it is the first thing that is read. It is important that the title is focused, but ...

  3. What Is a Sample Methodology in a Research Paper?

    The sample methodology in a research paper provides the information to show that the research is valid. It must tell what was done to answer the research question and how the research was done.

  4. Emotional Responses to Music: Shifts in Frontal Brain Asymmetry

    Recent studies have demonstrated increased activity in brain regions associated with emotion and reward when listening to pleasurable music.

  5. (PDF) Music and Emotion

    ... reports are “the best and most natural method to study emotional responses. to music.” 5.4.2 Physiological Measures. Several researchers have attempted to

  6. (PDF) Music and Emotion

    specific emotions to listeners. A rare exception is a study by Thompson and Robitaille (1992). They asked five highly experienced musicians to

  7. Full article: How Music Awakens the Heart: An Experimental Study

    Findings revealed that listening to meaningful music leads to stronger feelings of being moved, having a lump in one's throat and tears crying

  8. The Effect of Music on Emotion : The Role of Personal Preference

    This effect is better understood by a comparison of participant reports of


    As well as these major summaries, a number of journal articles by eminent scholars have made succinct digests of previous research in the field, and various

  10. Music-Evoked Emotions—Current Studies

    Finally, the reports on musical therapy are briefly outlined. The study concludes with an outlook on emerging technologies and future research

  11. How emotions sound. A literature review of music as an emotional

    These articles are analysed, reviewed and cited, emphasizing the research lines and application fields with the aim of highlighting the primary research

  12. Humans' Association of Emotion with Music: A Literature Review

    Because this literature review paper focuses largely on how human beings react emotionally to music, the majority of research reviewed in this article is.

  13. How have music emotions been described in Google books ...

    Previous research has shown that music arousing specific emotions ... Emotion 8(4):494. Article PubMed Google Scholar · Download references

  14. Social and Emotional Function of Music Listening: Reasons for

    Paper presented at the 9th National Symposium on. Research in Musical Behavior, Cannon Beach, OR. LeBlanc, A., Sims, W. L., Siivola, C., & Obert, M. (1996).