Elsevier

Cortex

Volume 69, August 2015, Pages 131-140
Cortex

Research report
Investigating the brain basis of facial expression perception using multi-voxel pattern analysis

https://doi.org/10.1016/j.cortex.2015.05.003Get rights and content

Abstract

Humans can readily decode emotion expressions from faces and perceive them in a categorical manner. The model by Haxby and colleagues proposes a number of different brain regions with each taking over specific roles in face processing. One key question is how these regions directly compare to one another in successfully discriminating between various emotional facial expressions.

To address this issue, we compared the predictive accuracy of all key regions from the Haxby model using multi-voxel pattern analysis (MVPA) of functional magnetic resonance imaging (fMRI) data. Regions of interest were extracted using independent meta-analytical data. Participants viewed four classes of facial expressions (happy, angry, fearful and neutral) in an event-related fMRI design, while performing an orthogonal gender recognition task.

Activity in all regions allowed for robust above-chance predictions. When directly comparing the regions to one another, fusiform gyrus and superior temporal sulcus (STS) showed highest accuracies.

These results underscore the role of the fusiform gyrus as a key region in perception of facial expressions, alongside STS. The study suggests the need for further specification of the relative role of the various brain areas involved in the perception of facial expression. Face processing appears to rely on more interactive and functionally overlapping neural mechanisms than previously conceptualised.

Introduction

Human faces are among the most complex and at the same time most frequently encountered stimuli in our daily life, carrying information important for survival and gene propagation (Little, Jones, & DeBruine, 2011). The human expertise in reading faces allows us to decode the static as well as changeable information they convey (Bruce and Young, 1986, Haxby et al., 2000). While static features include identity, gender, attractiveness or age, changeable features include gaze direction, utterance of speech and emotional expressions. Since humans can readily decode emotion expressions from faces, it should in turn be possible to decode how this ability is represented in the brain.

A seminal model of face perception, based on the distinction of invariant and changeable features, proposes distinct modules in the brain, each in charge of carrying out different tasks when observers perceive faces (Haxby et al., 2000). According to the Haxby model, the fusiform gyrus is the most important for processing the invariant aspects of the face, such as identity or gender, while the superior temporal sulcus (STS) is reported to be in charge of processing changeable features, like gaze or emotion expression (Hoffman & Haxby, 2000). This changeable information may be conveyed not only by dynamic stimuli like video clips, but crucially also by static configurations of muscle movement as represented by pictures of expressive faces (Haxby et al., 2000). Together with the occipital face area, from which they receive input signals, fusiform gyrus and STS constitute the core face network. These core regions are supported by other brain areas located throughout the brain, which constitute the extended face system. They contribute to the more specific demands of face processing in a task-dependent manner. For example, amygdala and insula are supposed to be recruited when processing emotion expressions (Haxby et al., 2000, Hoffman and Haxby, 2000).

This model and the dissociations it predicts are grounded in cognitive theories (Bruce and Young, 1986, Bruce and Young, 2012) and are supported by a number of neurological (Bruyer et al., 1983, Duchaine et al., 2003) and neuroimaging (Hoffman and Haxby, 2000, Winston et al., 2004) studies.

However, it is an ongoing question to what degree the neural basis of expression processing is restricted to STS, amygdala and insula (Calder & Young, 2005). While the Haxby model itself states that interactions do take place between regions, its emphasis is on functional dissociations.

That, at least on the macroscopic level, substantial overlap of function may be indeed present, is reflected by the neuroimaging literature which often finds pronounced emotion effects in fusiform gyrus (Fox et al., 2009, Ganel et al., 2005, Kawasaki et al., 2012; see Fusar-Poli et al., 2009 and Sabatinelli et al., 2011 for meta-analyses). For example, functional magnetic resonance imaging (fMRI) studies using adaptation designs found that both fusiform gyrus and STS are responsive to changes in identity and expression (Fox et al., 2009, Ganel et al., 2005).

Fox et al. (2009) investigated the occipital and fusiform face areas and the STS regarding their responsiveness to changes in expression versus identity of faces using an adaptation design. Participants were presented with morphed stimuli changing either along an identity or expression dimension. Release from adaptation was shown for the occipital face area when any structural changes occurred in a face, indicating that this region codes for low-level features. Both fusiform face area and STS showed release from adaptation when participants experienced a change of either identity or expression (e.g., the face switched from angry to happy). Therefore, the regions only reacted when a specific boundary in subjective experience was crossed. Since both regions were responsive to both feature dimensions, the results suggest the dissociation of expression and identity processing may not be as pronounced as previously thought.

Furthermore, re-entrant models of emotion processing state that areas in the ventral stream, including the fusiform gyrus, receive top-down input from regions like the amygdala to allow further detailed processing (Adolphs, 2002, Vuilleumier, 2005). This is often reflected in parallel activity in the amygdala and the fusiform gyrus during processing of emotional material (Sabatinelli, Bradley, Fitzsimmons, & Lang, 2005) and in the fact that both areas show higher activity during perception of emotional than of neutral expressions (Fusar-Poli et al., 2009, Sabatinelli et al., 2011). While re-entrant models add to our understanding of how facial expressions are processed in the brain, they leave open the question of how different brain areas, such as fusiform gyrus, STS or amygdala, directly compare to each other regarding their relative roles in expression perception.

Most of the past fMRT studies addressing face processing in fusiform gyrus and STS have analysed task-induced increments of activity using univariate analyses. This approach assumes that differences in information content between experimental conditions are coded in linear increments or decrements in most or all voxels in the brain areas under investigation. However, advances in the analysis of fMRI data (Haxby, 2001) have allowed to investigate the multi-voxel patterns of perceiving emotional material in complex scenes (Baucom, Wedell, Wang, Blitzer, & Shinkareva, 2012), voice prosody (Ethofer, van de Ville, Scherer, & Vuilleumier, 2009), facial expressions (Harry et al., 2013, Petro et al., 2013, Said et al., 2010) or of modality-independent emotion representations (Peelen, Atkinson, & Vuilleumier, 2010).

Multi-voxel pattern analysis (MVPA) may help to compare regions in regard to how much information they carry about expressions of emotion. While MVPA has previously been used to study facial expression perception, these studies focused only on single anatomical regions from the Haxby model (STS: Said et al., 2010; fusiform gyrus: Harry et al., 2013) or investigated the role of V1, an early visual area not traditionally incorporated in face processing models (Petro et al., 2013).

The study by Said et al. (2010) focused on face processing in the STS, using short videos depicting expressions of seven basic emotions. The authors showed that activity in both anterior and posterior parts of STS can be used to successfully predict which expression a participant has seen. However, the study did not compare STS with other regions from the Haxby model, due to the high-resolution acquisition protocol, which prohibited full-brain coverage. The role of the fusiform gyrus has been targeted by the study of Harry et al. (2013). The authors used pictures of six facial expressions and were able to successfully predict the perception of most of them from activity in both the left and the right fusiform face area. Together with previous univariate analyses (Fox et al., 2009, Ganel et al., 2005, Kawasaki et al., 2012), this MVPA study provides evidence that the fusiform gyrus is indeed sensitive to information about emotional expressions.

We aim to investigate whether the claim made by the Haxby model, namely that STS, amygdala and insula predominantly code for expressions of emotion, can be corroborated using MVPA, directly comparing classification success across all key areas specified by the model. If so, STS, amygdala and insula should have better classification performance than other regions specified by the model, such as inferior occipital gyrus, fusiform gyrus, intraparietal sulcus and anterior temporal regions. On the other hand, a more interactive model might suggest the contribution of other regions to the perception of expression of emotion, in particular the fusiform gyrus (Calder and Young, 2005, Fox et al., 2009). Comparison of classification accuracies across regions allows for a comprehensive test of these alternatives.

The present study builds on this previous work and employs an event-related fMRI design with happy, angry, fearful and neutral facial expressions. A selection was made to keep the duration of the experiment within reasonable limits. Also, these expressions are most often used in studies of emotional face perception. We directly compare the response patterns in all key brain regions of the Haxby model, and quantify how well their activity patterns can be used for predicting the presence of a facial expression. This provides a measure of the representational content in each region in regard to the presence of a facial expression. Independent meta-analytical data from the Neurosynth database (Yarkoni, Poldrack, Nichols, van Essen, & Wager, 2011) that aggregates activation from thousands of previous fMRI studies, was used to define the respective regions of interest.

Section snippets

Participants

Fourteen healthy participants (7 female; age: M = 25.4, SD = 2.6) took part in the study. Participants had normal or corrected-to-normal vision and reported no history of psychiatric or neurologic illness. All participants gave written informed consent prior to data acquisition. The study was approved by the Ethics Board of the Department of Psychology, University of Bielefeld, and was conducted in accordance with the Declaration of Helsinki.

Stimuli

Faces were derived from the MPI Faces database (

Behavioural data

Overall, mean accuracy in the gender discrimination task was 89% (SD = 4.0). There were significant effects for expression category [F(3,39) = 13.07; p < .001; ηp2 = .50] and the face × gender interaction [F(3,39) = 19.12; p < .001; ηp2 = .60]. Fearful and angry expressions were less accurately responded to than both happy and neutral faces (all p < .008; all d > .81). Only within angry expressions was there a difference in accuracy due to stimulus gender, male faces being recognized

Discussion

This study compared the relative importance of seven brain regions from the Haxby model for discriminating facial expressions, using MVPA of fMRI data. As could be expected by their theoretically postulated general involvement in face processing, consistent above-chance accuracies were found for all target regions, as well as for a whole brain mask. When directly comparing regions, fusiform gyrus, STS and anterior temporal regions showed highest accuracy values. Of note, the fusiform gyrus

Acknowledgements

Research was funded by DFG grant KI1286/4-1 and the DFG, Cluster of Excellence 277 “Cognitive Interaction Technology”. MRI facilities used for the project are supported by the “Gesellschaft für Epilepsieforschung e.V.”. We would like to thank Sebastian Schindler for helpful discussion.

References (42)

  • D. Sabatinelli et al.

    Parallel amygdala and inferotemporal activation reflect emotional intensity and fear relevance

    NeuroImage

    (2005)
  • D. Sabatinelli et al.

    Emotional perception: meta-analyses of face and natural scene processing

    NeuroImage

    (2011)
  • P. Vuilleumier

    How brains beware: neural mechanisms of emotional attention

    Trends in Cognitive Science

    (2005)
  • V. Bruce et al.

    Understanding face recognition

    British Journal of Psychology

    (1986)
  • V. Bruce et al.

    Face perception

    (2012)
  • A.J. Calder et al.

    Understanding the recognition of facial identity and facial expression

    Nature Review Neuroscience

    (2005)
  • B.C. Duchaine et al.

    Normal recognition of emotion in a prosopagnosic

    Perception

    (2003)
  • N.C. Ebner et al.

    FACES-a database of facial expressions in young, middle-aged, and older women and men: development and validation

    Behaviorel Research Methods

    (2010)
  • P. Fusar-Poli et al.

    Functional atlas of emotional faces processing, a voxel-based meta-analysis of 105 functional magnetic resonance imaging studies

    Journal of Psychiatry & Neuroscience: JPN

    (2009)
  • C. van der Gaag et al.

    The BOLD signal in the amygdala does not differentiate between dynamic facial expressions

    Social cognitive and affective neuroscience

    (2007)
  • K.M. Gothard et al.

    Neural responses to facial expression and face identity in the monkey amygdala

    Journal of Neurophysiology

    (2007)
  • Cited by (0)

    View full text