Questioning the Hypothetical Persona in Music

May 30, 2026
25 min read

When listeners experience expressiveness in music, the encounter feels strikingly immediate — like observing someone openly displaying an emotion rather than hearing a clinical description of one. The expressiveness seems to reside within the sounds themselves, not beyond them. Yet this raises a puzzle: if the emotion appears to belong to someone, yet the music itself is not a sentient being, whose emotion is it?

Traditional theories have tried to pin the emotion on the composer, the performer, or the listener, but all face well-known problems. The expression theory — which claims the composer (or performer) pours personal feelings into the work through the act of creation — runs into difficulty. Composers do sometimes match their materials to their moods, but they do so by drawing on the music's inherent expressive resources, not by infecting the notes with raw emotion. Arousalism, or emotivism, fares no better. It holds that a piece's expressiveness is a power to stir feelings in the audience. Listeners do feel sad in response to sad music, but the music is sad because it calls forth that reaction; the sadness arises from the music rather than the response producing it.

Another candidate is the characters identified in a work — Rodolfo, Mimi, and the rest in Puccini's La Bohème. But we routinely distinguish what a work expresses from what its characters feel. Moreover, purely instrumental music presents no characters at all, yet such pieces can brim with expression.

By this process of elimination, one arrives at a different proposition: the listener imagines a person who undergoes the emotions expressed in the music. Since the owner of those feelings cannot be the composer, performer, listener, or a represented character, the listener must project that owner through an act of imagination. In hearing sounds as emotionally expressive, we animate the music and hypothesize an abstract or virtual persona. The tensions, movements, and resolutions we hear map onto this persona's actions and sensations. This position is known as "hypothetical emotionalism."

Does this reduce absolute music to program music? Not necessarily — not if it merely uses the music to spark private daydreams. That is a common reaction, one that often leads to inattention, not insight. Hypothetical emotionalism does not endorse such wandering. The listener hypothesizes a persona but must then track the music's unfolding structure closely, because the persona's experiences are revealed precisely through that structure. The "dramatic narrative" must be discovered in the music and must respond to every subtle detail of its articulation. The story the listener constructs should map directly onto every part of the piece; it is the story of the work's formal and expressive progression, which are inseparable. This hypothesizing, far from pulling the listener away, opens a path to the fullest understanding of the piece's character and, if present, its unity and closure.

This approach emerges as better than crude formalism, which would dismiss expressiveness altogether. Steering clear of the old pitfalls about composer, performer, and listener feelings, it aligns with the common impression that expressiveness resides directly in the music itself.

How does hypothetical emotionalism compare with its main rival,"appearance emotionalism"? The latter holds that musical material can be literally expressive by presenting sounds with emotion-characteristic qualities — music is sad-sounding in the way a basset hound looks sad or a tragic mask appears sorrowful. Appearance emotionalism denies that music involves occurrent feelings needing an owner. If imagination is needed to hear expressiveness, it does not require any more than seeing a face mask as bearing a human expression or a willow tree as drooping sadly. This natural "animating tendency," as Peter Kivy calls it, is just our everyday mode of experiencing the world.

Four aspects deserve attention when comparing these theories, though only the last is decisive:

Hypothetical emotionalists rightly note that a listener's grasp of overall structure depends both on formal features and on the expressive pattern developed throughout a piece. This observation does not, however, depend on hypothetical emotionalism; appearance emotionalism can endorse it just as easily.
Expressiveness often stirs an emotional reaction. It is straightforward to react to a person's feelings, even a hypothesized one. But why react to mere appearances? The appearance emotionalist replies that expressive appearances are themselves evocative. People can find them moving, though the reaction tends to lack the force of responses to real or fictional persons because the beliefs and desires relevant in those contexts are absent here — consistent with the actual way listeners respond to sad music and keep returning to works that induce sadness.
Musical works often present contrasting emotions in succession. Listeners expect development, connection, and integration. Hypothetical emotionalism explains this: one looks for pattern and order, much as one would in a person's unfolding actions or feelings. Appealing to a persona gives a narrative thread tying together the expressive sequence. So how does appearance emotionalism avoid treating these as a mere list of disconnected moods? Two responses apply, depending on the work itself. When expressive progression is central, the composer shapes and orders the material deliberately, making it reasonable to look for connection and reference to human feeling—no fictional narrative required. When expressiveness is not the main concern, attention to formal features alone suffices.
Only emotions with distinctive behavioural displays can be captured in appearances. Those that require particular cognitive content — shame, pride, envy, patriotism, hope — cannot. Sadness and happiness, though, can be portrayed. Appearance emotionalism therefore limits the range of musical expression to fewer types. Certain pieces might hint at higher emotions by setting expressive contexts where those emotions naturally arise. But if such expression largely depends on musical presentation of cognitive attitudes, hypothetical emotionalism can better explain it. Imagining a personified persona involves attributing beliefs, desires, intentions, and attitudes — even if the specific content is indistinct. This makes it possible for music to express richer, cognitively complex emotions. Hypothetical emotionalism thus wins the debate if music often conveys these "higher" feelings.

One's view of the two theories likely depends on how frequently and explicitly music expresses complex emotional states. The author here expresses skepticism about hypothetical emotionalists' claims for the centrality and objectivity of such expressions.

Emotional Listening and Imagination

Claims made on behalf of hypothetical emotionalism for instrumental music come in several versions. The weakest: listeners can indeed imagine a persona whose emotional story unfolds through the music. This is true but trivial. No stronger is the observation that some listeners take this approach. A stronger empirical claim holds that listeners typically hear music as the expression of a persona — a claim the author believes false. More interesting is the prescriptive form: listeners should hypothesize a persona to understand the music properly. This version does not deny that other styles of listening are viable. The strongest version permits no exception: listeners must hear the music this way, and only this route leads to full comprehension. Advocates who say music expresses complex, cognitively demanding emotions typically endorse this position.

Scope matters, too: do the claims apply to all instrumental works, or only to some? And for adherents prescribing the approach, do they insist on it for all relevant pieces, or only for those especially dramatic or expressive?

Proponents may differ on both questions, approaching the theory from multiple backgrounds. Musicologists adopt it to counter formalism and excessive technical focus; they want to humanize criticism by stressing the link between formal and expressive elements. Philosophers are often more concerned with opposing appearance emotionalism so as to give a central place to higher emotions in musical expression.

Distinguishing Hypothetical Author from Hypothetical Persona

Hypothetical emotionalism has become especially prominent. It should be distinguished from analogous approaches used with other art forms. In literary interpretation, readers sometimes speculate about an implied, apparent, or hypothetical author. Such theories focus on a creator outside the work, whereas musical hypothetical emotionalism imagines a persona inside the work's world. This difference matters: what we infer about the narrator of a novel, for example, cannot automatically be applied to the implied author, who may hold broader moral or aesthetic views lost on the narrator.

Fred Maus's version of musical hypothetical emotionalism is clear on this point. Certain qualities attributed to the musical persona, such as surprise, could not reasonably apply to the work's composer, real or hypothetical. Moreover, what we hear in the music is indefinite in ways that speculating about the composer is not. We might imagine a persona without attributing clear generative actions, but when we consider the composer we think of a single individual whose actions produce the entire structure. These topics are raised not to accuse proponents of confusion, but for the sake of clarity — it is easy to lump all forms of hypotheticism together and miss the distinction.

A Closer Look at Hypothetical Emotionalism

The strongest version open to criticism enters the stage: to fully understand and appreciate some expressive works of absolute music, this argument goes, the listener must imagine a persona and hear the music as the actions, feelings, and experiences of that persona. In other words, such musical works must be heard as being about the emotional life of an imagined presence inside them.

Thousands of words have been written about what it might mean for a musical work to be about something. Four principal answers present themselves: (a) the composer intended it; (b) conventional practice in artistic appreciation demands such an approach; (c) enough suitably acculturated listeners would perceive that subject upon reflection; or (d) full understanding of the work is impossible without invoking that subject.

I will focus on the last condition and treat the first three briefly. There is scant evidence that composers intend listeners to hypothesize a persona in instrumental works. In any case, (a) depends on the viability of (b) or (c). A robust notion of intention is required here. A composer can possess such an intention only if she can embody it publicly in the music for the listener to acknowledge. Merely entertaining the thought that the work is about a persona and hoping it will be recognized is insufficient. Achieving the relevant intention presupposes either public conventions enabling its communication through the musical work, or widespread recognition of the intention. Thus (a) presupposes the possibility of (b)—practices calling on the listener to hypothesize a persona—or (c)—general listener agreement that such a persona is imagined, with coinciding descriptions of the persona’s emotional life. Yet it appears straightforwardly false that public conventions or consensus exist on these matters. Proponents of hypothetical emotionalism typically offer their analyses as novel, revealing expressive subtleties that have generally been overlooked.

Condition (d) is crucial. In relation to hypothetical emotionalism, it holds that to comprehend the music fully, the listener must imagine a persona within it and, to follow the music with understanding, must hear the actions, experiences, and sensations of that persona. This corresponds to what I earlier called the strong version of hypothetical emotionalism: one must hypothesize a persona to grasp the subtle expressiveness of at least some musical works. As already noted, (d) can plausibly be met if many works express higher emotions, because such expression depends on a cognitively rich context that might only be supplied by making-believe that the music unfolds parallel to the emotional life of an agent with beliefs, attitudes, and desires. The issue, as indicated before, is not whether the listener can invent a story about a persona’s actions or experiences that matches the music—that can be done easily. The question is whether this mode of listening provides access to an understanding that is genuinely of the music and unobtainable by any other approach. To show (d) is satisfied, one must argue that invoking a persona is essentially implicated in an understanding reaction to the music. This involves establishing that what is imagined is neither idiosyncratic nor irrelevant to musical understanding, and that the work itself invites, controls, and limits what might be hypothesized, so this approach leads to a revelatory experience no alternative can match.

Turning to (d), I concentrate on recent work by Jenefer Robinson. She outlines a version of hypothetical emotionalism in a paper coauthored with Gregory Karl (1995). They argue “that the expressive structure of some pieces of music can be interpreted as an unfolding of the psychological experience of [a] musical persona over time. … [The] formal coherence of the music often consists precisely in its embodying a coherent unfolding of psychological states in a musical persona” (405). The idea is developed in a detailed discussion of Shostakovitch’s Tenth Symphony, which they hear as expressing false hope: “The plot archetype to which Shostakovitch’s Tenth Symphony conforms is conventionally interpreted as a progression from dark to light or struggle to victory (adversity to salvation, illness to health, etc.)” (406). They continue:

“[V]ery often the formal and expressive threads of a work’s structure are so finely interwoven as to be inextricable. Thus, in establishing our case for the musical expression of hope, we had to discuss not only the contours and conventional associations of our focal passage, but also its role in patterns of thematic transformation and quotation spanning the entire symphony. To demonstrate that our focal passage expresses hope we had to engage in a formal analysis of the work as a whole. Conversely, we suggest that in a complexly integrated work like Shostakovitch’s Tenth, formal and expressive elements of musical structure are so thoroughly interdependent that the formal function of particular passages can often only be accurately described in expressive terms. Thus there is no ‘strictly formal’ or purely musical explanation for why our focal passage unfolds as it does in the central section of the third movement; its formal function just is to express the cognitively complex emotion of hope.” (412-3)

Thus far the account is familiar, though it blends the philosopher’s preoccupation with higher emotions and the musicologist’s concern with the intimate bond between the work’s expressive character and its large-scale structure. Attention to the structure of an entire work, interpreted as the emotional experience of a persona through time, provides enough cognitive content for the musical expression of complex emotions such as false hope. Formal coherence, it is suggested, often consists precisely in the work’s embodying a succession of connected psychological states attributed to a persona.

Now, hypothetical emotionalism faces the problem of establishing that the listener’s making-believe a persona and a cognitive context arises directly from an appropriate experience of the work’s properties.

Without this, the listener’s imaginative contribution becomes gratuitous and likely idiosyncratic. In “The Expression and Arousal of Emotion in Music” (1994), Robinson proposes a solution to this challenge. She argues for an intimate connection between primitive, largely noncognitive responses aroused in the listener by the music and the process of imaginative engagement that leads the listener to construct a narrative about the experiences of a persona residing in the music. While she allows that sad music might make listeners feel sadness, Robinson does not believe that music expresses cognitively complex emotions simply by evoking such responses. Yet she maintains that some “primitive” feelings are predicable of music because they are evoked by it. She also holds that the thoughtless reactions kindled by music feed and direct the hypothesizing that reveals within the music a persona experiencing cognitively complex emotions like “cheerful confidence turning to despair.”

According to Robinson, qualities attributed to music because of its power to arouse corresponding feelings include tension, nervousness, uncertainty, relief, disturbance, unease, surprise, reassurance, and relaxation. Music is tense just insofar as it tends to awaken that response in a listener familiar with the musical idiom. Whereas emotions are usually rich in cognitive content—involving beliefs, desires, and attitudes—the evocation of unease or relief by music requires little cognitive involvement, so the response is triggered more or less automatically. The listener must listen with expectations tailored to the style, but these are not typically called to mind. The response that concerns Robinson is usually an unthinking reaction—a somatic feeling. She writes:

“Music that disturbs and unsettles us is disturbing, unsettling music. Modulations that surprise us are surprising. Melodies that soothe us are soothing. … [I]t seems to me that the expression of a feeling by music can sometimes be explained straightforwardly in terms of the arousal of that feeling. However, the feelings aroused ‘directly’ by music are not stabs of pain or feelings of unrequited passion, but more ‘primitive’ feelings of tension, relaxation, surprise, and so on.” (19)

What interests her, Robinson notes, is

“the way in which the simple feelings ‘directly’ aroused by music can contribute to the imaginative expression of more complex emotions. … Now, just as the formal structure of a piece of music can be understood in terms of the arousal of such feelings as uncertainty, uneasiness, relaxation, tension, relief, etc., so too can we understand the expressiveness of that piece of music in terms of the arousal of those and similar feelings. … If a piece of music is heard as successively disturbing and reassuring, or as meandering uncertainly before moving forward confidently, or as full of obstacles, this is at least in part because of the way the music makes us feel. Disturbing passages disturb us; reassuring ones reassure. Passages that meander uncertainly make us feel uneasy: it is not clear where the music is going. Passages that move forward confidently make us feel satisfied: we know what is happening and seem to be able to predict what will happen next. Passages that are full of obstacles make us feel tense and when the obstacles are overcome, we feel relieved. It is important to notice that the feeling expressed is not always the feeling aroused: an uncertain, diffident passage may make me uneasy; a confident passage may make me feel reassured or relaxed. … As I listen to a piece which expresses serenity tinged with doubt, I myself do not have to feel serenity tinged with doubt, but the feelings I do experience, such as relaxation or reassurance, interspersed with uneasiness, alert me to the nature of the overall emotional expressiveness in the piece of music as a whole. … [T]he emotional experience aroused by the music is essential to the detection of the emotional expressiveness in the music itself.

At the same time, the emotions aroused in me are not the emotions expressed by the music.” (19-20)

Robinson clearly realizes that if listening to an extended piece reveals only a pattern of tensings and relaxings, hers is no advance over Kivy’s “contour” theory—a version of appearance emotionalism. In her view, that theory cannot account for the expression of cognitively complex emotions, since none of them has a distinctively articulated contour. Her criticisms of Jerrold Levinson (1982 and 1990) and Kendall L. Walton (1988) reveal what she thinks is needed for a more adequate account. She faults these authors, who agree that music expresses cognitively complex emotions, for failing to explain adequately how music could contain or convey the cognitive content required for such expression (and imaginative evocation). Robinson apparently believes that the largely noncognitive feelings aroused by music—or, rather, their accumulation and interrelation as generated by the detail of an extended work—suggest cognitive complexes and contents to be attributed to a persona hypothesized as the subject of this musical narrative. It is the succession of thoughtlessly automatic reactions that first animates and then controls the imaginative involvement revealing higher forms of musical expression to the listener.

I begin criticizing Robinson’s position by reviewing her suggestion that musical tension and similar properties consist in the music’s power to arouse a corresponding automatic response in the listener. It can be argued that the relevant properties are intrinsic, not causal powers. The succession of discords and concords is the pattern of harmonically generated tensings and relaxings. The initial use of terms like “tense,” “uncertain,” and “relaxed” for music might have been suggested by the sensational character of our reactions, but I doubt that current usage presupposes those responses. If these properties belong to the music, we should be able to observe and recognize them without experiencing them. Indeed, this is often the case. One can correctly attribute a pattern of tension and relaxation without undergoing an echoing feeling. I hear discordant major thirds in medieval music as high points of tension, yet I doubt I feel tense in that awareness. Where a musical style is boringly predictable, I might be quite indifferent while still being aware of the tension of, say, a prolonged dominant seventh leading to a tonic triad. And when I listen again to a well-known work and gain a better understanding of its tensing and relaxing pattern, I do not necessarily experience feelings mirroring that pattern. I accept that we must observe the flux of tensings and releasings in the musical fabric to recognize its expressive and formal character. I reject Robinson’s stronger claims: for instance, “[T]he emotional experience aroused by the music is essential to the detection of the emotional expressiveness in the music itself” (1994, 20) and “[T]he expressiveness of the piece as a whole can only be grasped if the listener’s feelings are aroused in such a way that they provide a clue to both the formal and the expressive structure of the piece as it develops through time” (1994, 21).

Moving on, I examine the connection Robinson finds between the arousal of primitive automatic reactions in the listener and the perception of higher emotions expressed by the music. As I earlier listed, Robinson writes of our feeling nervousness, relief, disturbance, and reassurance, as well as tension, relaxation, and surprise, as unthinking responses to music. By its power to produce such reactions, music is properly described as tense, surprising, disturbing, reassuring, unnerving, and so on. Hearing an appropriate succession of these qualities leads us to find, for example, bold progress checked by obstacles. Hypothesizing a persona, we recognize in this the expression of, say, cheerful confidence turning to despair.

I think nervousness, relief, disturbance, and reassurance typically come surrounded by an atmosphere of propositional attitudes, even when initiated automatically. An overdose of caffeine might put me on edge, but if my state is one of nervousness, that is because my sensations become located within a wider cognitive context where I contemplate some future state or action with apprehension. If music triggers reactions of these kinds—and thereby becomes unnerving, relieving, disturbing, and reassuring—it is far from evident that these qualities connect with a cognitive content delivered or directed by the music rather than one created by and imported from the listener. Because I believe the listener interjects rather than uncovers the ideas fueling her imagination, I doubt Robinson shows that music controls the listener’s imaginative involvement via the automatic reactions it arouses. Yet even if I am mistaken in this final claim, moving from a succession of musical qualities—nervousness, hesitation, reassurance—to the expression of higher emotions and further to the unfolding life of a persona requires more imaginative input than following music with understanding demands. The given pattern likely is consistent with many states of affairs where higher emotions are not expressed, as well as with expressions of many different higher emotions.

As I see it, Robinson is no nearer than her rivals to establishing that higher emotions are expressed in musical works as a result of features that both require the listener to make believe a persona and control the cognitive contents fed into a narrative about this persona’s experiences. If this is correct, she does not establish the strong version of hypothetical emotionalism, according to which some pieces cannot be understood and appreciated without such making believe.

My comments on Robinson’s view are necessarily particular. I conclude by raising a more general objection to hypothetical emotionalism.

So far I have implied that according to hypothetical emotionalism the listener entertains the existence of a persona whose tale is revealed in the music’s progress. Yet advocates often suggest that multiple personas might be identified within a work. Cone (1974) sometimes discusses different instruments and individual themes as distinct personas in a single piece. Newcomb (1984b) treats thematic units in Schumann’s Second Symphony as distinguishable personas. Callen (1982) suggests a work should be thought of as presenting the emotional life of a single organism—or perhaps of several agents. This would pose no problem if the relevant distinctions of number could be preserved in musical works; a story can contain more than one character. Things are not so simple. Maus (1988) allows that there is no basis for hearing different agents as opposed to hearing various parts or elements as different limbs of a single agent. He concludes that regarding how many personas are involved—one, several, or many—music is irredeemably indefinite. I believe he is correct. Walton (1994) makes a similar point and plainly regards it as a problem for hypothetical emotionalism.

The difficulty is this: if invoking a persona is essentially tied to understanding a work, it is likely for the reason Robinson and Karl suggest—the story told explains the work’s structure and coherence where a purely technical account falls short. But if any number of personas can be imagined, then the same number of stories can be told, each fitting the music. If these stories differ markedly in content and form, it becomes doubtful that any single one accounts for the work’s coherence (unless all do, which is highly improbable). A piece that could be heard as tracing the deepening gloom of a depressed persona might just as convincingly be experienced as revealing the disconnected moods of several personas, each being more depressed than the last. The music’s structure and coherence cannot be explained by reference to one narrative if others, equally consistent with what is audible in the music, fail in this respect.

One reply suggests that the hypothesizing strategy should be understood as a form of inference to the best explanation. Where insufficient evidence supports any theory, we may still favor some theories over others, discriminating among them through predictive power, economy of elements, elegance of structure, and similar criteria.

Likewise, while more than one narrative might be hypothesized to account for a work’s expressiveness, not all are equally acceptable. If one narrative provides the unity and closure we experience in the music and another does not, the former is to be preferred. The indefiniteness of the music, noted earlier, need not hinder our judgment between competing narratives or our comparison of those involving several personas with those relying on only one. The preferred narrative, besides matching the music’s structure, also encompasses other artistically significant properties, such as unity and closure.

This view would be appropriate if the credentials of hypothetical emotionalism were established, but I doubt it can be used to defend the theory. First, the “evidence” is disputed. It is not agreed that music commonly expresses higher emotions, nor that a persona must be hypothesized to understand and appreciate such expressiveness in those cases where it might occur. Second, hypothesizing occurs after the recognition of musical unity and thus does not account for that experience. We can explain why we would prefer one narrative over another—for instance, one displays the kind of form and unity also present in the work, while another does not, despite matching elements at the local level—but the features of the preferred narrative do not themselves justify the experience of the music. The integration of disparate elements achieved in the work and experienced by the listener is independent of what is hypothesized. If the work strikes us as episodic and disjointed, it is not clear we should prefer a coherent narrative over one that is less so. Our preferring one narrative presupposes, without explaining, a high level of musical understanding.

If the strong version of hypothetical emotionalism is to justify the force of the prescription that we must listen to some works as presenting dramatic narratives, it must show that forming such narratives is essentially involved in the listener’s understanding and appreciation of relevant works. I doubt this has been demonstrated. If the listener’s narrative is to be importantly revealing of the music rather than merely of herself, we should be able to explain why others who wish to understand and appreciate the music must listen in terms of that narrative. Music is too indefinite to constrain the contents of such narratives to the required extent. While there are grounds for

discriminating among various narratives that all match the music in their details, these criteria do not support strong hypothetical emotionalism. They presume that the listener can grasp the music’s nature—its unity, integrity, symmetry, and so forth—independently of the hypothesizing process. This contradicts the claim that it is only by developing a narrative about the actions, experiences, and feelings of a hypothesized persona that she can fully appreciate the music in question. It is not the case, I argue, that hypothetically invoking a persona is essentially involved in understanding musical works that display formal and expressive interrelation.