Conceptual Blending and Its Role in Musical Creativity

May 29, 2026
29 min read

Creativity poses one of the most challenging subjects in music studies, particularly when grappling with something as fleeting and immaterial as musical sound. In recent years, research on conceptual blending has emerged as a powerful resource for investigating musical creativity, offering tools that work equally well for ordinary musical experiences and for exceptional ones. The basic premise—that humans can combine concepts from two separate domains into entirely new ones—has proven highly productive in language research and has been proposed as a core mechanism of human cognition generally.

Early work applying conceptual blending to music examined how concepts triggered by nineteenth-century Lieder combined those activated by music and by words. That research focused less on creativity itself than on how knowledge from music and language could merge into unified understanding. More recently, blending theory has served as the foundation for computational models of creative processes that extend into musical practice, suggesting both the approach's suitability for creativity studies and its potential for computational implementation.

As one might expect from a theory about a process considered fundamental to human thought, the claims are sweeping. Mark Turner has argued that blending underlies the cognitive capacities that separate humans from other species. This discussion takes a more focused approach, examining those features of the theory that may interest researchers studying musical creativity. I will begin with an outline of blending theory centered on its basic model, using an example where music appears only briefly. This outline will reveal two aspects of the theory that could reshape how we think about creative processes in music. The second section will analyze the lament from Purcell's Dido and Aeneas, building on earlier work while extending that approach in ways relevant to recent blending research in musical contexts.

Conceptual blending, conceptual integration networks, and creativity

The Lieberson-as-Dido blend

To introduce conceptual blending, imagine an American mezzo-soprano, Lorraine Hunt Lieberson, performing the role of Dido in Purcell's seventeenth-century opera. Lieberson did sing the part for Mark Morris's danced version in 1998—though singers were in the orchestra pit and did not act on stage—and she recorded the role in 1993 under Nicholas McGegan, but that production may never have been staged. Whether a fully staged performance occurred matters less here than the act of imagining her in the role, because that act exemplifies conceptual blending.

This process begins with two temporary cognitive structures linguist Gilles Fauconnier termed mental spaces. The first space is set up by "the American mezzo-soprano Lorraine Hunt Lieberson." If you know her well, this space may contain her face, voice, and recollections of memorable performances. If you have never heard of her, the space would be sparser, built around generic ideas about "American," "mezzo-soprano," and women singers. The second space arises from "the role of Dido in Purcell's Dido and Aeneas." This might also contain detailed knowledge—facts about the opera and Dido's role, perhaps fragments of Purcell's music—or only general knowledge about seventeenth-century opera with singers performing classically inspired roles in English.

Connecting these spaces through imagining Lieberson as Dido creates a third space where concepts from both combine. Within this blend, Lieberson becomes a singer performing Dido's role, and Dido acquires Lieberson's face and voice.

To study such blends, Fauconnier worked with rhetorician Mark Turner to develop conceptual integration networks (CINs), which involve four interconnected mental spaces. For the Lieberson-as-Dido blend, the first two spaces (the inputs) are those for Lieberson and Dido. Connecting them establishes a third space, the blend, but also activates a fourth space, the generic space. This concept draws on analogy research by Mary Gick and Keith Holyoak, who observed that individuals using analogies succeed more often if they can find an abstract schema common to the source and target domains. The generic space captures essential cross-space mappings that define the network's basic structure. For this blend, I propose the generic space involves attributes of "woman."

I consider the concept "woman" equivalent to a basic-level category. Such categories balance highly specific classifications (like "Dido, Queen of Carthage") with broad categories (like "primate"), making core features readily available without unnecessary details. "Woman" captures features of both "Lorraine Hunt Lieberson" and "Dido" without predetermining how they will combine in the blend. The category is indeed generic—hence its usefulness—but not abstract in the strongest sense. It is shaped by countless encounters with real exemplars from early life onward.

The apparent simplicity of the concept "woman" highlights an important property of mental spaces: given only a few prompts, we construct them with remarkable ease, accounting for both the speed and complexity of human cognition.

In the CIN, dashed arrows indicate structure is projected from the generic space to the inputs and from the inputs to the blended space. The arrows have two heads because structure may also flow backward—from the blend to the inputs and from the inputs to the generic space—a feature absent from other accounts of cross-domain mapping. Thus the Lieberson-as-Dido blend may reshape our views of both the singer and the character, letting us attribute imperiousness and romantic melancholy to Lieberson and hear the classical Dido in her voice. Similarly, the concept "woman" may be temporarily reshaped: its representative may wear Lieberson's face or share Dido's concerns. These two-way arrows also remind us that diagrams of CINs are inherently limited. Mental spaces and the networks built from them are dynamic; Figure 1 captures only a snapshot of the network.

background image

Fauconnier and Turner identified three operations producing new structure unique to the blend: composition, completion, and elaboration. Composition fuses elements from the input spaces into new entities, giving us Lieberson-as-Dido, whose every utterance is accompanied by Purcell's music. Completion extends the image suggested by the initial mapping, drawing on background knowledge about the situation. We can imagine Lieberson's movements once she has taken on the tragic persona and her interactions with other characters. Elaboration develops the blend further, building on its principles and logic as input spaces recede in importance. For instance, one could imagine how Lieberson would perform Dido in a staging by director Peter Sellars set in present-day New York—even knowing Lieberson died in 2006.

All the blends discussed here are what Fauconnier and Turner call double-scope blends, where each input space contributes roughly equally to the blended space. (Other types include single-scope blends where one input dominates, and networks with more than two inputs.) Double-scope blends, in the authors' view, demonstrate the distinctive creativity that sets our species apart.

This creative process has constraints. The systematic relationships within the CIN limit how concepts develop. Consider the original productions of Mark Morris's danced version: Morris himself danced Dido. Replacing Lieberson with the dancer prompts interesting repercussions because of gender's role in the CIN. Generic features of a woman project onto Morris—a projection he encouraged through movement and flowing hair. In that blend, Morris-as-Dido Morris has his own physical features and movement style but not his voice, since trained singers supplied the music offstage. The concept "woman" would, accordingly, be shaped by Morris's portrayal. Another constraint concerns what is not specified. This CIN focuses on attributes of personages, not temporal frameworks. Dido's time—whether classical antiquity or Purcell's original production—is not part of what projects into the blend. Temporal frameworks are not irrelevant; they are vital to the classic "ghost ship" blend.

More exists to blending theory than this brief outline suggests, but these elements provide a framework for considering two aspects that pose challenges for studying creativity: how knowledge projects across integrated mental spaces and what kind of knowledge musical utterances convey.

Conceptual blending and the study of creativity

Before addressing those two aspects, working definitions of two key terms would be helpful: "creativity" and "concept."

Following Margaret Boden, creativity involves the capacity to generate novel and valuable ideas. In the Lieberson-as-Dido blend, the novelty of placing her in that role is unremarkable on its own—we expect singers to assume different characters—yet we may never before have considered Lieberson or Morris as Dido. Novelty always depends on what has come before. Value, too, depends on context: were we casting Purcell's opera, imagining Lieberson-as-Dido could be quite useful as a benchmark. A definition of creativity that omitted both novelty and value would be unusual.

Defining "concept" proves more challenging. My own work adopts a pragmatic approach rooted in categorization research. Categorization starts the process, allowing concepts to exist independently of language. Thus pre-linguistic children have concepts, as do other species. I propose three characteristics of concepts. First, they are cognitive constructs stable enough to be stored. Second, they serve as resources for present and future action. Third, concepts of one sort (visual or linguistic) can relate to concepts of another sort (musical sequences or physical movements). The mental spaces in a CIN are built from such conceptual material.

When people learn about conceptual blending, they typically start with cognitive categories and the relations connecting them. These categories sometimes contain just one member, such as “Lorraine Hunt Lieberson,” but they can also hold many members — the category “Dido” being one that appears in countless guises. Knowledge underlying such a category can prove largely independent of language, as it is with the category “movements Mark Morris makes in the role of Dido.”

A successful conceptual blend, broadly speaking, generates a new category that merges aspects of established categories (including their interrelations) and does so in a way useful for the task at hand. With that perspective, I now turn to two facets of blending theory that present interesting puzzles about creativity, both inside and outside music.

The first facet concerns knowledge projection across the constituent mental spaces of a conceptual integration network. Knowledge flows not only from the input spaces into the blend, but also from the blended space back toward the input spaces. This suggests a fresh lens for creativity: being creative involves not just producing new, valuable ideas, but also redefining the ideas foundational to a particular domain of inquiry. This possibilities — and the complications they raise — become especially clear when we consider challenges for computational models of blending. Most such models rely on the reasonable assumption that information (whatever its formalization within the computational framework) is selectively projected from discrete domains into the new domain of the conceptual blend (Pereira, 2007). The idea that blended concepts could reshape concepts from the input spaces implies that a computational model based on neural networks would need to incorporate some form of recurrence: new information generated inside the network could then modify information at the input nodes. Actually building such a system — which demands not merely recurrence, but decisions about which blend information gets fed back to the inputs — is hardly straightforward. Nevertheless, it would lead toward a highly dynamic model of creativity that seems to approach what human intelligence achieves.

The second facet of blending theory that challenges creativity — as we usually conceive it — involves the nature of musical utterances and particularly how they differ from linguistic ones. Let me use a practical exercise to illustrate: to the extent you can, call up in your aural imagination the musical events shown in Figure 2a, performed by your favorite violinist (or, if you are a violinist, imagine playing them yourself). The notated events, when performed, produce a distinctive gesture with certain intensity (a sustained forte E6 falling to mezzopiano F5), drama (the glissando leading to an abrupt staccato), and suspense (the dissonance between E and F, implying a need for some riposte). Language can, of course, capture aspects of this sequence — that, after all, is how I just directed your attention to details of Figure 2a. We could also replace the notation with written instructions for the violinist: “In a moderate tempo, play a glissando from a forte E6 to a mezzopiano F5, with the first note and glissando lasting slightly more than three times the duration of the second, using an accented staccato articulation for the second note.” Yet I propose that neither kind of linguistic description can replace the visceral effect of this simple figure, an effect crucial to its substance as a musical utterance.

background image

In my work on how humans conceptualize music, I have argued that a sequence of musical sounds like the one in Figure 2a can be thought of as a member of a category of musical events — in other words, a musical concept. Other members of this category might include replications of Figure 2a made by different violinists or by the same violinist at various times, but could easily also include figures like the one in Figure 2b. If Figure 2a is taken as most typical of the category (typicality effects being characteristic of cognitive categories; see Zbikowski, 2002, pp. 41–42), then Figure 2b could be seen as less typical, and the events in Figure 2c even less so. Whether the musical events pictured in Figure 2d would be less still depends in part on what drives our evaluation of different category members — after all, Figure 2d keeps the same pitch classes, dynamics, and articulations as Figure 2a. Yet perhaps the most important point for my purposes is that, as you considered these figures, your evaluation might well have been guided by using each bit of notation to activate your aural imagination and then comparing those activations. If that happened — even simple figures like those in Figure 2 can be challenging to simulate aurally — I would propose that you were dealing with musical concepts, concepts we can certainly describe with language but whose cognitive substance extends beyond linguistic description.

One distinctive hallmark of musical concepts is their temporal specificity: performing Figure 2a (whether real or simulated in the mind) unfolds across a time span coextensive with the dynamic shape comprising this musical utterance. Indeed, in my recent work on musical grammar’s foundations, I have claimed that, at the most basic level, musical utterances serve as sonic analogs for a range of dynamic processes (Zbikowski, 2015, 2017). So the music of Figure 2a might be an analog for an object falling from a shelf (itself largely silent except for the sound when it hits the floor). Yet it could equally stand as an analog for becoming aware of some foreboding event. In either case, the meaning of this musical utterance must start from its actuality — that is, the effect it creates when we attend to the sonic arc it carves through time.

This perspective on musical concepts produces two entailments for applying blending theory to music. First, because language plays such a large role in developing and articulating blending theory, concepts used in a musical analysis may well be linguistic ones instead of musical ones — they will be accounts, rendered through language, of salient features of musical events (rather like my original description of Figure 2a). We can, of course, learn much about music through language’s tools, but we should note that the conception of music developed this way stays beholden to language. In other words, we would not want to mistake the kind of communication fostered by the linguistic description of sequences of musical events for the kind fostered by performing or mentally simulating those sequences.

The second entailment, connected to the first, concerns bringing musical concepts into a double-scope network. As I mentioned, what makes blending possible is a uniform topography across the mental spaces of a CIN, summarized in the generic space. According to the perspective I have developed, however, at the most fundamental level the conceptual knowledge tied to language differs from the conceptual knowledge tied to music. When both input spaces come from language — as in the Lieberson-as-Dido blend I discussed — or from music (as in the intra-musical structural blend explored by Tsougras & Stefanou, 2015, and in this issue), no barriers block the establishment of a uniform topography across the network. But when language concepts and music concepts combine, as they do in nineteenth-century Lieder, something must give: either the organization of the mental space set up by music must bend toward that of the language space, or that of the language space must bend toward the music space.

In the remainder, I would like to explore the latter situation in more detail — a case in which our understanding of language gets shaped by music — by applying blending theory to the lament that closes Purcell’s Dido and Aeneas. My goal is a practical demonstration of how blending theory can be applied to music and to expand on the idea that music delivers resources for meaning construction distinct from those of language.

Conceptual blending and music: Dido’s lament

Dido’s lament became justifiably famous during the twentieth century, offering scholars and teachers a compelling, compact illustration of operatic music’s power — and for Anglophone audiences carried the added benefit of being sung in English. More recently, Janet Schmalfeldt published a thoughtful, wide-ranging analysis of the aria, examining its harmonic structure closely in order to understand its impact on listeners (Schmalfeldt, 2001). My account is narrower in scope and focuses mainly on the distinct resources that words and music provide — separately and combined — for meaning construction.

Nahum Tate’s text for this section of the opera is spare and direct (Price, 1986, p. 75). In the recitative, Dido — speaking to her lady-in-waiting Belinda — accepts death as the price for her infidelity to the memory of her deceased husband Sychaeus:

Recit. Thy hand, Belinda, darkness shades me, On thy bosom let me rest, More I would, but Death invades me; Death is now a welcome guest.

The aria then looks to a future in which Dido has become only a memory:

Aria When I am laid in earth, May my wrongs create No trouble in thy breast; Remember me, but ah! forget my fate.

Three guiding ideas emerge from the words Tate gives Dido for this final aria. First, straightforward acceptance of death (“When I am laid in earth”). Second, a hope that her “wrongs” — broadly understood as Fate’s entailments — may be buried with her so that they trouble no one. Third, a plea to be remembered not for the path her life took but for who she fundamentally was.

Setting this text, Purcell made use of one of his favorite compositional devices: a ground bass over which varied melodic lines can be spun. The ground he chose opens with a minor-mode descending tetrachord. As Ellen Rosand noted in her discussion of such grounds, these ostinato patterns formed the basis of many seventeenth-century operatic laments (Rosand, 1979). In Purcell’s ground, the tetrachord’s four notes descend G–F–E♭–D, followed by a cadential sequence B♭–C–D–G. Chromatic embellishment was common in such patterns; Purcell’s, likewise, enriches the fill from G to D as G–F♯–F–E–E♭–D. A noteworthy feature is Purcell’s rhythmic design: F♯ and E♮ — the nondiatonic ornaments — fall on the beat and receive agogic accents, projecting them prominently. After the protracted descent to D this creates, the ground’s concluding cadence seems to come almost too fast, pressing back to D and then directly to the low G that completes the pattern. The sound becomes a sonic analog to a relentless descent, fraught with palpable effort and uneasy steps toward some terrible end. Yet the ground might also represent the gradual ebbing of Dido’s life force as Death overtakes her, or an encroaching shadow as the light fails. Nonetheless, the ground’s cyclicity — repeating the same pattern over and over — implies that the cumulative accumulation of that slow, uncomfortable advance carries the most meaning.

The melody for the aria unfolds across eight repetitions of the ground, falling into two roughly equal halves. The words, however, are not distributed equally across the halves. Instead, Purcell assigns the first three of Tate’s lines to the first half (bars 6–25) and the fourth line — the plea for remembrance — to the second half (bars 26–46).

One vehicle through which composers interpret texts is repetition, whether of phrases or individual words. In his treatment of the first three lines Purcell uses both approaches. He sets the entire sequence over two statements of the ground (bars 6–15) and then repeats it (bars 16–25). He also singles out word pairs for emphasis: “am laid” (bars 7–8 and 17–18) from the first line, and “no trouble” (bars 11–13 and 21–23) from later on. The intensity of these strategies increases in the passage for “remember me” / “forget my fate”: the final line is repeated four times overall (twice each in bars 26–36 and bars 36–46), and on its first and third iterations “remember me” is pulled out for extra emphasis through repetition. Together these strategies slow down the text’s delivery, as does simply having a singer present it (who must negotiate pitch, rhythm, timbre, and dynamics while giving voice to the words). Composer Martin Boykan noted, “a text is sung far more slowly than it is spoken, and even where the musical tempo is fast and we have the impression of speed, the words move at a rate that we would find intolerable in conversation” (Boykan, 2000, p. 133). Where uttering the spoken text of Dido’s aria would take slightly under ten seconds, a sung performance typically lasts about two and a half minutes. That expansion alters the way the words contribute to meaning, moving them toward ritual speech and away from everyday discourse. This is obviously important for considering how words and music combine to create meaning, but for now I will keep focus on the basic communicative resources each medium provides.

Interactions between music and words in Dido’s lament

The most conspicuous resource Purcell leverages in his setting is the ground bass conceit itself, whose regular statements steer the aria’s overall shape. In the first half, Dido’s melody generally conforms to the ground. Figure 3 offers the score for Purcell’s setting of the three opening lines. While her melody spans two statements of the ground, it reaches a comfortable midpoint on the last syllable of “create” (which closes one text line), set with B♭4 against G2 in the bass, harmonized as a G‑minor chord. Setting “no” with E♭5 then shifts both register and harmony, launching the next line. Although the melody occasionally breaks free — a leap to D5 in bar 9, another to E♭5 in bar 11 — with respect to rhythmic figuration, its intervals with the bass, and its overall direction, the melody prefers alignment with the ground’s pattern.

In the second half, Purcell changes approach. Figure 4 shows bars 25–39. He sets the first statements of “remember me” with a repeated D5, which refuses the bass’s downward trajectory and clashes in bar 28 with the E♮3 and E♭3 from the ground. The voice keeps its independence through the ground’s next repetition, the completion of “but ah! forget my fate” stretching into bar 32. The last word, “fate,” returns to D5 — reinforcing the resistance to the bass — before springing up to a climactic G5 in bar 33 as the line is reprised. This reprise receives a new melody that now does align with the ground. That realignment also adjusts the second pass through melody and text, which starts one bar later. The consequence: conflict between D5 and the ground is largely eliminated, and, while the melody still reaches across the ground’s reiteration, it sounds more subdued — the listener now knows the climactic G5 will only briefly postpone the melody’s yielding to the bass.

background image

The melody that Dido sings, and its interplay with the lament ground, points to yet another difference between musical and linguistic utterances. Purcell clearly expects the ground to be a distinct aural strand: one full statement of the ground precedes the voice’s entry, and the pattern never changes through all eleven statements (including introduction and postlude). The melody equally functions as a separate element that can decorate to align with the ground or stand opposed to it. The aria thus consists of two distinct but related syntactic streams whose interplay shapes the musical utterance’s structure. And most crucially for conceptual blends involving music, that structure emerges step by step across time: the interaction in the first half is plainly not the same as in the second, and that difference is central to the whole utterance’s significance. While words have difficulty capturing it, this meaning becomes rather clearer when we examine how the concepts our sense of the music and of the text prompt combine in the integration network.

background image

The features of this blend can be explored through three basic processes that generate emergent structure unique to the blend: composition, completion, and elaboration. For composition, fitting Tate’s “laid in earth” with Purcell’s ground is quite natural, since the lamento bass is strongly tied to physical descent (which here connects to death); this pairing is born out in the aria’s first half. However, linking “remember me” with the more

background image

resistant second half of the aria presents greater complexity. It points toward the reason for that plea—anxiety before oblivion—which finds an analog in the resistance Dido’s melody offers against the bass’s unyielding repetitions and descent. Completing the blend, we might imagine a singer caught up in this struggle: her overall posture, the way she may move, and her interactions with the singer performing Belinda. Elaboration of the blend might extend to what we envision happening after the aria ends—how Dido’s death is depicted and how the remaining characters respond to the opera’s conclusion. What I wish to emphasize, and what I hope a look at Figure 5 clarifies, is that given only Tate’s text or only Purcell’s music

there would be no reason to imagine an enactment of a futile struggle against oblivion. Tate’s text alone might invite us to envision a rather ordinary scene: Dido, leaning on Belinda, making a simple request that her lady-in-waiting remember her queen after death. Purcell’s music alone would certainly suit a melancholy scene, but it might be quite different from Dido and Belinda’s tableau—for instance, the melody could be sung by Aeneas (portrayed by a countertenor), his words describing his reaction to losing a fallen comrade and recalling his courage. Only when the two combine, as they do in Purcell’s masterful aria—the concrete anchors for thought from Tate’s text brought together with the highly dynamic structures Purcell’s music activates—do Dido’s anguished emotions come alive.

As revealed by thorough analyses such as Schmalfeldt’s, the lament’s success is no accident but rather a tribute to Purcell’s musical artistry. However, a detailed grasp of all the compositional nuances is not necessary for deriving meaning from the aria. Part of that meaning comes, of course, from the tragic dramatic situation culminating in Dido’s death. Yet I would argue that the bulk of meaning stems from how Purcell shapes the listener’s experience using sonic models that first portray Dido’s acceptance of her sad fate and then a flash of resistance through which she would claim a form of immortality. My account here, drawing on conceptual blending theory, offers only an introduction to the means by which Purcell achieves these extraordinary ends. I believe it nevertheless highlights the differing resources music and words offer for constructing meaning. It also provides a way to study the relationship between these two communicative media by exploring how specific concepts from each can be combined to create a rich imaginative world.

Creativity remains elusive. Our fascination with the notion may be partly due to its mysterious aura: it can seem like an alchemical process that transforms ordinary thought into glittering gold. Yet it now appears that creativity is shown not only by bold innovators and revolutionary artists but also by anyone who utters something like “If I were you, I wouldn’t do it that way.” Such an utterance produces a blend of concepts—the speaker combining his wisdom with the actions of the person addressed—that opens a highly productive imaginative realm. As Fauconnier, Turner, and others have shown, these blends can be studied through conceptual integration networks, which specify how concepts from two clearly different domains are combined to create a novel and useful imaginative domain.

One remarkable aspect of blending concepts is the potential for the new ideas produced through the blend to inform older ideas anchoring the process. When I say “If I were you, I wouldn’t do it that way,” I will not truly become you. That said, by putting myself in your shoes I have subtly altered my self-conception. Similarly, imagining Lorraine Hunt Lieberson in the role of Dido changes how we think about Lieberson, about Dido, and perhaps even

about the basic-level category “woman.” And hearing Dido’s struggle against fate, as Purcell’s setting of Tate’s words invites us to do, leads us to think about a sequence of musical sounds that could have served other purposes, and to uncover resonances in words like “remember me” that might have initially escaped us.

It is worth stressing that conceptual blending is considered a cognitive process related to cross-domain mapping, which also yields analogy and metaphor. We still have much to learn about these processes, but two things are certain. First, cross-domain mapping contributes greatly to the distinctive character of human thought. Second, since music is a product of human cognitive processes, cross-domain mapping—expressed through analogies, metaphors, and conceptual blends—will inform our musical understanding.

Applying conceptual blending theory to music presents several challenges. Some, such as studying blending through computational models, are general in nature, concerned with the highly dynamic quality of human thought. Others, like the status of conceptual knowledge, directly affect musical understanding. If knowledge fundamental to producing and receiving musical utterances is of a different kind from that associated with language, analyzing blends drawing on concepts from both music and language could lead to a thorough reconsideration of blending theory’s core assumptions. Such reconsideration could deepen our grasp of conceptual blending and help us better understand the unique resources musical utterances offer for human cultural interaction.

No specific grant from any funding agency in the public, commercial, or not-for-profit sectors supported this research.

The aria begins with one statement of the ground and, after Dido’s final words, concludes with two additional statements. In those last two statements, the choir of viols that accompanied Dido offers its own strands of melody that reflect on and bring closure to Dido’s melodic lines.

It is also worth noting that these word pairs index two of the aria’s themes: acceptance of death and the hope that Dido’s collision with fate will be buried with her.

Fauconnier identifies blending as a cognitive process related to cross-domain mapping, which also produces analogy and metaphor. We still have much to learn about these processes, but two points are clear. First, cross-domain mapping contributes substantially to the distinctive character of human cognition. Second, to the extent that music is a product of human cognitive processes, cross-domain mapping—manifest in analogies, metaphors, and blends—will shape our understanding of music.

As noted, applying conceptual blending theory to music presents challenges. Some of these, like studying blending computationally, are general in nature, focusing on thought’s highly dynamic quality. Others, like the nature of conceptual knowledge, directly impact musical understanding. If knowledge fundamental to producing and understanding musical utterances is different from that associated with language, analyzing blends that draw on musical and linguistic concepts could lead to reassessing blending theory’s basic assumptions. Such reconsideration could contribute to understanding conceptual blending and help us better grasp the unique resources music provides for human cultural interaction.

No specific grant from any funding agency in the public, commercial, or not-for-profit sectors supported this work.

Notes

1. Recent debates and discussions about gender reveal the complex subtleties a basic-level category like “woman” may conceal and the potential for it to change as we gain more knowledge. 2. I use “musical utterance” as an inclusive term for “a sequence of musical sounds produced in service of musical communication.” I do not specify the modality through which it is produced—voice, instrument, or computer program—nor the number of performers or sound sources involved. I also adopt the basic framework of interpersonal communication, where one person produces an utterance for another to apprehend, while accepting that in some cases both producer and receiver may be virtual rather than actual. 3. An alternative definition is offered by Kaufman and Sternberg: creative ideas possess three features: they must be new or innovative, high quality, and appropriate. Since Boden’s “value” encompasses both quality and appropriateness, I prefer her more concise formulation.

4. In previous work (Zbikowski, 2002, pp. 59–61), I noted that I was not the first to draw a strong connection between categories and concepts, or to assert their equivalence. See, for instance, Hampton and Dubois (1993), Barsalou (1993), Barsalou et al. (1993), Smith and Medin (1981), and Murphy and Medin (1985). 5. Previously I proposed that cognitive categories are based on knowledge structures similar to frames (Minsky, 1975; Barsalou, 1992), which I called conceptual models, and which stabilize category-related information while allowing it to change over time. See Zbikowski (2002, pp. 41–48 and Chapter 3). 6. It bears mention that a performer producing sounds like those in Figure 2 may have an experience quite different from that of a passive listener, because her focus is first on the motor movements needed. To simplify, I concentrate on active audition of musical sounds—by performer, listener, or composer imagining those sounds (which I take as part of “active audition”) as distinct from the embodied processes that may be its source or correlate. While composers or performers may be the “first listeners” to musical sounds, I assume that the functional basis of their audition does not differ notably from that of non-producing listeners. 7. I emphasize that my notion of “sonic images” and ideas about musical imagination are not tied to the visual modality. “Image” acts in a broad sense as a cognitive construct, and “imagination” as the activation of such constructs. 8. The aria opens with one statement of the ground and, after Dido’s final words, ends with two more. In the final statements, the viol choir that accompanied Dido adds its own melodic strands that reflect on and complete Dido’s melodies. 9. These word pairs point to two of the aria’s themes: acceptance of death and the hope that Dido’s fateful collision will be buried with her.