How computational modeling sheds new light on free improvisation in music therapy
Free improvisation serves as a common tool in music therapy, offering patients a non-verbal medium for expressing thoughts and emotions. More broadly, music therapy aims to produce therapeutic and psychosocial benefits by alleviating symptoms in serious and chronic diseases and by enhancing well-being and quality of life for both healthy individuals and those with medical conditions. However, considerable research still needs to be done to understand exactly how music therapy works and to boost its effectiveness.
To address this, researchers have adapted a broad computational paradigm that allows for rigorous, quantitative tracking, analysis, and documentation of the dynamic expressive processes involved. Originally developed for the modalities of art and music, the method has now been applied in real-world music therapy experimentation. The study examines how the expressive behaviors of clients emerge under the direction of a therapist across a series of sessions, with the goal of developing and enhancing expressivity through free improvisation. The empirical insights gained are detailed, and their implications for therapy and scientific research are discussed.
The clinical challenge and computational response
The therapeutic setting brings together the musical work, the therapist, and the patient in a rich, dynamic environment that is difficult to capture. Complex, simultaneous, and interwoven expressive behavioral processes often elude human observers. As a result, these processes are often perceived and interpreted subjectively, most commonly in verbal descriptions, which can compromise subsequent analysis. The computational paradigm overcomes significant barriers in the arts-based fields by enabling rigorous quantitative tracking, analysis, and documentation of these dynamic processes. It allows for exploratory research, hypothesis testing and generation, and knowledge discovery that is empirically grounded.
The empirical infrastructure supports both intra-level analysis, focused on specific moments within the dynamics of an arts-based session, and inter-level analysis, which looks at broader perspectives across sessions, individuals, and groups. For example, this approach has been used to uncover demographic variation factors in artistic production. Past computational attempts to analyze music making have been limited. Some previous studies recorded only particular parameters based on predetermined hypotheses; others demonstrated tools on just two single test cases; in another study, extracted musical features from improvisations were used to predict the type of mental disorder in music therapy clients. The current paradigm is considerably more comprehensive.
The paradigm's components
The computational paradigm captures emergent behaviors, meaning the properties and patterns that arise from the behavioral processes themselves. Key elements include:
- Time measurement: Calculating exact time durations within a session, such as net idle time when the patient or client is not pressing a key, total playing time, and concurrent playing time from notes pressed in parallel.
- Note tracking: Recording per-time and per-press note use, including net number of notes used, total presses, time durations and density of notes, cluster formations, and note color preference (e.g., black versus white keys on a piano keyboard).
- Octave and intensity profiling: Capturing and analyzing preference profiles for octave use and note intensity — for instance, whether playing is confined to specific registers and dynamics.
- Transition calculation: Measuring crescendo, diminuendo, accelerando, ritardando, and note color transitions (e.g., black to white, white to white).
- Pitch class profiling: Distributing note use collapsed onto a single octave — C, C#, D, and so on up to A#, B.
- Pedal use: Recording the number of pedal presses and their duration.
In the experimental study, human subjects participated in a series of sessions with a music therapist. The dynamics of emergent behaviors during free improvisation were analyzed according to the parameters listed above.
Modeling the music room
The digital observations of the system under study (the musical work) are fed into a Modeled Tracking module. This module captures the events that occur, resulting in emergent expressive behaviors. These are then sent to Analysis and Documentation modules. The Analysis module yields empirical insights for the field of music therapy, while the Documentation module transforms the behavioral dynamics into usable descriptions.
The Modeled Tracking module houses a music room model based on Statecharts, a visual formalism that extends the basic state/event approach. Statecharts enables representation of hierarchy through nested states, multi-level transitions, and orthogonality through concurrent states. Three major entities make up the music room model: the musical work, the patient or client, and the music therapist. The musical work is driven by events that transition the system between states — starting a note, stopping a note, pressing the pedal, or entering an idle state. The Music_Work subsystem, for example, has exclusive substates: Idle, Selecting, and Playing. The Playing state includes complex, rich dynamics with orthogonal states for Timbre, Duration, Tempo, Cluster_size, Key_n, Max_metrics, and Min_metrics.
Experiment setup
The study featured a music therapist and four participants, identified as subjects A, B, C, and D. Each subject took part in six 50-minute therapy sessions. Every session began with a free improvisation and ended with one, yielding a total of 12 free improvisation recordings per participant. Between the improvisations, the therapist assigned exercises and tasks for the subjects to complete alone or together with the therapist. All participants were healthy adults between the ages of 22 and 35 who had college-level musical education and several years of piano training, mostly during childhood. Only improvised minimally before. The primary objective was to develop the subjects' expressive abilities.
The musical instrument was a Casio MIDI piano keyboard controller with a connected pedal. Digital data collection used the MIDI protocol. Improvisation data was recorded in Cubase9 and then processed with Max/MSP to create script files, which were read into the Statecharts model for computational analysis.
For each subject, the first improvisation (number 1) and the last (number 12) were extracted from the MIDI recordings and analyzed. Improvements in expressiveness were expected to appear in sound attributes such as intensity and pedal use, physical attributes such as key color and octave range, and temporal attributes like total improvisation time.
Results: subject B and subject A
Subject B
The comparison between B's first and last improvisations revealed several markers of enhanced expressiveness:
- Octave range: The minimum octave used remained at level 2, while the maximum octave extended from level 4 to level 5.
- Most-used octave: Shifted from octave 4 in the first improvisation to octave 3 in the last.
- Intensity range (dynamics): Expanded notably.
- Note color: Black key use increased in both the distribution of presses and in the transitions between across keys while white key use shifted.
- Pitch class preference: Subject B added the notes E and F# to the repertoire used in the final improvisation, while "letting go" of the note C and moving generally toward more chromatic playing.
- Improvisation duration: Increased dramatically from 0.6 minutes to 2.7 minutes.
- Pedal use: In the first improvisation, the foot rested statically on the pedal, even before playing began. In the last improvisation, the pedal was freely operated with 19 presses averaging 8.5 seconds each across 91.6% of the improvisation time.
- Concurrent playing time: Increased from 214% to 264%, reflecting a growth in simultaneous key presses. The percentage of keys used concurrently rose from 29% to 50%, and the size of note clusters tapped together grew larger.
The therapist's written summary of the first improvisation described it merely as "a short improvisation" and offered no documentation about the final performance — a gap that the computational paradigm precisely addresses.
Subject A
Results for Subject A further underscored the paradigm's value. In the first improvisation, playing was nearly all white keys; the final improvisation showed striking expansion into black keys, in the percentage of presses, in transitions seen as a higher "percent black to black", and in the distribution of playing time across pitch classes. The computational measurement also captured a crucial detail — the use of the 'white' A note — entirely missed by the human observer. The therapist's summary never noted improvisation durations or pedal use. In contrast, precise data showed that duration almost doubled from 1.5 minutes to 3.6 minutes, while average pedal press duration grew from 3.9 seconds to 4.6 seconds.

General empirical evidence of change
Across subjects, the data highlighted several consistent changes that reflect mastery in expressive development:
- Enhanced octave use — suggests the client can draw on more notes to convey themselves.
- Broader intensity range — indicates greater ability to express varied emotional states.
- Chromatic key transitions — signal more expressive pathways; early-stage improvisers tend to stay glued to white keys, black keys, or to chromatic alternatives, which appear broader possibilities after further intervention.
- Frequent pedal usage — introduces more expressive colors to convey feelings and and provide mood flexibility.
- Increased concurrent notes — playing between expressions across rhythmic sequences becomes richer when pressing altogether broad expressive landscape, This points to emerging comfort.
The methods produce exact tracking compare to perception that solo eye.
Discussion: implications and future directions
By precisely tracking, analyzing, and documenting the dynamics of emergent behaviors, the computational paradigm offers concrete contributions: it complements the therapist's potentially sparse written notes; enables improvisations and entire sessions to be reliably compared; and makes data retrievable, renewableable for validation. Clinicians could:
- Provide empirical evidence about what escaped narrative in the session transcripts such fine- tracked in subject A’s A the source quote too interpret – bridging misses from The health contexts. Communication between groups seeking meaning sharing such demographics difference broad connection.
- Assist assessment and self confidence delivery: Example evaluations real the raw where you provide better improvements via comparison records check interaction macro pattern progression findings across tracked pathological margins or approach forms among different demographic—foundation parameters per training outcomes developed growth directly aligned therapy period that generated direction precisely according observed patient record—closing blind zone cognitive evidence pool link past, test adjust onward between clinical settings cycle improve strong patient outcomes incremental: support standard reference independent future growth rate implementation overall actual recorded clinic!
Future planned advances to the computational framework expanded versions potentially will work methods focus pulling forward, recognizing sets–compare (including table what real look across subject training references measurable contrast clients verb The paradigms future line key progression exactly identify safe objective measure relation scaled scaling multiple collection experience up multiple comparison may across areas:
- Determining musical emphasis can which values (see parameter listed result portion [IMG confirm out base] pattern relation behavioral therapies) track perhaps eventual base validated heuristics/identity help detect ‘behavioral markers to track and repeat future interpret patient window simpler… such listed types performed scanning same changed the self summary suggests another: comparison
- Aligns direct use: therapist examples description were checked numbers methodically: exactly short longer mapping the behavioral more length correlation stage technique (scale those—duration total method work can directly more successfully for interpretation as written); This expands possible fine linking of common caregiver language qual summary to stat granular (octave note intensities free match).:
- Simultaneously modeled changes for patient statefull therapist identity among method manual and body behavioral gestures, conversational additional parallels alongside common info streams! In the side report therapist, patient plus client receive info delivery feedback which first initial deep modeling construction was development tasks papers scope than idea rather paper remains depth space model developed path tested. Cross next then includes client (patient) movement sequence facial tracking internal initial created conceptual sample refer structural statecharts reference source reference… future explore: significant digital capture across tracked
- Study work makes contribute known syntax among model. Like harmony combine arrangement shifts exactly captured basic timeframe people not track genre pop/skip harmonic… accessible forms under review eventual capture building theoretical corpus exploration map to standard performance scale client development report more tracking direct nuance beyond surface key analysis possible ahead align with specialist goals is beneficial increasing open combine known same flexibility framework… the basics will both:
The broader likely uptake work will eventually strengthen work collaborative recordable! use systematic progression progress scientific education and psychology/medical enabling consistency—best step proven through method patients open outcome—many researcher caregiver training services many receive baseline based multiple reviewable method details clear precise feedback that the model brings improvements to creativity movement outcome to long evidence real database; quick scaling iterative based real medical parameter proving direct patient growth rapid efficiency both early science among beneficial compared trial forms where logbooks remain and different system find too hard pick continuity lose path. Combining compute set solving ensures adaptable robust track records deliver same measure way subjective bias fixed move onto sound the net provides baseline collaborative macro comparison along earlier target clinical contexts education curriculum facilitation social or adjust best service recreation progressive through therapy modern evaluation. The vision says adapt believe extends transfer cultural wellbeing community ultimately progress easily brings progression applied globally among help fields tool powerful path the trust good combine balance skill therapist complement the machine efficiency scale massive reach solution cheap widely.
Skeja, E. (2014) investigated how cognitive intervention programs and music therapy affect individuals with learning disabilities, publishing findings in Procedia Social and Behavioral Sciences volume 159, pages 605-609.
Chanda and Levitin (2013) explored music's neurochemical underpinnings in Trends in Cognitive Sciences volume 17, pages 179-193.
Lindblad, Hogmark, and Theorell (2007) examined music intervention for fifth and sixth graders, specifically its effects on development and cortisol secretion, reported in Stress and Health volume 23, pages 9-14.
Smolen, Topp, and Singer (2002) studied how self-selected music during colonoscopy influenced anxiety, heart rate, and blood pressure, published in Applied Nursing Research volume 15, pages 126-136.
Kumar et al. (1999) demonstrated that music therapy raises serum melatonin levels in Alzheimer's disease patients, appearing in Alternative Therapies in Health and Medicine volume 5, pages 49-57.
Zhao, Bai, Bo, and Chi (2016) conducted a systematic review and meta-analysis on music therapy for older adults with depression in the International Journal of Geriatric Psychiatry volume 31, pages 188-196.
Chang et al. (2015) evaluated music therapy's efficacy for dementia through a meta-analysis of randomized controlled trials in the Journal of Clinical Nursing volume 24, pages 3425-3440.
Wang, Sun, and Zang (2014) performed a meta-analysis of ten randomized studies demonstrating music therapy improves sleep quality in acute and chronic sleep disorders, in the International Journal of Nursing Studies volume 51, pages 51-62.
Ansdell and Stige (2016) contributed a chapter on community music therapy in Edwards' Oxford Handbook of Music Therapy, pages 595-621.
Tuastad and Stige (2015) presented a narrative inquiry into identity construction within a rock band of former inmates, titled "The revenge of Me and THE BAND’its," in the Nordic Journal of Music Therapy volume 24, pages 252-275.
Wheeler (2015) authored the Handbook of Music Therapy, published by Guilford Publications.
Greenberg (1994) discussed investigating and measuring change in psychotherapy, appearing in Russell's Reassessing Psychotherapy Research, pages 114-143.
Juslin and Sloboda (2001) edited the Handbook of Music and Emotion: Theory, Research, Applications, published by Oxford University Press.
Sandak, Huss, Sarid, and Harel (2015) developed a computational paradigm to elucidate effects of arts-based approaches, focusing on emergent behaviors in artwork construction, in PLoS One volume 10, article e0126467.
Sandak, Cohen, Gilboa, and Harel (2019) computationally clarified effects induced by music making in PLoS One volume 14, article e0213247.
Streeter et al. (2012) tested the Music Therapy Logbook prototype for computer-aided evaluation in the Arts in Psychotherapy volume 39, pages 1-10.
Erkkilä, Ala-Ruona, and Lartillot (2014) addressed technology and clinical improvisation, from production to analysis, in Music, Health, Technology and Design, series volume 8, pages 209-225.
Luck et al. (2007) predicted music therapy clients' mental disorder types using computational feature extraction and statistical modeling, in MCM 2007, CCIS volume 37, pages 156-167.
Harel (1987) introduced Statecharts as a visual formalism for complex systems in Science of Computer Programming volume 8, pages 231-274.
Harel (1988) discussed visual formalisms in the Communications of the ACM volume 31, pages 514-530.
MathWorks resources include Simulink for simulation and model-based design, MATLAB as a technical computing language, and Stateflow for modeling logic using state machines, with URLs provided.
The MIDI (Musical Instrument Digital Interface) standard and its note messages are documented, along with Cubase9 for digital audio workstations, Max/MSP for visual audio programming, and resources on www.midi.org and www.cs.cmu.edu.
Gilboa and Bensimon (2007) developed a method for visual representation of music therapy sessions, putting clinical process into image, in Music Therapy Perspectives volume 9, pages 32-42.
Letulė, Ala-Ruona, and Erkkilä (2018) examined professional freedom in psychodynamic music therapy use of music analysis, publishing a grounded theory in the Nordic Journal of Music Therapy volume 27, pages 448-468.