What is music? An exploration of Indian art music and AUTRIM transcription
What is music?
Music can be understood as sound we take pleasure in hearing.
Which of these counts as music? The song of birds, a soft breeze, a flowing spring, temple bells—a mother's lullaby, a nursery rhyme—a poem set to a memorable tune—the collective sound of musicians playing together—and whatever comes out of speakers when we play a CD or switch on the radio.
Music as art
The core aesthetic goal separates music from other activities. Music operates both as an art form and as an industry. Every tradition rests on ten integrated aspects: composing, performing, receiving, perceiving, teaching, learning, preserving, accessing, disseminating, and sharing. These areas influence each other while remaining interdependent.
This presentation
The author speaks from the dual perspective of practitioner and musicologist. The discussion concentrates on five areas: listening, intonation, improvisation, notation, and a brief overview of AUTRIM—an automated transcription system for Indian music developed with the University of Amsterdam.
Human beings and music
People are innately musical. As John Blacking put it, "Music is humanly organised sound & product of behaviour of groups." Numerous cultures produce many musics. The creation, performance, meaning, and definition of music shift according to cultural and social context.
Assessing or studying a musical tradition demands attention to its tonal and rhythmic structures, grammar and aesthetics, along with extra-musical domains—history, sociology, psychology, philosophy, economics, physics, technology—that all affect the deep structures of music.
Indian music identity
The term Indian music refers to the music of the entire subcontinent, covering seven nations: India, Pakistan, Bangladesh, Afghanistan, Tibet, Nepal, and Bhutan. Today's classical or art music traces back to the Samaveda, the lyrical hymns of the Rigveda composed between 1500 and 900 BCE. Unlike traditions from ancient Greece, Egypt, Sumeria, Israel, or the Middle East, elements of ancient and medieval Indian music survive in contemporary practice and appear in treatises dating to the pre-Christian era.
Contemporary art music is a confluence arising from centuries of cultural exchange among Greek, Arabic, Iranian, and Indian peoples. Those civilizations share—or shared—common features in various degrees: oral tradition, primacy of the voice, and microtonality. Music in the subcontinent reflects the racial, linguistic, and cultural diversity of its population. The variety of musical types is unmatched elsewhere in the world. Music plays a central part in people's religious, social, and artistic lives.
Six categories flourish side by side: primitive, folk, religious, art, popular, and confluence. The Sanskrit term Sangit—cognate with the Latin concentus, meaning "sung together"—captures the core ancient conception of music. The English word "music" misses that sense, just as Greek mousike does. Understanding religion, philosophy, aesthetics, history, and culture is essential for full comprehension.
Music aims immediately for sensory pleasure, but its ultimate goal is spiritual release. The tradition is preeminently vocal; instruments are considered secondary. Based on melody and rhythm, it has no part for harmony or polyphony. It is modal and typically accompanied by a drone that establishes a fixed reference and avoids key changes.
Art music
A clearly aesthetic intention separates art music from other categories. It is regulated by two main elements: raga (a tonal matrix) and tala (a rhythmic framework). Unlike many traditions, tala is cyclic rather than linear in nature. The practice has two streams—performance and scholarship—with the latter following the former and leading to codification of rules, methods, and techniques.
Indian art music is primarily a solo performance tradition, allowing room for innovation and interpretation. Methods and techniques support that goal, giving rise to varied musical ideologies and family traditions (gharana or bani). An abundance of musical forms exist, each with structures based on patterns of notes, rhythms, and tempi.
Modes of expression are deliberately cultivated and therefore require a highly structured teaching-learning process. Audiences are expected to be educated about the art form and to take part in the music making. The quality of the audience and its response can bring about qualitative differences.
Listening and pitch identification
Listening demands multiple layers, with a complex soundscape that includes voice, tanpura drone, melodic, and rhythmic accompaniment. The tanpura provides a pitch reference: four or six strings tuned to the tonic, the fifth below, and an octave below. Its rich envelope of overtones and harmonics adds complexity. Other string instruments—sitarnoability-utter.—make pitch detection harder due to multiple main strings and sympathetic strings.

Notes have never been standardized in exact frequencies or ratios. The positions of semitones apart from tonic and fifth may shift; flat notes can drop by around 20 cents. Shruti is a concept describing subtle octave divisions. Problems arise when melody is thought of as fixed pitch points, while experimental studies show flexible intonation that rules out the notion of fixed points.
Modern scholars see intonation as a statistical phenomenon: note densities occur not at exact positions but within limited ranges in a tonal area. The influence of melodic context on pitch is also clear from studies.
Note connections—the "music between the notes"—are significant. Certain intonations and ornamentations become highly characteristic in some ragas. Microtonality in Indian music is real, not a myth, but the formulation is better understood as melodic shape or contour rather than discrete points. A model describing contemporary intonation needs to include also volume and timbre along a temporal axis.
Improvisation
Improvisation does not mean random expression or arbitrary ordering of notes and patterns. It permits creativity within the constraints of raga grammar and aesthetic norms. It is based on permutation and combination of notes, varying accent and volume, and use of ornaments—both in matter (what to play) and manner (how to play).
Speed and time are crucial to studying melodic shapes. A specific raga or a well-structured composition forms the foundation for improvisation. A story-telling logic leads to the raga-specific atmosphere and the aesthetic emotion.
Despite the primacy of voice, instruments abound—for solo performance, as drone support, or for melodic/rhythmic accompaniment. The first classification came from Bharata (200 BCE–200 AD), based on the sound-producing agent: strings, winds, solid body, and membrane. That system was the basis for the modern Sachs–Hornbostel classification of 1914.
C. V. Raman discovered in the 1920s the unique properties of Indian string and percussion instruments—owing to the peculiar bridge surface and the loaded membrane. Performance techniques enhance the acoustic quality even further.



Research includes spectral analysis, identification, and synthesis of specific instrument sounds. Work on stringed instruments' bridge surfaces—especially tanpura and sitar—aims to automate the manufacturing and maintenance process. Standardized instrument manufacturing is also being developed. Studies of string wear on given surfaces may lead to alternative bridge materials. Development of electro-acoustic and electronic instruments is ongoing.
Notation and its role
The relationship between notation and performance differs in Indian tradition compared to the West. Indian notations are oral in origin and mnemonic in function, whereas Western staff notation is graphic in origin and prescriptive. The system uses mnemonic syllables—naming sounds to help people talk about, think of, discuss, and transmit both melodic and rhythmic music. Independence from written notation allows, or reflects, the high degree of variation, embellishment, and improvisation found in performance.
Mnemonics give musicians a direct link between sound and symbol. They use sargam or bol for teaching, composing, and musical thinking. These sketchy notations act as an aide-memoire, particularly for preserving traditional compositions. From the late 19th century onward, printed compositions with notation began to be published for instruction, dissemination, and preservation of the repertoire.
Although a direct connection between sound and mnemonic exists, diverse interpretations of those mnemonics are possible. While writing, the additional information—infections and ornamentations—is never recorded. This makes the system inadequate for visual representation of music.
From notation to transcription
Notation is prescriptive; transcription is descriptive. Transcription provides a graphical interpretation of essential concepts and logical principles in a musical system. Manual transcription has a limitation: the coder is a black box—the inscrutable human brain. Understanding how that black box works would make the decoder at the other end function reliably.
Computer-aided transcription makes the coder-decoder system transparent, reliable, objective, and consistent.
AUTRIM: Automated Transcription for Indian Music
The project rests on a premise: sound and sight constitute a major synesthetic pair of senses. Combining auditory perception with simultaneous images of melodic shapes proved more effective. Audiences can "see" notes along with their intricate movements. Graphic contours help in understanding the music's "sound," something otherwise absorbed only through repeated learning and practice. Contours reveal what we do not hear, what we alter while hearing, or what we take for granted. They also offer insights into extremely subtle elements that we cannot distinguish aurally, though these may influence our perception subconsciously.
AUTRIM is an ongoing effort at the National Centre for the Performing Arts, Mumbai, in collaboration with the University of Amsterdam (Prof. Wim van der Meer). The team adapted PRAAT (created by Boersma and Weenink) into a full-fledged music analysis program for Indian music and processed a substantial body of audio. The final output is a 720p HD video showing melodic graphs synchronized to a mini raga performance of 10–12 minutes. The graphs are placed over a tonal grid, supplemented with rhythmic and poetic information, all displayed simultaneously with the corresponding audio.
A vertical cursor aligns the visual and audio information. The current database holds 110 compositions in 85 ragas. Videos for 25 ragas are already available, showing complete details about the raga, composition, performer, and analysis of the performance. The project can be found online.
Conclusion
Several components of art music are rule-based and model-based, so technology can help in understanding, analyzing, documenting, and developing those aspects. Involving musicians and musicologists is critical to ensure aesthetically meaningful and culturally viable work. Music remains an enigma: on one hand, as organized sound it is intentional and rule-bound; on the other, it is governed by culture-specific philosophical tenets rather than universally standardized measurable parameters.
Software needs to serve the "mindware" of music makers. As John Myhill put it, "Trying to characterize all the musical cognition in terms of computations alone, is bit like trying to pain all the landscapes without using green."
Acknowledgments
The work is grateful to Dr. Jamshed J. Bhabha, Founder Chairman of NCPA, and the Sir Dorabji Tata Trust.