© Robert Del Tredici
Sandeep Bhagwati, compositeur, et professeur à l’université Concordia à Montreal, nous propose dans cet article de remplacer la partition musicale par des consignes audio (d’où le terme de « partition audio », ou « audio score »). Les signes traditionnels de la notation musicale disparaissent donc ici au profit de stimuli acoustiques délivrés aux interprètes au moyen de casques audio. Une telle pratique redéfinit alors le rôle traditionnel du compositeur et son lien à l’interprète, en offrant une palette expressive bien différente de celle des partitions visuelles (ou classiques). Dans la pièce Villanelles de Voyelles par exemple (https://vimeo.com/250181664, 2017), Bhagwati nous propose une approche résolument moderne de la composition musicale, qui invite à comprendre, composer, interpréter et/ou écouter d’une façon nouvelle : tout d’abord la restitution de l’œuvre prend corps dans un contexte, on écoute pas ici une pièce de concert, mais on assiste à une performance, un happening, portant l’héritage de John Cage ; ceci pose aussi des questions relatives au monde sonore qui l’entoure, dans la mesure où la musique émise par les chanteurs se mêle à l’environnement sonore urbain. La partition enfin, cherche bien sûr une restitution fidèle de ses consignes audio, mais surtout stimule, provoque la créativité de l’interprète, dans une forme hybride entre improvisation et composition : comprovisation. Cet article propose donc de nombreuses pistes à tout jeune compositeur désireux de se poser une question importante : Comment fait-on faire de la musique à un autre musicien, How do we make another musician make music?



Over the past 18 years, I have repeatedly worked with auditive tools and audio scores that completely replaced any written score. The paper examines characteristics of the type of elaborate, autonomous audio score that I developed during this time, as well as attempts a preliminary classification of the compositional affordances that differentiate audio scores from visual scores. It describes the conveyance modes unique to audio scores; it touches on questions of control and context in elaborate audio scores, including on the question of whether such audio scores must necessarily be comprovisation scores; it details how, in the context of elaborate audio scores, the terms “practicing” and “rehearsal” describe other kinds of activities than they do in the context of visual scores; and it discusses unique problems of timing in the performance and composition of elaborate audio scores.


1.1 Conveying Music through Sound

How do we make another musician make music - not any music, but a very specific musical gestalt, music that conveys a specific meaning, an adequate sensibility, an intentional emotion? In all musical cultures, this is a key question for music performance pedagogy. Not surprisingly, the answer usually is: anything that works – gestures, images, symbols, verbal descriptions. But most music performance teaching, even today, uses our ears: the teacher plays, the students imitate the teacher. Musical precision is conveyed most effectively through music itself.
The European practice of music notation, introduced into teaching as a mnemonic device among many others, a device initially well-suited to encode sonically abstracted pitch sequences but not much more, gradually evolved over a millenium to become the dominant channel for conveying eurological music from musician to musician. Over centuries, its always wildly heterogenous catalogue of signs and symbols expanded to encode many, but never all, of the gestures, images and auditory informations that previously had to be conveyed by personal contact.
But how many, precisely? No method transmits musical information free of loss or noise, especially complex niceties such as precise timing, dynamics or timbre. But music conveyance is not simply the transmission of information: each loss or misinterpretation significantly alters the aesthetic meaning conveyed. And musicking, while it may gainfully employ acoustic noise, is inimical to informational and structural noise.
European music notation has thus always relied on parallel, complementary channels of music conveyance: in teaching, the score is used as a support for the sonic and verbal conversations between students and teachers. In chamber music rehearsals, the score as a scaffolding saves time better used for discussions on finer points between musicians (and, if available, the composer), while in larger ensembles the role of the conductor has specifically evolved as a centralized music conveyor.
Conductors in performance, of course, exclusively use gestures and facial expressions to convey musical niceties, but in rehearsal they still often sing: the premise being that even a conductor’s usually quite inadequate acoustic rendering of a musical passage can convey more specific musical information than a gesture, let alone words, could. Again, music itself, even a whiff of it, is experienced to be the best conveyance for music.1

1.2 Acoustical Cues

Acoustical cues, a feature of many musical practices around the world (e.g. colotonic gongs in gamelan, shouts in many African and afrological musics, cadential rhythms such as tihais in Hindustani art music), often function as mid-level temporal indices that shape structural features within a musical flow or coordinate ensemble phrasing. A special case of such acoustical cueing can be seen in click-tracks2: conceived initially to sync the inflexible time structure of tape(d) music with the unavoidably flexible timings of human performers, they quickly came to be used by composers who desired precisely coordinated control over the speed and the extent of tempo changes in an ensemble – or who wanted the musicians of one ensemble pursue individual tempo trajectories that would meet at specific moments: thus parametrizing time, as it were, both in its flux and in its synchronicities. It must, however, be pointed out that while most acoustical cues in other practices are used as the best available solution to a problem of coordination, click-tracks need not actually be acoustical, and probably are not even an optimal solution: visual time cues would work as well (and might even be less disturbing to musicians). In live performance, the click-track was most likely adopted only because paper scores already hogged the visual channel.
Nevertheless, click-tracks - their technical infrastructure as well as many musician’s familiarity with them - opened a window for the previously unknown type of score discussed in this paper: the elaborate audio score.

1.3 What is an Elaborate Audio Score (EAS)?

For the purposes of this paper, this term denotes a type of score that uses headphones as its interface to the musician and conveys musical information primarily via acoustical messages. If we accept the definition of a score as the collection of all composer3-defined, non-contingent aspects of a performance, audio scores, then, are scores that primarily use auditory communication to convey such composer-defined aspects to the performers.
These aspects can be conveyed in different modes and exercize various functions: information, instruction, imitation, inspiration, and instance (more on these terms below). These aspects will usually be conveyed in realtime, i.e. during the performance, although the last mode, instance, can be and has been used to complement a visual score.
In spite of their real-time bias, such elaborate audio scores need not necessarily be situative – they can be as fixed, and thus practice-able, as a written score. And yet, what is - and how it is - practiced will not be the same as in a written score: practicing such a score will tend more towards creative response than towards faithful execution, more towards exercising the imagination than exercising the fingers or the instrument.
Indeed, elaborate audio scores afford composers registers and opportunities of musical conveyance different from those possible in visual scores. They also exempt musicians from looking at a score, and thus free them to move around, and to use their eyes to take in other relevant information or to communicate, much as they do in improvisation or when music is played by heart.
Together with the possibility of conveying other registers of composerly intention to a musician, this unfettering of the musician’s body and gaze may be the strongest motivation for composers to choose the audio score as their primary communication channel for their compositional ideas.
These ideas, based on a different interface and sensory mode, must therefore be different from those underlying a written or graphic score – it is my experience that composition for elaborate audio scores, especially for ensemble music, most likely will employ the compositional stance called “comprovisation”, a complex intertwining of composition, structured improvisation and contextual improvisation – this, at least, has been the case in my compositions and comprovisations that use audio scores.

1.4 Developing an Elaborate Audio Score

My interest in audio scores already began with a very early score called “Music for the Deaf and Blind” (1985) written in my first year of composition studies at Salzburg’s Mozarteum. In this piece, I had planned to let each musician in a classical piano trio play within a different sonic context – each would have a closed-concept headphone with different music, and they would be asked to play their written part along with the music in their headphones, not with their fellow musicians. This piece was never performed. Since 1999, however, I have been working with increasing frequency on progressively complex types of audio score. In l’essence de l’insensible [3] I used variable radio clicktracks enhanced with audio instructions to guide and coordinate 12 musicians through the sonically convoluted spaces of Richard Meyer’s Stadthaus in Ulm (Germany), and to explore the aesthetic potential afforded by the difference between synchronicity and simultaneity. In Nexus [4] I used a continually reconfiguring live transmission network between five isolated musicians wandering in a cityspace to coordinate their musicking. In Alien Lands [5] I used a combination of animated score and audio score to enable the comprovisations of a spatially dispersed percussion quartet. In Iterations [6], I worked with live generated diverging and converging pulse paths, as well as with the “inspiration” mode detailed below that encouraged musicians to comprovise to a live DJ mix that the audience could not hear. During the gradual unfolding of a work cycle around a poem by Kabir, “I am a Bird from an Alien Land, my friend” (Oiseaux d’ailleurs [7], Ham Pardesi [8], Fremde Vögel [9], On Nostalgia [10], all for ensembles of 7-11 musicians), I finally developed elaborate audioscores that use all the conveyance modes listed below. Work on this elaborate audio score continued with Villanelles de Voyelles [11] for four singers a capella, and, at the time of writing, with “Ephémerides”, a new project for large, distributed ensemble, to be premiered in 2019.
The work on all these projects is the primary source for the analysis outlined below. This paper, as my previous work on the scores themselves, does not refer to, rely on or relate in any decisive way to the work of other composers. While I was distantly aware of and sometimes, in media reviews, read about works such as Alvin Lucier’s “Vespers” from 1968, which asks blindfolded performers to move in a space guided only by scholocation [12], Elisabeth Schimana’s works that rely on what she calls “sounding scores” [13,14], the audio pitch and rhythm prompts for lay singers in Jonathan Bell’s compositions [15], I never actually encountered these works live or studied them in detail during the years (1999-2015) that I developed my elaborate version of an autonomous audio score.
If anything, I was more influenced and inspired by certain works of installation and performance artists such as by Sophie Castonguay, whose audio instruction score patch for “Le souffleur” (2010) [16] was developed by the same programmer who designed the audio score patch for my work Oiseaux d’ailleurs; by TC McCormack’s performance project “Team Taxi” (2005) [17] where musicians sit in taxis who move around the city of Umea, Sweden, and create live music by emulating the sounds and events they hear on this trip; by Tino Sehgal’s “This variation” (2012) where singers in a dark room at an exhibition take their cues and sonic material from the audience members coming to see the exhibition [18]; or by choreographers such as Xavier Le Roy, who upended the relationship between sound and the body in his “Mouvements für Lachenmann”(2005) when he asks the musician to just execute the movements that would be required to make Lachenmann’s musique concrète instrumentale, but without any instruments – thus creating an inaudible, but mental music [18]; and finally Jerôme Bel whose “The Show Must Go On” (2001) [19] asks performers to only move in response different music’s they can hear in their headphones.4
The reason, however, that none of these works had any real bearing on my research-creation towards an elaborate audio score is simple: with the possible exception of Castonguay, none of these projects was interested in repeatable, precise instructions – they all aimed to create ephemeral, improvisatory situations rather than the kind of repeatable and coherent constellations of sonic events that characterize polyphonic and multilayered music scores. These projects did not really care about any specific dramaturgical shape and/or sound of the resulting music, whereas my intention was to develop a conceptual tool that could precisely convey musical ideas, sonic materials and complex cochlear and temporal dramaturgies to musicians while they perform – albeit in a less abstract mode of representation than that of a traditional ink-on-paper score.


As mentioned above, in an elaborate audio score the composer’s intentions may be conveyed to the musicians via different modes. It should be noted that all these conveyance modes are applicable to both real-time scores (when the audio messages are positioned, sequenced or even generated live) and offline scores (when audio tracks (i.e. parts) are prepared beforehand).
The difference between these score types will mainly impact production modalities, such as the nature of practicing and rehearsing (see section 4), or the preparation and integration of live vs. pre-recorded sonic materials. The sole difference they make to conveyance is quantitative: each score type will need a different set of conveyance modes and will weigh their importance differently.

2.1 Conveyance Mode A: Information Cues are the most basic of auditory signals. They usually inform the musician about their spatial or temporal embedment or their place within the dramaturgy of an evolving performance. They assume that the musician knows what to do with this information and do not usually offer specifics.
Cues can take the form of a variable /intermittent/continuous click-track, a count-down to the next change, or a kairotic cue-list (“Cue for your Solo: start NOW!”). Cues could also inform the performer about aspects of a performance that require no immediate action or reaction (“next pitch set in 10 sec”, “spatialisation mode 3 is now active”) or connect the performer to other participants (“Singer expects your cue”, “Next cue from trombone”).
A special kind of cue is the pitch cue: A musical pitch (played as a tone, not verbally denoted) which the musician does not imitate, but which informs the performance: the most obvious of pitch-cues would be a drone. Another example could be an upper-pitch limit that the improvising musician should not surpass, or a pitch-attractor, around which an improvisation should weave itself. While these tones themselves are purely informational, they, of course, must be pre-faced with an instruction that tells the musician how to extract this information from them.
Cues, while basic, can nevertheless decisively shape the music: most dramatically in the case of a click-track with varying speeds, or one in which individual tempi diverge and then re-unite again. They also can be essential for the performance of a live-generated auditory score, where a performer needs to be prepared in advance in order to be able to act on upcoming messages.

2.2 Conveyance Mode B: Instruction

Instruction messages, for a composer, will feel like the closest analogy to a visual score: they actually tell a musician what to do at a given moment. Nevertheless, the type of instructions that are possible in an audio score are quite different from those in a visual score. Visual notation affords the composer detailed control over fastmoving structural detail, especially with regard to pitch sequence and duration. Audio scores, mainly because inhabit the time of performance itself, and cannot be previewed, cannot specify temporal details in similarly fine detail: hence, their instruction set will always be limited to comparatively broad strokes.
Instructions come in several types: musical, interactional, para-musical and indexical. Musical instructions provoke musical structures that concern only the musician receiving the instruction; interactional instructions concern the musical relations between two or more musicians; para-musical instructions direct the performers to enact non-sonic behaviours; and indexical instructions point to, explain, and set up other conveyance modes.

2.2.1 Musical Instructions

While musical instructions in audio scores cannot shape musical structure in deep detail, they can provoke a more or less creative enactment of such structures. Such enactments can take different forms:
a) recall: instructions refer to material previously committedto memory (“Play Melody X”, “Play Rhythm Y”)
b) adapt: memorized musical fragments are used as material to be transformed into the current context (“Play Melody X to fit/counter the current tempo/time signature/register”, “Play Rhythm Y in triple time” etc.)
c) create: instructions describe the music to be played in a rather comprehensive fashion (“Play a sad / upward moving / triadic etc melody” , “Play a jerky / groovy /rigid beat” etc.). Perfomers must then invent a music that fits these descriptions.
d) tune: musicians can be given precise pitches to play. This can be especially useful in microtonal contexts, and indeed seems one of the more practicable and reliable scoring solutions for precise microtonal tunings. It, of course, will work only with slow moving pitch material. In live-generated scores, this format can also help tune the musicians to other sound sources, such as an environmental sound. e) conduct: each musician can be given precise cues for starting and stopping, for the precise evolution of dynamics and pulse, and for the coordination with other musicians. These are tasks that usually are relegated to conductors. Audio scores, however, are a unique tool that can be used by composers to shape each of these musical parameters as they happen, and this separately for each musician or sub-ensemble.

2.2.2 Interaction Instructions

These instructions ask the performers to connect with other performers or with their environment – sonic or otherwise - in various ways. Such instructions can range from “Imitate performer x” to “Accompany performer Y” or even “Disturb performer z”, or other interactional behaviours. And they can focus the interaction on specific elements of another’s performance: “Follow the pitches of Z but in another rhythm” or “Match timbre with Y” or “Create a rhythmical dialogue with X”.
Similar interactions with the environment fall into this category, if they do not only reflect the sonic landscape (that would be more an imitative behaviour, see 2.3) but imply an interaction with it (“Trumpet: make the piano strings resonate” or “Accentuate/Satirize a conversation happening nearby”).

2.2.3 Para-musical instructions

Freeing the performer’s body and gaze implies new compositional parameters: directionality of body and gaze, body posture, the musician’s position and trajectory, etc. These can be integrated into a score in flexible ways previously difficult to define (“During the next 6 seconds: On a high pitch, quickly turn 180° while singing” or “When you hear a mordent from someone, slowly walk towards this performer”, “Turn away from the loudest among you.”).5 Such parametrizations can be used musically (mainly for flexible, improvisable, emergent types of sonic spatialisation as well as for re-configurations of the ensemble) as well as theatrically or choreographically.

2.2.4 Indexical Instructions These are instructions that set up other conveyance modes: after all, the sound examples that are used as reference in the Imitation, Inspiration, and Instance modes (see below) are not self-explanatory – they need to be framed and defined by an instruction. (“Mimic the following sound”, “Accompany the following sound”, “Improvise like in the following sound”). Similarly, such instructions can set up and define a cue (2.1.1.) (“On next three cues: change timbre”).

2.2.5 Wording A final remark on the wording of instructions: there is a musical necessity to be as precise, unambiguous and concise as possible. Musical time is so much more finely grained than verbal time - and the longer or complex a message is, the more music time it consumes – both on hearing and when it is processed by the performer. In addition, the longer an instruction the greater the risk that it is not fully retained or understood by the performer (who, after all, is usually playing while listening to the instruction). At the same time, in a comprovisation context, instructions do not really work effectively if they are commands that must be followed blindly – they need to be experienced as hints that open possibilities rather than constraints that close down options.
I have frequently found the wording of instructions to be a aesthetic/creative act in itself, not unlike writing poetry.

2.3 Conveyance Mode C: Imitation Set up by the indexical instruction “Mimic the following sound” the performer aims to closely lock into a synchronized (or, if possible, responsive echoing) imitation of a sound example heard in the headphone. The composer is completely free to use any sounds as sound examples6 – a part of the interest in this feature will be the actual, physical inability to exactly imitate the sounds presented on one’s instrument: e.g. when a flutist hears a waterfall’s bass rumble, or a keyboard player hears a microtonal glissando. The strain to imitate the impossible will produce music that the performer would not have used in the course of their usual idiosyncratic improvisations.
An interesting aspect of this approach to imitation is the insight that the sound example will never be imitated perfectly – and that embracing this impossibility opens another door: just as Chinese script characters enable the same thought to be communicated and spoken in widely different dialects and languages, the imitation mode enables musicians of widely different traditions and instruments to create the same sonic dramaturgy within their own sonic reference frame, even though their individual realizations of the sound to be imitated might differ wildly. 7
A special case of this (and the two following modes) would be the invitation to mimic sounds and sonic structures outside the performer’s headphones, in the immediate or mediated environment. This introduces even more contextual chance elements into the score, and seems to require a kind of default instruction that kicks in when, for any reason, the environment does not afford anything that the performer could use as a sound example.

2.4 Conveyance Mode D: Inspiration

Set up by an indexical instruction that specifies an interational relationship with a sound example such as “Accompany/ accentuate/satirize/simplify etc the following sound” the performer uses the sound in the headphones (or outside) to orient her/his playing in the interaction mode defined by the instruction. This orientation is not mimikry in the sense of the previous mode, but rather a way of playing that takes off from the example, expands, comments, counterpoints it. This includes the possibility that the musician will play something that is not similar to the sound of the example, but emerges from a musical dialogue with it.
Interestingly, these interaction modes usually describe social or structural relationships rather than musical ones. In effect, the player treats the sound example in the headphone as if it emanated from another performer or other performers - and plays with these “other performers” according to their mutual musical and social positioning.8
One can, of course, ask a performer to be inspired by the sound example in a strictly musical, compositional manner (e.g. “play a floridus counterpoint to the example”, “play the example as a New Orleans jazz phrase”, “only play spectral overtones of this sound” etc.). This, obviously, will limit the choice of performers to those able to easily navigate such technical or stylistic constraints. But such a musical constraint can also be productive if used against the grain.
For example, I have found it musically interesting, in working with ensembles consisting of musicians from different traditions, to generalize such instructions to e.g. “play this example as it would be played in your tradition”. In this way, aesthetical choices (here, an interest in composing with the differences between musical mannerisms) can determine and redefine the function of particular modes of conveyance.

2.5 Conveyance Mode E: Instance

In this mode, the sound example the musician hears in the headphone9 is used indeed as an example, one instance of a particular style of musicking that the performer is expected to realize. These examples are, in a sense, seeds for a specific music to come: everything about them can be important and become a guide to improvisation.
As a composer, one can either rely on the performer’s ability to both intellectually and intuitively grasp the specifics of this particular instance of possible musicking – or one can specify those aspects of the sound example that could become generative in the context of the current performance: “Take the rhythms and improvise with them”, “Develop the example’s melodic movement”, “Like in the example, play with timbral changes” or a similar focus on other parameters.
Instances can be used as examples in the legends of visual scores, too (I have, for example, used them to specify and differentiate different types of glissando, or to show a specific desired voice quality). In an audio score, they become a powerful and enabling live comprovisation tool.

The three last approaches delineate three different interactions with any given sound example: imitation engages in sonic mimikry, inspiration engages in musical elaboration while instantiation is a process of analysis and continual re-construction.


3.1 Comprovisation

Most music traditions arise from the fact that those aspects of a performance that need to remain coherent from one performance to the next and those that can be left to contingency, context and improvisation tend to converge on a stable, praxis-based mix: each tradition ‘selects’ a unique constellation from among all the possible permutations of performance parameters10. Further musicking in such a tradition is then determined by this constellation.
For musicians within a specific tradition, its axiomatic constellation of performance parameters will over time become unquestioned and invisible. For example, western classical musicians usually not ask themselves why composers in their tradition (who mostly do not play with them) have readily provided them with pitches and rhythms and articulations - but often have left performers to figure out vibrato, portamenti, rubati or the kind of reed they use etc. They do not question this particular choice of parameters, but rather accept it as their baseline - and focus their creative energy on shaping those “surplus” parameters that their tradition leaves undefined.
Comprovisation, in contrast, is a creative mode in which composers, for each new piece, must decide the specific constellation of parameters that are to remain unchanged from one performance of the piece to the next, as well as those that are to be decided in the performance context [21]. Such decisions are often guided by several categories of constraints - cognitive (how many different and separate parameters can a musician control while playing), social (how much minute aesthetic control over a performer is socially acceptable, to what degree is a score perceived as an invitation for co-creation rather than as one where performers ‘execute’ the directions of an author) and, for a large part, technical/ structural/ organisational (available instruments and technology, players’ abilities and preferences, acoustics of available venues, can players hear/see each other, etc.).
The elaborate audio score, initially defined primarily as a specific interface and mode of conveyance, has already been shown to afford and privilege certain modes in which aesthetic or pragmatic information can be conveyed to the musician. There is, however, and for now, no particular school or aesthetic tradition based on audio scores, i.e. there is no “conventional” set of performance parameters, conveyance modes and sonic behaviours that performers and composers can regard as given when they embark on musicking with an audio score. This situation thus requires composers to constantly think about defining their own selection of performance parameters, almost anew for each artistic project: their creative mode for using audio score thus must be comprovisation.
As mentioned above, audio scores are not ideally suited to prescribe, describe or control fast-moving, nonrepetitive details of pitch sequences, durations or articulatio. Instead they allow composers to inspire ensemble musicians to realize sonic behaviours that transcend the limits of written notation – and to coordinate them in ways impossible for improvisers. Many of the sounds and sonic behaviours resulting from audio scores will, of course, be familiar both from improvised and from composed music. But in an audio score, they can be sequenced and arranged in conceptually and/or dramaturgically elaborate musical relationships and ensemble constellations that transcend both the barely situative written score and the bare scaffoldings or the entirely emergent dramaturgies of improvised music – they enable complex architectures of ensemble comprovisation.
Moreover, audio scoring enables a composer to devise performances on the basis of any sonic behaviour whatsoever – including those that in the normal course of improvisation or sound production would require lengthy emotional/musical build-ups or that musicians would never use instinctively in their improvisations.11 Such extra-traditional sonic behaviours can be coordinated and sequenced in utterly non-improvisational ways, while retaining their ontological openness for improvised sonic realization. As such, audio scoring is a creative mode that straddles both composing with conventional and graphic visual notation (imagining sounds, providing prompts to realize an imagined sound) and composing electroacoustic music (working with each sound as it is, without considering with its reproducibility or re-creation).

3.2 Timing

3.2.1 Precise timing

Tracing the advanced audio score back to click-tracks as one of its forerunners highlights one of the most obvious affordances of audio scores to the composer: perfect control over timing. Not only is it possible to enable groups of musicians to play in precisely coordinated variable tempi (rubati, accelerandi, ritardandi etc), but such variable tempi can also be composed polyphonically, allowing a different temporal evolution for each musicians while ensuring that all converge on a new common tempo at a later moment.
While such advantages certainly are useful, they are not applicable to all musical situations: Accelerandi and ritardandi often are more expressive when they are not precise, and rendered ad-hoc to fit the dramaturgical context. Diverging and converging tempi or polytemporal rhythms, in order to become aesthetically perceptible, usually require the musical material itself to be restrained and concise – and such restraint may well run counter to stylistic or improvisatory affordances.
In many cases, click-tracks, whether pre-recorded or live-generated, simply are not optimal solutions for a desired outcome. For example, musicians in imitation mode will often be attracted to or perturbed by any timings in the sound example, and many instructions effectively generate their own temporal structure which may clash with the abstract pulsations of a click-track.
Lastly, audio scores, unlike visual scores, confront musicians with a score element, a message or instruction in real-time. One cannot, in an audio score, glance ahead towards things to come – rather, each instruction and example in the score arrives in the actual present, and must be processed (i.e. understood and musically realized) immediately. But this moment of immediacy has an indeterminate duration – each musician will react more or less promptly to an instruction, and may take a different moment to process it into actual sound. Often, especially in live-generated scores, such instructions may arrive at any moment in a musical flow, and in certain stylistic or musical contexts, the musician may need to “wind down” the current utterance before taking up the new instruction. In music, however, aesthetically relevant coherence coordination is a matter of split-seconds – and the slightest of such hesitations could thus destabilize a music that relies on precise click-track compliance for its aesthetical import.

3.2.2 Heterophonic Elastic timing

Audio scores are a tool suited particularly well to what I call ‘heterophonic elastic timing’, i.e. a mode of temporal ensemble coherence that is neither rubato (localized pulse variance) nor swing (localized variance in pulse/attack couplings) nor, of course, straight “playing-on-the-beat”. It also is different from kairotic, inner timing in solo improvisations, because, although it may appear similar, heterophonic elastic timing can only really apply in an ensemble setting: the term describes a particular type of coherence between different musicians.
Heterophonic elastic timing occurs when a score is not only tolerant to the minute differences between individual performers in reaction time, processing time, and individually felt fit to the current musical activities – but when it actually embraces and expects such individual aberrations within the ensemble, usually in the interest of a larger goal: this could be a maintaining an emotionally/ kinesthetically convincing flow, or an interest in perturbances and their effects on musical dramaturgy etc.12
Performers in my audio score pieces have likened the experience of playing in heterophonic elastic timing to the coordination of fish or birds in a swarm: a common trajectory is followed, but nevertheless each participant in this swarm has a certain leeway in seeking their way – for example if one encounters an obstacle, or if winds or currents require adaptation. In an audio score with heterophonic elastic timing, performers are effectively asked to coordinate dramaturgically (i.e. by ear), while the temporal flow weaves in and out of synchronicity.
A special case where precise and heterophonic elastic timing are both applicable in audio scoring is the situation in spatially dispersed, and maybe even spatially mobile ensembles: here, a precisely synchronized audio score can serve as the rigid conceptual scaffolding for a music that will sound quite elastically timed, simply because each listener will be at a unique location that is defined by a specific set of time lags for each musician, depending on the distance of the musicians. A composer could make use of this effect by writing exactly the same rhythm for all musicians, and then let the position and movement of the listener ‘compose’ a flexible spatial canon.

3.3 Situative and Fixed Audio Scores

Audio scores occupy a curious middle ground between situative and fixed scores. If we follow the definition of situative scores, as “scores that do not build on linear, pre-existing information structures. Information in these scores is only available ephemerally, i.e. while it is displayed or accessed in a particular context” [24] then audio scores are situative scores – during performance, every instruction or example is only ephemerally available to the performer around the time of its realization. And in the case of live-generated audio scores, this assumption holds water.
However, both in my work and that of others, the audio score has also been used in a fixed format – the individual performers’ tracks, like orchestral parts of a written score remain the same for any performance, and can even be played on mp3 devices, their start synced by gestures. In this case, the individual part itself is no more situative than a written score – each performer can play it back to themselves and, if it helps, even learn it by heart. The audio comprovisation score is fixed and repeatable – which means it can be rehearsed, much like any other visual score.


In elaborate audio scores, the rehearsal is an important facet that guides their implementation and even composer’s choices.
The performance of audio scores usually requires fewer ensemble rehearsals than a complicated chamber music composition and more than a free improvisation concert. And it usually requires more individual practice and exploration than both the chamber music concert and, most likely, also a free improv concert. What are the demands on a musician performing the kind of elaborate audio score discussed in this paper?
In any audio score comprising more than the most basic of elements (durations and pitches), the particular set of instructions first needs to be learned and understood. As mentioned in 2.2.5, the constraints on the wording of instructions are intense, and almost always will require the composer to use short-hand terms for more complicated ones, and explain them in the legend. In this, the first approach to an audio score is very similar to that needed for a conventional new music score that uses many non-standard symbols.
Once the musicians understand all the instructions, they might need to practice particularly demanding passages, just like in any other score. The difference, however, that these passages will only rarely be demanding for their fingers or larynx – rather, the difficulty in these passages will mostly pose a conceptual or creative challenge: How to create engaging and convincing music in imitation, inspiration or instantiation of a given sound example – especially when the score affords only a fairly short window of a few seconds to make such a musical statement? In my experience, the only truly virtuosic challenge in practicing an audio score tends to arise with complicated click-track led tempo changes and pulsebased improvisation.
The main questions that need to be addressed in subsequent ensemble rehearsals usually are again very different from usual orchestra, chamber ensemble or band rehearsals. Coordination in time and pitch, in phrasing and in musical inflexion, the great time devourers in usual rehearsals are almost absent from the audio score rehearsal process – as delivering exactly these parameters to the musicians is the great forte of such scores. Most rehearsals I have witnessed tend to use the available time to focus on the musical interaction between the musicians, on understanding one’s role in a larger context and, as a consequence of this understanding, on exploring one’s responses to the instructions and sound examples. In rehearsing an audio score, musicians, much like theatre actors, need to understand the musical persona their engagement with the audio score brings forth from inside themselves.


5.1 Interface and Infrastructure Audio scores, while using a comparatively recent technological interface, are not currently in dire need of ongoing technological development – they rely on existing technologies. In fact, today’s audio and wireless technologies require between none and very minor tweaks in order to be appropriate for all kinds of audio scores for the foreseeable future.
All an audio score requires are interfaces to the musician’s ear(s) (typically: open-concept headphones), a device providing the sequence of acoustic conveyances that make up the audio score, and, for some uses, a centralized, multi-channel audio dispatching system. If musicians are expected to move through space freely (after all, one of the primary motivations for using an audio score) then this dispatching system must be wireless. All these technologies have for some time already attained commercial viability and reliability, and are commonly used in commercial branches of the live entertainment industry as well as in a variety of non-artistic professions such as the military, police, or large construction sites.
Likewise, any software that would control the score or the multi-channel dispatcher is comparatively easy to come by: in many cases, basic functions of studio sequencing softwares are largely sufficient, and if not, multi- channel real-time composition software frameworks are comparatively easy to program. While it is conceivable that a specialized audio score composition software might emerge, there currently seems to be no need for one.
The only remaining source of technological uncertainty concerns the synchronization problems that may emerge in future, more evolved and data network-centric instantiations of the audio score13 when many wireless data channels within close range must be kept in sync with one another. Interference, critical dropouts and unpredictable variations in latency can be assumed to remain vexing nuisances. Should the realization of an audio score therefore require split second coordination, analog radio transmission has so far proven to be the more reliable option.

5.2 Ensembles As already mentioned above, the most obvious use for audio scores in music is an ensemble – in principle, of any size.14 For the audience, the interplay of synchronicity and diversity, the joys of co-incidence and divergence, the seemingly unconducted and unexpected kairotic moment as well as the richness and tangibility of quickly changing, observable spatialisation through moving musicians are essential aesthetic assets of performances using an audio score, as can be the more choreographic or theatric possibilities such a score affords the composer. All these would obviously remain absent in a solo score – the one exception being: a solo musician performing to a live-generated audio score that in a specific, artistically insightful and perceptible way connects the comprovisational solo to the audible or visible, but ostentatiously non-composed, contingent context, environment or situation of the performance.


As we have seen, audio scores, at first blush merely a new type of interface, create new affordances for composers, require new approaches to playing with a score for performers and afford new aesthetic experiences for audiences. A widespread use of this interface would thus likely lead to new aesthetics of musicking. Competent and insightful reflections on such a sea change, however, would require detailed musical and theoretical analyses of actual comprovisation works that use audio scores.
This paper intends to provide some tools for such analyses, and for the ensuing aesthetic discussion. But most of all, it is a composer’s invitation to other composers, a little manual of how to approach and think through composing with this relatively new and, as far as I can see, not yet intensively explored score interface for novel types of communications between composers and performers.


Research-Creation leading to this paper was financially supported by the Social Sciences and Humanities Research Council, the Canada Research Chairs Program, the Fonds Quebecois de Recherche – Société et Culture, the Canadian Congress for the Humanities, the Conseil des Arts et Lettres de Quebec, the Hauptstadtkulturfonds Berlin, and the Sociéte de Musique Contemporaine de Quebec. It was artistically supported by matralab (Concordia University), Stadthaus Ulm, Radialsystem V Berlin, Ensemble Supermusique Montréal, Bye Bye Butterfly Percussion Quartet, and Ensemble Extrakte Berlin. The author would also like to thank: Dr. Martin Scherzinger, my audio score software developers Matthieu Marcoux and Joseph A. Browne, as well as all the many musicians who tested the different versions of the Elaborate Audio Score for valuable insights and hints.


[1] G. L. Duerksen. Teaching Instrumental Music. Music Educators National Conference, Ann Arbor: University of Michigan Press, 1972.

[2] G. Ligeti, “Poème Symphonique” for 100 metronomes. London: Boosey & Hawkes, 1962.

[3] S. Bhagwati, “L’essence de l’insensible” for dispersed ensemble with clicktrack unpublished score manuscript. Berlin, 1999.

[4] S. Bhagwati, “Nexus’ for 5 itinerant musicians with musical audio scores, unpublished score and software, Montréal 2010. [Online]. Available: http://matralab.hexagram.ca/projects/nexus/

[5] S. Bhagwati, “Alien Lands” for 4 distributed percussionists with animated and live-generated scores, unpublished software score, Montréal 2011. [Online]. Available: http://matralab.hexagram.ca/projects/alien-lands/

[6] S. Bhagwati et al., “Iterations” for speaker, 8 musicians, 2 silent DJs, headphone installation and interactive clicktrack unpublished score & software, Berlin, 2014. [Online]. Available: https://vimeo.com/120307891

[7] S. Bhagwati, “Oiseaux d’ailleurs” for 11 musicians with written score and live-performed audioscore unpublished, Montréal, 2011. [Online]. Available: http://matralab.hexagram.ca/projects/oiseauxdailleurs/

[8] S. Bhagwati, “Ham Pardesi” for 8 itinerant musicians with pre-recorded audio scores, unpublished audio score, Montréal, 2014.

[9] S. Bhagwati, “Fremde Vögel” for 7 itinerant musicians with pre-recorded audio scores. unpublished, Berlin, 2015.

[10] S. Bhagwati, “On Nostalgia” for 9 musicians with pre-recorded audio score. in: Ensemble Extrakte, Treatises on Trans-Traditional Musicking [CD]. Berlin, 2017. Track 13.

[11] S. Bhagwati, “Villanelles de Voyelles” for 4 singers a cappella with pre-recorded audio scores. unpublished, Montréal, 2017. [Online]. Available: http://matralab.hexagram.ca/projects/villanelles-devoyelles/

[12] A. Lucier, “Vespers” (1968) for blindfolded performers with mobile echolocation devices. In: A. Lucier and D. Simon, Chambers: Scores by Alvin Lucier. Middletown: Wesleyan University Press, 2012. p.15–27.

[13] E. Schimana. “Virus #1.0-#1.7” for live generated electronic resonating body and diverse instruments. [Online]. Available: http://elise.at/project/Virus_1

[14] E. Schimana, “Vast Territory. Episode 1 Lily Pond” for violin, viola, cello, bass clarinet, flute and sounding score. [Online]. Available: http://elise.at/project/Vast%20Territory

[15] J. Bell and B. Matuszewski, “SMARTVOX. A webbased distributed media player as notation tool for choral practices,” in Proceedings of the International Conference on Technologies for Music Notation and Representation (TENOR’17), A Coruña, Spain, 2017 pp. 99-104

[16] S. Castonguay, “Le Souffleur” performance for six performers with headphones. [Online]. Available: http://www.sophiecastonguay.ca/index.php?p=lirepr ojets&idprojet=15

[17] TC McCormack, “Team Taxi” performance for 6 taxi drivers and 6 musicians. [Online]. Available: http://www.tcmccormack.co.uk/work/team-taxi.php

[18] C. Bishop and T. Griffin, “No pictures, please: Claire Bishop on the art of Tino Sehgal,” Art Forum, vol. 43, no. 9, 2005.

[19] X. Le Roy, Mouvement für Lachenmann. Staging of an evening concert. [Online]. Available: http://www.xavierleroy.com/page.php?sp=e347f884f a37480bd0bd5dff79104483a8e284b5&lg=en

[20] J. Bel, The Show Must Go On, 2001. [Online]. Available: http://www.jeromebel.fr/index.php? p=2&s=6&ctid=1

[21] S. Bhagwati, “Comprovisation – Concepts and Techniques” in H. Frisk, S. Östersjö (Eds.), (Re)Thinking Improvisation. Malmö: Lund University Press, 2013.

[22] R. Polak, “Rhythmic Feel as Meter: Non- Isochronous Beat Subdivision in Jembe Music from Mali,” in Music Theory Online – A Journal of the Society for Music Theory, vol. 16, no. 4, 2010.

[23] M. Scherzinger, “Temporalities,” in A. Rehding, S. Rings, The Oxford Handbook of Critical Concepts in Music Theory. New York: Oxford University Press, 2019. (In press)

[24] S. Bhagwati et al., “Musicking the Body Electric. The “body:suit:score” as a polyvalent score interface for situational scores,” in Proceedings of the International Conference on Technologies for Music Notation and Representation (TENOR’16), Cambridge, UK, 2016

1 “Auditory models provide the only known method to develop an idea of how a specific instrument or passage should sound.” [1]
2 Mechanical Maelzel-type metronomes are a special case here: They indeed are acoustical prompts - but until Ligeti’s ‘Poème Symphonique for 100 metronomes’ (1962), [2] they were primarily a rehearsal tool, not intended for actual performance. Also, other than the examples mentioned above, metronomes, with their inflexible, non-resettable pulse rate, do not offer kairotic cues: they offer a chronological framework. Click-tracks, initially used as metronomes for multi-track recordings, were much more flexible - they could be used in performance, and their pulse rate could be made to change over time.

3 Throughout this paper, the terms ‘composer’ and ‘performer’ signify roles, not persons. The role of the ‘composer’ can be filled by an individual or a collective, by a software or by a traditional method of intergenerational creation. The role of the ‘performer’ can be filled by a human instrumentalist, a singer, a programmer, a dancer or actor, and any combination thereof. Non-human sound producers, while sometimes regarded as performers in a wider sense of the word, either are usually not conditioned [animals] or not required or able [machines, natural phenomena] to parse and interpret verbal instructions conveyed by audio in a presentational performance context.
4 Ineed, while watching this show in Berlin in 2005, I was strongly reminded of my own abandoned “Music for the Deaf and Blind”.
5 While musicians do move about in other types of music, such movements are either memorized (marching bands, choreographed performances) and optimized for the audience – or spontaneous and optimized for the performers. Audio scores allow musician movement to be developed further, into very specific configurations between choreography and spontaneity.
6 As far as I can discern, there are no limiting constraints for the kind of sound example that can be used, beyond the insight that the more complex a sound example is, the shorter it should be for imitation, inspiration and instance to work at all: the musician must, after all, get a fair chance to absorb the example in its entirety and in its details before reacting to it.

7 Of course, it is possible that a composer really intends to have a performer imitate a sound example perfectly, down to the inflections and microtimings – then the performer should have the opportunity to practice this imitation beforehand – it effectively becomes a sonic objet trouvé.
8 This reminds us that all sound examples could, in principle, also come live from other performers – whether they are in the same space or are telematically connected. Indeed, elaborate audio scores, and the elastic timing discussed in this paper, could be used as a powerful scoring tool in telematic performances.

9 The sound examples used for imitation, inspiration and instantiation can be, of course, taken from existing music / field recordings – but they also can be newly composed and recorded specifically for the sonic context of this piece. This would mean that a significant part of the composer’s sonic creation may be inaudible to the audience – if the composer does not decide to use this material in the performance, too – either as memorized performer scores or as part of an audio track played back in the space.
10 i.e. pitch, duration, timbre, acoustics, spatialisation, but also conventions of the performing body (posture, dress, movement), social relationships between performers, signalling between performers and many more. Each of these performance parameters, in most musicking traditions, is set within such narrow ranges of acceptability that even minute deviations or tweaks can have huge aesthetical import – a fact often and strategically exploited, for example, by the avantgarde movements in eurological art music over the course of the 20th century.

11 Another important affordance of the audio score is that it can be scored in ways that are culture/tradition-agnostic: Precisely because aesthetic intent is conveyed by a combination of natural language and recorded sounds, and not by culturally specific notation conventions, musicians from different traditions will, for the most case, understand and work with the audio score quicker, more reliably and with less stress than with other kinds of notation. The score itself requires no cultural adaptation or learning, once its basic functioning is understood. This, of course, is not to say that what musicians from different contexts will hear and how they interpret it will be the identical – the sonic realization of an elaborate audio score may vary not only from one tradition/culture to another, but also from individual musician to individual musician.
12 The concept of elastic timing itself is, of course, no invention of the author [22], [23]: several Asian traditions, such as sanjo and p’ansori, the music of gagaku, gugak, or jingju orchestras, as well as heterophonic chanting practices from Vietnam to Georgia are built on elastic timing as described above, as are drumming traditions in sub-Saharan Africa. The unique contribution of the elaborate audio score to elastic timing is the fact that each voice can be elastically timed in a different way, not only in a single temporal flow.
13 e.g. ones using sensor data and/or individual score processors on each musician’s body etc.

Copyright: © 2018 Sandeep Bhagwati. This is an open-access article distributed under the terms of the Creative Commons Attribution License 3.0 Unported, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Article publié dans les proceedings de la conférence Ténor 2018 (conférence internationale portant sur la question des technologies pour les notations et représentations musicales), sous license libre Creative Commons, reproduit avec l'aimable autorisation de l'auteur.


Sandeep Bhagwati,
Concordia University Montréal