8. Percussion Dialogues – text without a voice

The previous four chapters have dealt with various forms of vocal delivery of text, whether live or pre-recorded. This chapter deals with visual presentation of text without accompanying vocals – a form of ‘creative surtitling’ – that was applied to one particular type of scene in Kane’s text, the Doctor–Patient dialogue scenes.

There are four extended scenes in Kane’s text, scenes 6, 10, 12 and 23, which use character-to-character dialogue in the style of a traditional character–narrative stageplay. These scenes are clearly marked as two individuals by dashes in the margin, although with no character names given. From the text of these scenes we assume that one person is a doctor/therapist, the other a patient. Only in these scenes is Kane’s language colloquial, conversational, prosaic, direct question-and-answer. There is also humour in these scenes – wry, witty retorts by the patient to the doctor’s blunt questions.  (Scenes 1, 16 and 18 also use this dash convention to indicate dialogue. These are much shorter, and in 16 and 18 consist mainly of monologue by the patient followed by a very brief exchange with the doctor. These scenes were subjected to different musical treatments than that described in this chapter.)

An excerpt from the opening of Scene 6 is here (Kane, 2000: 8–9):

– Have you made any plans?

– Take an overdose, slash my wrists then hang myself.

– All those things together?

– It couldn’t possibly by misconstrued as a cry for help.


– It wouldn’t work.

– Of course it would.

– It wouldn’t work. You’d start to feel sleepy from the overdose and wouldn’t have the energy to cut your wrists.


– I’d be standing on a chair with a noose around my neck.


The dry humour and prosaic tone of this dialogue is clear from the example above, as is Kane’s intention that these scenes follow a natural conversational rhythm (implied also by her stipulation of where silences should fall). The emotional intensity of her writing in these scenes depends on an absolute dryness of delivery. There is little distress written into the text, even though the subject of discussion is very distressing; in other words the content (profound) is distanced from the style (banal), and therein lies the success of these scenes. This line from Scene 12 sums up that dark humour “I dreamt I went to the doctor’s and she gave me eight minutes to live. I’d been sitting in the fucking waiting room half an hour.” (Kane, 2000: 19). Spoken-word theatre productions of 4.48 Psychosis often struggle with delivering dry performances of these scenes. Acting can get in the way.

In dealing with this text, the aim was to produce a very distanced, abstracted setting that allowed Kane’s emotional dryness and dark, laconic wit to speak for itself. A narrative-realist, spoken-word or sung treatment was avoided because it would too easily allow acting and/or music to get in the way. Rather, the chosen concept stripped away pitch/harmony, orchestration, aria and recitative. The driest delivery of this text would be the text alone, without a human voice. The inspiration for this distanced, abstract concept came from Lithuanian-Norwegian artist Ignas Krunglevičius – particularly his work Skinner Box.


8.1. Krunglevičius’ Skinner Box

Ignas Krunglevičius’ makes work involving visually-presented text. His concept is simple: a combination of crotchet beats articulated by instruments and synchronised to projections of words. That principle underpins many of his works, including Skinner Box (2010), Deviance (2011) and Gradients (2012). Skinner Box takes a verbatim transcript of a group therapy session, documented by the Scottish psychiatrist R.D. Laing, involving the therapist, a teenage girl and her mother and father. The text is projected onto four screens, triggered by four instrumentalists, each player-projector pair corresponding to the four characters.  Each percussion strike or instrumental beat triggers a projection cue, with text projected either word-by-word or line-by-line. The score specifies only the number of crotchet beats, the text for each performer and occasional single-colour frames. Additional musical layers were developed in workshops with the composer and performers, such live sound processing controlled by effects pedals, tempo, rhythm, the lengths of pauses between ‘bars’, dynamics, which instruments to use, pitches, and there is also drone tape part not featured in the score. Figure 34 shows an excerpt of the score and Figure 35 a photo of the corresponding excerpt with performance annotations.

Figure 34: An excerpt (page 12) of the score for Skinner Box (Krunglevičius, 2010)

Figure 35: A photo showing annotations on the score of Skinner Box (Krunglevičius, 2010)

A full score of Skinner Box is available here and two videos below.

An excerpt (beginning bar 67) from a performance (unknown ensemble and location).

A complete performance by the NING ensemble, Bergen, 2011

The success of Krunglevičius’ technique is the distance it creates between the mode of presentation and the emotional content of the text. The conversation is profound, psychological, full of pain, but the way the text is presented, stripped of a human voice, in a removed, almost robotic rhythm and sound, allows the emotions in the text to come to the fore. No acting, no music, no singing, no speaking gets in between the audience and the text.

That distancing, and the resultant directness and simplicity, appealed, as an excellent solution to similar problems proposed by the Doctor–Patient scenes in 4.48 Psychosis. Using Krunglevičius as the model, I will discuss the methodology of the making of these dialogue scenes, which involved workshops, transcriptions of audio recordings, and a subsequent compositional phase. Finally, I will illustrate how this process worked using Scene 6 as a case study. But first, a brief explanation of the intended stage set-up, instrumentation and background music used for these scenes.


8.2. Stage set-up, instrumentation and framing

These scenes were set up in a similar way to the Krunglevičius pieces: one percussionist stage right played the Doctor and one percussionist stage left played the Patient. The text for each role was projected onto the upstage white wall of the set stage right and left respectively, close to the location of the performers, who were in the corresponding left and right positions on the raised gantry. The left-right spatialisation helped make the dialogue aspect clearer, and highlighted the adversarial nature of the characters’ relationship in these scenes. Figure 36 shows this layout in use in Scene 10, with a line from the Doctor (stage right) projected.

Figure 36: A photo from Scene 10 showing the stage right (doctor) and left (patient) positions of the two percussionists and the stage right projection location in use. Photo © Stephen Cummiskey

Several technological solutions for linking sound and projection were tested during instrumental workshops in January 2016 (see Chapter 3.1 – Process and Timeline). The three options were:

  1. A noise-gate triggering system similar to that used by Krunglevičius, taking a signal from the microphone on each instrument. A noise-gate triggers a signal to the projection computer to trigger the next image.
  2. Another performer, e.g. the synth player, triggering the projections manually, and rehearsing with the percussionists and conductor to get rhythmic accuracy.
  3. The projection for each scene synchronised to a click track and exported as a video file, with which the percussionists play on a click track.

Solutions two and three were tested in the workshops; we did not have the resources to design, build and test option one. The synchronised click-track solution was chosen, as it was the simplest and quickest to set up, test and rehearse during the production process.

Therefore click tracks were made for Scenes 1, 6, 10, 12 and 23, and the percussionists had these in-ear, so that these scenes could be performed without conductor. The scenes were performed from memory, and, as each had an independent click-track feed, the performers were able to record verbal notes, memory aides, performance notes, count-ins and excerpts of the script on top of the metronome click. The click track solution worked effectively in providing a perfect synchronisation between performance and projection.  An example of one of these annotated click tracks is given here.

The click track for Sarah Hatch (Percussion L: Doctor) for Scene 6, showing the audio annotations / memory aides.


The instrumentation for all of these scenes is shown in shown in Figure 37.

Figure 37: A table showing the instrumentation for the Percussion Dialogue scenes

These were chosen to reflect the roles and personalities of the two characters in these scenes. The Patient’s dialogue is relatively unchanging through these scenes: an honest and resonant voice of humanity, albeit in deep pain, remaining steadfast and unprovoked by the Doctor’s questions. They are represented therefore by a large orchestral bass drum, an instrument with depth, resonance and a large variety of sounds. In contrast, in scenes 6, 10 and 12 the doctor is mainly an antagonistic force, whose comments are essentially banal and inconsequential to the patient. The doctor interferes, annoys, niggles, often with trite or ridiculous questions. Instruments were chosen to reflect this, for example with the penetrating, invasive sound of metal scaffolding hit with metal-headed hammers (Scene 6), or the comic effect of sawing a piece of wood with heavy amplification applied (Scene 12). Only in Scene 23, when the Doctor finally shows her own humanity, flaws, emotions and vulnerabilities, does the Doctor also play the orchestral bass drum. The Doctor is now on the same emotional (and instrumental) plane as the patient; for the first time they seem to understand each other.

The Percussion Dialogue scenes were all accompanied by very quiet ‘elevator muzak’ emanating from the on-stage speaker. The muzak was different for each scene, but generally all synthesised bossanova tracks taken from rights-free sample libraries. This muzak provided a framing device for the scenes, suggesting a doctor’s waiting room, an element of banality juxtaposed against the profound content of the dialogue. It constituted another tool to distance the presentation of these scenes from the meaning of the text; distancing that I had so admired in Skinner Box. Here is an example of the elevator muzak used in Scene 6 (this is the original; I slowed it down for the opera).


8.3. Transcription process

Krunglevičius’ concept was taken as a model for the treatment of Scenes 1, 6, 10, 12 and 23, with one major difference: I would retain some element of natural speech rhythms in the projection and percussion rhythms. This decision was made partly to help Kane’s jokes land with good timing and partly to speed up the rate of text flow in the scenes to keep each scene under three minutes.  (It was a important target that the whole opera not last more than 90 minutes, so every scene was scrutinised for ‘spare time’.)

Incorporating speech rhythms required a rigorous process to generate, record, and then compose these speech rhythms and implement them in the scenes. This process centred on the Year Two Theatre Workshops (see Chapter 3.1 – Process and Timeline), and involved:

  • Workshop pre-planning with Jo McInnes and Philip Venables
  • Two days of workshops in which spoken-theatre readings of Scenes 6, 10, 12 and 23 were explored, performed and audio recorded.
  • Selecting one recording of each scene, and transcribing into rhythmic notation the speech rhythms of the performance from the audio recording.
  • Using the transcriptions as a basis for the rhythmical composition of each scene.

Jo McInnes’ approach to performing 4.48 Psychosis is to ‘stop acting’ – to avoid the controlled, performative skill of acting. She used a range of techniques with the actors to try to prevent them from ‘acting’ and instead try to deliver the text more honestly, from the gut. Her philosophy is that the brutal honesty and emotional depth of Kane’s text is lost when it is ‘performed’ demonstratively, in the way that actors are usually trained to do. In 4.48 Psychosis the actor is not playing a character, she argued, but simply any human being. Her approach – to remove performative layers and allow the text to ‘speak’ directly – therefore aligns with the Krunglevičius model of musical distancing from the text, ensuring some consistency between our approaches in our planned process of workshop performance to transcription to composition.

From the numerous recorded takes of each scene from the workshops, I alone selected the take to use for transcription. Transcriptions of these four audio recordings were then made, choosing tempi that best fit the pace of the speech. These audio recordings can be heard here.  The transcriptions, both handwritten and in rough typeset versions, are available in Supporting Materials SM.22.

Audio of theatre workshop performance of Scene 6:


Audio of theatre workshop performance of Scene 10:


Audio of theatre workshop performance of Scene 12:


Audio of theatre workshop performance of Scene 23:


8.4. From transcription to composition

The speech transcriptions provided a loose framework and reference point for the composition of the scene, but questions still remained that needed to be solved in the compositional process. The following discussion outlines these questions and makes references to the score to illustrate the principles applied to their solution.

  1. Would dialogue be projected syllable by syllable, word by word, or phrase by phrase?
  2. How to balance percussion writing between abstract rhythm and natural speech rhythm?
  3. How to balance percussion writing between musical phrasing and syntactical phrasing?
  4. How to pace tempo and pauses in the translation from a speech recording to a visual format?

Of course, all of these questions are interdependent and the solutions to them relied heavily on compositional intuition, a gut instinct. The approach to these issues in one phrase could be the opposite of the approach taken in another phrase. Always, decisions about these issues had to balance musical and theatrical/semantic concerns, and the aim was to marry the two as much as possible, so that no conflict between music and text was felt by the audience. The percussionist and the projected text must appear to come from one and the same source, as ‘naturally’ as possible, since the percussion instrument is a metaphor for the voice of the character.

Broadly the following principles were adopted to solve these problems (with references to the score).

Question 1

Question 1 was addressed differently for each phrase, and a wide range of possibilities were used across the four main scenes (6, 10, 12, 23). Some phrases were articulated syllable by syllable (e.g. Scene 10 entire doctor’s part), some word by word (Scene 6, b.1–10, ) and some whole phrases appearing at once with a single percussion strike (e.g. Scene 10, b.59–67; Scene 12, b.61–63).

Question 2

Question 2. Likewise, some phrases were pulled away from natural speech rhythms to a more abstracted rhythm (e.g. Scene 10, b.71–73 and b.80–83, Scene 12, b.39–41), and some phrases mirrored the transcribed speech rhythms more accurately (e.g. Scene 10, b.1–9).

Question 3

Question 3 in a sense is another way of thinking about the two problems above, and here the emotional content of each phrase usually played a role. Phrases with emotional or dramatic importance in the scene tended to be written with more emphasis on musical articulation (e.g. “I KNOW” Scene 12, b.42–43; “Because it feels fucking great, because it feels fucking amazing” Scene 10, b.90–96). Sentences with less emotional weight – a laconic, wry quality – tended to prioritise syntactic articulation (e.g. “I thought you might do this, lots of people do, it relieves the tension” Scene 10, b.47–56; “Take an overdose, slash my wrists then hang myself. It couldn’t possibly be misconstrued as a cry for help” Scene 6, b.6–14).

Question 4

Question 4, the issue of pacing, changed radically through the process of writing and staging the opera. The long silences in the audio recordings from the theatre workshops were pregnant with anticipation, meaning, and the collective feeling in the room. In live theatre, the tension was held through these pauses. However, when the human body and voice and its inherent emotion is removed, when the delivery of lines is abstracted to a drum and some projected text, when the feeling of the scenes is more that of wry humour than deep pain, then those long pauses could not sustain the tension through to the next phrase. Conversely, phrases articulated syllable by syllable often resulted in very quick musical rhythms, which. as a spoken rhythm sound normal, often relaxed, but which as percussion playing seemed quick, frantic, rushed; not necessarily in keeping with the sentiments of the text, but just re-contextualising an artefact of speech.

Given these issues with pacing and pauses, there were good reasons to elongate the rhythms of some of the percussion phrases and shorten the lengths of most of the pauses between phrases. This shortening process was iteratively done, both after the Year Three Workshops, where we presented Scene 10 to an audience in a Critical Response Process, and then again during the production process, when pauses were shortened even further to keep the drama and suspense moving and the conversations feeling lively. The effects of this shortening are clearly seen by comparing the durations of each scene. In the production, Scenes 6, 10 and 12 were all under three minutes long, but the audio recordings of the theatre workshops ranged from 3 minutes to over 7 minutes.

Having outlined the universal concerns that applied to the transcription & composition process for all four Percussion Dialogue scenes, I will move on to a case study in closer detail: Scene 6.


8.5. Case study: Scene 6

The composition of Scene 6 provides a good example of the process outlined above that was followed for scenes 6, 10 and 12. To illustrate this in more detail, we can compare and contrast bars from Scene 6 with the corresponding transcription to see what changes were made in the composition process. The following Figures 38 to 41 show the transcription and the final score, side by side (click on them for larger versions in new tabs).

Figure 38 (Transcription, left) and Figure 39 (Final score, right): Scene 6, side-by-side comparison of transcription and score.

Figure 40 (Transcription, left) and Figure 41 (Final score, right): Scene 6, side-by-side comparison of transcription and score (page 2)

  1. Rhythms have been made more regular. For example, triplet crotchets in the transcription Have you made any plans? (transcription b.2) and then hang myself (b.9) and cry for help (b.16) were replaced by quavers or crotchets (Scene 6, bars 3, 7, 13 respectively).
  2. Missing words were re-inserted. Some words were inaudible in the recordings and therefore absent from the transcriptions, but were re-inserted. For example it couldn’t possibly be (Scene 6, b.12) and of course it would (Scene 6, b.17).
  3. Many speech rhythms were kept intact. For example, I’d be standing on a chair with a noose around my neck (Scene 6, b.25) is very similar to the transcription except that two-syllable words standing and around were only one strike each, and noose was elongated from a quaver in the transcription to a crotchet in the score.
  4. Pauses have been shortened. For example in If you were alone [pause] do you think you might harm yourself (Scene 6, b.29–32) the pause of 7.5 beats in the transcription has been completely removed, forming one single run-one sentence. The five-bar pauses at b.17–21 and b.32–37 of the transcription were reduced to one bar (Scene 6, b.11 and b.24).
  5. Tempi have been changed. Generally, tempi in score are considerably slower than in the transcription, for the reasons described in the responses to Question 4 above. The tempo of the opening of Scene 6 is half the tempo of the transcription. Tempi of the doctor and patient were made sometimes independent from one another, which kept a free feeling to the scene, obscuring any sense of regular pulse. For example, at the opening of the scene the Patient is at 132 and Doctor at 104. The huge rallentando (Scene 6, b.43–49) has a significant musical effect, underpinning the drama of the line ending with I’m tired of life and my mind wants to die. Clearly this rallentando is entirely artificial and is not present in the transcription.
  6. Some punctuation is articulated separately. Punctuation, such as question marks, are not articulated in speech in a rhythmic sense but rather with tone of voice and inflection. However, with such melodic devices stripped away from this setting, there was an opportunity to articulate punctuation rhythmically. For example, in b.58–65, You are not eighty years old. Are you? Are you? Or are you?, the final question mark is articulated with a counter-top bell (Scene 6, b.65) – a traditionally comic instrument that highlights the banality of the Doctor’s question and her mocking of the Patient.

This case study illustrates the compositional process used to adapt spoken-word transcriptions into musical percussion-projection dialogues. Many examples of similar adaptations can be found in Scenes 6, 10, 12 and 23; see the Supporting Materials SM.22 for the transcriptions, for comparison with the corresponding scenes in the full score. An evaluation of the success of these scenes, and the other text–composition techniques discussed in the previous four chapters, will be discussed in the next chapter.

Scene 6 from 4.48 Psychosis (28th May 2016): (video available on request)


Go to next chapter >