6. Mid-Phrase Switching

I use the term ‘mid-phrase switching’ to refer to a performer in live performance switching or alternating between spoken and sung voice mid-way through a phrase. This technique was used in a number of scenes, but I will look at two examples: Scene 3, featuring a simple pattern of switching, and Scene 9 featuring a more complex pattern and layering of these phrases in the vocal ensemble. A number of potential problems with mid-phrase switching will be discussed, such as vocal projection/clarity, dealing with amplification and textural balance in an ensemble.

6.1. Scene 3

Kane’s text for Scene 3 is a list of negative statements in the first person. They are taken from, or inspired by, the Beck Depression Inventory, which is a widely-used, self-reporting, multiple-choice questionnaire used to diagnose clinical depression in adults. It was designed by Aaron Beck in the 1961 and subsequently updated in 1978 (BDI-I) and 1996 (BDI-II)1. It is likely that Sarah Kane had direct experience of BDI-I and/or BDI-II. An excerpt of Scene 3 is given here (Kane, 2000: 4):

I am sad


I feel that the future is hopeless and that things cannot improve


I am bored and dissatisfied with everything


I am a complete failure as a person


I am guilty, I am being punished


I would like to kill myself


I used to be able to cry but now I am beyond tears


I have lost interest in other people


I can’t make decisions

The scene contains 33 statements beginning with the word ‘I’, mostly as isolated statements, sometimes with two statements joined in one sentence by a conjunctive. The musical treatment involved singing the initial word ‘I’ and speaking the rest of the phrase in free-rhythm, indicated by a crossed notehead. Strings, tubular bells and synthesiser doubled and articulated only the sung notes in unison with the singers. The music was barred, in a consistent tempo of crotchet 92; the time allowed for the spoken portions was estimated in this tempo and indicated with blank staves. An example of the notation is given in Figure 24.

Figure 24: An excerpt of vocal writing in Scene 3, showing sung ‘I’ and spoken phrases

Sung notes were kept in a middle tessitura. Most of the phrases were for single voice, but some were for combinations of two, three or four singers, for example I am being punished in Figure 24 for Jen and Clare together. Some rehearsal was needed to keep the spoken words together in these moments, but the notation was found to be straightforward.

Once this pattern had been established, some other sentences were sung in a musical metre for dramatic effect. For example, I cannot love (see Figure 25). Other, subsequent sentences were layered, overlapping each other between the four singers in the scene. As phrases were predominantly spoken rather than sung, this overlapping did not significantly reduce the clarity and audience comprehension. Figure 26 shows the ends of spoken phrases (eat, sleep, think) overlapping with the sung entries.

Figure 25: An excerpt of vocal writing in Scene 3 showing rhythmically-notated sung phrases.

Figure 26: An excerpt from Scene 3 showing overlapping vocal phrases and instrumental scoringOne significant problem was encountered with mid-phrase switching – the natural volume difference between the trained sung voice and the relatively untrained spoken voice. The cast were amplified and a light digital compression was applied to the signal, giving a volume boost to quieter sounds, but this can compensate only minimally for natural volume differences. There is the danger with too much digital correction that sung notes also become amplified, thereby negating any benefit in balance between singing and speaking. In this scene, the solution to this problem was to enable the singers to have maximum control over the volume of their own sung voices, therefore allowing them to balance spoken and sung volumes themselves. Factors at play include:

  • Sung notes in mid-range tessitura allow singers to sing softly with maximum control at a volume that matches their spoken voice.
  • Minimal instrumental scoring and no instrumental articulation during spoken passages allows words to project through the texture easily. See Figure 26 for an example of this.
  • Rehearsal to help singers project their spoken voices, a skill which not all opera singers often have the opportunity to develop.

These solutions worked by allowing the spoken text to be heard easily in a simple musical texture, with relatively low-level, natural-sounding amplification. However, these solutions limited the possibilities of both instrumental and vocal writing. In contrast, Scene 9 featured mid-phrase switching in a much more complex texture of six voices and full instrumental ensemble, which presented similar problems of balance and clarity, but which were solved differently.

Scene 3 from 4.48 Psychosis (28th May 2016):  (video available on request)


6.2. Scene 9

Scene 9 is a scene about sex, rejection and an impossible search for love, in the first person. Kane tells this narrative in a series of short tableaux, separated on the page by (Silence.). This scene was drafted with a more complex layering of vocal lines containing mid-phrase switching. A workshop process exposed problems of balance and clarity in this draft that were addressed by compositional changes in a final draft, and which are discussed here.

The musical setting of Scene 9 used the full cast of singers and tutti ensemble, with the short tableaux separated by short bursts of white noise and pseudo-blackouts. The musical concept of each tableau involved clustered harmony, singers’ pitches sliding over and against one another on overlapping long notes forming a cloud of creeping dissonance interrupted by bursts of spoken text. My original staging concept involved the singers in close physical contact with each other, as a writhing mass of bodies – the harmony was to be a mirror of the physical action.  (This staging concept was not adopted in the production, but may be explored in the forthcoming revival.)

As in Scene 3, most phrases in this scene begin with ‘I’, although in this case many are joined with conjunctives to form compound sentences. The first tableau is (Kane, 2000: 12):

Sometimes I turn around and catch the smell of you and I cannot go on I cannot fucking go on without expressing this terrible so fucking awful physical aching fucking longing I have for you. And I cannot believe that I can feel this for you and you feel nothing. Do you feel nothing?

The principle of Scene 3 was applied here also: ‘I’ was sung, and the rest of the phrase after that was spoken. Auxiliary words before ‘I’ were sung as an anacrusis. Sung words are indicated here in bold.

I turn around

and catch the smell of you

and I cannot go on

I cannot fucking go on without expressing this terrible so fucking awful physical aching fucking longing I have for you

And I cannot believe

that I can feel this for you

and you feel nothing

The archetypal musical phrase for this setting can be illustrated by Clare’s first entry in the scene (see Scene 9 draft version, bar 9), shown here in Figure 27.

Figure 27: Excerpt from Scene 9 (draft version, b.9, Clare) showing the archetypal phrase structure for this scene.

The duration of spoken words was indicated with an extended beam, and the spoken text was parlando, on a pitch, with short vowels and normal (speech) elocution. The short text phrases were all set according to this archetype, although there was much variation and elaboration on this construction, particularly later in the scene. Figure 28 shows the phrases layered across the vocal ensemble, starting and ending at different points to create complete, compound sentences emerging from the cloud of cluster harmony. The phrases were arranged carefully to create a continuous flow of spoken text, hocketing between singers so that the complete text became apparent to the audience from the amalgamated ensemble. Some of this spoken text was for solo performers (Figure 28: turn around; catch the smell of you), some of it was doubled. Some of the spoken text ‘echoed’ around the ensemble (Figure 28, cannot go on).

Figure 28: Excerpt from Scene 9, first draft, showing the overlapping layering of phrases.

As discussed earlier, the volume balance between sung and spoken words in Scene 3 was easy to manage because of light scoring and a simple texture. However, in Scene 9, with a tutti ensemble and a thicker, more complex vocal texture, there was a greater risk that spoken words would not be heard clearly. As in Scene 3, amplification would provide only minimal help.

The original version of Scene 9 was tested in the Year 3 Workshops with a full cast and ensemble, although without amplification.  The full score of this version is here, and the workshop recording is here:

As suspected, these workshops established that both the vocal cluster chords and the instrumental texture were too thick, overpowering the spoken words. The singers verged on shouting the words to be heard, with limited success. Re-scoring a new version was required, with the following aims:

  • To thin out the number of long held notes in the vocal ensemble
  • To thin out some of the saxophone orchestration
  • To delay all crescendi in vocal and instrumental parts to just the ends of long notes, so that the overall texture of long notes was kept at a lower dynamic.
  • To double more of the spoken words across multiple singers and to thin out some of the echoing / non-unison spoken text.
  • To sit the long notes in a more central tessitura, giving singers more dynamic control over these notes
  • To set spoken text pitches lower, towards the natural speaking range, allowing better spoken projection.
  • To reassign parts so that, for each tableau, one singer had majority spoken text and no long held notes. i.e. to remove mid-phrase switching for one singer.

The final point is the crucial one. By re-allocating lines and words, one singer in each tableau could take almost all of the spoken text as a ‘thread’ woven through the texture, onto which other singers could double spoken text. This performer could be amplified louder than the others because there would no longer be the problem of balancing their spoken text against their singing voice. As the short tableaux were separated by white noise and blackouts, there was ample opportunity for amplification settings to be pre-programmed and quickly switched in blackouts, so that each tableau could have a different singer providing the ‘spoken thread’.

Figure 29: An excerpt of the final version of Scene 9, b.11–18, showing amendments to the original version that improve comprehension of spoken text.

To illustrate these changes, Figure 29 shows the final version of the score for the corresponding section of the first version of the score shown in Figure 28. Comparing them:

  • Gwen has been added as the ‘spoken thread’ in this tableau, with only spoken words. Her amplification is higher.
  • This section has been transposed down a minor third. Singers’ long, held notes are in a more comfortable range; spoken text now lies on pitches closer to a natural speaking pitch.
  • All spoken words are now doubled in at least two voices.
  • Crescendo hairpins have been delayed to the ends of notes.

The ‘before and after’ examples in Figures 28 and 29 do not show a changed number or density of long notes, but such changes were made at other points in the scene when textures became thicker and more complex. Similar changes were also made in the instrumental writing: care was taken to thin out the texture without changing the quality and harmonic density of the cluster chords.

One may ask why the quiet dynamics of the spoken words was unchanged between these two versions, given that spoken-word projection was a problem. To answer that, I argue that a key concern was the quality of spoken voice used in this scene: the spoken voice should always be intimate, sensual, befitting the subject matter of the scene – hence the performance instruction “pillow talk, sexy”. Declamatory spoken text would not feel right here. Therefore the dynamic indications are qualitative rather than quantitative. The rehearsal process managed to find a balance between a well-projected, louder, spoken voice but which stayed low in the chest register, maintaining a sultry rather than shouted quality.

All of these solutions together achieved spoken text that was audible and comprehensible in performance. There remained a few passages that were still a little overpowered by the musical texture, but these tended to be climactic moments in which I felt musical concerns outweighed text concerns and where any more thinning of the texture would have compromised the musical shape. Therefore no further changes to the score will be made for future performances.

Scene 9 from 4.48 Psychosis (28th May 2016):  (video available on request)


Go to next chapter >