15 Vocal Text

This chapter describes how to encode words and syllables in vocal notation. This text is typically written under a staff to indicate the text to be vocally performed. As such, this text should not be confused with other text on the score, for which see 1.3 Shared Textual Elements and 21 Text in MEI

These guidelines suggest two methods for encoding text in vocal notation: encoding syllables under each note and encoding performed text after the notes (and other staff events) either within layer elements or within measure elements when available (for example in a Common Music Notation context). Each method may be more convenient depending on the source text and on the textual phenomena that the encoding intends to record.

Both methods eventually rely on the syl element, which is part of the ‘shared’ module and is therefore available in all MEI files. The following sections will begin by introducing the general use of syl and then show in detail the two different encoding methods.

15.1 Lyric Syllables

By ‘lyric syllable’, these guidelines mean a word or portion of a word that is to be performed vocally. Each syllable is encoded with the syl element, with which it is also possible to specify the position of the syllable in a word, the type of connectors between syllables, alignment adjustments, and the formatting for each syllable. These are the key components:

  • syl(syllable) – Individual lyric syllable.
  • worpos
    con Describes the symbols typically used to indicate breaks between syllables and their functions.
    halign Records horizontal alignment.

The attribute wordpos is used to specify the position of the marked-up lyric syllable in a word. It allows the following values:

i
- Indicates that the current syllable's position is initial; that is, at the beginning of a word;
m
- Indicates that the current syllable is in the middle of a word;
t
- Indicates that the syllable's position is terminal; that is, at the end of a word.

When a syllable is at the beginning or in the middle of a word (in which case it will have the wordpos attribute set to ‘i’ or ‘m’), it is recommended to specify the type of connector written between the current and the following syllable. This is expressed with the con attribute, which takes the following values:

s
- A space is used as a connector between syllables;
d
- A dash is used as a connector between syllables;
u
- An underscore sign (indicating prologation of the syllable) is used as a connector between syllables;
t
- A tilde is used to indicate elision with the following syllable. This is typically rendered as a small curved line between the syllables.

Occasionally, a word or a final syllable needs to be extended across multiple notes. In this case an ‘extender’ is provided. An extender is a continuous line drawn at the text's baseline from the end of the syllable associated with the first note until the last note to be sung with the syllable.

The use of syl described in this section is common to CMN and other notation systems, such as mensural notation. Other uses specific to certain types of notation and repertoires are addressed in other chapters. See for example 6 Neume Notation.

15.2 Vocally Performed Text Encoded Within Notes

Each lyric syllable can be encoded directly within an associated note, either by using the syl attribute on note or the verse element.

Using the syl attribute on notes is the simplest way of encoding vocally performed text and is recommended only for simple situations or for those encodings which do not focus on vocally performed text.

The following example from Handel's Messiah (HWV 56) shows the use of syl:

Figure 50. Handel, Messiah HWV 56, Halleluja
<measure>
<staff>
<layer>
<note dots= "1" dur= "4" oct= "5" pname= "c" syl= "Hal-"/>
<note dur= "8" oct= "4" pname= "g" syl= "le-"/>
<beam>
<note dur= "8" oct= "4" pname= "a" syl= "lu-"/>
<note dur= "8" oct= "4" pname= "g" syl= "jah,"/>
</beam>
<rest dur= "4"/>
</layer>
</staff>
</measure>

When there are multiple lines of vocally performed text, or the encoder wishes to be more specific about connectors, etc., the use of verse and syl is recommended.

  • verseLyric verse.
  • rhythm Used to specify a rhythm for the lyric syllables that differs from that of the notes on the staff, e.g. '4,4,4,4' when the rhythm of the notes is '4.,8,4.,8'.

The following example from Handel's Messiah (HWV 56) shows the use of verse:

<measure>
<staff>
<layer>
<note dots= "1" dur= "4" oct= "5" pname= "c">
<verse n= "1">
<syl con= "d" wordpos= "i"> Hal </syl>
</verse>
</note>
<note dur= "8" oct= "4" pname= "g">
<verse n= "1">
<syl con= "d" wordpos= "m"> le </syl>
</verse>
</note>
<beam>
<note dur= "8" oct= "4" pname= "a">
<verse n= "1">
<syl con= "d" wordpos= "m"> lu </syl>
</verse>
</note>
<note dur= "8" oct= "4" pname= "g">
<verse n= "1">
<syl wordpos= "t"> jah, </syl>
</verse>
</note>
</beam>
<rest dur= "4"/>
</layer>
</staff>
</measure>

As it is common practice in written text, it is assumed that a space separates words. Many vocal texts, however, introduce elisions and connect two syllables into one unit. For example, the vocal text from Mozart's Don Giovanni sung by Don Giovanni in Finale II, Ho fermo il core in petto introduces an elision between the word fermo and il and between core and in. An elision can be indicated by placing both syllables within the same note and setting the syl element's con attribute value to 't':

<note>
<verse>
<syl con= "t" wordpos= "t"> re </syl>
<syl wordpos= "i"> in </syl>
</verse>
</note>

When there is more than one line of text, more than one verse element can be used. The following example from a piano reduction of Wagner's Rheingold has two lines of text, with an English translation on the second line. Note the use of the xml:lang attribute to differentiate the two languages:

Figure 51. Example from Wagner's Rheingold with translated text.
<scoreDef>
<staffGrp>
<staffDef clef.line= "4" clef.shape= "F" key.sig= "4s" lines= "5" n= "1"/>
</staffGrp>
</scoreDef>
<section>
<measure>
<staff n= "1">
<layer n= "1">
<note dur= "2" oct= "3" pname= "f" stem.dir= "down">
<verse n= "1" xml:lang= "ger">
<syl con= "d" wordpos= "i"> Rei </syl>
</verse>
<verse n= "2" xml:lang= "eng">
<syl>thinks </syl>
</verse>
</note>
<note dur= "8" oct= "3" pname= "f" stem.dir= "down">
<verse n= "1">
<syl wordpos= "t"> fes </syl>
</verse>
<verse n= "2">
<syl>it </syl>
</verse>
</note>
<note dur= "8" oct= "3" pname= "f" stem.dir= "down">
<verse n= "1">
<syl>zu </syl>
</verse>
<verse n= "2">
<syl>were </syl>
</verse>
</note>
</layer>
</staff>
</measure>
<measure>
<staff n= "1">
<layer>
<note dur= "4" oct= "3" pname= "b" stem.dir= "down">
<verse n= "1">
<syl con= "d" wordpos= "i"> wal </syl>
</verse>
<verse n= "2">
<syl>wise </syl>
</verse>
</note>
<note dur= "4" oct= "3" pname= "d" stem.dir= "down">
<accid accid= "n"/>
<verse n= "1">
<syl wordpos= "t"> ten, </syl>
</verse>
<verse n= "2">
<syl>now </syl>
</verse>
</note>
<rest dur= "4" dur.ges= "8p"/>
</layer>
</staff>
</measure>
</section>

Optionally, it is possible to include an lb element within verse to explicitly encode line and line group endings. This is specifically meant to facilitate karaoke applications.

Finally, the rhythm attribute can be used to specify a rhythm for the syllable that differs from that of the notes on the staff.

15.3 Vocally Performed Text Encoded Separately

Vocally performed text may also be encoded separately from the notes with the lyrics element. These are the main components:

  • lyricsVocally performed 'text' of a musical composition, such as a song or opera.
  • staff Signifies the staff on which a notated event occurs or to which a control event applies. Mandatory when applicable.
    layer Identifies the layer to which a feature applies.

Since this element is separated from the encoding of the notes, it must be associated with a staff that will provide rhythm information when required for automated processing. The staff attribute gives the associated staff and if there is more than one layer on that staff, the layer attribute may be used to indicate the layer from which the rhythm should be taken. If there is any divergence between the rhythm of the vocally performed text and the notes, the rhythm attribute on verse may be used to specify the text's rhythm.

The following example from Carl Maria von Weber's Der Freischütz illustrates this encoding method:

Figure 52. Weber, Der Freischütz
<section>
<measure>
<staff n= "1">
<layer n= "1">
<note dots= "1" dur= "4" oct= "3" pname= "a">
<artic artic= "acc"/>
</note>
<note dots= "1" dur= "4" oct= "3" pname= "a">
<artic artic= "acc"/>
</note>
</layer>
</staff>
<lyrics staff= "1">
<verse>
<syl>Sturm </syl>
<syl>und </syl>
</verse>
</lyrics>
</measure>
<measure>
<staff n= "1">
<layer n= "1">
<note dots= "1" dur= "2" oct= "3" pname= "g" tie= "i"/>
</layer>
</staff>
<lyrics staff= "1">
<verse>
<syl>Nacht! </syl>
</verse>
</lyrics>
</measure>
<measure>
<staff n= "1">
<layer n= "1">
<note dots= "1" dur= "2" oct= "3" pname= "g" tie= "t"/>
</layer>
</staff>
</measure>
</section>

In this encoding style, a syl element with its con attribute set to 't' and the following syllable are presumed to be associated with a single note. In the following example, the first two syllables occur on the first note and the third syllable occurs on the second note.

<staff>
<layer>
<note dur= "2" oct= "3" pname= "g"/>
<note dur= "2" oct= "3" pname= "f"/>
</layer>
</staff>
<!-- later -->
<lyrics staff= "1">
<verse>
<syl con= "t" wordpos= "t"> re </syl>
<syl wordpos= "i"> il </syl>
<syl wordpos= "i"> pet </syl>
</verse>
</lyrics>