21. Text in MEI

This chapter describes methods for encoding textual content with MEI. Textual information on scores has several different uses, although some text is closer to music notation than other kinds. For example, tempo marks, directives and lyrics are directly related to the functionality of the notated music and are, therefore, described in other chapters (see for example Vocal Text and Text Directives). This chapter, on the other hand, focuses on the text that accompanies the score, i.e., paratext (prefatory material, title pages, back matter, appendices, etc.), titles, prose, poetry, etc.

Most of the elements described here take inspiration from encoding formats that deal primarily with text, such as HTML and the Text Encoding Initiative (TEI). These elements are provided to encode relatively basic textual information. For deeper encoding of text, these Guidelines recommend consideration of other text-specific encoding formats with embedded MEI markup.

21.1. Organizing Text into Divisions

Text can be organized in different parts, for example in chapters or sections. The div element is used to encode such structural divisions.

div
(division) – Major structural division of text, such as a preface, chapter or section.
Characterizes the element in some sense, using any convenient classification scheme or typology.
Provide any sub-classification for the element, additional to that given by its type attribute.

For example, printed scores, before the actual notation, can have text that can be organized in multiple sections (e.g. a preface, a critical report, performance instructions, etc. for which see the following sections); each of these sections should be identified by a different div element. Text might also occur in between music sections (see Content of Musical Divisions), for example in a collection of romantic piano works, a few pieces might be preceded or followed by poetry. Such text should be encoded with the div element, as demonstrated in the following example:

<mdiv>
   <score>
      <section>
         <!-- Score of Franz Liszt's "Sonetto 104 del Petrarca -->
      </section>
      <div>
         <!-- Text of Francesco Petrarca's Sonett n. 104. -->
         <lg>
            <l>L'aspectata vertù, che 'n voi fioriva</l>
            <l>quando Amor cominciò darvi bataglia,</l>
            <!-- ... -->
         </lg>
      </div>
   </score>
</mdiv>

Textual divisions may have titles or other forms of introductory material, which are encoded with the head element.

(heading) – Contains any heading, for example, the title of a section of text, or the heading of a list.

The following example shows the encoding of a preface translated into three different languages, each with a different heading:

<div xml:lang="en">
   <head>
      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"></meta>Preface
   </head>
   <!-- text -->
</div>
<div xml:lang="de">
   <head>Vorwort</head>
   <!-- text -->
</div>
<div xml:lang="it">
   <head>Prefazione</head>
   <!-- text -->
</div>

Having said that div identifies any structural organization of text, it is often helpful to distinguish the typology of division. The attributes @type and @subtype can be used for this purpose. It is required that @type be present when @subtype is used, though their values can be freely set by the encoder.

The following example shows the use of @type to indicate three prefaces in English, German and Italian are columns on the same page.

<div n="1" type="column" xml:lang="en">
   <head>
      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"></meta>Preface
   </head>
   <!-- text -->
</div>
<div n="2" type="column" xml:lang="de">
   <head>Vorwort</head>
   <!-- text -->
</div>
<div n="3" type="column" xml:lang="it">
   <head>Prefazione</head>
   <!-- text -->
</div>
<pb></pb>

21.2. Paragraphs

Paragraphs are fundamental to prose text and typically group one or more sentences that form a logical passage. A paragraph is usually typographically distinct: The text begins on a new line and the first letter of the content is often indented, enlarged, or both.

A paragraph is encoded with the p element:

p
(paragraph) – One or more text phrases that form a logical prose passage.

Prose text is used for several different purposes within a MEI document, therefore p can occur in many situations. For example, it may be used within metadata elements (see The MEI Header):

<samplingDecl>
   <p>The encoding contains only the first 5 measures.</p>
</samplingDecl>

Alternatively, paragraphs may be part of the document contents (and therefore encoded within music), either as Paratext or within the music notation. In these cases, a paragraph will likely be contained by a div or other elements containing prose (e.g. annot, figDesc, etc.).

The following example shows a paragraph in a preface section:

<front>
   <div>
      <head>
         <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"></meta>The Preludes
         <lb></lb>Symphonic Poem No.3 by F. Liszt.
      </head>
      <p>What else is our life but a series of preludes to that unknown Hymn, the first and
         solemn note of which is intoned by Death?
      </p>
   </div>
</front>

21.3. Lists

When a text contains lists, they can be encoded with the following elements:

A formatting element that contains a series of items separated from one another and arranged in a linear, often vertical, sequence.
Captures the nature of the content of a list.
(heading) – Contains any heading, for example, the title of a section of text, or the heading of a list.
Single instance or exemplar of a source/manifestation.

The list element can identify any kind of list; the @form attribute can be used to specify whether the list is ordered, unordered etc. Each item in the list is encoded with the li element. The @n can be used to record a label for a list item, as in the following example:

<p>The modulation follows the following steps:
   <list form="ordered">
      <li n="1">C major</li>
      <li n="2">A minor</li>
      <li n="3">D major seventh</li>
      <li n="4">G major</li>
   </list>
</p>

Occasionally, lists have headers or titles, which can be encoded with head:

<list>
   <head>
      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"></meta>Ornaments in different languages
   </head>
   <li n="English" xml:lang="en">Turn</li>
   <li n="Italian" xml:lang="it">Gruppetto</li>
   <li n="French" xml:lang="fr">Gruppetto</li>
   <li n="German" xml:lang="de">Doppelschlag</li>
</list>

21.4. Quotation

It is common, in many types of texts, to find quotations. A quotation is typically attributed to another text other than the one being encoded. Often, the quoted material is typographically distinct from the surrounding text; i.e., surrounded by so-called ‘quote marks’ or rendered as a separate block of text. The quote element is used to mark this function:

(block quote) – A formatting element that designates an extended quotation; that is, a passage attributed to a source external to the text and normally set off from the text by spacing or other typographic distinction.

The following examples show the use of quote.

<p>Hugh MacDonald has argued that Liszt's Symphonic Poems were meant to
   <quote>display the traditional logic of symphonic thought</quote>.
</p>
<p>The majority of the works represented in this catalogue were purchased in Paris and
   London between 1928 and 1934. After graduating from Harvard in 1924, Mackay-Smith
   spent several years in Europe:
   <quote>
      <p>I bought my first early music from Harold Reeves in London in the summer of 1928 when
         I was able to acquire virtually all the 18th century editions, particularly of trio
         music, which he then had in stock, going back not only through his current but also
         through earlier catalogues, picking out numbers which remained unsold. It is almost
         a shame today to think of the prices at which such things were then available, one
         or two pounds apiece.
      </p>
   </quote>
</p>

21.5. Poetry

This lg (line group) element is used generically to encode any section of text that is organized as a group of lines. Following the recommendations of the Text Encoding Initiative, it is recommended to use it, along with the following elements, for marking up poetry:

lg
(line group) – May be used for any section of text that is organized as a group of lines; however, it is most often used for a group of verse lines functioning as a formal unit, e.g. a stanza, refrain, verse paragraph, etc.
(heading) – Contains any heading, for example, the title of a section of text, or the heading of a list.
l
(line of text) – Contains a single line of text within a line group.

Because lg groups verses, it can be used to encode additional stanzas not integrated into the music notation. In addition, it is common for a poem to include a title or a header, as is demonstrated by the following example:

<mdiv>
   <score>
      <section>
         <!-- Score of Franz Liszt's "Sonetto 104 del Petrarca" -->
      </section>
      <div>
         <!-- Text of Francesco Petrarca's Sonett n. 104. -->
         <lg>
            <head>
               <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"></meta>Sonetto 104
            </head>
            <l>L'aspectata vertù, che 'n voi fioriva</l>
            <l>quando Amor cominciò darvi bataglia,</l>
            <l>produce or frutto, che quel fiore aguaglia,</l>
            <l>et che mia speme fa venire a riva.</l>
            <!-- ... -->
         </lg>
      </div>
   </score>
</mdiv>

21.6. Paratext

This section introduces paratextual material, such as title pages, prefaces, indexes and other text that precedes or follows the actual score.

21.6.1. Front Matter

By ‘front matter’ these Guidelines mean distinct sections of a text (usually, but not necessarily, a printed one), prefixed to it by way of introduction or identification as a part of its production. Features such as title pages or prefaces are clear examples; a less definite case might be the prologue attached to a dramatic work. The front matter of an encoded text should not be confused with the MEI header described in chapter The MEI Header, which provides metadata for the entire file.

An encoder may choose simply to ignore the front matter in a text, if the original presentation of the work is of no interest. No specific tags are provided for the various kinds of subdivision which may appear within front matter: instead, generic div (“division”) elements may be used, which should not be confused with mdiv (“musical division”) elements. The following suggested values for the @type attribute may be used to distinguish various kinds of division characteristic of front matter:

‘preface’: A foreword or preface addressed to the reader in which the author or publisher explains the content, purpose, or origin of the text.

‘ack’: A formal declaration of acknowledgement by the author in which persons and institutions are thanked for their part in the creation of a text.

‘dedication’: A formal offering or dedication of a text to one or more persons or institutions by the author.

‘abstract’: A summary of the content of a text as continuous prose.

‘contents’: A table of contents, specifying the structure of a work and listing its constituents. The list element should be used to mark its structure.

‘frontispiece’: A pictorial frontispiece, possibly including some text.

The following extended example demonstrates how various parts of the front matter of a text may be encoded. The front part begins with a title page, which is presented in section Title Pages, below. This is followed by a dedication and a preface, each of which is encoded as a distinct div:

<front xmlns="http://www.music-encoding.org/ns/mei" xmlns:rng="http://relaxng.org/ns/structure/1.0" xmlns:sch="http://purl.oclc.org/dsdl/schematron">
   <titlePage>
      <!-- transcription of title page -->
   </titlePage>
   <div type="dedication">
      <p>
         <!-- Dedicatory text -->
      </p>
   </div>
   <div type="preface">
      <head>
         <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"></meta>Preface
      </head>
      <p>
         <!-- paragraph 1 -->
      </p>
      <p>
         <!-- paragraph 2 -->
      </p>
      <!-- additional material -->
   </div>
</front>

The front matter concludes with another div element, shown in the next example, this time containing a table of contents, which contains a list element (as described in chapter Lists). Note the use of the ptr element to provide page-references: the implication here is that the target identifiers (song1, song2, etc.) will correspond with identifiers used for the mdiv elements containing the individual songs. (For a description of the ptr element, see chapter Pointers and References.)

<div type="contents">
   <head>
      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"></meta>Contents
   </head>
   <list form="ordered">
      <li>On Wenlock Edge
         <ptr target="#song1"></ptr>
      </li>
      <li>From Far, From Eve and Morning
         <ptr target="#song2"></ptr>
      </li>
      <li>Is My Team Ploughing?
         <ptr target="#song3"></ptr>
      </li>
      <li>Oh, When I Was In Love With You
         <ptr target="#song4"></ptr>
      </li>
      <li>Bredon Hill
         <ptr target="#song5"></ptr>
      </li>
      <li>Clun
         <ptr target="#song6"></ptr>
      </li>
   </list>
</div>

Alternatively, the pointers in the table of contents might link to the page breaks at which a song begins, assuming that these have been included in the markup:

<list form="ordered">
   <li>On Wenlock Edge
      <ref target="#song1-p1">1</ref>
   </li>
   <li>From Far, From Eve and Morning
      <ref target="#song2-p15">15</ref>
   </li>
   <!-- .... -->
</list>
<!-- Later in the document -->
<mdiv type="song">
   <pb xml:id="song1-p1"></pb>
   <!-- .... -->
</mdiv>
<mdiv type="song">
   <pb xml:id="song2-p15"></pb>
   <!-- .... -->
</mdiv>
<!-- .... -->

21.6.2. Title Pages

Detailed analysis of the title page and other preliminaries of older printed books and manuscripts is of major importance in descriptive bibliography and the cataloging of printed books; such analysis, however, requires a more detailed approach than the general one described here. The following elements are suggested as a means of encoding the major features of most title pages for faithful rendition:

Contains a transcription of the title page of a text.
p
(paragraph) – One or more text phrases that form a logical prose passage.
Contains text displayed in tabular form.
A formatting element that contains a series of items separated from one another and arranged in a linear, often vertical, sequence.
(block quote) – A formatting element that designates an extended quotation; that is, a passage attributed to a source external to the text and normally set off from the text by spacing or other typographic distinction.
lg
(line group) – May be used for any section of text that is organized as a group of lines; however, it is most often used for a group of verse lines functioning as a formal unit, e.g. a stanza, refrain, verse paragraph, etc.

The following example shows the encoding of the title page of Vaughan Williams’ On Wenlock Edge. Note the use of the lb element to mark the line breaks present in the original.

<titlePage>
   <p>ON WENLOCK EDGE</p>
   <p>A CYCLE OF SIX SONGS
      <lb></lb>FOR TENOR VOICE ___ WITH ACCOMPANIMENT OF
      <lb></lb>Pianoforte and String Quartet (ad lib)
      <lb></lb>THE WORDS BY A. E. HOUSMAN
      <lb></lb>(FROM "A SHROPSHIRE LAD")
   </p>
   <p>
      <fig></fig>
   </p>
   <p>MUSIC BY
      <lb></lb>R. VAUGHAN
      <lb></lb>WILLIAMS
   </p>
   <list>
      <li>PRICE $3.75</li>
      <li>(COMPLETE WITH SET OF STRING PARTS $5.00</li>
      <li>STRING PARTS SEPARATELY $1.00</li>
   </list>
   <p>Boosey &amp; Hawkes, Inc.</p>
   <p>New York, U.S.A.</p>
   <p>London · Toronto · Sydney · Capetown</p>
</titlePage>

The physical rendition of title page information is often of considerable importance. One approach to this requirement would be to use the rend element, described in chapter Text Rendition to specify the rendition of each of the components of the title page. Another would be to employ a CSS stylesheet. Finally, a module customized for the description of typographic entities such as pages, lines, rules, etc., bearing special-purpose attributes to describe line-height, leading, degree of kerning, font, etc. could be employed.

21.7. Back Matter

Conventions vary as to which elements are grouped as back matter and which as front. For example, some books place the table of contents at the front, and others at the back. For this reason, the content models of the front and back elements are identical.

The following suggested values may be used for the @type attribute on all division elements, in order to distinguish various kinds of divisions characteristic of back matter:

‘appendix’: An ancillary self-contained section of a work, often providing additional but in some sense extra-canonical text.

‘glossary’: A list of terms associated with definition texts (‘glosses’).

‘notes’: A section in which textual notes are gathered together.

‘bibliography’: A list of bibliographic citations.

‘index’: Any form of index to the work.

‘colophon’: A statement appearing at the end of a book describing the conditions of its physical production.

No additional elements are proposed for the encoding of back matter at present. Some characteristic examples follow; first, an index (for the case in which a printed index is of sufficient interest to merit transcription):

<back>
   <div type="index">
      <head>
         <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"></meta>Index
      </head>
      <list type="index">
         <li>a2, a3, etc., 175-176</li>
         <li>Abbreviations, 3
            <list type="index">
               <li>Percussion, 205-213</li>
               <li>Strings, 307</li>
            </list>
         </li>
         <li>Afterbeats, 77</li>
      </list>
   </div>
</back>

Note that if the page breaks in the original source have also been explicitly encoded, and given identifiers, the references to them in the above index can more usefully be recorded as links. For example, assuming that the encoding of page 77 of the original source starts like this:

<pb xml:id="text.P77"></pb>

then the last item above might be encoded more usefully in the following form:

<li>Afterbeats,
   <ref target="#text.P77">77</ref>
</li>