21.1. Organizing Text into Divisions 21.2. Paragraphs 21.3. Lists 21.4. Quotation 21.5. Poetry 21.6. Paratext 21.6.1. Front Matter 21.6.2. Title Pages 21.7. Back Matter
This chapter describes methods for encoding textual content with MEI. Textual information on scores has several different uses, although some text is closer to music notation than other kinds. For example, tempo marks, directives and lyrics are directly related to the functionality of the notated music and are, therefore, described in other chapters (see for example Vocal Text and Text Directives). This chapter, on the other hand, focuses on the text that accompanies the score, i.e., paratext (prefatory material, title pages, back matter, appendices, etc.), titles, prose, poetry, etc.
Most of the elements described here take inspiration from encoding formats that deal primarily with text, such as HTML and the Text Encoding Initiative (TEI). These elements are provided to encode relatively basic textual information. For deeper encoding of text, these Guidelines recommend consideration of other text-specific encoding formats with embedded MEI markup.
Text can be organized in different parts, for example in chapters or sections. The div element is used to encode such structural divisions.
For example, printed scores, before the actual notation, can have text that can be organized in multiple sections (e.g. a preface, a critical report, performance instructions, etc. for which see the following sections); each of these sections should be identified by a different div element. Text might also occur in between music sections (see Content of Musical Divisions), for example in a collection of romantic piano works, a few pieces might be preceded or followed by poetry. Such text should be encoded with the div element, as demonstrated in the following example:
Textual divisions may have titles or other forms of introductory material, which are encoded with the head element.
The following example shows the encoding of a preface translated into three different languages, each with a different heading:
Having said that div identifies any structural organization of text, it is often helpful to distinguish the typology of division. The attributes @type and @subtype can be used for this purpose. It is required that @type be present when @subtype is used, though their values can be freely set by the encoder.
The following example shows the use of @type to indicate three prefaces in English, German and Italian are columns on the same page.
Paragraphs are fundamental to prose text and typically group one or more sentences that form a logical passage. A paragraph is usually typographically distinct: The text begins on a new line and the first letter of the content is often indented, enlarged, or both.
A paragraph is encoded with the p element:
Prose text is used for several different purposes within a MEI document, therefore p can occur in many situations. For example, it may be used within metadata elements (see The MEI Header):
Alternatively, paragraphs may be part of the document contents (and therefore encoded within music), either as Paratext or within the music notation. In these cases, a paragraph will likely be contained by a div or other elements containing prose (e.g. annot, figDesc, etc.).
The following example shows a paragraph in a preface section:
When a text contains lists, they can be encoded with the following elements:
The list element can identify any kind of list; the @form attribute can be used to specify whether the list is ordered, unordered etc. Each item in the list is encoded with the li element. The @n can be used to record a label for a list item, as in the following example:
Occasionally, lists have headers or titles, which can be encoded with head:
It is common, in many types of texts, to find quotations. A quotation is typically attributed to another text other than the one being encoded. Often, the quoted material is typographically distinct from the surrounding text; i.e., surrounded by so-called ‘quote marks’ or rendered as a separate block of text. The quote element is used to mark this function:
The following examples show the use of quote.
This lg (line group) element is used generically to encode any section of text that is organized as a group of lines. Following the recommendations of the Text Encoding Initiative, it is recommended to use it, along with the following elements, for marking up poetry:
Because lg groups verses, it can be used to encode additional stanzas not integrated into the music notation. In addition, it is common for a poem to include a title or a header, as is demonstrated by the following example:
This section introduces paratextual material, such as title pages, prefaces, indexes and other text that precedes or follows the actual score.
By ‘front matter’ these Guidelines mean distinct sections of a text (usually, but not necessarily, a printed one), prefixed to it by way of introduction or identification as a part of its production. Features such as title pages or prefaces are clear examples; a less definite case might be the prologue attached to a dramatic work. The front matter of an encoded text should not be confused with the MEI header described in chapter The MEI Header, which provides metadata for the entire file.
An encoder may choose simply to ignore the front matter in a text, if the original presentation of the work is of no interest. No specific tags are provided for the various kinds of subdivision which may appear within front matter: instead, generic div (“division”) elements may be used, which should not be confused with mdiv (“musical division”) elements. The following suggested values for the @type attribute may be used to distinguish various kinds of division characteristic of front matter:
‘preface’: A foreword or preface addressed to the reader in which the author or publisher explains the content, purpose, or origin of the text.
‘ack’: A formal declaration of acknowledgement by the author in which persons and institutions are thanked for their part in the creation of a text.
‘dedication’: A formal offering or dedication of a text to one or more persons or institutions by the author.
‘abstract’: A summary of the content of a text as continuous prose.
‘contents’: A table of contents, specifying the structure of a work and listing its constituents. The list element should be used to mark its structure.
‘frontispiece’: A pictorial frontispiece, possibly including some text.
The following extended example demonstrates how various parts of the front matter of a text may be encoded. The front part begins with a title page, which is presented in section Title Pages, below. This is followed by a dedication and a preface, each of which is encoded as a distinct div:
The front matter concludes with another div element, shown in the next example, this time containing a table of contents, which contains a list element (as described in chapter Lists). Note the use of the ptr element to provide page-references: the implication here is that the target identifiers (song1, song2, etc.) will correspond with identifiers used for the mdiv elements containing the individual songs. (For a description of the ptr element, see chapter Pointers and References.)
Alternatively, the pointers in the table of contents might link to the page breaks at which a song begins, assuming that these have been included in the markup:
Detailed analysis of the title page and other preliminaries of older printed books and manuscripts is of major importance in descriptive bibliography and the cataloging of printed books; such analysis, however, requires a more detailed approach than the general one described here. The following elements are suggested as a means of encoding the major features of most title pages for faithful rendition:
The following example shows the encoding of the title page of Vaughan Williams’ On Wenlock Edge. Note the use of the lb element to mark the line breaks present in the original.
The physical rendition of title page information is often of considerable importance. One approach to this requirement would be to use the rend element, described in chapter Text Rendition to specify the rendition of each of the components of the title page. Another would be to employ a CSS stylesheet. Finally, a module customized for the description of typographic entities such as pages, lines, rules, etc., bearing special-purpose attributes to describe line-height, leading, degree of kerning, font, etc. could be employed.
Conventions vary as to which elements are grouped as back matter and which as front. For example, some books place the table of contents at the front, and others at the back. For this reason, the content models of the front and back elements are identical.
The following suggested values may be used for the @type attribute on all division elements, in order to distinguish various kinds of divisions characteristic of back matter:
‘appendix’: An ancillary self-contained section of a work, often providing additional but in some sense extra-canonical text.
‘glossary’: A list of terms associated with definition texts (‘glosses’).
‘notes’: A section in which textual notes are gathered together.
‘bibliography’: A list of bibliographic citations.
‘index’: Any form of index to the work.
‘colophon’: A statement appearing at the end of a book describing the conditions of its physical production.
No additional elements are proposed for the encoding of back matter at present. Some characteristic examples follow; first, an index (for the case in which a printed index is of sufficient interest to merit transcription):
Note that if the page breaks in the original source have also been explicitly encoded, and given identifiers, the references to them in the above index can more usefully be recorded as links. For example, assuming that the encoding of page 77 of the original source starts like this:
then the last item above might be encoded more usefully in the following form: