This chapter describes basic principles and shared concepts of MEI. Besides giving
a general understanding of the basic structures of an MEI file it tries to introduce
elements, models, and attributes that are part of the MEI.shared module, describe
their use or at least point to chapters of these guidelines or tutorials that describe
their use and application.
Besides elements used by multiple other modules the MEI.shared module defines the
main structural elements of an MEI file. Please be aware that there is also a A short tutorial about the basics of XML & MEI that helps understanding and learning the contents of this chapter.
2.1.1Document Root Elements
MEI defines four elements qualifying as root elements (i.e., the element containing everything else) of an MEI document; the most common of these
are defined in the MEI.shared module:
Contains a single MEI-conformant document, consisting of an MEI header and a musical
either in isolation or as part of an meiCorpus element.
The most straightforward – and probably the most common choice fitting most of the
use cases when encoding music – is the mei element. It contains an meiHead element for capturing metadata and a music element for describing the musical text. A more detailed description of the application
of music can be found in the course of this section (see 2.1.2 General Music Structure Elements). If you want to learn more about the use of the meiHead element – formally declared in the MEI.header module – please visit the chapter 3.2 Structure of the MEI Header in the 3 Metadata in MEI section.
The below example shows the basic structure of an MEI file with mei as root element. Please be aware that this example still does not represent a valid
The other potential root elements serve different usecases or purposes.
(MEI corpus) – A group of related MEI documents, consisting of a header for the group,
one or more elements, each with its own complete header.
meiCorpus contains an meiHead element describing a collection of related MEI-encoded texts – known as a corpus
– and an mei element for each text. Further information regarding the organization and encoding
of music corpora is given in chapter 3.8.5 Musical Corpora.
The below example shows the basic structure of an MEI file with meiCorpus as root element. Please be aware that this example still does not represent a valid
The below example shows the basic structure of an MEI file with meiHead as root element. Please be aware that this example still does not represent a valid
The above examples all carry two attributes on their root elements. While the @xmlns
is a general feature of XML and not defined in MEI it is crucial for stating the fact
that it is an MEI file you are dealing with. The second attribute is att.meiVersion.
Although not required the att.meiVersion attribute is important for defining a stable reference to a specific MEI-version
used in the enclosed encoding, and thus is highly recommended on your root element.
Contains a composite musical text, grouping together a sequence of distinct musical
(or groups of such musical texts) which are regarded as a unit for some purpose, for
the collected works of a composer.
While body holds the contents of a single musical text, group allows the textual body to consists of a series of (subordinate) musical texts or
other e.g., to represent a collection of independent musical texts which is to be regarded as
a single unit for processing or other purposes. It is provided to simplify the encoding
of collections, anthologies, and cyclic works. It can also be used to record the potentially
complex internal structure of corpora, covered more fully in chapter 3.8.5 Musical Corpora. Whether the musical text being encoded should be structured one way or the other
is not to be decided here. For example, a collection of songs might be regarded as
a single item in some circumstances, or as a number of distinct items in others. In
such borderline cases, the encoder must choose whether to treat the text as unitary
or composite; each option may have advantages and disadvantages.
Please be aware that the following examples still do not reflect valid MEI files as
they are missing some required elements not defined in the MEI.shared module.
The basic structure of a unitary musical text:
Examples of composite texts which may be represented using the group element include anthologies and other collections. The presence of common front matter
referring to the whole collection, possibly in addition to front matter relating to
each individual musical text, is a good indication that a given musical text might
usefully be encoded in this way.
For example, the overall structure of a collection of songs might be encoded as follows:
A group of musical texts may contain other unitary and grouped texts:
The group element may be used to encode any kind of collection in which the constituents are
regarded by the encoder as works in their own right, such as ad hoc single- or multiple-composer collections or anthologies of works not originally conceived
of as a single composition.
184.108.40.206Divisions of the Body
This section describes sub-division of the body of a musical text. Front and back matter are described in chapter 9.2 Text in MEI.
Contains one or more URIs which denote classification terms that apply to the entity
bearing this attribute.
The body of a unitary musical text may contain one or more discrete, linear segments.
The names commonly used for these structural subdivisions vary with the genre, style,
and time period of the music, or even at the whim of the author, editor, or publisher.
For example, a major subdivision of a symphony is generally referred to as a ‘movement’.
An opera, on the other hand, is usually organized into ‘acts’ and then further by
‘scenes’. All such divisions are treated as occurrences of the same neutrally-named
mdiv element. The attributes type or class may be used to categorize them independently of their hierarchic level.
To accommodate "divisions within divisions", an mdiv element may contain additional mdiv sub-elements nested to any level required. For example, the encoding of a multi-movement
work, such as a symphony, might have the following structure:
While dramatic works, such as Verdi's opera, Il Trovatore, often exhibit a more deeply-nested structure:
Conventionally, in performance the musical structures represented by mdiv elements are separated by pauses; however, attacca, attacca subito, seque, or similar terms are sometimes used at the end of an mdiv to indicate that the next mdiv should begin immediately after the conclusion of the current one. These terms have
no effect, however, on the logical segmentation of musical content using mdiv elements.
220.127.116.11Content of Musical Divisions
The contents of mdiv can be organised according to the two encoding paradigms provided by the score and parts elements.
The score element represents notation in which all the parts of an ensemble are arranged on
vertically aligned staves, while the parts element collects the individually notated parts for each performer or group of performers.
The explicit encoding of these two ‘views’ is necessary because it is not always possible
or desirable to automatically derive one view from the other. In addition, separating
scores and parts can eliminate a great deal of markup complexity.
The score and parts elements may also be employed to accommodate different methods of organizing the
markup – with no particular presentation implied. In this case, software may render
a collection of parts as a score or a score as a collection of parts.
Within the collective parts element, notation for a single performer is represented by the part element:
An alternative visual rendition of the score from the point of view of a particular
performer (or group of performers).
A part is effectively a small-scale score, allowing all the encoding features of a full
score, such as multiple staves, performance directives, and so on. A group of part elements is useful for encoding performing parts when there is no score, such as
in early music part books; when the parts have non-aligning bar lines; when different
layout features, such as page turns, are needed for the score and parts; or for accommodating
software that requires part-by-part encoding.
Please note that part elements in MEI are not an indication of voice leading or staff grouping. Voice leading
can be encoded using the next attribute, available on all the members of the model.eventLike class. The staffGrp element handles grouping of staves in the score context.
In both score and part views, the scoreDef element is used to describe logical characteristics of the encoded music, such as
key signature, the sounding key (as opposed to the notated key signature), meter,
etc., and visual features, such as page size, staff groupings and display labels,
etc. The staffGrp elements within scoreDef and the order of staffDef elements inside staffGrp should follow the score order of the source for the encoding.
A part or score may be further divided into linear segments called "sections".
section elements are often used as a scoping mechanism for clef signs, key and meter signatures,
as well as metronome, tempo, and expression markings. Using section elements can help to minimize the need for backward scanning to establish context
when the starting point for access is not at the beginning of the score. section elements may also be used for other user-defined, i.e., analytical or editorial, purposes and may therefore be arbitrarily nested to any
The ending element shares the same model as the section element. Unlike section, however, it may not be recursively nested.
Alternative ending for a repeated passage of music; , prima volta, seconda volta,
The most common (non-analytical, non-editorial) use of section and ending elements is illustrated below:
Within section elements, several methods of organization are possible, depending upon the notational
style of the source material and the encoder's needs. For example, when the MEI.cmn
module is used, the default organization is measure-by-measure, with staff and layer sub-elements within each measure. Further discussion of CMN notation is continued in chapter 4 Repertoire: Common Music Notation.
However, staff-by-staff organization is more appropriate for music without measures
and is provided when either the MEI.mensural or MEI.neumes module is employed. Coverage
of mensural notation is provided in chapter 5 Repertoire: Mensural Notation, while 6 Repertoire: Neume Notation describes neumatic notation.
It must be noted that, when both the MEI.cmn and MEI.mensural modules are available,
it is possible to encode CMN notation without using measure elements; that is, staff-by-staff organization may be used and the ends of measures
marked using barLine elements.
In certain circumstances, this approach may be preferable for reproduction of the
visual layout of the music. However, the simultaneous use of the measure and barLine elements may lead to confusion and should be avoided.
Typically, MEI follows the order of sections as they appear in the document being
encoded. When performance requires a different order, for instance in the case of
D.C. and D.S. directives, the following element may be used to define the performance
Indicates how a section may be programmatically expanded into its 'through-composed'
In the following example, expansion is used to indicate how the notated sections should be ordered in a "through-composed"
rendition, for example for machine performance or analysis. The plist attribute contains an ordered list of identifiers of descendant section, ending, lem, or rdg elements. The sequence of values in the plist attribute indicates that the section labelled 'A' comes first, then the section labelled
'B', followed by the 'A' section again. This mechanism must be specified independently
of any textual directives, such as "Da capo" or "D.S. al Fine", that may be present
in the document.
2.1.3Document Layout Elements
This section introduces the elements that can be used to represent document layout
features in MEI, be it for the sake of capturing an original source's layout when
transcribing or setting up layout features in so called ‘born digital’ documents.
Provides a number-like designation that indicates an element's position in a sequence
of similar elements. May not contain space characters.
The pb element can be used to mark page beginnings. When transcribing an existing document
the n attribute should be used to record the page number displayed in the source. It need
not be an integer, e.g., 'iv', or 'p17-3'. The logical page number can be calculated by counting previous
pb ancestor elements. When used in a score context, a page beginning implies an accompanying
system beginning. This element is modelled on an element in the Text Encoding Initiative
(system beginning) – An empty formatting element that forces musical notation to begin
a new line.
Critical editions and collections of works often contain extensive text, such as a
title page, table of contents, an introductory essay, commentary, biographical sketch,
index, etc. These textual items may appear in either the front or back elements. The front and back elements, available only when the MEI.text module is
activated, are described more fully in chapter 9.2 Text in MEI.
(line group) – May be used for any section of text that is organized as a group of
however, it is most often used for a group of verse lines functioning as a formal
unit, , a
stanza, refrain, verse paragraph, etc.
The character of elements specifying one or more score or staff parameters, such as
meter and key signature, clefs, etc., is that of a milestone; that is, they affect
all subsequent material until a following redefinition. A scoreDef element, which may affect more than just one staff, is allowed only within score, part and section elements, whereas staffDef is allowed only within staffGrp, staff and layer. A staffDef nested inside a staff must bear the same value for its n attribute as its parent staff and may thus not affect other staves.
The actual use of these elements depends on the repertoire and historical context
of the source material. For details on their use in Common Western Notation, please
refer to chapter 4.2.2 Defining Score Parameters for CMN.
2.2.2Staves and Layers
The elements below are used to capture the logical organization of musical notation:
A group of equidistant horizontal lines on which notes are placed in order to represent
pitch or a grouping element for individual 'strands' of notes, rests, etc. that may
or may not
actually be rendered on staff lines; that is, both diastematic and non-diastematic
Records the output y coordinate of the stem's attachment point.
Because they can occur in the context of a stream of events on the staff, some elements
which are used in other contexts are also treated as events. For example, in addition
to being used to define the initial clef of a staff, the clef element can also be used to indicate a clef change.
18.104.22.168Key Signatures and Clefs
Key signatures and clefs as well as intra-staff changes to these musical parameters
are treated as events.
A placeholder used to fill an incomplete measure, layer, etc. most often so that the
combined duration of the events equals the number of beats in the measure.
In this context, the term ‘space’ is used to mean whitespace that is required to meaningfully
align multiple voices in a multi-voice texture. In DARMS these were referred to as
‘push codes’. The space element is most often used when a new voice appears on a staff mid-measure.
The space element may also be used to align material that crosses staves.
‘Space’ can be thought of as another kind of event. In fact, some refer to this concept
as an ‘invisible rest’.
While ‘space’ is meaningful, ‘padding’ is non-essential whitespace that is used to
shift the position of the events which follow.
(padding) – An indication of extra visual space between notational elements.
The pad element is provided in order to capture software-dependent placement information
when it is desirable to do so. Unless the MEI file will be used as an intermediate
file format, this is usually not necessary.
Expression marks are instructions in the form of words, abbreviations, or symbols
that convey aspects of performance that cannot be expressed purely through the musical
All of the following elements can be considered text directives; however, MEI uses
the dir element specifically for words, abbreviations, numbers, or symbols specifying or
suggesting the manner of performance that are not encoded elsewhere using the more
specific elements of tempo and dynam.
(directive) – An instruction expressed as a combination of text and symbols — such
segno and coda symbols, fermatas over a bar line, etc., typically above, below, or
staves, but not on the staff — that is not encoded elsewhere in more specific elements,
Examples of directives include text strings such as 'affettuoso', fingering numbers,
or music symbols such as segno and coda symbols or fermatas over a bar line. Directives
can be control elements. That is, they can linked via their attributes to other events.
The starting point of the directive may be indicated by either a tstamp, tstamp.ges,
tstamp.real or startid attribute, while the ending point may be recorded by either
a tstamp2, dur, dur.ges or endid attribute. It is a semantic error not to specify
a starting point attribute.
Tempo marks are indications through words, abbreviations, or specific metronome settings
of the speed at which a piece of music is to be performed. Both instantaneous and
continuous tempo markings may be encoded using this element.
Text and symbols descriptive of tempo, mood, or style, , "allarg.", "a tempo",
"cantabile", "Moderato", "♩=60", "Moderato ♩ =60").
Dynamics, or dynamic marks, are terms, abbreviations, and symbols that indicate the
specific degrees of volume of a note, phrase, or section of music, e.g., "piano", "forte". Transitions from one volume level to another, e.g., "crescendo", "diminuendo", are also specified through dynamic marks.
Indication of 1) a "unified melodic idea" or 2) performance technique.
MEI maintains a distinction between phrase marks and slurs, the latter being curved
lines over or under a sequence of notes indicating they are to be performed using
a particular playing/singing technique, notes that should be taken in a single breath
by wind instruments or played by string instruments using a single stroke of the bow.
Often, a slur also indicates that the affected notes should be played in a legato manner.
Even so, it is common for both of these concepts to be referred to generically as
"slurs". Therefore, unless one is encoding music from a repertoire in which this distinction
is important, the slur element should be preferred over phrase.
Ornaments are formulae of embellishment that can be realized by adding supplementary
notes to one or more notes of the melody.
An element indicating an ornament that is not a mordent, turn, or trill.
MEI provides a generic element for encoding an ornament symbol that is not a mordent,
turn, or trill. For those common CMN ornaments, please refer to 4.4 Common Music Notation Ornaments.
Ornaments can be represented as textual strings (e.g., with a Unicode symbol) or with a user defined symbol (for the latter also see 2.4 User-defined Symbols).
Ornamentsmay also be encoded as so called control events (see also: 1.2.2 Events and Controlevents). That is, they be can linked via their attributes to other events. It is a semantic
error not to specify a starting point attribute with either tstamp or startid.
The following attributes, all of which are defined in separate attribute classes but
are also provided through the att.common attribute class, are available on nearly all elements in an MEI encoding. They provide
e.g., the means to identify, label, or reference elements in MEI-encoded files.
2.3.1Attributes from the XML-namespace
The most general attributes that are very frequently encountered in MEI files are
not even native MEI attributes but are coming from the basic definition of XML in
the XML-namespace http://www.w3.org/XML/1998/namespace. MEI redefines some of them in the att.basic class.
Provides a base URI reference with which applications can resolve relative URI
references into absolute URI references.
At many locations in an MEI file one can reference internal or external references.
E.g. the following example defines a graphic and references an external image (entity)
by means of the target attribute:
When a reference to an external entity is not a complete URI it is resolved against
the current base URI; if not defined by other means this would be the location of
the current document. The above example consequently would mean, that the file `myImage.jpg`
referenced from graphic resides at the same location (in the same folder) as the MEI-file.
The xml:base attribute may be used “to specify a base URI other than the base URI of the document
or external entity.” (Marsch, Jonathan; Tobin, Richard: XML Base (Second Edition).
W3C Recommendation 28 January 2009. online at: http://www.w3.org/TR/2009/REC-xmlbase-20090128/).
The value of xml:base can be inherited from an ancestor. This is relevant for resolving relative links
or URIs within the document. A comprehensible use case can be illustrated by the following
example: the values of the graphic elements' target attribute can be completed by the xml:base value specified for the ancestor facsimile element:
In order to determine an absolute URI, the base URIs of the element and all its ancestors
(including the document node) have to be taken into account. In the above case the
relative URIs of graphic/@target would consequently resolve to:
The xml:id and xml:base attributes are especially important when it comes to linking document fragments to
eachother or to external entities. Many of the linking attributes are globally available
in MEI through the att.common attribute class.
Yet there are other attributes from the XML-Namespace encountered in MEI files.
Identifies the language of the element's content. The values for this attribute are
language 'tags' as defined in BCP 47. All language tags that make use of private use
sub-tags must be documented in a corresponding language element in the MEI header
attribute is the same as the language tag's value.
Allows one to signal to an application whether an element's white space is
"significant". The behavior of xml:space cascades to all descendant elements, but
be turned off locally by setting the xml:space attribute to the value "default".
While xml:lang attribute may be used to encode the language of an element's contents, the xml:space attribute lets you define the handling of whitespace, i.e., whitespace being important content ("preserve") or negligible ("default"). With
the latter also being the default value if no xml:space attribute is present.
Provides a numeric designation that indicates an element's position in a sequence
similar elements. Its value must be a non-negative integer.
The label and n attributes both serve a labeling function; however, they differ in the values they
allow. The n attribute must be a single token, while label may contain a string value that includes spaces. This makes label useful for the capture of free-text labels, but a name or number specified with n may be easier to process.
The elements provided by the usersymbols module may be used in two ways:
For defining lines, curves and text elements that cannot be represented by a more
For defining reusable symbols and special graphical renditions.
For this purpose, it provides three elements as graphic primitives, line, curve and anchoredText. Anywhere these elements are allowed, the symbol element can be used as well. The symbol element facilitates the re-use of symbols that were defined by symbolDef elements.
22.214.171.124Defining Reusable Symbols
The symbolDef element uses SVG markup or the aforementioned graphic primitives to describe a symbol.
A symbol definition may also use symbols defined by other symbolDef elements by employing the symbol element.
The following code snippet shows a definition of a triangle percussion symbol using
The following snippet encodes a symbol composed of the symbol defined above and additional
126.96.36.199Elements Without Semantic Implications
The graphics primitives and symbols can be used directly in the music to describe
text and lines on a purely graphical level, without implying a specific logical meaning.
If possible, however, more meaningful elements should be used. This means for example,
"a tempo" or "da capo" should in general not be put inside anchoredText. Instead, tempo and dir should be used. Likewise, slurs and ties should be encoded using their respective
elements, not using curve, and for glissandi, gliss should be used instead of line.
An example usage for line is the visualization of voice leading, which is not covered by a specific MEI element.
The following code snippet shows the encoding of the above example:
188.8.131.52Defining a Specific Graphical Rendition for a Semantic Element
Usersymbols can define the rendition of different elements in two ways. Some elements,
for example dir and tempo, can have user symbol elements as content. In the following example, the content
of dir is used to provide pictograms of percussion instruments.
The corresponding encoding would be as follows:
A number of elements can point to an internally-defined symbol for rendering using
the altsym attribute.
Externally-defined symbols may be referenced using a glyph.name or glyph.num attribute from the att.extSym attribute class. Both attributes refer to Standard Music Font Layout (SMuFL) characters,
if not specified differently by the glyph.auth and glyph.uri attributes.
2.4.3Positioning and Coordinates
MEI uses the classic axis directions where the x-axis points from left to right and
the y-axis points from bottom up. (This is compatible with PostScript's axis orientation,
while SVG's y-axis points in the opposite direction.)
There are two types of units used by MEI: Staff units and units of the output coordinate
system. Units of the output coordinate system can be translated to physical real world
distances by means of the vu.height and page.scale of a scoreDef element. Real world units are multiplied by the value of page.scale to get the corresponding value in output coordinate units.
If an element is scaled using the scale attribute, the actual size of the units changes accordingly.
An element may be positioned using either absolute or relative coordinates. If absolute
start point coordinates are specified using x/y coordinates (or their relatives x2/y2 for endpoints) they take precedence over relative positions specified by ho/vo/to (or startho/startvo/startto). Analogously, x2/y2 override endho/endvo/endto.
If to/startto/endto attributes are used, the start or end point is x-aligned with the indicated timestamp.
If relative start coordinates (ho/vo or startho/startvo) are used, the origin of the coordinate system to be used for the start point is
the first one found by the following search schema:
If startid is present, the origin of the referenced element;
If the element is inside running text (e.g., inside tempo), the end of the preceding text or element;
Otherwise, the origin of the containing element.
The start point is offset from this origin by the value of the start coordinates (ho/vo or startho/startvo), using staff units.
Analogously, the endpoint is determined using end coordinates (endho/endvo). If endid is specified, it takes precedence over startid.
Examples of origins are:
staff and layer: The horizontal origin is the starting point of the measure, the vertical one is
the bottom staff line;
note: The horizontal origin is the left end of the notehead, the vertical one is the center
of the notehead;
clef: The horizontal origin is the left end of the clef, the vertical one the line specified
by clef/line (or clef.line);
For elements containing text: The left end of the baseline;
symbolDef: As symbol definitions aren't rendered directly, their coordinate system and origin
are considered virtual.
When they are referenced by symbol or altsym, the origin of the context, i.e., the referencing symbol, is used. If neither absolute nor relative coordinates are
specified, determining visually suitable start and end points for line and curve attributes is left to the rendering application. A value of 0 is not always assumed
for absent relative coordinates. A typical example where a rendering application may
not choose the origins of absent relative start and end coordinates to be the start
point as well is the line connecting two notes in the above Schumann example.
If neither a bezier nor bulge attribute is present, the renderer determines a suitable shape. However, if curvedir is present, the curve must respect the curvature direction specified there.
The attributes bezier and bulge define the shape of a curve in two different ways. If both are present, a rendering
application may choose either one. They override curvedir.
bezier defines the inner control points of a cubic Bézier curve, i.e., a Bézier curve with two inner control points. The coordinates are given by a space
separated list, first x and y offsets for the first control point, then x and y offsets
for the second one. The x and y offsets are given in staff units (or inside the context
of symbolDef in abstract units). The offsets for the first inner control point are relative to
the start point, the ones for the second inner control point are relative to the end
The bulge attribute allows specification of the curve shape by a number of interpolation points.
The interpolation points are given by their distance from the line connecting the
start and end point. The distance values are stored as a space separated list.
The interpolation points are calculated as follows: If bulge provides n distance values, the connection line is divided into n+1 subsegments of equal length. The interpolation points are found by drawing a perpendicular
line of the respective length at each subsegment joint. Positive distance values are
drawn to the left of the connection line (left when traveling from start to end),
negative ones to the right.
The interpolation algorithm used by the rendering application is implementation dependent.
The form attribute of lines may take the following values:
These attribute values are only qualitative. Actual dash length and dot and dash spacing
are implementation dependent.
The width attribute may take the following values:
These values are also qualitative, however, they are also relative. That is, 'narrow'
is the default value, 'medium' is twice as wide as 'narrow', and 'wide' is twice as
wide as 'medium'.
In addition to these textual values, the width attribute may contain a numeric value
and an optional unit value, "2mm" for example. If the unit value is not provided,
staff interline units are presumed.
The lstartsym and lendsym attributes name the symbol that may start and/or end a line, while lstartsymsize and lendsymsize indicate the relative size of the symbol using a numeric value in the range from
1 to 9.
The usersymbols module does not currently support continuous composite lines or filled
areas. As mentioned above, the rendition of lines is highly implementation dependent.
Coordinate system transforms are restricted to scaling using scale.