12.1 Facsimiles12.1.1 Elements of the Facsimile Module12.2 Performances12.2.1 Overview
MEI can be used to connect an encoding of some sort – either a transcription of existing material, or the specification of some expected output in some form – with existing sources. This existing material may be in different formats – music notation in any combination of print and manuscript, or audio or video footage. The concepts for establishing such connections between encoded music and source material is described in the following chapters.
Most often, MEI is used for the preparation of a digital musical text based on an existing music document, or with the intention of rendering the encoded notation into a document or audio rendition. MEI can, however, be used to provide a different kind of digital reproduction of a source document, which relies on the description and provision of digital imagery. Both approaches may be combined, so that the encoding of the musical content and digital facsimiles may add different facets to the same MEI document.
This module makes available the following elements for encoding facsimiles:
These element are used to add a separate subtree to MEI, starting with the facsimile element inside music, as seen in the following example:
It is possible to have more than one facsimile element in this location. This is especially useful when multiple sources are encoded in the same file using the mechanisms described in chapter 11.2 Editorial Markup of these Guidelines. In this case, the decls (declarations) attribute of facsimile may be used to refer to a source defined in the document’s header, as seen in the following example:
When using the FRBR model (see 3.5 Functional Requirements for Bibliographic Records (FRBR)), it is equally possible to reference a manifestation element instead of source.
Within a facsimile element, each page of the source is represented by a surface element. Each surface may be assigned an identifying string utilizing the label attribute. In addition, it may encapsulate more detailed metadata about itself in a figDesc element. The coordinate space of the surface may be recorded in abstract terms in the ulx, uly, lrx, and lry attributes. For navigation purposes, surface has a startid attribute that accommodates pointing to the first object appearing on this particular writing surface.
Within surface elements, one may nest one or more graphic elements, each providing a reference to an image file that represents the writing surface. Multiple graphic elements are permitted in order to accommodate alternative versions (different resolutions or formats, for instance) of the surface image. In spite of changes in resolution or format, all images must contain the same content, i.e., the entire writing surface. A graphic may refer to a single page within a multi-page document, which is – at least for Adobe PDF documents – available through a #page=X suffix to the target attribute.
The preceding markup will provide the basis for most page-turning applications. Often, however, it is desirable to focus attention on particular areas of the graphical representation of the surface. The zone element fulfills this purpose:
The coordinates of each zone define a space relative to the coordinate space of its parent surface. Note that this is not necessarily the same coordinate space defined by the width and height attributes of the graphic that represents the surface. The zone coordinates in the preceding example do not represent regions within the graphic, but rather regions of the writing surface.
Because the coordinate space of a zone is defined relative to that of a surface, it is possible to provide multiple graphic elements and multiple zone elements within a single surface. In the following example, two different images representing the entire surface are provided alongside specification of two zones of interest within the surface:
A zone element may contain figDesc or graphic elements that provide detailed descriptive information about the zone and additional images, e.g., at a different/higher resolution, of the rectangle defined by the zone. The data objects contained within the zone may also be specified through the use of the data attribute, which contains ID references to one more elements in the content tree of the MEI file, such as a note, measure, etc.
Conversely, an element in the content may refer to the facsimile subtree using its facs attribute, which is made available by the att.facsimile attribute class. The last example could therefore be encoded with pointers in the other direction:
The pb element defined in the 2 Shared Concepts in MEI makes special use of the facs attribute, in that it does not point to a zone, but a surface element instead. A pb marks the beginning of a page, so it can be concluded that all elements in the content tree which are encoded between any two pb elements encode musical symbols written on the page (surface) referenced by the first of these two pb element’s facs attribute.
The encoding of facsimile elements is intended to support sequential display of page images. If an encoder wants to describe the physical setup of a source document, the foliaDesc element is more appropriate. The difference of both approaches, and how to combine them, is described in chapter 3.7.2 Description of folia.
This chapter describes the ‘performance’ module, which can be used for organizing audio and video files of performances of a musical work. The elements provided allow the encoder to group different recordings of the same performance, identify temporal segments within the recordings, and encode simple alignments with a music text.
The following elements are available to encode information about a recorded performance:
The performance element begins a subtree of the music element and appears alongside with, or instead of, body (described in 2.1.2 General Music Structure Elements) and facsimile (described in 12.1 Facsimiles). A performance element represents one recorded performance event. As a performance may be recorded in multiple formats or by different personnel or using different equipment, the performance element may group one or more recordings of the event.
The decls attribute can be used to point to performance medium metadata for the performed work. See 3.6.7 Performance Medium and 3.5 Functional Requirements for Bibliographic Records (FRBR) for more details.
The recording element identifies a single recording event taking place within an absolute temporal space. The class att.mediaBounds contains attributes that can be used to define this space:
The avFile element identifies an external file associated with a recording act. In the simplest case, the recording element will contain one avFile element identifying a file that represents it. The target attribute contains the URI of the digital media file. Use of the mimetype attribute is recommended for the avFile element. Its value should be a valid MIME media type defined by the Internet Engineering Task Force in RFC 2046. It is also recommended that all avFile elements have a recording or clip parent which bears the begin, end, and betype attributes.
Sometimes, multiple digital files are created in order to provide greater flexibility in redistribution and playback capabilities. In this case, multiple avFile elements may occur, each with a different mimetype. Keep in mind, however, that each file still represents the complete temporal extent of the recording act in spite of the change of file format:
The clip element identifies a temporal segment of a recording act. In the following example, the clip begins two minutes into the timeframe of the recording and ends 20 seconds later:
Beyond these relatively simple uses, complex situations may occur that require equally complex markup. For example, a single performance may be represented by multiple digital media files. Because they have differing durations, the media files must be the result of separate recording acts, even if these recording acts took place at the same time:
A single performance may also be represented by multiple, sequential digital files, as when a complete work is recorded in several so-called ‘takes’. In this case, the files may be considered to be parts of a single recording act, the extent of which is the combined extent of the individual clips. For example, a series of clip elements may be used to identify each movement of a piece and give start and end times for the movements in relation to the overall temporal space of the complete work:
Similar markup is also applicable when a single file representing the entirety of a recording act is broken into segments later, as is often done for practical storage and distribution reasons. The file from which the clips are derived is indicated using an avFile element:
A clip may be used to define any region of interest, such as a cadenza or a modulation, a song verse, etc. The following example shows the use of clip and its attributes to identify significant sections of a recording:
The preceding example also demonstrates that media files are not required in order to define the temporal space of a recording act or clip. This makes it possible to set the boundaries of these features, then use the content of the performance element as a rudimentary "edit decision list" to create the matching digital files.
If an encoding of the notated text with which the media files are associated is included in the MEI file, the startid attribute can be used to indicate the first element in the sequence of events to which the recording corresponds:
Clips can also be aligned with components of the musical text encoded in the body. The startid attribute can be used to specify the starting element in the sequence of events to which the clip corresponds. The following example shows the use of clip elements to identify the exposition of the first movement from Beethoven’s piano sonata Op. 14, no. 2 and its concluding ‘codetta’.
Please note that the begin and end times of clips may overlap. In the preceding example, the extent of the codetta is contained within that of the exposition. Overlapping beginning and ending points may also be used to provide additional performance context for a segment or because there is uncertainty with regard to precise values for these points.
A bibliographic description of a recording or metadata explaining how clip boundaries were determined may be associated with the recording and clip elements via the decls attribute:
Associations between a feature of the encoding, such as a note, dynamic mark, or annotation, and a time point, may be created using when elements and when attributes.
The when element identifies a particular point in time during the playback of a media file, such as an audio recording.
Time points may be identified in absolute terms as above; that is, in hours, minutes, and seconds since the beginning of the recording, or in relative terms using the interval, inttype, and since attributes. In the following example, the time point of interest happens 48 frames after the occurrence of the point labelled as "t1".
Having identified a point of interest, another feature of the encoding may be associated with this point using its when attribute:
One use of the association created between the annotation and the time point is to display the text of the annotation as the recording or clip is played.
The when attributes allows only a single value, so only one-to-one relationships can be created using this mechanism. However, one-to-many relationships are accommodated in the opposite direction; that is, from a time point to other features of the markup. For example,
indicates that the entities identified in data all occur at the same instant.
extData is a container for holding non-MEI data formats, similar to extMeta but available in when rather than in meiHead. extData allows for data from audio or other sources to be linked to notes or other score events. Data should be enclosed in a CDATA tag.
The following example shows JSON formatted performance data encoded with extMeta for a single note (presumed to be defined elsewhere in the document as with the ID "note_1"). Both single-value summaries (e.g., pitch) and time series values (e.g., contF0) are encoded.
12.1 Facsimiles12.1.1 Elements of the Facsimile Module12.2 Performances12.2.1 Overview