18. Performances

This chapter describes the ‘performance’ module, which can be used for organizing audio and video files of performances of a musical work. The elements provided allow the encoder to group different recordings of the same performance, identify temporal segments within the recordings, and encode simple alignments with a music text.

18.1. Overview

The following elements are available to encode information about a recorded performance:

A presentation of one or more musical works.
A recorded performance.
(audio/video file) – References an external digital audio or video file.
Defines a time segment of interest within a recording or within a digital audio or video file.
Indicates a point in time either absolutely (using the absolute attribute), or relative to another when element (using the since, interval and inttype attributes).

The performance element begins a subtree of the music element and appears alongside with, or instead of, body (described in Music Element and facsimile (described in Facsimiles). A performance element represents one recorded performance event. As a performance may be recorded in multiple formats or by different personnel or or using different equipment, the performance element may group one or more recordings of the event.

The @decls attribute can be used to point to performance medium metadata for the performed work. See Performance Medium for more details.

The recording element identifies a single recording event taking place within an absolute temporal space. The class att.mediabounds contains attributes that can be used to define this space:

Specifies a point where the relevant content begins. A numerical value must be less and a time value must be earlier than that given by the end attribute.
Specifies a point where the relevant content ends. If not specified, the end of the content is assumed to be the end point. A numerical value must be greater and a time value must be later than that given by the begin attribute.
Type of values used in the begin/end attributes. The begin and end attributes can only be interpreted meaningfully in conjunction with this attribute.

The avFile element identifies an external file associated with a recording act. In the simplest case, the recording element will contain one avFile element identifying a file that represents it. The @target attribute contains the URI of the digital media file. Use of the @mimetype attribute is recommended for the avFile element. Its value should be a valid MIME media type defined by the Internet Engineering Task Force in RFC 2046. It is also recommended that all avFile elements have a recording or clip parent which bears the @begin, @end, and @betype attributes.

<performance>
   <recording begin="00:00:00.00" betype="time" end="00:03:10.00">
      <avFile mimetype="audio/wav" target="http://example.com/path/to/audio/recording"></avFile>
   </recording>
</performance>

Sometimes, multiple digital files are created in order to provide greater flexibility in redistribution and playback capabilities. In this case, multiple avFile elements may occur, each with a different mimetype. Keep in mind, however, that each file still represents the complete temporal extent of the recording act in spite of the change of file format:

<performance>
   <recording begin="00:00:00.00" betype="time" end="00:03:10.00">
      <avFile mimetype="audio/wav" target="http://example.com/path/to/audio/recording"></avFile>
      <avFile mimetype="audio/mpeg" target="http://example.com/path/to/audio/recording"></avFile>
   </recording>
</performance>

The clip element identifies a temporal segment of a recording act. In the following example, the clip begins two minutes into the timeframe of the recording and ends 20 seconds later:

<recording begin="00:00:00.00" betype="time" end="00:03:10.00">
   <clip begin="00:02:00.00" betype="time" end="00:20:20.00"></clip>
</recording>

Beyond these relatively simple uses, complex situations may occur that require equally complex markup. For example, a single performance may be represented by multiple digital media files. Because they have differing durations, the media files must be the result of separate recording acts, even if these recording acts took place at the same time:

<performance>
   <recording begin="00:00:00.00" betype="time" end="00:03:10.00">
      <avFile mimetype="audio/wav" target="http://example.com/path/to/audio/recording"></avFile>
   </recording>
   <recording begin="00:00:00.00" betype="time" end="00:03:15.00">
      <avFile mimetype="audio/mpeg" target="http://example.com/path/to/audio/recording"></avFile>
   </recording>
</performance>

A single performance may also be represented by multiple, sequential digital files, as when a complete work is recorded in several so-called ‘takes’. In this case, the files may be considered to be parts of a single recording act, the extent of which is the combined extent of the individual clips. For example, a series of clip elements may be used to identify each movement of a piece and give start and end times for the movements in relation to the overall temporal space of the complete work:

<performance>
   <recording>
      <clip begin="00:00:00.00" betype="time" end="00:07:00.00" n="mov1">
         <avFile mimetype="audio/aiff" target="movement01.aiff"></avFile>
      </clip>
      <clip begin="00:07:01.00" betype="time" end="00:12:03.00" n="mov2">
         <avFile mimetype="audio/aiff" target="movement02.aiff"></avFile>
      </clip>
   </recording>
</performance>

Similar markup is also applicable when a single file representing the entirety of a recording act is broken into segments later, as is often done for practical storage and distribution reasons. The file from which the clips are derived is indicated using an avFile element:

<performance>
   <recording begin="00:00:00.00" betype="time" end="00:12:03.00" n="completeWork">
      <avFile mimetype="audio/aiff" target="completeWork.aiff"></avFile>
      <clip begin="00:00:00.00" betype="time" end="00:07:00.00" n="mov1">
         <avFile mimetype="audio/aiff" target="movement01.aiff"></avFile>
      </clip>
      <clip begin="00:07:02.00" betype="time" end="00:12:03.00" n="mov2">
         <avFile mimetype="audio/aiff" target="movement02.aiff"></avFile>
      </clip>
   </recording>
</performance>

A clip may be used to define any region of interest, such as a cadenza or a modulation, a song verse, etc. The following example shows the use of clip and its attributes to identify significant sections of a recording:

<performance>
   <recording begin="00:00:00.00" betype="time" end="00:05:21.00">
      <!-- Exposition -->
      <clip begin="00:00:00.00" betype="time" end="00:01:41.00"></clip>
      <!-- Development -->
      <clip begin="00:01:41.00" betype="time" end="00:03:14.00"></clip>
      <!-- Recapitulation -->
      <clip begin="00:03:14.00" betype="time" end="00:04:28.00"></clip>
      <!-- Coda -->
      <clip begin="00:04:28.00" betype="time" end="00:05:21.00"></clip>
   </recording>
</performance>

The preceding example also demonstrates that media files are not required in order to define the temporal space of a recording act or clip. This makes it possible to set the boundaries of these features, then use the content of the performance element as a rudimentary “edit decision list” to create the matching digital files.

If an encoding of the notated text with which the media files are associated is included in the MEI file, the @startid attribute can be used to indicate the first element in the sequence of events to which the recording corresponds:

<performance>
   <recording begin="00:00:00.00" betype="time" end="00:07:00.00" n="mov1" startid="#performance.m1_1">
      <avFile mimetype="audio/aiff" target="fullpiece.aiff"></avFile>
   </recording>
</performance>
<!-- ... -->
<body>
   <mdiv>
      <score>
         <section>
            <measure n="1" xml:id="performance.m1_1">
               <!-- ... -->
            </measure>
         </section>
      </score>
   </mdiv>
</body>

Clips can also be aligned with components of the musical text encoded in the body. The @startid attribute can be used to specify the starting element in the sequence of events to which the clip corresponds. The following example shows the use of of clip elements to identify the exposition of the first movement from Beethoven’s piano sonata Op. 14, no. 2 and its concluding ‘codetta’.

<performance>
   <recording begin="00:00:00.00" betype="time" end="00:05:21.00">
      <avFile mimetype="audio/aiff" target="BeethovenOp14N2-Mov1.aiff"></avFile>
      <!-- Exposition -->
      <clip begin="00:00:0.00" betype="time" end="00:01:41.00" startid="#performance.m1"></clip>
      <!-- Exposition's "codetta" -->
      <clip begin="00:01:31.00" betype="time" end="00:01:41.00" startid="#performance.m48"></clip>
   </recording>
</performance>
<!-- ... -->
<body>
   <mdiv>
      <score>
         <section>
            <measure n="1" xml:id="performance.m1">
               <!-- ... -->
            </measure>
            <!-- ... -->
            <measure n="48" xml:id="performance.m48">
               <!-- ... -->
            </measure>
         </section>
      </score>
   </mdiv>
</body>

Please note that the begin and end times of clips may overlap. In the preceding example, the extent of the codetta is contained within that of the exposition. Overlapping beginning and ending points may also be used to provide additional performance context for a segment or because there is uncertainty with regard to precise values for these points.

<performance>
   <recording begin="00:00:00.00" betype="time" end="00:03:06.54">
      <!-- a section of interest -->
      <clip begin="00:00:00.00" betype="time" end="00:00:41.00"></clip>
      <!-- the following section starts a little before the end of the 
         previous one to give some "adjustment" time -->
      <clip begin="00:00:31.00" betype="time" end="00:01:07.00"></clip>
      <!-- the boundaries of the following section are "fuzzy" -->
      <clip begin="00:02:18.00" betype="time" end="00:02:49.85"></clip>
   </recording>
</performance>

A bibliographic description of a recording or metadata explaining how clip boundaries were determined may be associated with the recording and clip elements via the @decls attribute:

<performance>
   <recording begin="00:00:00.00" betype="time" decls="#performance.recBibDesc" end="00:03:06.54">
      <!-- a section of interest -->
      <clip begin="00:00:00.00" betype="time" end="00:00:41.00"></clip>
      <!-- the following section starts a little before the end of the 
         previous one to give some "adjustment" time -->
      <clip begin="00:00:31.00" betype="time" decls="#performance.clipDesc" end="00:01:07.00"></clip>
      <!-- the boundaries of the following section are "fuzzy" -->
      <clip begin="00:02:18.00" betype="time" end="00:02:49.85"></clip>
   </recording>
</performance>

Associations between a feature of the encoding, such as a note, dynamic mark, or annotation, and a time point, may be created using when elements and @when attributes.

The when element identifies a particular point in time during the playback of a media file, such as an audio recording.

<when absolute="00:00:01.915291666" xml:id="t1"></when>

Time points may be identified in absolute terms as above; that is, in hours, minutes, and seconds since the beginning of the recording, or in relative terms using the @interval, @inttype, and @since attributes. In the following example, the time point of interest happens 48 frames after the occurrence of the point labelled as “t1”.

<when interval="48" inttype="smpte-ndf29.97" since="#t1" xml:id="t1.1"></when>

Having identified a point of interest, another feature of the encoding may be associated with this point using its @when attribute:

<annot plist="#LvB" when="#t1">
   <p>I like this part!</p>
</annot>

One use of the association created between the annotation and the time point is to display the text of the annotation as the recording or clip is played.

The @when attributes allows only a single value, so only one-to-one relationships can be created using this mechanism. However, one-to-many relationships are accommodated in the opposite direction; that is, from a time point to other features of the markup. For example,

<when absolute="00:00:01.915291666" data="#feature1 #feature2 #feature3" xml:id="t1.2"></when>

indicates that the entities identified in @data all occur at the same instant.