Richard Staehli and Jonathan Walpole
Oregon Graduate Institute of Science & Technology
20000 NW Walker Rd., PO Box 91000, Portland, OR 97291-1000
Real-time scripts are more complex than continuous media playback. Proposals for continuous media storage systems [Ande92,Loug93] have addressed the problem of guaranteeing resources for constant rate storage access. While a script may include continuous media elements, the rate of data access may vary greatly as new elements are introduced during the presentation. Furthermore, a script can compose any set of stored objects, making it difficult for the storage system to provide useful bounds on seek time between those objects. To satisfy a script, a real-time storage system must not only guarantee throughput for continuous media objects, but must anticipate the latency in retrieving the start of each object.
Unlike many popular scripting languages, our scripts do not include user interactions. User interaction introduces uncertainty into the prediction of resource requirements and would expand the size of the problems that we are trying to solve [Stae93]. Since interaction corresponds to choice points in a graph of scripts, it is reasonable to believe that support for non-interactive scripts will contribute to supporting more general scripting languages. For our work, we assume that user response should be quick, but does not require strong guarantees.
We have been investigating the feasibility of using the timing and other quality of service (QOS) information from presentation scripts to allow efficient utilization of storage system resources while still providing QOS guarantees. The architecture that we have developed is called Rowena after a windsurfing site famous for satisfying many users concurrently. It features a presentation service that accepts requests for script executions and attempts to reserve resources to satisfy as many requests as possible. Like other proposals for real-time multimedia support, the Rowena architecture provides an acceptance test for requests that considers the QOS requirements of the script. If a request is accepted, the presentation is guaranteed to meet the script's requirements, barring bugs or system failures. What is novel about our approach is that the presentation service can use heuristic scheduling methods to make advance reservations for resources such as a buffer that may be in use currently, but is known to be free for a specific future interval. In order to schedule resources for each request, the data objects and presentation tasks referenced in the application level script must be translated into retrieval, data processing, and device output tasks plus the physical resources needed for each.
In this paper, we focus on defining the parameters of a QOS-based request interface to the presentation service and a theory for translating requests into a resource scheduling problem. To put the value of this work into a real world context, we describe a digital television studio and its need for real-time resource management.
A script is a mapping of events to clock intervals. An event is an observable state change, e.g. the output of a message. An event is said to occur during an interval (t1,t2+1) if and only if a sequential process observes the clock value t1 before the event and the value t2 after observing the event. A clock produces a monotonically increasing sequence of integer values. A script that is intended only to specify synchronization between scripted events can be offset without changing its meaning so that the first interval (the one with the lowest time value) begins at zero.
For convenience, we require that multiple occurrences of the same event in a script be differentiated, by a counter if necessary, so that there is no ambiguity about which occurrence of an event the script refers to.
Like a script, a trace of a process execution is a mapping from events to the clock intervals that they were observed to occur in. A trace satisfies a script relative to start time t if there is a one-to-one mapping between event-interval pairs (e,I) in the script and (e,I') in the trace and t+I contains I'.
The quality of service requirements of a script may be relaxed by including an allowance for exceptions that tells what type of exceptions and how many of each can be tolerated. Exceptions can be categorized into three groups: missing events, extraneous events, and timing errors. A QOS specification may allow for specific events to be missed, or it may place some constraint on the number or pattern of missing events. Similarly, added events and timing errors may be allowed for individually or with aggregate constraints.
As an example, suppose that a script for the delivery of audio data to a voice quality output device requires a write of 512 bytes of data, 16 times per second. We can allow for imperfect data delivery by specifying that as many as 1 in every 100 successive writes can be replaced with null values if the correct data is not available.
In a multimedia script an event names a data object and specifies how it is to be presented. In the example above, an event names the nth block of a file to be written to the audio device. We define primitive media types such as audio and video that have default scripts specifying the mapping from storage objects to presentation events, where all events present the data at full resolution and normal rate on a single output device. Script composition operators are defined to clip from a source, synchronize subscripts, concatenate scripts and remap outputs.
In the Rowena architecture, the presentation manager must locate the data that is needed for a script and choose a set of tasks to retrieve and process that data for presentation. This choice must consider the resource costs and the impact of these tasks on future requests, e.g. cache management. Since data retrieval and processing tasks must complete before presentation, the result is a graph with precedence constraints forming the edges between tasks that may also compete for resources. The acceptance test must attempt to find a schedule that meets all timing constraints while also minimizing both resource usage and start-up delay for the presentation.
Note that the scheduling done in the acceptance test need not fix the interval that each task is performed in precisely. Instead, it will undoubtedly be more efficient to schedule the work to be done in large bins with the assurance that all work in the bin will complete by a suitable deadline. This loose scheduling allows the underlying scheduling mechanisms to reorder disk accesses in order to minimize inter-block latency.
If buffer space is plentiful, the presentation manager can provide a fixed size queue for each request with enough prefetched data to sustain playback through the worst case intervals of a slow prefetch rate. Fixed size queues are wasteful however, if the full capacity is rarely needed during playback. In that case more playback requests can be accepted with the weaker guarantee that buffer space will be available when it is needed. In the next section, we describe storage access in a digital television studio. From this work we hope to obtain traces of real world multimedia presentations to use in a comparison of fixed versus dynamic resource allocation strategies.
A single day's newsfeeds will constitute approximately 10 hours of video that require up to 360GB of storage. The studio can be expected to have many terabytes of archival video on site. Because of the amount of storage required, low-cost and typically high-latency storage devices will be used for archival storage. Although archival data that will be read frequently may be cached, the archive should support real-time scripted access for browsing. While editing, a single user may be interested in only an hour of video data and thus might be able to work effectively with only a 36 GB partition of the database.
In the database, video data is immutable. This simplifies the sharing of the data since the same segment of a video may be included by reference in independent presentations without making independent copies. Playback processes can require the simultaneous presentation of multiple video segments with one or more audio tracks. The first time a user requests such a presentation the database will need to concurrently retrieve all the data streams. The aggregate bandwidth will be the sum of the bandwidths required for each individual stream.
Editors will need to locate useful video footage, load new data, interactively view the data, and compose selected segments into new video presentations. Interactive viewing includes manual fast forward/reverse control by spinning a knob to locate visual and audio cues. A precise playback rate is not as important during this interactive search as it is during normal speed presentation, but it does require that full frames can be retrieved without reading the entire data stream serially from storage since the latter might need upto 100 times the normal bandwidth.
Composition operators include cuts from one data stream to another, concurrent presentation of two streams such as lip-synched audio and video, spatial layout of multiple regions of a display, and combination of inputs, including transitions such as wipes and fades. For reviewing the compositions, playback must be in real-time with broadcast quality in the signal reproduction and synchronization of media elements. The products of the editing process are scripts, as defined previously, that specify data sources, output devices, processing and real-time QOS requirements for the presentation. The operating system for the digital television studio must use this information to manage resources effectively for multiple users.