SlideShare una empresa de Scribd logo
1 de 26
Descargar para leer sin conexión
Continuous Media Sharing in
                                   Multimedia Database Systems
                          Mohan Kamathy , Krithi Ramamrithamy and Don Towsleyz
                                          Department of Computer Science
                                            University of Massachusetts
                                                Amherst MA 01003
                                       fkamath,krithi,towsleyg@cs.umass.edu

                                                        Abstract
    The timeliness and synchronization requirement of multimedia data demands e cient bu er
management and disk access schemes for multimedia database systems (MMDBS). The data rates
involved are also very high and despite the development of e cient storage and retrieval strategies,
disk I/O is likely to be a bottleneck, thereby limiting the number of concurrent sessions supported by
a system. This calls for better use of data that has already been brought into the bu er by exploiting
sharing whenever possible using advance knowledge of the multimedia stream to be accessed.
    This paper introduces the notion of continuous media caching which is a simple and novel tech-
nique where bu ers that have been played back by a user are preserved in a controlled fashion for
use by subsequent users requesting the same data. This is shown to have considerable impact on the
performance of bu er management schemes. When continuous media sharing is used in conjunction
with batching of user requests, even better performance improvement can be achieved. The results
of our performance study clearly indicate that schemes that utilize continuous media caching out-
perform existing schemes in environments where sharing is possible. We also discuss the impact of
fast-forward and rewind on continuous media sharing schemes.


Keywords: Bu er Management, Batching, Multimedia Databases, Data Sharing, Optimization,
                 Performance




y   Supported by the National Science Foundation under grant IRI - 9109210.
z   Supported by ARPA under contract F19628-92-C-0089.
Contents
1   Introduction                                                                                                           1
2   Data Characteristics                                                                                                   3
3   Continuous Media Sharing                                                                                               4
4   Bu er Management Schemes                                                                                               6
    4.1 Traditional Scheme - Use And Toss (UAT) : : : : : : : : : : : : : : : : : : : : : : : : : 8
    4.2 Proposed Scheme - Sharing Data (SHR) : : : : : : : : : : : : : : : : : : : : : : : : : : 8
        4.2.1 Criteria to attempt sharing : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 10
5 Batching Schemes                                                                                                         11
    5.1 Batching without Sharing (BAT-UAT) : : : : : : : : : : : : : : : : : : : : : : : : : : : 12
    5.2 Batching with Sharing (BAT-SHR) : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 12
6 Performance Tests                                                                                                        12
    6.1 Experimental Setting & Assumptions : : : : : : : : : : : : : : : : : : : : : : : : : : : : 12
    6.2 Parameters : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 14
    6.3 Metrics : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 14
7 Results                                                                                                                  15
    7.1 Comparison of UAT and SHR based on Disk Bandwidth : : : : : : :            :   :   :   :   :   :   :   :   :   :   15
        7.1.1 Comparison based on Inter Arrival Time : : : : : : : : : : : :       :   :   :   :   :   :   :   :   :   :   15
        7.1.2 Comparison based on Bu er Size : : : : : : : : : : : : : : : :       :   :   :   :   :   :   :   :   :   :   15
        7.1.3 Comparison based on Topic Parameters : : : : : : : : : : : :         :   :   :   :   :   :   :   :   :   :   16
    7.2 Comparison of BAT-UAT and BAT-SHR based on Disk Bandwidth                  :   :   :   :   :   :   :   :   :   :   17
        7.2.1 Comparison based on Inter Arrival Time : : : : : : : : : : : :       :   :   :   :   :   :   :   :   :   :   17
        7.2.2 Comparison based on Bu er Size : : : : : : : : : : : : : : : :       :   :   :   :   :   :   :   :   :   :   17
        7.2.3 Comparison based on Topic Parameters : : : : : : : : : : : :         :   :   :   :   :   :   :   :   :   :   18
        7.2.4 Comparison based on Batching Parameters : : : : : : : : : :          :   :   :   :   :   :   :   :   :   :   19
8 E ect of Fast Forward/Rewind on Continuous Media Caching                                                                 19
9 Extensions to this Work                                                                                                  21
10 Conclusion                                                                                                              21
1 Introduction
Multimedia systems have attracted the attention of many researchers and system developers, because
of the multimedia nature of the data used in modern applications and due to their capability to im-
prove the communication bandwidth between computers and users. These multimedia information
systems cater to a variety of scienti c and commercial elds ranging from medicine to insurance.
Lately the construction of the national information super-highway has also received enormous at-
tention and many commercial ventures are focusing on providing on-demand services. Multimedia
information systems will form the backbone of these services. Multimedia data can be classi ed
as static or dynamic (also known as continuous) media. While static media, like text and images,
do not vary with time (from the perspective of the output device), dynamic media, like audio and
video, vary with time. Whenever continuous media is involved, in order to ensure smooth and mean-
ingful ow of information to the user, timeliness and synchronization requirements are imposed.
Timeliness requires that consecutive pieces of data from a stream be displayed to the user within a
certain duration of time for continuity. Synchronization requires that data from di erent media be
coordinated so that the user is presented with a consistent view. Hence, unlike real-time database
systems where deadlines are associated with individual transactions, multimedia data has contin-
uous deadlines. Thus far, most of the research work in the area of multimedia data management
has concentrated on data modeling and storage mangement. While some of the issues relevant to
MMDBS are enumerated in 4, 5, 15, 20], object-oriented approaches for implementing MMDBS are
suggested in 6, 15, 21, 30]. Some architectural aspects of multimedia information systems like data
placement, bu ering and scheduling are described in 6] while issues related to a distributed multi-
media information system, performance in particular are discussed in 4]. Though e cient storage
and retrieval strategies are being developed 2, 12, 16, 24, 27, 29, 31], since the data rates involved
are very high (around 2 MB/sec of compressed data per request) disk I/O could still be a bottleneck,
i:e:; the number of concurrent sessions supported could be limited by the bandwidth of the storage
system. Hence, it is important to make better use of data brought from the disk into the database
bu er.
     In this paper, we explore the bene ts of continuous media sharing in multimedia database systems
(MMDBS). Speci cally, we consider the News-On-Demand-Service (NODS) which typi es applica-
tions where extensive data sharing is possible and focus on bu er management schemes that promote
sharing of continuous media data. Since multimedia data are typically read only, by sharing the data
between users, many trips to the disks can be avoided. The motivation for our work comes from the
fact that sharing of continuous media data can increase the number of users that can be supported
by a system. Furthermore, there are many interesting issues that arise in such an environment.
     The most important fact is that once the user has requested a playback, advance knowledge
of the multimedia stream to be accessed can be used to anticipate demand and promote sharing
(planned sharing for future accesses). To achieve this, we introduce the notion of continuous media
caching. It is a simple and novel technique where bu ers that have been played back by a user are
preserved in a controlled fashion for use by subsequent (or lagging) users requesting the same data.
This technique can be used when there is su cient bu er space such that it is possible to retain
data for the required duration. With this technique, the system can avoid fetching the data from the

                                                  1
disk again for the lagging user. Hence this technique makes it possible to support large number of
sessions than the number permitted by the disk bandwidth. Also the throughput of the system, i:e:;
the percentage of user requests permitted, can be increased by exercising the less expensive option
of additional memory rather than choosing a storage system with a higher bandwidth. Batching
is another technique to improve the performance of a system by grouping requests that demand
the same topic 7]. By using continuous media caching in conjunction with batching, even better
performance can be achieved. Throughout our discussion, we disregard issues that might arise if
users request fast-forward or rewind. We postpone our discussion on the impact of fast-forward and
rewind to the end of the paper.
    Bu er management schemes have been widely studied in traditional database systems 11, 26] and
in real-time database systems 18]. The objective of these bu er management schemes is to promote
sharing and reduce disk accesses, thereby increasing the throughput. However these schemes typically
consider only textual data. Some aspects of bu er management for multimedia data are addressed
in 2, 6]. While 6] does not speci cally consider concurrent sessions in multimedia systems, 2] does
address some performance issues when concurrent users are trying to access continuous media data.
However it disregards sharing of continuous media data. The Use-And-Toss (UAT) policy has been
suggested in 6] for multimedia data, under which, once a piece of data is used, it is tossed away so
that the next piece of required data is brought into the same bu er. Generally the UAT policy is good
only if data sharing does not exist or cannot be exploited. Essentially, there has not been any research
to evaluate the potential bene ts of continuous media sharing when users are accessing multimedia
data concurrently. Hence we propose a new bu er management scheme, SHR that enhances UAT
with continuous media caching. This scheme should perform very well since it uses past data and
also does planned sharing of future data. To study the bene ts of using continuous media caching
in conjunction with batching, we also study two other schemes BAT-UAT and BAT-SHR, which are
enhancements of UAT and SHR respectively with batching.
    The rest of the paper is organized as follows. In section 2 we discuss the data characteristics of
multimedia data and NODS in particular. Issues related to continuous media sharing are discussed
in section 3. Using simple examples, we illustrate how continuous media sharing can be useful in
increasing the number of users supported by a system. Section 4 discusses the UAT scheme and
its enhancement (SHR scheme) through the use of continuous media caching. Batching schemes for
grouping requests that demand the same data are discussed in section 5. Strategies to improve per-
formance through the use of continuous media caching are also discussed. Details of the performance
tests are discussed in section 6. This section describes the experimental setting, various assumptions
made, parameters and metrics used for the tests. Section 7 interprets the results of the performance
tests. These results show the superior performance of schemes that utilize continuous media caching.
In section 8 we discuss the impact of fast-forward and rewind in continuous media sharing envi-
ronments. Possible extensions to this work are described in section 9. Section 10 concludes with a
summary of this work.




                                                   2
2 Data Characteristics
Multimedia objects are captured using appropriate devices, digitized and edited, until they are in a
form that can be presented to the user. These independently created objects are then linked using
spatial and temporal information, to form a composite multimedia object that may correspond to a
presentation or some speci c application.
     In order to ensure smooth and meaningful ow of information to the user, there are timeliness
and synchronization requirements which vary from application to application. Timeliness requires
that consecutive pieces of data from a stream be displayed to the user within a certain duration
of time for continuity; hence multimedia data has continuous deadlines. Synchronization requires
that data from di erent media be coordinated so that the user is presented with a consistent view.
Essentially it refers to the temporal sequencing of events among multimedia data. It deals with the
interaction of independent objects that coexist on di erent media such that the user is presented
with a consistent view and is very much dependent on the type of application. Detailed analysis of
synchronization requirements for multimedia data can be found in 23, 28].
     The rest of this section describes the data characteristics of NODS, which typi es applications
where extensive sharing of data is possible. In NODS, given that the number of topics in high demand
is likely to be limited at any particular time, given the higher level of interest in certain topics like
sports/business, the probability of one or more users requesting the same item for playback is very
high. To request a service the user has to choose a topic and a language from, for example, the
following:
  1. Topic - Politics, International, Business, Sports, Entertainment, Health, Technology etc.
  2. Language - English, Spanish, Italian, French, German, Hindi etc.
Each topic consists of a list of news items. News items are composed of video and audio segments
and possibly some images. All news items to be displayed when a topic is chosen are compiled in
advance and known apriori.
     Sharing of data can occur at various levels as we discuss below. Typically there are many news
items that are shared between topics. For example, consider the news item on NAFTA1 , which can
be included in a variety of topics like Politics, International and Business because of its relevance to
all of them. Sharing of continuous media can also occur at the media level. Since the same news
items will be provided in di erent languages, to reduce storage, additional storage is necessary only
for the audio segments as the video segments can be shared. In addition, sharing can occur when
multiple users are requesting service on the same topic. This clearly shows that sharing at di erent
levels can lead to compound sharing. In this paper, we mainly focus on the last type of sharing since
it is likely to have the most impact on performance. Intelligent bu er management schemes can and
must utilize this to improve the performance of a system.
     Figure 1 shows the data stream of a request. Because video is often compressed, we assume that
the continuous media data is played out at a variable bit rate. The data rate depends on the amount
  1
      North American Free Trade Agreement

                                                   3
of action that is packed in the video. We assume that a data stream is divided into segments where
each segment is characterized by a xed playout data rate. For example a video stream of a news
reader reading news will require a lower data rate than the video stream from a moving camera (e.g.
sports clippings). In our work, segments are restricted to be of equal length (of 1 second duration)
for ease of implementation.
                                             Time

                                                                                  News Item boundary
                      Data Rate
                                                                                  segment boundary
                                                                                          audio data
                                                                                          video data

                                  Figure 1: Multimedia Data Stream of a request
    However since the data rate of segments can vary, the amount of bu er required by a segment
varies. Since the length of the segment also has an impact on the performance of the schemes we are
going to propose, if a segment is too long, it can be split into smaller segments that have the same
data rate.

3 Continuous Media Sharing
In this section we introduce the concept of continuous media caching which promotes planned sharing.
It is a simple and novel technique where bu ers that have been played back by a user are preserved
in a controlled fashion for use by subsequent users requesting the same data.
     Since the bu er requirements for images and audio are less compared to video (as we will see later
in this section), we ignore them and consider only the requirements of video data. Hence we mainly
focus on sharing that occurs when multiple users request service on the same topic (though they
may have chosen a di erent language). When there is sharing, many possibilities exist depending
upon the arrival time of the customers and we discuss them below. When the same service is being
requested by two (or more) users simultaneously, one of the following could be true:
    1. they arrive at the same time (they start at the same time)
    2. they arrive with a gap of few time units between them
    3. they arrive with a gap of large number of time units between them
     Depending upon the order in which users share an application, they may be classi ed as leading
(the rst one) or lagging (the subsequent ones that can share the bu ers). Using the same bu ers
for both users can de nitely payo in case 1 (since the same data is used at the same time by both
requests), whereas in case 2 it may payo . In case 2, after the data has been played out of the bu er
for the leading user, rather then discarding the data, it can be cached by preserving it in the bu er
so that the same data can be reused by the lagging user (rather than fetching it again from the disk).
We refer to this technique as continuous media caching. In case 3, it is easy to see that continuous
media caching will not payo since preserving bu ers may prove to be too costly in terms of bu er
requirements. In this case it is better to obtain the data for the lagging user directly from the disk
and reuse the data bu ers from the leading user immediately after it has been played back. In our
bu er management scheme, we allow sharing to occur as long as the di erence between the arrival
time of two consecutive users for the same topic is less than a particular value (details are described
                                                       4
in section 4.2.1). Barring information about the future arrivals, this is the earliest time instant when
a sharing situation can be recognized.
    Figure 2 compares the disk and bu er requirements as a function of time for the non-shared
case and the shared case (continuous media caching). Two requests for the same topic (containing 5
segments) that arrive with a time di erence of 2 segments is shown in the gure. In the non-shared
case, disk and bu er bandwidths are allocated independently for each request. However for the
shared case, since it is known that segments three through ve will be available in the bu er (it will
be loaded into the bu er for the rst request) when the second request needs it, the bu ers are reused
for the second request, i:e:; the bu ers are preserved until the corresponding playback time for the
second user. These segments need not be fetched again from the disk for the second request. This
results in an increase in the bu er requirement, but a decrease in the disk bandwidth requirement.
Thus it can be seen that continuous media caching exploits sharing to decrease the number of disk
I/Os and, hence, improve performance. This philosophy concurs with the increasing trend towards
main memory systems since disk transfer rates are typically limited by mechanical movements.
                        Disk Bandwidth for Present Request
                                                                                                  Disk Bandwidth for Present Request
                        Disk Bandwidth for Next Request
                                                                                                  Disk Bandwidth for Next Request
  Bytes/sec




                                                                               Bytes/sec




              Disk Bandwidth for Non−Shared Case                  Time                     Disk Bandwidth for Shared Case                        Time


                                                                                                         Buffers Allocated for Present Request

                                                                                                         New Buffers for Next Request
                      Buffers Allocated for Present Request
                                                                                                         Preserved Buffers for Next Request
                      New Buffers for Next Request
   Bytes




                                                                               Bytes




              Buffer Requirement for Non−Shared Case              Time                      Buffer Requirement for Shared Case                   Time


                                           Figure 2: Bu er Sharing between two requests
                                           1, 2, 3, 4, 5 − Segments Loaded for First User
                                           1,2 − Segments Loaded from disk for Second User
                                           3,4,5 − Segments preserved in buffer for Second User


                                                                     3            3
              Bytes




                                                          1
                                                                                 4          4
                                                                     2
                                                              3
                                            2                                     5
                                                                                            5                 5         Time
                             1                                       4

                            T1             T2             T3        T4           T5         T6               T7

   Figure 3: Segments available in bu er at each time slot (zooming into bottom right of gure 2)
    Figure 3 shows the details of the segments that are actually present in the bu er at each time
instant in the shared case (bottom right case of gure 2). It illustrates how the continuous media
caching technique preserves only the necessary segments for just the required amount of time. It
di erentiates the segments as follows | segments that are loaded for the rst user, segments fetched
                                                                           5
from the disk for the second user and segments that have been used by the rst user but preserved for
the second user. We present an example of a simple system, to illustrate how sharing can be exploited
to support more users. Consider a system which has a main bu er of 1280 MB (Mega Bytes) and
a storage system with a high storage capacity and a data transfer rate of 40 MB/s. MPEG-2 is an
accepted standard for continuous media data storage/transmission and the requirements of di erent
media based on this standard are as follows (in the compressed form): image 100 KB for a frame,
audio 16 KB/s (high quality) and video 2 MB/s. Ignoring the data requirements of audio and
images, we analyze the requirements of a session using the characteristics of its video data on a
segment (of duration 1 second and having same data rate of 2MB/s) basis. The available bu er can
hold 1280MB/(2MB/sec) = 640 seconds of video, which translates to 640 segments. This implies
that the bu er is capable of handling 640 concurrent sessions (since each session needs at least a
segment in memory). Neglecting the seek time (which is typically in the order of 10's of milliseconds)
and based on the data bandwidth of the storage device, the system can support (40MB/s)/(2MB/s)
= 20 concurrent sessions (topics). Thus we see that the number of concurrent sessions supported is
limited to 20 by the bandwidth of the storage system even though the bu er can handle 640 sessions.
However by exploiting sharing, the bu er that can support 640 seconds of video can be e ectively
utilized to satisfy more users, provided they request one of the 20 topics that can be fetched using the
available disk bandwidth. If the 20 topics are shared by u1 ; u2; :::; u20 concurrent users respectively,
then the number of users supported will be P20 u . This is however the best case and the probability
                                                 =1
                                                 i       i

of this is very low, since all the users sharing the sessions arrive at the same time in this scenario.
On the other hand, the worst case occurs when the users requesting the same topic come with larger
inter-arrival times. Continuous media caching may not be possible in this case (due to insu cient
bu er bandwidth), resulting in only 20 users being supported by the system. However if they arrive
within short intervals, instead of fetching the data from the disk for the lagging user, data read from
the disk by the leading user can be preserved for reuse by the lagging user. Thus depending on
the inter-arrival time and other possibilities of sharing (three cases discussed earlier), the number of
users supported will be between 20 and P20 u . This simple example gives an idea of how sharing
                                             i=1     i

can be exploited in a MMDBS for the case of NODS to provide better bu er management. If seek
times were also considered in the above example, continuous media caching will prove to be even
more bene cial.

4 Bu er Management Schemes
This section describes the two bu er management schemes (UAT and SHR) we are going to compare.
First we brie y review traditional bu er management techniques and then discuss the details of UAT
and SHR.
    The objective of the bu er manager is to reduce disk accesses for a given bu er size. Traditional
bu er management schemes for textual data use xed size pages to avoid fragmentation. When the
requested database page is not in the bu er, it is fetched from the disk into bu er. Then the page is
kept in the bu er using the fix operation, so that it is not paged out and future references to the page
do not cause page faults in a virtual memory operating system environment. Once it is determined

                                                         6
that the page is no longer necessary or is a candidate for replacement an unfix is done on that page.
Also in traditional database systems, bu er allocation and replacement are considered orthogonal
issues. A page can be allocated anywhere in the global bu er in a global scheme whereas in a local
scheme, speci c portions of the bu er are allocated for a given session and a page for a session is
allocated in the area allocated for that particular session. Hence in the global scheme there is only
one page replacement algorithm whereas in the local scheme several page replacement algorithms
may exist. In either case the goal of the page replacement algorithm is to minimize the bu er fault
rate. Before allocating a page, the bu er manager checks if the corresponding page is in the bu er
and if not, retrieves it from the disk. While static allocation schemes allocate a xed amount of
bu er to each session when it starts, dynamic allocation schemes allow the bu er requirements to
grow and shrink during the life of the session. In virtual memory operating systems where there
is paging, code pages (which are read only) are typically shared. For example, if many users are
using a particular program like an editor, then the code pages of the editor are shared between the
di erent users. Though this scheme cannot be used as such for continuous media data, it provides
some motivation for our continuous media caching scheme. Some of the common page replacement
strategies used for textual data include FIFO and LRU 11]. These traditional schemes do not have
provision for admission control and resource reservation and hence can not guarantee the success
of a request in multimedia environments. This can result in wastage of resources for requests that
eventually fail. We now discuss two schemes that have admission control and resource reservation
integrated with them. The rst is UAT, a scheme proposed previously for multimedia systems 6]
and SHR, our new scheme that exploits the potential for sharing.
    First we discuss the memory model, admission control and bandwidth reservation scheme. In
the case of a MMDBS, advance knowledge of the data stream to be accessed is available and this
can be utilized for better bu er management. Since we are assuming a large main bu er, we assume
that this information about the topic and segments can be stored in main bu er (i:e:; the number
of segments that make up a topic and the segment data rates for each segment). Bu er space that
is not used exists in a free-pool (this is a global pool). Bu ers needed for the segments of a request
are acquired (analogous to x) from the free-pool and returned (analogous to un x) to the free-pool
when they have been used. When bu ers are requested from the free-pool, the oldest bu er in the
free-pool is granted to the requester. We mentioned earlier that since the video rate dominates
the audio rates, only the requirements of video are considered for sharing. The availability of the
necessary bandwidth for audio is also to be checked by the admission control policy to meet the
synchronization requirements of audio and video streams. The admission control and reservation
scheme is dependent upon the sharing policy. Given (n ? 1) sessions already in progress, the n      th



request is accepted i the following inequalities hold.
                              Xl  n

                                              d          M       8t = 1; ::; s                    (1)
                                                  i;t

                              i=1
                                      Xd  n

                                                         R       8t = 1; ::; s                    (2)
                                                   i;t

                                      =1
                                      i




where d represents the continuous media rate (audio and video) for request i at time slot t, l
        i;t




                                                             7
represents the length of each segment (of 1 second duration), s represents the total number of
segments in the request. The 8t = 1; ::; s condition indicates that this inequality should hold for all
the segments of the request, thereby verifying that there is su cient resources for the entire duration
of the request. The rst equation gives the limiting case for the bu er bandwidth (bu er size = M )
and the second for disk bandwidth (data rate supported = R). If the inequalities hold for a request,
the required bu er and disk bandwidth is reserved and the request is guaranteed to succeed.
4.1 Traditional Scheme - Use And Toss (UAT)
The UAT policy that is traditionally used for MMDBS is described below with some enhancements.
UAT replaces data on a segment basis and it also uses admission control and bandwidth reservation
to anticipate future demand. A FIFO policy is used to decide which of the segments from the free-
pool are to be allocated to a new request, i:e:; when a new request is to be allocated a segment
from the free-pool, the oldest segment in the free-pool is allocated rst. A timestamp mechanism is
used to maintain information about the oldest segment in the free-pool. The UAT policy is ideal for
continuous media data where there is no sharing. UAT reuses segments that may already exist in
bu er, i:e:; if any of the required segments happen to exist in the free-pool then they are grabbed from
the free-pool, thereby avoiding fetching those segments again from the disk. UAT does not consider
sharing explicitly, i:e:; when the segments of a topic have been played back, they are not preserved in
bu er for a future user which might request the same topic. Hence requests from di erent users are
considered independent of each other for admission control and reservation purposes, even if they are
for the same topic. The admission control policy checks for each request separately to see if there is
su cient bu er and disk bandwidth available to fetch and store data for the request on a segment by
segment basis (using inequalities 1 and 2). If the required bandwidth is available, then the required
disk and bu er bandwidths are reserved; else the request is rejected.
4.2 Proposed Scheme - Sharing Data (SHR)
Here we propose a new bu er management scheme which uses the continuous media caching technique
to speci cally exploit the sharing characteristics of MMDBS applications. This scheme (SHR) is a
variation of the UAT scheme we discussed earlier, with enhancements for continuous media caching.
The bu er and disk bandwidth for these schemes are determined by utilizing the technique illustrated
in gure 2 (page 6). We do not discuss details of disk scheduling here. A rate monotonic type of
scheduling is the best, i:e:; since the disk bandwidth will be assured for the various sessions, some
disk bandwidth will be scheduled for each session, corresponding to the ratio of the data rate of the
segment it is fetching to the total demanded data rate.
    Since SHR utilizes continuous media caching, segments used by a leading user are explicitly
preserved for use by the lagging user. This can be explained better using gure 3. The segments
that are explicitly preserved begin with the segment that is being played back by the leading user
when the lagging user enters the system. All segments that follow this are also preserved in the
bu er until the lagging user needs it for playback. (Referring to gure 3, the segments that have
been preserved are 3, 4 and 5). If it happens that some of the segments are in the free-pool (say
segments 1 and 2 are in the free-pool after they have been used and returned by user 1), they can
be reacquired from the free-pool if the bu er space corresponding to those segments has not already

                                                   8
been granted to other sessions. If all the segments (loaded for the rst user prior to this user's
arrival) are available, then no data is to be fetched from the disk for the second user (only bu er
bandwidth is checked/reserved), else data is to be fetched from the disk for those segments that do
not exist in the free-pool(both bu er and disk bandwidth checked/reserved). Thus SHR uses past
and future information about segment access and hence should perform better that UAT as we verify
later in the performance tests. Essentially the performance di erence between UAT and SHR can
be attributed solely to continuous media caching.
                     do while there are requests in the queue
                         select next request ;
                        admitted = false ;
                        For each segment in the topic, check if the segment is already
                        in the free−pool and grab it if it is available ;
                       For each segment grabbed from the free−pool, modify the
                       total profile to reflect preserving of these segments and
                       check for buffer bandwidth violation ;
                        For segments that have to be fetched from the disk, modify
                        the total profile for disk and buffer accordingly and check
                        for buffer/disk bandwidth violation ;
                        For segments that exist in the buffer and are shareable, modify
                        the total profile for buffer and check for bandwidth violation ;
                        if no violation at any stage
                            admitted = true ;
                            /* total profile contains reservation for this request */
                        else
                           restore old total profile ;
                           reject request ;
                        endif
                     end do

                   Figure 4: Admission Control and Reserving future bandwidth
                   do forever
                      do for all active sessions
                        if buffers are shared with another session
                                if leading user
                                      preserve the used buffers for lagging users ;
                                      acquire new buffers from the free pool ;
                                      fetch data from disk into the buffer ;
                                else if lagging user
                                      if there are more lagging users
                                           preserve the used buffers for lagging users ;
                                           read from preserved buffers ;
                                      else
                                           return the used buffers to the free pool ;
                                           read from preserved buffers ;
                                      endif
                                endif
                       else
                               when a new segment is encountered
                                    return the used buffers to the free pool ;
                                    acquire new buffers from the free pool ;
                                    fetch segment from disk;
                       endif
                     end do
                  end do

                        Figure 5: Bu er Allocation & Replacement | SHR
   Figure 4 describes the admission control and bandwidth reservation scheme for SHR. The total
pro le refers to the total requirements of bu er/disk bandwidth. The gure describes the sequence
in which the availability of segments in the bu er is checked along with admission control and
bu er/disk bandwidth reservation. Since SHR considers sharing explicitly, the integrated admission
control and reservation scheme has to be modi ed accordingly. Speci cally, in SHR, the admission
control policy checks if there is su cient disk bandwidth available for segments to be loaded from the
disk (for segments 1 and 2 if they are not available in the free-pool) and su cient bu er bandwidth
for loading these segments in bu er and preserving the other segments (i:e:; loading segments 1 &

                                                      9
2 and preserving segments 3, 4 and 5). If the required bandwidth is available then appropriate disk
and bu er bandwidths are reserved, else sharing is not possible. When this is the case, the request
in considered independently (similar to the UAT scheme) and an attempt is made to reserve the
necessary bu er and disk bandwidth for each request. If this also fails, then the request is rejected.
    Bu er allocation and replacement is to be done in an integrated manner as shown in gure 5.
Bu er allocation is done when a inter segment synchronization point is encountered. Bu ers used
by a segment that has been played back are freed rst and then bu ers needed by the next segment
are allocated. Once the bu ers have been allocated for a segment, they are not freed till the end of
the segment. For every user, at the beginning of a data transfer from disk, new bu ers have to be
allocated. This is achieved by acquiring the necessary number of bu ers from the free pool. Once
data from these bu ers are played back by a user, if there are lagging users, instead of returning them
to the free pool, they are preserved in the bu er. For a lagging user, bu ers need not be acquired
from the free pool, since a disk transfer is not necessary. Instead data is read from the preserved
bu ers after which the bu ers are returned to the free pool. Information about allocated bu ers
can be maintained using a hash table since it is necessary for sharing bu ers. We do not explicitly
consider prefetching of data 26, 27] but if it is considered, then our schemes have to be modi ed
such that prefetching is applied to all or none of the sessions.

4.2.1 Criteria to attempt sharing
Another important tradeo related to continuous media caching is how to decide if it is advantageous
to share the data with another user. There are several factors that determine this. For example, if
the di erence between the arrival times of the leading and lagging users is not much then clearly this
technique pays o because in this case, bu ers needed by lagging users are not retained for a long
time. However if there is a large di erence in the arrival times, then the amount of time for which
the respective segments have to be carried over (cached) becomes large. Essentially this means that
more bu er space will be occupied to enable sharing between users. This may cause many other
requests to fail for lack of bu er space. Hence we performed some empirical tests to determine the
point beyond which there is no payo (i:e:, the maximum di erence in the arrival time beyond which
continuous media caching should not be used). Our tests indicate that continuous media caching
pays o as long as it is used when the inter arrival time (iat) is less than maximum tolerable inter
arrival time for sharing (max-iat) . Other than the arrival time, there are several other factors that
determine the value of max-iat as shown below.
                      max iat = C Mean Arrival TimeData Rate Size
                                            Topic Length
                                                              Memory                               (3)
                                                           iat      max iat                        (4)
If more memory is available, then segments can be cached for a longer duration of time and, hence,
max-iat is directly proportional to memory size. On the other hand, if the topic length is large or the
data rate is high, it is di cult to cache segments for a longer duration since the bu er space needed
will be too high. Hence max-iat is inversely proportional to the topic length and the data rate. The
constant C can be determined via empirical tests. Developing better heuristics to determine the
                                                  10
criteria for sharing can be a separate area of research and hence we do not go too much into the
details. Thus, for the SHR scheme, we have used inequality (4) as a basis for deciding when the
requests are to be shared. If iat is larger than max-iat then the request is satis ed using the UAT
policy. It should be noted that in all these cases, a request is considered for scheduling as soon as it
is received (zero response time).

5 Batching Schemes
In order to achieve better performance out of an existing system, i:e:; support more number of
concurrent users, batching can be used 7]. Batching tries to group requests that arrive within a
short duration of time for the same topic. Our objective is to explore how batching can enhance the
performance of continuous media caching schemes.
    There are a few tradeo s in the design of batching schemes. The response time of a request
is the time interval between the instant it enters the system and the instant at which it is served
(i:e:; the rst segment of the request is played back). Since customers will be inclined to choose a
service with a faster response time, it is important to study the impact of response time on batching
and sharing schemes. Thus the success of a scheme can be measured in terms of the number of
requests that succeed2 . The factors that determine the actual response time of a request include
the scheduling policy and the scheduling period. The scheduling period is the time interval between
two consecutive invocations of the scheduler. At each invocation of the scheduler (scheduling point),
requests for the same topic are batched and scheduled according to the scheduling policy. If the
tolerable response time is small, then it is necessary to have smaller scheduling periods or else the
percentage of successful requests will be low. However for better batching" (maximize sharing), the
scheduling period has to be large. Hence, as a compromise, some intermediate value has to be chosen
such that batching can be achieved with smaller response times. Next, we brie y discuss some of
the scheduling policies. The scheduling policy determines the order in which the requests are to be
considered (i:e:; if the admission control is successful, then the required resources are allocated) at
each scheduling point. Some of these batching policies have been studied for a video-on-demand
server in 7]. In the First-Come-First-Serve (FCFS) policy, requests for the same topic are batched
and the topic which has the earliest request is considered rst and so on. In the Maximum-Group-
Scheme (MGS) policy, requests for the same topic are batched and the topic which has the maximum
number of requests is considered rst. From the point of view of continuous media caching, other
variations are possible where the topic which was scheduled most recently can be scheduled again to
promote sharing (i:e:; schedule the topic with the smallest time di erence between now and when
the topic was scheduled last). While other policies favor topics that are requested more frequently,
the FCFS policy is fair and is recommended in 7]. For our study, we consider the FCFS batching
scheme.
    Having studied some of the issues related to batching, we now describe how continuous media
caching can be used in conjunction with batching to improve sharing. This results in considerable
   2
    a request succeeds if the required resources are allocated for the entire request and the actual response time does
not exceed the maximum tolerable response time


                                                          11
savings in the resources required to satisfy a given workload as we will see later in the performance
tests. For comparison purposes, we describe two batching schemes that use the FCFS policy for
scheduling but use di erent bu er management schemes.

5.1 Batching without Sharing (BAT-UAT)
BAT-UAT is the FCFS batching scheme that uses the UAT bu er management scheme. Since
UAT does not use continuous media caching, in BAT-UAT data sharing is achieved purely through
the e ect of batching. Data sharing possibilities cannot be fully exploited purely through batching
because of the dependency between scheduling period and response time which we discussed earlier.

5.2 Batching with Sharing (BAT-SHR)
BAT-SHR is the FCFS batching scheme that uses the SHR bu er management scheme. Here the
e ect of sharing is achieved in two di erent ways | through batching and continuous media caching.
Since BAT-SHR uses the same scheduling period as BAT-UAT, the response time will remain the
same or improve (since more resources will be available due to the cumulative e ects). However
the extra data sharing that could not be achieved by batching is achieved via continuous media
caching i:e; requests for the same topic may have been scheduled in di erent neighboring scheduling
points, but sharing is achieved through continuous media caching when the neighboring points are
within max-iat. Thus the di erence in performance between BAT-UAT and BAT-SHR can also be
attributed solely to continuous media caching.

6 Performance Tests
We have performed a number of tests to compare the di erent schemes and evaluate the bene ts
of continuous media caching. In this section we discuss our experimental setting, a few of our
assumptions and then describe the various parameters and metrics used for the tests.

6.1 Experimental Setting & Assumptions
We assume a client-server type of architecture shown in gure 6. A DMA transfer is assumed between
the disk and the bu er at the server and hence the CPU is not likely to be a bottleneck. So the
CPU has not been modeled in the performance tests. It is assumed that the network between the
client and the server provides certain quality-of-service (QOS) guarantees about the network delay
and bandwidth. A detailed discussion of reliability and networking issues for continuous media can
be found in 3]. We will assume a two level storage system: primary storage (main memory bu er)
and secondary storage (disk). Knowing the maximum seek time, rotational latency and data read
rate, the e ective transfer rate, R can be determined for a storage system. This is used in performing
bandwidth allocations for the disk. The secondary storage device could be a magnetic/optical disk or
even a RAID based system for increased bandwidth. We do not go into the details of how multimedia

                                                 12
data is organized and retrieved from the secondary storage in an e cient manner since there has
been extensive research in this area 2, 16, 24, 27, 29, 31] and it is orthogonal to the main focus
of this work. Since the data rate for video dominates, we assume that requests desiring the same
topic with di erent language options can share the segments. We also assume that segments are of
equal length (this simpli es our implementation for the performance tests but does not a ect the
results).
                                                                 Database


                                                        Server




                          Client      Client
                                                                              Client
                          Disk                                                             Client
                                                                                           Disk




                                               Client                Client
                                   Client                                         Client
                                   Disk                                           Disk


                                   Figure 6: Con guration of the System
    Admission control and bandwidth reservation is performed on a segment basis for each request and
if the bandwidth is insu cient at any point, the request is rejected. The bu er and disk bandwidth
checks are performed for the entire request at admission time, i:e:, even for segments that will have
to be played at later point in time. Before a segment is placed in the bu er, the following actions are
taken. First the bu er is searched to see if the requested segment exists in free-pool. If it does not
exist then the required bu er space is acquired from the free-pool. If this cannot be done because
the available bu er space in the free-pool is not su cient, then the request is rejected and it counts
as a reject due to insu cient bu er bandwidth. When segments have to be fetched from the disk,
if su cient bu er space is available then the disk bandwidth is checked for su ciency. If the disk
bandwidth is su cient then the request proceeds. If all the segments can be brought in, then the
request is successful else the request is rejected and it counts as a reject due to insu cient disk
bandwidth.
    For the batching schemes, at each scheduling point, the requests are grouped based on the
demanded topic. Since we use a FCFS scheme, the grouped requests are considered one by one
based on the earliest arrival time of a request in the group. A request that is considered undergoes
the admission control and bandwidth reservation described above. If the required resources are
available and allocated, then all the requests for that topic are successful. If this request cannot
be satis ed, then the remaining requests are considered for that scheduling point. If a particular
request cannot be satis ed at a scheduling point, it will be tried again at the next point and so on
as long as the response time requirements are met. Hence a request that could not be satis ed at a
particular scheduling point may be successful at the next point. If the actual response time increases
the maximum tolerable, then the request is no longer considered for scheduling and is rejected.




                                                            13
6.2 Parameters
The parameters to be considered fall under four di erent categories | customer, system, topic and
batching. They are listed below in the table with their default values. Topic details (length, data
rate, etc.) are generated using a uniform distribution. Customer arrival is modeled using Poisson
distribution. The topic chosen by a customer is modeled using Zipf's law 32]. According to this, if
the topics 1; 2; :::; n are ordered such that p1 p2 ::: p then p = c=i where c is a normalizing
                                                            n        i


constant and p represents the probability that a customer chooses topic i. This distribution has
               i


been shown to closely approximate real user behavior in videotex systems 14] which are similar to
on-demand services.
               Category Parameter                  Setting (Default Values)
             Customer    Inter-Arrival Time       Poisson Distribution with mean of 10 sec
                         Topic Chosen             Zipf's Distribution (1-10)
             System      Bu er Size               1.28 GB
                         Disk Rate                < determined based on 95 % success rate>
             Topic       Total Number             10
                         Length                   Uniform Distribution (500-700) sec
                         Data Rate of Segment     Uniform Distribution (1-2) MB/sec
             Batching    Maximum Response Time 40 sec
                         Scheduling Period     10 sec

6.3 Metrics
A popular metric for these tests would be the percentage of successful requests. But this alone is
not su cient for system designers since they are more interested in metrics that are closely tied to
QOS guarantees. An example of this would be the maximum arrival rate that can be sustained if
95 % of the requests are to succeed given certain resources. Another metric would be the resource
requirements for a 95 % success rate give certain workload. Evaluating metrics that are based on
QOS guarantees is a time consuming process since it is iterative in nature. For example, to evaluate
the disk bandwidth that ensures a 95 % success rate when the inter-arrival time is 10 sec, starting
from a small value of disk bandwidth (for which the success rate may be lower than 95 %), the disk
bandwidth has to be incremented till the success rate reaches 95 %.
    For all our tests, the metric we use is the disk bandwidth needed to guarantee that 95 % of the
requests will be successful. For the bu er management schemes, since we do not batch the requests,
95 % of the requests are to be satis ed with a zero response time. A non-zero value like 10 seconds
can also be used as the tolerable response time but since we do not batch requests, this might not
bene cial. Response time is more relevant for the batching schemes, since the scheduling is done
at periodic intervals. A tolerable response time is speci ed for these tests and 95 % of the requests
should have response times smaller than the tolerable response time.



                                                 14
7 Results
Each simulation consisted of 25 iterations and each iteration simulated 2000 customers. The con -
dence interval tests of the results indicate that 95 % of the results (percentage of successful requests)
are within 3 % of the mean in almost all cases. The results are discussed based on the disk bandwidth
required to guarantee that 95 % of the requests will succeed. All parameters are set to their default
values speci ed earlier unless explicitly indicated in the graphs. Topic details (the database accessed
by users) are generated for every iteration. We rst compare the results of the bu er management
schemes (UAT and SHR) and then compare the batching schemes (BAT-UAT and BAT-SHR). The
area below the curves are lled in light gray for schemes that do not use continuous media caching
and in dark gray (which blots out the light gray) for schemes that use continuous media caching.
Hence in all the graphs, the light gray region shows the bene t from continuous media caching.
7.1 Comparison of UAT and SHR based on Disk Bandwidth
This section compares the performance of UAT and SHR based on the various parameters we men-
tioned earlier. We analyze the behavior of the two schemes and discuss how continuous media caching
helps to improve the performance of the system.

7.1.1 Comparison based on Inter Arrival Time
Figure 7(a) shows the e ect of inter-arrival time on the disk bandwidth requirement. For smaller
values of inter-arrival time, higher disk bandwidths are needed to satisfy the requests. The disk
bandwidth requirement drops gradually as the inter arrival time increases. SHR requires about 50-
70 % of the bandwidth required by UAT depending upon the inter-arrival time. This clearly shows
the ability of continuous media caching schemes to improve the number of concurrent requests that
can be handled by a system.

7.1.2 Comparison based on Bu er Size
Figure 7(b) studies the e ect of bu er size on the performance of the schemes (a small portion of
the UAT curve is masked by the SHR curve at the left). In general if the bu er space available is
low, all bu er mangement schemes experience poor performance. Furthermore, to exploit sharing,
SHR needs a sizeable amount of bu er. It can be observed that at lower bu er sizes of 400-600 MB,
SHR is not able to exploit sharing as much and hence requires disk bandwidth comparable to that of
UAT. However, as the bu er size available increases (600MB-1.28GB), SHR uses the extra bu er to
exploit sharing and hence begins to outperform UAT. The disk bandwidth requirement drops down
to almost half of what is required by UAT. It can be noted that UAT does not utilize the available
bu er resources to the maximum whereas SHR does. Hence UAT requires a constant disk bandwidth
of 101 MB/s while the disk bandwidth for SHR reduces steadily with an increase in bu er size. This
graph is also more important since it gives the tradeo s in the system resources (bu er and disk)
required by the di erent schemes for 95 % success given a certain workload. Essentially the UAT
scheme is preferable when the available bu er size is small and if more bu er is available SHR can

                                                   15
be used.
                                               300                                                                               160




                                                     |




                                                                                                                                       |
                                                                                                                                                     UAT
                                                                                                                                 140




                                                                                                                                       |
           Disk Rate (MB/s) for 95 % success




                                                                                             Disk Rate (MB/s) for 95 % success
                                               250                                                                                                   SHR

                                                     |
                                                                           UAT
                                                                                                                                 120




                                                                                                                                       |
                                                                           SHR
                                               200
                                                     |
                                                                                                                                 100




                                                                                                                                       |
                                               150                                                                               80
                                                     |




                                                                                                                                       |
                                                                                                                                 60




                                                                                                                                       |
                                               100
                                                     |




                                                                                                                                 40




                                                                                                                                       |
                                               50
                                                     |




                                                                                                                                 20




                                                                                                                                       |
                                                0|        | | | | | | | |                                                         0|   |    |    |    |     |  |
                                                     |




                                                                                                                                       |
                                                 0       5 10 15 20 25 30 35 40                                                   320 480 640 800 960 1120 1280
                                                         (a) Inter-Arrival Time (sec)                                                     (b) Buffer Size (MB)
                                                           Figure 7: E ect of Inter Arrival Time and Bu er Size
7.1.3 Comparison based on Topic Parameters
The next comparison is based on the number of topics available. This test is important since the
probability of locating a required segment in memory with or without planning depends upon the
number of topics, i:e:; lesser the number of topics, higher the probability of nding a required segment.
It can be observed in gure 8(a) that the increase in bandwidth is steep initially and then becomes
more gradual. This behavior can be attributed to the Zipf's distribution we have chosen to model
the customer's topic selection. If a uniform distribution is chosen, the increase in disk bandwidth
may be steeper throughout. The main observation however is that SHR continues to dominate the
performance (with lesser disk bandwidth requirement) even with an increasing number of topics.
The bene t from continuous media caching is highly evident when there are fewer topics.
    Figure 8(b) explores the e ect of the length of the topic on the performance of the di erent
schemes. This is important due to the fact that increasing the length of the topic results in a
request taking longer time to complete and hence the number of concurrent customers in the system
increases, which in turn demands increased bu er and disk bandwidth. Thus the disk bandwidth
requirements of the schemes increases with an increase in length of topic. However if SHR is used,
some of the concurrent sessions are shared and hence the disk bandwidth requirement for SHR is
less than that of UAT.
    Another important parameter to be considered is the average data rate of the segments. Based on
the MPEG-2 standards, the average data rate for a compressed high quality video is around 2MB/s.
Hence we have performed tests for average data rate ranging from 2-5MB/s. As expected, we note
from gure 8(c) that with increasing data rates, the disk bandwidth requirement increases. However
we observe that SHR requires lesser disk bandwidth compared to UAT. As the data rate increases,
the gap between UAT and SHR tends to narrow down since more bu er space is needed for caching
                                                                                        16
larger amount of data and hence data has to be brought from the disk for some of the requests.
                                    130                                                                    160                                                                   340
                                          |




                                                                                                                 |




                                                                                                                                                                                       |
                                    120                                                                                                                                                               UAT
                                          |

                                                   UAT                                                                        UAT
                                                                                                           140                                                                                        SHR




                                                                                                                 |
                                                                                                                                                                                 300
Disk Rate (MB/s) for 95 % success




                                                                       Disk Rate (MB/s) for 95 % success




                                                                                                                                             Disk Rate (MB/s) for 95 % success
                                    110




                                                                                                                                                                                       |
                                                                                                                              SHR
                                          |



                                                   SHR
                                    100
                                          |




                                                                                                           120




                                                                                                                 |
                                                                                                                                                                                 260




                                                                                                                                                                                       |
                                    90
                                          |




                                    80                                                                     100




                                                                                                                 |
                                          |




                                                                                                                                                                                 220




                                                                                                                                                                                       |
                                    70
                                          |




                                                                                                           80




                                                                                                                 |
                                    60
                                          |




                                                                                                                                                                                 180




                                                                                                                                                                                       |
                                    50                                                                     60
                                          |




                                                                                                                 |
                                    40
                                          |




                                                                                                                                                                                 140




                                                                                                                                                                                       |
                                                                                                           40
                                                                                                                 |
                                    30
                                          |




                                    20                                                                                                                                           100
                                          |




                                                                                                                                                                                       |
                                                                                                           20
                                                                                                                 |
                                    10
                                          |




                                     0| | | | | | | | | | |                                                 0|   |     |   |   |    |    |   |                                    60 |      |         |         |
                                          |




                                                                                                                 |




                                                                                                                                                                                       |
                                      0 5 10 15 20 25 30 35 40 45 50                                        100 200 300 400 500 600 700 800                                          2      3         4         5
                                             (a) Number of Topics                                                  (b) Length of Topic (sec)                                               (c) Data Rate (MB/s)
                                                                    Figure 8: E ect of Topic Parameters

                                    7.2 Comparison of BAT-UAT and BAT-SHR based on Disk Bandwidth
                                    The batching schemes also behave similar to the bu er management schemes, the main di erence
                                    being that requests for the same topic are grouped and hence the overall disk bandwidth requirements
                                    is less in all cases. We also discuss results of some additional tests which are based on the batching
                                    parameters, i:e:; tolerable response time and scheduling period.

                                    7.2.1 Comparison based on Inter Arrival Time
                                    The e ect of inter-arrival time on the two batching schemes is shown in gure 9(a). Similar to the
                                    behavior we observed in gure 7(a) we see that BAT-SHR requires lesser disk bandwidth compared
                                    to BAT-UAT. Another important point to notice is that the disk bandwidth requirement has almost
                                    halved and this indicates that inter-arrival time has the most impact on data sharing through batching
                                    than any other parameter.

                                    7.2.2 Comparison based on Bu er Size
                                    It can be observed from gure 9(b) that the disk bandwidth requirement of BAT-SHR is less than
                                    BAT-UAT for all the bu er sizes we have tested for. This di ers from our observation in the case of
                                    a non batched system where UAT performed better than SHR for smaller bu er sizes. Here we have
                                    a slight departure from the behavior of gure 7(b) in the sense that the disk bandwidth required for
                                    BAT-SHR is always less than BAT-UAT and this can be attributed to batching. We also observe
                                    that the overall disk bandwidth requirements has dropped but not as much as we noticed in the case
                                    of the inter-arrival time.

                                                                                                                         17
140




                                                                                                                                                                                                           |
                                                                                  150




                                                                                        |
                                                                                                                                                                                                     120


                                              Disk Rate (MB/s) for 95 % success




                                                                                                                                                                 Disk Rate (MB/s) for 95 % success

                                                                                                                                                                                                           |
                                                                                                                                                                                                                                                             BAT-UAT
                                                                                                                                             BAT-UAT                                                                                                         BAT-SHR
                                                                                                                                                                                                     100




                                                                                                                                                                                                           |
                                                                                                                                             BAT-SHR
                                                                                  100
                                                                                        |
                                                                                                                                                                                                     80




                                                                                                                                                                                                           |
                                                                                                                                                                                                     60




                                                                                                                                                                                                           |
                                                                                  50
                                                                                        |



                                                                                                                                                                                                     40




                                                                                                                                                                                                           |
                                                                                                                                                                                                     20




                                                                                                                                                                                                           |
                                                                                   0|        | | | | | | | |                                                                                          0|   |    |    |     |    |   |
                                                                                        |




                                                                                                                                                                                                           |
                                                                                    0       5 10 15 20 25 30 35 40                                                                                    320 480 640 800 960 1120 1280
                                                                                            (a) Inter-Arrival Time (sec)                                                                                       (b) Buffer Size (MB)
                                                                                              Figure 9: E ect of Inter Arrival Time and Bu er Size

                                    130                                                                                                    140                                                                                                     300
                                          |




                                                                                                                                                 |




                                                                                                                                                                                                                                                         |
                                    120
                                          |




                                                                                            BAT-UAT                                                           BAT-UAT                                                                                                   BAT-UAT
                                                                                                                                           120
Disk Rate (MB/s) for 95 % success




                                                                                                       Disk Rate (MB/s) for 95 % success




                                                                                                                                                                                                               Disk Rate (MB/s) for 95 % success
                                    110
                                                                                                                                                 |




                                                                                                                                                                                                                                                   260
                                          |




                                                                                            BAT-SHR                                                           BAT-SHR




                                                                                                                                                                                                                                                         |
                                                                                                                                                                                                                                                                        BAT-SHR
                                    100
                                          |




                                                                                                                                           100
                                                                                                                                                 |




                                    90
                                          |




                                                                                                                                                                                                                                                   220

                                                                                                                                                                                                                                                         |
                                    80
                                          |




                                                                                                                                           80
                                                                                                                                                 |




                                    70
                                          |




                                                                                                                                                                                                                                                   180   |
                                    60
                                          |




                                                                                                                                           60
                                                                                                                                                 |




                                    50
                                          |




                                                                                                                                                                                                                                                   140
                                                                                                                                                                                                                                                         |



                                    40
                                          |




                                                                                                                                           40
                                                                                                                                                 |




                                    30
                                          |




                                    20                                                                                                                                                                                                             100
                                                                                                                                                                                                                                                         |




                                                                                                                                           20
                                          |




                                                                                                                                                 |




                                    10
                                          |




                                     0| | | | | | | | | | |                                                                                 0|   |     |   |   |    |    |   |                                                                     60 |       |         |         |
                                          |




                                                                                                                                                 |




                                                                                                                                                                                                                                                         |




                                      0 5 10 15 20 25 30 35 40 45 50                                                                        100 200 300 400 500 600 700 800                                                                           2       3         4         5
                                             (a) Number of Topics                                                                                  (b) Length of Topic (sec)                                                                                 (c) Data Rate (MB/s)
                                                                                                       Figure 10: E ect of Topic Parameters

                                    7.2.3 Comparison based on Topic Parameters
                                    The e ects of the topic parameters is shown in gure 10(a),(b) &(c). Even here we observe a behavior
                                    similar to that of gure 8. Again we see that the disk bandwidth requirement has dropped due to
                                    the e ect of batching. We also observe the e ect of continuous media caching in conjunction with
                                    batching in gure 10(a). The disk bandwidth requirement for BAT-SHR does not increase even when
                                    the number of topics is increased from 10 to 50 which is di erent from what we observed in the case
                                    of SHR in gure 8(a). Essentially batching contributes to additional data sharing and hence the
                                    reduction in the disk bandwidth requirement.
                                                                                                                                                         18
7.2.4 Comparison based on Batching Parameters
The e ect of the maximum tolerable response time on the two batching schemes is shown in gure
11(a). Here we observe that the drop in disk bandwidth requirement for BAT-SHR is more steeper
compared to BAT-UAT. This clearly indicates that continuous media caching in conjunction with
batching tries to e ectively utilize the increase in tolerable response time.
    Figure 11 (b) indicates more precisely how continuous media caching increases the e ect of
sharing. While it can be observed that at low scheduling intervals the e ect of sharing is not
completely achieved by BAT-UAT through batching, BAT-SHR achieves this by using continuous
media caching in conjunction with batching. Hence the disk bandwidth required for BAT-SHR is
almost the same irrespective of the scheduling period while BAT-UAT requires relatively higher disk
bandwidth at smaller scheduling periods. As discussed earlier, in order to satisfy the response time
requirements, it might be necessary to have smaller scheduling periods. To have better performance
(lesser disk bandwidth) at smaller scheduling periods, BAT-SHR scores over BAT-UAT.
                                              130                                                                            130
                                                    |




                                                                                                                                   |
                                              120                                                                            120
                                                    |




                                                                                                                                   |
                                                                     BAT-UAT                                                                           BAT-UAT
          Disk Rate (MB/s) for 95 % success




                                                                                         Disk Rate (MB/s) for 95 % success
                                              110                                                                            110
                                                    |




                                                                                                                                   |
                                                                     BAT-SHR                                                                           BAT-SHR
                                              100                                                                            100
                                                    |




                                                                                                                                   |
                                              90                                                                             90
                                                    |




                                                                                                                                   |
                                              80                                                                             80
                                                    |




                                                                                                                                   |

                                              70                                                                             70
                                                    |




                                                                                                                                   |


                                              60                                                                             60
                                                    |




                                                                                                                                   |




                                              50                                                                             50
                                                    |




                                                                                                                                   |




                                              40                                                                             40
                                                    |




                                                                                                                                   |




                                              30                                                                             30
                                                    |




                                                                                                                                   |




                                              20                                                                             20
                                                    |




                                                                                                                                   |




                                              10                                                                             10
                                                    |




                                                                                                                                   |




                                               0|     |     |   |    |    |     |                                             0|        |    |   |   |    |    |   |
                                                    |




                                                                                                                                   |




                                                10 15 20 25 30 35 40                                                           5       10 15 20 25 30 35 40
                                               (a) Tolerable Response Time (sec)                                                       (b) Scheduling Period (sec)
                                                             Figure 11: E ect of Batching Parameters
   Thus using these tests we have been able to demonstrate that in multimedia information systems
where sharing can be exploited, schemes can and must be designed such that they make use of this
sharing potential based on information about past and future access to perform more e cient bu er
management.


8 E ect of Fast Forward/Rewind on Continuous Media Caching
In this section we brie y discuss the e ects of fast forward and rewind while continuous media caching
is used. Support for fast forward and rewind (which are commonly known as VCR functions) is a
topic that is being researched 8]. Concrete solutions to this problem have not yet been proposed.

                                                                                    19
;;
;;
;;
;;
;;

Más contenido relacionado

Destacado

Primeros uxilios diapositivas
Primeros uxilios diapositivasPrimeros uxilios diapositivas
Primeros uxilios diapositivas
jhovannito22
 

Destacado (11)

Valu inlaga 1991-2010
Valu inlaga 1991-2010Valu inlaga 1991-2010
Valu inlaga 1991-2010
 
Bangladesh Project
Bangladesh Project Bangladesh Project
Bangladesh Project
 
Primeros uxilios diapositivas
Primeros uxilios diapositivasPrimeros uxilios diapositivas
Primeros uxilios diapositivas
 
Kf 161115, bilaga ärende 17 budgetförslag
Kf 161115, bilaga ärende 17 budgetförslagKf 161115, bilaga ärende 17 budgetförslag
Kf 161115, bilaga ärende 17 budgetförslag
 
Mal o-inriktningsdokument-2015-2019 Varberg
Mal o-inriktningsdokument-2015-2019 VarbergMal o-inriktningsdokument-2015-2019 Varberg
Mal o-inriktningsdokument-2015-2019 Varberg
 
White lily residency sonepat,sector 27 9350193692
White lily residency sonepat,sector 27 9350193692White lily residency sonepat,sector 27 9350193692
White lily residency sonepat,sector 27 9350193692
 
4 dr.rana
4  dr.rana4  dr.rana
4 dr.rana
 
Responding to extended writing electronically v3
Responding to extended writing electronically v3Responding to extended writing electronically v3
Responding to extended writing electronically v3
 
Developing Your Business In Asia: Country Briefings
Developing Your Business In Asia: Country BriefingsDeveloping Your Business In Asia: Country Briefings
Developing Your Business In Asia: Country Briefings
 
Content Marketing 2.0: How Content Promotion Can REvitalize Your Brand
Content Marketing 2.0: How Content Promotion Can REvitalize Your BrandContent Marketing 2.0: How Content Promotion Can REvitalize Your Brand
Content Marketing 2.0: How Content Promotion Can REvitalize Your Brand
 
How to Design For Mobile: 10 Easy Steps You Can Do Now
How to Design For Mobile: 10 Easy Steps You Can Do NowHow to Design For Mobile: 10 Easy Steps You Can Do Now
How to Design For Mobile: 10 Easy Steps You Can Do Now
 

Similar a ;;

Pivotal gem fire_wp_hardest-problems-data-management_053013
Pivotal gem fire_wp_hardest-problems-data-management_053013Pivotal gem fire_wp_hardest-problems-data-management_053013
Pivotal gem fire_wp_hardest-problems-data-management_053013
EMC
 
Scalable Distributed Real-Time Clustering for Big Data Streams
Scalable Distributed Real-Time Clustering for Big Data StreamsScalable Distributed Real-Time Clustering for Big Data Streams
Scalable Distributed Real-Time Clustering for Big Data Streams
Antonio Severien
 
[IJET-V1I6P11] Authors: A.Stenila, M. Kavitha, S.Alonshia
[IJET-V1I6P11] Authors: A.Stenila, M. Kavitha, S.Alonshia[IJET-V1I6P11] Authors: A.Stenila, M. Kavitha, S.Alonshia
[IJET-V1I6P11] Authors: A.Stenila, M. Kavitha, S.Alonshia
IJET - International Journal of Engineering and Techniques
 
Public Integrity Auditing for Shared Dynamic Cloud Data with Group User Revoc...
Public Integrity Auditing for Shared Dynamic Cloud Data with Group User Revoc...Public Integrity Auditing for Shared Dynamic Cloud Data with Group User Revoc...
Public Integrity Auditing for Shared Dynamic Cloud Data with Group User Revoc...
1crore projects
 
Mobile Web Browsing Based On Content Preserving With Reduced Cost
Mobile Web Browsing Based On Content Preserving With Reduced CostMobile Web Browsing Based On Content Preserving With Reduced Cost
Mobile Web Browsing Based On Content Preserving With Reduced Cost
Eswar Publications
 
Thesies_Cheng_Guo_2015_fina_signed
Thesies_Cheng_Guo_2015_fina_signedThesies_Cheng_Guo_2015_fina_signed
Thesies_Cheng_Guo_2015_fina_signed
Cheng Guo
 

Similar a ;; (20)

Fast and energy-efficient eNVM based memory organisation at L3-L1 layers for ...
Fast and energy-efficient eNVM based memory organisation at L3-L1 layers for ...Fast and energy-efficient eNVM based memory organisation at L3-L1 layers for ...
Fast and energy-efficient eNVM based memory organisation at L3-L1 layers for ...
 
Novel hybrid framework for image compression for supportive hardware design o...
Novel hybrid framework for image compression for supportive hardware design o...Novel hybrid framework for image compression for supportive hardware design o...
Novel hybrid framework for image compression for supportive hardware design o...
 
VoD Solutions
VoD SolutionsVoD Solutions
VoD Solutions
 
TR14-05_Martindell.pdf
TR14-05_Martindell.pdfTR14-05_Martindell.pdf
TR14-05_Martindell.pdf
 
Pivotal gem fire_wp_hardest-problems-data-management_053013
Pivotal gem fire_wp_hardest-problems-data-management_053013Pivotal gem fire_wp_hardest-problems-data-management_053013
Pivotal gem fire_wp_hardest-problems-data-management_053013
 
Dynamic Chunks Distribution Scheme for Multiservice Load Balancing Using Fibo...
Dynamic Chunks Distribution Scheme for Multiservice Load Balancing Using Fibo...Dynamic Chunks Distribution Scheme for Multiservice Load Balancing Using Fibo...
Dynamic Chunks Distribution Scheme for Multiservice Load Balancing Using Fibo...
 
Big Data Analytics: Architectural Perspective
Big Data Analytics: Architectural PerspectiveBig Data Analytics: Architectural Perspective
Big Data Analytics: Architectural Perspective
 
sac10serviso
sac10servisosac10serviso
sac10serviso
 
Scalable Distributed Real-Time Clustering for Big Data Streams
Scalable Distributed Real-Time Clustering for Big Data StreamsScalable Distributed Real-Time Clustering for Big Data Streams
Scalable Distributed Real-Time Clustering for Big Data Streams
 
IRJET- Improving Data Availability by using VPC Strategy in Cloud Environ...
IRJET-  	  Improving Data Availability by using VPC Strategy in Cloud Environ...IRJET-  	  Improving Data Availability by using VPC Strategy in Cloud Environ...
IRJET- Improving Data Availability by using VPC Strategy in Cloud Environ...
 
Public integrity auditing for shared dynamic cloud data with group user revoc...
Public integrity auditing for shared dynamic cloud data with group user revoc...Public integrity auditing for shared dynamic cloud data with group user revoc...
Public integrity auditing for shared dynamic cloud data with group user revoc...
 
P2P Video-On-Demand Systems
P2P Video-On-Demand SystemsP2P Video-On-Demand Systems
P2P Video-On-Demand Systems
 
Performing initiative data prefetching
Performing initiative data prefetchingPerforming initiative data prefetching
Performing initiative data prefetching
 
[IJET-V1I6P11] Authors: A.Stenila, M. Kavitha, S.Alonshia
[IJET-V1I6P11] Authors: A.Stenila, M. Kavitha, S.Alonshia[IJET-V1I6P11] Authors: A.Stenila, M. Kavitha, S.Alonshia
[IJET-V1I6P11] Authors: A.Stenila, M. Kavitha, S.Alonshia
 
Public Integrity Auditing for Shared Dynamic Cloud Data with Group User Revoc...
Public Integrity Auditing for Shared Dynamic Cloud Data with Group User Revoc...Public Integrity Auditing for Shared Dynamic Cloud Data with Group User Revoc...
Public Integrity Auditing for Shared Dynamic Cloud Data with Group User Revoc...
 
etd7288_MHamidirad
etd7288_MHamidiradetd7288_MHamidirad
etd7288_MHamidirad
 
Mobile Web Browsing Based On Content Preserving With Reduced Cost
Mobile Web Browsing Based On Content Preserving With Reduced CostMobile Web Browsing Based On Content Preserving With Reduced Cost
Mobile Web Browsing Based On Content Preserving With Reduced Cost
 
An Approach to Reduce Energy Consumption in Cloud data centers using Harmony ...
An Approach to Reduce Energy Consumption in Cloud data centers using Harmony ...An Approach to Reduce Energy Consumption in Cloud data centers using Harmony ...
An Approach to Reduce Energy Consumption in Cloud data centers using Harmony ...
 
An Approach to Reduce Energy Consumption in Cloud data centers using Harmony ...
An Approach to Reduce Energy Consumption in Cloud data centers using Harmony ...An Approach to Reduce Energy Consumption in Cloud data centers using Harmony ...
An Approach to Reduce Energy Consumption in Cloud data centers using Harmony ...
 
Thesies_Cheng_Guo_2015_fina_signed
Thesies_Cheng_Guo_2015_fina_signedThesies_Cheng_Guo_2015_fina_signed
Thesies_Cheng_Guo_2015_fina_signed
 

;;

  • 1. Continuous Media Sharing in Multimedia Database Systems Mohan Kamathy , Krithi Ramamrithamy and Don Towsleyz Department of Computer Science University of Massachusetts Amherst MA 01003 fkamath,krithi,towsleyg@cs.umass.edu Abstract The timeliness and synchronization requirement of multimedia data demands e cient bu er management and disk access schemes for multimedia database systems (MMDBS). The data rates involved are also very high and despite the development of e cient storage and retrieval strategies, disk I/O is likely to be a bottleneck, thereby limiting the number of concurrent sessions supported by a system. This calls for better use of data that has already been brought into the bu er by exploiting sharing whenever possible using advance knowledge of the multimedia stream to be accessed. This paper introduces the notion of continuous media caching which is a simple and novel tech- nique where bu ers that have been played back by a user are preserved in a controlled fashion for use by subsequent users requesting the same data. This is shown to have considerable impact on the performance of bu er management schemes. When continuous media sharing is used in conjunction with batching of user requests, even better performance improvement can be achieved. The results of our performance study clearly indicate that schemes that utilize continuous media caching out- perform existing schemes in environments where sharing is possible. We also discuss the impact of fast-forward and rewind on continuous media sharing schemes. Keywords: Bu er Management, Batching, Multimedia Databases, Data Sharing, Optimization, Performance y Supported by the National Science Foundation under grant IRI - 9109210. z Supported by ARPA under contract F19628-92-C-0089.
  • 2. Contents 1 Introduction 1 2 Data Characteristics 3 3 Continuous Media Sharing 4 4 Bu er Management Schemes 6 4.1 Traditional Scheme - Use And Toss (UAT) : : : : : : : : : : : : : : : : : : : : : : : : : 8 4.2 Proposed Scheme - Sharing Data (SHR) : : : : : : : : : : : : : : : : : : : : : : : : : : 8 4.2.1 Criteria to attempt sharing : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 10 5 Batching Schemes 11 5.1 Batching without Sharing (BAT-UAT) : : : : : : : : : : : : : : : : : : : : : : : : : : : 12 5.2 Batching with Sharing (BAT-SHR) : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 12 6 Performance Tests 12 6.1 Experimental Setting & Assumptions : : : : : : : : : : : : : : : : : : : : : : : : : : : : 12 6.2 Parameters : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 14 6.3 Metrics : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 14 7 Results 15 7.1 Comparison of UAT and SHR based on Disk Bandwidth : : : : : : : : : : : : : : : : : 15 7.1.1 Comparison based on Inter Arrival Time : : : : : : : : : : : : : : : : : : : : : : 15 7.1.2 Comparison based on Bu er Size : : : : : : : : : : : : : : : : : : : : : : : : : : 15 7.1.3 Comparison based on Topic Parameters : : : : : : : : : : : : : : : : : : : : : : 16 7.2 Comparison of BAT-UAT and BAT-SHR based on Disk Bandwidth : : : : : : : : : : 17 7.2.1 Comparison based on Inter Arrival Time : : : : : : : : : : : : : : : : : : : : : : 17 7.2.2 Comparison based on Bu er Size : : : : : : : : : : : : : : : : : : : : : : : : : : 17 7.2.3 Comparison based on Topic Parameters : : : : : : : : : : : : : : : : : : : : : : 18 7.2.4 Comparison based on Batching Parameters : : : : : : : : : : : : : : : : : : : : 19 8 E ect of Fast Forward/Rewind on Continuous Media Caching 19 9 Extensions to this Work 21 10 Conclusion 21
  • 3. 1 Introduction Multimedia systems have attracted the attention of many researchers and system developers, because of the multimedia nature of the data used in modern applications and due to their capability to im- prove the communication bandwidth between computers and users. These multimedia information systems cater to a variety of scienti c and commercial elds ranging from medicine to insurance. Lately the construction of the national information super-highway has also received enormous at- tention and many commercial ventures are focusing on providing on-demand services. Multimedia information systems will form the backbone of these services. Multimedia data can be classi ed as static or dynamic (also known as continuous) media. While static media, like text and images, do not vary with time (from the perspective of the output device), dynamic media, like audio and video, vary with time. Whenever continuous media is involved, in order to ensure smooth and mean- ingful ow of information to the user, timeliness and synchronization requirements are imposed. Timeliness requires that consecutive pieces of data from a stream be displayed to the user within a certain duration of time for continuity. Synchronization requires that data from di erent media be coordinated so that the user is presented with a consistent view. Hence, unlike real-time database systems where deadlines are associated with individual transactions, multimedia data has contin- uous deadlines. Thus far, most of the research work in the area of multimedia data management has concentrated on data modeling and storage mangement. While some of the issues relevant to MMDBS are enumerated in 4, 5, 15, 20], object-oriented approaches for implementing MMDBS are suggested in 6, 15, 21, 30]. Some architectural aspects of multimedia information systems like data placement, bu ering and scheduling are described in 6] while issues related to a distributed multi- media information system, performance in particular are discussed in 4]. Though e cient storage and retrieval strategies are being developed 2, 12, 16, 24, 27, 29, 31], since the data rates involved are very high (around 2 MB/sec of compressed data per request) disk I/O could still be a bottleneck, i:e:; the number of concurrent sessions supported could be limited by the bandwidth of the storage system. Hence, it is important to make better use of data brought from the disk into the database bu er. In this paper, we explore the bene ts of continuous media sharing in multimedia database systems (MMDBS). Speci cally, we consider the News-On-Demand-Service (NODS) which typi es applica- tions where extensive data sharing is possible and focus on bu er management schemes that promote sharing of continuous media data. Since multimedia data are typically read only, by sharing the data between users, many trips to the disks can be avoided. The motivation for our work comes from the fact that sharing of continuous media data can increase the number of users that can be supported by a system. Furthermore, there are many interesting issues that arise in such an environment. The most important fact is that once the user has requested a playback, advance knowledge of the multimedia stream to be accessed can be used to anticipate demand and promote sharing (planned sharing for future accesses). To achieve this, we introduce the notion of continuous media caching. It is a simple and novel technique where bu ers that have been played back by a user are preserved in a controlled fashion for use by subsequent (or lagging) users requesting the same data. This technique can be used when there is su cient bu er space such that it is possible to retain data for the required duration. With this technique, the system can avoid fetching the data from the 1
  • 4. disk again for the lagging user. Hence this technique makes it possible to support large number of sessions than the number permitted by the disk bandwidth. Also the throughput of the system, i:e:; the percentage of user requests permitted, can be increased by exercising the less expensive option of additional memory rather than choosing a storage system with a higher bandwidth. Batching is another technique to improve the performance of a system by grouping requests that demand the same topic 7]. By using continuous media caching in conjunction with batching, even better performance can be achieved. Throughout our discussion, we disregard issues that might arise if users request fast-forward or rewind. We postpone our discussion on the impact of fast-forward and rewind to the end of the paper. Bu er management schemes have been widely studied in traditional database systems 11, 26] and in real-time database systems 18]. The objective of these bu er management schemes is to promote sharing and reduce disk accesses, thereby increasing the throughput. However these schemes typically consider only textual data. Some aspects of bu er management for multimedia data are addressed in 2, 6]. While 6] does not speci cally consider concurrent sessions in multimedia systems, 2] does address some performance issues when concurrent users are trying to access continuous media data. However it disregards sharing of continuous media data. The Use-And-Toss (UAT) policy has been suggested in 6] for multimedia data, under which, once a piece of data is used, it is tossed away so that the next piece of required data is brought into the same bu er. Generally the UAT policy is good only if data sharing does not exist or cannot be exploited. Essentially, there has not been any research to evaluate the potential bene ts of continuous media sharing when users are accessing multimedia data concurrently. Hence we propose a new bu er management scheme, SHR that enhances UAT with continuous media caching. This scheme should perform very well since it uses past data and also does planned sharing of future data. To study the bene ts of using continuous media caching in conjunction with batching, we also study two other schemes BAT-UAT and BAT-SHR, which are enhancements of UAT and SHR respectively with batching. The rest of the paper is organized as follows. In section 2 we discuss the data characteristics of multimedia data and NODS in particular. Issues related to continuous media sharing are discussed in section 3. Using simple examples, we illustrate how continuous media sharing can be useful in increasing the number of users supported by a system. Section 4 discusses the UAT scheme and its enhancement (SHR scheme) through the use of continuous media caching. Batching schemes for grouping requests that demand the same data are discussed in section 5. Strategies to improve per- formance through the use of continuous media caching are also discussed. Details of the performance tests are discussed in section 6. This section describes the experimental setting, various assumptions made, parameters and metrics used for the tests. Section 7 interprets the results of the performance tests. These results show the superior performance of schemes that utilize continuous media caching. In section 8 we discuss the impact of fast-forward and rewind in continuous media sharing envi- ronments. Possible extensions to this work are described in section 9. Section 10 concludes with a summary of this work. 2
  • 5. 2 Data Characteristics Multimedia objects are captured using appropriate devices, digitized and edited, until they are in a form that can be presented to the user. These independently created objects are then linked using spatial and temporal information, to form a composite multimedia object that may correspond to a presentation or some speci c application. In order to ensure smooth and meaningful ow of information to the user, there are timeliness and synchronization requirements which vary from application to application. Timeliness requires that consecutive pieces of data from a stream be displayed to the user within a certain duration of time for continuity; hence multimedia data has continuous deadlines. Synchronization requires that data from di erent media be coordinated so that the user is presented with a consistent view. Essentially it refers to the temporal sequencing of events among multimedia data. It deals with the interaction of independent objects that coexist on di erent media such that the user is presented with a consistent view and is very much dependent on the type of application. Detailed analysis of synchronization requirements for multimedia data can be found in 23, 28]. The rest of this section describes the data characteristics of NODS, which typi es applications where extensive sharing of data is possible. In NODS, given that the number of topics in high demand is likely to be limited at any particular time, given the higher level of interest in certain topics like sports/business, the probability of one or more users requesting the same item for playback is very high. To request a service the user has to choose a topic and a language from, for example, the following: 1. Topic - Politics, International, Business, Sports, Entertainment, Health, Technology etc. 2. Language - English, Spanish, Italian, French, German, Hindi etc. Each topic consists of a list of news items. News items are composed of video and audio segments and possibly some images. All news items to be displayed when a topic is chosen are compiled in advance and known apriori. Sharing of data can occur at various levels as we discuss below. Typically there are many news items that are shared between topics. For example, consider the news item on NAFTA1 , which can be included in a variety of topics like Politics, International and Business because of its relevance to all of them. Sharing of continuous media can also occur at the media level. Since the same news items will be provided in di erent languages, to reduce storage, additional storage is necessary only for the audio segments as the video segments can be shared. In addition, sharing can occur when multiple users are requesting service on the same topic. This clearly shows that sharing at di erent levels can lead to compound sharing. In this paper, we mainly focus on the last type of sharing since it is likely to have the most impact on performance. Intelligent bu er management schemes can and must utilize this to improve the performance of a system. Figure 1 shows the data stream of a request. Because video is often compressed, we assume that the continuous media data is played out at a variable bit rate. The data rate depends on the amount 1 North American Free Trade Agreement 3
  • 6. of action that is packed in the video. We assume that a data stream is divided into segments where each segment is characterized by a xed playout data rate. For example a video stream of a news reader reading news will require a lower data rate than the video stream from a moving camera (e.g. sports clippings). In our work, segments are restricted to be of equal length (of 1 second duration) for ease of implementation. Time News Item boundary Data Rate segment boundary audio data video data Figure 1: Multimedia Data Stream of a request However since the data rate of segments can vary, the amount of bu er required by a segment varies. Since the length of the segment also has an impact on the performance of the schemes we are going to propose, if a segment is too long, it can be split into smaller segments that have the same data rate. 3 Continuous Media Sharing In this section we introduce the concept of continuous media caching which promotes planned sharing. It is a simple and novel technique where bu ers that have been played back by a user are preserved in a controlled fashion for use by subsequent users requesting the same data. Since the bu er requirements for images and audio are less compared to video (as we will see later in this section), we ignore them and consider only the requirements of video data. Hence we mainly focus on sharing that occurs when multiple users request service on the same topic (though they may have chosen a di erent language). When there is sharing, many possibilities exist depending upon the arrival time of the customers and we discuss them below. When the same service is being requested by two (or more) users simultaneously, one of the following could be true: 1. they arrive at the same time (they start at the same time) 2. they arrive with a gap of few time units between them 3. they arrive with a gap of large number of time units between them Depending upon the order in which users share an application, they may be classi ed as leading (the rst one) or lagging (the subsequent ones that can share the bu ers). Using the same bu ers for both users can de nitely payo in case 1 (since the same data is used at the same time by both requests), whereas in case 2 it may payo . In case 2, after the data has been played out of the bu er for the leading user, rather then discarding the data, it can be cached by preserving it in the bu er so that the same data can be reused by the lagging user (rather than fetching it again from the disk). We refer to this technique as continuous media caching. In case 3, it is easy to see that continuous media caching will not payo since preserving bu ers may prove to be too costly in terms of bu er requirements. In this case it is better to obtain the data for the lagging user directly from the disk and reuse the data bu ers from the leading user immediately after it has been played back. In our bu er management scheme, we allow sharing to occur as long as the di erence between the arrival time of two consecutive users for the same topic is less than a particular value (details are described 4
  • 7. in section 4.2.1). Barring information about the future arrivals, this is the earliest time instant when a sharing situation can be recognized. Figure 2 compares the disk and bu er requirements as a function of time for the non-shared case and the shared case (continuous media caching). Two requests for the same topic (containing 5 segments) that arrive with a time di erence of 2 segments is shown in the gure. In the non-shared case, disk and bu er bandwidths are allocated independently for each request. However for the shared case, since it is known that segments three through ve will be available in the bu er (it will be loaded into the bu er for the rst request) when the second request needs it, the bu ers are reused for the second request, i:e:; the bu ers are preserved until the corresponding playback time for the second user. These segments need not be fetched again from the disk for the second request. This results in an increase in the bu er requirement, but a decrease in the disk bandwidth requirement. Thus it can be seen that continuous media caching exploits sharing to decrease the number of disk I/Os and, hence, improve performance. This philosophy concurs with the increasing trend towards main memory systems since disk transfer rates are typically limited by mechanical movements. Disk Bandwidth for Present Request Disk Bandwidth for Present Request Disk Bandwidth for Next Request Disk Bandwidth for Next Request Bytes/sec Bytes/sec Disk Bandwidth for Non−Shared Case Time Disk Bandwidth for Shared Case Time Buffers Allocated for Present Request New Buffers for Next Request Buffers Allocated for Present Request Preserved Buffers for Next Request New Buffers for Next Request Bytes Bytes Buffer Requirement for Non−Shared Case Time Buffer Requirement for Shared Case Time Figure 2: Bu er Sharing between two requests 1, 2, 3, 4, 5 − Segments Loaded for First User 1,2 − Segments Loaded from disk for Second User 3,4,5 − Segments preserved in buffer for Second User 3 3 Bytes 1 4 4 2 3 2 5 5 5 Time 1 4 T1 T2 T3 T4 T5 T6 T7 Figure 3: Segments available in bu er at each time slot (zooming into bottom right of gure 2) Figure 3 shows the details of the segments that are actually present in the bu er at each time instant in the shared case (bottom right case of gure 2). It illustrates how the continuous media caching technique preserves only the necessary segments for just the required amount of time. It di erentiates the segments as follows | segments that are loaded for the rst user, segments fetched 5
  • 8. from the disk for the second user and segments that have been used by the rst user but preserved for the second user. We present an example of a simple system, to illustrate how sharing can be exploited to support more users. Consider a system which has a main bu er of 1280 MB (Mega Bytes) and a storage system with a high storage capacity and a data transfer rate of 40 MB/s. MPEG-2 is an accepted standard for continuous media data storage/transmission and the requirements of di erent media based on this standard are as follows (in the compressed form): image 100 KB for a frame, audio 16 KB/s (high quality) and video 2 MB/s. Ignoring the data requirements of audio and images, we analyze the requirements of a session using the characteristics of its video data on a segment (of duration 1 second and having same data rate of 2MB/s) basis. The available bu er can hold 1280MB/(2MB/sec) = 640 seconds of video, which translates to 640 segments. This implies that the bu er is capable of handling 640 concurrent sessions (since each session needs at least a segment in memory). Neglecting the seek time (which is typically in the order of 10's of milliseconds) and based on the data bandwidth of the storage device, the system can support (40MB/s)/(2MB/s) = 20 concurrent sessions (topics). Thus we see that the number of concurrent sessions supported is limited to 20 by the bandwidth of the storage system even though the bu er can handle 640 sessions. However by exploiting sharing, the bu er that can support 640 seconds of video can be e ectively utilized to satisfy more users, provided they request one of the 20 topics that can be fetched using the available disk bandwidth. If the 20 topics are shared by u1 ; u2; :::; u20 concurrent users respectively, then the number of users supported will be P20 u . This is however the best case and the probability =1 i i of this is very low, since all the users sharing the sessions arrive at the same time in this scenario. On the other hand, the worst case occurs when the users requesting the same topic come with larger inter-arrival times. Continuous media caching may not be possible in this case (due to insu cient bu er bandwidth), resulting in only 20 users being supported by the system. However if they arrive within short intervals, instead of fetching the data from the disk for the lagging user, data read from the disk by the leading user can be preserved for reuse by the lagging user. Thus depending on the inter-arrival time and other possibilities of sharing (three cases discussed earlier), the number of users supported will be between 20 and P20 u . This simple example gives an idea of how sharing i=1 i can be exploited in a MMDBS for the case of NODS to provide better bu er management. If seek times were also considered in the above example, continuous media caching will prove to be even more bene cial. 4 Bu er Management Schemes This section describes the two bu er management schemes (UAT and SHR) we are going to compare. First we brie y review traditional bu er management techniques and then discuss the details of UAT and SHR. The objective of the bu er manager is to reduce disk accesses for a given bu er size. Traditional bu er management schemes for textual data use xed size pages to avoid fragmentation. When the requested database page is not in the bu er, it is fetched from the disk into bu er. Then the page is kept in the bu er using the fix operation, so that it is not paged out and future references to the page do not cause page faults in a virtual memory operating system environment. Once it is determined 6
  • 9. that the page is no longer necessary or is a candidate for replacement an unfix is done on that page. Also in traditional database systems, bu er allocation and replacement are considered orthogonal issues. A page can be allocated anywhere in the global bu er in a global scheme whereas in a local scheme, speci c portions of the bu er are allocated for a given session and a page for a session is allocated in the area allocated for that particular session. Hence in the global scheme there is only one page replacement algorithm whereas in the local scheme several page replacement algorithms may exist. In either case the goal of the page replacement algorithm is to minimize the bu er fault rate. Before allocating a page, the bu er manager checks if the corresponding page is in the bu er and if not, retrieves it from the disk. While static allocation schemes allocate a xed amount of bu er to each session when it starts, dynamic allocation schemes allow the bu er requirements to grow and shrink during the life of the session. In virtual memory operating systems where there is paging, code pages (which are read only) are typically shared. For example, if many users are using a particular program like an editor, then the code pages of the editor are shared between the di erent users. Though this scheme cannot be used as such for continuous media data, it provides some motivation for our continuous media caching scheme. Some of the common page replacement strategies used for textual data include FIFO and LRU 11]. These traditional schemes do not have provision for admission control and resource reservation and hence can not guarantee the success of a request in multimedia environments. This can result in wastage of resources for requests that eventually fail. We now discuss two schemes that have admission control and resource reservation integrated with them. The rst is UAT, a scheme proposed previously for multimedia systems 6] and SHR, our new scheme that exploits the potential for sharing. First we discuss the memory model, admission control and bandwidth reservation scheme. In the case of a MMDBS, advance knowledge of the data stream to be accessed is available and this can be utilized for better bu er management. Since we are assuming a large main bu er, we assume that this information about the topic and segments can be stored in main bu er (i:e:; the number of segments that make up a topic and the segment data rates for each segment). Bu er space that is not used exists in a free-pool (this is a global pool). Bu ers needed for the segments of a request are acquired (analogous to x) from the free-pool and returned (analogous to un x) to the free-pool when they have been used. When bu ers are requested from the free-pool, the oldest bu er in the free-pool is granted to the requester. We mentioned earlier that since the video rate dominates the audio rates, only the requirements of video are considered for sharing. The availability of the necessary bandwidth for audio is also to be checked by the admission control policy to meet the synchronization requirements of audio and video streams. The admission control and reservation scheme is dependent upon the sharing policy. Given (n ? 1) sessions already in progress, the n th request is accepted i the following inequalities hold. Xl n d M 8t = 1; ::; s (1) i;t i=1 Xd n R 8t = 1; ::; s (2) i;t =1 i where d represents the continuous media rate (audio and video) for request i at time slot t, l i;t 7
  • 10. represents the length of each segment (of 1 second duration), s represents the total number of segments in the request. The 8t = 1; ::; s condition indicates that this inequality should hold for all the segments of the request, thereby verifying that there is su cient resources for the entire duration of the request. The rst equation gives the limiting case for the bu er bandwidth (bu er size = M ) and the second for disk bandwidth (data rate supported = R). If the inequalities hold for a request, the required bu er and disk bandwidth is reserved and the request is guaranteed to succeed. 4.1 Traditional Scheme - Use And Toss (UAT) The UAT policy that is traditionally used for MMDBS is described below with some enhancements. UAT replaces data on a segment basis and it also uses admission control and bandwidth reservation to anticipate future demand. A FIFO policy is used to decide which of the segments from the free- pool are to be allocated to a new request, i:e:; when a new request is to be allocated a segment from the free-pool, the oldest segment in the free-pool is allocated rst. A timestamp mechanism is used to maintain information about the oldest segment in the free-pool. The UAT policy is ideal for continuous media data where there is no sharing. UAT reuses segments that may already exist in bu er, i:e:; if any of the required segments happen to exist in the free-pool then they are grabbed from the free-pool, thereby avoiding fetching those segments again from the disk. UAT does not consider sharing explicitly, i:e:; when the segments of a topic have been played back, they are not preserved in bu er for a future user which might request the same topic. Hence requests from di erent users are considered independent of each other for admission control and reservation purposes, even if they are for the same topic. The admission control policy checks for each request separately to see if there is su cient bu er and disk bandwidth available to fetch and store data for the request on a segment by segment basis (using inequalities 1 and 2). If the required bandwidth is available, then the required disk and bu er bandwidths are reserved; else the request is rejected. 4.2 Proposed Scheme - Sharing Data (SHR) Here we propose a new bu er management scheme which uses the continuous media caching technique to speci cally exploit the sharing characteristics of MMDBS applications. This scheme (SHR) is a variation of the UAT scheme we discussed earlier, with enhancements for continuous media caching. The bu er and disk bandwidth for these schemes are determined by utilizing the technique illustrated in gure 2 (page 6). We do not discuss details of disk scheduling here. A rate monotonic type of scheduling is the best, i:e:; since the disk bandwidth will be assured for the various sessions, some disk bandwidth will be scheduled for each session, corresponding to the ratio of the data rate of the segment it is fetching to the total demanded data rate. Since SHR utilizes continuous media caching, segments used by a leading user are explicitly preserved for use by the lagging user. This can be explained better using gure 3. The segments that are explicitly preserved begin with the segment that is being played back by the leading user when the lagging user enters the system. All segments that follow this are also preserved in the bu er until the lagging user needs it for playback. (Referring to gure 3, the segments that have been preserved are 3, 4 and 5). If it happens that some of the segments are in the free-pool (say segments 1 and 2 are in the free-pool after they have been used and returned by user 1), they can be reacquired from the free-pool if the bu er space corresponding to those segments has not already 8
  • 11. been granted to other sessions. If all the segments (loaded for the rst user prior to this user's arrival) are available, then no data is to be fetched from the disk for the second user (only bu er bandwidth is checked/reserved), else data is to be fetched from the disk for those segments that do not exist in the free-pool(both bu er and disk bandwidth checked/reserved). Thus SHR uses past and future information about segment access and hence should perform better that UAT as we verify later in the performance tests. Essentially the performance di erence between UAT and SHR can be attributed solely to continuous media caching. do while there are requests in the queue select next request ; admitted = false ; For each segment in the topic, check if the segment is already in the free−pool and grab it if it is available ; For each segment grabbed from the free−pool, modify the total profile to reflect preserving of these segments and check for buffer bandwidth violation ; For segments that have to be fetched from the disk, modify the total profile for disk and buffer accordingly and check for buffer/disk bandwidth violation ; For segments that exist in the buffer and are shareable, modify the total profile for buffer and check for bandwidth violation ; if no violation at any stage admitted = true ; /* total profile contains reservation for this request */ else restore old total profile ; reject request ; endif end do Figure 4: Admission Control and Reserving future bandwidth do forever do for all active sessions if buffers are shared with another session if leading user preserve the used buffers for lagging users ; acquire new buffers from the free pool ; fetch data from disk into the buffer ; else if lagging user if there are more lagging users preserve the used buffers for lagging users ; read from preserved buffers ; else return the used buffers to the free pool ; read from preserved buffers ; endif endif else when a new segment is encountered return the used buffers to the free pool ; acquire new buffers from the free pool ; fetch segment from disk; endif end do end do Figure 5: Bu er Allocation & Replacement | SHR Figure 4 describes the admission control and bandwidth reservation scheme for SHR. The total pro le refers to the total requirements of bu er/disk bandwidth. The gure describes the sequence in which the availability of segments in the bu er is checked along with admission control and bu er/disk bandwidth reservation. Since SHR considers sharing explicitly, the integrated admission control and reservation scheme has to be modi ed accordingly. Speci cally, in SHR, the admission control policy checks if there is su cient disk bandwidth available for segments to be loaded from the disk (for segments 1 and 2 if they are not available in the free-pool) and su cient bu er bandwidth for loading these segments in bu er and preserving the other segments (i:e:; loading segments 1 & 9
  • 12. 2 and preserving segments 3, 4 and 5). If the required bandwidth is available then appropriate disk and bu er bandwidths are reserved, else sharing is not possible. When this is the case, the request in considered independently (similar to the UAT scheme) and an attempt is made to reserve the necessary bu er and disk bandwidth for each request. If this also fails, then the request is rejected. Bu er allocation and replacement is to be done in an integrated manner as shown in gure 5. Bu er allocation is done when a inter segment synchronization point is encountered. Bu ers used by a segment that has been played back are freed rst and then bu ers needed by the next segment are allocated. Once the bu ers have been allocated for a segment, they are not freed till the end of the segment. For every user, at the beginning of a data transfer from disk, new bu ers have to be allocated. This is achieved by acquiring the necessary number of bu ers from the free pool. Once data from these bu ers are played back by a user, if there are lagging users, instead of returning them to the free pool, they are preserved in the bu er. For a lagging user, bu ers need not be acquired from the free pool, since a disk transfer is not necessary. Instead data is read from the preserved bu ers after which the bu ers are returned to the free pool. Information about allocated bu ers can be maintained using a hash table since it is necessary for sharing bu ers. We do not explicitly consider prefetching of data 26, 27] but if it is considered, then our schemes have to be modi ed such that prefetching is applied to all or none of the sessions. 4.2.1 Criteria to attempt sharing Another important tradeo related to continuous media caching is how to decide if it is advantageous to share the data with another user. There are several factors that determine this. For example, if the di erence between the arrival times of the leading and lagging users is not much then clearly this technique pays o because in this case, bu ers needed by lagging users are not retained for a long time. However if there is a large di erence in the arrival times, then the amount of time for which the respective segments have to be carried over (cached) becomes large. Essentially this means that more bu er space will be occupied to enable sharing between users. This may cause many other requests to fail for lack of bu er space. Hence we performed some empirical tests to determine the point beyond which there is no payo (i:e:, the maximum di erence in the arrival time beyond which continuous media caching should not be used). Our tests indicate that continuous media caching pays o as long as it is used when the inter arrival time (iat) is less than maximum tolerable inter arrival time for sharing (max-iat) . Other than the arrival time, there are several other factors that determine the value of max-iat as shown below. max iat = C Mean Arrival TimeData Rate Size Topic Length Memory (3) iat max iat (4) If more memory is available, then segments can be cached for a longer duration of time and, hence, max-iat is directly proportional to memory size. On the other hand, if the topic length is large or the data rate is high, it is di cult to cache segments for a longer duration since the bu er space needed will be too high. Hence max-iat is inversely proportional to the topic length and the data rate. The constant C can be determined via empirical tests. Developing better heuristics to determine the 10
  • 13. criteria for sharing can be a separate area of research and hence we do not go too much into the details. Thus, for the SHR scheme, we have used inequality (4) as a basis for deciding when the requests are to be shared. If iat is larger than max-iat then the request is satis ed using the UAT policy. It should be noted that in all these cases, a request is considered for scheduling as soon as it is received (zero response time). 5 Batching Schemes In order to achieve better performance out of an existing system, i:e:; support more number of concurrent users, batching can be used 7]. Batching tries to group requests that arrive within a short duration of time for the same topic. Our objective is to explore how batching can enhance the performance of continuous media caching schemes. There are a few tradeo s in the design of batching schemes. The response time of a request is the time interval between the instant it enters the system and the instant at which it is served (i:e:; the rst segment of the request is played back). Since customers will be inclined to choose a service with a faster response time, it is important to study the impact of response time on batching and sharing schemes. Thus the success of a scheme can be measured in terms of the number of requests that succeed2 . The factors that determine the actual response time of a request include the scheduling policy and the scheduling period. The scheduling period is the time interval between two consecutive invocations of the scheduler. At each invocation of the scheduler (scheduling point), requests for the same topic are batched and scheduled according to the scheduling policy. If the tolerable response time is small, then it is necessary to have smaller scheduling periods or else the percentage of successful requests will be low. However for better batching" (maximize sharing), the scheduling period has to be large. Hence, as a compromise, some intermediate value has to be chosen such that batching can be achieved with smaller response times. Next, we brie y discuss some of the scheduling policies. The scheduling policy determines the order in which the requests are to be considered (i:e:; if the admission control is successful, then the required resources are allocated) at each scheduling point. Some of these batching policies have been studied for a video-on-demand server in 7]. In the First-Come-First-Serve (FCFS) policy, requests for the same topic are batched and the topic which has the earliest request is considered rst and so on. In the Maximum-Group- Scheme (MGS) policy, requests for the same topic are batched and the topic which has the maximum number of requests is considered rst. From the point of view of continuous media caching, other variations are possible where the topic which was scheduled most recently can be scheduled again to promote sharing (i:e:; schedule the topic with the smallest time di erence between now and when the topic was scheduled last). While other policies favor topics that are requested more frequently, the FCFS policy is fair and is recommended in 7]. For our study, we consider the FCFS batching scheme. Having studied some of the issues related to batching, we now describe how continuous media caching can be used in conjunction with batching to improve sharing. This results in considerable 2 a request succeeds if the required resources are allocated for the entire request and the actual response time does not exceed the maximum tolerable response time 11
  • 14. savings in the resources required to satisfy a given workload as we will see later in the performance tests. For comparison purposes, we describe two batching schemes that use the FCFS policy for scheduling but use di erent bu er management schemes. 5.1 Batching without Sharing (BAT-UAT) BAT-UAT is the FCFS batching scheme that uses the UAT bu er management scheme. Since UAT does not use continuous media caching, in BAT-UAT data sharing is achieved purely through the e ect of batching. Data sharing possibilities cannot be fully exploited purely through batching because of the dependency between scheduling period and response time which we discussed earlier. 5.2 Batching with Sharing (BAT-SHR) BAT-SHR is the FCFS batching scheme that uses the SHR bu er management scheme. Here the e ect of sharing is achieved in two di erent ways | through batching and continuous media caching. Since BAT-SHR uses the same scheduling period as BAT-UAT, the response time will remain the same or improve (since more resources will be available due to the cumulative e ects). However the extra data sharing that could not be achieved by batching is achieved via continuous media caching i:e; requests for the same topic may have been scheduled in di erent neighboring scheduling points, but sharing is achieved through continuous media caching when the neighboring points are within max-iat. Thus the di erence in performance between BAT-UAT and BAT-SHR can also be attributed solely to continuous media caching. 6 Performance Tests We have performed a number of tests to compare the di erent schemes and evaluate the bene ts of continuous media caching. In this section we discuss our experimental setting, a few of our assumptions and then describe the various parameters and metrics used for the tests. 6.1 Experimental Setting & Assumptions We assume a client-server type of architecture shown in gure 6. A DMA transfer is assumed between the disk and the bu er at the server and hence the CPU is not likely to be a bottleneck. So the CPU has not been modeled in the performance tests. It is assumed that the network between the client and the server provides certain quality-of-service (QOS) guarantees about the network delay and bandwidth. A detailed discussion of reliability and networking issues for continuous media can be found in 3]. We will assume a two level storage system: primary storage (main memory bu er) and secondary storage (disk). Knowing the maximum seek time, rotational latency and data read rate, the e ective transfer rate, R can be determined for a storage system. This is used in performing bandwidth allocations for the disk. The secondary storage device could be a magnetic/optical disk or even a RAID based system for increased bandwidth. We do not go into the details of how multimedia 12
  • 15. data is organized and retrieved from the secondary storage in an e cient manner since there has been extensive research in this area 2, 16, 24, 27, 29, 31] and it is orthogonal to the main focus of this work. Since the data rate for video dominates, we assume that requests desiring the same topic with di erent language options can share the segments. We also assume that segments are of equal length (this simpli es our implementation for the performance tests but does not a ect the results). Database Server Client Client Client Disk Client Disk Client Client Client Client Disk Disk Figure 6: Con guration of the System Admission control and bandwidth reservation is performed on a segment basis for each request and if the bandwidth is insu cient at any point, the request is rejected. The bu er and disk bandwidth checks are performed for the entire request at admission time, i:e:, even for segments that will have to be played at later point in time. Before a segment is placed in the bu er, the following actions are taken. First the bu er is searched to see if the requested segment exists in free-pool. If it does not exist then the required bu er space is acquired from the free-pool. If this cannot be done because the available bu er space in the free-pool is not su cient, then the request is rejected and it counts as a reject due to insu cient bu er bandwidth. When segments have to be fetched from the disk, if su cient bu er space is available then the disk bandwidth is checked for su ciency. If the disk bandwidth is su cient then the request proceeds. If all the segments can be brought in, then the request is successful else the request is rejected and it counts as a reject due to insu cient disk bandwidth. For the batching schemes, at each scheduling point, the requests are grouped based on the demanded topic. Since we use a FCFS scheme, the grouped requests are considered one by one based on the earliest arrival time of a request in the group. A request that is considered undergoes the admission control and bandwidth reservation described above. If the required resources are available and allocated, then all the requests for that topic are successful. If this request cannot be satis ed, then the remaining requests are considered for that scheduling point. If a particular request cannot be satis ed at a scheduling point, it will be tried again at the next point and so on as long as the response time requirements are met. Hence a request that could not be satis ed at a particular scheduling point may be successful at the next point. If the actual response time increases the maximum tolerable, then the request is no longer considered for scheduling and is rejected. 13
  • 16. 6.2 Parameters The parameters to be considered fall under four di erent categories | customer, system, topic and batching. They are listed below in the table with their default values. Topic details (length, data rate, etc.) are generated using a uniform distribution. Customer arrival is modeled using Poisson distribution. The topic chosen by a customer is modeled using Zipf's law 32]. According to this, if the topics 1; 2; :::; n are ordered such that p1 p2 ::: p then p = c=i where c is a normalizing n i constant and p represents the probability that a customer chooses topic i. This distribution has i been shown to closely approximate real user behavior in videotex systems 14] which are similar to on-demand services. Category Parameter Setting (Default Values) Customer Inter-Arrival Time Poisson Distribution with mean of 10 sec Topic Chosen Zipf's Distribution (1-10) System Bu er Size 1.28 GB Disk Rate < determined based on 95 % success rate> Topic Total Number 10 Length Uniform Distribution (500-700) sec Data Rate of Segment Uniform Distribution (1-2) MB/sec Batching Maximum Response Time 40 sec Scheduling Period 10 sec 6.3 Metrics A popular metric for these tests would be the percentage of successful requests. But this alone is not su cient for system designers since they are more interested in metrics that are closely tied to QOS guarantees. An example of this would be the maximum arrival rate that can be sustained if 95 % of the requests are to succeed given certain resources. Another metric would be the resource requirements for a 95 % success rate give certain workload. Evaluating metrics that are based on QOS guarantees is a time consuming process since it is iterative in nature. For example, to evaluate the disk bandwidth that ensures a 95 % success rate when the inter-arrival time is 10 sec, starting from a small value of disk bandwidth (for which the success rate may be lower than 95 %), the disk bandwidth has to be incremented till the success rate reaches 95 %. For all our tests, the metric we use is the disk bandwidth needed to guarantee that 95 % of the requests will be successful. For the bu er management schemes, since we do not batch the requests, 95 % of the requests are to be satis ed with a zero response time. A non-zero value like 10 seconds can also be used as the tolerable response time but since we do not batch requests, this might not bene cial. Response time is more relevant for the batching schemes, since the scheduling is done at periodic intervals. A tolerable response time is speci ed for these tests and 95 % of the requests should have response times smaller than the tolerable response time. 14
  • 17. 7 Results Each simulation consisted of 25 iterations and each iteration simulated 2000 customers. The con - dence interval tests of the results indicate that 95 % of the results (percentage of successful requests) are within 3 % of the mean in almost all cases. The results are discussed based on the disk bandwidth required to guarantee that 95 % of the requests will succeed. All parameters are set to their default values speci ed earlier unless explicitly indicated in the graphs. Topic details (the database accessed by users) are generated for every iteration. We rst compare the results of the bu er management schemes (UAT and SHR) and then compare the batching schemes (BAT-UAT and BAT-SHR). The area below the curves are lled in light gray for schemes that do not use continuous media caching and in dark gray (which blots out the light gray) for schemes that use continuous media caching. Hence in all the graphs, the light gray region shows the bene t from continuous media caching. 7.1 Comparison of UAT and SHR based on Disk Bandwidth This section compares the performance of UAT and SHR based on the various parameters we men- tioned earlier. We analyze the behavior of the two schemes and discuss how continuous media caching helps to improve the performance of the system. 7.1.1 Comparison based on Inter Arrival Time Figure 7(a) shows the e ect of inter-arrival time on the disk bandwidth requirement. For smaller values of inter-arrival time, higher disk bandwidths are needed to satisfy the requests. The disk bandwidth requirement drops gradually as the inter arrival time increases. SHR requires about 50- 70 % of the bandwidth required by UAT depending upon the inter-arrival time. This clearly shows the ability of continuous media caching schemes to improve the number of concurrent requests that can be handled by a system. 7.1.2 Comparison based on Bu er Size Figure 7(b) studies the e ect of bu er size on the performance of the schemes (a small portion of the UAT curve is masked by the SHR curve at the left). In general if the bu er space available is low, all bu er mangement schemes experience poor performance. Furthermore, to exploit sharing, SHR needs a sizeable amount of bu er. It can be observed that at lower bu er sizes of 400-600 MB, SHR is not able to exploit sharing as much and hence requires disk bandwidth comparable to that of UAT. However, as the bu er size available increases (600MB-1.28GB), SHR uses the extra bu er to exploit sharing and hence begins to outperform UAT. The disk bandwidth requirement drops down to almost half of what is required by UAT. It can be noted that UAT does not utilize the available bu er resources to the maximum whereas SHR does. Hence UAT requires a constant disk bandwidth of 101 MB/s while the disk bandwidth for SHR reduces steadily with an increase in bu er size. This graph is also more important since it gives the tradeo s in the system resources (bu er and disk) required by the di erent schemes for 95 % success given a certain workload. Essentially the UAT scheme is preferable when the available bu er size is small and if more bu er is available SHR can 15
  • 18. be used. 300 160 | | UAT 140 | Disk Rate (MB/s) for 95 % success Disk Rate (MB/s) for 95 % success 250 SHR | UAT 120 | SHR 200 | 100 | 150 80 | | 60 | 100 | 40 | 50 | 20 | 0| | | | | | | | | 0| | | | | | | | | 0 5 10 15 20 25 30 35 40 320 480 640 800 960 1120 1280 (a) Inter-Arrival Time (sec) (b) Buffer Size (MB) Figure 7: E ect of Inter Arrival Time and Bu er Size 7.1.3 Comparison based on Topic Parameters The next comparison is based on the number of topics available. This test is important since the probability of locating a required segment in memory with or without planning depends upon the number of topics, i:e:; lesser the number of topics, higher the probability of nding a required segment. It can be observed in gure 8(a) that the increase in bandwidth is steep initially and then becomes more gradual. This behavior can be attributed to the Zipf's distribution we have chosen to model the customer's topic selection. If a uniform distribution is chosen, the increase in disk bandwidth may be steeper throughout. The main observation however is that SHR continues to dominate the performance (with lesser disk bandwidth requirement) even with an increasing number of topics. The bene t from continuous media caching is highly evident when there are fewer topics. Figure 8(b) explores the e ect of the length of the topic on the performance of the di erent schemes. This is important due to the fact that increasing the length of the topic results in a request taking longer time to complete and hence the number of concurrent customers in the system increases, which in turn demands increased bu er and disk bandwidth. Thus the disk bandwidth requirements of the schemes increases with an increase in length of topic. However if SHR is used, some of the concurrent sessions are shared and hence the disk bandwidth requirement for SHR is less than that of UAT. Another important parameter to be considered is the average data rate of the segments. Based on the MPEG-2 standards, the average data rate for a compressed high quality video is around 2MB/s. Hence we have performed tests for average data rate ranging from 2-5MB/s. As expected, we note from gure 8(c) that with increasing data rates, the disk bandwidth requirement increases. However we observe that SHR requires lesser disk bandwidth compared to UAT. As the data rate increases, the gap between UAT and SHR tends to narrow down since more bu er space is needed for caching 16
  • 19. larger amount of data and hence data has to be brought from the disk for some of the requests. 130 160 340 | | | 120 UAT | UAT UAT 140 SHR | 300 Disk Rate (MB/s) for 95 % success Disk Rate (MB/s) for 95 % success Disk Rate (MB/s) for 95 % success 110 | SHR | SHR 100 | 120 | 260 | 90 | 80 100 | | 220 | 70 | 80 | 60 | 180 | 50 60 | | 40 | 140 | 40 | 30 | 20 100 | | 20 | 10 | 0| | | | | | | | | | | 0| | | | | | | | 60 | | | | | | | 0 5 10 15 20 25 30 35 40 45 50 100 200 300 400 500 600 700 800 2 3 4 5 (a) Number of Topics (b) Length of Topic (sec) (c) Data Rate (MB/s) Figure 8: E ect of Topic Parameters 7.2 Comparison of BAT-UAT and BAT-SHR based on Disk Bandwidth The batching schemes also behave similar to the bu er management schemes, the main di erence being that requests for the same topic are grouped and hence the overall disk bandwidth requirements is less in all cases. We also discuss results of some additional tests which are based on the batching parameters, i:e:; tolerable response time and scheduling period. 7.2.1 Comparison based on Inter Arrival Time The e ect of inter-arrival time on the two batching schemes is shown in gure 9(a). Similar to the behavior we observed in gure 7(a) we see that BAT-SHR requires lesser disk bandwidth compared to BAT-UAT. Another important point to notice is that the disk bandwidth requirement has almost halved and this indicates that inter-arrival time has the most impact on data sharing through batching than any other parameter. 7.2.2 Comparison based on Bu er Size It can be observed from gure 9(b) that the disk bandwidth requirement of BAT-SHR is less than BAT-UAT for all the bu er sizes we have tested for. This di ers from our observation in the case of a non batched system where UAT performed better than SHR for smaller bu er sizes. Here we have a slight departure from the behavior of gure 7(b) in the sense that the disk bandwidth required for BAT-SHR is always less than BAT-UAT and this can be attributed to batching. We also observe that the overall disk bandwidth requirements has dropped but not as much as we noticed in the case of the inter-arrival time. 17
  • 20. 140 | 150 | 120 Disk Rate (MB/s) for 95 % success Disk Rate (MB/s) for 95 % success | BAT-UAT BAT-UAT BAT-SHR 100 | BAT-SHR 100 | 80 | 60 | 50 | 40 | 20 | 0| | | | | | | | | 0| | | | | | | | | 0 5 10 15 20 25 30 35 40 320 480 640 800 960 1120 1280 (a) Inter-Arrival Time (sec) (b) Buffer Size (MB) Figure 9: E ect of Inter Arrival Time and Bu er Size 130 140 300 | | | 120 | BAT-UAT BAT-UAT BAT-UAT 120 Disk Rate (MB/s) for 95 % success Disk Rate (MB/s) for 95 % success Disk Rate (MB/s) for 95 % success 110 | 260 | BAT-SHR BAT-SHR | BAT-SHR 100 | 100 | 90 | 220 | 80 | 80 | 70 | 180 | 60 | 60 | 50 | 140 | 40 | 40 | 30 | 20 100 | 20 | | 10 | 0| | | | | | | | | | | 0| | | | | | | | 60 | | | | | | | 0 5 10 15 20 25 30 35 40 45 50 100 200 300 400 500 600 700 800 2 3 4 5 (a) Number of Topics (b) Length of Topic (sec) (c) Data Rate (MB/s) Figure 10: E ect of Topic Parameters 7.2.3 Comparison based on Topic Parameters The e ects of the topic parameters is shown in gure 10(a),(b) &(c). Even here we observe a behavior similar to that of gure 8. Again we see that the disk bandwidth requirement has dropped due to the e ect of batching. We also observe the e ect of continuous media caching in conjunction with batching in gure 10(a). The disk bandwidth requirement for BAT-SHR does not increase even when the number of topics is increased from 10 to 50 which is di erent from what we observed in the case of SHR in gure 8(a). Essentially batching contributes to additional data sharing and hence the reduction in the disk bandwidth requirement. 18
  • 21. 7.2.4 Comparison based on Batching Parameters The e ect of the maximum tolerable response time on the two batching schemes is shown in gure 11(a). Here we observe that the drop in disk bandwidth requirement for BAT-SHR is more steeper compared to BAT-UAT. This clearly indicates that continuous media caching in conjunction with batching tries to e ectively utilize the increase in tolerable response time. Figure 11 (b) indicates more precisely how continuous media caching increases the e ect of sharing. While it can be observed that at low scheduling intervals the e ect of sharing is not completely achieved by BAT-UAT through batching, BAT-SHR achieves this by using continuous media caching in conjunction with batching. Hence the disk bandwidth required for BAT-SHR is almost the same irrespective of the scheduling period while BAT-UAT requires relatively higher disk bandwidth at smaller scheduling periods. As discussed earlier, in order to satisfy the response time requirements, it might be necessary to have smaller scheduling periods. To have better performance (lesser disk bandwidth) at smaller scheduling periods, BAT-SHR scores over BAT-UAT. 130 130 | | 120 120 | | BAT-UAT BAT-UAT Disk Rate (MB/s) for 95 % success Disk Rate (MB/s) for 95 % success 110 110 | | BAT-SHR BAT-SHR 100 100 | | 90 90 | | 80 80 | | 70 70 | | 60 60 | | 50 50 | | 40 40 | | 30 30 | | 20 20 | | 10 10 | | 0| | | | | | | 0| | | | | | | | | | 10 15 20 25 30 35 40 5 10 15 20 25 30 35 40 (a) Tolerable Response Time (sec) (b) Scheduling Period (sec) Figure 11: E ect of Batching Parameters Thus using these tests we have been able to demonstrate that in multimedia information systems where sharing can be exploited, schemes can and must be designed such that they make use of this sharing potential based on information about past and future access to perform more e cient bu er management. 8 E ect of Fast Forward/Rewind on Continuous Media Caching In this section we brie y discuss the e ects of fast forward and rewind while continuous media caching is used. Support for fast forward and rewind (which are commonly known as VCR functions) is a topic that is being researched 8]. Concrete solutions to this problem have not yet been proposed. 19