Challenges and the Solutions for Multimedia Metadata Sharing in Networks

Nowadays, multimedia devices such as PCs, smart phones, and televisions have large storage capacities to store a lot of multimedia files and high-speed network ability to connect with each other and the Internet. The demand for multimedia content sharing in networks is scaled up supported by the availability of storage capability and network speed. Thus, innovative infrastructures and technologies have been introduced to efficiently share multimedia contents in distributed systems.


Introduction
Nowadays, multimedia devices such as PCs, smart phones, and televisions have large storage capacities to store a lot of multimedia files and high-speed network ability to connect with each other and the Internet. The demand for multimedia content sharing in networks is scaled up supported by the availability of storage capability and network speed. Thus, innovative infrastructures and technologies have been introduced to efficiently share multimedia contents in distributed systems.
One of the fundamental features of this sharing is metadata acquisition from the multimedia devices, because users in clients need to obtain information about multimedia contents from servers before they decide to manipulate or share the contents. However, obtaining multimedia metadata across multiple servers in networks usually results in high round-trip latency from a request to the response. Since this isn't responsive enough, a user may feel the user interface sluggish, delayed or frozen for significant periods.
Additionally, users expect that the Search function with certain conditions and Sort function by the titles, the artists, the dates, etc. of contents are supported regardless of the location of the contents or the number of servers. However, adding additional protocols to support various kinds of metadata-based service is very difficult to implement and deploy due to the interoperability problem.
Caching responses, historical and statistical information based prefetching, aggressive link based prefetching, full prefetching, and metadata database object acquisition, and the metadata aggregator-based centralized solution have been introduced to resolve these poor responsiveness and limitation of service extensibility from a user's perspective. Some solutions have several usability problems such as long warm-up time or have a lack of service extensibility, while others hurt the interoperability of the multimedia metadata sharing system.
In this chapter we introduce the solutions and guide you in order to figure out their limitations, pros and cons. This chapter is structured as follows: Section 2 covers the multimedia metadata sharing solutions and the examples; Section 3 explains the critical problems of metadata acquisition in multimedia sharing systems; Section 4 introduces the solutions to the problems; Conclusion section briefly restates challenges and the solutions for multimedia metadata sharing in networks.

Multimedia metadata sharing solutions
As demand for multimedia content sharing in networks has grown rapidly, innovative infrastructures and technologies have been introduced to efficiently browse multimedia content lists, control playbacks, and deliver multimedia contents in distributed systems. Fig. 1. A typical multimedia sharing scenario of DLNA Thus, Universal Plug and Play Audio and Video (UPnP AV) architecture 1 has been introduced as a candidate for multimedia distribution middleware for local networks, becoming widespread in many multimedia devices and software solutions in PCs. This architecture has been supervised by Digital Living Network Alliance (DLNA) 2 , a forum supported by many companies in the consumer electronics, multimedia, entertainment, and mobile industries.
The key of the UPnP AV architecture and DLNA is multimedia metadata sharing. Therefore, their specifications and guidelines have protocols and constraints for multimedia metadata sharing.

UPnP AV
UPnP was defined by the UPnP Forum 3 as what brings easy-to-use, flexible, standardsbased connectivity to peer-to-peer networks. It is an open architecture that uses established standards such as IPv4 4 , HTTP 5 , XML 6 , and SOAP 7 . The UPnP AV architecture defines AV device classes the general interaction between its Control Points and devices.

UPnP AV device classes
An UPnP AV Control Point is a device class responsible for coordinating and synchronizing distributed devices and a Media Server 8 is a device class defined as the source device of the media content responsible for exposing and streaming multimedia. In order to correspond with user inputs for the playback of multimedia contents, the Control Point coordinates, synchronizes, and interacts with its devices acting as a source called the Media Server and a sink called the Media Renderer 9 .

Metadata exchange in UPnP AV devices
The Media Server exposes its contents via the Content Directory service (CDS) which establishes and maintains the hierarchical structure of UPnP container objects. Each object has multimedia content information and metadata such as the type, creator, date, and locator to access it and each CDS may have a unique structure and various objects. UPnP defines two requests to inspect a CDS such as Browse to browse the CDS structure and Search to obtain items conforming to specified search terms. A Browse request is mandatory in all CDSs, but a Search request is optional in UPnP AV specification. UPnP also defines two events such as SystemUpdateID 10 and ContainerUpdateIDs 11 to indicate the change in the CDS. UPnP AV architecture defines the control messages expressed in XML using SOAP. SOAP is similar to RPC where the client sends a request to the server and the server sends a response to the client. This architecture also standardizes the transport layer for SOAP in the architecture using HTTP. SOAP in HTTP is used for high interoperability when exchanging information in networks.

DLNA
The aim of the DLNA is to enable end-to-end interoperability among the digital multimedia devices storing, playing and sharing digital content and help put an end to the fragmentation of multimedia sharing standards for multimedia devices. As a collaboration of the world's leading consumer electronics, PC and mobile companies, the DLNA has created design guidelines for DLNA Certified products that can work together -no matter what the brand is. The DLNA has been based on the UPnP AV architecture, and put some constraints on the metadata exchange protocols of the architecture to ensure interoperability among devices in local networks.

Challenges of multimedia metadata sharing solutions
Poor performance due to the round-trip latencies between the servers and clients is a typical problem of network-based solutions. Since some additional protocols to express the metadata hurt the responsiveness of multimedia applications, multimedia sharing systems suffer from serious usability problems due to poor responsiveness. Additionally, because multimedia metadata are distributed across several multimedia devices and the sharing protocols meet some constraints to improve interoperability, the extensibility of the metadata based services is usually poor. For instance, even if you want to sort multimedia contents according to geographic information, the UPnP AV architecture and DLNA don't support that. Thus, you should introduce your own propriety protocol hurting interoperability.

Performance
A common serious usability problem in multimedia sharing systems in networks is poor responsiveness of metadata acquisition. Generally, 100 to 200ms is the threshold beyond which users will perceive a lag in an application 12 . However, metadata acquisition in the systems causes long round-trip latencies, due mainly to the protocol exchange mechanism. For this reason, interactive multimedia sharing applications have suffered and caused inconvenience to users. To avoid this problem, some solutions have tried to aggregate all the metadata in a controller device which has a user interface. In that case, a user usually suffers from long warm-up time to gather metadata instead of the responsiveness problem which was discussed above. Fig. 9. The comparison of average latencies of UPNP AV's Browse requests to obtain metadata 13 . The round-trip latencies of Browse requests with local MediaServers were definitely short

Service extensibility problem
Another critical problem is a lack of additional metadata-based functions such as search and sort with a title, date, or type of contents that provide convenient ways for the user to find expected multimedia content. However, if multimedia metadata spreads over several servers, it is difficult to implement these functions. Moreover, the functions also cause high round-trip latencies.

Solutions
To address challenges such as poor responsiveness and limitation of service extensibility from a user's perspective as we described above, several solutions such as caching responses, historical and statistical information based prefetching, aggressive link based prefetching, full prefetching, metadata database object acquisition, and the metadata aggregator-based centralized solution have been introduced.

Caching
Caching the metadata in the client is one classical solution. The response to a user's metadata request is stored in a cache. In this case, since the round-trip latency of a metadata request with a cache hit is definitely short, a user may be satisfied with a very fast response. Additionally, with a larger number of cache entries, the cache hit rate tends to be higher.  However, cache misses which are known as cold start or compulsory misses during cache warm-up are unavoidable and a cache does not guarantee 100% cache hit rate even after cache warm-up is completed. Moreover, it is difficult for caching to meet users' needs of using a variety of metadata-based functions because the cache can only store part of the whole metadata.

Prefetching
Another solution is prefetching metadata responses to the expected requests. This means that a client sends requests based on users' prior requests and stores the responses in the cache before the user requests them. The prefetching seems necessary to prevent cache misses and this could dramatically improve the round-trip latencies from the user's perspective if the response to the request is already stored in the cache. There are several kinds of prefetching algorithms such as prefetching based on historical and statistical information, aggressive link based prefetching, and full prefetching before a user's first input. Fig. 13. An example of metadata sharing systems with prefetching. The steps from 5 to 9 are to obtain metadata for users' future requests according to historical and statistical information or the hierarchical structure of the contents before the next request

Historical and statistical information based prefetching
To increase cache hit rate, we can predict users' expected requests based on historical and statistical data, for example, a prediction based on what kinds of multimedia contents have been chosen or what search keywords have been used by the user. However, cache miss is inevitable because a user's preference can always be changed.

Aggressive link based prefetching
Aggressive link based prefetching means that all possible links that a user may select must be prefetched. Multimedia metadata is usually stored and expressed as a hierarchical structure. So, while a user navigates the structure, all metadata in links the user can move will be obtained in advance before the user moves to one of the links. It ensures a cache hit rate of 100% at all times except in the case that a user's input is faster than the response acquisition for the links. Also, it is not suitable for cases in which a link has a lot of items or that there are a lot of links that the user can move.

Full prefetching
The third solution is more aggressive to improve the usability. The solution is that the entire metadata stored in the servers is copied into the client using metadata acquisition protocols, even if the initial setup time is very long. It establishes the illusion of the original metadata storage at the client. After this has been established, users can access the storage with very low latencies and a variety of metadata-based functions can be supported as if the metadata storage is real. However, the whole metadata acquisition at the initial setup time may cause the huge protocol processing, data access time and the heavy traffic in networks due to very complicated low level protocol exchange, XML processing and retrieval of metadata from the database in the server and to the client.

Fig.
14. An example of metadata sharing systems with full prefetching. The steps from 1 to 5 are to obtain the whole metadata in all the servers via metadata exchange protocols and establish the illusion of metadata storage. Then, a user's request refers to the illusion instead of the real servers

Metadata database object acquisition
The establishment of the illusion of metadata storage at the client is very effective for usability except for a very long initial setup time. Therefore, we propose a more aggressive solution that copies the metadata database object itself from the servers to the client instead of copying metadata. If a client uses the same schema for metadata database as the servers, we can directly copy the database object from the servers to the client, and the client then can access the database object without a very complicated and redundant metadata acquisition method in the network. Database object acquisition protocol is relatively much simpler than metadata acquisition protocol. For example, a typical multimedia metadata acquisition method uses HTTP and SOAP for extensibility, interoperability, and flexibility of services.
On the other hand, these protocols are very complicated and cause very long round-trip latencies to exchange metadata. However, to obtain the database object, we can use the FTP which is a very simple and fast protocol. Additionally, since multiple database objects can be referred to in order to respond to user's requests, various metadata-based functions can be very easily implemented.
However, this solution requires the unified interface to the metadata database across several servers and clients. Thus, unlike the other solutions, changing both the servers and clients is required to apply this solution to your application. Fig. 15. An example of metadata sharing systems with Metadata database object acquisition. The steps from 1 to 5 are to obtain the whole metadata in all servers via file transfer protocols and establish the illusion of metadata storage. Then, a user's request refers to the illusion instead of the real servers

Metadata aggregator-based centralized solution
To obtain metadata in distributed systems, a client must access all the servers which causes the service extension problem. For example, if a user wants to get items ordered by dates, the client must send messages to all the servers to obtain metadata, wait for the replies and then sort them. If the whole metadata is aggregated in one server, a client can only access this server instead of all the servers. Also, the caching mechanism or prefetching mechanism can be used for the clients. The aggregator is suitable to run on non-mobile devices working with many metadata servers. If there are a few servers, it is not beneficial to apply this solution to that environment. 17. An example of metadata sharing systems with a metadata aggregator. The steps from 1 to 5 are to obtain the whole metadata from all the servers and fill the metadata aggregator with the metadata from the servers. Then, a client accesses the aggregator instead of all the servers in the network

Comparison of the solutions
Any solution described above does not resolve both the performance and service extensibility problem of metadata sharing solutions for multimedia sharing systems as follows.

Conclusion
In this chapter, we have presented the typical problems of metadata sharing. The latency problem and service extensibility problem are typical and crucial in industrial solutions but very difficult to resolve. To address those problems, caching responses, historical and statistical information based prefetching, aggressive link based prefetching, full prefetching, metadata database object acquisition, and the metadata aggregator-based centralized solution have been introduced and these have their own limitations, pros, and cons. Thus, you should understand that each solution has its own pros, cons, and limitations and figure out what solutions can be applied to your application. Most of all, the most important factor you should consider when you choose the solutions is how to satisfy users of your application. 15 It depends on the cache algorithm, the number of cache entries, and cache replacement algorithm. 16 When cache-miss happens, long round-trip time is required. 17 It depends on the number of links, the duration between a user's two inputs, and the speed of servers' response. 18 The metadata database object acquistion protocol must be implemented on the servers and clients. Also, the clients must be able to figure out that the servers support that protocol. 19 It takes long time to the aggregator gathers the whole metadata from the servers in the network. Then the clients can use the metadata immediately.