|
|
|
|
Traditionally, approaches to content-based image retrieval (CBIR) have adopted one of two directions. In the first, image contents are modeled as a set of image attributes managed within the framework of conventional database management systems. These systems involve a lot of manual effort and have limited scope to provide ad hoc queries and perform similarity-based retrieval. The second approach depended on integrated feature-extraction/object-recognition subsystem to overcome the limitations of attribute-based retrieval. However such automated approaches for feature extraction and object recognition were computationally expensive, difficult and tended to be domain specific.
More recently, CBIR research community has recognized the need for synergy between these two approaches. Toward this goal, efforts draw upon ideas from several related fields such as knowledge-based systems, user modeling, data mining and information retrieval. Generally speaking, two categories of features may be identified: primitive and logical. Primitive, or low-level, features can usually be extracted automatically. Logical features (e.g. snow covered mountains) are more abstract representations of images and denote the deeper domain semantics manifested in the images. Some logical features may be synthesized from primitive features, whereas others can only be obtained through considerable human involvement.
Given the advances in various kinds of sensor technologies, images are acquired at an ever-increasing rate and there is a need to have retrieval capabilities that can scale to very large image collections. Additionally, advances in knowledge-based systems and data mining should be exploited to achieve automatic indexing at the level of logical features, where by logical features can be the basis for user interactions with the system. Specifically, CBIR systems should be capable of dynamically computing the required primitive features and then synthesize the logical features from them, both under the guidance of a domain expert.
In this presentation, some recent efforts in building sophisticated CBIR systems that address scalability issues in both retrieval and indexing stages are discussed by choosing, for illustrative purposes, color information in images as the primitive feature set. Two approaches for achieving retrieval efficiencies are provided: i) through feature space transformation and image clustering, and ii). by introducing novel forms of retrieval functions.
To achieve effective indexing at the logical level, a learning method, called the Kernel Rocchio, that computes optimal linear decision functions, in the context of the second approach, is introduced. It is shown that, with the adaptive capability, the system can implicitly incorporate dependencies among color features, which results in efficient indexing of images. These strategies are expected to apply also to other primitive feature sets (e.g. shape or texture). Continued work along these lines should lead to sophisticated CBIR systems that will perform retrieval of images at the level of logical concepts and scale up to much larger image collections than with earlier approaches. |