WhittleSearch: Interactive image search with relative attribute feedback

Background

In image search, a user often has a mental picture of his or her desired content. For example, a shopper wants to retrieve images of a certain style (sportier, more formal, etc.) of clothing they are interested in beyond the usual one-shot query. See Figure 1. Keywords alone, even if all images were tagged to enable keyword search, are not enough, as it is infeasible to pre-assign tags sufficient to satisfy any future query a user may dream up. Vision algorithms are necessary to further parse the content of images for many search tasks.

Advances in image descriptors, learning algorithms, and large-scale indexing have all had impact in recent years. However, existing image search methods provide only a narrow channel of feedback to the system. Classic relevance feedback consists of binary relevance/irrelevance statements about selected examples. The “semantic gap” between low-level visual cues and the high-level intent of the user remains. This invention, WhittleSearch, proposes a novel mode of feedback where a user describes how high-level properties of exemplar images should be adjusted in order to more closely match their envisioned target images.


Technical description

Researchers at The University of Texas at Austin have developed an approach (algorithm) which contributes to widen human-machine communication for interactive image search by allowing users to communicate their preferences precisely and efficiently through visual comparisons. This relative feedback approach, called WhittleSearch, allows users to “whittle away” irrelevant portions of the visual feature space via precise, intuitive statements of their attribute preferences.

Using binary search trees on attributes for guiding relevance feedback, WhittleSearch guides the user through a coarse-to-fine search using relative attributes. Two variants of WhittleSearch are shown in Figure 2 with the “active variant” providing efficiency both for the system (which analyzes a small number of candidates per iteration) and the user (who locates content via a small number of well-chosen interactions).


Benefits

  • Comparing attribute feedback approach against traditional binary relevance feedback, WhittleSearch shows that it refines search results more effectively, often with less total user interaction.
  • Simpler binary search tree baseline that lacks our information gain prediction model (doesn’t require a naïve scan through all database images for each iteration).
  • Ideal for image search but also more broadly relevant to other information retrieval tasks, including document search.