Hash Techniques

Nearest neighbor (NN) search or similarity search is one of the most fundamental problems in computer vision and machine learning and has a wide range of practical applications, e.g., image and video retrieval, object recognition, image classification, image matching, human pose estimation, and etc. With the explosive growth of large-scale multimodal database containing millions or even billions of data points together with the curse of dimensionality of feature descriptors (e.g., SIFT, SURF, and BoW), the naive approach of exhaustive search using linear scan becomes infeasible due to the linear time complexity and excessive memory requirement.

To address these issues, approximate nearest neighbor (ANN) search is commonly adopted to reduce the search complexity. Among many techniques for ANN search, hashing has been widely used for efficient similarity search due to two folds: 1) fast computation of the similarity measure on the basis of Hamming distance, 2) compact representation for large-scale datasets. Essentially, hash techniques aim to construct a set of hash functions that map the high-dimensional data space into more compact space while capturing and preserving the semantics and similarity of data. This task is particularly challenging due the rapid increase in data dimensionality, modality, and scale. Besides, these large-scale data in real world often possess and exhibit complex semantic structures, complicated interrelationship among multimodality.

In this project, we aim to study the efficient hash techniques by investigating data saliency and inter-dependency of all sources, individually and collectively, to semantically mash-up all available information for effective binary code learning. We will consider a variety of data modelings (unsupervised and supervised) in the design to discover and capture both global and local discriminant structures in these complex data. More importantly, the powerful deep convolution network will be exploited to discover the hidden non-linear structures and features at different levels of data abstractions, in which the conventional hash techniques (e.g. linear-based and heuristic kernel-based approaches) often fall short. These newly developed hash techniques will play an important role to address other challenging vision tasks  in our grand research project.