Distance Encoded Product Quantization for Approximate K-Nearest Neighbor Search in High-Dimensional Space
Approximate K-nearest neighbor search is a fundamental problem in computer science. The problem is especially important for high-dimensional and large-scale data. Recently, many techniques encoding high-dimensional data to compact codes have been proposed. The product quantization and its variations...
| Veröffentlicht in: | IEEE transactions on pattern analysis and machine intelligence. - 1979. - 41(2019), 9 vom: 12. Sept., Seite 2084-2097 |
|---|---|
| 1. Verfasser: | |
| Weitere Verfasser: | , |
| Format: | Online-Aufsatz |
| Sprache: | English |
| Veröffentlicht: |
2019
|
| Zugriff auf das übergeordnete Werk: | IEEE transactions on pattern analysis and machine intelligence |
| Schlagworte: | Journal Article Research Support, Non-U.S. Gov't |
| Zusammenfassung: | Approximate K-nearest neighbor search is a fundamental problem in computer science. The problem is especially important for high-dimensional and large-scale data. Recently, many techniques encoding high-dimensional data to compact codes have been proposed. The product quantization and its variations that encode the cluster index in each subspace have been shown to provide impressive accuracy. In this paper, we explore a simple question: is it best to use all the bit-budget for encoding a cluster index? We have found that as data points are located farther away from the cluster centers, the error of estimated distance becomes larger. To address this issue, we propose a novel compact code representation that encodes both the cluster index and quantized distance between a point and its cluster center in each subspace by distributing the bit-budget. We also propose two distance estimators tailored to our representation. We further extend our method to encode global residual distances in the original space. We have evaluated our proposed methods on benchmarks consisting of GIST, VLAD, and CNN features. Our extensive experiments show that the proposed methods significantly and consistently improve the search accuracy over other tested techniques. This result is achieved mainly because our methods accurately estimate distances |
|---|---|
| Beschreibung: | Date Completed 11.09.2019 Date Revised 11.09.2019 published: Print-Electronic Citation Status PubMed-not-MEDLINE |
| ISSN: | 1939-3539 |
| DOI: | 10.1109/TPAMI.2018.2853161 |