Info

ANN-Benchmarks is a benchmarking environment for approximate nearest neighbor algorithms search. This website contains the current benchmarking results. Please visit http://github.com/erikbern/ann-benchmarks/ to get an overview over evaluated data sets and algorithms. Make a pull request on Github to add your own code or improvements to the benchmarking system.

Benchmarking Results

Results are split by distance measure and dataset. In the bottom, you can find an overview of an algorithm's performance on all datasets. Each dataset is annoted by (k = ...), the number of nearest neighbors an algorithm was supposed to return. The plot shown depicts Recall (the fraction of true nearest neighbors found, on average over all queries) against Queries per second. Clicking on a plot reveils detailled interactive plots, including approximate recall, index size, and build time.

The standard evaluation is done one a single CPU. At the bottom of this page, you find results for batched queries.

Machine Details

All experiments were run in Docker containers on Amazon EC2 c5.4xlarge instances that are equipped with Intel Xeon Platinum 8124M CPU (16 cores available, 3.00 GHz, 25.0MB Cache) and 32GB of RAM. For each parameter setting and dataset, the process was given five hours to build the index and answer the queries.

Raw Data & Configuration

Please find the raw experimental data here (link follows). The query set is available with the datasets, see the paper for a description. The algorithms used the following parameter choices in the experiments.

Benchmarks for Single Queries

Results by Dataset

Distance: Angular

glove-100-angular (k = 10)


glove-100-angular (k = 100)


glove-25-angular (k = 10)


nytimes-256-angular (k = 10)


nytimes-256-angular (k = 100)


Distance: Euclidean

fashion-mnist-784-euclidean (k = 10)


fashion-mnist-784-euclidean (k = 100)


gist-960-euclidean (k = 10)


gist-960-euclidean (k = 100)


random-10nn-euclidean (k = 10)


sift-128-euclidean (k = 10)


sift-128-euclidean (k = 100)


Distance: Hamming

sift-256-hamming (k = 10)


word2bits-800-hamming (k = 10)


Results by Algorithm

rpforest


SW-graph(nmslib)


flann


hnswlib


mrpt


MP-lsh(lshkit)


BallTree(nmslib)


annoy-eucl


mih


annoy


hnsw(nmslib)


kd


bruteforce-blas


NGT-onng


hnsw(faiss)


pynndescent


faiss-ivf


NGT-panng


kgraph


Benchmarks for Batched Queries

Results by Dataset

Distance: Euclidean

sift-128-euclidean (k = 10)


Results by Algorithm

faiss-ivf-batch


faiss-gpu-batch


faiss-gpu-bf-batch


hnsw(nmslib)-batch


Contact

ANN-Benchmarks has been developed by Martin Aumueller (maau@itu.dk), Erik Bernhardsson (mail@erikbern.com), and Alec Faitfull (alef@itu.dk). Please use Github to submit your implementation or improvements.