GustANN is a high-throughput, billion-scale, graph-based vector store built for GPUs. It is based on our SIGMOD '26 paper:
π High-Throughput, Cost-Effective Billion-Scale Vector Search with a Single GPU.
- π High Throughput: Achieves ~250K QPS on billion-scale datasets (SIFT1B, top-10, recall=0.9)β7.81x faster than DiskANN.
- π§ Memory Efficient: Requires only ~40GB of memory for both GPU and CPU on billion-scale datasets.
- π Flexible Interface: Supports multiple search modes (SSD-based, all-in-DRAM, all-in-GPU) and storage backends (SPDK, liburing, libaio).
Tip
For convenience, we highly recommend modifying scripts/setup.sh first to specify your file paths and save locations, and then using the provided automated scripts.
If you prefer to execute the commands step-by-step manually (or need deeper customization), please refer to our guide.
You can quickly set up a 1M vector database to try GustANN using our automated script:
./scripts/quick_start.shTo ensure optimal performance, please verify your hardware meets the following configurations:
- CPU: x86 CPU supporting huge pages (Verify with:
grep pdpe1gb /proc/cpuinfo). - System RAM (DRAM): ~40GB minimum for vector search. (Note: Building the index requires additional memory).
- GPU: ~40GB VRAM required for billion-scale search (e.g., NVIDIA A100).
- Dataset Constraints: Maximum of < 2 Billion vectors (to avoid integer overflow). Record size (
vector_size + 4 + 4 * num_neighbors) must be < 4KB. - Storage (SSD): ~700GB for SIFT1B or ~1TB for DEEP1B. Multi-SSD configurations are supported.
- Recommendation: Use SPDK for maximum performance. Alternatively, use io-uring, aio, or in-memory indexing.
GustANN relies on DiskANN for index building. Install the following system dependencies (Ubuntu 22.04):
sudo apt update
sudo apt install make cmake g++ libaio-dev libgoogle-perftools-dev clang-format libboost-all-dev libmkl-full-dev libjemalloc-devNote
You must also install CUDA following NVIDIA's official instructions.
git clone https://github.com/thustorage/GustANN.git --recursive
cd GustANNmkdir -p build && cd build
cmake .. # Use flags here for specific backends (see below)
make -j
cd ..Note: To build with a specific storage backend, append the switch to the CMake command: -DCMAKE_USE_{SPDK,URING,AIO}=ON.
Note
If you already have a built index, you can skip this step. For complete dataset preparation instructions from scratch, refer to the PipeANN repository.
First, compile DiskANN:
cd deps/DiskANN
mkdir build && cd build
cmake .. && make -j
cd ../../..To build a DiskANN index, you need to prepare your dataset in bin format. DiskANN provides a utility to convert from bvec/fvec formats (e.g., the format used by the SIFT dataset):
./deps/DiskANN/build/apps/utils/fvecs_to_bin <float/uint8> input_vecs output_binOnce the dataset is converted, update scripts/setup.sh with your paths, and run:
./scripts/build_disann_index.sh <pq_size> <memory>pq_size: 3.3 for 100M-scale datasets, 33 for 1B-scale datasets (generates 32-bit PQ vectors).memory: Maximum memory available for building the index.
In addition to the DiskANN index, GustANN requires building a pivot graph. Update scripts/setup.sh, then run:
./scripts/gen_pivot_graph.shFor smaller datasets, keeping data in DRAM or GPU memory yields the best performance. After updating scripts/setup.sh, run:
./scripts/run_mem.sh --topk <topk> --ef_search <L> [<L2> ...] -R <R> [-G]-GFlag: Enables pure GPU search (Fastest, but limited to small datasets).-LFlag: Number of vectors stored during search (Higher = more accurate).-RFlag: Repeat the queryRtimes for accurate benchmarking on small query sets.
You can run GustANN using SPDK (fastest), liburing, or libaio.
Warning
SPDK requires root privileges. There must be NO partitions or filesystems on the target SSDs. Using nvme format WILL ERASE ALL DATA ON THE DISK. Do this at your own risk!
- Build SPDK & GustANN:
git clone https://github.com/spdk/spdk deps/spdk # Note: We have tested GustANN on git commit 7c0720d1d cd deps/spdk sudo scripts/pkgdep.sh # Install the dependency of SPDK ./configure && make -j cd ../.. # Remember to rebuild GustANN with SPDK enabled: # cd build && cmake .. -DCMAKE_USE_SPDK=ON && make -j && cd ..
- Setup SPDK:
sudo ./deps/spdk/scripts/setup.sh sudo ./deps/spdk/build/examples/hello_world # Verify it works - Prepare SSD List & Write Index: Collect the PCIe addresses of your SSDs into a file (
ssd_list.txt), updatescripts/setup.sh, and run:sudo ./scripts/write_spdk.sh
- liburing:
git clone https://github.com/axboe/liburing.git deps/liburing # Note: We have tested on commit 20b3fe67 cd deps/liburing ./configure && make -j cd ../.. # Remember to rebuild GustANN with: # cd build && cmake .. -DCMAKE_USE_URING=ON && make -j && cd ..
- libaio: Ensure your Linux kernel supports libaio. Rebuild GustANN with
-DCMAKE_USE_AIO=ON.
Update scripts/setup.sh, then run the appropriate script for your backend:
sudo ./scripts/run_spdk.sh --topk <topk> --ef_search <L> -B <B> -T <T> -C <C> -R <R> # for SPDK
./scripts/run_uring.sh --topk <topk> --ef_search <L> -B <B> -T <T> -C <C> -R <R> # for liburing
./scripts/run_aio.sh --topk <topk> --ef_search <L> -B <B> -T <T> -C <C> -R <R> # for libaio
./scripts/run.sh --topk <topk> --ef_search <L> -B <B> -T <T> -C <C> -R <R> # for in-memory fallback testingImportant
Crucial Tuning Parameters (B, T, C):
Different I/O backends favor different worker configurations:
- SPDK:
-T 2(Worker threads),-C 20(Minibatches/thread),-B >=1000(Minibatch size) - Uring / Memory:
-T 20,-C 1,-B >=1000 - AIO:
-T 2,-C 10,-B 256(Setting B too large may cause AIO to crash!)
After execution, the runtime, total SSD I/Os, and recall metrics will be printed to stdout.
- GPU Direct-IO Support: See bam.md for experimental GPU Direct Storage setup.
If you find GustANN useful in your research, please cite our SIGMOD '26 paper:
@inproceedings{sigmod26gustann,
author = {Haodi Jiang and Hao Guo and Minhui Xie and Jiwu Shu and Youyou Lu},
title = {{High-Throughput, Cost-Effective Billion-Scale Vector Search with a Single GPU}},
year = {2026},
publisher = {Association for Computing Machinery},
booktitle = {Proceedings of the 2026 International Conference on Management of Data},
address = {Bengaluru, India},
series = {SIGMOD '26}
}Some GPU kernel implementations are adapted from CuHNSW. We greatly appreciate their open-source contributions.