GeistHaus
log in · sign up

Continuous recall measurement

turbopuffer.com

Describing how turbopuffer measures the recall (accuracy) of its vector indexes in production continuously. This ensures that turbopuffer's search results are accurate and reliable, despite using approximate nearest-neighbour algorithms to speed up queries.

1 page links to this URL
Document search using Claude and an inverted index.

Soooo vector databases are pretty popular right now for document search, but it has some drawbacks: choosing the right embedding models, finding the right chunk size, indexing into high dimensional embeddings and managing specialized infrastructure.