Eugene Yan 7/20/2021

Mailbag: How to Bootstrap Labels for Relevant Docs in Search

Read Original

This article addresses a reader's question on how industry engineers obtain the 'total number of relevant documents' to evaluate search systems with metrics like Recall@K. It advises starting with a lexical search system (e.g., BM25), deploying it to collect user click data as labels, and then using that data to train and evaluate a semantic search model, avoiding the high cost of large-scale human annotation.

Mailbag: How to Bootstrap Labels for Relevant Docs in Search

Comments

No comments yet

Be the first to share your thoughts!

Browser Extension

Get instant access to AllDevBlogs from your browser

Top of the Week

1
The Beautiful Web
Jens Oliver Meiert 2 votes
3
LLM Use in the Python Source Code
Miguel Grinberg 1 votes
4
Wagon’s algorithm in Python
John D. Cook 1 votes