A two-stage retrieval technique that uses a cross-encoder to re-score and reorder initial search results for precision.
Semantic re-ranking is a two-stage retrieval technique that improves the precision of semantic search results. In the first stage, a fast bi-encoder retrieves a candidate set of documents using approximate nearest-neighbor vector search. In the second stage, a more accurate but computationally expensive cross-encoder model re-scores each candidate by jointly encoding the query and document together, producing a more precise relevance score. The final results are re-ordered by the cross-encoder scores.
Semantic re-ranking has become a standard component of production RAG systems. While bi-encoder retrieval is fast and scalable, it often returns results that are topically related but not precisely relevant to the query. Cross-encoder re-ranking dramatically improves precision — the difference between returning the 10 most relevant documents versus 10 related-but-not-quite-right documents is critical for AI accuracy. Cohere, Jina AI, and Voyage AI all offer specialized re-ranking models.
Bi-encoder models encode queries and documents independently into fixed-size vectors, enabling pre-computation of document embeddings. Cross-encoder models process query-document pairs jointly through a transformer, attending to both simultaneously — this allows the model to detect subtle relevance signals that bi-encoders miss. The trade-off is speed: bi-encoders can search millions of documents in milliseconds, while cross-encoders must process each candidate pair individually.
A legal research AI retrieves 100 candidate cases using vector similarity search for the query 'software patent obviousness standard.' A cross-encoder re-ranker then scores each of the 100 cases by jointly analyzing the query and case text, re-ordering them by precise relevance. The top 5 results after re-ranking are dramatically more relevant than the top 5 from vector search alone — the difference between useful and unusable AI assistance.