Particle: New RAG Research Reports Adaptive Retrieval Gains, Multimodal Retrieval Edge

Overview

A new arXiv study finds direct multimodal embedding retrieval outperforms text-summary pipelines for image–text corpora, improving mAP@5 by 13% and nDCG@5 by 11% on a financial earnings benchmark.
The multimodal analysis reports more accurate, factually consistent answers when images are stored natively in the vector space rather than summarized into text before embedding.
Another arXiv preprint introduces Cluster-based Adaptive Retrieval, which selects retrieval depth by detecting clustering transitions in similarity distances for each query.
CAR reports 60% lower LLM token usage, 22% faster end-to-end latency, and 10% fewer hallucinations in tests, with the authors also claiming a 200% engagement lift after integration into Coinbase’s virtual assistant.
A domain-focused preprint presents Mycophyto, a RAG pipeline for arbuscular mycorrhizal fungi that pairs semantic retrieval with structured extraction of experimental metadata stored in a vector database.