Javascript must be enabled to continue!
Passage, Sentence, or Proposition? An Empirical Comparison of Retrieval Granularity Effects on LLM Answer Accuracy in Retrieval-Augmented Generation
View through CrossRef
Retrieval-Augmented Generation (RAG) has become a dominant paradigm for grounding large language model (LLM) outputs in external knowledge. While extensive research has focused on retriever architectures and generation strategies, the choice of retrieval granularity—the textual unit indexed and retrieved—remains insufficiently studied. This paper presents a controlled empirical comparison of four retrieval granularity levels: document, passage (100-word window), sentence, and proposition. Experiments are conducted across three open-domain question answering benchmarks (Natural Questions, TriviaQA, and HotpotQA) using two representative dense retrievers (DPR and Contriever) paired with LLaMA-2-7B-Chat as the reader. Results indicate that finer-grained retrieval units consistently improve retrieval recall, with proposition-level indexing achieving up to 6.8 absolute points higher Recall@20 than passage-level on Natural Questions under DPR. End-to-end answer accuracy follows a similar trend for single-hop factoid questions, where proposition-level retrieval yields the highest Exact Match scores. On multi-hop questions in HotpotQA, this advantage diminishes and passage-level retrieval produces comparable or slightly superior accuracy, suggesting that broader contextual units are beneficial when reasoning across multiple evidence pieces. These findings provide practical guidance for RAG pipeline design: retrieval granularity should be selected in accordance with question complexity, and no single granularity level dominates across all conditions.
Journal of Global Engineering Review
Title: Passage, Sentence, or Proposition? An Empirical Comparison of Retrieval Granularity Effects on LLM Answer Accuracy in Retrieval-Augmented Generation
Description:
Retrieval-Augmented Generation (RAG) has become a dominant paradigm for grounding large language model (LLM) outputs in external knowledge.
While extensive research has focused on retriever architectures and generation strategies, the choice of retrieval granularity—the textual unit indexed and retrieved—remains insufficiently studied.
This paper presents a controlled empirical comparison of four retrieval granularity levels: document, passage (100-word window), sentence, and proposition.
Experiments are conducted across three open-domain question answering benchmarks (Natural Questions, TriviaQA, and HotpotQA) using two representative dense retrievers (DPR and Contriever) paired with LLaMA-2-7B-Chat as the reader.
Results indicate that finer-grained retrieval units consistently improve retrieval recall, with proposition-level indexing achieving up to 6.
8 absolute points higher Recall@20 than passage-level on Natural Questions under DPR.
End-to-end answer accuracy follows a similar trend for single-hop factoid questions, where proposition-level retrieval yields the highest Exact Match scores.
On multi-hop questions in HotpotQA, this advantage diminishes and passage-level retrieval produces comparable or slightly superior accuracy, suggesting that broader contextual units are beneficial when reasoning across multiple evidence pieces.
These findings provide practical guidance for RAG pipeline design: retrieval granularity should be selected in accordance with question complexity, and no single granularity level dominates across all conditions.
Related Results
Exploring Large Language Models Integration in the Histopathologic Diagnosis of Skin Diseases: A Comparative Study
Exploring Large Language Models Integration in the Histopathologic Diagnosis of Skin Diseases: A Comparative Study
Abstract
Introduction
The exact manner in which large language models (LLMs) will be integrated into pathology is not yet fully comprehended. This study examines the accuracy, bene...
Automating Information Retrieval from Biodiversity Literature Using Large Language Models: A Case Study
Automating Information Retrieval from Biodiversity Literature Using Large Language Models: A Case Study
Recently, Large Language Models (LLMs) have transformed information retrieval, becoming widely adopted across various domains due to their ability to process extensive textual data...
Pola Fungsi Kalimat pada Novel “Pulang” Karya Tere Liye dan Kelayakannya sebagai Materi Pengayaan Siswa Kelas Xll SMA
Pola Fungsi Kalimat pada Novel “Pulang” Karya Tere Liye dan Kelayakannya sebagai Materi Pengayaan Siswa Kelas Xll SMA
Understanding sentence function patterns plays a major role in reading a novel, especially in class XII. By studying the understanding of sentence function patterns, class XII stud...
Human-AI Collaboration in Clinical Reasoning: A UK Replication and Interaction Analysis
Human-AI Collaboration in Clinical Reasoning: A UK Replication and Interaction Analysis
Abstract
Objective
A paper from Goh et al found that a large language model (LLM) working alone outperformed American clinicians assisted...
KALIMAT TANYA DALAM BAHASA INDONESIA
KALIMAT TANYA DALAM BAHASA INDONESIA
Interrogative sentence is one kind of sentences in Indonesian, which formed as proposition that required answer from hearer. It also called as requesting question. The difference w...
A Review of Video Text Retrieval Research
A Review of Video Text Retrieval Research
Video text retrieval is a hot research topic in artificial intelligence, with the core challenge being the semantic gap between visual dynamic features and discrete linguistic symb...
Unraveling the landscape of large language models: a systematic review and future perspectives
Unraveling the landscape of large language models: a systematic review and future perspectives
PurposeThe rapid rise of large language models (LLMs) has propelled them to the forefront of applications in natural language processing (NLP). This paper aims to present a compreh...
Study on Electromagnetic Shielding of Infrared /Visible Optical Window
Study on Electromagnetic Shielding of Infrared /Visible Optical Window
In allusion to electromagnetic radiation damage that existed in daily life, social safety and military field, electromagnetic shielding technology of infrared and infrared optical ...

