Score: 2

Agri-Query: A Case Study on RAG vs. Long-Context LLMs for Cross-Lingual Technical Question Answering

Published: August 25, 2025 | arXiv ID: 2508.18093v1

By: Julius Gun, Timo Oksanen

Potential Business Impact:

Helps computers answer questions from manuals better.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

We present a case study evaluating large language models (LLMs) with 128K-token context windows on a technical question answering (QA) task. Our benchmark is built on a user manual for an agricultural machine, available in English, French, and German. It simulates a cross-lingual information retrieval scenario where questions are posed in English against all three language versions of the manual. The evaluation focuses on realistic "needle-in-a-haystack" challenges and includes unanswerable questions to test for hallucinations. We compare nine long-context LLMs using direct prompting against three Retrieval-Augmented Generation (RAG) strategies (keyword, semantic, hybrid), with an LLM-as-a-judge for evaluation. Our findings for this specific manual show that Hybrid RAG consistently outperforms direct long-context prompting. Models like Gemini 2.5 Flash and the smaller Qwen 2.5 7B achieve high accuracy (over 85%) across all languages with RAG. This paper contributes a detailed analysis of LLM performance in a specialized industrial domain and an open framework for similar evaluations, highlighting practical trade-offs and challenges.

Comparing the Performance of LLMs in RAG-based Question-Answering: A Case Study in Computer Science Literature

Computation and Language

Helps AI answer questions more truthfully and accurately.

5 Nov 2025 0

92%

On the Influence of Context Size and Model Choice in Retrieval-Augmented Generation Systems

Computation and Language

Helps computers answer questions better with more info.

20 Feb 2025 1

92%

Aligning LLMs for the Classroom with Knowledge-Based Retrieval -- A Comparative RAG Study

Artificial Intelligence

Makes AI answers for school more truthful.

9 Sep 2025 1

View PDF Login to Bookmark

Country of Origin

🇩🇪 Germany

Repos / Data Links

github.com

Page Count

18 pages

Agri-Query: A Case Study on RAG vs. Long-Context LLMs for Cross-Lingual Technical Question Answering

Helps computers answer questions from manuals better.

Technical Abstract

Comparing the Performance of LLMs in RAG-based Question-Answering: A Case Study in Computer Science Literature

On the Influence of Context Size and Model Choice in Retrieval-Augmented Generation Systems

Aligning LLMs for the Classroom with Knowledge-Based Retrieval -- A Comparative RAG Study