Context-Aware Search and Retrieval Over Erasure Channels
By: Sara Ghasvarianjahromi, Yauhen Yakimenka, Jörg Kliewer
Potential Business Impact:
Finds better answers even with messy information.
This paper introduces and analyzes a search and retrieval model that adopts key semantic communication principles from retrieval-augmented generation. We specifically present an information-theoretic analysis of a remote document retrieval system operating over a symbol erasure channel. The proposed model encodes the feature vector of a query, derived from term-frequency weights of a language corpus by using a repetition code with an adaptive rate dependent on the contextual importance of the terms. At the decoder, we select between two documents based on the contextual closeness of the recovered query. By leveraging a jointly Gaussian approximation for both the true and reconstructed similarity scores, we derive an explicit expression for the retrieval error probability, i.e., the probability under which the less similar document is selected. Numerical simulations on synthetic and real-world data (Google NQ) confirm the validity of the analysis. They further demonstrate that assigning greater redundancy to critical features effectively reduces the error rate, highlighting the effectiveness of semantic-aware feature encoding in error-prone communication settings.
Similar Papers
GeAR: Generation Augmented Retrieval
Information Retrieval
Finds better answers in lots of text.
Diffusion-Aided Bandwidth-Efficient Semantic Communication with Adaptive Requests
Information Theory
Sends pictures using less data, fixing errors.
A Secure Semantic Communication System Based on Knowledge Graph
Cryptography and Security
Keeps secret messages safe from spies.