Score: 0

An Efficient Long-Context Ranking Architecture With Calibrated LLM Distillation: Application to Person-Job Fit

Published: January 15, 2026 | arXiv ID: 2601.10321v1

By: Warren Jouanneau, Emma Jouffroy, Marc Palyart

Finding the most relevant person for a job proposal in real time is challenging, especially when resumes are long, structured, and multilingual. In this paper, we propose a re-ranking model based on a new generation of late cross-attention architecture, that decomposes both resumes and project briefs to efficiently handle long-context inputs with minimal computational overhead. To mitigate historical data biases, we use a generative large language model (LLM) as a teacher, generating fine-grained, semantically grounded supervision. This signal is distilled into our student model via an enriched distillation loss function. The resulting model produces skill-fit scores that enable consistent and interpretable person-job matching. Experiments on relevance, ranking, and calibration metrics demonstrate that our approach outperforms state-of-the-art baselines.

LANTERN: Scalable Distillation of Large Language Models for Job-Person Fit and Explanation

Computation and Language

Helps job seekers find better jobs faster.

7 Oct 2025 1

89%

LLM as Explainable Re-Ranker for Recommendation System

Information Retrieval

Helps online stores show you better, clearer choices.

3 Dec 2025 1

88%

AI Hiring with LLMs: A Context-Aware and Explainable Multi-Agent Framework for Resume Screening

Computation and Language

Helps hire people faster by reading resumes.

1 Apr 2025 0

View PDF Login to Bookmark

An Efficient Long-Context Ranking Architecture With Calibrated LLM Distillation: Application to Person-Job Fit

Technical Abstract

LANTERN: Scalable Distillation of Large Language Models for Job-Person Fit and Explanation

LLM as Explainable Re-Ranker for Recommendation System

AI Hiring with LLMs: A Context-Aware and Explainable Multi-Agent Framework for Resume Screening