Score: 0

Improving Crash Data Quality with Large Language Models: Evidence from Secondary Crash Narratives in Kentucky

Published: August 6, 2025 | arXiv ID: 2508.04399v1

By: Xu Zhang, Mei Chen

Potential Business Impact:

Finds hidden car crash causes in police reports.

This study evaluates advanced natural language processing (NLP) techniques to enhance crash data quality by mining crash narratives, using secondary crash identification in Kentucky as a case study. Drawing from 16,656 manually reviewed narratives from 2015-2022, with 3,803 confirmed secondary crashes, we compare three model classes: zero-shot open-source large language models (LLMs) (LLaMA3:70B, DeepSeek-R1:70B, Qwen3:32B, Gemma3:27B); fine-tuned transformers (BERT, DistilBERT, RoBERTa, XLNet, Longformer); and traditional logistic regression as baseline. Models were calibrated on 2015-2021 data and tested on 1,771 narratives from 2022. Fine-tuned transformers achieved superior performance, with RoBERTa yielding the highest F1-score (0.90) and accuracy (95%). Zero-shot LLaMA3:70B reached a comparable F1 of 0.86 but required 139 minutes of inference; the logistic baseline lagged well behind (F1:0.66). LLMs excelled in recall for some variants (e.g., GEMMA3:27B at 0.94) but incurred high computational costs (up to 723 minutes for DeepSeek-R1:70B), while fine-tuned models processed the test set in seconds after brief training. Further analysis indicated that mid-sized LLMs (e.g., DeepSeek-R1:32B) can rival larger counterparts in performance while reducing runtime, suggesting opportunities for optimized deployments. Results highlight trade-offs between accuracy, efficiency, and data requirements, with fine-tuned transformer models balancing precision and recall effectively on Kentucky data. Practical deployment considerations emphasize privacy-preserving local deployment, ensemble approaches for improved accuracy, and incremental processing for scalability, providing a replicable scheme for enhancing crash-data quality with advanced NLP.

Domain-Adapted Pre-trained Language Models for Implicit Information Extraction in Crash Narratives

Computation and Language

Helps cars understand crash details better.

10 Oct 2025 1

89%

On-Premise AI for the Newsroom: Evaluating Small Language Models for Investigative Document Search

Information Retrieval

Helps reporters find facts faster and safer.

29 Sep 2025 1

89%

Fine-tuning of lightweight large language models for sentiment classification on heterogeneous financial textual data

Computation and Language

Small AI models understand money news well.

30 Nov 2025 0

View PDF Login to Bookmark

Country of Origin

🇺🇸 United States

Page Count

19 pages

Improving Crash Data Quality with Large Language Models: Evidence from Secondary Crash Narratives in Kentucky

Finds hidden car crash causes in police reports.

Technical Abstract

Domain-Adapted Pre-trained Language Models for Implicit Information Extraction in Crash Narratives

On-Premise AI for the Newsroom: Evaluating Small Language Models for Investigative Document Search

Fine-tuning of lightweight large language models for sentiment classification on heterogeneous financial textual data