Score: 0

Classification of kinetic-related injury in hospital triage data using NLP

Published: September 5, 2025 | arXiv ID: 2509.04969v1

By: Midhun Shyam , Jim Basilakis , Kieran Luken and more

Potential Business Impact:

Helps doctors sort patient notes faster.

Business Areas:
Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Triage notes, created at the start of a patient's hospital visit, contain a wealth of information that can help medical staff and researchers understand Emergency Department patient epidemiology and the degree of time-dependent illness or injury. Unfortunately, applying modern Natural Language Processing and Machine Learning techniques to analyse triage data faces some challenges: Firstly, hospital data contains highly sensitive information that is subject to privacy regulation thus need to be analysed on site; Secondly, most hospitals and medical facilities lack the necessary hardware to fine-tune a Large Language Model (LLM), much less training one from scratch; Lastly, to identify the records of interest, expert inputs are needed to manually label the datasets, which can be time-consuming and costly. We present in this paper a pipeline that enables the classification of triage data using LLM and limited compute resources. We first fine-tuned a pre-trained LLM with a classifier using a small (2k) open sourced dataset on a GPU; and then further fine-tuned the model with a hospital specific dataset of 1000 samples on a CPU. We demonstrated that by carefully curating the datasets and leveraging existing models and open sourced data, we can successfully classify triage data with limited compute resources.

Country of Origin
🇦🇺 Australia

Page Count
8 pages

Category
Computer Science:
Computation and Language