Score: 0

Standard Occupation Classifier -- A Natural Language Processing Approach

Published: November 28, 2025 | arXiv ID: 2511.23057v1

By: Sidharth Rony, Jack Patman

Potential Business Impact:

Helps match job ads to job types.

Business Areas:
Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Standard Occupational Classifiers (SOC) are systems used to categorize and classify different types of jobs and occupations based on their similarities in terms of job duties, skills, and qualifications. Integrating these facets with Big Data from job advertisement offers the prospect to investigate labour demand that is specific to various occupations. This project investigates the use of recent developments in natural language processing to construct a classifier capable of assigning an occupation code to a given job advertisement. We develop various classifiers for both UK ONS SOC and US O*NET SOC, using different Language Models. We find that an ensemble model, which combines Google BERT and a Neural Network classifier while considering job title, description, and skills, achieved the highest prediction accuracy. Specifically, the ensemble model exhibited a classification accuracy of up to 61% for the lower (or fourth) tier of SOC, and 72% for the third tier of SOC. This model could provide up to date, accurate information on the evolution of the labour market using job advertisements.

Page Count
43 pages

Category
Computer Science:
Computation and Language