Hey GPT-OSS, Looks Like You Got It -- Now Walk Me Through It! An Assessment of the Reasoning Language Models Chain of Thought Mechanism for Digital Forensics
By: Gaëtan Michelet , Janine Schneider , Aruna Withanage and more
Potential Business Impact:
Explains computer clues for crime investigations.
The use of large language models in digital forensics has been widely explored. Beyond identifying potential applications, research has also focused on optimizing model performance for forensic tasks through fine-tuning. However, limited result explainability reduces their operational and legal usability. Recently, a new class of reasoning language models has emerged, designed to handle logic-based tasks through an `internal reasoning' mechanism. Yet, users typically see only the final answer, not the underlying reasoning. One of these reasoning models is gpt-oss, which can be deployed locally, providing full access to its underlying reasoning process. This article presents the first investigation into the potential of reasoning language models for digital forensics. Four test use cases are examined to assess the usability of the reasoning component in supporting result explainability. The evaluation combines a new quantitative metric with qualitative analysis. Findings show that the reasoning component aids in explaining and validating language model outputs in digital forensics at medium reasoning levels, but this support is often limited, and higher reasoning levels do not enhance response quality.
Similar Papers
Learning to Reason: Training LLMs with GPT-OSS or DeepSeek R1 Reasoning Traces
Computation and Language
Teaches smaller computers to think like big ones.
Understanding LLM Scientific Reasoning through Promptings and Model's Explanation on the Answers
Artificial Intelligence
Makes AI better at solving hard science problems.
Beyond Chains of Thought: Benchmarking Latent-Space Reasoning Abilities in Large Language Models
Computation and Language
Tests how well computers "think" inside their brains.