Score: 0

A Method for the Architecture of a Medical Vertical Large Language Model Based on Deepseek R1

Published: April 25, 2025 | arXiv ID: 2505.00025v2

By: Mingda Zhang, Jianglong Qin

Potential Business Impact:

Makes AI doctors understand medical info faster.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Despite significant advances in foundation models like DeepSeek-R1 and ChatGPT, their deployment in medical settings faces critical challenges including computational requirements and professional knowledge barriers. This paper presents an efficient lightweight medical large language model architecture that systematically addresses these challenges through three-dimensional optimization: knowledge acquisition, model compression, and computational enhancement. We design a knowledge transfer pipeline from DeepSeek-R1-Distill-70B to DeepSeek-R1-Distill-7B using Low-Rank Adaptation (LoRA) for precise medical knowledge retention. Through 4-bit quantization and mixed-precision strategies, we achieve substantial model compression while preserving medical reasoning capabilities. The inference framework incorporates Flash Attention acceleration and continuous batching, complemented by specialized prompt templates for diverse medical queries. Experimental evaluation on medical benchmarks demonstrates that our approach maintains 92.1% accuracy on USMLE examinations while reducing memory consumption by 64.7% and inference latency by 12.4% compared to baseline models. This work provides a practical solution for deploying advanced language models in resource-constrained medical environments, enabling broader accessibility of AI-assisted healthcare.

Lightweight Clinical Decision Support System using QLoRA-Fine-Tuned LLMs and Retrieval-Augmented Generation

Computation and Language

Helps doctors suggest treatments using patient info.

6 May 2025 0

90%

Medical Reasoning in LLMs: An In-Depth Analysis of DeepSeek R1

Computation and Language

Helps doctors diagnose sickness with smart computer.

27 Mar 2025 0

90%

Scaling Down to Scale Up: Towards Operationally-Efficient and Deployable Clinical Models via Cross-Modal Low-Rank Adaptation for Medical Vision-Language Models

CV and Pattern Recognition

Helps doctors find diseases in CT scans faster.

29 Nov 2025 0

View PDF Login to Bookmark

Country of Origin

🇨🇳 China

Page Count

14 pages

A Method for the Architecture of a Medical Vertical Large Language Model Based on Deepseek R1

Makes AI doctors understand medical info faster.

Technical Abstract

Lightweight Clinical Decision Support System using QLoRA-Fine-Tuned LLMs and Retrieval-Augmented Generation

Medical Reasoning in LLMs: An In-Depth Analysis of DeepSeek R1

Scaling Down to Scale Up: Towards Operationally-Efficient and Deployable Clinical Models via Cross-Modal Low-Rank Adaptation for Medical Vision-Language Models