Score: 2

CodeBC: A More Secure Large Language Model for Smart Contract Code Generation in Blockchain

Published: April 28, 2025 | arXiv ID: 2504.21043v2

By: Lingxiang Wang , Hainan Zhang , Qinnan Zhang and more

Potential Business Impact:

Makes computer code for money safe from hackers.

Business Areas:

Ethereum Blockchain and Cryptocurrency

Large language models (LLMs) excel at generating code from natural language instructions, yet they often lack an understanding of security vulnerabilities. This limitation makes it difficult for LLMs to avoid security risks in generated code, particularly in high-security programming tasks such as smart contract development for blockchain. Researchers have attempted to enhance the vulnerability awareness of these models by training them to differentiate between vulnerable and fixed code snippets. However, this approach relies heavily on manually labeled vulnerability data, which is only available for popular languages like Python and C++. For low-resource languages like Solidity, used in smart contracts, large-scale annotated datasets are scarce and difficult to obtain. To address this challenge, we introduce CodeBC, a code generation model specifically designed for generating secure smart contracts in blockchain. CodeBC employs a three-stage fine-tuning approach based on CodeLlama, distinguishing itself from previous methods by not relying on pairwise vulnerability location annotations. Instead, it leverages vulnerability and security tags to teach the model the differences between vulnerable and secure code. During the inference phase, the model leverages security tags to generate secure and robust code. Experimental results demonstrate that CodeBC outperforms baseline models in terms of BLEU, CodeBLEU, and compilation pass rates, while significantly reducing vulnerability rates. These findings validate the effectiveness and cost-efficiency of our three-stage fine-tuning strategy, making CodeBC a promising solution for generating secure smart contract code.

Leveraging Large Language Models and Machine Learning for Smart Contract Vulnerability Detection

Cryptography and Security

Finds hidden bugs in computer money code.

4 Jan 2025 0

88%

Code Vulnerability Detection Across Different Programming Languages with AI Models

Cryptography and Security

Finds hidden bugs in computer code.

14 Aug 2025 0

88%

CFCEval: Evaluating Security Aspects in Code Generated by Large Language Models

Software Engineering

Tests computer code for mistakes and safety.

6 Dec 2025 1

View PDF Login to Bookmark

Country of Origin

🇨🇳 China

Repos / Data Links

github.com github.com huggingface.co

Page Count

13 pages

CodeBC: A More Secure Large Language Model for Smart Contract Code Generation in Blockchain

Makes computer code for money safe from hackers.

Technical Abstract

Leveraging Large Language Models and Machine Learning for Smart Contract Vulnerability Detection

Code Vulnerability Detection Across Different Programming Languages with AI Models

CFCEval: Evaluating Security Aspects in Code Generated by Large Language Models