Skip to content

SymFlow: Optimizing CFG Coverage for EVM Bytecode Symbolic Execution with LLM Embedding

mth0801/SE4SC-LLM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Blockchain-based crowdsourcing logistics presents a decentralized solution for the "last-mile delivery" challenge. Smart contracts are central to automating these workflows, but vulnerabilities caused by incorrect coding can result in significant financial losses. To detect these flaws, symbolic execution, a powerful program analysis technique, is widely used to explore vulnerable paths. However, it faces the path explosion problem, where the exponential growth of symbolic states limits Control Flow Graph (CFG) coverage. This limitation results in the overestimation of critical paths with potential security flaws, reducing the precision of vulnerability detection. In this work, we study the effectiveness of improving machine learning-based symbolic execution by Large Language Model (LLM). We propose SE4SC-LLM, a novel symbolic execution framework for crowdsourcing logistics smart contracts that restructures feature extraction. According to the bytecode-level runtime context, it redesigns symbolic execution features that ensure machine learning can be applied to path exploration of smart contracts. More importantly, it uses the LLM embedding technique to convert textual features into numerical features, which captures control flow semantics. Then, a coverage-driven feature fusion process is utilized to combine these features. Based on these, the path exploration capability of the regression model is enhanced through iterative training. We develop a prototype tool for SE4SC-LLM and open it at \url{https://github.com/mth0801/SE4SC-LLM}. The experiment shows that on two public datasets and a crowdsourcing logistics smart contract with vulnerabilities we constructed, it improves 5.7% in CFG coverage and 9.0% in vulnerability detection rate, offering enhanced security for smart contracts in crowdsourcing logistics.

About

SymFlow: Optimizing CFG Coverage for EVM Bytecode Symbolic Execution with LLM Embedding

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages