Rethinking Pointer Reasoning in Symbolic Execution
Symbolic execution is a popular program analysis technique that allows seeking for bugs by reasoning over multiple alternative execution states at once. In the last decade, techniques for symbolic execution have evolved significantly, leading to major practical breakthroughs. In 2008, Microsoft has announced that symbolic execution has been able to reveal nearly 30% of all the bugs discovered during the development of Windows 7. In 2016, the DARPA Cyber Grand Challenge (CGC) hosted systems that can detect and fix vulnerabilities in unknown software with no human intervention. The event demonstrated that symbolic executors can be competitive with human experts, paving the road to unprecedented applications that have the potential to shape software reliability in the next decades.
In practice, as the number of states to explore may grow exponentially, a symbolic executor may quickly run out of space. For instance, a memory access to a symbolic address may potentially reference the entire address space, leading to a combinatorial explosion of the possible execution states. To cope with this issue, state-of-the-art executors concretize symbolic addresses, i.e., a single value for the address is chosen using a heuristic. Unfortunately, this could result in missing interesting states, e.g., where a bug arises.
The main goal of this project is to devise a new approach to symbolic memory that reduces the need for concretization, hence offering the opportunity for broader state explorations and more precise pointer reasoning. Rather than mapping address instances to data as previous tools do, our technique will map symbolic address expressions with data, maintaining the possible alternative states resulting from the memory referenced by a symbolic address in a compact, implicit form. To test our approach, we plan to implement it in ANGR, a state-of-the-art symbolic engine, and to perform an experimental investigation using benchmarks taken from the DARPA CGC.