Better Program Analysis for Security via Data Flow Tracking and Symbolic Execution [thesis]


Navid Emamdoost (Ph.D. 2021)


Program analysis techniques have numerous applications in software optimization and correctness. Focusing on software security, analyses like data flow analysis and symbolic execution have been proven effective in many settings. Information-flow security measures and bounds the propagation of sensitive information throughout the code. Automatic vulnerability detection provides tooling for tracking and identifying potentially buggy code. Data flow analysis is scalable but imprecise due to over-approximations. Symbolic execution is precise and sound but suffers on scalability. In this thesis, we develop multiple dynamic and static analysis techniques that employ data flow analysis and symbolic execution together to address imprecision and scalability issues. More specifically, we revisit quantitative information flow analysis where the goal is measuring the amount of information flow from source to sink. We propose techniques to enable the incorporation of symbolic influence measurement and improve the precision of the final result. For static vulnerability detection, we propose a new tool to effectively detect memory leak bugs in big codebases like an OS kernel even in specialized modules. Our tool automatically identifies allocation and deallocation functions, and then reasons about the true location of memory release. It also employs under-constrained symbolic execution to improve the true positive ratio.We were able to detect numerous new memory leak bugs in the Linux kernel code. In both scenarios, we employ data flow analysis to shrink the search space for symbolic execution in a way that still can benefit from its precise property checking.

Link to full paper

Better Program Analysis for Security via Data Flow Tracking and Symbolic Execution


data flow analysis, security