(Virtualization-obfuscation protects a program from manual or automated analysis by compiling it into bytecode for a randomized virtual architecture and attaching a corresponding interpreter. Static analysis appears to be helpless on such programs, where only the code of the interpreter is directly visible.
In this paper, we explain the particular challenges for statically analyzing the combination of interpreter and bytecode. Static analysis for computing possible variable values is commonly precise only to the program location. In the interpreter loop, however, this combines unrelated data flow information from different locations of the bytecode program.
To avoid this loss of information, we show how to lift an existing static analysis to an additional dimension of location, to become sensitive to the value of the virtual program counter. Thus, the static analysis merges data flow from equal bytecode locations only. We lift an existing analysis implemented in the JAKSTAB static analyzer and present preliminary results for processing a virtualization-obfuscated binary.)