Identifying android library dependencies in the presence of code obfuscation and minimization
Abstract
The fast growth of the Android app market motivates the need for tools and techniques to analyze and improve Android apps. A basic capability in this context is to identify the libraries present in a given Android app, including their exact version. The problem of identifying library dependencies is made difficult by two common build-time transformations, namely code minimization and obfuscation. Minimization typically incorporates used library fragments into an app, while obfuscation renames symbols globally across an app. In this paper, we tackle both of these challenges via a unified approach, which abstracts app and library classes into summaries of their interactions with system libraries. The summarization technique is resistant to obfuscation, and is amenable to efficient similarity detection (matching). We lift the class-wise matches into a set of library dependencies by encoding this problem as a global constraint/optimization system across all app classes and available libraries. Our techniques identify the exact libraries and their versions used in the apps, for clear apps the recall is almost perfect at 98%. For obuscated/minimized apps it stands at 85%.