N-gram Opcode Analysis for Android Malware Detection

View/ Open
Date
2016-11Abstract
Android malware has been on the rise in recent years due to the increasing popularity of Android and the proliferation of third party application markets. Emerging Android malware families are increasingly adopting sophisticated detection avoidance techniques and this calls for more effective approaches for Android malware detection. Hence, in this paper we present and evaluate an n-gram opcode features based approach that utilizes machine learning to identify and categorize Android malware. This approach enables automated feature discovery without relying on prior expert or domain knowledge for pre-determined features. Furthermore, by using a data segmentation technique for feature selection, our analysis is able to scale up to 10-gram opcodes. Our experiments on a dataset of 2520 samples showed achieved an f-measure of 98% using the n-gram opcode based approach. We also provide empirical findings that illustrate factors that have probable impact on the overall n-gram opcodes performance trends.
Description
The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the URI link.
Citation : Kang, B., Yerima, S. Y., Sezer, S., McLaughlin, K. (2016) N-gram opcode analysis for Android malware detection. International Journal on Cyber Situational Awareness, 1(1), pp. 231-255.
ISSN : 2057-2182
Research Group : Cyber Technology Institute (CTI)
Research Institute : Cyber Technology Institute (CTI)