Abstract
Malware analysis and detection is an endless competitive battle between malware designers and the anti-malware community. Recently researchers have proposed state-of-the-art malware detection models built using machine learning and deep learning which are necessary to detect advanced metamorphic malware. But these malware detection models are susceptible to adversarial attacks. Therefore we propose to design robust Android malware detection models against adversarial attacks using reinforcement learning. We propose the Single Policy and Multi Policy based Evasion Attacks for Perfect Knowledge and Limited Knowledge scenario respectively against many malware detection models built using four different sets of classifiers (bagging, boosting and deep neural network). The motivation is to identify the adversarial vulnerability in Android malware detection models and then propose defence against them.