AI Malware PoC: Reinforcement Learning Evades Microsoft Defender

A new Trojan malware, developed using targeted reinforcement learning (RL), has demonstrated the ability to consistently bypass Microsoft Defender for Endpoint. This tool, set to debut at the upcoming Black Hat conference by Kyle Avery from Outflank, represents a significant advancement in malware development, leveraging large language models (LLMs) to create sophisticated evasion techniques.

Since late 2023, concerns have grown about hackers using LLMs to enhance malware creation. While previous AI applications in cybercrime focused on generating simple malware and phishing content, Avery's project showcases a more advanced approach. By training an open-source model, Qwen 2.5, in a sandbox environment, he developed a program that rewards the model for producing effective evasion tools.

The key innovation lies in using RL with verifiable rewards, allowing the model to specialize in evading security software. By integrating an API to query Microsoft Defender alerts, the model learned to generate malware that triggered progressively less severe alerts.

"NPAV recommends home users and organizations to maintain strong, up-to-date cybersecurity measures. Install NPAV on your desktop, laptop, and mobile devices to ensure world-class protection against fraud, malware, and ransomware attacks.

Choose NPAV and be a part of our mission to make the digital world safer for everyone"

Sharing is caring!
Share

Tweet LinkedIn